stemedb/applications/aphoria/dogfood/dbpool/eval-archive-2026-02-09/initial-observations-2026-02-09-run2.md
jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation
Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (/ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 03:31:06 +00:00

5.7 KiB

Initial Observations - Run 2

Timestamp: 2026-02-09T22:50:00Z Evaluator: aphoria-doc-evaluator Phase: Pre-implementation review


Team State vs Reality

Observation 1: .aphoria/config.toml Status

Team Said:

"Not Started Yet: Day 3: Scanning (no .aphoria/config.toml yet)"

Reality:

$ ls -la /home/jml/Workspace/stemedb/applications/aphoria/dogfood/dbpool/.aphoria/
-rw-rw-r-- 1 jml jml 2126 Feb  9 21:28 config.toml

Analysis:

  • .aphoria/config.toml EXISTS (created during reset)
  • Team incorrectly believes it doesn't exist
  • This is NOT a team error - they may not have checked hidden directories

Potential Documentation Gap:

  • CHECKLIST.md Day 3 says "Create .aphoria/config.toml" but it already exists
  • Should say "Verify .aphoria/config.toml" or "Update .aphoria/config.toml"
  • Need to check: Does Day 3 acknowledge config.toml is pre-created?

Observation 2: Pre-Flight Validator Positioning

Team Question:

"Do you want me to: ... 2. Run the pre-flight validator first? To ensure the environment is ready."

Documentation Says (CHECKLIST.md:10-13):

### ⚡ Quick Start: Run Pre-Flight Validator

Before manually checking each item, run the automated validator:

Analysis:

  • Pre-flight validator IS documented as first step
  • Team found it (mentioned in "What's Working Well")
  • But team is ASKING whether to run it instead of just running it
  • Suggests positioning isn't strong enough - "Quick Start" may be seen as optional

Potential Documentation Gap:

  • Heading says "Quick Start" which implies optional/alternative path
  • Should be: "REQUIRED: Run Pre-Flight Validator First"
  • Or move it above "Pre-Execution Requirements" as mandatory Step 0

Observation 3: Claim Creation Confidence

Team Question:

"Help create the 27 corpus claims now? I can extract claims from the authority source documents and generate the CLI commands."

Team Plan:

"Create 3 practice claims following the example" "Create remaining 24-27 claims using templates in CHECKLIST.md"

Analysis:

  • Team offering to help create claims suggests templates may not be sufficient
  • BUT team also plans to use templates, so they found them
  • This is GOOD - team is cautious and wants to verify approach before creating all 27

Not a Documentation Gap (yet):

  • Need to wait and see if templates are actually insufficient
  • Could be team being thorough rather than templates being unclear

Observation 4: Sequential Understanding

Team Understanding:

"The workflow: corpus → code violations → scan → fix → re-scan"

Analysis:

  • Team correctly understands Day 1 → 2 → 3 → 4 sequence
  • Team verified current state (0 claims)
  • Team plans to start with 3 practice claims (following docs)

Not a Gap:

  • Sequential flow is well understood

Early Warning Signals

Signal 1: Hidden Files Visibility

Team didn't notice .aphoria/config.toml exists. Possible causes:

  1. Didn't run ls -la (only ls)
  2. Documentation doesn't explicitly say "config.toml is pre-created"
  3. Day 3 wording "Create .aphoria/config.toml" misleading

Action: Check Day 3 wording when team reaches it

Signal 2: Mandatory vs Optional

Team treating pre-flight validator as optional choice. Possible causes:

  1. "Quick Start" heading implies alternative path
  2. Not positioned as blocking requirement
  3. Could skip directly to "Manual Verification" section

Action: If team skips validator, this is a HIGH PRIORITY gap

Signal 3: Template Self-Sufficiency

Team asking for help creating claims. Need to monitor:

  1. Do templates provide enough examples?
  2. Are authority source documents clear enough?
  3. Is the claim-extraction-example.md sufficient?

Action: Wait to see if team successfully creates claims using templates


Predictions

Likely Success

  • Team will successfully create 3 practice claims
  • Team will find and use templates in CHECKLIST.md
  • Team will verify claims with curl command

Likely Confusion Points

  • Day 3: "Create .aphoria/config.toml" when it already exists
  • Authority tier selection (which tier for which source)
  • Explanation format (maintaining WHAT+WHY+CONSEQUENCE structure)

Unlikely Issues

  • Team seems well-prepared and thorough
  • Documentation review was comprehensive
  • Understanding of concepts is solid

Next Evaluation Checkpoint

When to log next progress:

  • After team runs validator (or skips it)
  • After team creates 3 practice claims
  • After team attempts to create all 27 claims

When to trigger review:

  • Team says "Day 1 complete"
  • Team says "claims ready for review"
  • Team reports confusion or blocking issue

What to watch:

  1. Did they run validator?
  2. Did templates suffice for claim creation?
  3. Did they find config.toml issue on Day 3?
  4. Did they successfully create 25-30 claims?

Documentation Hypothesis (To Test)

Hypothesis 1: "Quick Start" heading makes validator seem optional

  • Test: Did team skip validator?
  • Expected if gap: Team proceeds to Day 1 without running validator

Hypothesis 2: Day 3 "Create config.toml" misleading when file exists

  • Test: Team confusion when reaching Day 3
  • Expected if gap: Team tries to create file that already exists

Hypothesis 3: Templates insufficient for claim creation

  • Test: Team asks for help or creates wrong format
  • Expected if gap: Team can't complete claims without assistance

Status

Current Phase: Pre-implementation (Day 1 about to start) Team Confidence: High Blocking Issues: None identified yet Ready for Next Phase: Yes

Next Action: Wait for team to proceed with validator and claim creation, then log results.