jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation

Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (✅/❌ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 03:31:06 +00:00

10 KiB

Raw Blame History

Gap Analysis

Timestamp: 2026-02-09T21:22:00Z Phase: Day 1 - Corpus Building Critical Finding: 0 claims created (expected 25-30)

Gap 1: Unclear Day 1 Completion Criteria

Type: Missing Information

Evidence:

Team thought (progress log): "Where You Are: Day 1, Step 3 (creating claims in corpus)"
Team did (implementation review): Created 3 source documents, config file, code placeholder, but 0 claims
Doc said (CHECKLIST.md:103-159): Shows "Day 1: Corpus Building - Information Needed" section with source document instructions

Root Cause: Documentation presents Day 1 as "Information Needed" with source documents as primary deliverable, burying actual claim creation workflow further down. Team interpreted "Day 1" as "fetch the information" not "create 25-30 claims."

Evidence from CHECKLIST.md Structure:

Line 103: ## Day 1: Corpus Building - Information Needed
Line 105: ### 📖 Learn Claim Extraction First
Line 124: ### 📚 Authority Source Documents  ← Team stopped here
Line 155: ### 🔧 Aphoria CLI Usage               ← Actual work buried below
Line 157: - [ ] **How to create claims**

Impact:

Time lost: Unknown (team hasn't completed Day 1)
Confusion level: High (team thinks they're "ready to proceed")
Blocker: Yes (cannot proceed to Day 2 without claims)

Recommendation:

Where: CHECKLIST.md:103-104

What to add:

## Day 1: Corpus Building

**Deliverable:** 25-30 claims created via CLI and verified in corpus database

**Success Criteria:** Run `curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | jq '.items | map(select(.subject | startswith("dbpool"))) | length'` and see 25-30

**Estimated Time:** 4-6 hours

Priority: High (blocker for Day 2+)

Gap 2: "Information Needed" Implies Preparation, Not Execution

Type: Unclear Instructions

Evidence:

Team thought: "Ready to Execute: Yes, but run the validator first" (implies they think prep is done)
Team did: Fetched source docs, created config, stopped
Doc said (CHECKLIST.md:103): "Day 1: Corpus Building - Information Needed"

Root Cause: "Information Needed" heading implies "here's what you need to gather" not "here's what you need to DO." Team interpreted this as prerequisite gathering phase, not execution phase.

Impact:

Time lost: Moderate (team waiting for next instruction)
Confusion level: High (team believes Day 1 is complete)
Blocker: Yes (Day 1 actually incomplete)

Recommendation:

Where: CHECKLIST.md:103

What to change:

# BEFORE:
## Day 1: Corpus Building - Information Needed

# AFTER:
## Day 1: Create 25-30 Corpus Claims

**What you're doing:** Extract claims from authority sources and create them via CLI
**How long:** 4-6 hours
**Done when:** `curl ...` returns 25-30 claims

Priority: High (misleading heading causes confusion)

Gap 3: Source Document Fetching Gets Checkboxes, Claim Creation Doesn't

Type: Buried Information

Evidence:

Doc structure (CHECKLIST.md:126-153): 3 checkbox items for source documents
Doc structure (CHECKLIST.md:157-180): Claim creation is prose explanation, no checkboxes
Team did: Completed all checkbox items (source docs), skipped prose section

Root Cause: Checkboxes signal "this is the task." Prose without checkboxes signals "this is reference information." Team followed checkboxes, ignored prose.

Impact:

Time lost: Moderate (incomplete Day 1)
Confusion level: Medium (checkboxes are powerful psychological signals)
Blocker: Yes (claim creation is the actual Day 1 work)

Recommendation:

Where: CHECKLIST.md:157-200

What to add: Convert claim creation into checkbox format

### ✅ Create Corpus Claims (25-30 total)

- [ ] **Safety Claims (10 claims)**
  - [ ] Create `dbpool/max_connections` required claim
  - [ ] Create `dbpool/min_connections` min_value claim
  - [ ] ... (all 10 listed)

- [ ] **Performance Claims (8 claims)**
  - [ ] Create `dbpool/max_connections/development` default_value claim
  - [ ] ... (all 8 listed)

- [ ] **Security Claims (5 claims)**
  - [ ] Create `dbpool/connection_string/password` must_not_be claim
  - [ ] ... (all 5 listed)

- [ ] **Architecture Claims (4 claims)**
  - [ ] Create `dbpool/health_check/endpoint` required claim
  - [ ] ... (all 4 listed)

- [ ] **Verify claims created**
  ```bash
  curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
    jq '.items | map(select(.subject | startswith("dbpool"))) | length'
  # Expected output: 25-30

Priority: High (critical workflow gap)

Gap 4: Claim Extraction Example Not Integrated into Workflow

Type: Buried Information

Evidence:

Team thought (progress log): "✅ claim-extraction-example.md teaches the critical distinction between observations vs claims with full worked examples"
Team did: Read the example (acknowledged it), but didn't use it to create actual claims
Doc said (CHECKLIST.md:105-120): "Before creating claims, read the complete walkthrough"

Root Cause: Example is presented as "read this first" but not integrated into execution workflow. No step says "Now use this example to create your first 3 claims following the same process."

Impact:

Time lost: Low (example was read and understood)
Confusion level: Low (example is clear)
Blocker: No (team understands concepts, just didn't execute)

Recommendation:

Where: CHECKLIST.md:120 (after claim extraction example)

What to add:

**Now apply this:** Create your first 3 claims following the same reasoning process:

- [ ] **Claim 1:** Extract from HikariCP "Small Pool Philosophy" paragraph
- [ ] **Claim 2:** Extract from PostgreSQL "300-500 connections optimal" empirical result
- [ ] **Claim 3:** Extract from OWASP "plaintext passwords prohibited"

Use the same structure as the walkthrough: identify claimable statements → reason about WHY → write explanation with WHAT/WHY/CONSEQUENCE → submit via CLI.

Priority: Medium (bridges example to execution)

Gap 5: No "You Are Here" Progress Indicator

Type: Missing Information

Evidence:

Team thought: "Where You Are: Day 1, Step 3 (creating claims in corpus)"
Team did: Fetched sources (Step 1), read example (Step 2), stopped before Step 3
Doc structure: Days are clear, but steps within days are not numbered

Root Cause: Team self-identified as "Step 3" but docs don't have explicit step numbers. No way to confirm "am I done with Day 1?" without reading entire section.

Impact:

Time lost: Low (self-assessment was close)
Confusion level: Medium (team unsure if Day 1 complete)
Blocker: No (team can figure it out, just slower)

Recommendation:

Where: CHECKLIST.md:103-200

What to add: Add step numbers

## Day 1: Create 25-30 Corpus Claims

**Step 1:** Read claim extraction example (15-20 min)
- [ ] Read `docs/claim-extraction-example.md`

**Step 2:** Fetch authority source documents (30 min)
- [ ] HikariCP
- [ ] PostgreSQL
- [ ] OWASP

**Step 3:** Create corpus claims (3-4 hours)
- [ ] Create 25-30 claims via `aphoria corpus create`

**Step 4:** Verify completion (2 min)
- [ ] Run verification: `curl ...`
- [ ] Confirm: 25-30 claims found

✅ **Day 1 Complete** when verification shows 25-30 claims

Priority: Medium (improves clarity)

Non-Gaps (Correct Team Actions)

Action 1: Created `.aphoria/config.toml` Early

Doc said (CHECKLIST.md:250-265): "Create .aphoria/config.toml" (Day 3: Scanning section)

Team did: Created config file during Day 1

Analysis: NOT A GAP

Config file is correct (ephemeral mode, thresholds match docs)
Creating early is actually helpful (ready for Day 3)
Shows proactive project setup behavior (positive)
No negative impact

Action 2: Created `src/lib.rs` Placeholder

Doc said (CHECKLIST.md:164-242): Implementation code is Day 2 deliverable

Team did: Created 5-line placeholder with intentional violation (Option<usize>)

Analysis: NOT A GAP

Minimal placeholder, not full implementation
Contains intentional violation (matches plan)
Creating early doesn't block Day 1 completion
Shows forward-thinking preparation (positive)

Summary

Documentation Gaps Found: 5

Gap	Type	Priority	Impact
1. Unclear completion criteria	Missing	High	Blocker
2. "Information Needed" misleading heading	Unclear	High	Blocker
3. No checkboxes for claim creation	Buried	High	Blocker
4. Example not integrated into workflow	Buried	Medium	Confusion
5. No step numbers within days	Missing	Medium	Slowdown

Team Errors: 0

Team followed documentation structure exactly. They completed every checkbox item presented. The issue is documentation structure, not team execution.

Root Cause Pattern

The Problem: Day 1 section structured as "reference information" not "execution workflow"

Evidence:

Heading: "Information Needed" (implies prereqs, not work)
Checkboxes: Only for source docs (fetching), not claims (creating)
Example: Positioned as "read first" not "now do this"
Verification: Buried at bottom, not emphasized as completion gate

Fix: Restructure Day 1 as execution checklist with clear deliverable (25-30 claims) and success criteria (verification command).

Impact Assessment

Current State:

Team believes Day 1 is complete (90% understanding)
Actually: 0% complete (0/25-30 claims created)
Estimated time to complete: 3-4 hours (claim creation)

If Documentation Had Been Clear:

Team would have created 25-30 claims
Day 1 would be complete
Ready to proceed to Day 2 (implementation)

Documentation Success Rate:

Source fetching: 100% (3/3 completed correctly)
Claim creation: 0% (0/25-30 completed)
Overall Day 1: 10% complete

Cold-Start Success Estimate Revision:

Original estimate: 85-90%
Actual observed: ~10% (stopped after source docs)
Gap: Documentation structure implies wrong completion criteria

Next Steps

Proceed to Phase 4: Report with actionable recommendations for each gap.

10 KiB Raw Blame History

Gap Analysis

Gap 1: Unclear Day 1 Completion Criteria

Gap 2: "Information Needed" Implies Preparation, Not Execution

Gap 3: Source Document Fetching Gets Checkboxes, Claim Creation Doesn't

Gap 4: Claim Extraction Example Not Integrated into Workflow

Gap 5: No "You Are Here" Progress Indicator

Non-Gaps (Correct Team Actions)

Action 1: Created .aphoria/config.toml Early

Action 2: Created src/lib.rs Placeholder

Summary

Documentation Gaps Found: 5

Team Errors: 0

Root Cause Pattern

Impact Assessment

Next Steps

10 KiB

Raw Blame History

Action 1: Created `.aphoria/config.toml` Early

Action 2: Created `src/lib.rs` Placeholder