stemedb/applications/aphoria/dogfood/dbpool/eval-archive-2026-02-09/gap-analysis-2026-02-09-run2.md
jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation
Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (/ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 03:31:06 +00:00

15 KiB

Gap Analysis - Run 2

Timestamp: 2026-02-09T23:20:00Z


Executive Summary

Root Cause: Documentation presents Days 1-5 as parallel information sections, not sequential prerequisites.

Evidence:

  • Team skipped Day 1 entirely (0/27 claims created)
  • Team executed Day 2 perfectly (7/7 files, 100% adherence)
  • No documentation indicates Day 1 BLOCKS Day 2
  • Day 2 section doesn't reference Day 1 completion

Impact: CRITICAL - Dogfood demonstration premise broken (cannot scan without claims)

Gap Count: 5 documentation gaps identified


Gap 1: No Prerequisite Relationship Documented

Type: Missing Information

Evidence:

  • Team understood Day 1 requirements (progress log):

    "📍 Current Status: Day 1, Step 3 (Claim Creation)"

  • Team proceeded to Day 2 anyway:

    • Created all 7 files from Day 2 checklist
    • Implemented all 7 violations
    • Never created claims from Day 1
  • Doc doesn't say Day 1 blocks Day 2:

    CHECKLIST.md:103:

    ## Day 1: Create 25-30 Corpus Claims
    

    CHECKLIST.md:276:

    ## Day 2: Implementation - Information Needed
    

    No text between these sections says "Complete Day 1 before proceeding to Day 2"

  • Doc presents days as parallel info:

    • plan.md shows days with equal status (🔄/)
    • README.md shows table with all days visible simultaneously
    • CHECKLIST.md uses same heading level for all days (##)

Root Cause:

Documentation structure implies Days 1-5 are sections of a reference document, not sequential steps in a workflow.

Impact:

  • Blocker: Team completed Day 2 but cannot proceed to Day 3 (scan requires claims)
  • Time lost: Estimated 4-5 hours to implement Day 2, must now backfill Day 1 (4-6 hours)
  • Confusion: High - team will discover scan returns 0 violations and have to diagnose why

Recommendation:

Where: CHECKLIST.md between Day 1 and Day 2 sections (after line 280)

What to add:

---

✅ **Day 1 Complete** when verification shows 25-30 claims in corpus

**CHECKPOINT: DO NOT PROCEED TO DAY 2 WITHOUT COMPLETING DAY 1**

Day 2 implementation requires corpus claims to exist for Day 3 scanning.
Without claims, scan will return 0 violations and the dogfood demo cannot proceed.

**Verify before continuing:**
\`\`\`bash
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \\
  jq '.items | map(select(.subject | startswith("dbpool"))) | length'
# Must show: 25-30
\`\`\`

If verification fails, complete Day 1 before proceeding.

---

## Day 2: Implementation - Information Needed

Priority: HIGH (Blocker)


Gap 2: Day 2 Heading Implies "Information" Not "Prerequisites"

Type: Unclear Instructions

Evidence:

  • Team thought: Day 2 heading is "Implementation - Information Needed"

  • Team interpreted: "Information Needed" = reference material to read

  • Team did: Implemented Day 2 without checking Day 1 completion

  • Doc said (CHECKLIST.md:276):

    ## Day 2: Implementation - Information Needed
    
  • Comparison with Day 1 heading (CHECKLIST.md:103):

    ## Day 1: Create 25-30 Corpus Claims
    

Root Cause:

Day 1 heading says "Create" (action verb), Day 2 says "Information Needed" (passive/reference tone).

Inconsistent heading style suggests Day 2 is reference material, not a sequential action.

Impact:

  • Confusion: Medium - heading tone mismatch suggests different purposes
  • Time lost: N/A (team proceeded anyway)
  • Blocker: No (but contributes to Gap 1)

Recommendation:

Where: CHECKLIST.md:276

What to change:

-## Day 2: Implementation - Information Needed
+## Day 2: Implement Code with Intentional Violations
+
+**Prerequisites:** Day 1 complete (25-30 claims in corpus)
+
+**Deliverable:** Working Rust library with 7 intentional violations
+
+**Success Criteria:**
+\`\`\`bash
+cargo test
+# Expected: All tests pass (violations are semantic, not syntax errors)
+\`\`\`
+
+**Estimated Time:** 4-5 hours

Priority: MEDIUM


Gap 3: No Automated Verification Between Days

Type: Missing Information

Evidence:

  • Team skipped Day 1: No manual check prevented this

  • No validator exists: scripts/validate-setup.sh checks environment, not day completion

  • Doc doesn't mention verification:

    • Day 1 has success criteria (CHECKLIST.md:105-110)
    • But no instruction to RUN it before Day 2
    • Day 2 doesn't reference Day 1 verification
  • Doc said (CHECKLIST.md:105-110):

    **Success Criteria:**
    \`\`\`bash
    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \\
      jq '.items | map(select(.subject | startswith("dbpool"))) | length'
    # Expected output: 25-30
    \`\`\`
    
  • Team did:

    • Did not run verification command
    • Did not check if claims exist before Day 2

Root Cause:

Success criteria shown as "expected output" documentation, not as "you must run this" checkpoint.

Impact:

  • Blocker: Yes - team proceeded to Day 2 without Day 1 complete
  • Time lost: Will discover on Day 3 when scan returns 0 violations
  • Confusion: High - requires diagnosis to determine Day 1 was skipped

Recommendation:

Where: Create new script scripts/verify-day1.sh

What to add:

#!/bin/bash
# Verify Day 1 completion before proceeding to Day 2

set -e

echo "=== Day 1 Verification ==="
echo

CLAIMS_COUNT=$(curl -s 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' 2>/dev/null | \\
  jq '.items | map(select(.subject | startswith("dbpool"))) | length')

if [ "$CLAIMS_COUNT" -ge 25 ] && [ "$CLAIMS_COUNT" -le 30 ]; then
    echo "✓ Day 1 complete: $CLAIMS_COUNT claims in corpus"
    exit 0
else
    echo "✗ Day 1 incomplete: $CLAIMS_COUNT claims (expected 25-30)"
    echo
    echo "Please complete Day 1 before proceeding to Day 2:"
    echo "  1. Read: cat docs/claim-extraction-example.md"
    echo "  2. Create: Follow CHECKLIST.md Day 1, Step 3 (27 checkbox items)"
    echo "  3. Verify: Run this script again"
    exit 1
fi

Also add to CHECKLIST.md Day 2 start:

## Day 2: Implement Code with Intentional Violations

**Prerequisites:** Day 1 complete

- [ ] **Verify Day 1 completion**
  \`\`\`bash
  ./scripts/verify-day1.sh
  \`\`\`
  **Must pass before proceeding**

Priority: HIGH (Prevents sequence violation)


Gap 4: plan.md Shows Days with Equal Status (Visual Parity)

Type: Buried Information

Evidence:

  • Team saw (plan.md:88-96):

    | Phase | Status | Completed | Notes |
    |-------|--------|-----------|-------|
    | Day 1: Preparation | 🔄 IN PROGRESS | 2026-02-09 | Corpus building |
    | Day 2: Implementation | ⏳ PENDING | - | - |
    | Day 3: First Scan | ⏳ PENDING | - | - |
    | Day 4: Remediation | ⏳ PENDING | - | - |
    | Day 5: Documentation | ⏳ PENDING | - | - |
    
  • Team interpreted: All days shown equally, can work on any

  • Visual issue: Status emojis (🔄/) don't indicate blocking relationship

Root Cause:

Status table shows "what to do" but not "what blocks what". All days have equal visual weight.

Impact:

  • Confusion: Low (table is for tracking, not instructions)
  • Time lost: N/A (team didn't use this for sequencing)
  • Blocker: No (but contributes to Gap 1)

Recommendation:

Where: plan.md:88-96

What to change:

| Phase | Status | Prerequisites | Completed | Notes |
|-------|--------|---------------|-----------|-------|
| Day 1: Preparation | 🔄 IN PROGRESS | None | 2026-02-09 | Corpus building |
| Day 2: Implementation | ⏳ PENDING | Day 1 ✓ | - | Requires claims in corpus |
| Day 3: First Scan | ⏳ PENDING | Day 2 ✓ | - | Requires code with violations |
| Day 4: Remediation | ⏳ PENDING | Day 3 ✓ | - | Requires scan results |
| Day 5: Documentation | ⏳ PENDING | Day 4 ✓ | - | Requires fixed code |

Priority: LOW (Table is for tracking, not primary instructions)


Gap 5: README Day-by-Day Overview Shows All Days Equally

Type: Unclear Instructions

Evidence:

  • Team saw (README.md:70-78):

    | Day | Focus | Key Deliverable | Time |
    |-----|-------|-----------------|------|
    | **Day 1** | Corpus Building | 25-30 claims created via CLI | 4-6 hours |
    | **Day 2** | Implementation | Working code with 7-8 intentional violations | 4-5 hours |
    | **Day 3** | Scanning | Initial scan showing all violations | 2-3 hours |
    | **Day 4** | Remediation | Progressive fixes with re-scans | 4-5 hours |
    | **Day 5** | Documentation | Success story, demo materials | 3-4 hours |
    
  • Visual problem: All rows have equal weight, no arrows/dependencies shown

  • Team interpreted: Days are sections to complete, not sequential steps

Root Cause:

Table shows "what" but not "when" or "depends on what". All days visually parallel.

Impact:

  • Confusion: Medium - first thing team sees when opening README
  • Time lost: N/A (team proceeded to CHECKLIST anyway)
  • Blocker: No (but contributes to overall sequence confusion)

Recommendation:

Where: README.md:70-78

What to change:

| Day | Focus | Key Deliverable | Prerequisites | Time |
|-----|-------|-----------------|---------------|------|
| **Day 1** | Corpus Building | 25-30 claims created via CLI | *(start here)* | 4-6 hours |
| **Day 2** | Implementation | Working code with 7-8 intentional violations | Day 1 ✓ | 4-5 hours |
| **Day 3** | Scanning | Initial scan showing all violations | Day 2 ✓ | 2-3 hours |
| **Day 4** | Remediation | Progressive fixes with re-scans | Day 3 ✓ | 4-5 hours |
| **Day 5** | Documentation | Success story, demo materials | Day 4 ✓ | 3-4 hours |

**IMPORTANT:** Days must be completed sequentially. Each day requires the previous day's deliverable.

Priority: MEDIUM (Improves first impression, prevents confusion)


Non-Gaps (Team Did Right)

Not a Gap 1: Day 2 Implementation Quality

What team did:

  • Created all 7 files exactly as specified
  • Implemented all 7 violations correctly
  • Added comprehensive tests (21/21 passing)
  • Documented violations inline with clear explanations

Doc was clear (CHECKLIST.md:276-357):

  • File structure fully specified
  • Violations listed with examples
  • Dependencies shown in Cargo.toml
  • Tests described

Evaluation: NOT A GAP - Team followed Day 2 instructions perfectly


Not a Gap 2: Code Quality

What team did:

  • Clean architecture (lib.rs, config.rs, pool.rs, connection.rs, error.rs)
  • Proper async/await usage
  • Good error handling with thiserror
  • Comprehensive test coverage

Evaluation: NOT A GAP - Team has strong Rust skills, executed well


Not a Gap 3: Violation Documentation

What team did:

  • Every violation labeled with VIOLATION N
  • Clear explanation of what claim is violated
  • Consequence described ("If X, then Y breaks")
  • Example:
    /// **VIOLATION 1**: Set to `None` (unbounded growth)
    /// - Violates: `dbpool/max_connections` required claim
    /// - Consequence: Pool grows without limit, exhausts database connections
    

Evaluation: NOT A GAP - Team understood violation requirements perfectly


Summary of Gaps

Gap Type Priority Impact
Gap 1: No prerequisite relationship Missing Information HIGH BLOCKER - Team skipped Day 1
Gap 2: Day 2 heading tone Unclear Instructions MEDIUM Contributed to confusion
Gap 3: No automated verification Missing Information HIGH Prevents sequence violation
Gap 4: plan.md status table Buried Information LOW Visual parity issue
Gap 5: README day overview Unclear Instructions MEDIUM First impression confusion

Total Gaps: 5 Critical (High Priority): 2 Medium Priority: 2 Low Priority: 1


Root Cause Chain

Documentation presents days as parallel sections
              ↓
Team interprets: "Day 1 = reference, Day 2 = work"
              ↓
Team executes Day 2 first (perfect implementation)
              ↓
Day 1 skipped (0/27 claims created)
              ↓
Day 3 scan will return 0 violations (BLOCKER)
              ↓
Team must backfill Day 1 (4-6 hours lost)

Primary failure point: No explicit "Day 1 BLOCKS Day 2" statement in documentation

Contributing factors:

  • Visual parity (all days shown equally in tables)
  • Inconsistent heading tone ("Create" vs "Information Needed")
  • No automated verification checkpoints
  • No dependency relationships documented

Recommendations Summary

Immediate (Before Next Team)

  1. Add checkpoint text between Day 1 and Day 2 (Gap 1)

    • Location: CHECKLIST.md:280
    • Content: "DO NOT PROCEED WITHOUT DAY 1 COMPLETE"
    • Priority: HIGH
  2. Create verify-day1.sh script (Gap 3)

    • Location: scripts/verify-day1.sh
    • Content: Check claims count 25-30
    • Priority: HIGH
  3. Update Day 2 heading (Gap 2)

    • Location: CHECKLIST.md:276
    • Content: Add prerequisites, deliverable, success criteria
    • Priority: MEDIUM

Short Term (This Week)

  1. Add prerequisites column to README table (Gap 5)

    • Location: README.md:70-78
    • Content: Show Day 1 ✓, Day 2 ✓, etc.
    • Priority: MEDIUM
  2. Add prerequisites column to plan.md table (Gap 4)

    • Location: plan.md:88-96
    • Content: Show blocking relationships
    • Priority: LOW

Long Term (Next Month)

  1. Create automated day sequencer
    • New script: scripts/check-day-sequence.sh
    • Checks: Day N complete before Day N+1 starts
    • Integration: Add to pre-flight validator

Lessons Learned

Documentation Principle Violated

Violated: "Explicit > Implicit"

What we did:

  • Implicitly suggested sequence through day numbers (1, 2, 3)
  • Implicitly suggested prerequisites through "you'll need claims for scanning"

What we should have done:

  • Explicitly state "Complete Day 1 before Day 2"
  • Explicitly check prerequisite completion
  • Explicitly block progression without verification

Agent vs Human Documentation

New insight: Agent interpreters may need more explicit sequencing than humans.

Humans might intuit: "Day 1 comes before Day 2, so I should do Day 1 first"

Agents might interpret: "Both sections are present, I can execute either one"

Implication: Documentation for agent workflows needs explicit prerequisite statements, not implicit ordering.


Next Steps

  1. User needs to be informed:

    • Day 1 was skipped (0/27 claims)
    • Day 2 implementation is excellent (perfect execution)
    • Day 3 will fail (scan returns 0 violations)
    • Must backfill Day 1 before continuing
  2. Documentation fixes needed:

    • Implement Gap 1 fix (checkpoint between days)
    • Implement Gap 3 fix (verify-day1.sh script)
    • Consider Gap 2, 5 fixes for clarity
  3. Team recovery path:

    • Run verify-day1.sh (will fail)
    • Complete Day 1 (create 25-30 claims)
    • Re-run verify-day1.sh (will pass)
    • Proceed to Day 3 (scan will now detect violations)