jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation

Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (✅/❌ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 03:31:06 +00:00

11 KiB

Raw Blame History

Documentation Gaps Before Project 2

Date: 2026-02-10 Context: Preparing to launch second dogfood project that demonstrates flywheel value

What Project 2 Must Demonstrate

Flywheel Value:

Cross-project learning - Project 2 sees Project 1's 27 dbpool claims
Pattern reuse - Similar code patterns trigger suggestions from Project 1
Autonomous workflow - Skills driving claim creation, not manual CLI
Knowledge compounding - Project 2 starts with institutional knowledge Project 1 built

Current Problem: dbpool docs teach manual CLI workflow. This doesn't demonstrate the autonomous flywheel.

Critical Gaps

Gap 1: Skills Are Not Documented (HIGH PRIORITY)

Evidence:

grep -r "aphoria-claims\|aphoria-suggest\|Claude Code skill" dogfood/dbpool/CHECKLIST.md
# Result: 0 matches

Impact:

Project 1 (dbpool) manually created 27 claims in 3-4 hours
Project 2 will do the same manual work
Flywheel value NOT demonstrated - no autonomous operation

What's Missing:

No instructions to install skills
No explanation that skills are the primary workflow
No demonstration of skills analyzing code and suggesting claims

Recommended Fix: Add skills installation and workflow to CHECKLIST.md Day 1

Gap 2: Naming Conventions Not Explained (CRITICAL)

Evidence:

grep -rn "lowercase\|slash-separated\|tail.path\|naming convention" dogfood/dbpool/CHECKLIST.md
# Result: 0 matches in main workflow sections

Impact:

Manual claim creation leads to inconsistent naming
Inconsistent naming breaks tail-path matching
Project 2 won't see Project 1's claims (mismatch)
Flywheel appears broken

What's Missing:

Format rules: lowercase, slash-separated, hierarchical
Why it matters: tail-path matching algorithm
Verification steps: how to check naming consistency

Recommended Fix: Add naming rules to CHECKLIST.md Day 1, Step 3

Gap 3: Cross-Project Setup Not Documented (HIGH PRIORITY)

Question: How does Project 2 discover Project 1's claims?

Current Documentation:

flywheel-setup.md explains persistent mode + aggregation
NEVER explains how to query cross-project patterns
NEVER explains how community corpus works across projects

What Project 2 Needs to Know:

# Before starting Project 2, verify access to Project 1 claims
curl 'http://localhost:18180/v1/aphoria/corpus' | \
  jq '[.items[] | select(.subject | contains("dbpool"))] | length'
# Should return: 27 (from Project 1)

# Project 2 should start by querying relevant patterns
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&sources[]=community' | \
  jq '.items[] | select(.subject | contains("pool") or .subject | contains("connection"))'

What's Missing:

Pre-flight check: "Can I see other projects' claims?"
Query patterns for cross-project discovery
Expected behavior: "Project 2 should see X claims from Project 1"

Recommended Fix: Add "Multi-Project Setup" section to CHECKLIST.md

Gap 4: Skills Workflow Not Demonstrated

Current Workflow (Day 1):

1. Read claim extraction example
2. Fetch source documents
3. Manually create 27 claims via CLI (3-4 hours)

Autonomous Workflow (What Flywheel Needs):

1. Install Claude Code skills
2. Show skills a diff: "What claims does this need?"
3. Skills query existing corpus: "Similar patterns already exist?"
4. Skills suggest: "Based on Project 1, you should add claims X, Y, Z"
5. Create claims with consistent naming (1-2 hours)

What's Missing:

Skills installation instructions
Skills-driven workflow demonstration
Cross-project pattern discovery via skills

Recommended Fix: Add skills workflow to Day 1, make it the PRIMARY path

Recommended Documentation Updates

Update 1: CHECKLIST.md Pre-Execution

Add before Day 1:

### ✅ Claude Code Skills (Required for Autonomous Flywheel)

**CRITICAL:** The Aphoria flywheel is autonomous - driven by LLM skills analyzing code and suggesting patterns. Manual CLI exists as fallback only.

- [ ] **Skills installed in Claude Code**

In Claude Code, verify skills are available: /aphoria-claims # Diff analysis, claim authoring /aphoria-suggest # Pattern suggestion from observations


- [ ] **Skills workflow understood**
- Primary: Use skills to analyze code → get claim suggestions
- Fallback: Manual CLI (`aphoria corpus create`)

**For dogfooding:** Skills demonstrate the production autonomous workflow.

- [ ] **Cross-project corpus access verified**
```bash
# Verify you can see other projects' claims
curl 'http://localhost:18180/v1/aphoria/corpus' | jq '.items | length'
# Should show claims from ALL projects in corpus


---

### Update 2: CHECKLIST.md Day 1, Step 3

**Add before claim creation:**

```markdown
### 🤖 Primary Workflow: Use Claude Code Skills

**CRITICAL:** Skills are the primary workflow. Manual CLI is fallback.

#### Option A: Autonomous (Skills) - RECOMMENDED

- [ ] **Use aphoria-claims skill to analyze source documents**

In Claude Code: "Read docs/sources/hikaricp-config.md and suggest claims to extract"


- [ ] **Skill will:**
1. Analyze document for claimable patterns
2. Query existing corpus for similar claims
3. Suggest claims with proper naming (lowercase, slash-separated)
4. Generate CLI commands with consistent format

- [ ] **Review and execute suggested commands**
- Skill enforces naming conventions automatically
- Estimated time: 1-2 hours (vs 3-4 hours manual)

#### Option B: Manual (CLI Only) - FALLBACK

[Existing manual workflow]

**Why use skills?**
- 2-3x faster (automatic pattern analysis)
- Consistent naming (enforced by skill)
- Cross-project awareness (skill queries existing corpus)
- Demonstrates production autonomous workflow

Update 3: CHECKLIST.md - Add Naming Conventions

Add after Step 2 (before claim creation):

### ⚠️ Naming Convention Rules (CRITICAL)

**Why this matters:** Tail-path matching compares last 2 path segments. Inconsistent naming breaks matching → violations missed.

#### Format Rules

✅ **Correct:**
- Lowercase: `max_connections` (not `MaxConnections`)
- Slash-separated: `dbpool/max_connections` (not `dbpool::max_connections`)
- Underscores: `connection_timeout` (not `connectionTimeout` or `connection-timeout`)
- Hierarchical: `dbpool/config/max_connections`

❌ **Wrong (breaks matching):**
- `dbpool/MaxConnections` - Case mismatch
- `dbpool::max_connections` - Wrong separator
- `dbpool/connectionTimeout` - CamelCase

#### How Tail-Path Matching Works

Corpus: vendor://dbpool/config/max_connections → tail_path: "config/max_connections"

Observation: dbpool/config/max_connections → tail_path: "config/max_connections" → MATCH ✓

Observation: dbpool/config/MaxConnections → tail_path: "config/MaxConnections" → NO MATCH ✗ (violation missed)


**Verification:**
```bash
# After creating claims, verify naming
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items[] | select(.subject | contains("dbpool")) | .subject'
# All subjects should be lowercase, slash-separated

Pro Tip: Use aphoria-claims skill - it enforces naming automatically.


---

### Update 4: Add Multi-Project Setup Guide

**New file:** `docs/multi-project-setup.md`

```markdown
# Multi-Project Flywheel Setup

## Purpose

Demonstrate how Project 2 benefits from Project 1's institutional knowledge.

## Pre-Flight: Verify Cross-Project Access

Before starting Project 2, verify you can see Project 1's claims:

```bash
# Query all corpus claims
curl 'http://localhost:18180/v1/aphoria/corpus' | \
  jq '[.items[] | {source, subject, predicate}] | length'

# Should show: 27+ claims (from dbpool project)

Project 2 Discovery Workflow

Step 1: Query Relevant Patterns

# If Project 2 is about HTTP clients, query connection patterns
curl 'http://localhost:18180/v1/aphoria/corpus' | \
  jq '.items[] | select(.subject | contains("connection") or .subject | contains("timeout"))'

# Should show dbpool's connection_timeout, max_connections, etc.

Step 2: Use Skills for Pattern Reuse

In Claude Code:
/aphoria-suggest

"I'm building an HTTP client. What patterns from other projects should I reuse?"

Expected behavior:

Skill queries corpus for connection/timeout/pool patterns
Suggests: "dbpool project has claims about connection_timeout, max_connections..."
Proposes: "You should create similar claims for http_client/connection_timeout"

Step 3: Create Claims with Reuse

/aphoria-claims

"Extract claims from this HTTP client code. Align naming with dbpool patterns."

Expected output:

Skill uses dbpool naming conventions
http_client/connection_timeout (aligned with dbpool/connection_timeout)
Cross-project consistency enforced automatically

Success Criteria

Project 2 demonstrates flywheel value when:

✅ Project 2 discovers Project 1's patterns automatically ✅ Skills suggest reusing Project 1 naming conventions ✅ Similar code patterns trigger cross-project suggestions ✅ Project 2 completes faster than Project 1 (knowledge reuse)

Flywheel Metrics

Compare Project 1 vs Project 2:

Metric	Project 1 (dbpool)	Project 2 (Expected)
Claims created	27	20-25 (some reused)
Time spent	3-4 hours	1-2 hours (patterns exist)
Naming consistency	Manual (error-prone)	Automatic (skill-enforced)
Cross-project awareness	None	High (queries dbpool)

Flywheel working: Project 2 is faster and more consistent because institutional knowledge accumulated.


---

## Summary of Changes Needed

| File | Section | Change | Priority | Effort |
|------|---------|--------|----------|--------|
| CHECKLIST.md | Pre-Execution | Add skills installation requirement | HIGH | 20 min |
| CHECKLIST.md | Day 1, Step 3 | Add skills workflow (Option A: Skills, Option B: Manual) | HIGH | 30 min |
| CHECKLIST.md | Day 1, Step 3 | Add naming convention rules | CRITICAL | 30 min |
| CHECKLIST.md | Pre-Execution | Add cross-project corpus verification | HIGH | 15 min |
| docs/ | New file | Create `multi-project-setup.md` | MEDIUM | 45 min |

**Total:** ~2.5 hours to prepare for Project 2

---

## Expected Outcome After Changes

**Project 1 (dbpool) - Manual workflow:**
- 3-4 hours creating 27 claims manually
- Demonstrates claim extraction and scanning

**Project 2 (with updated docs) - Autonomous workflow:**
- 1-2 hours creating 20-25 claims with skills
- Skills query dbpool corpus, suggest pattern reuse
- Demonstrates cross-project knowledge compounding
- **Shows the flywheel working**

---

## Verification

Before launching Project 2:

- [ ] Skills installation documented
- [ ] Skills workflow is PRIMARY path (manual is fallback)
- [ ] Naming conventions explained with examples
- [ ] Cross-project corpus access verified
- [ ] Multi-project setup guide created
- [ ] All examples tested

**Success:** Project 2 team uses skills, discovers dbpool patterns, completes faster than Project 1.

11 KiB Raw Blame History

Documentation Gaps Before Project 2

What Project 2 Must Demonstrate

Critical Gaps

Gap 1: Skills Are Not Documented (HIGH PRIORITY)

Gap 2: Naming Conventions Not Explained (CRITICAL)

Gap 3: Cross-Project Setup Not Documented (HIGH PRIORITY)

Gap 4: Skills Workflow Not Demonstrated

Recommended Documentation Updates

Update 1: CHECKLIST.md Pre-Execution

Update 3: CHECKLIST.md - Add Naming Conventions

Project 2 Discovery Workflow

Step 1: Query Relevant Patterns

Step 2: Use Skills for Pattern Reuse

Step 3: Create Claims with Reuse

Success Criteria

Flywheel Metrics

11 KiB

Raw Blame History