jml e758f2ebfb feat(aphoria): implement programmatic extractors for Option<T> semantics

Completes Task #3 of httpclient dogfooding with 100% detection rate (7/7 violations).

## New Extractors

- **OptionBoundsExtractor**: Detects Option<T> fields set to None (unbounded)
- **OptionValueExtractor**: Extracts values from Some(n) for threshold checks

Both extractors use context-aware pattern matching to understand Rust Option<T>
semantics, which declarative extractors cannot handle.

## Implementation

**Files Created**:
- applications/aphoria/src/extractors/option_bounds.rs (257 lines)
- applications/aphoria/src/extractors/option_value.rs (277 lines)
- applications/aphoria/docs/examples/extractors/programmatic-option-semantics.md

**Files Modified**:
- applications/aphoria/src/extractors/mod.rs - Added module declarations
- applications/aphoria/src/extractors/registry.rs - Registered extractors
- applications/aphoria/dogfood/httpclient/.aphoria/claims.toml - Added 4 claims
- applications/aphoria/dogfood/httpclient/TASK-1-SUMMARY.md - Task #3 completion

## Results

| Metric | Value |
|--------|-------|
| Detection Rate | 100% (7/7 violations) |
| Improvement | +29 percentage points (from 71%) |
| New Violations | 2 (max_redirects, max_retries unbounded) |
| Unit Tests | 13 (all passing) |

## Two-Claim Strategy

For each bounded Option<T> field:
1. **configured** claim - Detects None (unbounded)
2. **max_value** claim - Validates Some(n) threshold

Example:
- `max_redirects: None` → CONFLICT (not configured)
- `max_redirects: Some(20)` → CONFLICT (exceeds 10)
- `max_redirects: Some(5)` → PASS

## Enterprise Quality

✓ Proper error handling (no unwrap/expect)
✓ Comprehensive tests (6+7 unit tests)
✓ Full documentation with examples
✓ Reusable for 10+ similar patterns
✓ Screening patterns for performance

## Cachewrap Dogfood

Also includes complete cachewrap dogfood exercise:
- 10 claims for Redis cache wrapper
- Day 1-5 summaries
- Full retrospective and evaluation
- Declarative extractors for all patterns

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 06:43:10 +00:00

29 KiB

Raw Blame History

Common Dogfooding Mistakes

This document catalogs common mistakes made during Aphoria dogfooding exercises, with evidence from real failures and how to avoid them.

Mistake #1: Skipping Day 3 Extractor Creation (CRITICAL)

Severity: 🚨 CRITICAL - Breaks the entire flywheel

What People Do Wrong

# Day 3 (incorrect execution):
aphoria scan --format json > scan-results-v1.json
# Looks at results (0/8 violations detected)
# Moves on to Day 4 without creating extractors

Result:

0 extractors created (should be 8)
No .aphoria/extractors/ directory
No scan-v2.json file
No DAY3-SUMMARY.md
Detection rate: 0% (no improvement)
Flywheel completely broken

Why It's Wrong

No knowledge captured - The 8 patterns that should have been learned are lost
No corpus growth - Next msgqueue dogfood will ALSO have 0% detection
Flywheel doesn't compound - No benefit from previous work
Misses the point - Day 3 IS the autonomous learning validation
Product not validated - Can't prove Aphoria creates extractors dynamically

Evidence from msgqueue Dogfood (2026-02-10)

What was done:

✅ Day 1: 22 claims authored (50% reused)
✅ Day 2: 8 violations embedded in code
❌ Day 3: Scan ran once, showed 0/8 violations, stopped there
- No extractors created
- No gap analysis
- No re-scan
- No DAY3-SUMMARY.md
⚠️ Day 4-5: Can't proceed without working extractors

Scan results:

v1: 0/8 violations detected (0%)
v2: Not run (should have been 8/8 = 100%)
Missing: 20/22 claims had no observations
Detection rate improvement: 0% (should have been +100%)

Files that should exist but don't:

$ ls .aphoria/extractors/
# No such directory

$ ls scan-v2.json
# No such file

$ ls DAY3-SUMMARY.md
# No such file

What To Do Instead

Day 3 (correct execution - 5 phases):

Phase 1: Pre-Flight Check (5 min)

/help | grep aphoria-custom-extractor-creator  # Verify skill available
grep -r "@aphoria:claim" src/ | wc -l          # Verify markers (should be 8)
cargo check                                    # Verify code compiles

Phase 2: Baseline Scan (15 min)

aphoria scan --format json > scan-v1.json
# Result: 0/8 violations detected (expected for new domain)

Phase 3: Gap Analysis (15 min)

Analyze why 0/8 detected:

No extractors exist for msgqueue patterns
Need to create 8 extractors (one per violation)

Phase 4: Extractor Creation (30 min) [CRITICAL]

/aphoria-custom-extractor-creator --violation "timeout=0" --claim msgqueue-001
/aphoria-custom-extractor-creator --violation "prefetch_count=u16::MAX" --claim msgqueue-012
/aphoria-custom-extractor-creator --violation "verify_certificates=false" --claim msgqueue-002
/aphoria-custom-extractor-creator --violation "blocking in async fn" --claim msgqueue-009
/aphoria-custom-extractor-creator --violation "max_queue_size=None" --claim msgqueue-015
/aphoria-custom-extractor-creator --violation "ack_mode=AutoAck" --claim msgqueue-013
/aphoria-custom-extractor-creator --violation "max_requeue_count=None" --claim msgqueue-018
/aphoria-custom-extractor-creator --violation "max_connections=None" --claim msgqueue-003

# Verify:
ls .aphoria/extractors/*.toml | wc -l  # Should be: 8

Phase 5: Verification Scan (15 min)

aphoria scan --format json > scan-v2.json
# Result: 8/8 violations detected (100% improvement!)

Phase 6: Documentation (15 min)

Create DAY3-SUMMARY.md with:

Detection rate v1 vs v2 (0% → 100%)
Extractors created (8 total)
Time breakdown
Learning captured

How to Verify Correct Execution

After Day 3, these MUST exist:

# 1. Extractor directory with 8 files
$ ls .aphoria/extractors/*.toml | wc -l
8

# 2. Verification scan
$ ls scan-v2.json
scan-v2.json

# 3. Daily summary
$ ls DAY3-SUMMARY.md
DAY3-SUMMARY.md

# 4. Detection rate improvement
$ jq '.summary.claims_conflict' scan-v1.json
0
$ jq '.summary.claims_conflict' scan-v2.json
8
# Improvement: +8 (0 → 8)

If ANY of these checks fail, Day 3 was not completed correctly. Redo from Phase 4.

Why This Mistake Happens

Root cause: Mental model mismatch

People think:

❌ "Aphoria is a CLI tool you run manually"
❌ "Scan shows results, that's the end"
❌ "Low detection rate means Aphoria doesn't work"

Reality:

✅ "Aphoria is an autonomous learning system"
✅ "Low initial detection is EXPECTED, creation phase fixes it"
✅ "The flywheel requires LLM to create extractors dynamically"

Contributing factors:

plan.md Step 3 could be read as optional
CLI worked without errors (reinforced wrong model)
No pre-flight check to verify skill availability
Scan output doesn't suggest next action

How We're Fixing This

Documentation updates:

✅ plan.md now emphasizes Step 3 as REQUIRED
✅ SKILL.md rewritten with 6 explicit phases
✅ Pre-flight checks added to verify skill availability
✅ Success criteria now includes "8 extractors created"
✅ Evidence checklist added (ls commands to verify)

Product improvements (planned):

Scan output will suggest: "Run /aphoria-custom-extractor-creator"
New CLI command: aphoria extractors coverage (show gaps)
New CLI command: aphoria dogfood metrics (track Day 3 progress)
Pre-flight check command: aphoria dogfood preflight --day 3

VG-025: No default extractors ship for common patterns
VG-027: No skill availability check
VG-028: No example extractor TOML files
VG-031: No visual diff between scan-v1 and scan-v2

Mistake #2: Creating Extractors with Wrong Subject Format (CRITICAL)

Severity: 🚨 CRITICAL - Breaks extractor matching

What People Do Wrong

Create extractors that run successfully but don't match claims due to incorrect subject field:

# Claim has:
concept_path = "msgqueue/queue/max_size"

# Extractor uses (WRONG):
[extractors.declarative.claim]
subject = "queue/max_size"  # ❌ Missing "msgqueue/" prefix

Result:

✅ Extractors run (no errors)
✅ Observations created (+7 observations)
❌ 0% detection rate (observations don't match claims)
❌ Day 3 still incomplete (can't proceed without working extractors)

Why It's Wrong

Subject field MUST exactly match claim's concept_path.

Aphoria uses tail-path matching (last 2 segments), but if the observation path is queue/max_size and claim path is msgqueue/queue/max_size, the alignment fails because:

Observation tail: queue/max_size
Claim tail: queue/max_size
But observation is missing the namespace prefix, causing match failures

Rule: Copy claim's concept_path EXACTLY into extractor's subject.

Evidence from msgqueue Dogfood (2026-02-10, Second Attempt)

What was done:

✅ Day 1-2: Claims authored, code written
✅ Day 3 Step 3: Created 7 extractors (fixed from first attempt)
✅ Day 3 Step 4: Verification scan ran
❌ Result: 0% detection rate (same as before creating extractors)

Extractor subjects used (all WRONG):

subject = "queue/max_size"                    # ❌ Should be: msgqueue/queue/max_size
subject = "consumer/prefetch_count"           # ❌ Should be: msgqueue/consumer/prefetch_count
subject = "tls/certificate_validation"        # ❌ Should be: msgqueue/tls/certificate_validation
subject = "async/runtime"                     # ❌ Should be: msgqueue/async/runtime
subject = "consumer/ack_mode"                 # ❌ Should be: msgqueue/consumer/ack_mode
subject = "consumer/requeue_limit"            # ❌ Should be: msgqueue/consumer/requeue_limit
subject = "connection/max_connections"        # ❌ Should be: msgqueue/connection/max_connections

Pattern: All missing msgqueue/ prefix.

Scan results:

scan-v1.json: 0/8 violations (0%) - before extractors
scan-v2.json: 0/8 violations (0%) - after extractors (7 observations, no conflicts)
Improvement: 0% (no change despite creating extractors)

Files that exist but don't work:

$ ls .aphoria/extractors/
# (No directory - wrong location, should be in config.toml)

$ grep "subject =" .aphoria/config.toml
subject = "queue/max_size"
subject = "consumer/prefetch_count"
# ... (all missing prefix)

$ grep "concept_path =" .aphoria/claims.toml
concept_path = "msgqueue/queue/max_size"
concept_path = "msgqueue/consumer/prefetch_count"
# ... (all have msgqueue/ prefix)

# Mismatch!

What To Do Instead

Step 1: Copy concept_path from claim EXACTLY

# Find your claim's concept_path:
grep "id = \"msgqueue-015\"" -A 1 .aphoria/claims.toml
# Output: concept_path = "msgqueue/queue/max_size"

# Copy this EXACTLY into extractor subject:
subject = "msgqueue/queue/max_size"  # ✅ CORRECT (exact copy)

Step 2: Validate BEFORE scanning

# Compare subjects vs concept_paths
grep "subject =" .aphoria/config.toml | sort
grep "concept_path =" .aphoria/claims.toml | sort

# Verify: Every subject should appear in a concept_path
# If subject = "queue/max_size" and no concept_path = "queue/max_size" → WRONG
# Must use full path: "msgqueue/queue/max_size"

Step 3: Test pattern matches code

# For each extractor pattern, verify it matches code:
grep -rE 'max_queue_size:\s*None' src/
# Should find: src/config.rs:45: max_queue_size: None

# If no match → pattern is wrong, fix regex
# If matches → pattern is correct, issue is subject field

Step 4: Create extractor with correct format

[[extractors.declarative]]
name = "queue_max_size_unbounded"
pattern = 'max_queue_size:\s*None'
languages = ["rust"]

[extractors.declarative.claim]
subject = "msgqueue/queue/max_size"  # ✅ Copied from claim concept_path
predicate = "bounded"
value = false
confidence = 0.95

How to Verify Correct Format

After creating extractors, before scanning:

# 1. Check all subjects
grep "subject =" .aphoria/config.toml

# 2. Check all concept_paths
grep "concept_path =" .aphoria/claims.toml

# 3. Verify alignment
# For each subject, there MUST be a claim with matching concept_path
# "msgqueue/queue/max_size" → MUST exist in claims.toml

# Example check:
for subject in $(grep "subject =" .aphoria/config.toml | cut -d'"' -f2); do
  if ! grep -q "concept_path = \"$subject\"" .aphoria/claims.toml; then
    echo "❌ MISMATCH: $subject not found in claims"
  else
    echo "✅ OK: $subject"
  fi
done

Expected output: All subjects show ✅ OK

Debug 0% Detection After Creating Extractors

If you created extractors and detection rate is still 0%:

Step 1: Were observations created?

jq '.observations | length' scan-results-v2.json
# Expected: > 0

If 0 observations → Pattern doesn't match code (test with grep -rE "pattern" src/)
If >0 observations → Observations don't match claims (subject mismatch, proceed to Step 2)

Step 2: Compare observation paths vs claim paths

# Observation paths (what extractors created):
jq '.observations[].concept_path' scan-results-v2.json | sort -u

# Claim paths (what exists in claims.toml):
grep "concept_path =" .aphoria/claims.toml | cut -d'"' -f2 | sort -u

# Compare: Do observation paths END with same tail as claim paths?

Example mismatch:

Observation: queue/max_size
Claim: msgqueue/queue/max_size
Tail: Both have queue/max_size (last 2 segments)
Problem: Observation missing msgqueue/ prefix

Fix: Update extractor subject to match claim's full path.

How We're Fixing This

Documentation updates (2026-02-10):

✅ Created docs/extractors/declarative-extractors.md with subject field reference
✅ Created docs/examples/extractors/timeout-zero-example.md with worked example
✅ Updated plan.md Day 3 Step 3 to show manual extractor format
✅ Updated plan.md Day 3 Step 4 with debug workflow for 0% detection
✅ Added validation steps (grep subject vs concept_path)

Product improvements (planned):

VG-DAY3-001: aphoria scan --show-observations to see observation concept paths
VG-DAY3-002: Better error messages when subject doesn't match any claim
VG-DAY3-003: aphoria extractors validate to check subject alignment
VG-DAY3-004: aphoria extractors test NAME --file path.rs for single-extractor testing

Comparison: Two Failure Modes

Attempt	Extractors Created	Detection Rate	Failure Reason
First	0	0%	Skipped Phase 4 entirely (docs unclear)
Second	7	0%	Wrong subject format (undocumented requirement)
Correct	7	100%	Subject matches concept_path exactly

Progress: First fix got team to CREATE extractors. Second fix ensures extractors WORK.

Mistake #3: Treating Aphoria as Static Scanner

Severity: 🚨 CRITICAL - Fundamental misunderstanding

What People Think

"Aphoria is a CLI tool you run to check code, like a linter."

What It Actually Is

"Aphoria is an autonomous learning system where LLM skills drive the workflow, and CLI is a debug interface."

How This Manifests

Wrong workflow:

Run aphoria scan
Look at output
Done

Correct workflow:

Use /aphoria-suggest to discover patterns (Day 1)
Use /aphoria-claims to author claims (Day 1)
Write code with violations (Day 2)
Run aphoria scan to get baseline (Day 3)
Use /aphoria-custom-extractor-creator to close gaps (Day 3)
Re-scan to verify (Day 3)
Fix violations progressively (Day 4)

Key difference: LLM skills (/aphoria-*) are PRIMARY, CLI is FALLBACK.

How to Avoid

Before starting dogfood:

Verify skills are available: /help | grep aphoria
Understand: Skills drive the process, not manual CLI
Reference: Read applications/aphoria/vision.md sections on autonomous workflows

Mistake #4: Not Verifying Prerequisites

Severity: ⚠️ MAJOR - Wastes time mid-execution

What People Do Wrong

Start Day 3 without checking:

Is /aphoria-custom-extractor-creator skill available?
Are inline markers present in code?
Does code compile?

Result: Workflow fails mid-execution, must backtrack.

What To Do Instead

Pre-flight check at start of EACH day:

Day 1:

/help | grep aphoria-suggest           # Skill available?
/help | grep aphoria-claims            # Skill available?
ls .aphoria/config.toml                # Config exists?

Day 2:

ls src/                                # Project structure exists?
cargo check                            # Dependencies resolve?

Day 3:

/help | grep aphoria-custom-extractor-creator  # Skill available?
grep -r "@aphoria:claim" src/ | wc -l          # Markers present?
cargo check                                    # Code compiles?

Day 4:

ls scan-v2.json                        # Verification scan exists?
jq '.summary.claims_conflict' scan-v2.json  # Violations detected?

If any check fails, STOP and fix before proceeding.

Mistake #5: Skipping Gap Analysis

Severity: ⚠️ MAJOR - Can't prioritize what to fix

What People Do Wrong

See "20/22 claims MISSING" in scan output, don't investigate why.

What To Do Instead

Create gap analysis table after scan-v1:

## Gap Analysis

| Violation | Location | Marker Present? | Observation Found? | Extractor Exists? | Action |
|-----------|----------|----------------|-------------------|------------------|--------|
| timeout=0 | config.rs:20 | ✅ | ❌ | ❌ | Create extractor |
| prefetch=MAX | config.rs:33 | ✅ | ❌ | ❌ | Create extractor |
| verify_tls=false | config.rs:68 | ✅ | ❌ | ❌ | Create extractor |
... (8 total)

**Summary:**
- Total violations: 8
- Markers present: 8/8
- Observations found: 0/8
- Extractors needed: 8

**Root cause:** Zero extractors exist for msgqueue domain patterns.

This makes it clear WHAT to create in Phase 4.

Mistake #6: No Time Tracking

Severity: ℹ️ MINOR - Can't optimize workflow

What People Do Wrong

Don't track time per phase, can't calculate efficiency.

What To Do Instead

Track time in daily summary:

## Time Breakdown

| Phase | Target | Actual | Delta |
|-------|--------|--------|-------|
| Pre-flight check | 5 min | 3 min | -2 min ✅ |
| Baseline scan | 15 min | 12 min | -3 min ✅ |
| Gap analysis | 15 min | 18 min | +3 min |
| Extractor creation | 30 min | 35 min | +5 min |
| Verification scan | 15 min | 10 min | -5 min ✅ |
| Documentation | 15 min | 12 min | -3 min ✅ |
| **Total** | **95 min** | **90 min** | **-5 min ✅** |

This shows where time is spent and where to optimize.

Mistake #7: No Detection Rate Calculation

Severity: ℹ️ MINOR - Can't prove success

What People Do Wrong

Scan results exist but no explicit detection rate calculated.

What To Do Instead

## Detection Rate

| Scan | Violations Detected | Total Violations | Detection Rate | Target | Pass? |
|------|---------------------|-----------------|----------------|--------|-------|
| v1 (baseline) | 0 | 8 | 0% | N/A | Baseline |
| v2 (after extractors) | 8 | 8 | 100% | ≥90% | ✅ PASS |

**Improvement:** +100 percentage points (0% → 100%)

**Root Cause of Initial 0%:** Zero extractors existed for msgqueue patterns. After creating 8 extractors, 100% detection achieved.

Mistake #8: Not Comparing to httpclient

Severity: ℹ️ MINOR - Misses learning opportunity

What People Do Wrong

Don't reference why httpclient succeeded (100% detection on first scan) where msgqueue failed (0% detection).

What To Do Instead

## Comparison: httpclient vs msgqueue

| Metric | httpclient | msgqueue | Why Different? |
|--------|-----------|----------|----------------|
| Initial detection | 7/7 (100%) | 0/8 (0%) | httpclient had extractors from corpus |
| Extractors created | 0 (existed) | 8 (new) | msgqueue required new extractors |
| Final detection | 7/7 (100%) | 8/8 (100%) | After creation, both 100% |

**Lesson:** First dogfood in new domain requires extractor creation (Day 3 Phase 4). Subsequent dogfoods reuse extractors (corpus compounding).

**Corpus growth:** These 8 msgqueue extractors will benefit:
- Next msgqueue project (100% detection on first scan)
- Any async Rust project (timeout, TLS, blocking-in-async patterns reusable)

Checklist: "Did I Do Day 3 Correctly?"

Use this checklist after completing Day 3:

✅ Pre-Flight (5 min)

Verified skill availability (/help | grep aphoria-custom-extractor-creator)
Verified inline markers present (grep -r "@aphoria:claim" src/)
Verified code compiles (cargo check)

✅ Baseline Scan (15 min)

Ran aphoria scan > scan-v1.json
Reviewed results (expected: low detection rate for new domain)

✅ Gap Analysis (15 min)

Created gap table (violations vs observations)
Identified which extractors are needed (8 total)

✅ Extractor Creation (30 min) [CRITICAL]

Invoked /aphoria-custom-extractor-creator 8 times (one per violation)
Created .aphoria/extractors/ directory
8 .toml files exist in extractors/ directory
Each extractor file has: name, pattern, concept_path, predicate, value

✅ Verification Scan (15 min)

Ran aphoria scan > scan-v2.json
Compared v1 vs v2 (detection rate improved from 0% to ≥90%)
Zero false positives

✅ Documentation (15 min)

Created DAY3-SUMMARY.md
Included metrics table (v1 vs v2 detection rate)
Listed all 8 extractors created
Documented time per phase
Described learning captured (patterns identified)

Evidence Check

Run these commands to verify:

# 1. Extractor files exist
ls .aphoria/extractors/*.toml | wc -l
# Expected: 8

# 2. Verification scan exists
ls scan-v2.json
# Expected: file exists

# 3. Daily summary exists
ls DAY3-SUMMARY.md
# Expected: file exists

# 4. Detection improved
jq '.summary.claims_conflict' scan-v1.json  # Should be: 0
jq '.summary.claims_conflict' scan-v2.json  # Should be: 8
# Improvement: +8 violations detected

If ANY check fails, Day 3 is incomplete. Redo from Phase 4 (extractor creation).

How to Recover from Mistakes

If You Skipped Day 3 Extractor Creation

Symptoms:

No .aphoria/extractors/ directory
Only scan-v1.json exists (no v2)
No DAY3-SUMMARY.md
Detection rate still 0%

Recovery:

Load skill: /aphoria-custom-extractor-creator
Create extractors (Phase 4 of Day 3)
Run verification scan (Phase 5)
Write summary (Phase 6)
Mark Day 3 as complete

Time: ~1 hour

If You Forgot Pre-Flight Check

Symptoms:

Workflow failed mid-execution
Skill not found errors
Code doesn't compile

Recovery:

Run pre-flight check now
Fix blockers (load skills, fix compilation)
Resume from where you stopped

Time: ~15 minutes

If You Have No Gap Analysis

Symptoms:

Can't explain why violations were missed
Don't know which extractors to create

Recovery:

Review scan-v1.json
Create gap table (template above)
Proceed with extractor creation

Time: ~15 minutes

Prevention: What We Fixed

Documentation Updates (2026-02-10)

✅ plan.md:

Day 3 Step 3 now says [REQUIRED - DO NOT SKIP]
Added pre-flight check section
Broke Day 3 into 6 explicit phases
Added evidence checklist (ls commands)

✅ SKILL.md (aphoria-dogfood):

Rewrote Day 3 section with emphasis on extractor creation
Added Phase 1-6 breakdown
Added warning: "THIS IS THE CORE FLYWHEEL STEP"

✅ This document (dogfooding-common-mistakes.md):

Documents msgqueue failure as cautionary example
Provides recovery procedures
Includes verification checklists

Product Improvements (Planned)

🔜 Scan output enhancement:

Show "Run /aphoria-custom-extractor-creator" suggestion when claims are MISSING

🔜 New CLI commands:

aphoria extractors coverage - Show which extractors exist vs needed
aphoria dogfood metrics --day 3 - Calculate detection rate improvement
aphoria scan diff scan-v1.json scan-v2.json - Visual diff

🔜 Pre-flight validation:

aphoria dogfood preflight --day 3 - Verify prerequisites before starting

Mistake #9: Not Refining Extractors After Low Detection

Severity: ⚠️ MAJOR - Leaves false negatives unaddressed

What People Do Wrong

Day 3 achieves 50% detection (5/10 violations), Day 5 documents "use programmatic for complex patterns," but never actually creates programmatic extractors.

Evidence from cachewrap dogfood (2026-02-11):

Day 3: Created 10 declarative extractors
Result: 50% detection (5/10 violations)
Day 4: Fixed all violations manually
Day 5: Wrote extensive documentation recommending programmatic extractors
But never created programmatic extractors to fix the 5 false negatives

Why It's Wrong

False negatives persist - 5 violations undetected (cache key validation, TLS, sync blocking, pooling, metrics)
No knowledge refinement - Next cache project will ALSO have 50% detection
Documentation-code gap - Says "use programmatic" but only shows declarative
Flywheel incomplete - Learning cycle stops at 50%, doesn't reach 90%+ target
Pattern persists - Next dogfood will repeat the same mistake

What To Do Instead

Day 5 should include extractor refinement workflow:

Phase 1: Analyze Day 3 Failures (15 min)

# Compare Day 3 expectations vs results
jq '.summary.claims_conflict' scan-v3.json
# Output: 5 (expected: 9-10)

# Identify which violations were missed
jq '.claim_verification[] | select(.verdict == "MISSING") | .claim_id' scan-v3.json
# Output: cache-tls-validation-001, cache-async-blocking-001, etc.

Create analysis table:

## Day 3 False Negatives

| Violation | Declarative Pattern | Why It Failed | Needs Programmatic? |
|-----------|---------------------|---------------|---------------------|
| cache-key-validation-001 | `pub async fn get\(&self, key: &str\)` | Can't see function body (validate_key() call) | ✅ Yes |
| cache-tls-validation-001 | `verify_tls:\s*false` | Declaration vs value context | ✅ Yes |
| cache-async-blocking-001 | `self\.client\.get_connection\(\)` | Escaping issue or not matching | ⚠️ Maybe |
| cache-max-connections-001 | Long pattern | Too complex for regex | ✅ Yes |
| cache-metrics-enabled-001 | `metrics_enabled:\s*false` | Declaration vs value context | ✅ Yes |

**Summary:** 5 false negatives, 4 require programmatic extractors

Phase 2: Create Programmatic Extractors (45 min)

Use /aphoria-custom-extractor-creator with programmatic implementations:

Example: cache-key-validation-001

# Create programmatic extractor
/aphoria-custom-extractor-creator \
  --violation "Missing key validation in get() function body" \
  --claim cache-key-validation-001 \
  --type programmatic \
  --file src/client.rs

Expected output: src/extractors/cache_key_validation.rs with AST parsing

Phase 3: Re-Scan with Hybrid Extractors (10 min)

# Rebuild Aphoria with new extractors
cd ../../.. # Back to Aphoria root
cargo build --release --bin aphoria

# Run scan with hybrid extractors (declarative + programmatic)
cd dogfood/cachewrap
/path/to/aphoria scan --format json > scan-final-refined.json

# Compare detection rates
jq '.summary.claims_conflict' scan-v3.json      # Declarative only: 5
jq '.summary.claims_conflict' scan-final-refined.json  # Hybrid: 9-10

Phase 4: Document Refinement (15 min)

Update DAY5-SUMMARY.md with:

## Extractor Refinement (Day 5, Phase 4)

### Detection Rate Improvement

| Approach | Extractors | Detection | Rate |
|----------|-----------|-----------|------|
| Declarative only (Day 3) | 10 | 5/10 | 50% |
| Hybrid (Day 5 refined) | 10 declarative + 4 programmatic | 9/10 | 90% |

### Programmatic Extractors Created

1. **cache-key-validation-001** - AST parsing to detect validate_key() call in function body
2. **cache-tls-validation-001** - Context-aware detection of verify_tls value in Default impl
3. **cache-max-connections-001** - Simplified pattern with screening
4. **cache-metrics-enabled-001** - Context-aware detection in Default impl

### Lessons Learned

- Declarative extractors are 50-70% effective for initial pass
- Programmatic extractors necessary for 90%+ detection
- Hybrid strategy: declarative for rapid prototyping, programmatic for refinement

How to Verify Correct Execution

After Day 5, these MUST exist if detection rate was <90% in Day 3:

# 1. Programmatic extractors created
$ ls src/extractors/*.rs | grep -v mod.rs | grep -v registry.rs | wc -l
4  # Should match number of false negatives needing programmatic

# 2. Refined scan exists
$ ls scan-final-refined.json
scan-final-refined.json

# 3. Detection rate improved
$ jq '.summary.claims_conflict' scan-v3.json
5
$ jq '.summary.claims_conflict' scan-final-refined.json
9  # Should be ≥9 (90%+)

# 4. DAY5-SUMMARY includes refinement section
$ grep "Extractor Refinement" DAY5-SUMMARY.md
## Extractor Refinement (Day 5, Phase 4)

If declarative detection was ≥90%, refinement is optional but recommended for completeness.

Why This Mistake Happens

Root cause: Skill bias + missing workflow

Skill says "Declarative First" - Creates strong default
No threshold trigger - No guidance on "when detection <70%, switch to programmatic"
Effort imbalance - Declarative framed as "fast/easy", programmatic as "hard/slow"
No Day 5 workflow - Plan doesn't include extractor refinement
Documentation-code gap - Write "use programmatic" but never actually do it

How We're Fixing This

Skill updates (2026-02-11):

✅ Changed principle from "Declarative First" to "Hybrid Strategy"
✅ Added detection threshold: "<70% → create programmatic"
✅ Updated "Do" list: "Upgrade to programmatic when detection <70%"
✅ Updated "Do Not" list: "Do NOT stop at declarative when detection <70%"
✅ Added section: "When to switch from declarative to programmatic"

Documentation updates (2026-02-11):

✅ Added Mistake #9 to common-mistakes.md (this section)
✅ Added Day 5 Phase 4: Extractor Refinement workflow
✅ Created programmatic extractor example (see below)

Next: Update plan.md template to include Day 5 refinement workflow

Comparison: Declarative vs Hybrid

Dogfood	Approach	Day 3 Detection	Day 5 Refined	Final Rate
cachewrap (before fix)	Declarative only	50% (5/10)	N/A (skipped)	50%
cachewrap (after fix)	Hybrid (declarative → programmatic)	50% (5/10)	90% (9/10)	90%

Lesson: Day 5 refinement turns 50% declarative detection into 90% hybrid detection.

Summary

Most Critical Mistake: Skipping Day 3 extractor creation (breaks flywheel completely)

How to Avoid:

Understand Aphoria is autonomous learning system (not static scanner)
Follow plan.md Day 3 phases 1-6 WITHOUT skipping any
Verify evidence after Day 3 (8 extractors, scan-v2.json, DAY3-SUMMARY.md)
Run pre-flight check before each day

How to Verify Success:

ls .aphoria/extractors/*.toml | wc -l  # Must be: 8
ls scan-v2.json                        # Must exist
ls DAY3-SUMMARY.md                     # Must exist

If ANY check fails, Day 3 is incomplete.

Last Updated: 2026-02-10 (after msgqueue dogfood Day 3 failure)

29 KiB Raw Blame History Unescape Escape

Common Dogfooding Mistakes

Mistake #1: Skipping Day 3 Extractor Creation (CRITICAL)

What People Do Wrong

Why It's Wrong

Evidence from msgqueue Dogfood (2026-02-10)

What To Do Instead

Phase 1: Pre-Flight Check (5 min)

Phase 2: Baseline Scan (15 min)

Phase 3: Gap Analysis (15 min)

Phase 4: Extractor Creation (30 min) [CRITICAL]

Phase 5: Verification Scan (15 min)

Phase 6: Documentation (15 min)

How to Verify Correct Execution

Why This Mistake Happens

How We're Fixing This

Related Issues

Mistake #2: Creating Extractors with Wrong Subject Format (CRITICAL)

What People Do Wrong

Why It's Wrong

Evidence from msgqueue Dogfood (2026-02-10, Second Attempt)

What To Do Instead

How to Verify Correct Format

Debug 0% Detection After Creating Extractors

How We're Fixing This

Comparison: Two Failure Modes

Mistake #3: Treating Aphoria as Static Scanner

What People Think

What It Actually Is

How This Manifests

How to Avoid

Mistake #4: Not Verifying Prerequisites

What People Do Wrong

What To Do Instead

Mistake #5: Skipping Gap Analysis

What People Do Wrong

What To Do Instead

Mistake #6: No Time Tracking

What People Do Wrong

What To Do Instead

Mistake #7: No Detection Rate Calculation

What People Do Wrong

What To Do Instead

Mistake #8: Not Comparing to httpclient

What People Do Wrong

What To Do Instead

Checklist: "Did I Do Day 3 Correctly?"

✅ Pre-Flight (5 min)

✅ Baseline Scan (15 min)

✅ Gap Analysis (15 min)

✅ Extractor Creation (30 min) [CRITICAL]

✅ Verification Scan (15 min)

✅ Documentation (15 min)

Evidence Check

How to Recover from Mistakes

If You Skipped Day 3 Extractor Creation

If You Forgot Pre-Flight Check

If You Have No Gap Analysis

Prevention: What We Fixed

Documentation Updates (2026-02-10)

Product Improvements (Planned)

Mistake #9: Not Refining Extractors After Low Detection

What People Do Wrong

Why It's Wrong

What To Do Instead

Phase 1: Analyze Day 3 Failures (15 min)

Phase 2: Create Programmatic Extractors (45 min)

Phase 3: Re-Scan with Hybrid Extractors (10 min)

Phase 4: Document Refinement (15 min)

How to Verify Correct Execution

Why This Mistake Happens

How We're Fixing This

Comparison: Declarative vs Hybrid

Summary

29 KiB

Raw Blame History