stemedb/applications/aphoria/dogfood/dbpool/CHECKLIST.md
jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation
Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (/ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 03:31:06 +00:00

38 KiB

Dogfood Execution Checklist

Project: Database Connection Pool (dbpool) Duration: 5 days Last Updated: 2026-02-09


Pre-Execution Requirements

Quick Start: Run Pre-Flight Validator

Before manually checking each item, run the automated validator:

./scripts/validate-setup.sh

This script checks all prerequisites and provides clear fixes for any issues. Expected output:

=== Pre-Flight Validation ===

Checking: Aphoria CLI installed... ✓ PASS (aphoria 0.1.0)
Checking: StemeDB API running on :18180... ✓ PASS
Checking: Corpus database accessible... ✓ PASS (/home/jml/.aphoria/corpus-db)
Checking: Corpus API returns data... ✓ PASS (27 items in corpus)
Checking: jq JSON processor installed... ✓ PASS
Checking: Rust toolchain available... ✓ PASS (cargo 1.75.0)
Checking: Aphoria extractors detect patterns... ✓ PASS (detected 1 patterns)

=== Summary ===
Passed: 7
Failed: 0

✓ All checks passed. Ready to proceed with dogfood exercise!

If any checks fail, the script will show you exactly what to fix.


Environment Setup (Manual Verification)

  • Aphoria CLI installed and working

    aphoria --version
    

    Expected output:

    aphoria 0.1.0
    
  • API running with corpus database

    # Check API health
    curl http://localhost:18180/health
    

    Expected output:

    {"status":"healthy","version":"0.1.0"}
    

    Prerequisites:

    • StemeDB API must be running on port 18180
    • Set environment variable: STEMEDB_CORPUS_DB_DIR=/path/to/corpus-db
    • Corpus DB directory should exist and contain fjall/ subdirectory
  • Corpus database location verified

    ls -la ~/.aphoria/corpus-db/
    

    Expected output:

    drwxr-xr-x 3 user user 4096 Feb  9 10:30 fjall/
    
  • Git repository clean

    cd /home/jml/Workspace/stemedb/applications/aphoria/dogfood/dbpool
    git status
    

    Expected output:

    On branch dogfood/dbpool
    nothing to commit, working tree clean
    
  • Rust toolchain up to date

    cargo --version
    rustc --version
    

    Expected output:

    cargo 1.75.0 (1d8b05cdd 2024-01-18)
    rustc 1.75.0 (82e1608df 2024-12-21)
    

    Required: Rust 1.70+


Claude Code Skills (Required for Autonomous Flywheel)

CRITICAL: The Aphoria flywheel is autonomous - driven by LLM skills (Claude Code, Go ADK, or other methodology) analyzing code and suggesting patterns. Manual CLI exists as fallback only.

  • Skills installed in Claude Code

    Verify skills are available in ~/.claude/skills/:
    
    ls -la ~/.claude/skills/ | grep aphoria
    
    Expected skills (8 total):
      aphoria/                         # Main Aphoria scan skill
      aphoria-claims/                  # ⭐ Diff analysis, claim authoring
      aphoria-suggest/                 # ⭐ Pattern suggestion from observations
      aphoria-custom-extractor-creator/ # Generate extractors for patterns
      aphoria-corpus-import/           # Import corpus from external sources
      aphoria-install/                 # Installation and setup
      aphoria-post-commit-hook/        # Autonomous post-commit integration
      aphoria-ci-setup/                # CI/CD pipeline integration
    
  • Skills workflow understood

    • Primary workflow (Day 1, 3-4): Use skills to analyze code → get claim suggestions with enforced naming

      • /aphoria-claims - Analyze diffs, author/update claims
      • /aphoria-suggest - Suggest new claims from patterns
      • /aphoria-custom-extractor-creator - Generate extractors for discovered patterns
    • Autonomous workflow (Production): Post-commit hooks or CI/CD integration

      • /aphoria-post-commit-hook - Set up automatic commit-time scanning
      • /aphoria-ci-setup - Configure GitHub Actions/GitLab CI integration
    • Fallback workflow: Manual CLI (aphoria corpus create commands) when LLM unavailable

    For dogfooding: Skills demonstrate the production autonomous workflow and cross-project knowledge compounding.

  • Cross-project corpus access verified

    # Verify you can see claims from other projects
    curl 'http://localhost:18180/v1/aphoria/corpus' | jq '.items | length'
    # Should show: All claims from corpus (including other projects)
    
    # For Project 2+: Check for patterns from previous projects
    curl 'http://localhost:18180/v1/aphoria/corpus' | \
      jq '[.items[] | select(.subject | contains("dbpool"))] | length'
    # If dbpool exists: Should show 27 claims from Project 1
    

Why skills matter:

  • 2-3x faster than manual (automatic pattern analysis)
  • Consistent naming enforced automatically
  • Cross-project awareness (queries existing corpus)
  • Demonstrates the autonomous flywheel in action

Day 1: Create 25-30 Corpus Claims

Deliverable: 25-30 claims created via CLI and verified in corpus database

Success Criteria:

curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items | map(select(.subject | startswith("dbpool"))) | length'
# Expected output: 25-30

Estimated Time: 4-6 hours


Step 1: Read Claim Extraction Example (15-20 min)

  • Read complete walkthrough with worked examples

    cat docs/claim-extraction-example.md
    

    This document shows you:

    • How to extract 3 claims from a HikariCP paragraph (full reasoning)
    • Decision framework: What deserves to be a claim vs background noise
    • How to structure --explanation with WHAT + WHY + CONSEQUENCE
    • Anti-patterns to avoid (too generic, no consequence, not verifiable)

    Time to read: 15-20 minutes Key takeaway: Claims are products with full context, not just grep results

  • Now apply this knowledge: Create 3 practice claims

    Following the same process you just learned, extract your first 3 claims:

    • Practice Claim 1: Extract from HikariCP "Small Pool Philosophy" section

      • Use the example's analysis structure: identify claimable statement → reason WHY → write WHAT/WHY/CONSEQUENCE → submit via CLI
    • Practice Claim 2: Extract from PostgreSQL "300-500 connections optimal" guidance

      • Apply the decision framework: Is this verifiable? Does it have consequences?
    • Practice Claim 3: Extract from OWASP "plaintext passwords prohibited"

      • Structure with WHAT (prohibition) + WHY (security risk) + CONSEQUENCE (credential exposure)

    Verification after practice:

    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items | map(select(.subject | startswith("dbpool"))) | length'
    # Expected output: 3
    

Step 2: Fetch Authority Source Documents (30 min)


Step 3: Understand Naming Conventions (CRITICAL - 5 min)

⚠️ Read this before creating any claims - Inconsistent naming breaks tail-path matching.

Format Rules

CRITICAL: Aphoria uses tail-path matching (last 2 path segments) to compare observations against corpus claims. Inconsistent naming breaks matching → violations go undetected.

Correct Format:

  • Lowercase only: max_connections (NOT MaxConnections)
  • Slash-separated: dbpool/max_connections (NOT dbpool::max_connections)
  • Underscores for spaces: connection_timeout (NOT connectionTimeout or connection-timeout)
  • Hierarchical: dbpool/config/max_connections (component → subcategory → property)

Wrong Format (breaks matching):

  • dbpool/MaxConnections - Case mismatch
  • dbpool::max_connections - Wrong separator (::)
  • dbpool/connectionTimeout - CamelCase
  • dbpool-max-connections - Hyphens instead of slashes

How Tail-Path Matching Works

Corpus Claim: vendor://dbpool/config/max_connections
              → tail_path: "config/max_connections" (last 2 segments)

Observation:  dbpool/config/max_connections
              → tail_path: "config/max_connections"
              → MATCH ✓ (conflict detected)

Observation:  dbpool/config/MaxConnections
              → tail_path: "config/MaxConnections"
              → NO MATCH ✗ (violation missed - looks like different paths!)

Examples (Correct Naming)

# Safety claims
--subject "dbpool/max_connections"              # ✓ Correct
--subject "dbpool/min_connections"              # ✓ Correct
--subject "dbpool/connection_timeout"           # ✓ Correct

# Security claims (hierarchical)
--subject "dbpool/connection_string/password"   # ✓ Correct (3 levels)
--subject "dbpool/tls/enabled"                  # ✓ Correct

# WRONG - Don't do this:
--subject "dbpool/MaxConnections"               # ✗ Case mismatch
--subject "dbpool::max_connections"             # ✗ Wrong separator
--subject "dbpool/max-connections"              # ✗ Hyphens

Verification After Creating Claims

# Check all subjects use correct naming
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items[] | select(.subject | contains("dbpool")) | .subject'

# All should be:
# - Lowercase
# - Slash-separated
# - No special characters except underscores

Pro Tip: Use /aphoria-claims skill - it enforces naming conventions automatically.


Step 4: Create Corpus Claims (Primary: Skills / Fallback: CLI)

Estimated Time:

  • With skills: 1-2 hours (recommended)
  • Manual CLI: 3-4 hours (fallback)

Why use skills:

  • 2-3x faster (automatic pattern analysis)
  • Naming conventions enforced automatically
  • Cross-project awareness (queries existing corpus)
  • Demonstrates autonomous flywheel

Available Skills: (Installed in ~/.claude/skills/)

Skill Use When Purpose
/aphoria-claims Analyzing diffs, authoring claims Extract claims from docs/diffs with enforced naming
/aphoria-suggest Growing coverage, finding gaps Suggest new claims from unclaimed observations
/aphoria-corpus-import Importing external corpuses Bulk import from wikis, RFCs, compliance docs
/aphoria-custom-extractor-creator Day 3-4 (if needed) Generate extractors for custom patterns

Steps:

  • Use aphoria-claims skill to analyze source documents

    In Claude Code:

    /aphoria-claims
    
    "Read docs/sources/hikaricp-config.md and extract claims following the dbpool naming pattern (dbpool/property_name)."
    
  • Skill will:

    1. Analyze document for claimable patterns
    2. Query existing corpus for similar claims (cross-project awareness)
    3. Suggest claims with proper naming (lowercase, slash-separated)
    4. Generate aphoria corpus create commands with consistent format
    5. Enforce tail-path matching rules (last 2 segments for concept_path)
  • Review skill suggestions and execute commands

    # Example skill output:
    aphoria corpus create \
      --subject "dbpool/max_connections" \
      --predicate "required" \
      --value "true" \
      --explanation "..." \
      --authority "HikariCP" \
      --category "safety" \
      --tier 2
    
  • Repeat for all source documents

    • HikariCP: Extract 15-18 claims
    • PostgreSQL: Extract 5-7 claims
    • OWASP: Extract 5 claims
  • Verify naming consistency

    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items[] | select(.subject | contains("dbpool")) | .subject'
    # All subjects should be lowercase, slash-separated
    

Estimated time with skills: 1-2 hours


📝 Option B: Manual CLI Workflow (FALLBACK)

Use only if:

  • Skills are unavailable
  • You need to understand the low-level CLI

Trade-offs:

  • 2-3x slower than skills
  • Manual naming consistency (error-prone)
  • No cross-project pattern awareness
  • Does not demonstrate autonomous flywheel

If using manual CLI, follow naming rules in Step 3 strictly.


Aphoria CLI Commands (Manual)

  • How to create claims manually

    # Template command (follow naming rules from Step 3!)
    aphoria corpus create \
      --subject "dbpool/{component}/{property}" \
      --predicate "{required|recommended|bounded|minimum|maximum}" \
      --value "{value}" \
      --explanation "{What} MUST {do} because {why}. If {violation}, {consequence}." \
      --authority "{Source Name}" \
      --category "{safety|security|performance|architecture}" \
      --tier {0-3}
    
    # Real example
    aphoria corpus create \
      --subject "dbpool/max_connections" \
      --predicate "required" \
      --value "true" \
      --explanation "Connection pools MUST have max_connections set to prevent unbounded growth that exhausts database connections" \
      --authority "HikariCP Configuration Guide" \
      --category "safety" \
      --tier 2
    

    Expected output:

    ✓ Created claim: vendor://dbpool/max_connections
    Subject: dbpool/max_connections
    Predicate: required
    Value: true
    Authority: HikariCP Configuration Guide
    Tier: 2 (Vendor)
    Category: safety
    
  • How to query the corpus

    # Query all corpus items
    curl 'http://localhost:18180/v1/aphoria/corpus?limit=100' | jq .
    
    # Query specific source
    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&limit=100' | jq .
    
    # Count items for dbpool
    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items | map(select(.subject | startswith("dbpool"))) | length'
    

    Expected output (after creating claims):

    {
      "items": [
        {
          "subject": "vendor://dbpool/max_connections",
          "predicate": "required",
          "value": true,
          "explanation": "Connection pools MUST have max_connections set to prevent unbounded growth that exhausts database connections",
          "authority_source": "HikariCP Configuration Guide",
          "tier": 2,
          "category": "safety",
          "evidence": [],
          "tags": []
        },
        {
          "subject": "vendor://dbpool/connection_timeout",
          "predicate": "maximum",
          "value": 30,
          "explanation": "Connection timeout SHOULD NOT exceed 30 seconds. Long timeouts delay error detection and can cause thread starvation under load.",
          "authority_source": "HikariCP Configuration Guide",
          "tier": 2,
          "category": "performance",
          "evidence": [],
          "tags": []
        }
      ],
      "total_matching": 27,
      "page_size": 100,
      "offset": 0
    }
    
  • Understanding authority tiers

    Tier 0: Regulatory (RFCs, Standards) - Highest authority
    Tier 1: Clinical (OWASP, NIST) - Security/compliance
    Tier 2: Vendor (HikariCP, PostgreSQL docs) - Industry best practices
    Tier 3: Expert (Team policy) - Project-specific rules
    

Create All 27 Claims (Grouped by Category)

  • Safety Claims (10 claims)

    • dbpool/max_connections - required: true
    • dbpool/min_connections - minimum: 2
    • dbpool/connection_timeout - maximum: 30
    • dbpool/idle_timeout - required: true
    • dbpool/idle_timeout - bounded: true
    • dbpool/max_lifetime - required: true
    • dbpool/max_lifetime - default: 1800
    • dbpool/validation_timeout - maximum: 3
    • dbpool/leak_detection_threshold - recommended: true
    • dbpool/max_connections - bounded: true
  • Performance Claims (8 claims)

    • dbpool/max_connections/development - default_value: 10
    • dbpool/max_connections/production - recommended_range: 50-100
    • dbpool/checkout_timeout - default_value: 5
    • dbpool/validation/frequency - required: on_checkout
    • dbpool/connection_test_query - recommended: SELECT 1
    • dbpool/prefill - recommended: true (production)
    • dbpool/fair_queue - default_value: true
    • dbpool/metrics/enabled - recommended: true
  • Security Claims (5 claims)

    • dbpool/connection_string/password - must_not_be: plaintext
    • dbpool/connection_string/source - required: environment_variable
    • dbpool/tls/enabled - recommended: true (production)
    • dbpool/tls/certificate_validation - required: true
    • dbpool/credentials/rotation - recommended: true
  • Architecture Claims (4 claims)

    • dbpool/health_check/endpoint - required: true
    • dbpool/metrics/exposed - required: pool_size,active,idle,waiting
    • dbpool/error_handling/connection_failure - must: return_error_not_panic
    • dbpool/shutdown/graceful - required: true

Step 4: Verify Completion (2 min)

  • Run verification command

    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items | map(select(.subject | startswith("dbpool"))) | length'
    

    Expected output: 25-30

  • Verify claim quality (spot check 5 random claims)

    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items[] | select(.subject | startswith("dbpool")) | {subject, predicate, value, explanation}' | head -20
    

    Check for:

    • Clear WHAT + WHY + CONSEQUENCE in explanation
    • Correct authority attribution
    • Appropriate tier (1 for OWASP, 2 for vendor)

Day 1 Complete when verification shows 25-30 claims in corpus


📊 Additional Verification (Optional)

  • Inspect individual claim structure

    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&limit=5' | \
      jq '.items[] | select(.subject | contains("dbpool")) | {subject, predicate, value, explanation}'
    

    Expected format:

    {
      "subject": "dbpool/max_connections",
      "predicate": "required",
      "value": "true",
      "explanation": "Connection pools MUST have max_connections... [WHAT/WHY/CONSEQUENCE]"
    }
    

Day 2: Implementation - Information Needed

🏗️ Project Structure

  • Directory layout

    applications/aphoria/dogfood/dbpool/
    ├── Cargo.toml           # Create this
    ├── src/
    │   ├── lib.rs           # Create this
    │   ├── config.rs        # Create this (with violations)
    │   ├── pool.rs          # Create this (with violations)
    │   ├── connection.rs    # Create this
    │   └── error.rs         # Create this
    └── tests/
        └── basic.rs         # Create this
    
  • Cargo.toml dependencies

    [dependencies]
    tokio = { version = "1", features = ["full"] }
    tokio-postgres = "0.7"
    serde = { version = "1", features = ["derive"] }
    thiserror = "1"
    
    [dev-dependencies]
    tempfile = "3"
    

🐛 Intentional Violations Guide

  • Violation 1: Unbounded max_connections

    // ❌ This violates: dbpool/max_connections required
    pub max_connections: Option<usize>,  // Set to None
    
  • Violation 2: Plaintext password

    // ❌ This violates: dbpool/connection_string/password must_not_be plaintext
    pub connection_string: String,  // Include "postgres://user:password@..."
    
  • Violation 3: Missing max_lifetime

    // ❌ This violates: dbpool/max_lifetime required
    pub max_lifetime: Option<Duration>,  // Set to None
    
  • Violation 4: Excessive timeout

    // ❌ This violates: dbpool/connection_timeout maximum 30
    pub connection_timeout: Duration::from_secs(60),  // Too long
    
  • Violation 5: Zero min_connections

    // ❌ This violates: dbpool/min_connections minimum 2
    pub min_connections: usize = 0,  // Should be >= 2
    
  • Violation 6: No validation

    // ❌ This violates: dbpool/validation/frequency required on_checkout
    pub async fn get(&self) -> Result<Connection> {
        self.connections.pop()  // No validation
    }
    
  • Violation 7: No metrics

    // ❌ This violates: dbpool/metrics/enabled recommended
    // Don't create PoolMetrics struct
    
  • Verification: Code compiles

    cargo build
    # Should succeed (violations are semantic, not syntax)
    

Day 3: Scanning - Information Needed

⚙️ Configure Flywheel Before Scanning

CRITICAL: Read flywheel setup guide before proceeding:

cat docs/flywheel-setup.md

This covers:

  • Persistent vs ephemeral modes (you need persistent for pattern learning)
  • Pattern aggregation (how observations feed back into corpus)
  • Community corpus (cross-project pattern sharing)
  • Verification steps (how to confirm flywheel is working)

Time to read: 10-15 minutes

  • Update .aphoria/config.toml for flywheel mode

    Change from ephemeral to persistent:

    [episteme]
    mode = "persistent"  # Required for pattern learning
    
    [corpus]
    aggregation_enabled = true  # Enable flywheel
    

    See docs/flywheel-setup.md for complete configuration options.


🔍 Aphoria Scan Configuration

  • Verify .aphoria/config.toml is properly configured

    cat .aphoria/config.toml | grep -A 2 "episteme\|corpus"
    

    Should show:

    [episteme]
    mode = "persistent"
    
    [corpus]
    aggregation_enabled = true
    use_community = true
    include_vendor = true
    

    If not configured: See docs/flywheel-setup.md for setup instructions

  • How to run scan (with persistent mode)

    # Persistent scan (recommended - enables learning)
    aphoria scan --persist
    
    # With JSON output
    aphoria scan --persist --format json > scan-results-v1.json
    
    # With markdown report
    aphoria scan --persist --format markdown > SCAN-REPORT-v1.md
    
    # With table output (default)
    aphoria scan --persist --format table
    
    # Optional: Sync to community corpus
    aphoria scan --persist --sync
    

    Expected output (table format):

    ┌──────────────────────┬──────┬─────────┬──────────────────────────────────────────────────────┐
    │ File                 │ Line │ Verdict │ Explanation                                          │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/config.rs        │ 12   │ BLOCK   │ max_connections is None - violates required field   │
    │                      │      │         │ (HikariCP: Tier 2, confidence: 0.95)                │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/config.rs        │ 37   │ BLOCK   │ Plaintext password in connection_string              │
    │                      │      │         │ (OWASP A07: Tier 1, confidence: 0.98)               │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/config.rs        │ 28   │ BLOCK   │ max_lifetime is None - violates required field      │
    │                      │      │         │ (HikariCP: Tier 2, confidence: 0.92)                │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/config.rs        │ 45   │ FLAG    │ connection_timeout (60s) exceeds maximum (30s)      │
    │                      │      │         │ (HikariCP: Tier 2, confidence: 0.68)                │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/config.rs        │ 21   │ FLAG    │ min_connections (0) below minimum (2)               │
    │                      │      │         │ (PostgreSQL: Tier 2, confidence: 0.62)              │
    ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────────────────┤
    │ src/pool.rs          │ 67   │ FLAG    │ Missing validation before checkout                   │
    │                      │      │         │ (HikariCP: Tier 2, confidence: 0.58)                │
    └──────────────────────┴──────┴─────────┴──────────────────────────────────────────────────────┘
    
    Summary: 3 BLOCK, 3 FLAG, 0 PASS
    Scan completed in 0.24s
    
  • Understanding scan output

    {
      "findings": [
        {
          "claim": {
            "concept_path": "code://rust/dbpool/config/max_connections",
            "predicate": "value",
            "value": null,
            "file": "src/config.rs",
            "line": 15
          },
          "conflicts": [
            {
              "subject": "dbpool/max_connections",
              "predicate": "required",
              "value": true,
              "tier": 2,
              "confidence": 0.95,
              "authority": "HikariCP Configuration Guide"
            }
          ],
          "verdict": "BLOCK",
          "conflict_score": 0.95
        }
      ]
    }
    
  • How to interpret verdicts

    BLOCK:  Conflict score >= 0.7 (critical violations)
    FLAG:   Conflict score >= 0.5 (errors)
    PASS:   Below thresholds (compliant)
    

Verification Checklist

  • All intentional violations detected

    # Count BLOCK verdicts (should be 3)
    jq '.findings | map(select(.verdict == "BLOCK")) | length' scan-results-v1.json
    

    Expected output: 3

    # Count FLAG verdicts (should be 3)
    jq '.findings | map(select(.verdict == "FLAG")) | length' scan-results-v1.json
    

    Expected output: 3

    # Count total conflicts (should be 6-8)
    jq '.findings | length' scan-results-v1.json
    

    Expected output: 6 to 8

  • No false positives

    # Review all findings - none should be incorrect
    jq '.findings[] | {file, line, verdict, explanation}' scan-results-v1.json
    

    Expected: Every finding should correspond to an intentional violation. Review each one to ensure it's catching real issues.

  • Scan performance acceptable

    time aphoria scan
    

    Expected output:

    real    0m0.247s
    user    0m0.198s
    sys     0m0.045s
    

    Target: ≤0.3 seconds (ephemeral mode)

⚠️ Troubleshooting: When Scan Returns 0 Observations

Symptom: Scan completes but shows:

{
  "observations_extracted": 0,
  "observations_recorded": 0,
  "authority_conflicts": 0,
  "files_scanned": 7
}

Message: "No claims found. Run 'aphoria claims create' to author claims."

This message is MISLEADING. It appears when extractors find 0 patterns, not when corpus is empty.

Diagnosis Steps

  1. Verify claims exist in corpus (they should - you created 27 in Day 1):

    curl 'http://localhost:18180/v1/aphoria/corpus' | \
      jq '[.items[] | select(.subject | contains("dbpool"))] | length'
    # Expected: 27
    
  2. Check if extractors are enabled:

    grep "enabled =" .aphoria/config.toml
    

    CRITICAL: If you see:

    [extractors]
    enabled = ["imports", "struct_field", "const_value", ...]
    

    These are fictional extractor names that don't exist in Aphoria!

    Fix: Remove the entire enabled = [...] array from config.toml:

    # Edit .aphoria/config.toml and DELETE the enabled array
    # This allows all 42 built-in extractors to run
    
  3. Verify built-in extractor coverage:

    Built-in extractors detect security patterns (TLS, secrets, injection) but NOT struct field validation.

    # Re-scan with all built-in extractors
    aphoria scan --format json | jq '.summary'
    

    Expected: Some violations detected (plaintext password, excessive timeout)

    Still 0 observations? Built-in extractors don't cover your violation types.

Solution: Build Custom Extractors

Why this happens: Aphoria's 42 built-in extractors focus on security patterns (TLS, JWT, secrets, injection, rate limits). They don't detect library API design patterns like:

  • Optional struct fields (Option<usize> when required)
  • Missing struct fields (no max_lifetime field)
  • Type mismatches (String when SecretString expected)

Solution: Create declarative extractors for your patterns.

Guide: See complete walkthrough at:

cat docs/CUSTOM-EXTRACTOR-GUIDE.md

Time estimate: 2-3 hours to create all 7 extractors

Quick example - Add to .aphoria/config.toml:

[[extractors.declarative]]
name = "dbpool_max_connections_optional"
description = "Detects Option<usize> for max_connections (should be required)"
languages = ["rust"]
pattern = 'pub\s+max_connections:\s+Option<(?:usize|u64|u32)>'

[extractors.declarative.claim]
subject = "dbpool/max_connections"
predicate = "is_option"
value = { boolean = true }

confidence = 0.92
source = "dogfood"

Verification after adding extractors:

aphoria scan --format json | jq '.summary.observations_extracted'
# Expected: 7 (one per custom extractor)

Day 4: Remediation - Information Needed

🔧 Fix Workflow

  • Git workflow for incremental fixes

    # Create branch for dogfood
    git checkout -b dogfood/dbpool
    
    # Make fix
    # Edit src/config.rs
    
    # Commit with descriptive message
    git add src/config.rs
    git commit -m "fix(dbpool): set max_connections to prevent unbounded growth"
    
    # Tag milestone
    git tag v0.2.0-fix-unbounded
    
    # Re-scan
    aphoria scan --format json > scan-results-v2.json
    
    # Verify improvement
    jq '.findings | length' scan-results-v2.json
    # Should decrease after each fix
    
  • Fix templates for each violation

    Fix 1: Set max_connections

    // Before
    pub max_connections: Option<usize>,
    
    // After
    pub max_connections: usize,  // Required field
    
    impl Default for PoolConfig {
        fn default() -> Self {
            Self {
                max_connections: 10,  // Development default
                // ...
            }
        }
    }
    

    Fix 2: Environment variable for password

    // Before
    pub connection_string: String,  // "postgres://user:password@..."
    
    // After
    pub fn from_env() -> Result<Self> {
        let connection_string = std::env::var("DATABASE_URL")
            .map_err(|_| PoolError::MissingConnectionString)?;
    
        // Validate no plaintext password
        if connection_string.contains("password=") {
            return Err(PoolError::PlaintextPassword);
        }
    
        Ok(Self {
            connection_string,
            // ...
        })
    }
    

    Fix 3: Set max_lifetime

    // Before
    pub max_lifetime: Option<Duration>,
    
    // After
    pub max_lifetime: Duration,
    
    impl Default for PoolConfig {
        fn default() -> Self {
            Self {
                max_lifetime: Duration::from_secs(1800),  // 30 minutes
                // ...
            }
        }
    }
    
  • Progressive scan results

    # After each fix, save new scan results
    aphoria scan --format json > scan-results-v{N}.json
    
    # Track improvement
    echo "Version,Conflicts" > improvement.csv
    for i in {1..6}; do
      count=$(jq '.findings | length' scan-results-v${i}.json)
      echo "v${i},${count}" >> improvement.csv
    done
    
    # Expected progression: 8 → 7 → 6 → 5 → 4 → 3 → 2 → 1 → 0
    

Day 5: Documentation - Information Needed

📝 Success Story Template

  • Structure to follow
    # Aphoria Success Story: dbpool
    
    ## Executive Summary
    - What we built
    - What Aphoria caught
    - What was prevented
    
    ## The Challenge
    - Connection pools are safety-critical
    - Misconfigurations cause P0 incidents
    - Best practices exist but are easy to miss
    
    ## Violations Detected
    For each violation:
    - What the code did wrong
    - What Aphoria detected
    - What would have happened in production
    - Estimated cost of incident
    
    ## Before/After Comparison
    - Screenshots of initial scan (8 violations)
    - Progressive fixes
    - Final clean scan (0 violations)
    
    ## Prevented Incidents
    - Connection exhaustion outage (est. $50K)
    - Security audit finding (compliance risk)
    - Production debugging hours (20 engineer-hours)
    
    ## Metrics
    - Detection accuracy: 100% (8/8 violations found)
    - False positives: 0
    - Scan performance: 0.25s
    - Time to remediation: 4 days
    
    ## Conclusion
    - Aphoria caught all violations before first deployment
    - Production-ready code in 5 days
    - Clear ROI demonstration
    

🎬 Demo Preparation

  • Demo script template

    #!/bin/bash
    # demo.sh - Live demonstration of Aphoria dogfood
    
    echo "=== Aphoria Dogfood: Database Connection Pool ==="
    echo
    
    echo "Step 1: Initial state (8 violations)"
    git checkout v0.1.0-violations
    aphoria scan --format table
    read -p "Press enter to see first fix..."
    
    echo "Step 2: Fix unbounded connections (CRITICAL)"
    git checkout v0.2.0-fix-unbounded
    git diff v0.1.0-violations src/config.rs
    aphoria scan --format table
    read -p "Press enter to continue..."
    
    # ... repeat for each fix
    
    echo "Final: Production ready (0 violations)"
    git checkout v1.0.0-production-ready
    aphoria scan --format table
    echo
    echo "✅ All violations fixed - production ready!"
    
  • Screenshots needed

    • Initial scan showing 8 violations
    • Each fix with before/after code
    • Progressive violation count graph
    • Final clean scan
    • Markdown report example
    • JSON output example

📊 Metrics to Collect

  • Scan performance

    # Run 10 scans, collect timing
    for i in {1..10}; do
      { time aphoria scan > /dev/null; } 2>&1 | grep real
    done
    
    # Calculate average
    
  • Detection accuracy

    True Positives: 8 (all intentional violations detected)
    False Positives: 0 (no incorrect violations)
    False Negatives: 0 (no missed violations)
    
    Precision: 8/8 = 100%
    Recall: 8/8 = 100%
    
  • Lines of code

    # Count lines in src/
    find src -name "*.rs" -exec wc -l {} + | tail -1
    # Expected: ~600 lines
    

Communication & Support

📞 Who to Contact

  • For Aphoria CLI issues

    • Check: applications/aphoria/README.md
    • Debug logs: RUST_LOG=aphoria=debug aphoria scan
  • For API issues

    • Check: API is running on http://localhost:18180
    • Health check: curl http://localhost:18180/health
    • Logs: /tmp/stemedb-api.log
  • For corpus issues

    • Verify corpus DB: ls ~/.aphoria/corpus-db/
    • Query API: curl 'http://localhost:18180/v1/aphoria/corpus'

🐛 Common Issues & Solutions

  • Aphoria not found

    # Build and install
    cd applications/aphoria
    cargo build --release
    sudo cp target/release/aphoria /usr/local/bin/
    
  • Corpus empty after creating claims

    # Verify API is using correct corpus DB
    ps aux | grep stemedb-api
    # Should show: STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db
    
    # If not, restart API with env var
    
  • Scan finds no violations

    # Verify extractors are working
    RUST_LOG=aphoria=debug aphoria scan
    # Check logs for extractor output
    
    # Verify claims exist in corpus
    curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
      jq '.items[] | select(.subject | contains("dbpool"))'
    

Final Deliverables Checklist

📦 Required Files

  • plan.md - This master plan
  • CHECKLIST.md - This checklist
  • src/ - Implementation code
  • tests/ - Test suite
  • docs/sources/ - Authority source documents
  • docs/SUCCESS-STORY.md - Case study
  • docs/DEMO-SCRIPT.md - Live demo guide
  • demo.sh - Automated demo script
  • scan-results-v1.json through scan-results-v6.json - Progressive scans
  • SCAN-REPORT-v1.md - Initial markdown report
  • SCAN-REPORT-FINAL.md - Clean scan report
  • screenshots/ - Visual evidence
  • Updated applications/aphoria/roadmap.md

Success Criteria

  • 25-30 claims in corpus
  • All claims queryable via API
  • 7-8 violations detected in initial scan
  • 100% detection accuracy (no false positives/negatives)
  • Scan performance ≤0.3s
  • Progressive fixes reduce violations to 0
  • Final code is production-ready
  • Comprehensive documentation completed
  • Demo materials prepared

Status: 🎯 READY TO START Next Step: Begin Day 1 - Fetch authority sources and create claims Estimated Time: 4-6 hours for Day 1