# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. # Aphoria Dogfood Project: Database Connection Pool (`dbpool`) **Purpose:** Demonstrate Aphoria's code-level truth linting by building a PostgreSQL connection pool library with intentional violations, then using Aphoria to detect and guide remediation. **Status:** 5-day dogfood project (see `plan.md` for detailed schedule) **Parent Project:** This is a dogfood demonstration within the larger Aphoria/StemeDB project located at `/home/jml/Workspace/stemedb/` ## Critical Context ### What Makes This Project Special This is **not** a normal implementation project. The workflow is: 1. **Day 1:** Create 25-30 authoritative claims in corpus database (HikariCP, PostgreSQL, OWASP best practices) 2. **Day 2:** Write working code that **intentionally violates** 7-8 of those claims 3. **Day 3:** Run `aphoria scan` to verify all violations are detected 4. **Day 4:** Fix violations incrementally, re-scanning after each fix 5. **Day 5:** Document the success story with before/after evidence **The violations are intentional and educational.** When writing code in Days 1-2, we **want** to violate the claims to demonstrate detection. ### The Two Modes - **Violation Mode (Day 2):** Write code that deliberately violates best practices - **Remediation Mode (Day 4):** Fix code to comply with all claims Always check `plan.md` to understand which mode we're in. ## Quick Start ### Pre-Flight Check Before starting the dogfood exercise, validate your environment: ```bash ./scripts/validate-setup.sh ``` This checks all prerequisites (Aphoria CLI, API running, corpus DB, extractors working) and shows you exactly what to fix if anything is missing. ### Learn Claim Extraction Read the complete walkthrough before creating claims: ```bash cat docs/claim-extraction-example.md ``` This teaches you how to extract claims from prose documentation with: - Complete worked example (HikariCP paragraph → 3 claims) - Decision framework (what deserves to be a claim vs noise) - Anti-patterns to avoid (too generic, no consequences) **Time:** 15-20 minutes | **Worth it:** Prevents creating garbage claims --- ## Development Commands ### Corpus Management (Day 1) ```bash # Create a claim in the corpus database aphoria corpus create \ --subject "dbpool/{component}/{property}" \ --predicate "{required|recommended|bounded|minimum|maximum}" \ --value "{value}" \ --explanation "{What} MUST {do} because {why}. If {violation}, {consequence}." \ --authority "{Source Name}" \ --category "{safety|security|performance|architecture}" \ --tier {0-3} # Query corpus via API (requires stemedb-api running on :18180) curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&limit=100' | jq . # Count claims for dbpool curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \ jq '.items | map(select(.subject | startswith("dbpool"))) | length' ``` ### Build & Test (Day 2+) ```bash # Build library cargo build # Run tests cargo test # Build release cargo build --release ``` ### Aphoria Scanning (Day 3+) **Before Day 3:** Configure flywheel mode (see `docs/flywheel-setup.md`) ```bash # Persistent scan (enables pattern learning) aphoria scan --persist # Persistent with sync (contributes to community corpus) aphoria scan --persist --sync # Ephemeral scan (fast, in-memory, ~0.25s - no learning) aphoria scan # JSON output (for programmatic analysis) aphoria scan --format json > scan-results-v1.json # Markdown report (human-readable) aphoria scan --format markdown > SCAN-REPORT-v1.md # Table output (default, terminal-friendly) aphoria scan --format table ``` ### Analyze Scan Results ```bash # Count violations by severity jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})' scan-results-v1.json # Count BLOCK verdicts (critical violations) jq '.findings | map(select(.verdict == "BLOCK")) | length' scan-results-v1.json # List all findings with explanations jq '.findings[] | {file, line, verdict, explanation}' scan-results-v1.json ``` ## Architecture ### File Structure ``` applications/aphoria/dogfood/dbpool/ ├── plan.md # Master plan with 5-day schedule ├── CHECKLIST.md # Execution checklist with templates ├── CLAUDE.md # This file ├── Cargo.toml # Rust library manifest ├── .aphoria/ │ └── config.toml # Aphoria scan configuration ├── src/ │ ├── lib.rs # Library root │ ├── config.rs # PoolConfig (violations → fixes) │ ├── pool.rs # ConnectionPool implementation │ ├── connection.rs # Connection wrapper │ ├── metrics.rs # Pool metrics (added in Day 4) │ └── error.rs # Error types ├── tests/ │ └── basic.rs # Functionality tests ├── docs/ │ ├── claim-extraction-example.md # COMPLETE WALKTHROUGH (read this first!) │ ├── flywheel-setup.md # Flywheel configuration guide │ ├── sources/ # Authority source documents │ │ ├── hikaricp-config.md │ │ ├── postgresql-pooling.md │ │ └── owasp-credentials.md │ ├── SUCCESS-STORY.md # Case study (Day 5) │ └── DEMO-SCRIPT.md # Demo guide (Day 5) ├── scripts/ │ └── validate-setup.sh # Pre-flight validator └── scan-results-v*.json # Progressive scan results ``` ### Expected Violations (Day 2-3) | # | Violation | Severity | Claim Violated | |---|-----------|----------|----------------| | 1 | Unbounded `max_connections: Option` | BLOCK | `dbpool/max_connections` required | | 2 | Plaintext password in connection string | BLOCK | `dbpool/connection_string/password` must_not_be plaintext | | 3 | Missing `max_lifetime` | BLOCK | `dbpool/max_lifetime` required | | 4 | Excessive `connection_timeout` (60s vs 30s max) | FLAG | `dbpool/connection_timeout` maximum 30 | | 5 | Zero `min_connections` (should be ≥2) | FLAG | `dbpool/min_connections` minimum 2 | | 6 | No connection validation before checkout | FLAG | `dbpool/validation/frequency` required on_checkout | | 7 | No metrics exposed | WARNING | `dbpool/metrics/enabled` recommended | | 8 | No leak detection threshold | WARNING | `dbpool/leak_detection_threshold` recommended | **Target:** 8/8 violations detected (100% accuracy), 0 false positives ## Dependencies & Environment ### Required Services - **stemedb-api** running on `:18180` with corpus database ```bash STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db target/release/stemedb-api ``` - **Aphoria CLI** installed and working ```bash aphoria --version # Should show version ``` ### Rust Dependencies ```toml [dependencies] tokio = { version = "1", features = ["full"] } tokio-postgres = "0.7" serde = { version = "1", features = ["derive"] } thiserror = "1" ``` ## Authority Tiers Claims in the corpus use these authority tiers: - **Tier 0:** Regulatory (RFCs, Standards) - Highest authority - **Tier 1:** Clinical (OWASP, NIST) - Security/compliance - **Tier 2:** Vendor (HikariCP, PostgreSQL) - Industry best practices - **Tier 3:** Expert (Team policy) - Project-specific rules Our claims use **Tier 1** (OWASP A07) for security and **Tier 2** (HikariCP, PostgreSQL) for safety/performance. ## Git Workflow ### Progressive Tagging (Day 4) Each fix gets a tag for easy demo navigation: ```bash # Initial state with violations git tag v0.1.0-violations # After each fix git tag v0.2.0-fix-unbounded # Fixed max_connections git tag v0.3.0-fix-credentials # Fixed plaintext password git tag v0.4.0-fix-lifetime # Fixed max_lifetime git tag v0.5.0-fix-timeouts # Fixed timeouts git tag v0.6.0-fix-validation # Added validation git tag v0.7.0-fix-observability # Added metrics # Final state git tag v1.0.0-production-ready # All violations fixed ``` ### Commit Messages ```bash # Format: fix(dbpool): - git commit -m "fix(dbpool): set max_connections to prevent unbounded growth Aphoria detected missing max_connections configuration which would allow unbounded connection growth and exhaust database connections under load. Added required max_connections field with development default of 10. Resolves: BLOCK violation from HikariCP claim (Tier 2)" ``` ## Success Metrics ### Objective Targets | Metric | Target | How to Measure | |--------|--------|----------------| | Claims Extracted | 25-30 | `curl corpus API \| jq '.total_matching'` | | Violations Detected | 7-8 | `jq '.findings \| length' scan-results-v1.json` | | Detection Accuracy | 100% | All intentional violations found, 0 false positives | | Scan Performance | ≤0.3s | `time aphoria scan` (ephemeral mode) | | Final Scan Result | 0 conflicts | `scan-v6.json` shows all PASS | ### Qualitative Outcomes - **Compelling Story:** "Aphoria prevented 3 potential P0 incidents before first deployment" - **Educational Value:** Each violation includes explanation of real-world consequence - **Production Ready:** Final code is genuinely production-worthy (can be extracted as real library) - **Demonstrable:** 5-minute demo shows clear value proposition ## Critical Rules 1. **Read `plan.md` First:** Always check the plan to understand current phase and goals 2. **Intentional Violations:** Days 1-2 involve deliberately writing bad code (it's educational) 3. **Progressive Fixes:** Day 4 fixes violations one at a time with re-scans after each 4. **Evidence Collection:** Save all scan results (`scan-results-v*.json`) for documentation 5. **Authority Attribution:** Every claim must cite specific authority source (HikariCP docs, PostgreSQL guide, OWASP) ## Common Tasks ### Start a New Day ```bash # 1. Read the plan cat plan.md | grep "^### Day X" # 2. Check current status cat CHECKLIST.md | grep "^### Day X" # 3. Verify environment aphoria --version curl http://localhost:18180/health ``` ### Verify Corpus Setup ```bash # Count dbpool claims curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \ jq '.items | map(select(.subject | startswith("dbpool"))) | length' # Should return: 25-30 after Day 1 ``` ### Check Violation Status ```bash # Current violation count aphoria scan --format json | jq '.findings | length' # Breakdown by severity aphoria scan --format json | \ jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})' ``` ## Troubleshooting ### Aphoria not found ```bash cd /home/jml/Workspace/stemedb/applications/aphoria cargo build --release sudo cp target/release/aphoria /usr/local/bin/ ``` ### Corpus empty after creating claims ```bash # Verify API is using correct corpus DB ps aux | grep stemedb-api # Should show: STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db # If not, restart API with correct env var ``` ### Scan finds no violations ```bash # Enable debug logging RUST_LOG=aphoria=debug aphoria scan # Verify claims exist curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \ jq '.items[] | select(.subject | contains("dbpool"))' ``` ## Documentation Requirements All documentation must include: - **Before/After Evidence:** Screenshots of violations → clean scans - **Cost Analysis:** Estimated impact of prevented incidents ($50K connection exhaustion, 20 engineer-hours debugging) - **Metrics:** Detection accuracy (100%), scan performance (≤0.3s), false positive rate (0%) - **Authority Attribution:** Every claim linked to specific source (HikariCP wiki page, PostgreSQL docs, OWASP A07) ## Related Documentation - `plan.md` - Detailed 5-day implementation plan - `CHECKLIST.md` - Execution checklist with templates and examples - `/home/jml/Workspace/stemedb/CLAUDE.md` - Parent project guidance - `/home/jml/Workspace/stemedb/applications/aphoria/README.md` - Aphoria documentation