stemedb/applications/aphoria/dogfood/dbpool/CLAUDE.md
jml 3dac3dc914 feat(aphoria): implement Day 3 debugging features and comprehensive documentation
Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004)
and adds comprehensive documentation to prevent dogfooding failures.

## Product Features (VG-DAY3-XXX)

### VG-DAY3-001: --show-observations flag (P0)
- Shows all observations with concept paths for debugging extractor alignment
- Includes claim matching analysis (/ visual feedback)
- Explains tail-path matching and why observations don't match claims
- 8 unit tests in src/report/observations.rs
- 5 integration tests in src/tests/day3_debugging.rs

### VG-DAY3-003: aphoria extractors validate (P2)
- Validates extractor subject fields match claim concept_paths
- Smart fuzzy matching suggests corrections for typos
- Clear error messages with actionable hints
- Proper exit codes (0=success, 1=validation failed)

### VG-DAY3-004: aphoria extractors test NAME --file (P2)
- Tests single extractor pattern against one file (no full scan needed)
- Shows line numbers and matched text
- Previews what observation would be created
- Helpful troubleshooting when pattern doesn't match

## Documentation (P0-P1)

### New Docs Created
- docs/extractors/declarative-extractors.md (800 lines)
  - Complete field reference with emphasis on subject field format
  - 3 worked examples (timeout=0, unbounded queue, TLS disabled)
  - Common mistakes with fixes
  - Validation workflow
  - Debugging 0% detection rate

- docs/examples/extractors/timeout-zero-example.md (500 lines)
  - End-to-end flow: code → extractor → claim → conflict → fix
  - Visual diagrams showing path alignment
  - Troubleshooting guide
  - Validation checklist

- docs/dogfooding-common-mistakes.md (560 lines)
  - Mistake #1: Skipping Day 3 extractor creation (CRITICAL)
  - Mistake #2: Creating extractors with wrong subject format (NEW)
  - Evidence from msgqueue failures
  - Recovery procedures

### Docs Updated
- dogfood/msgqueue/plan.md (Day 3 Steps 3-4)
  - Added complete manual declarative extractor TOML format
  - Added validation workflow BEFORE scanning
  - Added debug workflow for 0% detection after creating extractors

- dogfood/msgqueue/eval/ (evaluation artifacts)
  - EVALUATION-REPORT-2026-02-10.md (600 lines)
  - DOC-FIXES-2026-02-10.md (summary of fixes)
  - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review)

## New Extractors
- src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations
- src/extractors/async_blocking.rs - Detects blocking calls in async functions
- src/extractors/unbounded_resources.rs - Detects unbounded queues/connections

## Code Changes
- src/cli/mod.rs: Add --show-observations flag to scan command
- src/cli/extractors.rs: Add Validate and Test subcommands
- src/handlers/scan.rs: Call format_observations when flag enabled
- src/handlers/extractors.rs: Implement handle_validate() and handle_test()
- src/report/observations.rs: Observation formatting with claim matching analysis
- src/tests/day3_debugging.rs: Integration tests for new features

## Dogfood Artifacts
- dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings
- dogfood/dbpool/ - Database pool dogfooding exercise

## Impact
- Time savings: 30 min per Day 3 debugging (67% faster)
- User experience: Transparent debugging (no blind trial-and-error)
- Documentation: 1,860 new lines covering all P0-P1 gaps

## Related Issues
- Closes VG-DAY3-001 (--show-observations)
- Closes VG-DAY3-002 (concept path alignment docs)
- Closes VG-DAY3-003 (extractors validate)
- Closes VG-DAY3-004 (extractors test)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 03:31:06 +00:00

366 lines
12 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
# Aphoria Dogfood Project: Database Connection Pool (`dbpool`)
**Purpose:** Demonstrate Aphoria's code-level truth linting by building a PostgreSQL connection pool library with intentional violations, then using Aphoria to detect and guide remediation.
**Status:** 5-day dogfood project (see `plan.md` for detailed schedule)
**Parent Project:** This is a dogfood demonstration within the larger Aphoria/StemeDB project located at `/home/jml/Workspace/stemedb/`
## Critical Context
### What Makes This Project Special
This is **not** a normal implementation project. The workflow is:
1. **Day 1:** Create 25-30 authoritative claims in corpus database (HikariCP, PostgreSQL, OWASP best practices)
2. **Day 2:** Write working code that **intentionally violates** 7-8 of those claims
3. **Day 3:** Run `aphoria scan` to verify all violations are detected
4. **Day 4:** Fix violations incrementally, re-scanning after each fix
5. **Day 5:** Document the success story with before/after evidence
**The violations are intentional and educational.** When writing code in Days 1-2, we **want** to violate the claims to demonstrate detection.
### The Two Modes
- **Violation Mode (Day 2):** Write code that deliberately violates best practices
- **Remediation Mode (Day 4):** Fix code to comply with all claims
Always check `plan.md` to understand which mode we're in.
## Quick Start
### Pre-Flight Check
Before starting the dogfood exercise, validate your environment:
```bash
./scripts/validate-setup.sh
```
This checks all prerequisites (Aphoria CLI, API running, corpus DB, extractors working) and shows you exactly what to fix if anything is missing.
### Learn Claim Extraction
Read the complete walkthrough before creating claims:
```bash
cat docs/claim-extraction-example.md
```
This teaches you how to extract claims from prose documentation with:
- Complete worked example (HikariCP paragraph → 3 claims)
- Decision framework (what deserves to be a claim vs noise)
- Anti-patterns to avoid (too generic, no consequences)
**Time:** 15-20 minutes | **Worth it:** Prevents creating garbage claims
---
## Development Commands
### Corpus Management (Day 1)
```bash
# Create a claim in the corpus database
aphoria corpus create \
--subject "dbpool/{component}/{property}" \
--predicate "{required|recommended|bounded|minimum|maximum}" \
--value "{value}" \
--explanation "{What} MUST {do} because {why}. If {violation}, {consequence}." \
--authority "{Source Name}" \
--category "{safety|security|performance|architecture}" \
--tier {0-3}
# Query corpus via API (requires stemedb-api running on :18180)
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&limit=100' | jq .
# Count claims for dbpool
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
jq '.items | map(select(.subject | startswith("dbpool"))) | length'
```
### Build & Test (Day 2+)
```bash
# Build library
cargo build
# Run tests
cargo test
# Build release
cargo build --release
```
### Aphoria Scanning (Day 3+)
**Before Day 3:** Configure flywheel mode (see `docs/flywheel-setup.md`)
```bash
# Persistent scan (enables pattern learning)
aphoria scan --persist
# Persistent with sync (contributes to community corpus)
aphoria scan --persist --sync
# Ephemeral scan (fast, in-memory, ~0.25s - no learning)
aphoria scan
# JSON output (for programmatic analysis)
aphoria scan --format json > scan-results-v1.json
# Markdown report (human-readable)
aphoria scan --format markdown > SCAN-REPORT-v1.md
# Table output (default, terminal-friendly)
aphoria scan --format table
```
### Analyze Scan Results
```bash
# Count violations by severity
jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})' scan-results-v1.json
# Count BLOCK verdicts (critical violations)
jq '.findings | map(select(.verdict == "BLOCK")) | length' scan-results-v1.json
# List all findings with explanations
jq '.findings[] | {file, line, verdict, explanation}' scan-results-v1.json
```
## Architecture
### File Structure
```
applications/aphoria/dogfood/dbpool/
├── plan.md # Master plan with 5-day schedule
├── CHECKLIST.md # Execution checklist with templates
├── CLAUDE.md # This file
├── Cargo.toml # Rust library manifest
├── .aphoria/
│ └── config.toml # Aphoria scan configuration
├── src/
│ ├── lib.rs # Library root
│ ├── config.rs # PoolConfig (violations → fixes)
│ ├── pool.rs # ConnectionPool implementation
│ ├── connection.rs # Connection wrapper
│ ├── metrics.rs # Pool metrics (added in Day 4)
│ └── error.rs # Error types
├── tests/
│ └── basic.rs # Functionality tests
├── docs/
│ ├── claim-extraction-example.md # COMPLETE WALKTHROUGH (read this first!)
│ ├── flywheel-setup.md # Flywheel configuration guide
│ ├── sources/ # Authority source documents
│ │ ├── hikaricp-config.md
│ │ ├── postgresql-pooling.md
│ │ └── owasp-credentials.md
│ ├── SUCCESS-STORY.md # Case study (Day 5)
│ └── DEMO-SCRIPT.md # Demo guide (Day 5)
├── scripts/
│ └── validate-setup.sh # Pre-flight validator
└── scan-results-v*.json # Progressive scan results
```
### Expected Violations (Day 2-3)
| # | Violation | Severity | Claim Violated |
|---|-----------|----------|----------------|
| 1 | Unbounded `max_connections: Option<usize>` | BLOCK | `dbpool/max_connections` required |
| 2 | Plaintext password in connection string | BLOCK | `dbpool/connection_string/password` must_not_be plaintext |
| 3 | Missing `max_lifetime` | BLOCK | `dbpool/max_lifetime` required |
| 4 | Excessive `connection_timeout` (60s vs 30s max) | FLAG | `dbpool/connection_timeout` maximum 30 |
| 5 | Zero `min_connections` (should be ≥2) | FLAG | `dbpool/min_connections` minimum 2 |
| 6 | No connection validation before checkout | FLAG | `dbpool/validation/frequency` required on_checkout |
| 7 | No metrics exposed | WARNING | `dbpool/metrics/enabled` recommended |
| 8 | No leak detection threshold | WARNING | `dbpool/leak_detection_threshold` recommended |
**Target:** 8/8 violations detected (100% accuracy), 0 false positives
## Dependencies & Environment
### Required Services
- **stemedb-api** running on `:18180` with corpus database
```bash
STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db target/release/stemedb-api
```
- **Aphoria CLI** installed and working
```bash
aphoria --version # Should show version
```
### Rust Dependencies
```toml
[dependencies]
tokio = { version = "1", features = ["full"] }
tokio-postgres = "0.7"
serde = { version = "1", features = ["derive"] }
thiserror = "1"
```
## Authority Tiers
Claims in the corpus use these authority tiers:
- **Tier 0:** Regulatory (RFCs, Standards) - Highest authority
- **Tier 1:** Clinical (OWASP, NIST) - Security/compliance
- **Tier 2:** Vendor (HikariCP, PostgreSQL) - Industry best practices
- **Tier 3:** Expert (Team policy) - Project-specific rules
Our claims use **Tier 1** (OWASP A07) for security and **Tier 2** (HikariCP, PostgreSQL) for safety/performance.
## Git Workflow
### Progressive Tagging (Day 4)
Each fix gets a tag for easy demo navigation:
```bash
# Initial state with violations
git tag v0.1.0-violations
# After each fix
git tag v0.2.0-fix-unbounded # Fixed max_connections
git tag v0.3.0-fix-credentials # Fixed plaintext password
git tag v0.4.0-fix-lifetime # Fixed max_lifetime
git tag v0.5.0-fix-timeouts # Fixed timeouts
git tag v0.6.0-fix-validation # Added validation
git tag v0.7.0-fix-observability # Added metrics
# Final state
git tag v1.0.0-production-ready # All violations fixed
```
### Commit Messages
```bash
# Format: fix(dbpool): <what> - <why/consequence prevented>
git commit -m "fix(dbpool): set max_connections to prevent unbounded growth
Aphoria detected missing max_connections configuration which would allow
unbounded connection growth and exhaust database connections under load.
Added required max_connections field with development default of 10.
Resolves: BLOCK violation from HikariCP claim (Tier 2)"
```
## Success Metrics
### Objective Targets
| Metric | Target | How to Measure |
|--------|--------|----------------|
| Claims Extracted | 25-30 | `curl corpus API \| jq '.total_matching'` |
| Violations Detected | 7-8 | `jq '.findings \| length' scan-results-v1.json` |
| Detection Accuracy | 100% | All intentional violations found, 0 false positives |
| Scan Performance | ≤0.3s | `time aphoria scan` (ephemeral mode) |
| Final Scan Result | 0 conflicts | `scan-v6.json` shows all PASS |
### Qualitative Outcomes
- **Compelling Story:** "Aphoria prevented 3 potential P0 incidents before first deployment"
- **Educational Value:** Each violation includes explanation of real-world consequence
- **Production Ready:** Final code is genuinely production-worthy (can be extracted as real library)
- **Demonstrable:** 5-minute demo shows clear value proposition
## Critical Rules
1. **Read `plan.md` First:** Always check the plan to understand current phase and goals
2. **Intentional Violations:** Days 1-2 involve deliberately writing bad code (it's educational)
3. **Progressive Fixes:** Day 4 fixes violations one at a time with re-scans after each
4. **Evidence Collection:** Save all scan results (`scan-results-v*.json`) for documentation
5. **Authority Attribution:** Every claim must cite specific authority source (HikariCP docs, PostgreSQL guide, OWASP)
## Common Tasks
### Start a New Day
```bash
# 1. Read the plan
cat plan.md | grep "^### Day X"
# 2. Check current status
cat CHECKLIST.md | grep "^### Day X"
# 3. Verify environment
aphoria --version
curl http://localhost:18180/health
```
### Verify Corpus Setup
```bash
# Count dbpool claims
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
jq '.items | map(select(.subject | startswith("dbpool"))) | length'
# Should return: 25-30 after Day 1
```
### Check Violation Status
```bash
# Current violation count
aphoria scan --format json | jq '.findings | length'
# Breakdown by severity
aphoria scan --format json | \
jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})'
```
## Troubleshooting
### Aphoria not found
```bash
cd /home/jml/Workspace/stemedb/applications/aphoria
cargo build --release
sudo cp target/release/aphoria /usr/local/bin/
```
### Corpus empty after creating claims
```bash
# Verify API is using correct corpus DB
ps aux | grep stemedb-api
# Should show: STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db
# If not, restart API with correct env var
```
### Scan finds no violations
```bash
# Enable debug logging
RUST_LOG=aphoria=debug aphoria scan
# Verify claims exist
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
jq '.items[] | select(.subject | contains("dbpool"))'
```
## Documentation Requirements
All documentation must include:
- **Before/After Evidence:** Screenshots of violations → clean scans
- **Cost Analysis:** Estimated impact of prevented incidents ($50K connection exhaustion, 20 engineer-hours debugging)
- **Metrics:** Detection accuracy (100%), scan performance (≤0.3s), false positive rate (0%)
- **Authority Attribution:** Every claim linked to specific source (HikariCP wiki page, PostgreSQL docs, OWASP A07)
## Related Documentation
- `plan.md` - Detailed 5-day implementation plan
- `CHECKLIST.md` - Execution checklist with templates and examples
- `/home/jml/Workspace/stemedb/CLAUDE.md` - Parent project guidance
- `/home/jml/Workspace/stemedb/applications/aphoria/README.md` - Aphoria documentation