stemedb/applications/aphoria/dogfood/dbpool/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

# Aphoria Dogfood Project: Database Connection Pool (`dbpool`)

**Purpose:** Demonstrate Aphoria's code-level truth linting by building a PostgreSQL connection pool library with intentional violations, then using Aphoria to detect and guide remediation.

**Status:** 5-day dogfood project (see `plan.md` for detailed schedule)

**Parent Project:** This is a dogfood demonstration within the larger Aphoria/StemeDB project located at `/home/jml/Workspace/stemedb/`

## Critical Context

### What Makes This Project Special

This is **not** a normal implementation project. The workflow is:

1. **Day 1:** Create 25-30 authoritative claims in corpus database (HikariCP, PostgreSQL, OWASP best practices)
2. **Day 2:** Write working code that **intentionally violates** 7-8 of those claims
3. **Day 3:** Run `aphoria scan` to verify all violations are detected
4. **Day 4:** Fix violations incrementally, re-scanning after each fix
5. **Day 5:** Document the success story with before/after evidence

**The violations are intentional and educational.** When writing code in Days 1-2, we **want** to violate the claims to demonstrate detection.

### The Two Modes

- **Violation Mode (Day 2):** Write code that deliberately violates best practices
- **Remediation Mode (Day 4):** Fix code to comply with all claims

Always check `plan.md` to understand which mode we're in.

## Quick Start

### Pre-Flight Check

Before starting the dogfood exercise, validate your environment:

```bash
./scripts/validate-setup.sh
```

This checks all prerequisites (Aphoria CLI, API running, corpus DB, extractors working) and shows you exactly what to fix if anything is missing.

### Learn Claim Extraction

Read the complete walkthrough before creating claims:

```bash
cat docs/claim-extraction-example.md
```

This teaches you how to extract claims from prose documentation with:
- Complete worked example (HikariCP paragraph → 3 claims)
- Decision framework (what deserves to be a claim vs noise)
- Anti-patterns to avoid (too generic, no consequences)

**Time:** 15-20 minutes | **Worth it:** Prevents creating garbage claims

---

## Development Commands

### Corpus Management (Day 1)

```bash
# Create a claim in the corpus database
aphoria corpus create \
  --subject "dbpool/{component}/{property}" \
  --predicate "{required|recommended|bounded|minimum|maximum}" \
  --value "{value}" \
  --explanation "{What} MUST {do} because {why}. If {violation}, {consequence}." \
  --authority "{Source Name}" \
  --category "{safety|security|performance|architecture}" \
  --tier {0-3}

# Query corpus via API (requires stemedb-api running on :18180)
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor&limit=100' | jq .

# Count claims for dbpool
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items | map(select(.subject | startswith("dbpool"))) | length'
```

### Build & Test (Day 2+)

```bash
# Build library
cargo build

# Run tests
cargo test

# Build release
cargo build --release
```

### Aphoria Scanning (Day 3+)

**Before Day 3:** Configure flywheel mode (see `docs/flywheel-setup.md`)

```bash
# Persistent scan (enables pattern learning)
aphoria scan --persist

# Persistent with sync (contributes to community corpus)
aphoria scan --persist --sync

# Ephemeral scan (fast, in-memory, ~0.25s - no learning)
aphoria scan

# JSON output (for programmatic analysis)
aphoria scan --format json > scan-results-v1.json

# Markdown report (human-readable)
aphoria scan --format markdown > SCAN-REPORT-v1.md

# Table output (default, terminal-friendly)
aphoria scan --format table
```

### Analyze Scan Results

```bash
# Count violations by severity
jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})' scan-results-v1.json

# Count BLOCK verdicts (critical violations)
jq '.findings | map(select(.verdict == "BLOCK")) | length' scan-results-v1.json

# List all findings with explanations
jq '.findings[] | {file, line, verdict, explanation}' scan-results-v1.json
```

## Architecture

### File Structure

```
applications/aphoria/dogfood/dbpool/
├── plan.md                      # Master plan with 5-day schedule
├── CHECKLIST.md                 # Execution checklist with templates
├── CLAUDE.md                    # This file
├── Cargo.toml                   # Rust library manifest
├── .aphoria/
│   └── config.toml              # Aphoria scan configuration
├── src/
│   ├── lib.rs                   # Library root
│   ├── config.rs                # PoolConfig (violations → fixes)
│   ├── pool.rs                  # ConnectionPool implementation
│   ├── connection.rs            # Connection wrapper
│   ├── metrics.rs               # Pool metrics (added in Day 4)
│   └── error.rs                 # Error types
├── tests/
│   └── basic.rs                 # Functionality tests
├── docs/
│   ├── claim-extraction-example.md  # COMPLETE WALKTHROUGH (read this first!)
│   ├── flywheel-setup.md            # Flywheel configuration guide
│   ├── sources/                     # Authority source documents
│   │   ├── hikaricp-config.md
│   │   ├── postgresql-pooling.md
│   │   └── owasp-credentials.md
│   ├── SUCCESS-STORY.md             # Case study (Day 5)
│   └── DEMO-SCRIPT.md               # Demo guide (Day 5)
├── scripts/
│   └── validate-setup.sh            # Pre-flight validator
└── scan-results-v*.json         # Progressive scan results
```

### Expected Violations (Day 2-3)

| # | Violation | Severity | Claim Violated |
|---|-----------|----------|----------------|
| 1 | Unbounded `max_connections: Option<usize>` | BLOCK | `dbpool/max_connections` required |
| 2 | Plaintext password in connection string | BLOCK | `dbpool/connection_string/password` must_not_be plaintext |
| 3 | Missing `max_lifetime` | BLOCK | `dbpool/max_lifetime` required |
| 4 | Excessive `connection_timeout` (60s vs 30s max) | FLAG | `dbpool/connection_timeout` maximum 30 |
| 5 | Zero `min_connections` (should be ≥2) | FLAG | `dbpool/min_connections` minimum 2 |
| 6 | No connection validation before checkout | FLAG | `dbpool/validation/frequency` required on_checkout |
| 7 | No metrics exposed | WARNING | `dbpool/metrics/enabled` recommended |
| 8 | No leak detection threshold | WARNING | `dbpool/leak_detection_threshold` recommended |

**Target:** 8/8 violations detected (100% accuracy), 0 false positives

## Dependencies & Environment

### Required Services

- **stemedb-api** running on `:18180` with corpus database
  ```bash
  STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db target/release/stemedb-api
  ```

- **Aphoria CLI** installed and working
  ```bash
  aphoria --version  # Should show version
  ```

### Rust Dependencies

```toml
[dependencies]
tokio = { version = "1", features = ["full"] }
tokio-postgres = "0.7"
serde = { version = "1", features = ["derive"] }
thiserror = "1"
```

## Authority Tiers

Claims in the corpus use these authority tiers:

- **Tier 0:** Regulatory (RFCs, Standards) - Highest authority
- **Tier 1:** Clinical (OWASP, NIST) - Security/compliance
- **Tier 2:** Vendor (HikariCP, PostgreSQL) - Industry best practices
- **Tier 3:** Expert (Team policy) - Project-specific rules

Our claims use **Tier 1** (OWASP A07) for security and **Tier 2** (HikariCP, PostgreSQL) for safety/performance.

## Git Workflow

### Progressive Tagging (Day 4)

Each fix gets a tag for easy demo navigation:

```bash
# Initial state with violations
git tag v0.1.0-violations

# After each fix
git tag v0.2.0-fix-unbounded      # Fixed max_connections
git tag v0.3.0-fix-credentials    # Fixed plaintext password
git tag v0.4.0-fix-lifetime       # Fixed max_lifetime
git tag v0.5.0-fix-timeouts       # Fixed timeouts
git tag v0.6.0-fix-validation     # Added validation
git tag v0.7.0-fix-observability  # Added metrics

# Final state
git tag v1.0.0-production-ready   # All violations fixed
```

### Commit Messages

```bash
# Format: fix(dbpool): <what> - <why/consequence prevented>
git commit -m "fix(dbpool): set max_connections to prevent unbounded growth

Aphoria detected missing max_connections configuration which would allow
unbounded connection growth and exhaust database connections under load.
Added required max_connections field with development default of 10.

Resolves: BLOCK violation from HikariCP claim (Tier 2)"
```

## Success Metrics

### Objective Targets

| Metric | Target | How to Measure |
|--------|--------|----------------|
| Claims Extracted | 25-30 | `curl corpus API \| jq '.total_matching'` |
| Violations Detected | 7-8 | `jq '.findings \| length' scan-results-v1.json` |
| Detection Accuracy | 100% | All intentional violations found, 0 false positives |
| Scan Performance | ≤0.3s | `time aphoria scan` (ephemeral mode) |
| Final Scan Result | 0 conflicts | `scan-v6.json` shows all PASS |

### Qualitative Outcomes

- **Compelling Story:** "Aphoria prevented 3 potential P0 incidents before first deployment"
- **Educational Value:** Each violation includes explanation of real-world consequence
- **Production Ready:** Final code is genuinely production-worthy (can be extracted as real library)
- **Demonstrable:** 5-minute demo shows clear value proposition

## Critical Rules

1. **Read `plan.md` First:** Always check the plan to understand current phase and goals
2. **Intentional Violations:** Days 1-2 involve deliberately writing bad code (it's educational)
3. **Progressive Fixes:** Day 4 fixes violations one at a time with re-scans after each
4. **Evidence Collection:** Save all scan results (`scan-results-v*.json`) for documentation
5. **Authority Attribution:** Every claim must cite specific authority source (HikariCP docs, PostgreSQL guide, OWASP)

## Common Tasks

### Start a New Day

```bash
# 1. Read the plan
cat plan.md | grep "^### Day X"

# 2. Check current status
cat CHECKLIST.md | grep "^### Day X"

# 3. Verify environment
aphoria --version
curl http://localhost:18180/health
```

### Verify Corpus Setup

```bash
# Count dbpool claims
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items | map(select(.subject | startswith("dbpool"))) | length'

# Should return: 25-30 after Day 1
```

### Check Violation Status

```bash
# Current violation count
aphoria scan --format json | jq '.findings | length'

# Breakdown by severity
aphoria scan --format json | \
  jq '.findings | group_by(.verdict) | map({verdict: .[0].verdict, count: length})'
```

## Troubleshooting

### Aphoria not found

```bash
cd /home/jml/Workspace/stemedb/applications/aphoria
cargo build --release
sudo cp target/release/aphoria /usr/local/bin/
```

### Corpus empty after creating claims

```bash
# Verify API is using correct corpus DB
ps aux | grep stemedb-api
# Should show: STEMEDB_CORPUS_DB_DIR=/home/jml/.aphoria/corpus-db

# If not, restart API with correct env var
```

### Scan finds no violations

```bash
# Enable debug logging
RUST_LOG=aphoria=debug aphoria scan

# Verify claims exist
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=vendor' | \
  jq '.items[] | select(.subject | contains("dbpool"))'
```

## Documentation Requirements

All documentation must include:

- **Before/After Evidence:** Screenshots of violations → clean scans
- **Cost Analysis:** Estimated impact of prevented incidents ($50K connection exhaustion, 20 engineer-hours debugging)
- **Metrics:** Detection accuracy (100%), scan performance (≤0.3s), false positive rate (0%)
- **Authority Attribution:** Every claim linked to specific source (HikariCP wiki page, PostgreSQL docs, OWASP A07)

## Related Documentation

- `plan.md` - Detailed 5-day implementation plan
- `CHECKLIST.md` - Execution checklist with templates and examples
- `/home/jml/Workspace/stemedb/CLAUDE.md` - Parent project guidance
- `/home/jml/Workspace/stemedb/applications/aphoria/README.md` - Aphoria documentation