stemedb/applications/aphoria/dogfood/cachewrap/plan.md

# Dogfood Project: Distributed Cache Client (cachewrap)

**Start Date:** 2026-02-11
**Hypothesis:** Connection patterns + resource limits + TTL semantics from 3 corpora (httpclient, dbpool, msgqueue) transfer to cache clients with 35-40% pattern reuse, demonstrating multi-domain flywheel strength.
**Corpus Overlap:** httpclient + dbpool + msgqueue → **35-40%** pattern reuse expected
**Target Metrics:**
- Time savings: **≥60%** vs manual (Day 1: <2 hrs vs ~4 hrs manual)
- Pattern reuse: **≥35%** of claims (7/20 claims)
- Detection rate: **≥90%** of violations (9/10 detected)
- Naming errors: **<2**
- Total time: **12-16 hours** (reflects ★★★★☆ difficulty)

---

## Day 1: Claims Extraction (1-2 hours)

**Goal:** Author **20 claims** (7 reused from corpus, 13 new) with full provenance

**Skills:**
- `/aphoria-suggest --corpus httpclient,dbpool,msgqueue` - Discover reusable patterns
- `/aphoria-claims` - Author claims with full provenance

**Process:**

### 1. Discover Reusable Patterns (30 min)

```bash
cd /path/to/aphoria/dogfood/cachewrap
/aphoria-suggest --corpus httpclient,dbpool,msgqueue --domain cache
```

Expected reusable patterns (7 total):
- httpclient: timeout, TLS verification, retry, async (4)
- dbpool: max_connections, connection lifecycle (2)
- msgqueue: metrics (1)

### 2. Draft New Claims (30 min)

Read authority sources in `docs/sources/`:
- `redis-spec.md` - TTL, eviction, consistency
- `aws-elasticache.md` - Best practices, security
- `redis-rs-lib.md` - Rust patterns

Draft 13 new claims covering:
- TTL and expiration (3 claims)
- Security (key validation, injection) (2 claims)
- Eviction policies (2 claims)
- Resource limits (cache size, memory) (2 claims)
- Consistency and sharding (2 claims)
- Serialization and compression (2 claims)

### 3. Author All Claims (30 min)

Use `/aphoria-claims` to author each claim with:
- **Provenance:** Redis spec, AWS docs, or library docs
- **Invariant:** What MUST stay true
- **Consequence:** What breaks if violated
- **Authority tier:** Tier 1 (spec), Tier 2 (vendor), Tier 3 (library)
- **Category:** security, safety, performance, correctness

Example:
```bash
/aphoria-claims create \
  --subject "cache/ttl" \
  --predicate "required" \
  --value "true" \
  --provenance "Redis SETEX command spec" \
  --invariant "TTL MUST be set for all cached values" \
  --consequence "Missing TTL causes memory leak - unbounded growth" \
  --tier "expert" \
  --category "safety"
```

### 4. Verify Claims (10 min)

```bash
cat .aphoria/claims.toml
# Verify all 20 claims present with full fields
```

**Target Output:**
- 20 claims in `.aphoria/claims.toml`
- 7 reused from corpus (35% reuse rate)
- 13 new claims specific to caching
- Daily summary: `DAY1-SUMMARY.md`

**Success Criteria:**
- ✅ All claims have: provenance, invariant, consequence, authority tier
- ✅ Reuse rate ≥ 35% (7/20 claims)
- ✅ Time ≤ 2 hours
- ✅ 0 naming errors (consistent with corpus)

---

## Day 2: Implementation (3-4 hours)

**Goal:** Write cachewrap library with **10 intentional violations** (security + performance + correctness)

**Violations (Intentional) - Cross-Cutting:**

### Security Violations (3):

1. **Key Injection Vulnerability**
   - Consequence: Attacker controls cache keys → data breach, cache poisoning
   - Marker: `@aphoria:claim[security] Cache keys MUST be validated -- unvalidated keys enable injection attacks`
   - Location: `src/client.rs:get()` method
   - Pattern: Accept user input as key without validation/sanitization

2. **TLS Verification Disabled**
   - Consequence: MITM attacks intercept cache traffic → credential theft
   - Marker: `@aphoria:claim[security] TLS certificate verification MUST be enabled -- disabled TLS enables MITM attacks`
   - Location: `src/config.rs:verify_tls = false`
   - Pattern: `verify_tls: false` in config

3. **Hardcoded Credentials**
   - Consequence: Credentials in version control → unauthorized access
   - Marker: `@aphoria:claim[security] Credentials MUST NOT be hardcoded -- hardcoded passwords leak in VCS`
   - Location: `src/config.rs:password = "secret123"`
   - Pattern: Plaintext password string in struct

### Performance Violations (3):

4. **Missing TTL**
   - Consequence: Memory leak - unbounded cache growth → OOM
   - Marker: `@aphoria:claim[safety] TTL MUST be set for cached values -- missing TTL causes memory leak`
   - Location: `src/client.rs:set()` method
   - Pattern: `SET key value` without `EX ttl`

5. **Unbounded Cache Size**
   - Consequence: OOM under sustained load
   - Marker: `@aphoria:claim[safety] Cache MUST have max_size limit -- unbounded cache causes OOM`
   - Location: `src/config.rs:max_size = None`
   - Pattern: `Option<usize>` instead of required field

6. **Synchronous Blocking**
   - Consequence: Throughput collapse - blocks event loop
   - Marker: `@aphoria:claim[performance] Cache I/O MUST be async -- synchronous blocking kills throughput`
   - Location: `src/client.rs:blocking_get()`
   - Pattern: Blocking Redis call in async context

### Correctness Violations (3):

7. **No Eviction Policy**
   - Consequence: Unpredictable behavior when cache full
   - Marker: `@aphoria:claim[correctness] Eviction policy MUST be configured -- missing policy causes undefined behavior`
   - Location: `src/config.rs:eviction_policy = None`
   - Pattern: Missing LRU/LFU configuration

8. **Zero Timeout**
   - Consequence: Indefinite blocking → hung threads
   - Marker: `@aphoria:claim[safety] Timeout MUST be > 0 -- timeout=0 causes indefinite blocking`
   - Location: `src/config.rs:timeout = Duration::from_secs(0)`
   - Pattern: `Duration::from_secs(0)`

9. **No Connection Pooling**
   - Consequence: Resource exhaustion - new connection per request
   - Marker: `@aphoria:claim[performance] Connection pooling MUST be enabled -- no pooling exhausts resources`
   - Location: `src/client.rs:new_connection()` called per request
   - Pattern: `redis::Client::open()` in hot path

### Observability Violation (1):

10. **No Metrics**
   - Consequence: Cannot debug cache hit/miss behavior in production
   - Marker: `@aphoria:claim[observability] Metrics MUST track hit/miss rates -- no metrics prevents debugging`
   - Location: `src/config.rs:metrics_enabled = false`
   - Pattern: No hit/miss counter fields

**Process:**

### 1. Create Project Structure (30 min)

```bash
cargo init --lib
# Or appropriate build setup
```

Files to create:
- `src/lib.rs` - Library root
- `src/config.rs` - CacheConfig (violations 2, 5, 7, 8, 10)
- `src/client.rs` - CacheClient (violations 1, 4, 6, 9)
- `src/error.rs` - Error types
- `tests/basic.rs` - Integration tests

### 2. Implement Happy Path (1.5 hours)

Core functionality:
- `CacheClient::new(config)` - Initialize with config
- `async fn get(&self, key: &str) -> Result<Option<String>>` - Fetch from cache
- `async fn set(&self, key: &str, value: &str) -> Result<()>` - Store in cache
- `async fn delete(&self, key: &str) -> Result<()>` - Remove from cache
- `fn health_check(&self) -> Result<bool>` - Connection health

**Keep implementation simple** - focus on violations, not production quality.

### 3. Embed Violations (1 hour)

For each violation:
1. Write code that violates the claim
2. Add inline marker comment (`@aphoria:claim[category] invariant -- consequence`)
3. Document why this is realistic (common mistake, copy-paste error, etc.)

Example (Violation 1 - Key Injection):
```rust
// @aphoria:claim[security] Cache keys MUST be validated -- unvalidated keys enable injection attacks
pub async fn get(&self, key: &str) -> Result<Option<String>> {
    // ❌ VIOLATION: No key validation - enables injection
    let value = self.conn.get(key).await?;  // User input directly to Redis
    Ok(value)
}

// ✅ COMPLIANT (for Day 4):
// pub async fn get(&self, key: &str) -> Result<Option<String>> {
//     validate_key(key)?;  // Check for control chars, length, etc.
//     let value = self.conn.get(key).await?;
//     Ok(value)
// }
```

### 4. Add Tests (30 min)

Create 15+ tests covering:
- Basic get/set/delete operations
- Error handling (connection failures, invalid keys)
- Configuration validation
- Async behavior verification

Tests should **pass** despite violations (violations are configuration/usage issues, not logic errors).

### 5. Document Violations (10 min)

In `src/lib.rs` doc comment, list all 10 violations with consequences:

```rust
//! # ⚠️ INTENTIONAL VIOLATIONS (Dogfooding Exercise)
//!
//! This library contains 10 intentional violations for Aphoria detection:
//! 1. Key injection (no validation) → Data breach
//! 2. TLS disabled → MITM attacks
//! ...
//! 10. No metrics → Cannot debug production
//!
//! These will be fixed progressively in Day 4 after detection in Day 3.
```

**Target Output:**
- Working cachewrap library (basic functionality)
- 10 embedded violations with inline markers
- 15+ tests passing
- Daily summary: `DAY2-SUMMARY.md`

**Success Criteria:**
- ✅ All 10 violations have inline markers
- ✅ Code is realistic (not contrived toy example)
- ✅ Tests pass (violations don't break logic)
- ✅ Time ≤ 4 hours

---

## Day 3: Scanning (1.5-2 hours)

**Goal:** Detect **9/10 violations** (≥90%) via `aphoria scan` AND create extractors for gaps

**⚠️ THIS IS THE CORE FLYWHEEL STEP** - Day 3 validates autonomous learning. Do NOT skip extractor creation.

**Process:**

### Phase 1: Pre-Flight Check (5 min) **[REQUIRED]**

```bash
# Verify skill availability
/help | grep aphoria-custom-extractor-creator
# Expected: skill listed and available

# Verify inline markers present
grep -r "@aphoria:claim" src/ | wc -l
# Expected: 10 markers

# Verify code compiles
cargo check
# Expected: 0 errors (warnings OK)
```

If any check fails, STOP and fix before proceeding.

### Phase 2: Baseline Scan (15 min)

```bash
cd /path/to/aphoria/dogfood/cachewrap
aphoria scan --format json > scan-v1.json
aphoria scan --format markdown > scan-v1.md
```

**Expected on FIRST scan:**
- Low detection rate (0-20%) is **NORMAL** for new domain
- Built-in extractors may catch: hardcoded credentials, TLS=false
- Most violations (TTL, key injection, eviction) will be **MISSING**
- This is NOT a failure - it's the signal that Phase 4 is needed

### Phase 3: Gap Analysis (15 min) **[REQUIRED]**

Analyze `scan-v1.json`:

```bash
jq '.findings[] | select(.verdict == "MISSING") | .claim_id' scan-v1.json
```

Create gap table in `DAY3-SUMMARY.md`:

| Violation | Claim ID | Detected? | Why Not? |
|-----------|----------|-----------|----------|
| Key injection | cache-001 | ❌ | No key validation extractor |
| TLS disabled | cache-002 | ✅ | Built-in TLS extractor |
| Hardcoded password | cache-003 | ✅ | Built-in secrets extractor |
| Missing TTL | cache-004 | ❌ | No TTL presence extractor |
| Unbounded size | cache-005 | ❌ | No max_size extractor |
| Sync blocking | cache-006 | ❌ | No async/await extractor |
| No eviction policy | cache-007 | ❌ | No eviction config extractor |
| Zero timeout | cache-008 | ⚠️ | Maybe (timeout extractor exists) |
| No pooling | cache-009 | ❌ | No connection pool extractor |
| No metrics | cache-010 | ❌ | No metrics field extractor |

**Expected:** 2-3 detected (built-in), 7-8 missing (need extractors)

### Phase 4: Extractor Creation (40 min) **[REQUIRED - DO NOT SKIP]**

**⚠️ CRITICAL:** This step is REQUIRED. Skipping this breaks the autonomous learning flywheel.

For EACH missed violation (7-8 total), use the skill:

```bash
/aphoria-custom-extractor-creator \
  --violation "cache SET without TTL" \
  --claim "cache-004" \
  --pattern 'SET.*(?!EX|PX)' \
  --language rust
```

Repeat for:
- Key injection (no `validate_key()` call)
- Unbounded cache size (`max_size: None`)
- Synchronous blocking (`blocking_get()` in async)
- No eviction policy (`eviction_policy: None`)
- No connection pooling (`Client::open()` in loop)
- No metrics (`metrics_enabled: false`)

**Expected:** 7-8 extractor files created in `.aphoria/extractors/`

### Phase 5: Verification Scan (20 min) **[REQUIRED]**

```bash
aphoria scan --format json > scan-v2.json
```

**Expected:**
- Detection rate ≥90% (9/10 or 10/10 violations)
- Gap closed: Missing → Detected
- 0 false positives

Compare scans:
```bash
echo "Scan v1 detections:"
jq '.summary.authority_conflicts' scan-v1.json
echo "Scan v2 detections:"
jq '.summary.authority_conflicts' scan-v2.json
```

### Phase 6: Documentation (15 min) **[REQUIRED]**

Create `DAY3-SUMMARY.md` with:

```markdown
# Day 3 Summary: Scanning & Extractor Creation

**Date:** 2026-02-XX
**Duration:** X hours

## Metrics

| Metric | Target | Actual | Delta |
|--------|--------|--------|-------|
| Detection rate (v1) | 20% | X% | +/- |
| Detection rate (v2) | ≥90% | X% | +/- |
| Extractors created | 7-8 | X | +/- |
| Time spent | ≤2 hrs | X hrs | +/- |

## Extractors Created

1. `key_validation_check.toml` - Detects missing `validate_key()`
2. `ttl_presence.toml` - Detects SET without EX/PX
3. `max_size_check.toml` - Detects `max_size: None`
...

## What Worked

- ✅ Built-in extractors caught TLS + hardcoded secrets
- ✅ Custom extractors closed gap to 90%+
- ✅ Flywheel workflow (scan → gap → extract → verify) smooth

## What Broke

- ❌ {Any issues encountered}

## Next Steps

- [ ] Day 4: Fix violations progressively
```

**Target Output:**
- `scan-v1.json` and `scan-v2.json` (baseline + verification)
- **7-8 extractor files** in `.aphoria/extractors/`
- `DAY3-SUMMARY.md` with metrics

**Success Criteria:**
- ✅ Pre-flight checks pass
- ✅ **7-8 extractors created** (one per missed violation) - **CRITICAL**
- ✅ Detection rate ≥ 90% in v2 scan
- ✅ Detection rate improvement documented (v1 → v2)
- ✅ Zero false positives
- ✅ Time ≤ 2 hours

**Evidence of Correct Execution:**
```bash
ls .aphoria/extractors/*.toml | wc -l  # Should be: 7-8
ls scan-v2.json                        # Should exist
ls DAY3-SUMMARY.md                     # Should exist
```

If ANY of these are missing, Day 3 was not completed correctly.

---

## Day 4: Remediation (3-4 hours)

**Goal:** Progressive fixes - remove all 10 violations, verify 0 conflicts

**Process:**

### 1. Fix Violations One-by-One (3 hours)

Fix in order of severity (security → performance → correctness → observability):

**Round 1: Security (30 min)**
- Fix violation 1: Add `validate_key()` function
- Fix violation 2: Set `verify_tls: true`
- Fix violation 3: Load credentials from `env::var("REDIS_PASSWORD")`
- After each fix: `aphoria scan` → verify conflict count decreases

**Round 2: Performance (45 min)**
- Fix violation 4: Add TTL parameter to `set()` method
- Fix violation 5: Set `max_size: Some(1000)` in config
- Fix violation 6: Make all methods `async`, remove blocking calls
- After each fix: Re-scan

**Round 3: Correctness (45 min)**
- Fix violation 7: Set `eviction_policy: Some(EvictionPolicy::LRU)`
- Fix violation 8: Change `timeout` to `Duration::from_secs(5)`
- Fix violation 9: Use `r2d2` or `bb8` for connection pooling
- After each fix: Re-scan

**Round 4: Observability (30 min)**
- Fix violation 10: Add `hit_count`, `miss_count` metrics fields
- Final scan: `aphoria scan --format json > scan-final.json`
- Verify: `jq '.summary.authority_conflicts' scan-final.json` → 0

### 2. Document Fix Times (30 min)

In `DAY4-SUMMARY.md`:

| Violation | Fix Time | Complexity | Notes |
|-----------|----------|------------|-------|
| 1. Key injection | 10 min | Low | Added `validate_key()` regex |
| 2. TLS disabled | 2 min | Trivial | Config flip |
| 3. Hardcoded password | 5 min | Low | `env::var()` |
| 4. Missing TTL | 15 min | Medium | API change (breaking) |
| 5. Unbounded size | 2 min | Trivial | Config value |
| 6. Sync blocking | 20 min | Medium | Async conversion |
| 7. No eviction | 10 min | Low | Config + enum |
| 8. Zero timeout | 2 min | Trivial | Config value |
| 9. No pooling | 25 min | High | Add r2d2 dependency |
| 10. No metrics | 15 min | Medium | Add struct fields |

**Total:** ~106 min (~1.8 hours) for fixes

### 3. Verify All Tests Still Pass (30 min)

```bash
cargo test
# All tests should pass with compliant code
```

If tests fail, fix issues before considering Day 4 complete.

**Target Output:**
- All 10 violations fixed
- Progressive scan results (scan-v1, scan-v2, scan-final)
- `DAY4-SUMMARY.md` with fix times
- Final scan: 0 conflicts

**Success Criteria:**
- ✅ Final scan: 0 conflicts
- ✅ Each fix verified independently via scan
- ✅ All tests passing
- ✅ Time ≤ 4 hours

---

## Day 5: Documentation (2-3 hours)

**Goal:** Comprehensive report with metrics, findings, product gaps

**Process:**

### 1. Write Final Report (2 hours)

Create `DAY5-DOGFOODING-REPORT.md` with sections:

**Executive Summary (15 min)**
- Hypothesis result (validated/partial/invalidated)
- Key findings (2-3 bullet points)
- Metrics snapshot

**Metrics Table (15 min)**

| Metric | Target | Actual | Delta | Analysis |
|--------|--------|--------|-------|----------|
| Total time | 12-16 hrs | X hrs | +/- | Why different? |
| Pattern reuse | 35% | X% | +/- | Which patterns reused? |
| Detection rate | ≥90% | X% | +/- | What missed? |
| Naming errors | <2 | X | +/- | Examples? |
| Time savings | ≥60% | X% | +/- | vs manual |

**What Worked (30 min)**
- Multi-domain corpus transfer (3 corpora → cache)
- Cross-cutting violation detection (security + performance + correctness)
- Extractor creation workflow
- Skills integration

**What Broke (30 min)**
- Product gaps discovered (prioritize by severity)
- Blockers encountered
- Workarounds applied
- Root cause analysis

**Product Gap Analysis (20 min)**

| Gap ID | Title | Severity | Effort | ROI | Priority |
|--------|-------|----------|--------|-----|----------|
| VG-XXX | {Title} | High/Med/Low | High/Med/Low | High/Med/Low | P1/P2/P3 |

**Recommendations (20 min)**
- Immediate (this sprint)
- Short-term (next 2 sprints)
- Long-term (roadmap)

### 2. Update README (15 min)

Add completion status to README.md:

```markdown
## Status

- [x] **Day 1:** Claims extraction (X hrs) - Y claims, Z% reuse
- [x] **Day 2:** Implementation (X hrs) - 10 violations, N tests
- [x] **Day 3:** Scanning (X hrs) - Y/10 detected
- [x] **Day 4:** Remediation (X hrs) - 0 conflicts
- [x] **Day 5:** Documentation (X hrs) - Report complete

**Final Metrics:**
- Time: X hrs (target: 12-16)
- Reuse: Y% (target: ≥35%)
- Detection: Z% (target: ≥90%)
```

### 3. Archive Artifacts (15 min)

Organize files:
- Move `DAY{1-5}-SUMMARY.md` to `summaries/`
- Keep `DAY5-DOGFOODING-REPORT.md` at root
- Archive scan results in `scans/`

**Target Output:**
- `DAY5-DOGFOODING-REPORT.md` (comprehensive, 600-800 lines)
- Updated README with completion status
- Organized artifacts

**Success Criteria:**
- ✅ All metrics quantified
- ✅ Product gaps prioritized (P1/P2/P3)
- ✅ Recommendations actionable
- ✅ Time ≤ 3 hours

---

## Success Metrics

| Metric | Target | Actual | Delta |
|--------|--------|--------|-------|
| Total time | 12-16 hrs | ___ | ___ |
| Pattern reuse | 35% | ___ | ___ |
| Detection rate | ≥90% | ___ | ___ |
| Naming errors | <2 | ___ | ___ |
| Time savings | ≥60% | ___ | ___ |

---

## Authority Sources

### Redis Protocol Specification (Tier 1)
- **URL:** https://redis.io/docs/reference/protocol-spec/
- **Relevance:** TTL commands (SETEX, EXPIRE), eviction policies, consistency
- **Covered Claims:** TTL, eviction, key formats, command semantics

### AWS ElastiCache Best Practices (Tier 2)
- **URL:** https://docs.aws.amazon.com/elasticache/
- **Relevance:** Security (TLS, auth), performance (connection pooling), monitoring
- **Covered Claims:** TLS verification, connection limits, metrics, timeouts

### redis-rs Library Documentation (Tier 3)
- **URL:** https://docs.rs/redis/
- **Relevance:** Rust-specific patterns, connection management, async usage
- **Covered Claims:** Connection pooling, async patterns, error handling

---

## References

- **httpclient dogfood:** `dogfood/httpclient/` (gold standard)
- **dbpool dogfood:** `dogfood/dbpool/` (connection patterns)
- **msgqueue dogfood:** `dogfood/msgqueue/` (async patterns)
- **Claims authoring:** `.claude/skills/aphoria-claims/`
- **Pattern discovery:** `.claude/skills/aphoria-suggest/`
- **Extractor creation:** `.claude/skills/aphoria-custom-extractor-creator/`

---

**You are ready to start Day 1!** Follow this plan and track metrics daily.