stemedb/applications/aphoria/dogfood/cachewrap/README.md

# Dogfood: Distributed Cache Client Library (cachewrap)

**Hypothesis:** Connection patterns + resource limits + TTL semantics from 3 corpora (httpclient, dbpool, msgqueue) transfer to cache clients with **35-40%** pattern reuse, demonstrating multi-domain flywheel strength and cross-cutting concern detection.

**Corpus Overlap:** httpclient + dbpool + msgqueue → **~35-40%** pattern reuse expected

**Target Metrics:**
- Time savings: **≥60%** vs manual
- Pattern reuse: **≥35%** of claims (7+/20)
- Detection rate: **≥90%** of violations (8-9/10)
- Naming errors: **<2**

---

## Why This Domain? (Difficulty: ★★★★☆)

Cache clients test whether patterns from **3 different domains** (HTTP, DB, messaging) transfer to a fourth domain with **cross-cutting violations**:

✅ **Connection patterns** from httpclient (timeout, TLS, async, retry)
✅ **Resource limits** from dbpool (max connections, lifecycle, cleanup)
✅ **Semantic patterns** from msgqueue (backpressure, metrics)
✅ **New patterns** unique to caching (TTL, eviction, sharding, consistency)

**What Makes This Harder:**
- **Lower corpus overlap** (35-40% vs msgqueue's 50%)
- **Cross-cutting violations** (security + performance + correctness)
- **Stateful semantics** (cache invalidation, TTL expiry, consistency)
- **Subtle bugs** (key injection, unbounded growth, race conditions)

This validates **multi-domain flywheel adaptability** - knowledge compounds across domains.

---

## Quick Start

1. **Read the plan:** `plan.md` (detailed 5-day workflow)
2. **Start Day 1:** Use `/aphoria-suggest --corpus httpclient,dbpool,msgqueue` to discover reusable patterns
3. **Follow the workflow:** Track metrics daily, write summaries
4. **Reference examples:** See `dogfood/httpclient/` for complete example

---

## Status

- [x] **Day 1:** Claims extraction (11 min) - ✅ 20 claims (7 reused = 35%)
- [x] **Day 2:** Implementation (10 min) - ✅ 10 violations embedded, 16 tests pass
- [x] **Day 3:** Scanning (9 min) - ⚠️ 5/10 violations detected (50%)
- [x] **Day 4:** Remediation (25 min) - ✅ All 10 violations fixed
- [ ] **Day 5:** Documentation (in progress) - Comprehensive report + retrospective

**Total Time:** 56 minutes (Days 1-4) - 89% faster than 12-16 hour target

**Final Status:** ✅ Production-ready with secure defaults

---

## Expected Pattern Reuse (7/20 = 35%)

### From httpclient Corpus (4 patterns):
- `timeout` → `cache/timeout`
- `tls/certificate_validation` → `tls/certificate_validation`
- `retry/max_attempts` → `retry/max_attempts`
- `async/runtime` → `async/runtime`

### From dbpool Corpus (2 patterns):
- `max_connections` → `connection/max_connections`
- `connection_lifecycle` → `connection/lifecycle`

### From msgqueue Corpus (1 pattern):
- `metrics/enabled` → `metrics/enabled`

### New for Cache Client (13 patterns):
- `cache/ttl` (Time To Live)
- `cache/eviction_policy`
- `cache/max_size`
- `cache/key_prefix`
- `cache/serialization`
- `cache/compression`
- `cache/consistency_mode`
- `cache/sharding_strategy`
- `cache/read_through`
- `cache/write_through`
- `cache/stampede_prevention`
- `cache/key_validation`
- `cache/circuit_breaker`

**Total:** 20 claims (7 reused = 35% reuse rate)

---

## Violations to Embed (Day 2) - Cross-Cutting

### Security Violations (3):
1. ❌ **Key injection vulnerability** - No key validation → Data breach
2. ❌ **verify_tls = false** - No TLS verification → MITM attacks
3. ❌ **Plaintext credential storage** - Hardcoded password → Credential exposure

### Performance Violations (3):
4. ❌ **Missing TTL** - No expiration → Memory leak (unbounded growth)
5. ❌ **Unbounded cache size** - No max_size → OOM under load
6. ❌ **Synchronous blocking** - No async I/O → Throughput collapse

### Correctness Violations (3):
7. ❌ **No eviction policy** - Missing LRU/LFU → Unpredictable behavior
8. ❌ **timeout = 0** - Indefinite blocking → Hung threads
9. ❌ **No connection pooling** - New conn per request → Resource exhaustion

### Observability Violation (1):
10. ⚠️ **No metrics** - Missing hit/miss tracking → Debugging impossible

---

## Files

```
cachewrap/
├── README.md                    # This file
├── plan.md                      # Detailed 5-day workflow
├── .aphoria/
│   ├── config.toml              # Persistent mode, corpus enabled
│   └── claims.toml              # (empty, fill on Day 1)
├── docs/
│   └── sources/                 # Authority sources
│       ├── redis-spec.md        # Redis protocol (Tier 1)
│       ├── aws-elasticache.md   # AWS best practices (Tier 2)
│       └── redis-rs-lib.md      # Rust library patterns (Tier 3)
├── src/                         # (create on Day 2)
│   └── .gitkeep
├── claims-template.sh           # Batch claim import (20 claims)
└── DAY1-SUMMARY.md              # (create after Day 1)
```

---

## References

- **Plan:** `plan.md` (start here)
- **Authority sources:** `docs/sources/` (use for provenance)
- **Complete example:** `dogfood/httpclient/` (gold standard)
- **Similar domains:** `dogfood/dbpool/`, `dogfood/msgqueue/`
- **Skills:**
  - `/aphoria-suggest` - Day 1 pattern discovery
  - `/aphoria-claims` - Day 1 claim authoring
  - `/aphoria-custom-extractor-creator` - Day 3 extractor generation

---

## Success Criteria

| Metric | Target | Validates |
|--------|--------|-----------|
| Pattern reuse | ≥35% | Multi-domain flywheel works |
| Time savings | ≥60% | Automation value at lower reuse rate |
| Detection rate | ≥90% | Cross-cutting violation detection |
| Naming errors | <2 | 3-corpus consistency |
| Total time | 12-16 hrs | Difficulty calibration |

---

## What This Tests (vs Previous Exercises)

| Exercise | Corpus Sources | Reuse % | Difficulty | What It Tests |
|----------|----------------|---------|------------|---------------|
| httpclient | None (baseline) | 0% | ★★☆☆☆ | Async patterns, HTTP |
| dbpool | httpclient | 30% | ★★★☆☆ | Connection lifecycle |
| msgqueue | httpclient + dbpool | 50% | ★★★☆☆ | Cross-domain transfer (2→1) |
| **cachewrap** | **httpclient + dbpool + msgqueue** | **35%** | **★★★★☆** | **Multi-domain (3→1), cross-cutting** |

**Progressive Challenge:**
- msgqueue: 2 corpora → 50% reuse (easier)
- **cachewrap: 3 corpora → 35% reuse (harder, more discovery)**

---

**Ready to start Day 1!** Follow `plan.md` and track metrics daily.