# Setup Evaluation: cachewrap Dogfood Project **Evaluation Date:** 2026-02-11 **Evaluator:** Claude (Setup Review Agent) **Status:** ⚠️ **MOSTLY READY** (2 gaps to fix before starting) --- ## Executive Summary The cachewrap dogfood project is **90% correctly set up** with excellent structure, hypothesis, and documentation. However, it's **missing critical Day 3 enhancements** that were added to msgqueue after its Day 3 failure. **Must fix before Day 1:** 1. Add manual fallback format to Day 3 Phase 4 2. Add debug workflow to Day 3 Phase 5 **These fixes take ~10 minutes and prevent a Day 3 failure like msgqueue experienced.** --- ## Setup Checklist ### ✅ Correctly Set Up #### Directory Structure (Perfect) ``` cachewrap/ ├── README.md ✅ Excellent (hypothesis, metrics, status) ├── plan.md ⚠️ Good (needs Day 3 updates) ├── .aphoria/ │ ├── config.toml ✅ Perfect (persistent mode, 3 corpus sources) │ └── claims.toml ✅ Ready (empty with instructions) ├── docs/ │ └── sources/ ✅ Perfect (3 authority sources) │ ├── redis-spec.md ✅ Template with extraction guide │ ├── aws-elasticache.md ✅ Template ready │ └── redis-rs-lib.md ✅ Template ready └── src/ └── .gitkeep ✅ Placeholder with instructions ``` **All expected directories and files present.** --- #### README.md Quality (⭐⭐⭐⭐⭐ Excellent) ✅ **Hypothesis clearly stated:** > "Connection patterns + resource limits + TTL semantics from 3 corpora (httpclient, dbpool, msgqueue) transfer to cache clients with 35-40% pattern reuse, demonstrating multi-domain flywheel strength" ✅ **Target metrics defined:** - Time savings: ≥60% vs manual - Pattern reuse: ≥35% (7/20 claims) - Detection rate: ≥90% (9/10 violations) - Naming errors: <2 - Total time: 12-16 hours ✅ **Difficulty calibrated:** ★★★★☆ (harder than msgqueue ★★★☆☆) ✅ **Corpus overlap explained:** - httpclient: 4 patterns (timeout, TLS, retry, async) - dbpool: 2 patterns (max_connections, lifecycle) - msgqueue: 1 pattern (metrics) - New: 13 cache-specific patterns ✅ **Violations categorized by type:** - 3 security (key injection, TLS disabled, plaintext credentials) - 3 performance (missing TTL, unbounded size, sync blocking) - 3 correctness (no eviction, timeout=0, no pooling) - 1 observability (no metrics) ✅ **Cross-cutting nature emphasized:** Tests whether flywheel works across security + performance + correctness boundaries simultaneously. **This is gold-standard README quality.** --- #### .aphoria/config.toml (⭐⭐⭐⭐⭐ Perfect) ✅ **Persistent mode enabled:** ```toml [episteme] mode = "persistent" corpus_db = "/home/jml/.aphoria/corpus-db" ``` ✅ **3 corpus sources configured:** ```toml [corpus] sources = ["httpclient", "dbpool", "msgqueue"] ``` ✅ **Corpus flags enabled:** ```toml include_rfc = true include_owasp = true include_vendor = true use_community = true ``` ✅ **Inline markers enabled:** ```toml [extractors.inline_markers] enabled = true sync_to_pending = true ``` ✅ **Comments explain extractor expectations:** ```toml # Built-in extractors that may detect violations: # - hardcoded_secrets: Detects violation 3 # - tls_config: Detects violation 2 # - timeout_config: May detect violation 8 # # Custom extractors needed (created on Day 3): # - key_validation: Violation 1 # - ttl_presence: Violation 4 # ... ``` **This config is production-ready.** --- #### Authority Sources (⭐⭐⭐⭐☆ Very Good) **redis-spec.md (Tier 1):** - ✅ Template structure correct - ✅ Extraction guide included - ✅ Key claims identified (TTL, eviction, key validation, connection pooling) - ✅ Placeholders for user to fill ("> **User fills in:** Fetch Redis command docs") **aws-elasticache.md (Tier 2):** - ✅ Template ready - ✅ Best practices focus **redis-rs-lib.md (Tier 3):** - ✅ Template ready - ✅ Community patterns focus **Minor improvement:** Could pre-populate some quotes from well-known Redis docs, but templates are sufficient for dogfooding. --- #### plan.md Day 1-2 (⭐⭐⭐⭐⭐ Excellent) ✅ **Day 1 process clear:** - Step 1: Discover reusable patterns (30 min) - Step 2: Draft new claims (30 min) - Step 3: Author all claims (30 min) - Step 4: Verify claims (10 min) ✅ **Day 2 process detailed:** - Files to create listed (config.rs, client.rs, error.rs, lib.rs) - Each violation mapped to file + line - Inline marker syntax shown - Test requirements specified (15+ tests) ✅ **Violations are realistic:** - Not contrived (e.g., key injection via user input directly to Redis) - Have clear consequences - Inline markers documented **Day 1-2 are production-ready.** --- ### ⚠️ Gaps to Fix (Day 3) #### Gap 1: Missing Manual Fallback Format (Day 3 Phase 4) **Problem:** plan.md Day 3 Phase 4 only shows skill invocation: ```bash /aphoria-custom-extractor-creator \ --violation "cache SET without TTL" \ --claim "cache-004" ``` **But doesn't show what to do if skill is unavailable.** **From msgqueue evaluation:** Teams need manual fallback with: 1. Complete declarative extractor TOML format 2. Emphasis that `subject` must EXACTLY match claim `concept_path` 3. Validation steps BEFORE scanning 4. Link to comprehensive reference doc **What's needed:** Add after Phase 4 skill invocations: ```markdown **If skill is unavailable:** You can manually create declarative extractors. Follow the format below: **Manual Fallback (Declarative Extractor):** Add to `.aphoria/config.toml` for EACH violation: \```toml [[extractors.declarative]] name = "descriptive_name" pattern = 'regex_pattern_matching_code' languages = ["rust"] [extractors.declarative.claim] subject = "FULL_CLAIM_CONCEPT_PATH" # ← Copy from claim's concept_path EXACTLY predicate = "claim_predicate" value = inverted_value # false if claim expects true confidence = 0.95 \``` **⚠️ CRITICAL:** `subject` must EXACTLY match your claim's `concept_path`. **Example (TTL presence):** \```toml [[extractors.declarative]] name = "ttl_presence_check" pattern = 'SET.*(?!EX|PX)' languages = ["rust"] [extractors.declarative.claim] subject = "cachewrap/cache/ttl" # ← Matches claim concept_path exactly predicate = "required" value = false # Observing "NOT required" (violation) confidence = 0.95 \``` **Validation Before Scanning:** \```bash # 1. Check subject matches claim concept_path grep "subject =" .aphoria/config.toml grep "concept_path =" .aphoria/claims.toml # Subjects should match concept_paths EXACTLY # 2. Test regex pattern matches code grep -rE 'SET.*(?!EX|PX)' src/ # Should find the violation line # 3. Verify TOML syntax cargo install taplo-cli taplo fmt --check .aphoria/config.toml \``` **See also:** `../../docs/extractors/declarative-extractors.md` for complete reference. ``` **Why this matters:** msgqueue Day 3 failed TWICE because: 1. First attempt: Skipped extractor creation entirely 2. Second attempt: Created extractors with wrong `subject` format (missing prefix) Manual fallback with validation prevents both failures. --- #### Gap 2: Missing Debug Workflow (Day 3 Phase 5) **Problem:** plan.md Day 3 Phase 5 shows expected result but doesn't explain **what to do if detection rate is still 0%**. **From msgqueue evaluation:** After creating 7 extractors, team had 0% detection because extractor `subject` fields didn't match claim `concept_path` fields. **What's needed:** Add after Phase 5 scan commands: ```markdown **If detection rate is still 0% (extractors don't match claims):** This means extractors ran but observations didn't align with claims. Debug: \```bash # Step 1: Verify observations were created jq '.observations | length' scan-v2.json # Expected: > 0 (if 0, patterns don't match code) # Step 2: Compare observation paths vs claim paths jq '.observations[].concept_path' scan-v2.json | sort -u grep "concept_path =" .aphoria/claims.toml | sort -u # Observation paths should END with same tail as claim paths # Step 3: Check for tail-path mismatch # Example mismatch: # - Observation: cache/ttl (extractor subject too short) # - Claim: cachewrap/cache/ttl (needs full path) # - Fix: Update extractor subject = "cachewrap/cache/ttl" # Step 4: Verify predicate alignment jq '.observations[].predicate' scan-v2.json | sort -u grep "predicate =" .aphoria/claims.toml | sort -u # Must match exactly \``` **Common Issue:** Extractor `subject` doesn't match claim `concept_path`. **Fix:** Update extractor subject to use full path matching claim. **Example Fix:** \```toml # Before (WRONG): [extractors.declarative.claim] subject = "cache/ttl" # ❌ Missing "cachewrap/" prefix # After (CORRECT): [extractors.declarative.claim] subject = "cachewrap/cache/ttl" # ✅ Matches claim exactly \``` Re-scan after fixing: \```bash aphoria scan --format json > scan-v3.json # Should now show 9/10 conflicts \``` ``` **Why this matters:** Without debug workflow, teams spend hours in trial-and-error. With it, they can diagnose and fix alignment issues in 10 minutes. --- ### ✅ Not Missing (But Expected) These are intentionally empty (correct for pre-Day-1 state): - ✅ **No Cargo.toml** - Created on Day 2 when implementing code - ✅ **No claims-template.sh** - Optional (can use CLI directly) - ✅ **No src/*.rs files** - Created on Day 2 - ✅ **Empty claims.toml** - Filled on Day 1 via `/aphoria-claims` - ✅ **No DAY1-SUMMARY.md** - Created after completing Day 1 --- ## Comparison: cachewrap vs msgqueue Setup | Aspect | msgqueue (before fixes) | cachewrap (current) | Status | |--------|-------------------------|---------------------|--------| | **Directory structure** | ✅ Complete | ✅ Complete | Equal | | **README quality** | ✅ Excellent | ✅ Excellent | Equal | | **Config.toml** | ✅ Perfect | ✅ Perfect | Equal | | **Authority sources** | ✅ Complete | ✅ Complete | Equal | | **Day 1-2 plan** | ✅ Detailed | ✅ Detailed | Equal | | **Day 3 manual fallback** | ❌ Missing → caused failure | ❌ **Missing** | **Needs fix** | | **Day 3 debug workflow** | ❌ Missing → caused failure | ❌ **Missing** | **Needs fix** | **cachewrap is at same state msgqueue was BEFORE Day 3 failures.** **Good news:** We know exactly what to add (manual fallback + debug workflow) because msgqueue failures taught us. --- ## Validation Against Dogfooding Standards ### From `aphoria-dogfood` Skill Requirements: ✅ **1. Test Something New (Hypothesis Required):** - Clear hypothesis: "3 corpora → 35-40% reuse in 4th domain" - Specific and measurable ✅ **2. Reuse Is the Magic (30%+ Corpus Overlap):** - Expected: 35% (7/20 claims) - Justified by pattern analysis (4 from httpclient, 2 from dbpool, 1 from msgqueue) ✅ **3. Violations Must Be Intentional (7-10 with Consequences):** - 10 violations planned - Each has consequence - Each has inline marker syntax documented ✅ **4. Quantify Everything (Metrics Required):** - Time savings: ≥60% - Pattern reuse: ≥35% - Detection rate: ≥90% - Naming errors: <2 - Total time: 12-16 hours ✅ **5. Follow the 5-Day Arc:** - Day 1: Claims (1-2 hrs) - Day 2: Implementation (3-4 hrs) - Day 3: Scanning (1.5-2 hrs) - Day 4: Remediation (3-4 hrs) - Day 5: Documentation (2-3 hrs) **All standards met except Day 3 manual fallback + debug workflow.** --- ## Difficulty Assessment **Rated:** ★★★★☆ (4/5 stars) **Justification (from README):** - Lower corpus overlap (35% vs msgqueue's 50%) - Cross-cutting violations (security + performance + correctness) - Stateful semantics (cache invalidation, TTL, consistency) - Subtle bugs (key injection, race conditions) **Time estimate:** 12-16 hours (vs msgqueue's 8-10 hours) **Is this realistic?** Comparing to completed exercises: - httpclient: 8-10 hrs (baseline, 0% reuse) ✅ Realistic - msgqueue: 8-10 hrs (50% reuse) ✅ Realistic - cachewrap: 12-16 hrs (35% reuse, higher complexity) ✅ **Realistic** **Why longer despite corpus:** - 3 corpus sources = more discovery time (Day 1 takes longer) - 13 new patterns (vs msgqueue's 11) = more authoring (Day 1) - 10 violations (vs msgqueue's 8) = more implementation (Day 2) - Cross-cutting violations = more complex extractors (Day 3) **Difficulty rating is well-calibrated.** --- ## Domain Validation ### Why Cache Client? (From README) ✅ **Tests multi-domain transfer:** Patterns from HTTP + DB + messaging → caching ✅ **Tests cross-cutting concerns:** Security + performance + correctness simultaneously ✅ **Tests stateful semantics:** TTL, eviction, consistency (harder than stateless HTTP) ✅ **Tests corpus adaptability:** 3 sources with 35% overlap **This is a valid progression:** 1. httpclient: Baseline (no corpus) 2. dbpool: Single-source transfer (httpclient → dbpool) 3. msgqueue: Dual-source transfer (httpclient + dbpool → msgqueue) 4. **cachewrap: Triple-source transfer (httpclient + dbpool + msgqueue → cache)** Each exercise increases complexity and validates a deeper aspect of the flywheel. --- ## Corpus Overlap Analysis ### Claimed Reuse (7/20 = 35%) **From httpclient (4 patterns):** - `timeout` → cache timeout ✅ Valid (connection timeout) - `tls/certificate_validation` → cache TLS ✅ Valid (secure connection) - `retry/max_attempts` → cache retry ✅ Valid (operation retry) - `async/runtime` → cache async ✅ Valid (async I/O) **From dbpool (2 patterns):** - `max_connections` → cache max connections ✅ Valid (connection pooling) - `connection_lifecycle` → cache connection lifecycle ✅ Valid (cleanup) **From msgqueue (1 pattern):** - `metrics/enabled` → cache metrics ✅ Valid (observability) **Assessment:** All 7 reuse claims are **legitimate pattern transfers**. Not forced. --- ### New Patterns (13 cache-specific) - TTL and expiration (3) ✅ Cache-specific - Key validation and injection (2) ✅ Cache-specific - Eviction policies (2) ✅ Cache-specific - Serialization and compression (2) ✅ Cache-specific - Consistency and sharding (2) ✅ Cache-specific - Circuit breaker, stampede prevention (2) ✅ Cache-specific **Assessment:** 13 new patterns are **genuinely cache-specific**, not variations of existing patterns. **35% reuse estimate is realistic.** --- ## Recommendations ### Immediate (Before Starting Day 1) - ~10 minutes **1. Add manual fallback to plan.md Day 3 Phase 4:** - Copy format from `dogfood/msgqueue/plan.md` lines 303-341 - Adapt example from msgqueue → cachewrap - Link to `../../docs/extractors/declarative-extractors.md` **2. Add debug workflow to plan.md Day 3 Phase 5:** - Copy format from `dogfood/msgqueue/plan.md` lines 342-385 - Adapt commands for cachewrap (subject paths, predicates) **Impact:** Prevents Day 3 failure like msgqueue experienced (70 minutes wasted) --- ### Optional (Before Starting) - ~30 minutes **3. Create `claims-template.sh`:** - Batch script to create all 20 claims - Reduces Day 1 time from 1-2 hours → 45 minutes - See `dogfood/httpclient/create-claims.sh` for template **4. Pre-populate authority sources:** - Add 2-3 actual quotes from Redis docs to `redis-spec.md` - Reduces Day 1 discovery time - But templates are sufficient - not critical --- ### During Execution **5. Track detection rate pattern:** On Day 3, track: - Baseline scan: X/10 detected - After extractor creation: Y/10 detected - Expected: 0-2 → 9-10 (big improvement) This validates the **cross-domain flywheel hypothesis**. **6. Compare to msgqueue metrics:** After Day 5, compare: - msgqueue: 50% reuse, 8-10 hours, 100% detection - cachewrap: 35% reuse, 12-16 hours, ≥90% detection If cachewrap takes **<60% more time** despite **30% less reuse**, the flywheel scales well. --- ## Final Verdict ### Status: ⚠️ **90% Ready - Fix 2 Gaps** **What's excellent:** - ⭐⭐⭐⭐⭐ README (hypothesis, metrics, difficulty) - ⭐⭐⭐⭐⭐ Config (persistent mode, 3 corpus sources) - ⭐⭐⭐⭐⭐ Day 1-2 plan (detailed, realistic) - ⭐⭐⭐⭐☆ Authority sources (templates ready) - ⭐⭐⭐⭐⭐ Domain choice (validates multi-domain transfer) **What needs fixing:** - ⚠️ Day 3 Phase 4: Add manual fallback format - ⚠️ Day 3 Phase 5: Add debug workflow **Time to fix:** ~10 minutes **After fixes:** ✅ Ready to start Day 1 --- ## Comparison to Gold Standard (httpclient) | Aspect | httpclient | cachewrap | Rating | |--------|-----------|-----------|--------| | Directory structure | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Equal | | README hypothesis | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Equal | | Config quality | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Equal | | Authority sources | ⭐⭐⭐⭐⭐ (filled) | ⭐⭐⭐⭐☆ (templates) | Slightly lower | | Day 1-2 plan | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Equal | | Day 3 plan | ⭐⭐⭐⭐⭐ (complete) | ⭐⭐⭐☆☆ (missing 2 features) | **Needs update** | | Day 4-5 plan | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Equal | **Overall:** cachewrap is **httpclient-quality** except for Day 3 gaps (which are easy to fix). --- ## Action Items ### For Setup Owner (Do Now) - [ ] Copy manual fallback format from msgqueue to cachewrap plan.md Phase 4 - [ ] Copy debug workflow from msgqueue to cachewrap plan.md Phase 5 - [ ] Review additions for cachewrap-specific terminology - [ ] Commit changes **Time:** 10 minutes ### For Day 1 Executor (When Starting) - [ ] Read `plan.md` completely before starting - [ ] Verify `/aphoria-suggest` skill available - [ ] Verify `/aphoria-claims` skill available - [ ] Have `docs/extractors/declarative-extractors.md` open for reference ### For Day 3 Executor (Critical) - [ ] **DO NOT skip Phase 4 (extractor creation)** - This is the flywheel validation - [ ] Follow 6-phase workflow exactly (pre-flight → scan → gap → create → verify → document) - [ ] If 0% detection after Phase 5 → Use debug workflow immediately - [ ] Document detection rate improvement (v1 → v2) --- **Evaluation complete:** 2026-02-11 **Next step:** Fix 2 Day 3 gaps, then **ready to start Day 1**.