# Gap Analysis - cachewrap Documentation

**Timestamp:** 2026-02-11

---

## CRITICAL FIRST CHECK: Aphoria Nature Question

**Question:** Did the team use LLM workflows (skills) or manual CLI?

### Evidence Review:

**Day 1 (Claims):**
- ✅ `.aphoria/claims.toml` shows `created_by = "aphoria-suggest"` for all 20 claims
- ✅ DAY1-SUMMARY.md mentions "Pattern Discovery via LLM"
- ✅ Time: 18.2 seconds per claim (suggests automation)
- **Verdict:** USED `/aphoria-suggest` skill ✅

**Day 3 (Extractors):**
- ❌ DAY3-SUMMARY.md shows manual `.aphoria/config.toml` editing
- ❌ 3 iterations with manual debugging (separate files → config → path alignment)
- ❌ No mention of `/aphoria-custom-extractor-creator` skill invocations
- ❌ gap-analysis.md mentions skill but no evidence of actual usage
- **Verdict:** MANUAL workflow ❌

### Conclusion: PARTIAL Product Misunderstanding

**Type:** Documentation Gap (Not Product Misunderstanding)

**Reason:** Team used LLM skills for Day 1 but manual workflow for Day 3

**Root Cause:** Documentation failed to emphasize **continuous LLM requirement** across all phases. Skills presented as "recommended tools" not "core mechanism."

**Impact:**
- Team experienced extractor creation challenges (3 iterations)
- Manual workflow slower and more error-prone than autonomous
- Knowledge capture happened but inefficiently

**Recommendation:**
- Emphasize: "LLM workflows REQUIRED for ALL phases" (not just Day 1)
- Distinguish: "Autonomous workflow" (skills) vs "Debug mode" (manual CLI)
- See Finding 2 in main evaluation report

---

## Gap 1: Day 3 Workflow Not Emphasized as Flywheel Core

**Type:** Missing Information + Unclear Instructions

**Evidence:**
- **Team thought (DAY3-SUMMARY.md:1-8):** "Day 3: Scanning & Extractor Creation - 9 minutes"
- **Team did (DAY3-SUMMARY.md:80-152):** 3 iterations before achieving 50% detection
  - Iteration 1: Created separate .toml files (wrong approach)
  - Iteration 2: Added to config.toml but concept path mismatch
  - Iteration 3: Fixed paths, achieved 50%
- **Doc said (plan.md:111):** "Day 3: Scanning (1.5-2 hrs) - 6-phase workflow"

**Root Cause:** Day 3 presented as "validation day" when it's **knowledge capture day** (Steps 4-5 of flywheel)

**Impact:**
- Time lost: None (team completed in 9 min vs 2 hr target)
- Confusion level: Medium (3 iterations to find correct approach)
- Blocker: No (team discovered correct pattern via trial and error)

**Recommendation:**
- **Where:** plan.md Day 3 introduction (lines 111-115)
- **What to add:**
  ```markdown
  ## ⚠️ CRITICAL: Day 3 is Flywheel Steps 4-5

  This is NOT "run scan and check results." This IS:
  - Step 4: Identify claims without extractors (MISSING verdicts)
  - Step 5: Create extractors for those claims (autonomous learning)

  **Without extractor creation, NO knowledge is captured.**

  Evidence of correct execution:
  - `.aphoria/extractors/` directory with 8+ .toml files, OR
  - `.aphoria/config.toml` with `[[extractors.declarative]]` sections
  - `scan-v2.json` exists (verification scan AFTER extractor creation)
  - DAY3-SUMMARY.md documents detection rate improvement (v1 → v2)
  ```
- **Priority:** High (Critical for flywheel narrative)

---

## Gap 2: Continuous LLM Requirement Not Explicit

**Type:** Buried Information

**Evidence:**
- **Team thought:** Skills are optional tools, manual CLI is primary
- **Team did:**
  - Day 1: Used `/aphoria-suggest` skill ✅
  - Day 3: Manually edited config.toml ❌
- **Doc said (plan.md:121):** "Skills: /aphoria-suggest, /aphoria-claims, /aphoria-custom-extractor-creator"
- **Doc said (README.md:142):** Lists skills with "when to use"

**Root Cause:** Documentation doesn't distinguish "autonomous workflow" (LLM-driven) vs "manual CLI" (debug mode)

**Impact:**
- Time lost: Unknown (team still completed fast)
- Confusion level: Medium (used skills inconsistently)
- Blocker: No (partial LLM usage still worked)

**Recommendation:**
- **Where:** README.md top section + plan.md Day 1 & Day 3
- **What to add:**
  ```markdown
  ## 🤖 Autonomous Workflow (REQUIRED)

  Aphoria IS an LLM-driven continuous learning system. Skills ARE the product:

  - **Day 1:** `/aphoria-suggest` discovers patterns → `/aphoria-claims` authors claims
  - **Day 3:** `/aphoria-custom-extractor-creator` generates extractors for gaps

  **Manual CLI exists for debugging only.** If you find yourself:
  - Running `aphoria claims create` manually → Use `/aphoria-suggest` instead
  - Editing `.aphoria/config.toml` manually → Use `/aphoria-custom-extractor-creator`

  The dogfood exercise validates the **autonomous workflow**, not manual fallbacks.
  ```
- **Priority:** High (Product positioning)

---

## Gap 3: Detection Rate Target Not Contextualized

**Type:** Unclear Instructions

**Evidence:**
- **Team thought (DAY3-SUMMARY.md:18):** "⚠️ Below target" (50% vs ≥90%)
- **Team did (DAY3-SUMMARY.md:186-229):** Analyzed "Why 50% Instead of ≥90%?" with root causes
- **Doc said (plan.md:7):** "Detection rate: ≥90% of violations"

**Root Cause:** Target implies built-in extractors should catch 90%, doesn't account for baseline scan expectations

**Impact:**
- Time lost: 0 (team understood through analysis)
- Confusion level: High (thought they failed)
- Blocker: No (DAY5 retrospective clarified success)

**Recommendation:**
- **Where:** plan.md Day 3 success criteria (lines 170-178)
- **What to add:**
  ```markdown
  **Detection Rate Expectations:**

  - **Baseline scan (v1):** 0-20% expected (built-in extractors don't know cache patterns)
  - **After declarative extractors (v2):** 50-70% achievable (regex limitations)
  - **After programmatic extractors (v3):** 90-100% target (AST analysis)

  **Success = improvement, not perfection.** 0% → 50% validates the flywheel.
  ```
- **Priority:** High (Affects success interpretation)

---

## Gap 4: Concept Path Alignment Not Pre-Explained

**Type:** Missing Information

**Evidence:**
- **Team thought:** Extractor subject can be any string
- **Team did (DAY3-SUMMARY.md:126-148):**
  - Iteration 2: `claim.subject = "timeout"` → tail `config/timeout` ❌
  - Iteration 3: `claim.subject = "cache/timeout"` → tail `cache/timeout` ✅
- **Doc said:** config.toml has examples but no explanation

**Root Cause:** Tail-path matching algorithm not documented for extractor authors

**Impact:**
- Time lost: ~1 minute (Iteration 2)
- Confusion level: Medium
- Blocker: No (discovered via trial and error)

**Recommendation:**
- **Where:** plan.md Day 3 Phase 4 (lines 130-145)
- **What to add:**
  ```markdown
  **⚠️ Concept Path Alignment:**

  Extractor `claim.subject` creates observation tail-path (last 2 segments).

  Example:
  - Claim: `cache/timeout`
  - Subject: `timeout` → Tail: `config/timeout` ❌
  - Subject: `cache/timeout` → Tail: `cache/timeout` ✅

  **Pattern:** Prefix subjects with claim namespace.
  ```
- **Priority:** Medium (Reduces iteration count)

---

## Gap 5: Declarative Extractor Limitations Not Documented

**Type:** Buried Information

**Evidence:**
- **Team thought:** Declarative extractors can detect any pattern
- **Team did (DAY3-SUMMARY.md:193-221):**
  - 5/10 violations detected (50%)
  - 5 undetected due to: declaration vs value, escaping, multi-line, context
- **Doc said:** No mention of trade-offs in plan.md or README.md

**Root Cause:** Extractor type selection guidance missing

**Impact:**
- Time lost: 0 (discovered post-exercise)
- Confusion level: Low (understood limitations through execution)
- Blocker: No (50% validates mechanism)

**Recommendation:**
- **Where:** plan.md Day 3 + docs/extractors/ guide
- **What to add:**
  ```markdown
  ## Declarative vs Programmatic Extractors

  **Use declarative (regex in config.toml) when:**
  - ✅ Config values (`max_size: None`)
  - ✅ Function signatures (`pub async fn get`)

  **Use programmatic (Rust code) when:**
  - ❌ Function bodies (need to see `validate_key()` call)
  - ❌ Multi-line patterns with context

  **Day 3:** Use declarative for speed. Refine to programmatic in Day 5 if needed.
  ```
- **Priority:** Medium (Improves extractor selection)

---

## Gap 6: Authority Tier Mapping Not Explicit

**Type:** Missing Information

**Evidence:**
- **Team thought:** Tiers are subjective
- **Team did:** Mix of "expert" and "community" tiers (reasonable choices)
- **Doc said (docs/sources/):** Mentions tiers but no criteria

**Root Cause:** Decision framework for tier selection not documented

**Impact:**
- Time lost: 0
- Confusion level: Low
- Blocker: No

**Recommendation:**
- **Where:** plan.md Day 1 section
- **What to add:**
  ```markdown
  ## Authority Tier Selection

  | Tier | Source | When |
  |------|--------|------|
  | Tier 1 | RFCs, Standards | Normative (MUST) |
  | Tier 2 | Vendor docs | Official recommendations |
  | Tier 3 | Community | Implementation patterns |
  ```
- **Priority:** Low (Nice-to-have)

---

## Non-Gaps (Team Errors)

### Error 1: Separate TOML Files

**What team did wrong:**
- Created 10 `.toml` files in `.aphoria/extractors/` directory
- Assumed directory-based loading

**Doc was clear:**
- config.toml:64 shows `[[extractors.declarative]]` syntax inline

**Reason:** Misread extractor configuration format

**Impact:** 1 minute wasted (Iteration 1)

**Not a gap:** Syntax was documented and visible

---

## Summary

**Total Gaps:** 6 (3 Critical, 2 Medium, 1 Low)

**Total Errors:** 1 (Iteration 1 wrong approach)

**Critical Pattern:** Documentation presents Day 3 and LLM workflows as optional when they're core to the autonomous learning flywheel.

**Recommendation:** Emphasize REQUIRED status for:
1. Day 3 extractor creation (Steps 4-5 of flywheel)
2. Continuous LLM usage (skills for ALL phases)
3. Detection rate context (0% → 50% → 90% progression)