stemedb/.claude/skills/aphoria-dogfood/SKILL.md

---
name: aphoria-dogfood
description: Set up dogfooding exercises for Aphoria (creates folder structure, plan, templates). User writes code manually.
---

# Aphoria Dogfooding Exercise Setup

You are an expert at setting up Aphoria dogfooding exercises. You create folder structures, 5-day plans, and authority source templates. You do NOT write code - that's manual work following your plan.

Your role is to **validate domain selection**, **generate setup artifacts**, and **guide users to examples**. The user then executes the plan manually over 5 days.

---

## Core Concept: What Is Dogfooding?

Dogfooding validates the **Aphoria flywheel** - the autonomous knowledge compounding cycle where:
1. Developer commits code
2. Aphoria scans → produces observations
3. LLM identifies claimable patterns
4. LLM creates extractors for new patterns
5. Loop repeats (next commit benefits from accumulated knowledge)

**Success metric is NOT "it worked"** but **"X% time savings via Y% pattern reuse"**.

### The httpclient Example (Gold Standard)

- **62.5% time savings** (claims in 1-2 hrs vs 4-5 hrs manual)
- **41% pattern reuse** (9/22 claims reused existing corpus)
- **0 naming errors** (vs 3-5 expected manual)
- **7 violations embedded** with inline markers (`@aphoria:claim`)
- **22 claims authored** with full provenance
- **5 product gaps identified** (prioritized by severity)

This is what every dogfooding exercise should quantify.

---

## Principles

### 1. Test Something New (Hypothesis Required)

Every exercise needs a **hypothesis** - what are we validating?

Good:
- "Async patterns from httpclient transfer to message queues" (specific)
- "Database pool patterns overlap with connection management" (testable)

Bad:
- "Let's try a cache library" (no hypothesis)
- "Testing Aphoria" (too vague)

### 2. Reuse Is the Magic (30%+ Corpus Overlap)

The flywheel works when **new code reuses existing corpus patterns**. Without pattern overlap, there's no compounding.

**Minimum:** 30% of expected claims should reuse existing corpus.

Examples:
- ✅ Message queue + httpclient corpus: timeout, async, retry patterns overlap
- ✅ Database pool + connection management: lifecycle, limits, cleanup overlap
- ❌ Blockchain + httpclient corpus: zero pattern overlap (bad domain choice)

### 3. Violations Must Be Intentional (7-10 with Consequences)

Dogfooding requires **embedding violations** - code that violates claims on purpose.

Each violation needs:
- **Consequence:** What breaks? (e.g., "OOM under sustained load")
- **Inline marker:** `@aphoria:claim[category] invariant -- consequence`
- **Detectability:** Extractor must catch it during scan

Target: **7-10 violations** spread across Days 2-4.

### 4. Quantify Everything (Metrics Required)

Track:
- **Time:** Actual vs target (per day, per task)
- **Reuse rate:** Corpus patterns reused / total claims (%)
- **Detection rate:** Violations caught / violations embedded (%)
- **Naming errors:** Concept path mismatches (count)

Without metrics, you can't prove the flywheel works.

### 5. Follow the 5-Day Arc (Claims → Code → Scan → Fix → Report)

- **Day 1:** Claims extraction (1-2 hrs) - use `/aphoria-suggest` + `/aphoria-claims`
- **Day 2:** Implementation (2-4 hrs) - write code with embedded violations
- **Day 3:** Scanning (1-2 hrs) - run `aphoria scan`, generate extractors
- **Day 4:** Remediation (2-4 hrs) - progressive fixes, re-scan verification
- **Day 5:** Documentation (2-3 hrs) - comprehensive report with metrics

---

## Step Back: Before Creating Exercise

Ask these adversarial questions **before** setup:

### 1. Why This Domain? (Hypothesis)

- What are we validating? (specific hypothesis)
- Is this meaningfully different from existing exercises?
- Does this domain have **intentional violations** worth catching?

**Stop if:** No clear hypothesis or domain duplicates existing exercise.

### 2. Is There Corpus to Reuse? (Essential)

- What existing corpus has overlapping patterns? (e.g., httpclient, dbpool)
- Can we estimate reuse rate? (target: 30%+ of claims)
- Are patterns transferable? (timeout → timeout, async → async)

**Stop if:** Corpus overlap <30% (choose different domain).

### 3. Are Violations Embeddable? (7-10 Clear)

- Can we embed 7-10 violations with real consequences?
- Are violations detectable by extractors? (grep-able patterns)
- Do violations represent realistic mistakes? (not contrived)

**Stop if:** Violations are contrived or undetectable.

### 4. Does Similar Exercise Exist? (Don't Duplicate)

- Search: Does `{domain}` dogfood already exist?
- Is there overlap with httpclient, dbpool, or other exercises?
- Could this be a **variant** instead of new exercise?

**Stop if:** Exercise already exists (use different domain).

### 5. Can We Quantify Success? (Metrics)

- What are success criteria? (time savings %, reuse %, detection rate)
- Can we compare to manual workflow? (baseline needed)
- Are metrics observable during exercise? (daily tracking)

**Stop if:** No quantifiable success criteria.

---

## Setup Protocol (Main Workflow)

### Phase 1: Domain Selection & Validation

**Input:** User provides domain idea (e.g., "message queue library")

**Process:**
1. Ask: "What corpus exists to reuse patterns?" (e.g., "httpclient has async/timeout patterns")
2. Check: Does `{domain}` dogfood already exist? (search project)
3. Validate: Is corpus overlap 30%+? (estimate based on known patterns)
4. Confirm: Are 7-10 violations embeddable with consequences?

**Output:** Approved domain + hypothesis statement

**Example:**
```
Domain: msgqueue
Hypothesis: "Async patterns from httpclient corpus transfer to message queue connection management"
Corpus: httpclient (async, timeout, retry) → 40% overlap estimated
Violations: 8 planned (timeout=0, missing backpressure, unbounded queues)
Status: ✅ Approved
```

---

### Phase 2: Folder Structure Creation

**Input:** Approved domain from Phase 1

**Process:**
Create folder structure at `{aphoria-project}/dogfood/{domain}/`:

```
{aphoria-project}/dogfood/{domain}/
├── README.md                    # Overview (hypothesis, quick start)
├── plan.md                      # 5-day detailed plan
├── .aphoria/
│   ├── config.toml              # Persistent mode, corpus enabled
│   └── claims.toml              # (empty, filled on Day 1)
├── docs/
│   └── sources/                 # Authority sources
│       ├── {rfc}.md             # Standards (Tier 1)
│       ├── {vendor}.md          # Vendor docs (Tier 2)
│       └── {library}.md         # Community (Tier 3)
├── src/                         # (placeholder, user creates)
│   └── .gitkeep
├── claims-template.toml             # Batch creation script (Day 1)
└── DAY1-SUMMARY.md              # (user creates during execution)
```

**Output:** Complete folder tree with placeholder files

---

### Phase 3: Plan Writing (5-Day Template)

**Input:** Domain, hypothesis, corpus overlap %, violations list

**Process:**
Copy httpclient/plan.md structure and customize:

**plan.md Template:**

```markdown
# Dogfood Project: {Domain} Library

**Start Date:** {YYYY-MM-DD}
**Hypothesis:** {What we're testing - specific, measurable}
**Corpus Overlap:** {existing-corpus} ({X}% pattern reuse expected)
**Target Metrics:**
- Time savings: {X}% vs manual
- Pattern reuse: {Y}% of claims
- Detection rate: {Z}% of violations
- Naming errors: <2

---

## Day 1: Claims Extraction (1-2 hours)

**Goal:** Author {N} claims ({X} reused, {Y} new) from corpus and domain knowledge

**Skills:**
- `/aphoria-suggest` - discover reusable patterns from corpus
- `/aphoria-claims` - author claims with full provenance

**Process:**
1. Run `/aphoria-suggest --domain {domain}` to find corpus patterns
2. Review suggestions, identify {X} reusable claims
3. Draft {Y} new claims specific to {domain}
4. Use `/aphoria-claims` to author all {N} claims
5. Verify claims in `.aphoria/claims.toml`
6. Run `claims-template.toml` for batch creation (if applicable)

**Target Output:**
- {N} claims in `.aphoria/claims.toml`
- {X} reused from corpus ({Y}% reuse rate)
- Daily summary: `DAY1-SUMMARY.md`

**Success Criteria:**
- All claims have: provenance, invariant, consequence, authority tier
- Reuse rate ≥ 30%
- Time ≤ 2 hours

---

## Day 2: Implementation (2-4 hours)

**Goal:** Write {domain} library with {Z} intentional violations

**Violations (Intentional):**
1. **{Violation 1}**: {Brief description}
   - Consequence: {What breaks}
   - Marker: `@aphoria:claim[{category}] {invariant} -- {consequence}`
   - Location: `src/{file}.{ext}:{line}`

2. **{Violation 2}**: {Brief description}
   - Consequence: {What breaks}
   - Marker: `@aphoria:claim[{category}] {invariant} -- {consequence}`
   - Location: `src/{file}.{ext}:{line}`

{...continue for all Z violations}

**Process:**
1. Create `src/` files (config, client, connection, etc.)
2. Implement happy path functionality
3. Embed {Z} violations with inline markers
4. Add comments linking to authority sources
5. Keep violations realistic (not contrived)

**Target Output:**
- Working {domain} library (basic functionality)
- {Z} embedded violations with markers
- Daily summary: `DAY2-SUMMARY.md`

**Success Criteria:**
- All violations have inline markers
- Code is realistic (not toy example)
- Time ≤ 4 hours

---

## Day 3: Scanning (1-2 hours)

**Goal:** Detect {Z}/{Z} violations via `aphoria scan` AND create extractors for gaps

**⚠️ THIS IS THE CORE FLYWHEEL STEP** - Day 3 validates autonomous learning. Do NOT skip extractor creation.

**Process:**

### Phase 1: Pre-Flight Check (5 min) **[REQUIRED]**
```bash
# Verify skill availability
/help | grep aphoria-custom-extractor-creator
# Verify inline markers present
grep -r "@aphoria:claim" src/ | wc -l  # Expected: {Z}
# Verify code compiles
cargo check  # or appropriate build command
```

If any check fails, STOP and fix before proceeding.

### Phase 2: Baseline Scan (15 min)
```bash
aphoria scan --format json > scan-v1.json
aphoria scan --format markdown > scan-v1.md
```

**Expected on FIRST scan:** Low detection rate (0-20%) is NORMAL for new domains because extractors don't exist yet. This is NOT a failure - it's the signal that Phase 4 (extractor creation) is needed.

### Phase 3: Gap Analysis (15 min) **[REQUIRED]**

Analyze scan-v1.json:
- Which claims show "MISSING" verdict?
- Which violations have inline markers but weren't detected?
- What patterns need extractors?

Create gap table in daily summary.

### Phase 4: Extractor Creation (30 min) **[REQUIRED - DO NOT SKIP]**

**⚠️ CRITICAL:** This step is REQUIRED. Skipping this breaks the autonomous learning flywheel.

For EACH missed violation ({Z} total):
```bash
/aphoria-custom-extractor-creator --violation "{pattern}" --claim {claim-id}
```

Expected: {Z} extractors created in `.aphoria/extractors/`

### Phase 5: Verification Scan (15 min) **[REQUIRED]**
```bash
aphoria scan --format json > scan-v2.json
```

**Expected:** Detection rate ≥90% ({Z}/{Z} or {Z-1}/{Z})

### Phase 6: Documentation (15 min) **[REQUIRED]**

Create `DAY3-SUMMARY.md` with:
- Metrics (detection rate v1 vs v2)
- Extractors created (list all {Z})
- Time spent per phase
- Learning captured (what patterns were identified)

**Target Output:**
- `scan-v1.json` and `scan-v2.json` (baseline + verification)
- **{Z} extractor files** in `.aphoria/extractors/`
- `DAY3-SUMMARY.md` with metrics

**Success Criteria:**
- ✅ Pre-flight checks pass
- ✅ **{Z} extractors created** (one per violation) - **CRITICAL**
- ✅ Detection rate ≥ 90% in v2 scan
- ✅ Detection rate improvement documented (v1 → v2)
- ✅ Zero false positives
- ✅ Time ≤ 2 hours

**Evidence of Correct Execution:**
```bash
ls .aphoria/extractors/*.toml | wc -l  # Should be: {Z}
ls scan-v2.json                        # Should exist
ls DAY3-SUMMARY.md                     # Should exist
```

If ANY of these are missing, Day 3 was not completed correctly.

---

## Day 4: Remediation (2-4 hours)

**Goal:** Progressive fixes - remove violations, verify compliance

**Process:**
1. Fix violations one by one (not all at once)
2. After each fix, re-run `aphoria scan`
3. Verify conflict count decreases
4. Document fix time per violation
5. Final scan should show 0 conflicts

**Target Output:**
- All {Z} violations fixed
- Progressive scan results (decreasing conflicts)
- Daily summary: `DAY4-SUMMARY.md`

**Success Criteria:**
- Final scan: 0 conflicts
- Each fix verified independently
- Time ≤ 4 hours

---

## Day 5: Documentation (2-3 hours)

**Goal:** Comprehensive report with metrics, findings, product gaps

**Process:**
1. Write `DAY5-DOGFOODING-REPORT.md` (see httpclient template)
2. Include:
   - Executive summary (hypothesis, result)
   - Metrics table (time, reuse %, detection rate)
   - What worked (flywheel successes)
   - What broke (product gaps, blockers)
   - Product gaps (prioritized by severity)
   - Recommendations (what to build next)
3. Archive daily summaries

**Target Output:**
- `DAY5-DOGFOODING-REPORT.md` (comprehensive, 500-800 lines)
- Updated README with links to report

**Success Criteria:**
- All metrics quantified
- Product gaps prioritized
- Recommendations actionable
- Time ≤ 3 hours

---

## Success Metrics

| Metric | Target | Actual | Delta |
|--------|--------|--------|-------|
| Total time | {X} hrs | ___ | ___ |
| Pattern reuse | {Y}% | ___ | ___ |
| Detection rate | {Z}% | ___ | ___ |
| Naming errors | <2 | ___ | ___ |
| Time savings | {A}% | ___ | ___ |

---

## Authority Sources

### {Source 1} (Tier {X})
- **URL:** {link or RFC number}
- **Relevance:** {Why this matters for domain}
- **Covered Claims:** {list concept paths}

### {Source 2} (Tier {X})
- **URL:** {link}
- **Relevance:** {Why this matters}
- **Covered Claims:** {list concept paths}

---

## References

- httpclient dogfood: `dogfood/httpclient/` (gold standard)
- dbpool dogfood: `dogfood/dbpool/` (if exists)
- Claims authoring: `.claude/skills/aphoria-claims/`
- Pattern discovery: `.claude/skills/aphoria-suggest/`
```

**Output:** Detailed `plan.md` customized for domain

---

### Phase 4: Authority Source Templates

**Input:** Domain name, authority sources identified (RFCs, vendor docs, libraries)

**Process:**
Create 3 template files in `docs/sources/`:

**1. `{rfc}.md` (Standards - Tier 1):**

```markdown
# {RFC Name or Standard} - Key Excerpts for {Domain}

**Authority Tier:** Tier 1 (Standards)
**Source:** {RFC number or URL}
**Relevance:** {Why this standard matters for domain}

---

## Section Title

> {Key quote from RFC}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What breaks if violated}

---

## Another Section

> {Key quote}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What breaks}

---

## Extraction Guide

1. Fetch RFC via WebFetch or manual download
2. Search for sections on: {key topics - e.g., "timeouts", "connection limits", "error codes"}
3. Extract quotes that define MUST/SHOULD requirements
4. Map to concept paths in your domain
5. Add consequences for violations
```

**2. `{vendor}.md` (Vendor Documentation - Tier 2):**

```markdown
# {Vendor} {Product} Documentation - Key Excerpts for {Domain}

**Authority Tier:** Tier 2 (Vendor)
**Source:** {vendor docs URL}
**Relevance:** {Why vendor docs matter - e.g., "Official implementation guidance"}

---

## Best Practices Section

> {Key recommendation from vendor}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What breaks or performs poorly}

---

## Common Pitfalls Section

> {Warning from vendor docs}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What goes wrong}

---

## Extraction Guide

1. Navigate to vendor docs: {URL}
2. Search for: best practices, common errors, performance tuning
3. Extract official recommendations
4. Map to concept paths
5. Note consequences from vendor warnings
```

**3. `{library}.md` (Community Library - Tier 3):**

```markdown
# {Library Name} Implementation Patterns - Key Excerpts for {Domain}

**Authority Tier:** Tier 3 (Community)
**Source:** {library docs URL or GitHub}
**Relevance:** {Why this library is authoritative - e.g., "Most popular implementation"}

---

## Configuration Patterns

> {Code example or doc quote}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What breaks - from library issues, stack overflow}

---

## Common Usage Patterns

> {Example or quote}

**Key Claim:**
- `{domain}/{concept_path} :: {predicate} = {value}`
- **Consequence:** {What breaks}

---

## Extraction Guide

1. Review library docs: {URL}
2. Search GitHub issues for common problems
3. Look for configuration examples
4. Extract patterns with evidence
5. Map to concept paths
```

**Output:** 3 authority source template files

---

### Phase 5: Handoff to User

**Input:** All setup artifacts created

**Process:**
Generate handoff summary with:
- Status report (what was created)
- Next steps (Day 1-5 workflow guidance)
- Examples (links to httpclient, dbpool)
- Skills to use (when to invoke each)

**Output Format:**

```markdown
## Dogfooding Exercise: {Domain}

**Status:** ✅ Setup Complete
**Location:** {path}/dogfood/{domain}/
**Hypothesis:** {What we're testing}
**Corpus:** {existing-corpus} ({X}% overlap)

---

### Files Created:

- `README.md` - Overview with hypothesis and quick start
- `plan.md` - Complete 5-day workflow with metrics
- `.aphoria/config.toml` - Persistent mode, corpus enabled
- `.aphoria/claims.toml` - Empty (fill on Day 1)
- `docs/sources/{rfc,vendor,library}.md` - Authority source templates
- `claims-template.toml` - Batch claim creation script skeleton
- `src/.gitkeep` - Placeholder for implementation

---

### Next Steps (Follow plan.md):

**Day 1: Claims Extraction (1-2 hours)**
1. Use `/aphoria-suggest --domain {domain}` to discover corpus patterns
2. Use `/aphoria-claims` to author claims with full provenance
3. Target: {N} claims ({X} reused, {Y} new)
4. Write `DAY1-SUMMARY.md` with metrics

**Day 2: Implementation (2-4 hours)**
1. Write `src/` code with {Z} intentional violations
2. Use inline markers: `@aphoria:claim[category] invariant -- consequence`
3. See `dogfood/httpclient/src/config.rs` for marker examples
4. Keep violations realistic (not contrived)
5. Write `DAY2-SUMMARY.md`

**Day 3: Scanning (1-2 hours)**
1. Run `aphoria scan` in dogfood directory
2. Verify {Z}/{Z} violations detected
3. If misses occur, use `/aphoria-custom-extractor-creator` for extractors
4. Write `DAY3-SUMMARY.md` with detection rate

**Day 4: Remediation (2-4 hours)**
1. Fix violations one by one (progressive)
2. Re-scan after each fix (verify conflict count decreases)
3. Final scan should show 0 conflicts
4. Write `DAY4-SUMMARY.md` with fix times

**Day 5: Documentation (2-3 hours)**
1. Write `DAY5-DOGFOODING-REPORT.md` (comprehensive)
2. See `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md` for template (816 lines)
3. Include: metrics, what worked, what broke, product gaps, recommendations
4. Update README with report link

---

### Examples:

**Complete Exercise (Gold Standard):**
- `dogfood/httpclient/` - Full 5-day exercise
- `dogfood/httpclient/plan.md` - Detailed workflow
- `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md` - Final report (816 lines)
- `dogfood/httpclient/src/config.rs` - Violations with inline markers
- `dogfood/httpclient/claims-template.toml` - Batch claim script

**Alternative Exercise:**
- `dogfood/dbpool/` - Database pool patterns (if exists)

**Skills:**
- `/aphoria-suggest` - Day 1 pattern discovery
- `/aphoria-claims` - Day 1 claim authoring
- `/aphoria-custom-extractor-creator` - Day 3 extractor generation

---

### Success Criteria:

| Metric | Target |
|--------|--------|
| Total time | {X} hrs |
| Pattern reuse | {Y}% |
| Detection rate | {Z}% |
| Naming errors | <2 |
| Time savings | {A}% vs manual |

**You are ready to start Day 1.** Follow `plan.md` and track metrics daily.
```

**Output:** User ready to execute manually

---

## Do (10 Imperatives)

1. **Reuse existing corpus patterns** - The flywheel works through pattern reuse, not novelty. Always check what corpus exists and map overlaps.

2. **Create 5-day plans with metrics** - Every plan needs: time targets, reuse %, detection rate, naming errors. Without metrics, you can't prove flywheel works.

3. **Provide authority source templates** - Give users markdown templates for RFCs, vendor docs, libraries. They fill in domain-specific content.

4. **Link to httpclient/dbpool examples** - Always reference complete examples. Don't make users guess where examples live.

5. **Validate 30%+ corpus overlap** - Before approving domain, estimate pattern overlap. <30% means weak flywheel (pick different domain).

6. **Check for duplicate exercises** - Search for existing dogfood exercises before creating new ones. Don't waste effort duplicating.

7. **Quantify success criteria** - Every exercise needs: time savings %, reuse %, detection rate. These prove the flywheel works.

8. **Guide user to skills** - Tell user WHEN to use `/aphoria-suggest`, `/aphoria-claims`, `/aphoria-custom-extractor-creator` (specific days).

9. **Reference inline marker syntax** - Show examples from httpclient: `@aphoria:claim[category] invariant -- consequence`. Users copy this pattern.

10. **Provide daily summary format** - Give template for DAY{N}-SUMMARY.md with metrics table. Users track progress daily.

---

## Do Not (10 Prohibitions)

1. **Generate code** - You create structure and plans. User writes `src/` code manually following the plan.

2. **Duplicate existing exercises** - Check for duplicates BEFORE creating. Don't create httpclient v2 or dbpool v2.

3. **Create without hypothesis** - Every exercise needs specific, measurable hypothesis. "Let's try X" is not a hypothesis.

4. **Pick domain with <30% corpus overlap** - Weak pattern reuse = weak flywheel. Refuse domains without sufficient corpus.

5. **Skip step back questions** - Always ask adversarial questions before setup. Prevent bad exercises upfront.

6. **Assume user knows where examples are** - Always link explicitly: `dogfood/httpclient/plan.md`, `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md`.

7. **Write vague plans** - Don't say "implement library". Say "embed 7 violations: timeout=0, missing backpressure, unbounded queues...".

8. **Forget metrics** - Every plan must quantify time, reuse %, detection rate. No metrics = no proof.

9. **Create authority sources from scratch** - Provide TEMPLATES only. User fetches actual RFCs, vendor docs, library docs.

10. **Execute the plan** - You do setup (structure, plan, templates). User does execution (code, scan, fix, report).

---

## Decision Points (Critical Checkpoints)

### Before Domain Selection
**Stop. Does `{domain}` dogfood already exist?**

- Search project for `dogfood/{domain}/`
- Check existing exercises: httpclient, dbpool, etc.
- If exists: STOP, use different domain
- If new: Proceed to corpus validation

---

### Before Folder Creation
**Stop. Is corpus overlap 30%+?**

Calculate:
- What existing corpus has overlapping patterns?
- Estimate: How many claims will reuse corpus? (count)
- Total claims expected? (count)
- Reuse rate = reused / total

If <30%: STOP, choose different domain or accept weak flywheel
If ≥30%: Proceed to folder creation

---

### Before Plan Writing
**Stop. Do we have clear hypothesis?**

Check:
- Is hypothesis specific? (not "test Aphoria")
- Is hypothesis measurable? (can we quantify success?)
- Is hypothesis different from existing exercises? (not duplicate)

If vague: STOP, refine hypothesis with user
If clear: Proceed to plan writing

---

### Before Handoff
**Stop. Are examples linked?**

Verify handoff summary includes:
- Explicit path to httpclient example (`dogfood/httpclient/plan.md`)
- Explicit path to DAY5 report template (`dogfood/httpclient/DAY5-DOGFOODING-REPORT.md`)
- Explicit path to marker examples (`dogfood/httpclient/src/config.rs`)
- Skills to use with timing (Day 1: `/aphoria-suggest`, etc.)

If missing: ADD links before handoff
If complete: Handoff to user

---

## Constraints (NEVER/ALWAYS)

### NEVER

- **NEVER generate code** - User writes `src/` manually
- **NEVER duplicate exercises** - Check for existing first
- **NEVER create without hypothesis** - Require specific, measurable hypothesis
- **NEVER skip corpus validation** - Must verify 30%+ overlap
- **NEVER assume user knows paths** - Always link explicitly
- **NEVER write vague plans** - Specifics required (violations, metrics, timing)
- **NEVER forget metrics** - Quantify time, reuse %, detection rate
- **NEVER execute plan** - Setup only, user executes

### ALWAYS

- **ALWAYS validate corpus overlap** - 30%+ minimum
- **ALWAYS link to examples** - httpclient, dbpool (explicit paths)
- **ALWAYS quantify metrics** - Time, reuse %, detection rate, naming errors
- **ALWAYS provide templates** - Authority sources, daily summaries, final report
- **ALWAYS check for duplicates** - Search before creating
- **ALWAYS ask step back questions** - Prevent bad exercises upfront
- **ALWAYS guide to skills** - Tell user WHEN to use each skill (Day 1-5)
- **ALWAYS reference marker syntax** - Show examples from httpclient

---

## Templates Section

### Folder Structure Template

```
{aphoria-project}/dogfood/{domain}/
├── README.md                    # Overview (hypothesis, quick start)
├── plan.md                      # 5-day detailed plan
├── .aphoria/
│   ├── config.toml              # Persistent mode, corpus enabled
│   └── claims.toml              # (empty, filled on Day 1)
├── docs/
│   └── sources/                 # Authority sources
│       ├── {rfc}.md             # Standards (Tier 1)
│       ├── {vendor}.md          # Vendor docs (Tier 2)
│       └── {library}.md         # Community (Tier 3)
├── src/                         # (placeholder, user creates)
│   └── .gitkeep
├── claims-template.toml             # Batch creation script (Day 1)
└── DAY1-SUMMARY.md              # (user creates during execution)
```

---

### README.md Template

```markdown
# Dogfood: {Domain} Library

**Hypothesis:** {Specific, measurable hypothesis}

**Corpus Overlap:** {existing-corpus} ({X}% pattern reuse expected)

**Target Metrics:**
- Time savings: {X}% vs manual
- Pattern reuse: {Y}% of claims
- Detection rate: {Z}% of violations

---

## Quick Start

1. Read `plan.md` for 5-day workflow
2. Start Day 1: `/aphoria-suggest --domain {domain}`
3. Follow plan, track metrics daily
4. See `dogfood/httpclient/` for complete example

---

## Status

- [ ] Day 1: Claims extraction
- [ ] Day 2: Implementation
- [ ] Day 3: Scanning
- [ ] Day 4: Remediation
- [ ] Day 5: Documentation

---

## References

- Plan: `plan.md`
- Authority sources: `docs/sources/`
- Example: `dogfood/httpclient/`
```

---

### .aphoria/config.toml Template

```toml
[project]
name = "{domain}-dogfood"
version = "0.1.0"

[episteme]
mode = "persistent"
wal_path = ".aphoria/wal"
kv_path = ".aphoria/kv"

[corpus]
enabled = true
sources = [
    "{existing-corpus}",  # e.g., "httpclient", "dbpool"
]
```

---

### claims-template.toml Template

```bash
#!/bin/bash
# Batch claim creation for {domain} dogfood
# Usage: ./claims-template.toml

set -e

echo "Creating claims for {domain} dogfood..."

# Example claim 1 (customize for domain)
aphoria claims create \
  --id "{domain}-001" \
  --concept-path "{domain}/config/timeout" \
  --predicate "value_gt" \
  --value "0" \
  --comparison "greater_than" \
  --provenance "RFC XXXX - Timeout handling" \
  --invariant "Timeout MUST be greater than 0" \
  --consequence "timeout=0 causes indefinite blocking" \
  --tier "expert" \
  --category "safety" \
  --evidence "docs/sources/{rfc}.md" \
  --by "{your-name}"

# Add more claims here...

echo "✅ Claims created successfully"
echo "Verify: cat .aphoria/claims.toml"
```

---

### Daily Summary Template

```markdown
# Day {N} Summary: {Focus}

**Date:** {YYYY-MM-DD}
**Duration:** {actual time}
**Status:** ✅ Complete | ⚠️ Blocked | 🚧 In Progress

---

## Metrics

| Metric | Target | Actual | Delta |
|--------|--------|--------|-------|
| Time spent | {target} hrs | {actual} hrs | {+/-} |
| {Day-specific metric} | {target} | {actual} | {+/-} |

**Day 1 specific:**
| Claims authored | {N} | {actual} | {+/-} |
| Corpus reused | {X} | {actual} | {+/-} |
| Reuse rate | {Y}% | {actual}% | {+/-} |

**Day 3 specific:**
| Violations embedded | {Z} | {actual} | {+/-} |
| Violations detected | {Z} | {actual} | {+/-} |
| Detection rate | 100% | {actual}% | {+/-} |

---

## What Worked

- ✅ {Success 1}
- ✅ {Success 2}

---

## What Broke

- ❌ {Problem 1}
  - **Root cause:** {Why it happened}
  - **Workaround:** {How you unblocked}
  - **Product gap?** Yes/No - {If yes, describe}

---

## Next Steps

- [ ] {Task for next day}
- [ ] {Task for next day}

---

## Notes

{Any observations, patterns, insights}
```

---

### Final Report Template (Abbreviated)

```markdown
# Dogfooding Report: {Domain} Library

**Date:** {YYYY-MM-DD}
**Duration:** {total days}
**Hypothesis:** {What we tested}
**Result:** ✅ Validated | ⚠️ Partial | ❌ Invalidated

---

## Executive Summary

{2-3 paragraphs: What we did, what we learned, what needs to change}

---

## Metrics

| Metric | Target | Actual | Delta | Analysis |
|--------|--------|--------|-------|----------|
| Total time | {X} hrs | {actual} | {+/-} | {Why different?} |
| Pattern reuse | {Y}% | {actual}% | {+/-} | {Which patterns?} |
| Detection rate | {Z}% | {actual}% | {+/-} | {What missed?} |
| Naming errors | <2 | {actual} | {+/-} | {Examples} |
| Time savings | {A}% | {actual}% | {+/-} | {Calculation} |

---

## What Worked (Flywheel Successes)

### 1. {Success category}
- ✅ {Specific win}
- **Evidence:** {Metric, example, observation}
- **Why it worked:** {Root cause}

---

## What Broke (Product Gaps)

### Priority 1 (Blocker)
- ❌ {Gap title}
  - **Problem:** {What broke}
  - **Root cause:** {Why it broke}
  - **Impact:** {Who is affected, how severely}
  - **Recommendation:** {What to build/fix}

### Priority 2 (Major)
...

### Priority 3 (Minor)
...

---

## Product Gap Analysis

| Gap ID | Title | Severity | Effort | ROI | Priority |
|--------|-------|----------|--------|-----|----------|
| VG-XXX | {Title} | High | Medium | High | P1 |

---

## Recommendations

### Immediate (This sprint)
1. {Action with clear owner/timeline}

### Short-term (Next 2 sprints)
1. {Action}

### Long-term (Roadmap)
1. {Action}

---

## Appendices

### Appendix A: Daily Summaries
- [Day 1](./DAY1-SUMMARY.md)
- [Day 2](./DAY2-SUMMARY.md)
...

### Appendix B: Claims Created
{List all claims with IDs}

### Appendix C: Violations Embedded
{List all violations with consequences}
```

See `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md` for complete 816-line example.

---

## Output Format (What Skill Produces)

After successful setup, output this summary:

```markdown
## Dogfooding Exercise: {Domain}

**Status:** ✅ Setup Complete
**Location:** {path}/dogfood/{domain}/
**Hypothesis:** {What we're testing}
**Corpus:** {existing-corpus} ({X}% overlap)

---

### Files Created:

```
dogfood/{domain}/
├── README.md
├── plan.md (5-day workflow)
├── .aphoria/config.toml
├── .aphoria/claims.toml (empty)
├── docs/sources/
│   ├── {rfc}.md
│   ├── {vendor}.md
│   └── {library}.md
├── src/.gitkeep
└── claims-template.toml
```

---

### Next Steps:

**Day 1: Claims Extraction (1-2 hours)**
- Use `/aphoria-suggest --domain {domain}`
- Use `/aphoria-claims` to author
- Target: {N} claims ({X} reused, {Y} new)

**Day 2: Implementation (2-4 hours)**
- Write `src/` code with {Z} violations
- Use inline markers: `@aphoria:claim[category] invariant -- consequence`
- See `dogfood/httpclient/src/config.rs` for examples

**Day 3: Scanning (1-2 hours)**
- Run `aphoria scan`
- Verify {Z}/{Z} violations detected
- Use `/aphoria-custom-extractor-creator` if needed

**Day 4: Remediation (2-4 hours)**
- Fix violations progressively
- Re-scan after each fix
- Final scan should show 0 conflicts

**Day 5: Documentation (2-3 hours)**
- Write `DAY5-DOGFOODING-REPORT.md`
- See `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md` (816 lines)

---

### Examples:

- Complete exercise: `dogfood/httpclient/`
- Plan template: `dogfood/httpclient/plan.md`
- Final report: `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md`
- Marker examples: `dogfood/httpclient/src/config.rs`
- Alternative: `dogfood/dbpool/` (if exists)

---

### Skills:

| Skill | When |
|-------|------|
| `/aphoria-suggest` | Day 1 - Pattern discovery |
| `/aphoria-claims` | Day 1 - Claim authoring |
| `/aphoria-custom-extractor-creator` | Day 3 - Extractor generation |

---

**You are ready to start Day 1.** Follow `plan.md` and track metrics daily.
```

---

## Related Skills

| Skill | When to Use |
|-------|-------------|
| `/aphoria-suggest` | **Day 1:** Discover reusable patterns from existing corpus |
| `/aphoria-claims` | **Day 1:** Author claims with full provenance (invariant, consequence, authority tier) |
| `/aphoria-custom-extractor-creator` | **Day 3:** Generate extractors when violations are missed during scan |
| `/aphoria-corpus-import` | **Before Day 1:** Import external docs (RFCs, wikis) to build corpus for reuse |

---

## Examples Section

### Complete Example: HTTP Client

The **httpclient dogfood exercise** is the gold standard. Reference it for:

**Files:**
- `dogfood/httpclient/plan.md` - 5-day workflow with metrics and success criteria
- `dogfood/httpclient/DAY5-DOGFOODING-REPORT.md` - Comprehensive 816-line report
- `dogfood/httpclient/src/config.rs` - Violations with inline markers (`@aphoria:claim`)
- `dogfood/httpclient/claims-template.toml` - Batch claim creation script
- `dogfood/httpclient/docs/sources/` - Authority source extracts (RFCs, Mozilla, Requests)

**Metrics:**
- 62.5% time savings (claims in 1-2 hrs vs 4-5 hrs manual)
- 41% pattern reuse (9/22 claims from corpus)
- 0 naming errors (vs 3-5 expected)
- 7 violations embedded and detected

**Product Gaps Identified:**
- VG-015: Violation detection broken (declarative extractors don't load)
- VG-016: No batch claim import
- VG-017: No violation markers in scan output
- VG-018: No progressive fix workflow
- VG-019: No daily summary templates

**How to use:**
- Copy `plan.md` structure for new domains
- Use `DAY5-DOGFOODING-REPORT.md` as final report template
- Replicate marker syntax from `src/config.rs`
- Adapt `claims-template.toml` for batch operations

---

### Alternative Example: Database Pool (if exists)

If `dogfood/dbpool/` exists, reference it for:
- Connection lifecycle patterns
- Resource limit claims
- Cleanup violation examples
- Different domain showing pattern reuse in action

---

## Workflow Summary

1. **User invokes:** `/aphoria-dogfood --domain msgqueue --hypothesis "async patterns transfer from httpclient"`

2. **Skill validates:**
   - Step back questions (why, corpus overlap, violations, duplicates)
   - Corpus overlap calculation (30%+ required)
   - Duplicate check (search for existing exercises)

3. **Skill creates:**
   - Folder structure (`dogfood/{domain}/`)
   - Detailed plan (`plan.md` with 5-day workflow)
   - Authority source templates (`docs/sources/*.md`)
   - Config files (`.aphoria/config.toml`)
   - Batch script (`claims-template.toml`)

4. **Skill outputs:**
   - Setup complete summary
   - Next steps (Day 1-5 guidance)
   - Examples (links to httpclient, dbpool)
   - Skills (when to use each)

5. **User executes:**
   - Follow `plan.md` manually over 5 days
   - Track metrics daily
   - Write comprehensive report on Day 5

---

## Key Insights

### Why Dogfooding Matters

Dogfooding validates the **autonomous learning flywheel**:
- Commit → observations → patterns → guidance → trust → more commits
- **Structured decisions compound** (not ML training)
- **Pattern reuse is the magic** (30%+ overlap required)

### Why Metrics Matter

Without quantification, you can't prove flywheel works:
- Time savings: Faster than manual = automation value
- Reuse rate: Higher = stronger flywheel
- Detection rate: Higher = better violation catching
- Naming errors: Lower = better corpus quality

### Why Examples Matter

Users shouldn't guess how to structure exercises:
- httpclient: Complete 5-day reference
- Templates: Copy, don't invent
- Markers: Syntax examples prevent errors

---

## Constraints Reminder

### This Skill Is User-Global

- ❌ No StemeDB-specific paths (`/home/jml/Workspace/stemedb/...`)
- ❌ No assumptions about project structure
- ✅ Generic guidance: "Create at `{your-aphoria-project}/dogfood/{domain}/`"
- ✅ Reference examples: "See `dogfood/httpclient/plan.md`"
- ✅ Relative paths only

### This Skill Does Setup Only

- ✅ Creates folder structure
- ✅ Writes plan.md
- ✅ Provides templates
- ✅ Guides to examples
- ❌ Does NOT generate code
- ❌ Does NOT execute plan
- ❌ Does NOT write authority sources (templates only)

### This Skill Enforces Quality

- ✅ Validates corpus overlap (30%+ minimum)
- ✅ Checks for duplicates (refuses if exists)
- ✅ Requires hypothesis (specific, measurable)
- ✅ Demands metrics (time, reuse %, detection rate)
- ✅ Links to examples (explicit paths)

---

**You are ready to create dogfooding exercises. Remember: validate first (step back questions), create structure (folders, plan, templates), handoff to user (they execute manually).**