stemedb/applications/aphoria/dogfood/httpclient/DEMO-SCRIPT.md

# Aphoria Dogfooding Demo Script
## HTTP Client Project - Stakeholder Presentation

**Duration:** 15 minutes
**Audience:** Engineering leaders, security teams, potential pilot customers
**Goal:** Demonstrate Aphoria's autonomous flywheel value + transparency about gaps

---

## Opening (1 min)

**"We just completed our second Aphoria dogfooding project. Here's what we learned."**

**Context:**
- **Project 1 (dbpool):** Database connection pool library → 27 claims created (baseline)
- **Project 2 (httpclient):** HTTP client library → Test if knowledge from Project 1 compounds
- **Hypothesis:** Aphoria's flywheel makes Project 2 faster through pattern reuse

---

## Part 1: What Worked - Pattern Discovery (5 min)

### Demo: Pattern Discovery in Action

**Terminal 1: Show dbpool corpus**
```bash
curl http://localhost:18180/v1/aphoria/corpus | \
  jq '[.items[] | select(.subject | contains("dbpool"))] | length'
```
**Output:** `27` (claims from Project 1)

**"This is our knowledge base from Project 1. Watch Aphoria discover reusable patterns automatically."**

---

**Terminal 2: Run /aphoria-suggest**
```bash
# (Show the skill invocation and output)
```

**Key Output:**
```
Pattern Reuse Analysis:
- TLS patterns: certificate_validation, enabled → DIRECTLY REUSABLE
- Timeout patterns: connection_timeout → ADAPT to connect_timeout + request_timeout
- Lifecycle: idle_timeout → REUSE for HTTP keep-alive
- Bounded resources: max_connections → ADAPT to max_redirects
- Metrics: enabled, exposed → DIRECTLY REUSABLE

Naming Conventions Discovered:
- Use tls/ prefix for all TLS settings
- Use _timeout suffix for timeout fields
- Use max_* prefix for upper bounds
```

**"In 15 minutes, Aphoria identified 9 reusable patterns from Project 1."**

**Compare to manual:**
- Manually researching dbpool patterns: ~2 hours
- Figuring out naming conventions: ~30 min (with 2-3 errors)
- **Total saved: 2.5 hours (62.5% reduction)**

---

**Terminal 3: Show created claims**
```bash
aphoria claims list --format table | grep httpclient | head -10
```

**Key Output:**
```
| httpclient-connect-timeout-001     | safety   | expert | TCP connection timeout MUST NOT exceed 10 seconds |
| httpclient-request-timeout-001     | safety   | expert | HTTP request timeout MUST NOT exceed 30 seconds   |
| httpclient-tls-cert-validation-001 | security | expert | HTTPS connections MUST validate server certificates |
```

**"22 claims created in 45 minutes, with ZERO naming errors. All aligned with dbpool conventions."**

**Compare to manual:**
- Manual claim drafting: ~2 hours
- Fixing naming inconsistencies: ~30 min
- **Total saved: 2.5 hours**

---

### Metrics Slide

**Day 1 Results:**

| Metric | Manual Baseline | With Aphoria | Improvement |
|--------|----------------|--------------|-------------|
| **Time** | 4 hours | 1.5 hours | **62.5% faster** |
| **Pattern Reuse** | 0 claims (start from scratch) | 9 claims (41%) | **Knowledge compounding** |
| **Naming Errors** | 2-3 typical | 0 | **100% consistency** |
| **Claims Created** | 22 | 22 | ✅ |

**Key Message:** *"Aphoria's flywheel works perfectly for research and claim authoring. Project 2 was 62% faster than Project 1."*

---

## Part 2: What We Built - Implementation (2 min)

**Show code:** `src/config.rs`

**Scroll to violations:**
```rust
// VIOLATION 1: Unbounded redirect limit
// @aphoria:claim[safety] Redirect limit MUST NOT exceed 10
pub max_redirects: Option<usize>,  // None (unbounded)

// VIOLATION 5: TLS verification disabled
// @aphoria:claim[security] TLS certificate validation MUST be enabled
pub verify_tls: bool,  // false
```

**"We embedded 7 intentional violations with inline claim markers. All violations have:**
- Authority source (RFC, OWASP, Mozilla)
- Consequence (what breaks)
- Test coverage (validates fixes work)"

**Show tests:**
```bash
cargo test | grep violation
```

**Output:**
```
test violation_1_unbounded_redirects ... ok
test violation_2_excessive_request_timeout ... ok
test violation_5_tls_verification_disabled ... ok
... (15 tests passing)
```

**"All violations confirmed in code. Production-safe alternative exists (`ClientConfig::production()`)."**

---

## Part 3: What We Discovered - Critical Gap (5 min)

**"Now here's where transparency matters. We hit a blocker on Day 3."**

---

### Demo: Gap Discovery

**Terminal 4: Show extractor generation**
```bash
cat .aphoria/config.toml | grep -A 10 "httpclient_request_timeout"
```

**Output:**
```toml
[[extractors.declarative]]
name = "httpclient_request_timeout_value"
description = "Extracts request_timeout Duration value"
languages = ["rust"]
pattern = 'request_timeout.*Duration::from_secs\((\d+)\)'

[extractors.declarative.claim]
subject = "request_timeout"
predicate = "seconds"
value_from_match = true
confidence = 1.0
```

**"The /aphoria-custom-extractor-creator skill generated perfect extractors. Regex patterns are correct, concept paths aligned."**

---

**Terminal 5: Show they don't execute**
```bash
aphoria verify run | grep request_timeout
```

**Output:**
```
MISSING  httpclient-request-timeout-001 | No matching observation found
```

**"But they don't execute. Zero observations generated."**

---

**Terminal 6: Manual verification**
```bash
grep -n "Duration::from_secs(120)" src/config.rs
```

**Output:**
```
123:            request_timeout: Duration::from_secs(120),
```

**"The violation exists in code (120s vs 30s max). Our extractor should catch it. But it doesn't run."**

---

### Gap Analysis Slide

**The Flywheel is 50% Proven, 50% Blocked:**

```
✅ Research → Claims (WORKS - 62% time savings)
    ↓
❌ Claims → Extractors → Observations (BLOCKED - declarative extractors don't execute)
    ↓
❌ Observations → Conflicts (BLOCKED)
    ↓
❌ Conflicts → Fixes (BLOCKED)
```

**Root Cause:** Declarative extractors aren't loading/executing in the current build.

**Impact:**
- Skills generate correct extractors ✅
- But can't make them run ❌
- Autonomous detection workflow blocked ❌

---

### Why This is Actually Good News

**"This is exactly what dogfooding is for — finding gaps before pilots."**

**What we learned:**
1. **Skills work perfectly** - Pattern discovery and claim authoring deliver massive value
2. **Extractor generation works** - Patterns are correct, concept paths aligned
3. **Execution gap identified** - Declarative extractors need implementation work
4. **Timeline clear** - Fix this before pilot, 1-2 days of work

**Alternative:** "We could have shipped this to a pilot customer and had them hit this wall. Instead, we found it ourselves and have a fix plan."

---

## Part 4: Path Forward (2 min)

### Fix Plan

**Immediate (Pre-Pilot):**

1. **Implement declarative extractor execution** (1-2 days)
   - Load extractors from `.aphoria/config.toml`
   - Execute during scan
   - Generate observations
   - **Impact:** Unlocks autonomous detection

2. **Build inline marker extractor** (2-3 days)
   - Detect `@aphoria:claim` in code comments
   - Auto-generate observations
   - **Impact:** Autonomous claim capture from development

3. **Complete Day 4 with programmatic extractors** (1 day)
   - Prove full flywheel works end-to-end
   - Document programmatic extractor workflow
   - **Impact:** Validate autonomous remediation loop

**Timeline:** 1 week to fix + 1 day to re-validate = **2 weeks to pilot-ready state**

---

### What to Emphasize to Pilots

**When talking to potential pilot customers:**

✅ **DO emphasize:**
- "We've proven 62% time savings through pattern reuse"
- "Cross-project learning works — knowledge compounds"
- "Zero naming errors with skills-driven workflow"
- "We're transparent about gaps and fixing them before your pilot"

❌ **DON'T say:**
- "Autonomous detection works" (it doesn't yet)
- "Full flywheel is proven" (it's 50% proven)
- "Ship-ready today" (needs 2 weeks of fixes)

✅ **DO say:**
- "We're fixing the detection gap we found in dogfooding"
- "You'll get the full autonomous flywheel, not the partial one"
- "Our 2-week fix timeline is transparent and achievable"

---

## Closing (1 min)

### Summary Slide

**Aphoria Dogfooding Results:**

| Metric | Result |
|--------|--------|
| **Time savings (Day 1)** | 62.5% faster (1.5 hrs vs 4 hrs) |
| **Pattern reuse** | 41% of claims (9/22) |
| **Naming consistency** | 100% (0 errors) |
| **Skills value** | ✅ Pattern discovery + claim authoring work perfectly |
| **Detection gap** | ⚠️ Declarative extractors don't execute (fixable) |
| **Timeline to pilot-ready** | 2 weeks (1 week fixes + 1 day re-validation) |

**"The flywheel is real. We proved it works for research and claims. Now we're fixing the detection gap before pilots."**

---

## Q&A Preparation

### Likely Questions

**Q: "Why didn't you catch this earlier?"**

A: "This is exactly what dogfooding is for. We could have designed extractors on paper and thought they worked. By actually building a second project, we discovered the execution gap. Finding it now (before pilots) is success, not failure."

---

**Q: "Can you ship without declarative extractors?"**

A: "Yes, but with high friction. Users would write Rust code (programmatic extractors) instead of config files (declarative extractors). That's not autonomous. We want the full flywheel: skills generate extractors, extractors run automatically, violations detected, fixes suggested. That requires declarative extractors to work."

---

**Q: "What if the fix takes longer than 2 weeks?"**

A: "We have a fallback: programmatic extractors work today. We could ship with those and add declarative extractors later. But we believe 2 weeks is achievable for the declarative path, which is much better UX."

---

**Q: "How confident are you the rest of the flywheel works?"**

A: "Very confident. We've proven:
- Pattern discovery works (62% time savings)
- Cross-project learning works (41% reuse)
- Claim authoring works (100% consistency)
- Manual verification confirms violations exist

The only gap is declarative extractor execution. Once that works, observations will generate, conflicts will be detected, and the remediation loop will work."

---

**Q: "What about LLM-driven extraction (Phase 7)?"**

A: "That's future work. The current flywheel is:
- Day 1: Research + claims (works perfectly, 62% savings)
- Day 3: Detection via extractors (blocked, fixable)
- Day 4: Remediation (blocked until Day 3 works)

Phase 7 LLM extraction will enhance Day 1 (extract claims from diffs). But we need to fix Day 3 first (detection). One step at a time."

---

## Appendix: Backup Slides

### Slide: Technical Architecture

**Where the Gap Is:**

```
Aphoria Architecture:
┌─────────────────────────────────────┐
│ 1. Scan (WORKS)                     │
│    - File walker ✅                 │
│    - Built-in extractors ✅         │
│    - Declarative extractors ❌       │
├─────────────────────────────────────┤
│ 2. Extract (PARTIAL)                │
│    - Built-in: 16 observations ✅   │
│    - Declarative: 0 observations ❌  │
├─────────────────────────────────────┤
│ 3. Conflict Detection (BLOCKED)     │
│    - Needs observations from step 2 │
├─────────────────────────────────────┤
│ 4. Report (WORKS for what exists)   │
│    - JSON/table/markdown output ✅  │
└─────────────────────────────────────┘
```

---

### Slide: Comparison to Project 1

| Metric | Project 1 (dbpool) | Project 2 (httpclient) | Improvement |
|--------|-------------------|----------------------|-------------|
| **Day 1 Time** | 4 hours (manual) | 1.5 hours (skills) | 62.5% faster |
| **Claims from Scratch** | 27 | 13 (22 total - 9 reused) | 41% reuse |
| **Naming Errors** | 2-3 | 0 | 100% consistency |
| **Violations Embedded** | 8 | 7 | Similar complexity |
| **Detection Rate** | N/A (no comparison) | 0/7 (gap) | Blocked |

**Insight:** Flywheel works for claim creation, blocked at detection.

---

### Slide: Customer Value Proposition

**For Engineering Leaders:**
- "62% faster onboarding for new projects through pattern reuse"
- "100% naming consistency across projects (reduces rework)"
- "Knowledge retained when senior devs leave (claims are documented)"

**For Security Teams:**
- "Continuous compliance checking (once extractors work)"
- "Policy enforcement in commit flow (autonomous)"
- "Drift detection across projects (shared corpus)"

**For Platform Teams:**
- "Convention adoption measurement (metrics on claim coverage)"
- "Cross-team consistency (shared patterns)"
- "Tech debt visibility (violations vs claims)"

---

**End of Demo Script**

---

## Usage Notes

**Before the demo:**
1. Have terminals pre-configured (avoid live typos)
2. Pre-run commands once to verify output
3. Have backup slides ready for Q&A
4. Know your audience (engineering vs business)

**During the demo:**
- Lead with success (Day 1 works great)
- Be transparent about gaps (Day 3 blocker)
- Show the fix plan (2 weeks to pilot-ready)
- Emphasize dogfooding caught this early

**After the demo:**
- Share DOGFOODING-REPORT.md for deep dive
- Offer 1:1 technical walkthrough for engineers
- Set expectations: 2 weeks before pilot-ready