# Aphoria Dogfooding Demo Script ## HTTP Client Project - Stakeholder Presentation **Duration:** 15 minutes **Audience:** Engineering leaders, security teams, potential pilot customers **Goal:** Demonstrate Aphoria's autonomous flywheel value + transparency about gaps --- ## Opening (1 min) **"We just completed our second Aphoria dogfooding project. Here's what we learned."** **Context:** - **Project 1 (dbpool):** Database connection pool library → 27 claims created (baseline) - **Project 2 (httpclient):** HTTP client library → Test if knowledge from Project 1 compounds - **Hypothesis:** Aphoria's flywheel makes Project 2 faster through pattern reuse --- ## Part 1: What Worked - Pattern Discovery (5 min) ### Demo: Pattern Discovery in Action **Terminal 1: Show dbpool corpus** ```bash curl http://localhost:18180/v1/aphoria/corpus | \ jq '[.items[] | select(.subject | contains("dbpool"))] | length' ``` **Output:** `27` (claims from Project 1) **"This is our knowledge base from Project 1. Watch Aphoria discover reusable patterns automatically."** --- **Terminal 2: Run /aphoria-suggest** ```bash # (Show the skill invocation and output) ``` **Key Output:** ``` Pattern Reuse Analysis: - TLS patterns: certificate_validation, enabled → DIRECTLY REUSABLE - Timeout patterns: connection_timeout → ADAPT to connect_timeout + request_timeout - Lifecycle: idle_timeout → REUSE for HTTP keep-alive - Bounded resources: max_connections → ADAPT to max_redirects - Metrics: enabled, exposed → DIRECTLY REUSABLE Naming Conventions Discovered: - Use tls/ prefix for all TLS settings - Use _timeout suffix for timeout fields - Use max_* prefix for upper bounds ``` **"In 15 minutes, Aphoria identified 9 reusable patterns from Project 1."** **Compare to manual:** - Manually researching dbpool patterns: ~2 hours - Figuring out naming conventions: ~30 min (with 2-3 errors) - **Total saved: 2.5 hours (62.5% reduction)** --- **Terminal 3: Show created claims** ```bash aphoria claims list --format table | grep httpclient | head -10 ``` **Key Output:** ``` | httpclient-connect-timeout-001 | safety | expert | TCP connection timeout MUST NOT exceed 10 seconds | | httpclient-request-timeout-001 | safety | expert | HTTP request timeout MUST NOT exceed 30 seconds | | httpclient-tls-cert-validation-001 | security | expert | HTTPS connections MUST validate server certificates | ``` **"22 claims created in 45 minutes, with ZERO naming errors. All aligned with dbpool conventions."** **Compare to manual:** - Manual claim drafting: ~2 hours - Fixing naming inconsistencies: ~30 min - **Total saved: 2.5 hours** --- ### Metrics Slide **Day 1 Results:** | Metric | Manual Baseline | With Aphoria | Improvement | |--------|----------------|--------------|-------------| | **Time** | 4 hours | 1.5 hours | **62.5% faster** | | **Pattern Reuse** | 0 claims (start from scratch) | 9 claims (41%) | **Knowledge compounding** | | **Naming Errors** | 2-3 typical | 0 | **100% consistency** | | **Claims Created** | 22 | 22 | ✅ | **Key Message:** *"Aphoria's flywheel works perfectly for research and claim authoring. Project 2 was 62% faster than Project 1."* --- ## Part 2: What We Built - Implementation (2 min) **Show code:** `src/config.rs` **Scroll to violations:** ```rust // VIOLATION 1: Unbounded redirect limit // @aphoria:claim[safety] Redirect limit MUST NOT exceed 10 pub max_redirects: Option, // None (unbounded) // VIOLATION 5: TLS verification disabled // @aphoria:claim[security] TLS certificate validation MUST be enabled pub verify_tls: bool, // false ``` **"We embedded 7 intentional violations with inline claim markers. All violations have:** - Authority source (RFC, OWASP, Mozilla) - Consequence (what breaks) - Test coverage (validates fixes work)" **Show tests:** ```bash cargo test | grep violation ``` **Output:** ``` test violation_1_unbounded_redirects ... ok test violation_2_excessive_request_timeout ... ok test violation_5_tls_verification_disabled ... ok ... (15 tests passing) ``` **"All violations confirmed in code. Production-safe alternative exists (`ClientConfig::production()`)."** --- ## Part 3: What We Discovered - Critical Gap (5 min) **"Now here's where transparency matters. We hit a blocker on Day 3."** --- ### Demo: Gap Discovery **Terminal 4: Show extractor generation** ```bash cat .aphoria/config.toml | grep -A 10 "httpclient_request_timeout" ``` **Output:** ```toml [[extractors.declarative]] name = "httpclient_request_timeout_value" description = "Extracts request_timeout Duration value" languages = ["rust"] pattern = 'request_timeout.*Duration::from_secs\((\d+)\)' [extractors.declarative.claim] subject = "request_timeout" predicate = "seconds" value_from_match = true confidence = 1.0 ``` **"The /aphoria-custom-extractor-creator skill generated perfect extractors. Regex patterns are correct, concept paths aligned."** --- **Terminal 5: Show they don't execute** ```bash aphoria verify run | grep request_timeout ``` **Output:** ``` MISSING httpclient-request-timeout-001 | No matching observation found ``` **"But they don't execute. Zero observations generated."** --- **Terminal 6: Manual verification** ```bash grep -n "Duration::from_secs(120)" src/config.rs ``` **Output:** ``` 123: request_timeout: Duration::from_secs(120), ``` **"The violation exists in code (120s vs 30s max). Our extractor should catch it. But it doesn't run."** --- ### Gap Analysis Slide **The Flywheel is 50% Proven, 50% Blocked:** ``` ✅ Research → Claims (WORKS - 62% time savings) ↓ ❌ Claims → Extractors → Observations (BLOCKED - declarative extractors don't execute) ↓ ❌ Observations → Conflicts (BLOCKED) ↓ ❌ Conflicts → Fixes (BLOCKED) ``` **Root Cause:** Declarative extractors aren't loading/executing in the current build. **Impact:** - Skills generate correct extractors ✅ - But can't make them run ❌ - Autonomous detection workflow blocked ❌ --- ### Why This is Actually Good News **"This is exactly what dogfooding is for — finding gaps before pilots."** **What we learned:** 1. **Skills work perfectly** - Pattern discovery and claim authoring deliver massive value 2. **Extractor generation works** - Patterns are correct, concept paths aligned 3. **Execution gap identified** - Declarative extractors need implementation work 4. **Timeline clear** - Fix this before pilot, 1-2 days of work **Alternative:** "We could have shipped this to a pilot customer and had them hit this wall. Instead, we found it ourselves and have a fix plan." --- ## Part 4: Path Forward (2 min) ### Fix Plan **Immediate (Pre-Pilot):** 1. **Implement declarative extractor execution** (1-2 days) - Load extractors from `.aphoria/config.toml` - Execute during scan - Generate observations - **Impact:** Unlocks autonomous detection 2. **Build inline marker extractor** (2-3 days) - Detect `@aphoria:claim` in code comments - Auto-generate observations - **Impact:** Autonomous claim capture from development 3. **Complete Day 4 with programmatic extractors** (1 day) - Prove full flywheel works end-to-end - Document programmatic extractor workflow - **Impact:** Validate autonomous remediation loop **Timeline:** 1 week to fix + 1 day to re-validate = **2 weeks to pilot-ready state** --- ### What to Emphasize to Pilots **When talking to potential pilot customers:** ✅ **DO emphasize:** - "We've proven 62% time savings through pattern reuse" - "Cross-project learning works — knowledge compounds" - "Zero naming errors with skills-driven workflow" - "We're transparent about gaps and fixing them before your pilot" ❌ **DON'T say:** - "Autonomous detection works" (it doesn't yet) - "Full flywheel is proven" (it's 50% proven) - "Ship-ready today" (needs 2 weeks of fixes) ✅ **DO say:** - "We're fixing the detection gap we found in dogfooding" - "You'll get the full autonomous flywheel, not the partial one" - "Our 2-week fix timeline is transparent and achievable" --- ## Closing (1 min) ### Summary Slide **Aphoria Dogfooding Results:** | Metric | Result | |--------|--------| | **Time savings (Day 1)** | 62.5% faster (1.5 hrs vs 4 hrs) | | **Pattern reuse** | 41% of claims (9/22) | | **Naming consistency** | 100% (0 errors) | | **Skills value** | ✅ Pattern discovery + claim authoring work perfectly | | **Detection gap** | ⚠️ Declarative extractors don't execute (fixable) | | **Timeline to pilot-ready** | 2 weeks (1 week fixes + 1 day re-validation) | **"The flywheel is real. We proved it works for research and claims. Now we're fixing the detection gap before pilots."** --- ## Q&A Preparation ### Likely Questions **Q: "Why didn't you catch this earlier?"** A: "This is exactly what dogfooding is for. We could have designed extractors on paper and thought they worked. By actually building a second project, we discovered the execution gap. Finding it now (before pilots) is success, not failure." --- **Q: "Can you ship without declarative extractors?"** A: "Yes, but with high friction. Users would write Rust code (programmatic extractors) instead of config files (declarative extractors). That's not autonomous. We want the full flywheel: skills generate extractors, extractors run automatically, violations detected, fixes suggested. That requires declarative extractors to work." --- **Q: "What if the fix takes longer than 2 weeks?"** A: "We have a fallback: programmatic extractors work today. We could ship with those and add declarative extractors later. But we believe 2 weeks is achievable for the declarative path, which is much better UX." --- **Q: "How confident are you the rest of the flywheel works?"** A: "Very confident. We've proven: - Pattern discovery works (62% time savings) - Cross-project learning works (41% reuse) - Claim authoring works (100% consistency) - Manual verification confirms violations exist The only gap is declarative extractor execution. Once that works, observations will generate, conflicts will be detected, and the remediation loop will work." --- **Q: "What about LLM-driven extraction (Phase 7)?"** A: "That's future work. The current flywheel is: - Day 1: Research + claims (works perfectly, 62% savings) - Day 3: Detection via extractors (blocked, fixable) - Day 4: Remediation (blocked until Day 3 works) Phase 7 LLM extraction will enhance Day 1 (extract claims from diffs). But we need to fix Day 3 first (detection). One step at a time." --- ## Appendix: Backup Slides ### Slide: Technical Architecture **Where the Gap Is:** ``` Aphoria Architecture: ┌─────────────────────────────────────┐ │ 1. Scan (WORKS) │ │ - File walker ✅ │ │ - Built-in extractors ✅ │ │ - Declarative extractors ❌ │ ├─────────────────────────────────────┤ │ 2. Extract (PARTIAL) │ │ - Built-in: 16 observations ✅ │ │ - Declarative: 0 observations ❌ │ ├─────────────────────────────────────┤ │ 3. Conflict Detection (BLOCKED) │ │ - Needs observations from step 2 │ ├─────────────────────────────────────┤ │ 4. Report (WORKS for what exists) │ │ - JSON/table/markdown output ✅ │ └─────────────────────────────────────┘ ``` --- ### Slide: Comparison to Project 1 | Metric | Project 1 (dbpool) | Project 2 (httpclient) | Improvement | |--------|-------------------|----------------------|-------------| | **Day 1 Time** | 4 hours (manual) | 1.5 hours (skills) | 62.5% faster | | **Claims from Scratch** | 27 | 13 (22 total - 9 reused) | 41% reuse | | **Naming Errors** | 2-3 | 0 | 100% consistency | | **Violations Embedded** | 8 | 7 | Similar complexity | | **Detection Rate** | N/A (no comparison) | 0/7 (gap) | Blocked | **Insight:** Flywheel works for claim creation, blocked at detection. --- ### Slide: Customer Value Proposition **For Engineering Leaders:** - "62% faster onboarding for new projects through pattern reuse" - "100% naming consistency across projects (reduces rework)" - "Knowledge retained when senior devs leave (claims are documented)" **For Security Teams:** - "Continuous compliance checking (once extractors work)" - "Policy enforcement in commit flow (autonomous)" - "Drift detection across projects (shared corpus)" **For Platform Teams:** - "Convention adoption measurement (metrics on claim coverage)" - "Cross-team consistency (shared patterns)" - "Tech debt visibility (violations vs claims)" --- **End of Demo Script** --- ## Usage Notes **Before the demo:** 1. Have terminals pre-configured (avoid live typos) 2. Pre-run commands once to verify output 3. Have backup slides ready for Q&A 4. Know your audience (engineering vs business) **During the demo:** - Lead with success (Day 1 works great) - Be transparent about gaps (Day 3 blocker) - Show the fix plan (2 weeks to pilot-ready) - Emphasize dogfooding caught this early **After the demo:** - Share DOGFOODING-REPORT.md for deep dive - Offer 1:1 technical walkthrough for engineers - Set expectations: 2 weeks before pilot-ready