stemedb/applications/pitch/README.md
jordan 157dbbb9eb feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing)
## Phase 8: Enterprise Extractor Improvements 
- 14 security extractors (TLS, JWT, SQL injection, XSS, etc.)
- 10 framework-specific extractors (Spring, Django, Rails, etc.)
- Config file security detection (YAML, TOML)

## Phase 9: Autonomous Extractor Generation 
- Shadow mode executor with TP/FP tracking
- Graduation pipeline with confidence thresholds
- Auto-rollback on regression detection
- Cross-project pattern syncing

## UAT Suite Complete (14 scripts, 90 tests)
- test-core-detection.sh (6 tests)
- test-declarative-extractors.sh (5 tests)
- test-domain-frameworks.sh (5 tests)
- test-domain-unreal.sh (3 tests)
- test-llm-extraction.sh (6 tests)
- test-eval-harness.sh (5 tests)
- test-cross-language.sh (3 tests)
- test-precommit-performance.sh (4 tests)
- test-output-formats.sh (8 tests)
- test-drift-detection.sh (6 tests)
- test-exit-codes.sh (12 tests)
+ 3 more scripts

## Other Changes
- Updated roadmap to mark Phase 8-9 complete
- Added .gitignore entries for build artifacts
- Updated pre-commit: 800 line limit, exclude tests/data/cmd

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:50:55 -07:00

316 lines
12 KiB
Markdown

# StemeDB Demo Script
> **Duration:** 15-20 minutes + Q&A
> **URLs:** Slides at `localhost:3000` → Dashboard at `localhost:18188`
---
## Pre-Demo Checklist
```bash
# Terminal 1: Start API server
cargo run --bin stemedb-api
# Terminal 2: Seed demo data
cd cmd/demo-seed && go run .
# Terminal 3: Start dashboard
cd applications/stemedb-dashboard && npm run dev -- -p 18188
# Terminal 4: Start slides
cd applications/pitch && npm run dev
```
**Verify before presenting:**
- [ ] Skeptic query returns CONTESTED results
- [ ] Sources page shows entries (find NEJM source)
- [ ] Quarantine page shows entries
- [ ] Both browser tabs ready (slides + dashboard)
---
## Part 1: Slides (localhost:3000)
### Slide 1: The Hook
**On screen:** "In 2024, **79%** of FDA Warning Letters cited data integrity failures"
**Say:**
> "In fiscal year 2024, 79 percent of FDA Warning Letters cited data integrity failures. The core violation: failure to maintain audit trails that can reconstruct who did what, when, and why."
>
> "And here's what's hidden: 85 percent of safety and efficacy issues in Complete Response Letters are never disclosed by companies. The FDA sees patterns the public doesn't."
>
> "In February 2025, the FDA issued a Warning Letter to Exer Labs for marketing an AI diagnostic without a quality management system. They thought they were exempt. They weren't."
**Then:** Press → to reveal "With AI making more decisions, the audit trail matters more than ever."
---
### Slide 2: Why This Keeps Happening
**On screen:** Three pain points
**Say (reveal each with →):**
> "Your data warehouse stores the current answer. But sources disagree."
>
> "When a study is retracted, which decisions did it affect? Can you answer that today?"
>
> "AI recommended X. Can you reconstruct why? For an auditor?"
**Key point:** "'Black box' is a documented rejection reason. Traditional databases overwrite history. You need to preserve it."
---
### Slide 3: Introducing StemeDB
**On screen:** StemeDB logo + tagline
**Say:**
> "StemeDB is a knowledge graph that stores claims, not facts. Append-only. Auditable. Built for regulated industries."
**Don't linger** - next slide explains the approach.
---
### Slide 4: Every Claim Has a Source
**On screen:** Three benefits
**Say (reveal each with →):**
> "When sources disagree, you see the disagreement. Not hidden. Visible."
>
> "When a source is retracted, you know what's affected in seconds. Not days."
>
> "History is preserved. Nothing gets silently overwritten."
---
### Slide 5: What This Enables
**On screen:** Three capability cards
**Say:**
> "Conflict visibility - see when sources disagree, with confidence scores."
>
> "Cascade invalidation - retract a source, see every downstream decision affected."
>
> "Complete audit trail - every query logged with provenance. Export for regulators."
**Reveal:** "Time-travel queries: What did we believe on January 1st?"
---
### Slide 6: Demo Preview
**On screen:** "FDA guidance now requires audit trails for all AI-enabled devices."
**Say:**
> "The FDA has authorized over 1,200 AI-enabled devices. Every single one requires an audit trail. This is what compliance looks like."
>
> "I'm going to run this exact query live. Semaglutide gastroparesis risk. FDA labels say low incidence. Patient reports say otherwise. Watch how StemeDB surfaces that conflict."
**Then:** Say "Let me show you" and switch to dashboard tab (localhost:18188). Don't switch silently—brief verbal bridge prevents dead air.
---
## Part 2: Live Demo (localhost:18188)
### Demo Step 1: Conflict Visibility
**Page:** `/skeptic`
**Action:** Query already typed in, click "Query Claims"
**What they see:**
- Status: **CONTESTED**
- Conflict Score: ~0.72 (high)
- Weight Distribution chart showing FDA vs Reddit
**Say:**
> "Notice the status: CONTESTED. This immediately tells your analyst there's no clean answer."
>
> "FDA regulatory data says 0.2% incidence. Patient reports say something different. Both are visible."
>
> "Most databases would give you the FDA number and call it done. We show you the disagreement. Your medical affairs team can investigate. Nobody is blindsided."
**AMAZE MOMENT:** "This is not a recommendation from a black box. This is a recommendation with a complete evidence chain."
---
### Demo Step 2: Audit Trail
**Page:** Click "View Audit Trail" button → `/audit`
**What they see:**
- Query ID, timestamp, agent, latency
- Every query logged with subject/predicate
**Say:**
> "Every query, every agent, every decision - logged. Click any entry and you see exactly what assertions contributed."
>
> "Your auditor asks 'walk me through the evidence' - you show them this."
**AMAZE MOMENT:** "Audit response time drops dramatically. What used to require manual log archaeology is now a single click."
---
### Demo Step 3: Cascade Invalidation
**Page:** `/sources`
**Action:** Find the NEJM cardiovascular study, click "Preview Impact"
**What they see:**
- Source status, DOI, tier
- Assertion count citing this source
- List of impacted queries/decisions
**Say:**
> "This is a landmark cardiovascular study. Over 100 assertions cite it. Now imagine the journal retracts it this morning. What do you do?"
>
> "A JAMA study found that devices cleared using predicates with recall history had a 6.4-fold higher risk of future Class I recalls. When you can't trace which studies supported which decisions, you inherit that risk silently."
>
> "One click - Preview Impact. Here's every decision that relied on this study. Your team can review them in priority order."
>
> "In a traditional system, you'd be scrambling for days. Here, you know instantly."
**AMAZE MOMENT:** "Time to identify impact goes from days to seconds."
**MANDATORY:** Click "Quarantine Source" to show the status change live. This is your most differentiated feature—do not skip it. (Can restore after demo.)
---
### Demo Step 4: Time-Travel
**Page:** `/skeptic`
**Action:** Same query, but select a date 6 months ago in the date picker
**What they see:**
- Different results based on what was known then
- Possibly different confidence scores
**⚠️ FRESH DATA NOTE:** If you just ran demo-seed, all data has today's timestamp. Time-travel to past dates will show fewer/no results. This is CORRECT behavior—it proves the system respects temporal boundaries. Say: "This demo database was just seeded. In production, you'd see the historical state."
**Say:**
> "A patient had an adverse event 8 months ago. Their lawyer asks: 'What information was available to your system at the time?' Can you reconstruct that state?"
>
> "We can. This is the exact state of the knowledge graph on that date."
>
> "For legal and regulatory defense, this is invaluable. You're not saying 'we think we knew X.' You're showing exactly what evidence was available."
**AMAZE MOMENT:** "Point-in-time reconstruction is native, not a manual log archaeology project."
---
### Demo Step 5: Trust & Safety
**Page:** `/quarantine` then `/circuit`
**What they see on Quarantine:**
- Suspicious assertions pending review
- Reason for quarantine (untrusted agent, high confidence, etc.)
**What they see on Circuit:**
- Blocked agents with failure counts
- Auto-recovery timers
**Say:**
> "What happens when things go wrong? A competitor - or an overeager intern - tries to inject high-confidence assertions without proper credentials."
>
> "A new agent claiming 95% confidence? Suspicious. Goes to review queue, not production."
>
> "After 5 failures in a minute, the agent is blocked. Automatic. No human intervention needed at 3am."
>
> "Nothing is deleted. Your team reviews and approves or rejects. Full audit trail."
**AMAZE MOMENT:** "Your knowledge base cannot be poisoned. And when something gets blocked, you know about it."
---
## Part 3: Return to Slides
**Verbal bridge:** "That's the core of what StemeDB does. Let me recap what you just saw."
### Slide 7: Questions
**Page:** Back to localhost:3000, press → to reach Q&A slide
**What they see:** Recap of what they just saw
**Be ready for:**
| Question | Answer |
|----------|--------|
| "What's the latency?" | "340ms on 2.3M assertions for a typical Skeptic query. Measured on [specify your demo hardware]. Happy to run live queries to verify." |
| "SOC 2?" | "In progress. Not yet certified. Pilot deploys on your infrastructure." |
| "How do I get my data out?" | "Full API export. Standard JSON. Documented schema." |
| "Who else uses this?" | "We're onboarding our first enterprise pilots." Be honest. |
| "Why not Postgres?" | "You could build this. 12-18 months, 3-4 engineers. We've done the hard work." |
| "Can this touch PHI?" | "Hash or tokenize before ingestion. Provenance still works." |
| "What if system goes down?" | "WAL replay on recovery. Multi-node for production." |
---
## The Five Aha Moments (Summary)
| # | Moment | What Impresses Them |
|---|--------|---------------------|
| 1 | Conflict Visibility | CONTESTED status + weight distribution - disagreement is visible |
| 2 | Audit Trail | Every query logged with full provenance - single click, not log archaeology |
| 3 | Cascade Invalidation | Source retraction → instant impact list - seconds, not days (**Don't skip this**) |
| 4 | Time-Travel | Point-in-time queries - reconstruct exactly what was known |
| 5 | Trust & Safety | Quarantine + circuit breakers - data poisoning mitigated |
---
## Keyboard Shortcuts (Slides)
| Key | Action |
|-----|--------|
| `→` / `Space` | Next slide/fragment |
| `←` | Previous |
| `S` | Speaker notes (new window) |
| `ESC` | Overview mode |
| `B` | Blackout |
| `F` | Fullscreen |
---
## If Something Goes Wrong
| Problem | Recovery |
|---------|----------|
| No data in Skeptic | Re-run `go run .` in cmd/demo-seed |
| Dashboard won't load | Check port 18188, restart npm |
| Slides won't advance | Click in the slide area first |
| Quarantine empty | That's OK - mention "clean system, nothing suspicious" |
| Query returns SUPPORTED not CONTESTED | "Even consensus has provenance" - show the audit trail instead |
| Latency is slow (>1s) | "This is demo hardware. Production runs on [X]. Let me show the architecture." |
| NEJM source missing | Use whatever source IS in the data - the cascade demo works with any source |
| Results different than script | Acknowledge it: "Live data, live results. Let me walk you through what we're seeing." |
| Time-travel shows no results | "This is fresh demo data. It proves temporal boundaries work—nothing existed 6 months ago." |
| Audience asks to try their own query | Say yes - this is a confidence moment. Type exactly what they say. |
---
## If You Only Have 10 Minutes
Skip to essentials:
1. **Slide 1** (hook with 79% stat) → 1 min
2. **Slide 3** (StemeDB intro) → 30 sec
3. **Demo Step 1** (Conflict Visibility) → 2 min
4. **Demo Step 3** (Cascade Invalidation - MANDATORY) → 3 min
5. **Demo Step 2** (Audit Trail) → 2 min
6. **Q&A Slide** → remaining time
**Cut:** Slides 2, 4, 5. Demo Steps 4, 5.
---
---
## Source Attribution (Presenter Notes)
| Statistic | Source | Link |
|-----------|--------|------|
| 79% of Warning Letters cite data integrity | FY2024 FDA Form 483 inspection statistics | [Pharmaceutical Online](https://www.pharmaceuticalonline.com/doc/trends-in-fda-fy-2024-inspection-based-warning-letters-0001) |
| 85% of CRL issues never disclosed | 2015 BMJ study — validated by FDA's July 2025 "radical transparency" initiative where FDA published 200+ CRLs themselves | [PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC7885096/), [FDA CRL Database](https://open.fda.gov/crltable/) |
| Exer Labs Warning Letter | FDA enforcement action, Feb 10, 2025 (inspection Oct 2024) | [FDA.gov](https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/warning-letters/exer-labs-inc-699218-02102025) |
| 6.4x higher recall risk | JAMA January 2023 predicate network analysis | [JAMA Network](https://jamanetwork.com/journals/jama/fullarticle/2800187) |
| 1,200+ AI-enabled devices | FDA AI/ML device clearance database | [Bipartisan Policy Center](https://bipartisanpolicy.org/issue-brief/fda-oversight-understanding-the-regulation-of-health-ai-tools/) |
| Median 510(k) review: 108 days | FDA 2024 review timeline data | [Hardian Health](https://www.hardianhealth.com/insights/how-long-does-an-fda-510k-actually-take) |
---
*Last updated: 2026-02-06*