## Phase 8: Enterprise Extractor Improvements ✅ - 14 security extractors (TLS, JWT, SQL injection, XSS, etc.) - 10 framework-specific extractors (Spring, Django, Rails, etc.) - Config file security detection (YAML, TOML) ## Phase 9: Autonomous Extractor Generation ✅ - Shadow mode executor with TP/FP tracking - Graduation pipeline with confidence thresholds - Auto-rollback on regression detection - Cross-project pattern syncing ## UAT Suite Complete (14 scripts, 90 tests) - test-core-detection.sh (6 tests) - test-declarative-extractors.sh (5 tests) - test-domain-frameworks.sh (5 tests) - test-domain-unreal.sh (3 tests) - test-llm-extraction.sh (6 tests) - test-eval-harness.sh (5 tests) - test-cross-language.sh (3 tests) - test-precommit-performance.sh (4 tests) - test-output-formats.sh (8 tests) - test-drift-detection.sh (6 tests) - test-exit-codes.sh (12 tests) + 3 more scripts ## Other Changes - Updated roadmap to mark Phase 8-9 complete - Added .gitignore entries for build artifacts - Updated pre-commit: 800 line limit, exclude tests/data/cmd Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
316 lines
12 KiB
Markdown
316 lines
12 KiB
Markdown
# StemeDB Demo Script
|
|
|
|
> **Duration:** 15-20 minutes + Q&A
|
|
> **URLs:** Slides at `localhost:3000` → Dashboard at `localhost:18188`
|
|
|
|
---
|
|
|
|
## Pre-Demo Checklist
|
|
|
|
```bash
|
|
# Terminal 1: Start API server
|
|
cargo run --bin stemedb-api
|
|
|
|
# Terminal 2: Seed demo data
|
|
cd cmd/demo-seed && go run .
|
|
|
|
# Terminal 3: Start dashboard
|
|
cd applications/stemedb-dashboard && npm run dev -- -p 18188
|
|
|
|
# Terminal 4: Start slides
|
|
cd applications/pitch && npm run dev
|
|
```
|
|
|
|
**Verify before presenting:**
|
|
- [ ] Skeptic query returns CONTESTED results
|
|
- [ ] Sources page shows entries (find NEJM source)
|
|
- [ ] Quarantine page shows entries
|
|
- [ ] Both browser tabs ready (slides + dashboard)
|
|
|
|
---
|
|
|
|
## Part 1: Slides (localhost:3000)
|
|
|
|
### Slide 1: The Hook
|
|
**On screen:** "In 2024, **79%** of FDA Warning Letters cited data integrity failures"
|
|
|
|
**Say:**
|
|
> "In fiscal year 2024, 79 percent of FDA Warning Letters cited data integrity failures. The core violation: failure to maintain audit trails that can reconstruct who did what, when, and why."
|
|
>
|
|
> "And here's what's hidden: 85 percent of safety and efficacy issues in Complete Response Letters are never disclosed by companies. The FDA sees patterns the public doesn't."
|
|
>
|
|
> "In February 2025, the FDA issued a Warning Letter to Exer Labs for marketing an AI diagnostic without a quality management system. They thought they were exempt. They weren't."
|
|
|
|
**Then:** Press → to reveal "With AI making more decisions, the audit trail matters more than ever."
|
|
|
|
---
|
|
|
|
### Slide 2: Why This Keeps Happening
|
|
**On screen:** Three pain points
|
|
|
|
**Say (reveal each with →):**
|
|
> "Your data warehouse stores the current answer. But sources disagree."
|
|
>
|
|
> "When a study is retracted, which decisions did it affect? Can you answer that today?"
|
|
>
|
|
> "AI recommended X. Can you reconstruct why? For an auditor?"
|
|
|
|
**Key point:** "'Black box' is a documented rejection reason. Traditional databases overwrite history. You need to preserve it."
|
|
|
|
---
|
|
|
|
### Slide 3: Introducing StemeDB
|
|
**On screen:** StemeDB logo + tagline
|
|
|
|
**Say:**
|
|
> "StemeDB is a knowledge graph that stores claims, not facts. Append-only. Auditable. Built for regulated industries."
|
|
|
|
**Don't linger** - next slide explains the approach.
|
|
|
|
---
|
|
|
|
### Slide 4: Every Claim Has a Source
|
|
**On screen:** Three benefits
|
|
|
|
**Say (reveal each with →):**
|
|
> "When sources disagree, you see the disagreement. Not hidden. Visible."
|
|
>
|
|
> "When a source is retracted, you know what's affected in seconds. Not days."
|
|
>
|
|
> "History is preserved. Nothing gets silently overwritten."
|
|
|
|
---
|
|
|
|
### Slide 5: What This Enables
|
|
**On screen:** Three capability cards
|
|
|
|
**Say:**
|
|
> "Conflict visibility - see when sources disagree, with confidence scores."
|
|
>
|
|
> "Cascade invalidation - retract a source, see every downstream decision affected."
|
|
>
|
|
> "Complete audit trail - every query logged with provenance. Export for regulators."
|
|
|
|
**Reveal:** "Time-travel queries: What did we believe on January 1st?"
|
|
|
|
---
|
|
|
|
### Slide 6: Demo Preview
|
|
**On screen:** "FDA guidance now requires audit trails for all AI-enabled devices."
|
|
|
|
**Say:**
|
|
> "The FDA has authorized over 1,200 AI-enabled devices. Every single one requires an audit trail. This is what compliance looks like."
|
|
>
|
|
> "I'm going to run this exact query live. Semaglutide gastroparesis risk. FDA labels say low incidence. Patient reports say otherwise. Watch how StemeDB surfaces that conflict."
|
|
|
|
**Then:** Say "Let me show you" and switch to dashboard tab (localhost:18188). Don't switch silently—brief verbal bridge prevents dead air.
|
|
|
|
---
|
|
|
|
## Part 2: Live Demo (localhost:18188)
|
|
|
|
### Demo Step 1: Conflict Visibility
|
|
**Page:** `/skeptic`
|
|
|
|
**Action:** Query already typed in, click "Query Claims"
|
|
|
|
**What they see:**
|
|
- Status: **CONTESTED**
|
|
- Conflict Score: ~0.72 (high)
|
|
- Weight Distribution chart showing FDA vs Reddit
|
|
|
|
**Say:**
|
|
> "Notice the status: CONTESTED. This immediately tells your analyst there's no clean answer."
|
|
>
|
|
> "FDA regulatory data says 0.2% incidence. Patient reports say something different. Both are visible."
|
|
>
|
|
> "Most databases would give you the FDA number and call it done. We show you the disagreement. Your medical affairs team can investigate. Nobody is blindsided."
|
|
|
|
**AMAZE MOMENT:** "This is not a recommendation from a black box. This is a recommendation with a complete evidence chain."
|
|
|
|
---
|
|
|
|
### Demo Step 2: Audit Trail
|
|
**Page:** Click "View Audit Trail" button → `/audit`
|
|
|
|
**What they see:**
|
|
- Query ID, timestamp, agent, latency
|
|
- Every query logged with subject/predicate
|
|
|
|
**Say:**
|
|
> "Every query, every agent, every decision - logged. Click any entry and you see exactly what assertions contributed."
|
|
>
|
|
> "Your auditor asks 'walk me through the evidence' - you show them this."
|
|
|
|
**AMAZE MOMENT:** "Audit response time drops dramatically. What used to require manual log archaeology is now a single click."
|
|
|
|
---
|
|
|
|
### Demo Step 3: Cascade Invalidation
|
|
**Page:** `/sources`
|
|
|
|
**Action:** Find the NEJM cardiovascular study, click "Preview Impact"
|
|
|
|
**What they see:**
|
|
- Source status, DOI, tier
|
|
- Assertion count citing this source
|
|
- List of impacted queries/decisions
|
|
|
|
**Say:**
|
|
> "This is a landmark cardiovascular study. Over 100 assertions cite it. Now imagine the journal retracts it this morning. What do you do?"
|
|
>
|
|
> "A JAMA study found that devices cleared using predicates with recall history had a 6.4-fold higher risk of future Class I recalls. When you can't trace which studies supported which decisions, you inherit that risk silently."
|
|
>
|
|
> "One click - Preview Impact. Here's every decision that relied on this study. Your team can review them in priority order."
|
|
>
|
|
> "In a traditional system, you'd be scrambling for days. Here, you know instantly."
|
|
|
|
**AMAZE MOMENT:** "Time to identify impact goes from days to seconds."
|
|
|
|
**MANDATORY:** Click "Quarantine Source" to show the status change live. This is your most differentiated feature—do not skip it. (Can restore after demo.)
|
|
|
|
---
|
|
|
|
### Demo Step 4: Time-Travel
|
|
**Page:** `/skeptic`
|
|
|
|
**Action:** Same query, but select a date 6 months ago in the date picker
|
|
|
|
**What they see:**
|
|
- Different results based on what was known then
|
|
- Possibly different confidence scores
|
|
|
|
**⚠️ FRESH DATA NOTE:** If you just ran demo-seed, all data has today's timestamp. Time-travel to past dates will show fewer/no results. This is CORRECT behavior—it proves the system respects temporal boundaries. Say: "This demo database was just seeded. In production, you'd see the historical state."
|
|
|
|
**Say:**
|
|
> "A patient had an adverse event 8 months ago. Their lawyer asks: 'What information was available to your system at the time?' Can you reconstruct that state?"
|
|
>
|
|
> "We can. This is the exact state of the knowledge graph on that date."
|
|
>
|
|
> "For legal and regulatory defense, this is invaluable. You're not saying 'we think we knew X.' You're showing exactly what evidence was available."
|
|
|
|
**AMAZE MOMENT:** "Point-in-time reconstruction is native, not a manual log archaeology project."
|
|
|
|
---
|
|
|
|
### Demo Step 5: Trust & Safety
|
|
**Page:** `/quarantine` then `/circuit`
|
|
|
|
**What they see on Quarantine:**
|
|
- Suspicious assertions pending review
|
|
- Reason for quarantine (untrusted agent, high confidence, etc.)
|
|
|
|
**What they see on Circuit:**
|
|
- Blocked agents with failure counts
|
|
- Auto-recovery timers
|
|
|
|
**Say:**
|
|
> "What happens when things go wrong? A competitor - or an overeager intern - tries to inject high-confidence assertions without proper credentials."
|
|
>
|
|
> "A new agent claiming 95% confidence? Suspicious. Goes to review queue, not production."
|
|
>
|
|
> "After 5 failures in a minute, the agent is blocked. Automatic. No human intervention needed at 3am."
|
|
>
|
|
> "Nothing is deleted. Your team reviews and approves or rejects. Full audit trail."
|
|
|
|
**AMAZE MOMENT:** "Your knowledge base cannot be poisoned. And when something gets blocked, you know about it."
|
|
|
|
---
|
|
|
|
## Part 3: Return to Slides
|
|
|
|
**Verbal bridge:** "That's the core of what StemeDB does. Let me recap what you just saw."
|
|
|
|
### Slide 7: Questions
|
|
**Page:** Back to localhost:3000, press → to reach Q&A slide
|
|
|
|
**What they see:** Recap of what they just saw
|
|
|
|
**Be ready for:**
|
|
|
|
| Question | Answer |
|
|
|----------|--------|
|
|
| "What's the latency?" | "340ms on 2.3M assertions for a typical Skeptic query. Measured on [specify your demo hardware]. Happy to run live queries to verify." |
|
|
| "SOC 2?" | "In progress. Not yet certified. Pilot deploys on your infrastructure." |
|
|
| "How do I get my data out?" | "Full API export. Standard JSON. Documented schema." |
|
|
| "Who else uses this?" | "We're onboarding our first enterprise pilots." Be honest. |
|
|
| "Why not Postgres?" | "You could build this. 12-18 months, 3-4 engineers. We've done the hard work." |
|
|
| "Can this touch PHI?" | "Hash or tokenize before ingestion. Provenance still works." |
|
|
| "What if system goes down?" | "WAL replay on recovery. Multi-node for production." |
|
|
|
|
---
|
|
|
|
## The Five Aha Moments (Summary)
|
|
|
|
| # | Moment | What Impresses Them |
|
|
|---|--------|---------------------|
|
|
| 1 | Conflict Visibility | CONTESTED status + weight distribution - disagreement is visible |
|
|
| 2 | Audit Trail | Every query logged with full provenance - single click, not log archaeology |
|
|
| 3 | Cascade Invalidation | Source retraction → instant impact list - seconds, not days (**Don't skip this**) |
|
|
| 4 | Time-Travel | Point-in-time queries - reconstruct exactly what was known |
|
|
| 5 | Trust & Safety | Quarantine + circuit breakers - data poisoning mitigated |
|
|
|
|
---
|
|
|
|
## Keyboard Shortcuts (Slides)
|
|
|
|
| Key | Action |
|
|
|-----|--------|
|
|
| `→` / `Space` | Next slide/fragment |
|
|
| `←` | Previous |
|
|
| `S` | Speaker notes (new window) |
|
|
| `ESC` | Overview mode |
|
|
| `B` | Blackout |
|
|
| `F` | Fullscreen |
|
|
|
|
---
|
|
|
|
## If Something Goes Wrong
|
|
|
|
| Problem | Recovery |
|
|
|---------|----------|
|
|
| No data in Skeptic | Re-run `go run .` in cmd/demo-seed |
|
|
| Dashboard won't load | Check port 18188, restart npm |
|
|
| Slides won't advance | Click in the slide area first |
|
|
| Quarantine empty | That's OK - mention "clean system, nothing suspicious" |
|
|
| Query returns SUPPORTED not CONTESTED | "Even consensus has provenance" - show the audit trail instead |
|
|
| Latency is slow (>1s) | "This is demo hardware. Production runs on [X]. Let me show the architecture." |
|
|
| NEJM source missing | Use whatever source IS in the data - the cascade demo works with any source |
|
|
| Results different than script | Acknowledge it: "Live data, live results. Let me walk you through what we're seeing." |
|
|
| Time-travel shows no results | "This is fresh demo data. It proves temporal boundaries work—nothing existed 6 months ago." |
|
|
| Audience asks to try their own query | Say yes - this is a confidence moment. Type exactly what they say. |
|
|
|
|
---
|
|
|
|
## If You Only Have 10 Minutes
|
|
|
|
Skip to essentials:
|
|
|
|
1. **Slide 1** (hook with 79% stat) → 1 min
|
|
2. **Slide 3** (StemeDB intro) → 30 sec
|
|
3. **Demo Step 1** (Conflict Visibility) → 2 min
|
|
4. **Demo Step 3** (Cascade Invalidation - MANDATORY) → 3 min
|
|
5. **Demo Step 2** (Audit Trail) → 2 min
|
|
6. **Q&A Slide** → remaining time
|
|
|
|
**Cut:** Slides 2, 4, 5. Demo Steps 4, 5.
|
|
|
|
---
|
|
|
|
---
|
|
|
|
## Source Attribution (Presenter Notes)
|
|
|
|
| Statistic | Source | Link |
|
|
|-----------|--------|------|
|
|
| 79% of Warning Letters cite data integrity | FY2024 FDA Form 483 inspection statistics | [Pharmaceutical Online](https://www.pharmaceuticalonline.com/doc/trends-in-fda-fy-2024-inspection-based-warning-letters-0001) |
|
|
| 85% of CRL issues never disclosed | 2015 BMJ study — validated by FDA's July 2025 "radical transparency" initiative where FDA published 200+ CRLs themselves | [PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC7885096/), [FDA CRL Database](https://open.fda.gov/crltable/) |
|
|
| Exer Labs Warning Letter | FDA enforcement action, Feb 10, 2025 (inspection Oct 2024) | [FDA.gov](https://www.fda.gov/inspections-compliance-enforcement-and-criminal-investigations/warning-letters/exer-labs-inc-699218-02102025) |
|
|
| 6.4x higher recall risk | JAMA January 2023 predicate network analysis | [JAMA Network](https://jamanetwork.com/journals/jama/fullarticle/2800187) |
|
|
| 1,200+ AI-enabled devices | FDA AI/ML device clearance database | [Bipartisan Policy Center](https://bipartisanpolicy.org/issue-brief/fda-oversight-understanding-the-regulation-of-health-ai-tools/) |
|
|
| Median 510(k) review: 108 days | FDA 2024 review timeline data | [Hardian Health](https://www.hardianhealth.com/insights/how-long-does-an-fda-510k-actually-take) |
|
|
|
|
---
|
|
|
|
*Last updated: 2026-02-06*
|