jordan 157dbbb9eb feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing)

## Phase 8: Enterprise Extractor Improvements ✅
- 14 security extractors (TLS, JWT, SQL injection, XSS, etc.)
- 10 framework-specific extractors (Spring, Django, Rails, etc.)
- Config file security detection (YAML, TOML)

## Phase 9: Autonomous Extractor Generation ✅
- Shadow mode executor with TP/FP tracking
- Graduation pipeline with confidence thresholds
- Auto-rollback on regression detection
- Cross-project pattern syncing

## UAT Suite Complete (14 scripts, 90 tests)
- test-core-detection.sh (6 tests)
- test-declarative-extractors.sh (5 tests)
- test-domain-frameworks.sh (5 tests)
- test-domain-unreal.sh (3 tests)
- test-llm-extraction.sh (6 tests)
- test-eval-harness.sh (5 tests)
- test-cross-language.sh (3 tests)
- test-precommit-performance.sh (4 tests)
- test-output-formats.sh (8 tests)
- test-drift-detection.sh (6 tests)
- test-exit-codes.sh (12 tests)
+ 3 more scripts

## Other Changes
- Updated roadmap to mark Phase 8-9 complete
- Added .gitignore entries for build artifacts
- Updated pre-commit: 800 line limit, exclude tests/data/cmd

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-06 22:50:55 -07:00

12 KiB

Raw Blame History

StemeDB Demo Script

Duration: 15-20 minutes + Q&A URLs: Slides at localhost:3000 → Dashboard at localhost:18188

Pre-Demo Checklist

# Terminal 1: Start API server
cargo run --bin stemedb-api

# Terminal 2: Seed demo data
cd cmd/demo-seed && go run .

# Terminal 3: Start dashboard
cd applications/stemedb-dashboard && npm run dev -- -p 18188

# Terminal 4: Start slides
cd applications/pitch && npm run dev

Verify before presenting:

Skeptic query returns CONTESTED results
Sources page shows entries (find NEJM source)
Quarantine page shows entries
Both browser tabs ready (slides + dashboard)

Part 1: Slides (localhost:3000)

Slide 1: The Hook

On screen: "In 2024, 79% of FDA Warning Letters cited data integrity failures"

Say:

"In fiscal year 2024, 79 percent of FDA Warning Letters cited data integrity failures. The core violation: failure to maintain audit trails that can reconstruct who did what, when, and why."

"And here's what's hidden: 85 percent of safety and efficacy issues in Complete Response Letters are never disclosed by companies. The FDA sees patterns the public doesn't."

"In February 2025, the FDA issued a Warning Letter to Exer Labs for marketing an AI diagnostic without a quality management system. They thought they were exempt. They weren't."

Then: Press → to reveal "With AI making more decisions, the audit trail matters more than ever."

Slide 2: Why This Keeps Happening

On screen: Three pain points

Say (reveal each with →):

"Your data warehouse stores the current answer. But sources disagree."

"When a study is retracted, which decisions did it affect? Can you answer that today?"

"AI recommended X. Can you reconstruct why? For an auditor?"

Key point: "'Black box' is a documented rejection reason. Traditional databases overwrite history. You need to preserve it."

Slide 3: Introducing StemeDB

On screen: StemeDB logo + tagline

Say:

"StemeDB is a knowledge graph that stores claims, not facts. Append-only. Auditable. Built for regulated industries."

Don't linger - next slide explains the approach.

Slide 4: Every Claim Has a Source

On screen: Three benefits

Say (reveal each with →):

"When sources disagree, you see the disagreement. Not hidden. Visible."

"When a source is retracted, you know what's affected in seconds. Not days."

"History is preserved. Nothing gets silently overwritten."

Slide 5: What This Enables

On screen: Three capability cards

Say:

"Conflict visibility - see when sources disagree, with confidence scores."

"Cascade invalidation - retract a source, see every downstream decision affected."

"Complete audit trail - every query logged with provenance. Export for regulators."

Reveal: "Time-travel queries: What did we believe on January 1st?"

Slide 6: Demo Preview

On screen: "FDA guidance now requires audit trails for all AI-enabled devices."

Say:

"The FDA has authorized over 1,200 AI-enabled devices. Every single one requires an audit trail. This is what compliance looks like."

"I'm going to run this exact query live. Semaglutide gastroparesis risk. FDA labels say low incidence. Patient reports say otherwise. Watch how StemeDB surfaces that conflict."

Then: Say "Let me show you" and switch to dashboard tab (localhost:18188). Don't switch silently—brief verbal bridge prevents dead air.

Part 2: Live Demo (localhost:18188)

Demo Step 1: Conflict Visibility

Page: /skeptic

Action: Query already typed in, click "Query Claims"

What they see:

Status: CONTESTED
Conflict Score: ~0.72 (high)
Weight Distribution chart showing FDA vs Reddit

Say:

"Notice the status: CONTESTED. This immediately tells your analyst there's no clean answer."

"FDA regulatory data says 0.2% incidence. Patient reports say something different. Both are visible."

"Most databases would give you the FDA number and call it done. We show you the disagreement. Your medical affairs team can investigate. Nobody is blindsided."

AMAZE MOMENT: "This is not a recommendation from a black box. This is a recommendation with a complete evidence chain."

Demo Step 2: Audit Trail

Page: Click "View Audit Trail" button → /audit

What they see:

Query ID, timestamp, agent, latency
Every query logged with subject/predicate

Say:

"Every query, every agent, every decision - logged. Click any entry and you see exactly what assertions contributed."

"Your auditor asks 'walk me through the evidence' - you show them this."

AMAZE MOMENT: "Audit response time drops dramatically. What used to require manual log archaeology is now a single click."

Demo Step 3: Cascade Invalidation

Page: /sources

Action: Find the NEJM cardiovascular study, click "Preview Impact"

What they see:

Source status, DOI, tier
Assertion count citing this source
List of impacted queries/decisions

Say:

"This is a landmark cardiovascular study. Over 100 assertions cite it. Now imagine the journal retracts it this morning. What do you do?"

"A JAMA study found that devices cleared using predicates with recall history had a 6.4-fold higher risk of future Class I recalls. When you can't trace which studies supported which decisions, you inherit that risk silently."

"One click - Preview Impact. Here's every decision that relied on this study. Your team can review them in priority order."

"In a traditional system, you'd be scrambling for days. Here, you know instantly."

AMAZE MOMENT: "Time to identify impact goes from days to seconds."

MANDATORY: Click "Quarantine Source" to show the status change live. This is your most differentiated feature—do not skip it. (Can restore after demo.)

Demo Step 4: Time-Travel

Page: /skeptic

Action: Same query, but select a date 6 months ago in the date picker

What they see:

Different results based on what was known then
Possibly different confidence scores

⚠️ FRESH DATA NOTE: If you just ran demo-seed, all data has today's timestamp. Time-travel to past dates will show fewer/no results. This is CORRECT behavior—it proves the system respects temporal boundaries. Say: "This demo database was just seeded. In production, you'd see the historical state."

Say:

"A patient had an adverse event 8 months ago. Their lawyer asks: 'What information was available to your system at the time?' Can you reconstruct that state?"

"We can. This is the exact state of the knowledge graph on that date."

"For legal and regulatory defense, this is invaluable. You're not saying 'we think we knew X.' You're showing exactly what evidence was available."

AMAZE MOMENT: "Point-in-time reconstruction is native, not a manual log archaeology project."

Demo Step 5: Trust & Safety

Page: /quarantine then /circuit

What they see on Quarantine:

Suspicious assertions pending review
Reason for quarantine (untrusted agent, high confidence, etc.)

What they see on Circuit:

Blocked agents with failure counts
Auto-recovery timers

Say:

"What happens when things go wrong? A competitor - or an overeager intern - tries to inject high-confidence assertions without proper credentials."

"A new agent claiming 95% confidence? Suspicious. Goes to review queue, not production."

"After 5 failures in a minute, the agent is blocked. Automatic. No human intervention needed at 3am."

"Nothing is deleted. Your team reviews and approves or rejects. Full audit trail."

AMAZE MOMENT: "Your knowledge base cannot be poisoned. And when something gets blocked, you know about it."

Part 3: Return to Slides

Verbal bridge: "That's the core of what StemeDB does. Let me recap what you just saw."

Slide 7: Questions

Page: Back to localhost:3000, press → to reach Q&A slide

What they see: Recap of what they just saw

Be ready for:

Question	Answer
"What's the latency?"	"340ms on 2.3M assertions for a typical Skeptic query. Measured on [specify your demo hardware]. Happy to run live queries to verify."
"SOC 2?"	"In progress. Not yet certified. Pilot deploys on your infrastructure."
"How do I get my data out?"	"Full API export. Standard JSON. Documented schema."
"Who else uses this?"	"We're onboarding our first enterprise pilots." Be honest.
"Why not Postgres?"	"You could build this. 12-18 months, 3-4 engineers. We've done the hard work."
"Can this touch PHI?"	"Hash or tokenize before ingestion. Provenance still works."
"What if system goes down?"	"WAL replay on recovery. Multi-node for production."

The Five Aha Moments (Summary)

#	Moment	What Impresses Them
1	Conflict Visibility	CONTESTED status + weight distribution - disagreement is visible
2	Audit Trail	Every query logged with full provenance - single click, not log archaeology
3	Cascade Invalidation	Source retraction → instant impact list - seconds, not days (Don't skip this)
4	Time-Travel	Point-in-time queries - reconstruct exactly what was known
5	Trust & Safety	Quarantine + circuit breakers - data poisoning mitigated

Keyboard Shortcuts (Slides)

Key	Action
`→` / `Space`	Next slide/fragment
`←`	Previous
`S`	Speaker notes (new window)
`ESC`	Overview mode
`B`	Blackout
`F`	Fullscreen

If Something Goes Wrong

Problem	Recovery
No data in Skeptic	Re-run `go run .` in cmd/demo-seed
Dashboard won't load	Check port 18188, restart npm
Slides won't advance	Click in the slide area first
Quarantine empty	That's OK - mention "clean system, nothing suspicious"
Query returns SUPPORTED not CONTESTED	"Even consensus has provenance" - show the audit trail instead
Latency is slow (>1s)	"This is demo hardware. Production runs on [X]. Let me show the architecture."
NEJM source missing	Use whatever source IS in the data - the cascade demo works with any source
Results different than script	Acknowledge it: "Live data, live results. Let me walk you through what we're seeing."
Time-travel shows no results	"This is fresh demo data. It proves temporal boundaries work—nothing existed 6 months ago."
Audience asks to try their own query	Say yes - this is a confidence moment. Type exactly what they say.

If You Only Have 10 Minutes

Skip to essentials:

Slide 1 (hook with 79% stat) → 1 min
Slide 3 (StemeDB intro) → 30 sec
Demo Step 1 (Conflict Visibility) → 2 min
Demo Step 3 (Cascade Invalidation - MANDATORY) → 3 min
Demo Step 2 (Audit Trail) → 2 min
Q&A Slide → remaining time

Cut: Slides 2, 4, 5. Demo Steps 4, 5.

Source Attribution (Presenter Notes)

Statistic	Source	Link
79% of Warning Letters cite data integrity	FY2024 FDA Form 483 inspection statistics	Pharmaceutical Online
85% of CRL issues never disclosed	2015 BMJ study — validated by FDA's July 2025 "radical transparency" initiative where FDA published 200+ CRLs themselves	PMC, FDA CRL Database
Exer Labs Warning Letter	FDA enforcement action, Feb 10, 2025 (inspection Oct 2024)	FDA.gov
6.4x higher recall risk	JAMA January 2023 predicate network analysis	JAMA Network
1,200+ AI-enabled devices	FDA AI/ML device clearance database	Bipartisan Policy Center
Median 510(k) review: 108 days	FDA 2024 review timeline data	Hardian Health

Last updated: 2026-02-06

12 KiB Raw Blame History

StemeDB Demo Script

Pre-Demo Checklist

Part 1: Slides (localhost:3000)

Slide 1: The Hook

Slide 2: Why This Keeps Happening

Slide 3: Introducing StemeDB

Slide 4: Every Claim Has a Source

Slide 5: What This Enables

Slide 6: Demo Preview

Part 2: Live Demo (localhost:18188)

Demo Step 1: Conflict Visibility

Demo Step 2: Audit Trail

Demo Step 3: Cascade Invalidation

Demo Step 4: Time-Travel

Demo Step 5: Trust & Safety

Part 3: Return to Slides

Slide 7: Questions

The Five Aha Moments (Summary)

Keyboard Shortcuts (Slides)

If Something Goes Wrong

If You Only Have 10 Minutes

Source Attribution (Presenter Notes)

12 KiB

Raw Blame History