stemedb/use-cases/glp1-living-review.md
jordan c59066949a feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints
- Add Layered() method to Go SDK for per-source-class consensus queries
- Add LayeredQueryParams, LayeredResult, TierResolution types to Go SDK
- Create conflict example demonstrating Skeptic and Layered endpoints
- Update quickstart.md with sections 6 (conflict detection) and 7 (authority tiers)
- Remove tracked Go binary and add data/ to .gitignore

The new quickstart sections demonstrate Episteme's differentiating features:
- Skeptic endpoint shows "Trust but Verify" conflict analysis
- Layered endpoint shows per-tier resolution (Clinical vs Anecdotal)

Note: Pre-existing large files flagged by pre-commit hook (technical debt from prior sessions)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:00:59 -07:00

208 lines
8.6 KiB
Markdown

# BioTech/Pharma: The Living Systematic Review (GLP-1)
> **Tier:** Strategic Pilot
> **Pillars Used:** First-Class Contradiction, Paradigm Supersession (Epochs), Multi-Signature Consensus, Semantic Decay, Visual Anchoring
> **Postgres Test:** FAILED - Negation blindness in vector search leads to dangerous clinical recommendations; manual decay calculations in SQL cannot handle rapidly shifting paradigms; provenance for "Dark Matter" (failed trials) is lost.
## The Hazard (Without Episteme)
I observed a medical research team spend 4 months updating a systematic review on "GLP-1 muscle loss," only for it to be obsolete 72 hours after publication.
The problem wasn't their talent; it was their database. They used a standard RAG system (Vector DB + Postgres) to ingest 5,000 papers. When a new trial (STEP-1) contradicted earlier pilot data regarding lean mass preservation, the vector database retrieved *both* papers with nearly identical similarity scores.
Because standard embeddings suffer from **Negation Blindness**, the search for "Does Semaglutide cause muscle loss?" returned:
1. "Trial A: Semaglutide causes significant muscle loss."
2. "Trial B: Semaglutide does *not* cause significant muscle loss."
The LLM, attempting to synthesize these into a "canonical" answer, hallucinated a cautious "maybe," averaging out the truth. The team missed the fact that Trial B was a Phase III RCT while Trial A was a small, unblinded pilot. By failing to structurally model the **Conflict**, they served a dangerous "Generative Soup" instead of a valid scientific consensus.
**The failure mode:** Vector databases optimize for plausibility (similarity), not validity. In Pharma, plausibility is a patient safety risk.
---
## The Scenario: "Operation LeanMass"
A pharmaceutical R&D team is monitoring the "GLP-1 Agonist" landscape (Semaglutide, Tirzepatide). Evidence is exploding from PubMed, biorxiv, and clinicaltrial.gov.
The system must:
1. Model contradictory study results explicitly (Muscle Loss vs. Preservation).
2. Handle "Paradigm Shifts" (e.g., a new FDA label update superseding all prior trial assertions).
3. Weight results by Journal Reputation and Agent TrustRank (Multi-Sig).
4. Apply aggressive "Semantic Decay" to knowledge with a 73-day half-life.
5. Anchor claims to pixels (Screenshots of primary data charts) to detect data drift.
---
## Feature 1: First-Class Contradiction (The Skeptic Lens)
### The Failure Mode
In medical research, "Truth" isn't a binary; it's a distribution. When two studies disagree, picking a "winner" or averaging the results hides the very signal a researcher needs: **Variance**.
### The Episteme Solution
```
POST /assert
{
"subject": "Semaglutide",
"predicate": "muscle_sparing_effect",
"object": { "Boolean": false },
"source_hash": "study_low_n_2021",
"confidence": 0.7,
"signatures": [{ "agent_id": "pubmed_crawler_01", ... }]
}
POST /assert
{
"subject": "Semaglutide",
"predicate": "muscle_sparing_effect",
"object": { "Boolean": true },
"source_hash": "step_1_trial_2023",
"confidence": 0.95,
"signatures": [{ "agent_id": "reviewer_agent_alpha", ... }]
}
```
Querying the "State of Truth":
```
GET /query?subject=Semaglutide&predicate=muscle_sparing_effect&lens=skeptic
-> Returns {
conflict_score: 0.88,
variance: "High",
candidates: [
{ val: false, trust: 0.12 },
{ val: true, trust: 0.86 }
]
}
```
**Pillar:** First-Class Contradiction. The `Lens::Skeptic` identifies that the scientific community is in disagreement, preventing the "Hallucination Cascade" where the agent averages two opposites.
---
## Feature 2: Paradigm Supersession (Epochs)
### The Failure Mode
The FDA releases a new "Warning Label" for a drug class. Instantly, 500 assertions regarding "Safe Use Guidelines" derived from older trials are now legally and clinically superseded. In Postgres, you either run O(N) updates or build complex `is_active` logic that fails to capture *why* things changed.
### The Episteme Solution
Assertions are tagged with an **Epoch**. When the paradigm shifts, we supersede the entire epoch in one O(1) operation.
```
POST /v1/epoch
{
"name": "post_fda_label_2024",
"supersedes": "<hex-encoded-id-of-pre_fda_label_2024>",
"supersession_type": "Invalidate"
}
```
The `supersedes` field is the hex-encoded 32-byte ID of the prior epoch. The `supersession_type`
can be `Invalidate` (factually incorrect), `Temporal` (outdated but was correct), `Refinement`
(more precise), `RequiresReview` (flagged for review), or `Additive` (extends without replacing).
Additional context like the reason can be stored in assertions tagged with this epoch.
**Effect:**
Queries using `Lens::EpochAware` automatically ignore the 500 assertions from the `pre_fda` epoch. They remain in the `Lens::History` for audit but are "excreted" from the current reasoning context.
**Pillar:** Paradigm Management. Truth isn't just updated; it is evolved. Epochs allow the system to "change its mind" at scale.
---
## Feature 3: Multi-Signature Consensus (The Hive)
### The Failure Mode
A pre-print on biorxiv claims a breakthrough. A week later, a peer-reviewed letter in *The Lancet* refutes it. In a standard database, these are just two rows of text.
### The Episteme Solution
Agents don't just "write" data; they **Co-Sign** it.
```
-- Agent A (Researcher) finds a fact
POST /assert { ... object: "High Efficacy", agent_id: "researcher_bot" }
-- Agent B (Peer Reviewer) validates the fact
POST /cosign {
"assertion_hash": "...",
"agent_id": "lancet_reviewer_agent",
"signature_weight": 100 // Tier 1 Authority
}
```
**Pillar:** Multi-Signature Consensus. The database implements a **Supreme Court** logic where expert agents (Tier 1) can override the noise of the "Worker Agent" swarm without deleting the history of the debate.
---
## Feature 4: Semantic Decay (Knowledge Half-Life)
### The Failure Mode
Medical knowledge has a t½ of ~73 days. A "cutting-edge" study from 6 months ago is often "Old News" or "Stale." In Postgres, data lives forever until deleted.
### The Episteme Solution
Episteme applies a **Confidence Half-Life** at read time.
```
GET /query?subject=Tirzepatide&predicate=weight_loss_pct&lens=authority&decay=73d
```
- **Study (10 days old):** 0.95 Confidence -> **0.91 Effective Confidence**
- **Study (200 days old):** 0.95 Confidence -> **0.14 Effective Confidence**
The old data "fades" from the hot path automatically. If a "Super Curator" (The Judge) re-verifies the old study, it triggers a **Resurrection Event**, resetting the decay timer.
**Pillar:** Semantic Decay. Episteme handles the "Metabolism" of knowledge, ensuring agents don't hallucinate based on "Context Pollution" from stale research.
---
## Feature 5: Visual Anchoring (AVAM)
### The Failure Mode
An agent extracts "15% Weight Loss" from a PDF. It turns out the OCR misread "1.5%". The text assertion is now a lie.
### The Episteme Solution
Assertions are anchored to a **Visual Hash (pHash)** of the primary data source.
```
POST /assert
{
"subject": "STEP-1_Trial",
"predicate": "primary_endpoint",
"object": { "Percent": 14.9 },
"visual_hash": "0x8f3c...", // pHash of the results table in the PDF
"confidence": 1.0
}
```
When the **Super Curator** audits the fact, it uses a multimodal LLM to look at the *pixels* of the chart, not the *text* of the assertion. If the pixels don't match the claim, the assertion is invalidated.
**Pillar:** Visual Anchoring. StemeDB anchors truth to the physical evidence (pixels), providing the "Eye" that prevents text-based drift.
---
## The Home Run: "The Simulator"
By running "Operation LeanMass" on Episteme, the team passively builds the **"Simulator"**:
- A dataset of every "Failed Experiment" (Negative Trajectories).
- A log of every "High-Confidence Failure" (Conflict).
- A library of "Golden Paths" (Resolved Consensus).
This data is licensed to model labs to train **Medical Reasoning Adapters**, making StemeDB the primary supplier of "experience" for the next generation of Scientific AGI.
---
## Summary: Why Episteme for BioTech?
| Problem | Vector DB Approach | Episteme Approach |
|---------|--------------------|-------------------|
| "Muscle Loss" vs "No Loss" | Averages/Hallucinates | **Skeptic Lens** flags variance |
| FDA Label Update | O(N) Manual Update | **Epoch Supersession** (O(1)) |
| Pre-print vs Lancet | Text Similarity | **Multi-Sig** reputation weight |
| Knowledge Half-Life | Metadata sorting | **Semantic Decay** (auto-fading) |
| OCR Errors | Trust the text | **Visual Anchoring** (pHash) |
In Pharma, the "Git for Truth" isn't a feature; it's the only way to avoid the liability of a hallucinating research swarm.