stemedb/use-cases/glp1-living-review.md
jordan c59066949a feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints
- Add Layered() method to Go SDK for per-source-class consensus queries
- Add LayeredQueryParams, LayeredResult, TierResolution types to Go SDK
- Create conflict example demonstrating Skeptic and Layered endpoints
- Update quickstart.md with sections 6 (conflict detection) and 7 (authority tiers)
- Remove tracked Go binary and add data/ to .gitignore

The new quickstart sections demonstrate Episteme's differentiating features:
- Skeptic endpoint shows "Trust but Verify" conflict analysis
- Layered endpoint shows per-tier resolution (Clinical vs Anecdotal)

Note: Pre-existing large files flagged by pre-commit hook (technical debt from prior sessions)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:00:59 -07:00

8.6 KiB

BioTech/Pharma: The Living Systematic Review (GLP-1)

Tier: Strategic Pilot Pillars Used: First-Class Contradiction, Paradigm Supersession (Epochs), Multi-Signature Consensus, Semantic Decay, Visual Anchoring Postgres Test: FAILED - Negation blindness in vector search leads to dangerous clinical recommendations; manual decay calculations in SQL cannot handle rapidly shifting paradigms; provenance for "Dark Matter" (failed trials) is lost.

The Hazard (Without Episteme)

I observed a medical research team spend 4 months updating a systematic review on "GLP-1 muscle loss," only for it to be obsolete 72 hours after publication.

The problem wasn't their talent; it was their database. They used a standard RAG system (Vector DB + Postgres) to ingest 5,000 papers. When a new trial (STEP-1) contradicted earlier pilot data regarding lean mass preservation, the vector database retrieved both papers with nearly identical similarity scores.

Because standard embeddings suffer from Negation Blindness, the search for "Does Semaglutide cause muscle loss?" returned:

  1. "Trial A: Semaglutide causes significant muscle loss."
  2. "Trial B: Semaglutide does not cause significant muscle loss."

The LLM, attempting to synthesize these into a "canonical" answer, hallucinated a cautious "maybe," averaging out the truth. The team missed the fact that Trial B was a Phase III RCT while Trial A was a small, unblinded pilot. By failing to structurally model the Conflict, they served a dangerous "Generative Soup" instead of a valid scientific consensus.

The failure mode: Vector databases optimize for plausibility (similarity), not validity. In Pharma, plausibility is a patient safety risk.


The Scenario: "Operation LeanMass"

A pharmaceutical R&D team is monitoring the "GLP-1 Agonist" landscape (Semaglutide, Tirzepatide). Evidence is exploding from PubMed, biorxiv, and clinicaltrial.gov.

The system must:

  1. Model contradictory study results explicitly (Muscle Loss vs. Preservation).
  2. Handle "Paradigm Shifts" (e.g., a new FDA label update superseding all prior trial assertions).
  3. Weight results by Journal Reputation and Agent TrustRank (Multi-Sig).
  4. Apply aggressive "Semantic Decay" to knowledge with a 73-day half-life.
  5. Anchor claims to pixels (Screenshots of primary data charts) to detect data drift.

Feature 1: First-Class Contradiction (The Skeptic Lens)

The Failure Mode

In medical research, "Truth" isn't a binary; it's a distribution. When two studies disagree, picking a "winner" or averaging the results hides the very signal a researcher needs: Variance.

The Episteme Solution

POST /assert
{
  "subject": "Semaglutide",
  "predicate": "muscle_sparing_effect",
  "object": { "Boolean": false },
  "source_hash": "study_low_n_2021",
  "confidence": 0.7,
  "signatures": [{ "agent_id": "pubmed_crawler_01", ... }]
}

POST /assert
{
  "subject": "Semaglutide",
  "predicate": "muscle_sparing_effect",
  "object": { "Boolean": true },
  "source_hash": "step_1_trial_2023",
  "confidence": 0.95,
  "signatures": [{ "agent_id": "reviewer_agent_alpha", ... }]
}

Querying the "State of Truth":

GET /query?subject=Semaglutide&predicate=muscle_sparing_effect&lens=skeptic
-> Returns { 
     conflict_score: 0.88, 
     variance: "High", 
     candidates: [
       { val: false, trust: 0.12 }, 
       { val: true, trust: 0.86 } 
     ]
   }

Pillar: First-Class Contradiction. The Lens::Skeptic identifies that the scientific community is in disagreement, preventing the "Hallucination Cascade" where the agent averages two opposites.


Feature 2: Paradigm Supersession (Epochs)

The Failure Mode

The FDA releases a new "Warning Label" for a drug class. Instantly, 500 assertions regarding "Safe Use Guidelines" derived from older trials are now legally and clinically superseded. In Postgres, you either run O(N) updates or build complex is_active logic that fails to capture why things changed.

The Episteme Solution

Assertions are tagged with an Epoch. When the paradigm shifts, we supersede the entire epoch in one O(1) operation.

POST /v1/epoch
{
  "name": "post_fda_label_2024",
  "supersedes": "<hex-encoded-id-of-pre_fda_label_2024>",
  "supersession_type": "Invalidate"
}

The supersedes field is the hex-encoded 32-byte ID of the prior epoch. The supersession_type can be Invalidate (factually incorrect), Temporal (outdated but was correct), Refinement (more precise), RequiresReview (flagged for review), or Additive (extends without replacing). Additional context like the reason can be stored in assertions tagged with this epoch.

Effect: Queries using Lens::EpochAware automatically ignore the 500 assertions from the pre_fda epoch. They remain in the Lens::History for audit but are "excreted" from the current reasoning context.

Pillar: Paradigm Management. Truth isn't just updated; it is evolved. Epochs allow the system to "change its mind" at scale.


Feature 3: Multi-Signature Consensus (The Hive)

The Failure Mode

A pre-print on biorxiv claims a breakthrough. A week later, a peer-reviewed letter in The Lancet refutes it. In a standard database, these are just two rows of text.

The Episteme Solution

Agents don't just "write" data; they Co-Sign it.

-- Agent A (Researcher) finds a fact
POST /assert { ... object: "High Efficacy", agent_id: "researcher_bot" }

-- Agent B (Peer Reviewer) validates the fact
POST /cosign { 
  "assertion_hash": "...", 
  "agent_id": "lancet_reviewer_agent", 
  "signature_weight": 100 // Tier 1 Authority
}

Pillar: Multi-Signature Consensus. The database implements a Supreme Court logic where expert agents (Tier 1) can override the noise of the "Worker Agent" swarm without deleting the history of the debate.


Feature 4: Semantic Decay (Knowledge Half-Life)

The Failure Mode

Medical knowledge has a t½ of ~73 days. A "cutting-edge" study from 6 months ago is often "Old News" or "Stale." In Postgres, data lives forever until deleted.

The Episteme Solution

Episteme applies a Confidence Half-Life at read time.

GET /query?subject=Tirzepatide&predicate=weight_loss_pct&lens=authority&decay=73d
  • Study (10 days old): 0.95 Confidence -> 0.91 Effective Confidence
  • Study (200 days old): 0.95 Confidence -> 0.14 Effective Confidence

The old data "fades" from the hot path automatically. If a "Super Curator" (The Judge) re-verifies the old study, it triggers a Resurrection Event, resetting the decay timer.

Pillar: Semantic Decay. Episteme handles the "Metabolism" of knowledge, ensuring agents don't hallucinate based on "Context Pollution" from stale research.


Feature 5: Visual Anchoring (AVAM)

The Failure Mode

An agent extracts "15% Weight Loss" from a PDF. It turns out the OCR misread "1.5%". The text assertion is now a lie.

The Episteme Solution

Assertions are anchored to a Visual Hash (pHash) of the primary data source.

POST /assert
{
  "subject": "STEP-1_Trial",
  "predicate": "primary_endpoint",
  "object": { "Percent": 14.9 },
  "visual_hash": "0x8f3c...", // pHash of the results table in the PDF
  "confidence": 1.0
}

When the Super Curator audits the fact, it uses a multimodal LLM to look at the pixels of the chart, not the text of the assertion. If the pixels don't match the claim, the assertion is invalidated.

Pillar: Visual Anchoring. StemeDB anchors truth to the physical evidence (pixels), providing the "Eye" that prevents text-based drift.


The Home Run: "The Simulator"

By running "Operation LeanMass" on Episteme, the team passively builds the "Simulator":

  • A dataset of every "Failed Experiment" (Negative Trajectories).
  • A log of every "High-Confidence Failure" (Conflict).
  • A library of "Golden Paths" (Resolved Consensus).

This data is licensed to model labs to train Medical Reasoning Adapters, making StemeDB the primary supplier of "experience" for the next generation of Scientific AGI.


Summary: Why Episteme for BioTech?

Problem Vector DB Approach Episteme Approach
"Muscle Loss" vs "No Loss" Averages/Hallucinates Skeptic Lens flags variance
FDA Label Update O(N) Manual Update Epoch Supersession (O(1))
Pre-print vs Lancet Text Similarity Multi-Sig reputation weight
Knowledge Half-Life Metadata sorting Semantic Decay (auto-fading)
OCR Errors Trust the text Visual Anchoring (pHash)

In Pharma, the "Git for Truth" isn't a feature; it's the only way to avoid the liability of a hallucinating research swarm.