jordan 1ce4004807 feat: Complete Phase 2 (The Cortex) - query, lens, and API layers

This commit adds the read path (Cortex) to complement the write path (Spine):

## Crates
- stemedb-api: HTTP API with axum + utoipa OpenAPI
  - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit
  - Metered endpoints with quota enforcement
  - Ed25519 signature verification
- stemedb-lens: Truth resolution lenses
  - RecencyLens, ConsensusLens, ConfidenceLens
  - VoteAwareConsensusLens (Ballot Box pattern)
  - TrustAwareAuthorityLens (The Hive pattern)
  - SkepticLens (conflict analysis)
  - EpochAwareLens (paradigm-safe queries)
- stemedb-query: Query engine with materialized views

## Storage Extensions
- VoteStore: Vote aggregation with cached counts
- TrustRankStore: Agent reputation with decay
- AuditStore: Query audit trail
- IndexStore: SP/P/S index structures
- SupersessionStore: Epoch supersession chains

## SDKs
- sdk/go/steme: Go HTTP client with Ed25519 signing
- sdk/go/adk: ADK-Go tools for AI agents

## Documentation
- Updated CLAUDE.md, architecture.md, roadmap.md
- New ai-lookup entries for all services
- Use case docs for consumer health intelligence
- Arena roadmap for simulation advancement

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-01 13:22:44 -07:00

22 KiB

Raw Blame History

Consumer Health Intelligence: The Living Truth Layer

Tier: Strategic Pilot Pillars Used: First-Class Contradiction, Invalidation Cascades, Multi-Signature Consensus, Semantic Decay Postgres Test: FAILED - Source-class hierarchy with cross-tier signal detection requires graph traversal impossible in relational schema; anecdotal clustering with escalation triggers has no SQL equivalent; temporal consensus snapshots across millions of assertions from heterogeneous sources break materialized views; decay rates that vary by source class cannot be expressed in WHERE clauses.

The Catastrophe

In March 2024, a woman considering Semaglutide for weight loss Googled "Ozempic side effects." Here is what she encountered in one hour:

Her doctor said: "Well-tolerated, nausea is common but transient."
The FDA label said: Risk of thyroid C-cell tumors (boxed warning, animal studies).
Reddit r/Ozempic (1.2M members): Hundreds of posts about "Ozempic face," hair loss, muscle wasting, gastroparesis.
A TikTok with 4M views: "Ozempic gave me stomach paralysis."
A NEJM meta-analysis: "No statistically significant increase in gastroparesis."
A FAERS query: 8,500+ gastroparesis reports filed for Semaglutide.
Her friend: "I lost 40 lbs and feel amazing."

She had seven sources. They contradicted each other. She had no way to weigh them. She made her decision based on whichever source she encountered last.

Six months later, the FDA updated the Ozempic label to include intestinal obstruction warnings. The Reddit community had been flagging this for a year.

The failure mode: Consumers navigate health decisions using search engines that rank by engagement, not validity. Contradictions are hidden behind algorithmic curation. Anecdotal signals that precede clinical validation are dismissed as noise -- until they aren't. And when guidance changes, there is no mechanism to notify people whose decisions were based on the old consensus.

This is the same failure mode that drove vaccine hesitancy. Not the existence of misinformation -- the absence of a system that could hold "this is what the clinical trials showed" alongside "this is what people are reporting" alongside "this is what changed since you last looked" without collapsing into either false certainty or false equivalence.

The Domain: Living Health Topics

A "living health topic" is any subject where:

Evidence is actively accumulating (new trials, new reports, new guidance)
Sources span from gold-standard to anecdotal (RCTs to Reddit)
Public interest is high and information spreads faster than validation
Guidance changes and prior guidance doesn't self-destruct
Individual experience varies and population-level data doesn't capture edge cases

Examples: GLP-1 agonists, COVID vaccines, SSRI discontinuation, hormone replacement therapy, long COVID, PFAS exposure, mRNA technology, microplastics, ultra-processed food.

Every one of these has the same structural problem: the truth is not a single value. It is a distribution across source classes, changing over time, with different validity at different moments.

What Episteme Enables

Source-Class Hierarchy

The core innovation for consumer health is treating source authority as structural, not cosmetic. Every assertion entering Episteme carries a source class that determines its weight, decay rate, and role in consensus calculation.

POST /assert
{
  "subject": "semaglutide/adverse-effects/gastroparesis",
  "predicate": "risk_level",
  "object": { "Text": "No statistically significant increase" },

  "source_class": "tier-1",
  "source": {
    "type": "meta-analysis",
    "journal": "NEJM",
    "doi": "10.1056/NEJMoa2403123",
    "sample_size": 14847,
    "study_design": "RCT"
  },

  "confidence": 0.92,
  "lifecycle": "current"
}

POST /assert
{
  "subject": "semaglutide/adverse-effects/gastroparesis",
  "predicate": "risk_level",
  "object": { "Text": "Experienced severe gastroparesis after 3 months" },

  "source_class": "tier-5",
  "source": {
    "type": "patient-report",
    "platform": "reddit",
    "subreddit": "r/Ozempic",
    "upvotes": 847,
    "replies": 234
  },

  "confidence": 0.3,
  "lifecycle": "current"
}

Both assertions coexist. Neither overwrites the other. But they are not equal, and the system knows it.

Source-Class Tiers:

Tier	Source Type	Base Weight	Decay Half-Life	Example
0	Regulatory action	1.0	None (until superseded)	FDA label change, EMA withdrawal
1	Peer-reviewed RCT, meta-analysis	0.9	2 years	NEJM, Lancet, JAMA
2	Observational study, real-world evidence	0.7	1 year	Insurance claims data, EHR studies
3	Pharmacovigilance	0.5	18 months	FAERS reports, EudraVigilance
4	Clinician anecdote, case report	0.4	6 months	Conference presentations, case series
5	Patient community	0.2	3 months	Reddit, forums, patient registries
6	Media, influencer, commercial	0.1	30 days	TikTok, news articles, pharma marketing

Postgres Test: A relational schema can store these tiers as an enum column. It cannot make the tier structurally affect query resolution, decay calculation, and consensus weighting without rebuilding Episteme's lens system in application code.

Feature 1: The Nuance Query (Layered Consensus)

The Failure Mode

Consumer health tools give you one answer. Google's AI Overview picks one. WebMD picks one. Your doctor picks one. None of them show you the shape of the disagreement.

The vaccine parallel: "Are COVID vaccines safe?" had a single clinical answer (yes, statistically) and a complex experiential answer (most people were fine; some had significant adverse reactions; the definition of "significant" was itself contested). Collapsing this to a single answer created the trust gap.

The Episteme Solution

GET /query
  ?subject=semaglutide/adverse-effects/gastroparesis
  &lens=layered-consensus

-> Returns:
{
  "subject": "semaglutide/adverse-effects/gastroparesis",
  "consensus_by_tier": {
    "regulatory": {
      "position": "Listed as adverse reaction (post-marketing)",
      "effective_date": "2024-01-12",
      "assertion_count": 2
    },
    "clinical_evidence": {
      "position": "No statistically significant increase in Phase III",
      "conflict_score": 0.34,
      "assertion_count": 12,
      "note": "3 observational studies show elevated signal"
    },
    "pharmacovigilance": {
      "position": "8,547 FAERS reports (disproportionality signal detected)",
      "trend": "increasing",
      "assertion_count": 8547
    },
    "patient_community": {
      "position": "Widely reported, high-engagement posts",
      "sentiment": "negative",
      "cluster_size": 4200,
      "assertion_count": 23891
    }
  },
  "overall_conflict_score": 0.72,
  "guidance_changed_since": "2024-01-12",
  "summary": "Clinical trials show low incidence; post-marketing reports and patient communities report higher rates; FDA added to label January 2024."
}

The consumer sees layers, not a single answer. They see where the tiers agree and where they don't. The disagreement is the information.

Pillar: First-Class Contradiction. Episteme doesn't resolve the gastroparesis question -- it shows the consumer the shape of the evidence across every source class.

Feature 2: Anecdotal Signal Detection (Cluster Escalation)

The Failure Mode

Patient communities flagged gastroparesis risk, "Ozempic face" (facial volume loss), and hair loss months to years before these appeared in clinical literature or regulatory action. In every traditional system, this signal was noise -- unstructured, self-reported, no controlled comparison.

The same pattern played out with COVID vaccines and myocarditis. VAERS reports were accumulating signal before the clinical confirmation. The problem was not that the signal didn't exist -- it was that no system could surface it structurally without either dismissing it (underweighting) or amplifying it beyond its evidentiary basis (overweighting).

The Episteme Solution

Episteme tracks assertion clusters at lower tiers. When a cluster crosses a density threshold, it generates an escalation assertion -- a meta-claim that says "this topic has unusual anecdotal density and deserves investigation."

# Background process (The Gardener) detects clustering
POST /assert
{
  "subject": "semaglutide/adverse-effects/hair-loss",
  "predicate": "escalation_signal",
  "object": {
    "Text": "Anecdotal cluster detected"
  },

  "source_class": "meta",
  "meta": {
    "trigger": "cluster_threshold",
    "tier_5_assertions": 1847,
    "tier_5_growth_rate": "312/month",
    "tier_1_assertions": 0,
    "clinical_gap": true,
    "earliest_report": "2022-06-14",
    "sentiment_polarity": -0.78
  },

  "confidence": 0.6,
  "lifecycle": "under-review"
}

The escalation assertion does not claim hair loss is a real side effect. It claims that patient communities are reporting it at a rate that warrants clinical attention. This is the difference between amplifying noise and surfacing signal.

GET /query
  ?subject=semaglutide/adverse-effects/hair-loss
  &lens=layered-consensus

-> Returns:
{
  "clinical_evidence": {
    "position": "Not reported in Phase III trials",
    "assertion_count": 0
  },
  "patient_community": {
    "position": "Widely reported",
    "cluster_size": 1847,
    "growth_rate": "312/month"
  },
  "escalation": {
    "status": "under-review",
    "signal": "Anecdotal cluster with no clinical counterpart",
    "gap_detected": "2023-09-01"
  }
}

A consumer sees: "Clinical trials didn't study this, but a lot of people are reporting it. This gap has been flagged for investigation." That is an honest answer. It is not "yes," it is not "no," and it is not "we don't know." It is "here is exactly what each source class says, and here is where they disagree."

Pillar: Multi-Signature Consensus + Invalidation Cascades. The cluster detection acts as a low-tier consensus mechanism. The escalation assertion triggers review without granting anecdotal data clinical authority.

Feature 3: Guidance Change Propagation

The Failure Mode

In January 2024, the FDA updated the Semaglutide label to include intestinal obstruction. Everyone who researched Semaglutide before January 2024 made their decision based on a label that no longer reflects the regulatory position. There is no mechanism to propagate this change to prior consumers of that information.

With COVID vaccines, guidance changed repeatedly: eligibility criteria, booster recommendations, age restrictions, brand preferences. Each change invalidated prior guidance, but people continued operating on the version they last encountered.

The Episteme Solution

Episteme's epoch system and invalidation cascades propagate changes structurally:

POST /epoch/supersede
{
  "old_epoch": "semaglutide-label-pre-2024",
  "new_epoch": "semaglutide-label-2024-01",
  "type": "Augment",
  "reason": "FDA label update: intestinal obstruction warning added",
  "sections_affected": ["adverse-reactions", "warnings-and-precautions"],
  "effective_date": "2024-01-12"
}

When a consumer returns to the topic -- or when an agent queries on their behalf -- the system surfaces what changed:

GET /query
  ?subject=semaglutide/adverse-effects
  &lens=layered-consensus
  &since=2023-10-01

-> Returns:
{
  "changes_since_query": [
    {
      "date": "2024-01-12",
      "type": "regulatory",
      "change": "FDA label updated: intestinal obstruction added to warnings",
      "impact": "12 prior assertions in adverse-effects now reference superseded label"
    }
  ],
  "current_consensus": { ... }
}

The consumer doesn't just get today's answer. They get "here is what changed since you last looked." This is the mechanism that was missing for vaccine guidance -- not a static FAQ, but a living diff.

Pillar: Invalidation Cascades. Label changes propagate to all downstream assertions. Prior consensus snapshots remain for audit ("what did we believe in October?") but the current query reflects the updated state.

Feature 4: Source-Aware Decay

The Failure Mode

Medical knowledge decays at different rates depending on its source. A Phase III RCT from 2022 is still highly relevant. A Reddit post from 2022 about side effects may reflect a formulation or dosing protocol that has since changed. A TikTok from 2022 may reference guidance that has been updated three times.

Uniform decay (or no decay) means stale anecdotes compete with current clinical evidence. This is how "Ozempic causes X" persists in search results long after the claim has been investigated and contextualized.

The Episteme Solution

Decay rates are tied to source class:

GET /query
  ?subject=semaglutide/efficacy/weight-loss
  &lens=authority
  &decay=source-aware

-> Resolution:
  NEJM meta-analysis (2023):  base 0.92 * decay(2yr, age=8mo)  = 0.87 effective
  Reddit anecdote (2022):     base 0.20 * decay(3mo, age=26mo)  = 0.00 effective (expired)
  FAERS aggregate (2024):     base 0.50 * decay(18mo, age=2mo)  = 0.49 effective
  TikTok claim (2023):        base 0.10 * decay(30d, age=14mo)  = 0.00 effective (expired)
  FDA label (2024):           base 1.00 * decay(none)           = 1.00 effective

Old anecdotes fade. Old clinical evidence persists. Regulatory actions persist until superseded. The hot path reflects the current evidentiary landscape without manual curation.

A consumer querying today doesn't see a 2022 TikTok claim alongside a 2024 FDA label update. The system's decay has already handled the temporal relevance.

Pillar: Semantic Decay. Source-class-aware decay ensures the knowledge metabolism matches the actual shelf life of each evidence type.

Feature 5: Time Travel for Personal Audit

The Failure Mode

A consumer started Semaglutide in June 2023. In January 2024, new warnings were added. They want to know: "What was the known risk profile when I started?" -- not to assign blame, but to understand whether their decision was reasonable given available evidence.

This same question was asked millions of times about COVID vaccines: "What was known about myocarditis risk when I got my second dose in April 2021?"

No consumer tool can answer this. Google results reflect today's knowledge. The FDA website reflects the current label. The clinical evidence has been updated. The historical state is gone.

The Episteme Solution

GET /query
  ?subject=semaglutide/adverse-effects
  &lens=layered-consensus
  &as_of=2023-06-15

-> Returns the consensus snapshot from June 2023:
{
  "regulatory": {
    "position": "Boxed warning: thyroid C-cell tumors (animal). No GI obstruction warning.",
    "label_version": "2021-06-04"
  },
  "clinical_evidence": {
    "position": "Nausea common, no gastroparesis signal in Phase III",
    "assertion_count": 8
  },
  "patient_community": {
    "position": "Growing reports of gastroparesis, hair loss",
    "cluster_size": 340,
    "escalation": "not yet triggered (threshold: 500)"
  }
}

The consumer can see exactly what was known, at every tier, at the moment they made their decision. This is the audit trail that health decisions currently lack.

Pillar: First-Class Contradiction (temporal). The append-only DAG preserves every historical state. Time travel is a hash lookup, not a reconstruction.

Feature 6: The Disagreement Dashboard

The Failure Mode

Consumer health tools present certainty. "Ozempic is safe" or "Ozempic is dangerous." This false binary creates two failure modes: blind trust or conspiracy-driven rejection. The vaccine discourse demonstrated both at scale.

The actual evidence is nuanced: safe for most, with specific risks for specific populations, with emerging signals that may or may not be validated, and with guidance that changes as evidence accumulates. No consumer-facing tool presents this structure.

The Episteme Solution

The Skeptic Lens generates a disagreement map -- not to create doubt, but to show where certainty exists and where it doesn't:

GET /query
  ?subject=semaglutide
  &lens=skeptic
  &scope=adverse-effects

-> Returns:
{
  "resolved": [
    {
      "topic": "nausea",
      "consensus": "Common, dose-dependent, generally transient",
      "conflict_score": 0.08,
      "all_tiers_agree": true
    }
  ],
  "active_disagreement": [
    {
      "topic": "gastroparesis",
      "conflict_score": 0.72,
      "tier_positions": {
        "clinical": "Low incidence in trials",
        "pharmacovigilance": "Disproportionality signal",
        "patient_community": "Widely reported"
      },
      "note": "FDA added to label 2024-01"
    },
    {
      "topic": "muscle_loss",
      "conflict_score": 0.61,
      "tier_positions": {
        "clinical": "Lean mass loss ~40% of total weight lost",
        "patient_community": "Significant concern, exercise mitigation discussed"
      }
    }
  ],
  "emerging_signal": [
    {
      "topic": "hair_loss",
      "source": "tier-5 cluster",
      "clinical_evidence": "none",
      "cluster_size": 1847,
      "status": "under-review"
    }
  ]
}

Three categories: resolved (everyone agrees), active disagreement (tiers disagree, with each position shown), and emerging signal (anecdotal clusters without clinical counterpart). A consumer can see where the science is settled, where it's contested, and where patient experience is ahead of clinical evidence.

The Vaccine Parallel

Everything described above applies directly to COVID vaccines (and any future vaccine discourse):

GLP-1 Scenario	Vaccine Equivalent
FDA label update adds gastroparesis	EUA conditions change, booster guidance shifts
Reddit flags hair loss 12 months before clinical study	VAERS myocarditis reports accumulate before confirmation
10-K vs 10-Q revenue discrepancy	Different efficacy numbers from different trial endpoints
TikTok influencer claims "stomach paralysis"	Social media claims about fertility, magnetism, etc.
Patient makes decision based on June 2023 data	Person gets vaccinated based on March 2021 guidance
Competing analyst extractions from same proxy	Different researchers interpret same trial differently

The structural problem is identical: heterogeneous sources, varying authority, temporal validity, and a consumer who has no mechanism to see the shape of the evidence.

Episteme doesn't tell the consumer what to believe. It shows them what each source class says, where they agree, where they disagree, what changed, and when.

The "Everything" Ingestion Challenge

This use case intentionally targets the hardest version of the problem: ingest all data about a topic. Not a curated corpus. Everything.

Data Sources

Source	Volume	Source Class	Ingestion Method
PubMed / biorxiv	~5,000 papers for GLP-1	Tier 1-2	API crawl, structured extraction
ClinicalTrials.gov	~800 registered trials	Tier 1-2	API, protocol/results parsing
FDA labels, safety communications	~50 documents	Tier 0	EDGAR/DailyMed scraping
FAERS adverse event reports	~50,000 for semaglutide	Tier 3	openFDA API
Reddit (r/Ozempic, r/Semaglutide, r/loseit)	~500,000 posts/comments	Tier 5	API, NLP extraction
Patient forums (MyFitnessPal, etc.)	~100,000 posts	Tier 5	Crawl, NLP extraction
News articles	~10,000	Tier 6	News API, extraction
Social media (TikTok, Instagram, X)	~1,000,000+ mentions	Tier 6	Firehose sample, NLP
Manufacturer press releases	~200	Tier 6	Crawl
Insurance claims data (if available)	Millions of records	Tier 2	Batch ingest

Total assertion volume: millions. This is where Episteme's architecture is tested. The append-only DAG, source-class-aware decay, and lens-based resolution must operate at scale without degrading query latency.

The Noise Problem

At this volume, Tier 5-6 assertions outnumber Tier 0-2 by 100:1 or more. Without source-class hierarchy, any consensus mechanism would be dominated by volume rather than validity. This is exactly what happened with vaccine discourse -- the volume of social media claims overwhelmed the signal from clinical evidence in every system that treated all sources equally.

Episteme's response: source class is not a filter applied after retrieval. It is structural. The lens system weights assertions by class during resolution, not after. A million Tier 6 assertions cannot outvote a single Tier 0 regulatory action. But they can signal that something is happening -- and the escalation mechanism ensures that signal is visible.

Summary: Why Episteme for Consumer Health?

Problem	Current Approach	Episteme Approach
Doctor says X, Reddit says Y	Pick one (or Google picks for you)	Layered consensus shows both with source class
Guidance changes, old advice persists	Manual updates, no propagation	Invalidation cascades propagate changes
Anecdotal signal precedes clinical validation	Dismissed as noise	Cluster escalation surfaces signal without false authority
"Was my decision reasonable?"	No historical record	Time travel to consensus at decision point
Stale TikTok claims compete with current RCTs	Same search ranking	Source-aware decay fades noise, preserves evidence
False certainty creates distrust	One answer, take it or leave it	Disagreement dashboard shows shape of evidence
1M social posts vs 5 clinical trials	Volume wins	Source-class hierarchy weights by validity, not volume

The trust crisis in health information is not caused by the existence of contradictory claims. It is caused by the absence of a system that can hold all claims, from every source, and show a consumer the structure of the evidence without either collapsing it into false certainty or abandoning them to navigate it alone.

Episteme is that system.

22 KiB Raw Blame History

Consumer Health Intelligence: The Living Truth Layer

The Catastrophe

The Domain: Living Health Topics

What Episteme Enables

Source-Class Hierarchy

Feature 1: The Nuance Query (Layered Consensus)

The Failure Mode

The Episteme Solution

Feature 2: Anecdotal Signal Detection (Cluster Escalation)

The Failure Mode

The Episteme Solution

Feature 3: Guidance Change Propagation

The Failure Mode

The Episteme Solution

Feature 4: Source-Aware Decay

The Failure Mode

The Episteme Solution

Feature 5: Time Travel for Personal Audit

The Failure Mode

The Episteme Solution

Feature 6: The Disagreement Dashboard

The Failure Mode

The Episteme Solution

The Vaccine Parallel

The "Everything" Ingestion Challenge

Data Sources

The Noise Problem

Summary: Why Episteme for Consumer Health?

Further Reading

22 KiB

Raw Blame History