stemedb/vision.md
jordan 55349845d0 refactor: Split all files to enforce 500-line max
Break monolith source files into focused modules:
- stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.)
- stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories
- stemedb-ingest/worker.rs → worker/ with separate test modules
- stemedb-query: engine, materializer, query → module directories
- stemedb-lens: epoch_aware, skeptic → module directories
- stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types
- stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline
- stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution
- stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay

Also adds new features: gold standard verification, escalation handlers,
admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 01:13:45 -07:00

8.9 KiB

Episteme: Git for Truth

Internal Codename: StemeDB Category: Infrastructure / Probabilistic Knowledge Database Role: The shared memory for AI research agents that disagree

The Problem: Databases Force False Certainty

Current databases (Postgres, Neo4j, Vector DBs) suffer from The Tower of Babel problem: they store Data, not Evidence. They are deterministic, stateless, and brittle.

When multiple agents observe the world and report different things, traditional databases force you to:

  • Pick a winner (losing the disagreement)
  • Version-table chaos (complexity explodes)
  • Application logic everywhere (authority weighting, decay, cascades)

Real example: A woman researching Semaglutide found her doctor saying "well-tolerated" while Reddit flagged gastroparesis months before the FDA added the warning. She had no way to weigh these sources structurally. The Reddit signal was right. The system failed her.

The Solution: Store Claims, Resolve at Read Time

Episteme rejects the idea of a single, static "database state." Instead, it models knowledge as a Probabilistic Marketplace:

  • Assertions are immutable. Every claim is signed, timestamped, and preserved forever.
  • Contradictions coexist. The database holds disagreement without forcing resolution.
  • Lenses resolve at query time. Different readers can apply different resolution strategies.
  • Source authority is structural. A regulatory filing outweighs a Reddit post by design.

The Four Pillars

Every use case must demonstrate at least one pillar. If Postgres could do it, it's not a compelling use case.

Pillar What It Enables Postgres Gap
First-Class Contradiction Hold conflicting facts without forcing resolution Must pick one value or version-table chaos
Invalidation Cascades Retracted evidence flags all downstream decisions Recursive CTEs don't scale, app logic drifts
Multi-Signature Consensus Weighted trust via cryptographic co-signatures Join tables have no cryptographic proof
Semantic Decay Old data fades from hot path but remains auditable Manual WHERE clauses, inconsistent decay rates

The Core Data Model: The Signed Assertion

The atomic unit is not a Row, Document, or Embedding. It is the Signed Assertion:

struct Assertion {
    // The Proposition (What is being claimed)
    subject: EntityId,           // "semaglutide", "Tesla_Inc"
    predicate: RelationId,       // "has_side_effect", "annual_revenue"
    object: ObjectValue,         // "gastroparesis", "$96.7B"

    // The Lineage (Why we believe it)
    source_hash: Hash,           // Content-addressed link to source document
    source_class: SourceClass,   // Authority tier (0=Regulatory...5=Anecdotal)
    source_metadata: Option<JSON>, // Rich provenance (journal, DOI, etc.)
    visual_hash: Option<PHash>,  // Perceptual hash for image provenance
    epoch: Option<EpochId>,      // Paradigm context ("covid-2020", "gaap-2024")

    // The Meta-Cognition (Who said it, how confident)
    signatures: Vec<SignatureEntry>,  // Ed25519 cryptographic proofs
    confidence: f32,                   // 0.0-1.0 subjective certainty
    timestamp: u64,                    // When created
    lifecycle: LifecycleStage,         // Proposed → Approved → Deprecated

    // The Semantic (Meaning for similarity search)
    vector: Option<Vec<f32>>,    // Embedding for k-NN queries
}

The Source Class Hierarchy

Every assertion has a source class that structurally affects resolution weight and decay:

Tier Class Example Decay Half-Life Authority Weight
0 Regulatory FDA label, SEC filing Never 1.0
1 Clinical Peer-reviewed RCTs 2 years 0.9
2 Observational Real-world evidence 1 year 0.7
3 Expert Physician guidelines 6 months 0.5
4 Community Patient registries 3 months 0.2
5 Anecdotal Reddit posts, social 30 days 0.1

A million Tier-5 anecdotal assertions cannot outvote a single Tier-0 regulatory assertion. But the million anecdotes can signal "something is happening here" via cluster escalation.

The Query Engine: Truth Lenses

Reading applies a Lens to collapse the probabilistic field into a concrete answer. Materialized Views ensure sub-millisecond latency for common patterns.

Resolution Lenses (Pick a Winner)

Lens Behavior
Recency Last writer wins
Consensus Highest cluster density of object values
Authority Filter by TrustRank reputation
Vote-Aware Weight by Ballot Box votes
EpochAware Filter out superseded paradigms

Analysis Lenses (Surface Disagreement)

Lens Behavior
Skeptic Return all claims with conflict score and weight shares
Layered Consensus Per-source-class resolution (tier-by-tier visibility)
Constraints Pre-flight check for must_use/forbidden predicates

The Skeptic and Layered Consensus lenses are key differentiators: they answer "where do sources agree and disagree?" rather than hiding the variance.

Key Capabilities

Time-Travel Queries

"What was the known risk profile when I started Semaglutide in June 2023?"

GET /query?subject=semaglutide&predicate=side_effects&as_of=1687000000

The append-only DAG preserves every historical state. Time travel is a hash lookup, not a reconstruction.

Semantic Decay

Confidence decays based on source class. Old Reddit posts fade; regulatory filings persist:

GET /query?subject=semaglutide&predicate=efficacy&source_class_decay=true

Conflict Analysis

Instead of "here is the answer," show "here is the shape of disagreement":

GET /skeptic?subject=semaglutide&predicate=gastroparesis_risk

Returns: which tiers agree, which disagree, emerging signals without clinical evidence.

Query Audit Trail

Every query is logged with full provenance. "Why did you believe that?" is answerable:

GET /audit/query/{query_id}

The Ballot Box: High-Velocity Consensus

To avoid write contention on assertions, agents vote separately:

struct Vote {
    assertion_hash: Hash,
    agent_id: PublicKey,
    weight: f32,
    signature: Signature,
}

Votes are append-only. A background Materializer aggregates votes to update O(1) read views.

Trust Packs: The Curator Economy

Users subscribe to "Trust Packs" (curated lists of trusted agents) to filter reality:

  • "The Skeptical Cardio Pack" filters out low-quality cardiac studies
  • "Mayo Clinic Curated" only shows assertions from verified Mayo researchers

Trust Packs are BitSet overlays that filter the Consensus Lens efficiently.

The Meter: Economics of Reasoning

Deep Research is computationally expensive. Episteme enforces token-bucket quotas:

  • Assert: 10 tokens
  • Vote: 1 token
  • Query: 5 + lens complexity tokens
  • Default: 10,000 tokens/agent/hour

Architecture: The Biological Stack

Layer Crate Role
The Spine stemedb-wal Append-only WAL for durability
The Lattice stemedb-storage KV store, indexes, vector/visual indices
The Cortex stemedb-query, stemedb-lens Query engine, Lenses, Materializer
The Surface stemedb-api HTTP API with OpenAPI docs

The biological metaphor:

  • Spine: Raw persistence. Never loses a claim.
  • Lattice: Connectivity. O(1) lookups via compound indexes.
  • Cortex: Reasoning. Collapse probability into answers.

Future Vision

Forking Reality (Planned)

Agents simulate futures without polluting the main branch via Copy-on-Write Branching using Sparse Merkle Trees.

The Super Curator (Planned)

A specialized swarm of reviewer agents that audits high-variance facts and escalates emerging signals.

The Simulator (Planned)

A pipeline that converts high-confidence failure logs into synthetic training trajectories.

The Git Analogy

Git Concept Episteme Equivalent
Commit Assertion (immutable, content-addressed)
Branch Epoch (paradigm context)
Merge Lens resolution
Revert Epoch supersession cascade
Blame Signature/agent audit trail
History Append-only DAG preserved forever

When to Use Episteme

Use Episteme when:

  • Multiple sources report conflicting information
  • You need to weight sources by authority, not just timestamp
  • You need to surface disagreement, not hide it
  • Guidance changes and you need to notify prior consumers
  • You need to audit "why did you believe that?"
  • You need historical snapshots ("what was true on this date?")

Use Postgres when:

  • You have a single source of truth
  • Data never conflicts
  • Temporal validity doesn't matter
  • Consensus has already been reached by humans

For everything else: Episteme is the database.