# Episteme: The Probabilistic Knowledge Lattice > **Internal Codename:** StemeDB > **Category:** Infrastructure / Database > **Role:** The Cortex (Reasoning & Truth) ## 1. The Manifesto: "A Marketplace of Truth" We are building the shared, long-term memory for autonomous research agents. Current databases (Postgres, Neo4j, Vector DBs) suffer from **The Tower of Babel** problem: they store *Data*, not *Evidence*. They are deterministic, stateless, and brittle. **Episteme** rejects the idea of a single, static "database state." Instead, it models knowledge as a **Probabilistic Marketplace**. * **Democracy:** Truth is established via high-velocity consensus (Voting), not just overwrite privileges. * **Federation:** "Waze for Deep Research." Free users contribute to a Global Lattice; Paid users get private silos. * **Economics:** Reasoning has a cost. The system enforces efficiency via "The Meter." * **Curation:** We are not the Ministry of Truth. We are the App Store for Trust. Users subscribe to "Trust Packs" (e.g., "Mayo Clinic", "Rust Experts") to filter reality. ## 2. The Core Data Model: The Hyper-Edge The atomic unit of Episteme is not a Row, Document, or Embedding. It is the **Signed Assertion**. ```rust struct Assertion { // The Proposition (The "What") subject: EntityId, // e.g., "Tesla_Inc" predicate: RelationId, // e.g., "has_annual_revenue" object: Value, // e.g., "$96.7B" // The Meta-Cognition (The "Why") confidence: f32, // 0.0 to 1.0 (Agent's subjective certainty) source_hash: Hash, // Content-addressed link to source (PDF, URL, Log) visual_hash: Option, // pHash for visual anchoring against web drift agent_id: PublicKey, // Who made this claim? (Cryptographic multi-sig) timestamp: u64, // When? // The Semantic Vector (The "Meaning") vector: Vec, // Embedding for semantic navigation // The Paradigm (The "Context") epoch: Option, // "covid-guidelines-2020", "gaap-2024" } ``` ## 3. The Query Engine: "Truth Lenses" Reading is a compute-heavy operation. You must apply a **Lens** to collapse the probabilistic field into a concrete answer. To ensure sub-millisecond latency, Episteme uses **Materialized Views** to pre-calculate the results of standard lenses. ### Standard Lenses 1. **Lens::Consensus:** Returns the value with the highest cluster density (Weighted by Trust Pack). 2. **Lens::Authority:** Filters by subscribed **Trust Packs** (e.g., "Show me reality according to The Financial Times"). 3. **Lens::Recency:** Returns the latest assertion, ignoring history. 4. **Lens::EpochAware:** Validates assertions against the *current* paradigm, filtering superseded epochs. 5. **Lens::Skeptic:** Returns the *variance* between claims (identifies high-conflict/unstable truth). ## 4. Features for the Agentive Team ### 4.1. "Forking Reality" (Branching) Agents need to simulate futures without polluting the main branch. Episteme supports **Copy-on-Write Branching** via Sparse Merkle Trees. ### 4.2. The Ballot Box: High-Velocity Consensus To avoid write contention, Episteme separates the "Candidate" (Assertion) from the "Votes" (Signatures). * Agents write **Votes** to a high-speed append-only log ("The Ballot Box"). * A background process aggregates these votes to update the Materialized View. ### 4.3. The Hive: Learning & Trust * **Trust Packs (The Curator Economy):** Users can publish and subscribe to Lists of Trusted Agents. * *Example:* "The Skeptical Cardio Pack" filters out low-quality studies. * *Mechanism:* A BitSet overlay that filters the Consensus Lens efficiently. * **The Simulator (Mid-Training):** A pipeline that converts high-confidence failure logs into **Synthetic Trajectories**. * **The Super Curator (Judicial Branch):** A specialized swarm of "Reviewer Agents" that audits high-variance facts. ### 4.4. The Meter: Economics of Reasoning Deep Research is computationally expensive. Episteme enforces **Temporal Advantage Normalization (TAN)**. * **Budgeting:** Every Job carries a budget (tokens/dollars). * **Throttling:** The system rejects "Fork Reality" requests if the projected cost exceeds the Value of Information. ## 5. Architecture: The Rust Stack Episteme follows the **"Defensive by Default"** best practices. ### Tier 1: The Spine (Durability) * **Component:** `episteme-wal` (Quarantine Pattern) * **Role:** Raw, serialized append-only log. Ensures we never lose a claim. ### Tier 2: The Lattice (Graph/Index) * **Component:** `episteme-core` (Hot/Warm memory) * **Warm Tier:** `sled` (LSM Tree) for the Merkle DAG + `hnsw` for vector search. * **Ballot Box:** High-velocity stream for vote ingestion. ### Tier 3: The Cortex (Compute) * **Component:** `episteme-lens` * **Role:** The WASM runtime for executing Lenses and resolving probabilistic state. * **Materializer:** Background worker maintaining O(1) read views. ## 6. The Ecosystem Triad | System | Biological Analogy | Function | Question Answered | | :--- | :--- | :--- | :--- | | **LogDB** | **The Spine** | Immutable Event Log | "What happened?" | | **AssociativeDB** | **The Hippocampus** | Associative Memory | "What is this like?" | | **Episteme** | **The Cortex** | Structured Reasoning | "Is this true?" |