This commit adds the read path (Cortex) to complement the write path (Spine): ## Crates - stemedb-api: HTTP API with axum + utoipa OpenAPI - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit - Metered endpoints with quota enforcement - Ed25519 signature verification - stemedb-lens: Truth resolution lenses - RecencyLens, ConsensusLens, ConfidenceLens - VoteAwareConsensusLens (Ballot Box pattern) - TrustAwareAuthorityLens (The Hive pattern) - SkepticLens (conflict analysis) - EpochAwareLens (paradigm-safe queries) - stemedb-query: Query engine with materialized views ## Storage Extensions - VoteStore: Vote aggregation with cached counts - TrustRankStore: Agent reputation with decay - AuditStore: Query audit trail - IndexStore: SP/P/S index structures - SupersessionStore: Epoch supersession chains ## SDKs - sdk/go/steme: Go HTTP client with Ed25519 signing - sdk/go/adk: ADK-Go tools for AI agents ## Documentation - Updated CLAUDE.md, architecture.md, roadmap.md - New ai-lookup entries for all services - Use case docs for consumer health intelligence - Arena roadmap for simulation advancement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.3 KiB
Episteme: The Probabilistic Knowledge Lattice
Internal Codename: StemeDB Category: Infrastructure / Database Role: The Cortex (Reasoning & Truth)
1. The Manifesto: "A Marketplace of Truth"
We are building the shared, long-term memory for autonomous research agents.
Current databases (Postgres, Neo4j, Vector DBs) suffer from The Tower of Babel problem: they store Data, not Evidence. They are deterministic, stateless, and brittle.
Episteme rejects the idea of a single, static "database state." Instead, it models knowledge as a Probabilistic Marketplace.
- Democracy: Truth is established via high-velocity consensus (Voting), not just overwrite privileges.
- Federation: "Waze for Deep Research." Free users contribute to a Global Lattice; Paid users get private silos.
- Economics: Reasoning has a cost. The system enforces efficiency via "The Meter."
- Curation: We are not the Ministry of Truth. We are the App Store for Trust. Users subscribe to "Trust Packs" (e.g., "Mayo Clinic", "Rust Experts") to filter reality.
2. The Core Data Model: The Hyper-Edge
The atomic unit of Episteme is not a Row, Document, or Embedding. It is the Signed Assertion.
struct Assertion {
// The Proposition (The "What")
subject: EntityId, // e.g., "Tesla_Inc"
predicate: RelationId, // e.g., "has_annual_revenue"
object: Value, // e.g., "$96.7B"
// The Meta-Cognition (The "Why")
confidence: f32, // 0.0 to 1.0 (Agent's subjective certainty)
source_hash: Hash, // Content-addressed link to source (PDF, URL, Log)
visual_hash: Option<Hash>, // pHash for visual anchoring against web drift
agent_id: PublicKey, // Who made this claim? (Cryptographic multi-sig)
timestamp: u64, // When?
// The Semantic Vector (The "Meaning")
vector: Vec<f32>, // Embedding for semantic navigation
// The Paradigm (The "Context")
epoch: Option<EpochId>, // "covid-guidelines-2020", "gaap-2024"
}
3. The Query Engine: "Truth Lenses"
Reading is a compute-heavy operation. You must apply a Lens to collapse the probabilistic field into a concrete answer. To ensure sub-millisecond latency, Episteme uses Materialized Views to pre-calculate the results of standard lenses.
Standard Lenses
- Lens::Consensus: Returns the value with the highest cluster density (Weighted by Trust Pack).
- Lens::Authority: Filters by subscribed Trust Packs (e.g., "Show me reality according to The Financial Times").
- Lens::Recency: Returns the latest assertion, ignoring history.
- Lens::EpochAware: Validates assertions against the current paradigm, filtering superseded epochs.
- Lens::Skeptic: Returns the variance between claims (identifies high-conflict/unstable truth).
4. Features for the Agentive Team
4.1. "Forking Reality" (Branching)
Agents need to simulate futures without polluting the main branch. Episteme supports Copy-on-Write Branching via Sparse Merkle Trees.
4.2. The Ballot Box: High-Velocity Consensus
To avoid write contention, Episteme separates the "Candidate" (Assertion) from the "Votes" (Signatures).
- Agents write Votes to a high-speed append-only log ("The Ballot Box").
- A background process aggregates these votes to update the Materialized View.
4.3. The Hive: Learning & Trust
- Trust Packs (The Curator Economy): Users can publish and subscribe to Lists of Trusted Agents.
- Example: "The Skeptical Cardio Pack" filters out low-quality studies.
- Mechanism: A BitSet overlay that filters the Consensus Lens efficiently.
- The Simulator (Mid-Training): A pipeline that converts high-confidence failure logs into Synthetic Trajectories.
- The Super Curator (Judicial Branch): A specialized swarm of "Reviewer Agents" that audits high-variance facts.
4.4. The Meter: Economics of Reasoning
Deep Research is computationally expensive. Episteme enforces Temporal Advantage Normalization (TAN).
- Budgeting: Every Job carries a budget (tokens/dollars).
- Throttling: The system rejects "Fork Reality" requests if the projected cost exceeds the Value of Information.
5. Architecture: The Rust Stack
Episteme follows the "Defensive by Default" best practices.
Tier 1: The Spine (Durability)
- Component:
episteme-wal(Quarantine Pattern) - Role: Raw, serialized append-only log. Ensures we never lose a claim.
Tier 2: The Lattice (Graph/Index)
- Component:
episteme-core(Hot/Warm memory) - Warm Tier:
sled(LSM Tree) for the Merkle DAG +hnswfor vector search. - Ballot Box: High-velocity stream for vote ingestion.
Tier 3: The Cortex (Compute)
- Component:
episteme-lens - Role: The WASM runtime for executing Lenses and resolving probabilistic state.
- Materializer: Background worker maintaining O(1) read views.
6. The Ecosystem Triad
| System | Biological Analogy | Function | Question Answered |
|---|---|---|---|
| LogDB | The Spine | Immutable Event Log | "What happened?" |
| AssociativeDB | The Hippocampus | Associative Memory | "What is this like?" |
| Episteme | The Cortex | Structured Reasoning | "Is this true?" |