This commit adds the read path (Cortex) to complement the write path (Spine): ## Crates - stemedb-api: HTTP API with axum + utoipa OpenAPI - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit - Metered endpoints with quota enforcement - Ed25519 signature verification - stemedb-lens: Truth resolution lenses - RecencyLens, ConsensusLens, ConfidenceLens - VoteAwareConsensusLens (Ballot Box pattern) - TrustAwareAuthorityLens (The Hive pattern) - SkepticLens (conflict analysis) - EpochAwareLens (paradigm-safe queries) - stemedb-query: Query engine with materialized views ## Storage Extensions - VoteStore: Vote aggregation with cached counts - TrustRankStore: Agent reputation with decay - AuditStore: Query audit trail - IndexStore: SP/P/S index structures - SupersessionStore: Epoch supersession chains ## SDKs - sdk/go/steme: Go HTTP client with Ed25519 signing - sdk/go/adk: ADK-Go tools for AI agents ## Documentation - Updated CLAUDE.md, architecture.md, roadmap.md - New ai-lookup entries for all services - Use case docs for consumer health intelligence - Arena roadmap for simulation advancement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
16 KiB
StemeDB Data Structures
Last Updated: 2026-01-31 Source:
crates/stemedb-core/src/types.rs
This document describes the core data structures in StemeDB (Episteme). These types form the foundation of the "Git for Truth" knowledge graph.
Design Principles
- Append-Only: Data is never mutated. New assertions create new records.
- Content-Addressed: Every assertion's ID is a BLAKE3 hash of its content.
- Zero-Copy: Uses
rkyvfor serialization - data can be read directly from disk without parsing. - Provenance-First: Every fact carries its source, signers, and confidence.
Primitive Types
pub type Hash = [u8; 32]; // BLAKE3 256-bit hash
pub type PHash = [u8; 8]; // Perceptual hash for images (8 bytes)
pub type EntityId = String; // Subject or object identifier
pub type RelationId = String; // Predicate identifier
pub type EpochId = Hash; // Paradigm/era identifier
pub type QueryId = Hash; // Query audit record identifier
The Assertion (Atomic Unit of Knowledge)
The Assertion is the fundamental unit. It represents a single claim about the world.
pub struct Assertion {
// ═══════════════════════════════════════════════════════════
// 1. THE FACT (What is being claimed)
// ═══════════════════════════════════════════════════════════
/// The entity this assertion is about (e.g., "Semaglutide", "Tesla_Inc")
pub subject: EntityId,
/// The relationship or property (e.g., "has_side_effect", "annual_revenue")
pub predicate: RelationId,
/// The claimed value
pub object: ObjectValue,
// ═══════════════════════════════════════════════════════════
// 2. THE LINEAGE (Why we believe it)
// ═══════════════════════════════════════════════════════════
/// If this modifies/forks another assertion, its hash
pub parent_hash: Option<Hash>,
/// Hash of the source evidence (PDF, URL, database export)
pub source_hash: Hash,
/// Authority tier of the source (enables indexing and decay rates)
pub source_class: SourceClass,
/// Perceptual hash of a visual anchor (e.g., screenshot of table)
pub visual_hash: Option<PHash>,
/// Which paradigm/era this belongs to (for paradigm shifts)
pub epoch: Option<EpochId>,
/// Lifecycle stage (Proposed → Approved → Deprecated)
pub lifecycle: LifecycleStage,
// ═══════════════════════════════════════════════════════════
// 3. META-COGNITION (Who said it, how sure are they)
// ═══════════════════════════════════════════════════════════
/// Cryptographic signatures from agents vouching for this
pub signatures: Vec<SignatureEntry>,
/// Subjective confidence score (0.0 to 1.0)
pub confidence: f32,
/// Unix timestamp when created
pub timestamp: u64,
/// Semantic embedding vector for similarity search
pub vector: Option<Vec<f32>>,
}
ObjectValue
The value in a subject-predicate-object triple:
pub enum ObjectValue {
Text(String), // "muscle loss"
Number(f64), // 96.7
Boolean(bool), // true
Reference(EntityId), // Points to another entity (graph edge)
}
LifecycleStage
Assertions progress through stages (as new assertions, not mutations):
Proposed → UnderReview → Approved
↘ Rejected
↘ Deprecated
pub enum LifecycleStage {
Proposed, // Initial idea, not for production use
UnderReview, // Gathering votes and feedback
Approved, // Accepted as current truth
Deprecated, // Was true, now superseded
Rejected, // Explicitly declined
}
SourceClass
Authority tier classification for sources. Enables indexing by tier and tier-based decay rates:
| Tier | Class | Example | Default Decay |
|---|---|---|---|
| 0 | Regulatory | FDA, EMA, WHO | Never |
| 1 | Clinical | Phase III trials, peer-reviewed RCTs | 2 years |
| 2 | Observational | Real-world evidence, cohort studies | 1 year |
| 3 | Expert | Medical professional opinions, guidelines | 6 months |
| 4 | Community | Curated forums, patient advocacy groups | 3 months |
| 5 | Anecdotal | Reddit posts, individual testimonials | 1 month |
pub enum SourceClass {
Regulatory, // Tier 0: Highest authority, never decays
Clinical, // Tier 1: Peer-reviewed research
Observational, // Tier 2: Real-world evidence
Expert, // Tier 3: Professional opinions (default)
Community, // Tier 4: Curated community knowledge
Anecdotal, // Tier 5: Individual reports, fast decay
}
impl SourceClass {
pub fn tier(&self) -> u8; // Returns 0-5
pub fn default_decay_days(&self) -> Option<u32>;
pub fn authority_weight(&self) -> f32; // 1.0 for Regulatory, 0.1 for Anecdotal
}
Key Benefits:
- Indexing:
SC:{source_class}index enables "show me only regulatory sources" - Decay rates: Anecdotal claims decay faster than clinical evidence
- Trust weighting: Lenses can weight sources by authority tier in conflict resolution
SignatureEntry
Cryptographic proof that an agent vouches for an assertion:
pub struct SignatureEntry {
pub agent_id: [u8; 32], // Ed25519 public key
pub signature: [u8; 64], // Ed25519 signature over assertion content
pub timestamp: u64, // When the agent signed
}
The Vote (High-Velocity Consensus)
Votes are separated from assertions to enable thousands of agents to vote simultaneously without lock contention (the "Ballot Box" pattern).
pub struct Vote {
/// Hash of the assertion being voted on
pub assertion_hash: Hash,
/// Ed25519 public key of the voter
pub agent_id: [u8; 32],
/// Weight of the vote (0.0 = reject, 1.0 = full endorsement)
pub weight: f32,
/// Signature over the assertion_hash
pub signature: [u8; 64],
/// When the vote was cast
pub timestamp: u64,
}
Key Insight: Votes are append-only. An agent can change their vote by submitting a new one with a later timestamp.
The Epoch (Paradigm Shifts)
Epochs represent distinct periods of truth. When knowledge paradigms shift, old epochs can be superseded.
pub struct Epoch {
pub id: EpochId,
pub name: String, // "Pre-2024", "Newtonian"
pub supersedes: Option<EpochId>, // What this replaces
pub supersession_type: Option<SupersessionType>,
pub start_timestamp: u64,
pub end_timestamp: Option<u64>,
}
pub enum SupersessionType {
Invalidation, // Old epoch was factually wrong (e.g., "Earth is flat")
Temporal, // Old epoch was correct but outdated (e.g., "President is Obama")
Refinement, // Old epoch was a simplification (e.g., Newtonian → Relativity)
}
Query Results
MaterializedView (O(1) Winner Lookup)
Pre-computed resolution stored at MV:{subject}:{predicate}:
pub struct MaterializedView {
/// The winning assertion from lens resolution
pub winner: Assertion,
/// Which lens produced this (e.g., "VoteAwareConsensus")
pub lens_name: String,
/// Confidence in the resolution (0.0 to 1.0)
pub resolution_confidence: f32,
/// How many candidates were considered
pub candidates_count: usize,
/// When this view was computed
pub materialized_at: u64,
}
ConflictAnalysis (Trust but Verify)
For the SkepticLens - surfaces all competing claims instead of picking a winner:
pub struct ConflictAnalysis {
/// Overall status: Unanimous, Agreed, or Contested
pub status: ResolutionStatus,
/// Conflict score (0.0 = unanimous, 1.0 = maximum chaos)
/// Calculated using normalized Shannon entropy
pub conflict_score: f32,
/// All distinct claims, ranked by weight_share descending
pub claims: Vec<ClaimSummary>,
/// Total candidates considered
pub candidates_count: usize,
}
pub enum ResolutionStatus {
Unanimous, // All agree (entropy < 0.1)
Agreed, // Strong majority (entropy < 0.4)
Contested, // Significant disagreement (entropy >= 0.4)
}
ClaimSummary
A single competing claim within a ConflictAnalysis:
pub struct ClaimSummary {
/// The claimed value
pub value: ObjectValue,
/// This claim's share of total support (0.0 to 1.0)
pub weight_share: f32,
/// Number of assertions making this claim
pub assertion_count: u32,
/// Hash of the highest-confidence assertion (for drill-down)
pub representative_hash: Hash,
/// Source provenance
pub source: SourceSummary,
/// Agents who signed assertions for this claim
pub supporting_agents: Vec<AgentSummary>,
}
SourceSummary & AgentSummary
Provenance types for "show me the proof" UX:
pub struct SourceSummary {
pub source_hash: Hash, // Hash of source document
pub visual_hash: Option<PHash>, // Visual anchor (screenshot)
}
pub struct AgentSummary {
pub agent_id: [u8; 32], // Agent's public key
pub trust_score: f32, // Trust score at query time
}
Query Audit Trail
Every query is logged for "Why did you think that?" debugging:
pub struct QueryAudit {
pub query_id: QueryId,
pub agent_id: Option<[u8; 32]>, // Who queried (from X-Agent-Id header)
pub timestamp: u64,
pub params: QueryParams,
pub result_hash: Option<Hash>, // Winning assertion hash
pub result_confidence: f32,
pub contributing_assertions: Vec<ContributingAssertion>,
}
pub struct QueryParams {
pub subject: Option<EntityId>,
pub predicate: Option<RelationId>,
pub lifecycle: Option<LifecycleStage>,
pub epoch: Option<EpochId>,
pub lens: Option<String>,
}
pub struct ContributingAssertion {
pub assertion_hash: Hash,
pub weight: f32, // How much this influenced the result
pub source_hash: Hash,
pub lifecycle: LifecycleStage,
}
Storage Layout
Key patterns in the KV store:
| Key Pattern | Value | Purpose |
|---|---|---|
H:{hash} |
Serialized Assertion | Primary assertion storage |
S:{subject} |
Vec<Hash> |
Subject index |
SP:{subject}:{predicate} |
Vec<Hash> |
Compound index (O(1) lookup) |
MV:{subject}:{predicate} |
MaterializedView | Pre-computed winner |
V:{assertion_hash}:{vote_hash} |
Vote | Individual votes |
VC:{assertion_hash} |
u64 | Vote count cache |
VW:{assertion_hash} |
f32 | Aggregate vote weight cache |
TR:{agent_id} |
TrustRank | Agent reputation |
TP:{pack_id} |
TrustPack | Curated agent lists |
AUD:{query_id} |
QueryAudit | Query audit record |
E:{epoch_id} |
Epoch | Epoch definitions |
The Trust Pack (Curator Economy)
Trust Packs are the "App Store for Trust" - curated lists of trusted agents that filter consensus through domain expertise.
pub struct TrustPack {
/// Content-addressed pack ID (BLAKE3 hash)
pub id: PackId,
/// Human-readable name (e.g., "Mayo_Clinic_Experts")
pub name: String,
/// Ed25519 public key of the pack maintainer
pub maintainer: [u8; 32],
/// Agent public keys in this pack
/// Future: Replace with RoaringBitmap for O(1) membership
pub agents: Vec<[u8; 32]>,
/// Unix timestamp when pack was created
pub created_at: u64,
/// Unix timestamp of last modification
pub updated_at: u64,
}
Key Methods:
add_agent(agent_id)- Idempotent agent additionremove_agent(agent_id)- Safe removalcontains_agent(agent_id) -> bool- Membership check
Use Case: Users subscribe to packs like "Skeptical Cardio Pack" to filter GLP-1 side effect claims through vetted cardiologists.
Serialization
All types use rkyv for zero-copy deserialization:
use stemedb_core::serde::{serialize, deserialize};
// Serialize
let bytes: Vec<u8> = serialize(&assertion)?;
// Deserialize (zero-copy when possible)
let assertion: Assertion = deserialize(&bytes)?;
Critical Rule: Never use raw AllocSerializer in production code. Always use stemedb_core::serde::{serialize, deserialize}.
Relationship Diagram
┌─────────────────────────────────────────────┐
│ ASSERTION │
│ ┌─────────┐ ┌───────────┐ ┌─────────────┐ │
│ │ subject │ │ predicate │ │ object │ │
│ └─────────┘ └───────────┘ └─────────────┘ │
│ │
│ ┌─────────────────┐ ┌──────────────────┐ │
│ │ source_hash │ │ signatures[] │ │
│ └────────┬────────┘ └────────┬─────────┘ │
│ │ │ │
└───────────┼────────────────────┼────────────┘
│ │
┌───────────▼───────┐ ┌────────▼────────┐
│ SOURCE DOCUMENT │ │ AGENTS │
│ (PDF, URL...) │ │ (Ed25519 keys) │
└───────────────────┘ └────────┬────────┘
│
┌────────▼────────┐
│ TRUST RANK │
│ (reputation) │
└─────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ VOTE │◄────────│ ASSERTION │
│ (Ballot Box) │ votes │ (target) │
│ weight: 0.0-1.0│ on │ │
└─────────────────┘ └─────────────────┘
┌─────────────────┐ ┌─────────────────┐
│ EPOCH B │◄────────│ EPOCH A │
│ supersedes: A │ older │ │
│ type: Temporal │ epoch │ │
└─────────────────┘ └─────────────────┘
API Representation
All binary data (hashes, signatures, agent IDs) is hex-encoded in JSON APIs:
{
"subject": "Semaglutide",
"predicate": "muscle_effect",
"object": { "type": "Text", "value": "Significant loss" },
"source_hash": "a1b2c3d4e5f6...",
"signatures": [
{
"agent_id": "deadbeef...",
"signature": "cafebabe...",
"timestamp": 1706745600
}
],
"confidence": 0.85,
"timestamp": 1706745600
}