Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint, security hardening, signed assertions, source registry, dashboard enhancements) and fixed all test failures across the full workspace (2656/2656 passing). Key fixes: - fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges() - 6 previously-hanging SWIM tests now pass in <2s - fix(sim): replace background-task+polling ingestion with synchronous process_pending() - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests - Removed ingestor.start() + wait_until_ingested() pattern throughout sim - All arena functions now call ingestor.process_pending() directly (deterministic) - fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2) - fix(test): quota test signed "test" but v1 requires "subject:predicate" format - fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex - fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change) - config: add nextest.toml with slow-timeout for background-task-tests group Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.9 KiB
Vision: Epistemic Logits (The Neuro-Symbolic Cortex)
Status: Vision / L9 Roadmap Target: Solves "Intrinsic Hallucination" Core Concept: StemeDB is no longer just a database we query; it is a constraint layer applied to the model's probability distribution during inference.
1. The Problem: The "RAG Ceiling"
Current architectures (including our own Aphoria/ADK stack) rely on Retrieval Augmented Generation (RAG). This is a "Glass Box" system, but it is composed of two disconnected brains:
- The Retriever (StemeDB): Knows what is true, what is conflicted, and who said what.
- The Generator (LLM): Knows how to predict the next token based on statistical patterns.
In our current architecture, we paste the Truth (1) into the Context Window of the Generator (2) and hope the Generator attends to it.
The Failure Mode: The Generator can still ignore the context. It can hallucinate. It can state a high-conflict fact with absolute certainty ("X is true") instead of qualified uncertainty ("Some sources claim X").
We cannot fix this by prompting. We must fix it by math.
2. The Solution: Epistemic Logits
Epistemic Logits is a decoding strategy that modifies the probability distribution of the LLM's output layer in real-time, based on the ConflictScore and TrustRank of the concepts being generated.
We move StemeDB from the Input Layer (Prompt) to the Activation Layer (Logits).
The Core Equation
P_{final}(token) = P_{model}(token) \times E(Subject, Predicate)
Where E is the Epistemic Function:
- If
ConflictScore > 0.8(High Disagreement) ANDTokenimplies certainty ("is", "proven", "fact"), thenE \to 0(Penalty). - If
ConflictScore > 0.8ANDTokenimplies uncertainty ("reported", "alleged", "contested"), thenE \to 1(Boost). - If
SourceTieris Low (Anecdotal) ANDTimeis old (Decayed), thenE \to 0.
Result: The model physically cannot state a contested claim as a fact. It effectively has a "physics engine" for Truth.
3. Architecture: The Neuro-Symbolic Stack
[ User Query ]
│
▼
[ 1. Semantic Router ] ───► [ StemeDB (The Graph) ]
│ │
│ (Context) │ (Constraints & Scores)
▼ ▼
[ 2. LLM Core ] [ 3. Epistemic Decoder ]
(Transformer) (Logit Processor)
│ │
└──► [ Raw Logits ] ──────►│
│ ◄── "Don't say 'proven' if Conflict > 0.5"
│
▼
[ Final Token ]
Component 1: The Lookahead Mapper
To constrain logits, we must know what the model is about to say. We implement a lightweight "Concept Probe" (a small BERT model or sparse autoencoder) that runs parallel to the main LLM.
- Input: Current generation stream.
- Output: The
StemeDB::SubjectIDthe stream is discussing.
Component 2: The Constraint Projector
Once the Subject is identified, StemeDB projects the Epistemic State of that subject into a set of forbidden/boosted tokens.
- State:
Semaglutide::has_side_effect-> Conflict: High. - Constraint: Ban absolute assertions. Boost attribution markers ("According to FDA...", "Patients report...").
Component 3: The Reward Loop (RLHF on Reality)
We use the VoteStore not just for consensus, but to train a Reward Model.
- Data: Millions of historical votes where Agents disagreed.
- Training: Fine-tune the LLM to prefer outputs that align with the weighted consensus of the Graph.
- Outcome: The model "intuitively" knows which sources are trustworthy (Tier 0/1) without needing RAG retrieval for every fact.
4. Implementation Roadmap (The Path to L9)
Phase 1: Structured Decoding (The "Guardrails")
Integrate StemeDB with grammar-constrained generation libraries (like guidance or outlines).
- Mechanism: Force the LLM to output a citation struct
{ claim: "...", source_id: "...", confidence: 0.0-1.0 }for every assertion. - Validation: If the generated
source_iddoes not exist in StemeDB, or if theconfidencedoesn't match theVoteStore, reject the token stream and regenerate. - Deliverable:
crates/stemedb-guidance: A Rust binding for grammar-constrained sampling backed by the KV store.
Phase 2: DPO Pipeline (The "Training")
Direct Preference Optimization using StemeDB history.
- Mechanism: Export the
VoteStorehistory as(Prompt, Chosen, Rejected)tuples.- Chosen: An assertion supported by Tier 0 (Regulatory) sources.
- Rejected: A conflicting assertion supported only by Tier 5 (Anecdotal) sources.
- Action: Fine-tune a Llama-3 8B model on this dataset.
- Deliverable:
crates/stemedb-rlhf: A pipeline that turns WAL segments into HuggingFace datasets.
Phase 3: The Logit Processor (The "Cortex")
Real-time intervention.
- Mechanism: A custom sampler (integrated into
llama.cpporvLLM) that queries StemeDB'sMaterializedViewin real-time (sub-millisecond) during inference. - Optimization: This requires the
HybridStoreto be memory-mapped into the inference engine's address space for zero-latency lookups. - Deliverable:
episteme-inference: A standalone inference server that speaks OpenAI API but enforces StemeDB truth constraints.
5. The Impact
When we achieve Epistemic Logits, we solve the Liability Gap.
Currently, no enterprise can deploy an autonomous agent for critical tasks (Medical, Legal, Finance) because they cannot guarantee the output.
With Epistemic Logits, we provide a mathematical guarantee: "This system is incapable of stating a claim with higher confidence than the underlying evidence supports."
This transforms AI from a creative writing tool into a fiduciary instrument.