# Vision: Epistemic Logits (The Neuro-Symbolic Cortex) > **Status:** Vision / L9 Roadmap > **Target:** Solves "Intrinsic Hallucination" > **Core Concept:** StemeDB is no longer just a database we query; it is a constraint layer applied to the model's probability distribution during inference. --- ## 1. The Problem: The "RAG Ceiling" Current architectures (including our own Aphoria/ADK stack) rely on **Retrieval Augmented Generation (RAG)**. This is a "Glass Box" system, but it is composed of two disconnected brains: 1. **The Retriever (StemeDB):** Knows what is true, what is conflicted, and who said what. 2. **The Generator (LLM):** Knows how to predict the next token based on statistical patterns. In our current architecture, we paste the Truth (1) into the Context Window of the Generator (2) and *hope* the Generator attends to it. **The Failure Mode:** The Generator can still ignore the context. It can hallucinate. It can state a high-conflict fact with absolute certainty ("X is true") instead of qualified uncertainty ("Some sources claim X"). **We cannot fix this by prompting. We must fix it by math.** --- ## 2. The Solution: Epistemic Logits **Epistemic Logits** is a decoding strategy that modifies the probability distribution of the LLM's output layer in real-time, based on the `ConflictScore` and `TrustRank` of the concepts being generated. We move StemeDB from the **Input Layer** (Prompt) to the **Activation Layer** (Logits). ### The Core Equation $$ P_{final}(token) = P_{model}(token) \times E(Subject, Predicate) $$ Where $E$ is the **Epistemic Function**: * If `ConflictScore > 0.8` (High Disagreement) AND `Token` implies certainty ("is", "proven", "fact"), then $E \to 0$ (Penalty). * If `ConflictScore > 0.8` AND `Token` implies uncertainty ("reported", "alleged", "contested"), then $E \to 1$ (Boost). * If `SourceTier` is Low (Anecdotal) AND `Time` is old (Decayed), then $E \to 0$. **Result:** The model *physically cannot* state a contested claim as a fact. It effectively has a "physics engine" for Truth. --- ## 3. Architecture: The Neuro-Symbolic Stack ```ascii [ User Query ] │ ▼ [ 1. Semantic Router ] ───► [ StemeDB (The Graph) ] │ │ │ (Context) │ (Constraints & Scores) ▼ ▼ [ 2. LLM Core ] [ 3. Epistemic Decoder ] (Transformer) (Logit Processor) │ │ └──► [ Raw Logits ] ──────►│ │ ◄── "Don't say 'proven' if Conflict > 0.5" │ ▼ [ Final Token ] ``` ### Component 1: The Lookahead Mapper To constrain logits, we must know what the model is *about* to say. We implement a lightweight "Concept Probe" (a small BERT model or sparse autoencoder) that runs parallel to the main LLM. * **Input:** Current generation stream. * **Output:** The `StemeDB::SubjectID` the stream is discussing. ### Component 2: The Constraint Projector Once the Subject is identified, StemeDB projects the **Epistemic State** of that subject into a set of forbidden/boosted tokens. * *State:* `Semaglutide::has_side_effect` -> Conflict: High. * *Constraint:* Ban absolute assertions. Boost attribution markers ("According to FDA...", "Patients report..."). ### Component 3: The Reward Loop (RLHF on Reality) We use the `VoteStore` not just for consensus, but to train a **Reward Model**. * **Data:** Millions of historical votes where Agents disagreed. * **Training:** Fine-tune the LLM to prefer outputs that align with the *weighted consensus* of the Graph. * **Outcome:** The model "intuitively" knows which sources are trustworthy (Tier 0/1) without needing RAG retrieval for every fact. --- ## 4. Implementation Roadmap (The Path to L9) ### Phase 1: Structured Decoding (The "Guardrails") *Integrate StemeDB with grammar-constrained generation libraries (like `guidance` or `outlines`).* * **Mechanism:** Force the LLM to output a citation struct `{ claim: "...", source_id: "...", confidence: 0.0-1.0 }` for every assertion. * **Validation:** If the generated `source_id` does not exist in StemeDB, or if the `confidence` doesn't match the `VoteStore`, reject the token stream and regenerate. * **Deliverable:** `crates/stemedb-guidance`: A Rust binding for grammar-constrained sampling backed by the KV store. ### Phase 2: DPO Pipeline (The "Training") *Direct Preference Optimization using StemeDB history.* * **Mechanism:** Export the `VoteStore` history as `(Prompt, Chosen, Rejected)` tuples. * *Chosen:* An assertion supported by Tier 0 (Regulatory) sources. * *Rejected:* A conflicting assertion supported only by Tier 5 (Anecdotal) sources. * **Action:** Fine-tune a Llama-3 8B model on this dataset. * **Deliverable:** `crates/stemedb-rlhf`: A pipeline that turns WAL segments into HuggingFace datasets. ### Phase 3: The Logit Processor (The "Cortex") *Real-time intervention.* * **Mechanism:** A custom sampler (integrated into `llama.cpp` or `vLLM`) that queries StemeDB's `MaterializedView` in real-time (sub-millisecond) during inference. * **Optimization:** This requires the `HybridStore` to be memory-mapped into the inference engine's address space for zero-latency lookups. * **Deliverable:** `episteme-inference`: A standalone inference server that speaks OpenAI API but enforces StemeDB truth constraints. --- ## 5. The Impact When we achieve Epistemic Logits, we solve the **Liability Gap**. Currently, no enterprise can deploy an autonomous agent for critical tasks (Medical, Legal, Finance) because they cannot guarantee the output. With Epistemic Logits, we provide a mathematical guarantee: **"This system is incapable of stating a claim with higher confidence than the underlying evidence supports."** This transforms AI from a creative writing tool into a **fiduciary instrument**.