stemedb/future-vision.md
jordan 02ecac9a07 fix: merge upstream 10 commits, fix DashMap deadlock, deterministic sim ingestion
Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint,
security hardening, signed assertions, source registry, dashboard enhancements)
and fixed all test failures across the full workspace (2656/2656 passing).

Key fixes:
- fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node
  - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock
  - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges()
  - 6 previously-hanging SWIM tests now pass in <2s
- fix(sim): replace background-task+polling ingestion with synchronous process_pending()
  - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests
  - Removed ingestor.start() + wait_until_ingested() pattern throughout sim
  - All arena functions now call ingestor.process_pending() directly (deterministic)
- fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2)
- fix(test): quota test signed "test" but v1 requires "subject:predicate" format
- fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex
- fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change)
- config: add nextest.toml with slow-timeout for background-task-tests group

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 20:27:32 -07:00

117 lines
5.9 KiB
Markdown

# Vision: Epistemic Logits (The Neuro-Symbolic Cortex)
> **Status:** Vision / L9 Roadmap
> **Target:** Solves "Intrinsic Hallucination"
> **Core Concept:** StemeDB is no longer just a database we query; it is a constraint layer applied to the model's probability distribution during inference.
---
## 1. The Problem: The "RAG Ceiling"
Current architectures (including our own Aphoria/ADK stack) rely on **Retrieval Augmented Generation (RAG)**. This is a "Glass Box" system, but it is composed of two disconnected brains:
1. **The Retriever (StemeDB):** Knows what is true, what is conflicted, and who said what.
2. **The Generator (LLM):** Knows how to predict the next token based on statistical patterns.
In our current architecture, we paste the Truth (1) into the Context Window of the Generator (2) and *hope* the Generator attends to it.
**The Failure Mode:** The Generator can still ignore the context. It can hallucinate. It can state a high-conflict fact with absolute certainty ("X is true") instead of qualified uncertainty ("Some sources claim X").
**We cannot fix this by prompting. We must fix it by math.**
---
## 2. The Solution: Epistemic Logits
**Epistemic Logits** is a decoding strategy that modifies the probability distribution of the LLM's output layer in real-time, based on the `ConflictScore` and `TrustRank` of the concepts being generated.
We move StemeDB from the **Input Layer** (Prompt) to the **Activation Layer** (Logits).
### The Core Equation
$$ P_{final}(token) = P_{model}(token) \times E(Subject, Predicate) $$
Where $E$ is the **Epistemic Function**:
* If `ConflictScore > 0.8` (High Disagreement) AND `Token` implies certainty ("is", "proven", "fact"), then $E \to 0$ (Penalty).
* If `ConflictScore > 0.8` AND `Token` implies uncertainty ("reported", "alleged", "contested"), then $E \to 1$ (Boost).
* If `SourceTier` is Low (Anecdotal) AND `Time` is old (Decayed), then $E \to 0$.
**Result:** The model *physically cannot* state a contested claim as a fact. It effectively has a "physics engine" for Truth.
---
## 3. Architecture: The Neuro-Symbolic Stack
```ascii
[ User Query ]
[ 1. Semantic Router ] ───► [ StemeDB (The Graph) ]
│ │
│ (Context) │ (Constraints & Scores)
▼ ▼
[ 2. LLM Core ] [ 3. Epistemic Decoder ]
(Transformer) (Logit Processor)
│ │
└──► [ Raw Logits ] ──────►│
│ ◄── "Don't say 'proven' if Conflict > 0.5"
[ Final Token ]
```
### Component 1: The Lookahead Mapper
To constrain logits, we must know what the model is *about* to say. We implement a lightweight "Concept Probe" (a small BERT model or sparse autoencoder) that runs parallel to the main LLM.
* **Input:** Current generation stream.
* **Output:** The `StemeDB::SubjectID` the stream is discussing.
### Component 2: The Constraint Projector
Once the Subject is identified, StemeDB projects the **Epistemic State** of that subject into a set of forbidden/boosted tokens.
* *State:* `Semaglutide::has_side_effect` -> Conflict: High.
* *Constraint:* Ban absolute assertions. Boost attribution markers ("According to FDA...", "Patients report...").
### Component 3: The Reward Loop (RLHF on Reality)
We use the `VoteStore` not just for consensus, but to train a **Reward Model**.
* **Data:** Millions of historical votes where Agents disagreed.
* **Training:** Fine-tune the LLM to prefer outputs that align with the *weighted consensus* of the Graph.
* **Outcome:** The model "intuitively" knows which sources are trustworthy (Tier 0/1) without needing RAG retrieval for every fact.
---
## 4. Implementation Roadmap (The Path to L9)
### Phase 1: Structured Decoding (The "Guardrails")
*Integrate StemeDB with grammar-constrained generation libraries (like `guidance` or `outlines`).*
* **Mechanism:** Force the LLM to output a citation struct `{ claim: "...", source_id: "...", confidence: 0.0-1.0 }` for every assertion.
* **Validation:** If the generated `source_id` does not exist in StemeDB, or if the `confidence` doesn't match the `VoteStore`, reject the token stream and regenerate.
* **Deliverable:** `crates/stemedb-guidance`: A Rust binding for grammar-constrained sampling backed by the KV store.
### Phase 2: DPO Pipeline (The "Training")
*Direct Preference Optimization using StemeDB history.*
* **Mechanism:** Export the `VoteStore` history as `(Prompt, Chosen, Rejected)` tuples.
* *Chosen:* An assertion supported by Tier 0 (Regulatory) sources.
* *Rejected:* A conflicting assertion supported only by Tier 5 (Anecdotal) sources.
* **Action:** Fine-tune a Llama-3 8B model on this dataset.
* **Deliverable:** `crates/stemedb-rlhf`: A pipeline that turns WAL segments into HuggingFace datasets.
### Phase 3: The Logit Processor (The "Cortex")
*Real-time intervention.*
* **Mechanism:** A custom sampler (integrated into `llama.cpp` or `vLLM`) that queries StemeDB's `MaterializedView` in real-time (sub-millisecond) during inference.
* **Optimization:** This requires the `HybridStore` to be memory-mapped into the inference engine's address space for zero-latency lookups.
* **Deliverable:** `episteme-inference`: A standalone inference server that speaks OpenAI API but enforces StemeDB truth constraints.
---
## 5. The Impact
When we achieve Epistemic Logits, we solve the **Liability Gap**.
Currently, no enterprise can deploy an autonomous agent for critical tasks (Medical, Legal, Finance) because they cannot guarantee the output.
With Epistemic Logits, we provide a mathematical guarantee: **"This system is incapable of stating a claim with higher confidence than the underlying evidence supports."**
This transforms AI from a creative writing tool into a **fiduciary instrument**.