jordan 02ecac9a07 fix: merge upstream 10 commits, fix DashMap deadlock, deterministic sim ingestion

Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint,
security hardening, signed assertions, source registry, dashboard enhancements)
and fixed all test failures across the full workspace (2656/2656 passing).

Key fixes:
- fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node
  - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock
  - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges()
  - 6 previously-hanging SWIM tests now pass in <2s
- fix(sim): replace background-task+polling ingestion with synchronous process_pending()
  - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests
  - Removed ingestor.start() + wait_until_ingested() pattern throughout sim
  - All arena functions now call ingestor.process_pending() directly (deterministic)
- fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2)
- fix(test): quota test signed "test" but v1 requires "subject:predicate" format
- fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex
- fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change)
- config: add nextest.toml with slow-timeout for background-task-tests group

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 20:27:32 -07:00

5.9 KiB

Raw Blame History

Vision: Epistemic Logits (The Neuro-Symbolic Cortex)

Status: Vision / L9 Roadmap Target: Solves "Intrinsic Hallucination" Core Concept: StemeDB is no longer just a database we query; it is a constraint layer applied to the model's probability distribution during inference.

1. The Problem: The "RAG Ceiling"

Current architectures (including our own Aphoria/ADK stack) rely on Retrieval Augmented Generation (RAG). This is a "Glass Box" system, but it is composed of two disconnected brains:

The Retriever (StemeDB): Knows what is true, what is conflicted, and who said what.
The Generator (LLM): Knows how to predict the next token based on statistical patterns.

In our current architecture, we paste the Truth (1) into the Context Window of the Generator (2) and hope the Generator attends to it.

The Failure Mode: The Generator can still ignore the context. It can hallucinate. It can state a high-conflict fact with absolute certainty ("X is true") instead of qualified uncertainty ("Some sources claim X").

We cannot fix this by prompting. We must fix it by math.

2. The Solution: Epistemic Logits

Epistemic Logits is a decoding strategy that modifies the probability distribution of the LLM's output layer in real-time, based on the ConflictScore and TrustRank of the concepts being generated.

We move StemeDB from the Input Layer (Prompt) to the Activation Layer (Logits).

The Core Equation

 P_{final}(token) = P_{model}(token) \times E(Subject, Predicate)

Where E is the Epistemic Function:

If ConflictScore > 0.8 (High Disagreement) AND Token implies certainty ("is", "proven", "fact"), then E \to 0 (Penalty).
If ConflictScore > 0.8 AND Token implies uncertainty ("reported", "alleged", "contested"), then E \to 1 (Boost).
If SourceTier is Low (Anecdotal) AND Time is old (Decayed), then E \to 0.

Result: The model physically cannot state a contested claim as a fact. It effectively has a "physics engine" for Truth.

3. Architecture: The Neuro-Symbolic Stack

[ User Query ]
      │
      ▼
[ 1. Semantic Router ] ───► [ StemeDB (The Graph) ]
      │                          │
      │ (Context)                │ (Constraints & Scores)
      ▼                          ▼
[ 2. LLM Core ]          [ 3. Epistemic Decoder ]
(Transformer)            (Logit Processor)
      │                          │
      └──► [ Raw Logits ] ──────►│
                                 │ ◄── "Don't say 'proven' if Conflict > 0.5"
                                 │
                                 ▼
                         [ Final Token ]

Component 1: The Lookahead Mapper

To constrain logits, we must know what the model is about to say. We implement a lightweight "Concept Probe" (a small BERT model or sparse autoencoder) that runs parallel to the main LLM.

Input: Current generation stream.
Output: The StemeDB::SubjectID the stream is discussing.

Component 2: The Constraint Projector

Once the Subject is identified, StemeDB projects the Epistemic State of that subject into a set of forbidden/boosted tokens.

State: Semaglutide::has_side_effect -> Conflict: High.
Constraint: Ban absolute assertions. Boost attribution markers ("According to FDA...", "Patients report...").

Component 3: The Reward Loop (RLHF on Reality)

We use the VoteStore not just for consensus, but to train a Reward Model.

Data: Millions of historical votes where Agents disagreed.
Training: Fine-tune the LLM to prefer outputs that align with the weighted consensus of the Graph.
Outcome: The model "intuitively" knows which sources are trustworthy (Tier 0/1) without needing RAG retrieval for every fact.

4. Implementation Roadmap (The Path to L9)

Phase 1: Structured Decoding (The "Guardrails")

Integrate StemeDB with grammar-constrained generation libraries (like guidance or outlines).

Mechanism: Force the LLM to output a citation struct { claim: "...", source_id: "...", confidence: 0.0-1.0 } for every assertion.
Validation: If the generated source_id does not exist in StemeDB, or if the confidence doesn't match the VoteStore, reject the token stream and regenerate.
Deliverable: crates/stemedb-guidance: A Rust binding for grammar-constrained sampling backed by the KV store.

Phase 2: DPO Pipeline (The "Training")

Direct Preference Optimization using StemeDB history.

Mechanism: Export the VoteStore history as (Prompt, Chosen, Rejected) tuples.
- Chosen: An assertion supported by Tier 0 (Regulatory) sources.
- Rejected: A conflicting assertion supported only by Tier 5 (Anecdotal) sources.
Action: Fine-tune a Llama-3 8B model on this dataset.
Deliverable: crates/stemedb-rlhf: A pipeline that turns WAL segments into HuggingFace datasets.

Phase 3: The Logit Processor (The "Cortex")

Real-time intervention.

Mechanism: A custom sampler (integrated into llama.cpp or vLLM) that queries StemeDB's MaterializedView in real-time (sub-millisecond) during inference.
Optimization: This requires the HybridStore to be memory-mapped into the inference engine's address space for zero-latency lookups.
Deliverable: episteme-inference: A standalone inference server that speaks OpenAI API but enforces StemeDB truth constraints.

5. The Impact

When we achieve Epistemic Logits, we solve the Liability Gap.

Currently, no enterprise can deploy an autonomous agent for critical tasks (Medical, Legal, Finance) because they cannot guarantee the output.

With Epistemic Logits, we provide a mathematical guarantee: "This system is incapable of stating a claim with higher confidence than the underlying evidence supports."

This transforms AI from a creative writing tool into a fiduciary instrument.

5.9 KiB Raw Blame History