This commit adds the read path (Cortex) to complement the write path (Spine): ## Crates - stemedb-api: HTTP API with axum + utoipa OpenAPI - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit - Metered endpoints with quota enforcement - Ed25519 signature verification - stemedb-lens: Truth resolution lenses - RecencyLens, ConsensusLens, ConfidenceLens - VoteAwareConsensusLens (Ballot Box pattern) - TrustAwareAuthorityLens (The Hive pattern) - SkepticLens (conflict analysis) - EpochAwareLens (paradigm-safe queries) - stemedb-query: Query engine with materialized views ## Storage Extensions - VoteStore: Vote aggregation with cached counts - TrustRankStore: Agent reputation with decay - AuditStore: Query audit trail - IndexStore: SP/P/S index structures - SupersessionStore: Epoch supersession chains ## SDKs - sdk/go/steme: Go HTTP client with Ed25519 signing - sdk/go/adk: ADK-Go tools for AI agents ## Documentation - Updated CLAUDE.md, architecture.md, roadmap.md - New ai-lookup entries for all services - Use case docs for consumer health intelligence - Arena roadmap for simulation advancement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
142 lines
5.4 KiB
Markdown
142 lines
5.4 KiB
Markdown
# Materializer
|
|
|
|
**Last Updated:** 2026-01-31
|
|
**Confidence:** High
|
|
**Status:** Implemented
|
|
|
|
## Summary
|
|
|
|
The Materializer is a background worker that pre-computes winning assertions for each subject+predicate pair, storing them at `MV:{subject}:{predicate}` for O(1) reads. It bridges the gap between O(N) lens resolution and sub-millisecond query latency.
|
|
|
|
**Key Facts:**
|
|
- Scans all `SP:` compound indexes to discover pairs
|
|
- Resolves each pair through an `AsyncLens` (default: VoteAwareConsensus)
|
|
- Stores `MaterializedView` with winner + provenance metadata
|
|
- `step()` for one-shot, `run()` for polling loop, `run_notified()` for event-driven mode
|
|
- Event-driven: IngestWorker signals `tokio::sync::Notify` on new data; Materializer reacts immediately
|
|
- QueryEngine uses fast-path: checks `MV:` key before falling back to `SP:` index
|
|
- Error-resilient: individual pair failures are logged and skipped
|
|
|
|
**File Pointers:**
|
|
- `crates/stemedb-query/src/materializer.rs` - Materializer worker
|
|
- `crates/stemedb-core/src/types.rs` - MaterializedView type
|
|
|
|
## Storage Layout
|
|
|
|
| Key Pattern | Value | Purpose |
|
|
|-------------|-------|---------|
|
|
| `MV:{subject}:{predicate}` | Serialized `MaterializedView` | Pre-computed winner + metadata |
|
|
|
|
## MaterializedView Type
|
|
|
|
```rust
|
|
pub struct MaterializedView {
|
|
pub winner: Assertion, // The resolved winner
|
|
pub lens_name: String, // Which lens produced this (e.g., "VoteAwareConsensus")
|
|
pub resolution_confidence: f32, // Confidence in the resolution [0.0, 1.0]
|
|
pub candidates_count: usize, // How many candidates were considered
|
|
pub materialized_at: u64, // When this view was last computed
|
|
}
|
|
```
|
|
|
|
## Materializer Interface
|
|
|
|
```rust
|
|
pub struct Materializer<S> {
|
|
store: Arc<S>,
|
|
index_store: GenericIndexStore<Arc<S>>,
|
|
lens: Box<dyn AsyncLens>,
|
|
}
|
|
|
|
impl<S: KVStore + 'static> Materializer<S> {
|
|
/// Create with any AsyncLens implementation
|
|
pub fn new(store: Arc<S>, lens: Box<dyn AsyncLens>) -> Self;
|
|
|
|
/// One full materialization pass over all SP: pairs
|
|
pub async fn step(&self) -> Result<MaterializeReport>;
|
|
|
|
/// Materialize a single subject+predicate pair
|
|
pub async fn materialize_pair(&self, subject: &str, predicate: &str)
|
|
-> Result<Option<MaterializedView>>;
|
|
|
|
/// Read a pre-computed view (O(1))
|
|
pub async fn get_materialized_view(&self, subject: &str, predicate: &str)
|
|
-> Result<Option<MaterializedView>>;
|
|
|
|
/// Run continuously with configurable interval (polling mode)
|
|
pub async fn run(&self, interval: Duration);
|
|
|
|
/// Run in event-driven mode, triggered by IngestWorker notifications
|
|
pub async fn run_notified(&self, notify: Arc<Notify>, max_interval: Duration);
|
|
}
|
|
```
|
|
|
|
## Read Path Integration
|
|
|
|
The `QueryEngine` automatically uses the fast path when both subject and predicate are specified.
|
|
|
|
**Fast Path (O(1)):**
|
|
```
|
|
QueryEngine::execute() -> MV:{subject}:{predicate} -> MaterializedView.winner -> QueryResult
|
|
```
|
|
|
|
**Slow Path (O(N)):**
|
|
```
|
|
QueryEngine::execute() -> SP:{subject}:{predicate} -> [H:{hash}...] -> candidates -> filter -> QueryResult
|
|
```
|
|
|
|
The fast path is used when a materialized view exists and the winner matches query filters (lifecycle, epoch). The slow path is the fallback when no MV exists, the winner doesn't match filters, or only a subject is specified.
|
|
|
|
**File:** `crates/stemedb-query/src/engine.rs` — `try_fast_path()` method
|
|
|
|
## Event-Driven Mode
|
|
|
|
The Materializer supports two operating modes:
|
|
|
|
**Polling mode** (`run(interval)`): Fixed-interval passes. Simple but wastes cycles when idle and adds latency after writes.
|
|
|
|
**Event-driven mode** (`run_notified(notify, max_interval)`): The IngestWorker signals a `tokio::sync::Notify` after each successful record ingestion. The Materializer awaits this signal, running a pass immediately when new data arrives. A `max_interval` timeout acts as a safety net for missed notifications.
|
|
|
|
```
|
|
IngestWorker::step() -> notify.notify_one() -> Materializer::run_notified() wakes -> step()
|
|
```
|
|
|
|
**Wiring:**
|
|
```rust
|
|
let notify = Arc::new(tokio::sync::Notify::new());
|
|
let worker = IngestWorker::new(journal, store.clone()).await?.with_notify(Arc::clone(¬ify));
|
|
let materializer = Materializer::new(store, Box::new(lens));
|
|
// In separate tasks:
|
|
tokio::spawn(async move { worker.run().await });
|
|
tokio::spawn(async move { materializer.run_notified(notify, Duration::from_secs(30)).await });
|
|
```
|
|
|
|
**File:** `crates/stemedb-ingest/src/worker.rs` — `with_notify()` method
|
|
|
|
## Design Rationale
|
|
|
|
### Why a Background Worker?
|
|
|
|
Inline materialization (on every write) would:
|
|
1. Add latency to the write path
|
|
2. Create contention when many agents write simultaneously
|
|
3. Couple write and read concerns
|
|
|
|
The background worker approach:
|
|
1. Keeps the write path fast (append-only)
|
|
2. Batches resolution work efficiently
|
|
3. Tolerates temporary staleness (eventual consistency)
|
|
|
|
### Why Store Metadata?
|
|
|
|
The `MaterializedView` includes `lens_name`, `confidence`, `candidates_count`, and `materialized_at` because:
|
|
1. **Provenance:** Agents can verify how truth was determined
|
|
2. **Debugging:** "Why does the system think Tesla's revenue is X?"
|
|
3. **Staleness detection:** Readers can check `materialized_at` to decide if a slow-path re-resolution is needed
|
|
|
|
## Related Topics
|
|
|
|
- [Ballot Box](./ballot-box.md) - Vote data consumed by the Materializer
|
|
- [Storage](./storage.md) - KV layout and key patterns
|
|
- [Architecture](../../architecture.md) - Section 3 (Write Path) and Section 4 (Read Path)
|