- Add `content: Option<String>` to SourceRecord with rkyv schema evolution (LegacySourceRecord compat deserializer for backward compatibility) - Add MAX_SOURCE_CONTENT_LEN (1MB) limit with API validation - Strip content from list responses, include in single-source GET - Update Go SDK RegisterSourceRequest with Content field - FCM pipeline extracts PDF text via pdftotext and passes to registration - Dashboard impact panel fetches and displays source content with expand/collapse - Add feed endpoint, dashboard feed panel, and signed assertion support - Update data-structures.md, API docs, and storage docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
129 lines
4.9 KiB
Markdown
129 lines
4.9 KiB
Markdown
# Storage
|
|
|
|
**Last Updated:** 2026-02-19
|
|
**Confidence:** High
|
|
|
|
## Summary
|
|
|
|
Episteme uses a Log-Structured, Content-Addressed storage model. Writes append to WAL, then index asynchronously. Reads query indexes and apply Lenses.
|
|
|
|
**Key Facts:**
|
|
- Append-only (never mutate)
|
|
- WAL for durability (fsync on write)
|
|
- KV store: HybridStore (fjall for writes, redb for reads)
|
|
- Content-addressed by BLAKE3 hash
|
|
|
|
**File Pointers:**
|
|
- `crates/stemedb-storage/src/traits.rs` - KVStore trait
|
|
- `crates/stemedb-storage/src/key_codec.rs` - Centralized key encoding (40+ builders, subject validation, extraction)
|
|
- `crates/stemedb-storage/src/hybrid_backend.rs` - HybridStore (routes to fjall or redb)
|
|
- `crates/stemedb-storage/src/fjall_backend.rs` - FjallStore (write-heavy keys)
|
|
- `crates/stemedb-storage/src/redb_backend.rs` - RedbStore (read-heavy keys)
|
|
- `crates/stemedb-storage/src/serde_helpers.rs` - Storage-layer serialize/deserialize helpers
|
|
- `crates/stemedb-storage/src/vote_store.rs` - VoteStore (Ballot Box)
|
|
- `crates/stemedb-storage/src/index_store.rs` - IndexStore (S: and SP: indexes)
|
|
- `crates/stemedb-storage/src/trust_rank_store.rs` - TrustRankStore (TR:)
|
|
|
|
## KV Layout
|
|
|
|
All keys use a centralized `key_codec` module (`crates/stemedb-storage/src/key_codec.rs`). Subject-scoped keys use `{subject}\x00` prefix for co-location; global keys use `\x00` prefix to sort first.
|
|
|
|
### Subject-Prefixed Keys (co-located per subject)
|
|
|
|
| Key Pattern | Value | Purpose |
|
|
|-------------|-------|---------|
|
|
| `{subject}\x00H:{hash}` | `Assertion` (serialized) | Main content store |
|
|
| `{subject}\x00S:{hash_list}` | `Vec<Hash>` (rkyv) | Subject index (IndexStore) |
|
|
| `{subject}\x00SP:{predicate}` | `Vec<Hash>` (rkyv) | Compound index (IndexStore) |
|
|
| `{subject}\x00MV:{predicate}` | `MaterializedView` (rkyv) | Pre-computed winner (Materializer) |
|
|
| `{subject}\x00V:{hash}:{vh}` | `Vote` (serialized) | Ballot Box votes |
|
|
| `{subject}\x00VC:{hash}` | `u64` (LE bytes) | Vote count cache |
|
|
| `{subject}\x00VW:{hash}` | `f32` (LE bytes) | Aggregate weight cache |
|
|
| `{subject}\x00GS:{predicate}` | `GoldStandard` (rkyv) | Gold standard entries |
|
|
|
|
### Global Keys (sort first via `\x00` prefix)
|
|
|
|
| Key Pattern | Value | Purpose |
|
|
|-------------|-------|---------|
|
|
| `\x00TRUST:{agent_id}` | `TrustRank` (rkyv) | Agent reputation (TrustRankStore) |
|
|
| `\x00QUOTA:{agent_id}:{window}` | Quota record | Per-agent per-window quota |
|
|
| `\x00QLIMIT:{agent_id}` | Quota limit | Per-agent quota limit |
|
|
| `\x00E:{epoch_id}` | `Epoch` (serialized) | Paradigm definitions |
|
|
| `\x00SUPERSEDED:{epoch_id}` | Supersession marker | O(1) epoch supersession lookup |
|
|
| `\x00SUP:{hash}` | Supersession record | Supersession data |
|
|
| `\x00AUD:{query_id}` | `QueryAudit` (rkyv) | Query audit trail |
|
|
| `\x00ESC:{ts}:{id}` | `EscalationEvent` (rkyv) | Escalation events |
|
|
| `\x00TP:{pack_id}` | `TrustPack` (rkyv) | Trust packs |
|
|
| `\x00META:{key}` | Varies | System metadata (e.g., cursor) |
|
|
| `\x00HASH_SUBJECT:{hash}` | Subject string | Reverse lookup: hash → subject |
|
|
| `\x00SUBJECTS:{subject}` | Marker | Known subjects index |
|
|
| `\x00GS_LIST:{subj}:{pred}` | Listing data | Gold standard listing |
|
|
|
|
## Serialization
|
|
|
|
### stemedb-core (shared types)
|
|
|
|
For core types, use the canonical module:
|
|
|
|
```rust
|
|
use stemedb_core::serde::{serialize, deserialize};
|
|
|
|
let bytes = serialize(&my_value)?;
|
|
let value: MyType = deserialize(&bytes)?;
|
|
```
|
|
|
|
**File:** `crates/stemedb-core/src/serde.rs`
|
|
|
|
Raw `AllocSerializer` usage is prohibited in production code (enforced via CLAUDE.md).
|
|
|
|
### stemedb-storage (store implementations)
|
|
|
|
In storage modules, use the storage-layer helpers that map to `StorageError`:
|
|
|
|
```rust
|
|
use crate::serde_helpers::{serialize, deserialize};
|
|
|
|
let bytes = serialize(&my_value)?; // Returns Result<Vec<u8>, StorageError>
|
|
let value: MyType = deserialize(&bytes)?;
|
|
```
|
|
|
|
**File:** `crates/stemedb-storage/src/serde_helpers.rs`
|
|
|
|
This provides unified error handling across all store implementations (VoteStore, IndexStore, TrustRankStore, AuditStore, TrustPackStore, QuotaStore).
|
|
|
|
For types with schema evolution (rkyv compat), use the dedicated compat functions:
|
|
|
|
```rust
|
|
use crate::serde_helpers::deserialize_source_record_compat;
|
|
|
|
let record: SourceRecord = deserialize_source_record_compat(&bytes)?;
|
|
```
|
|
|
|
Available compat deserializers: `deserialize_source_record_compat` (SourceRecord). For assertions, use `stemedb_core::serde::deserialize_assertion_compat` directly.
|
|
|
|
## Write Path
|
|
|
|
```
|
|
1. Agent submits signed Assertion
|
|
2. Validate signature
|
|
3. Append to WAL (fsync)
|
|
4. Return 202 Accepted with Hash
|
|
5. Background: tail WAL -> update indexes
|
|
```
|
|
|
|
## Read Path
|
|
|
|
```
|
|
1. Query: GET(Subject, Predicate, Lens)
|
|
2. Lookup: {subject}\x00SP:{predicate} -> [Hash...]
|
|
3. Hydrate: Load assertions from {subject}\x00H:{hash}
|
|
4. Resolve: Apply Lens
|
|
5. Return: Deterministic answer
|
|
```
|
|
|
|
## Related Topics
|
|
|
|
- [Assertion](./assertion.md)
|
|
- [Ballot Box](./ballot-box.md) - High-velocity vote storage
|
|
- [Architecture](../../../architecture.md)
|