Phase 1 delivers the complete durability and storage layer:
- WAL with crash recovery: Append-only journal with BLAKE3 checksums,
fsync guarantees, and proper seek-to-EOF on reopen
- Storage engine: sled-backed KVStore with scan_prefix for range queries
- Content-addressed storage: H:{hash}, V:{hash}, E:{hash} key patterns
- Ingestor: Background worker tailing WAL, writing to KV with 8-byte
aligned record headers for rkyv zero-copy deserialization
- Comprehensive tests: 31 tests covering crash recovery, round-trips,
and multi-cycle durability
New crates: stemedb-wal, stemedb-storage, stemedb-ingest
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2.2 KiB
2.2 KiB
Ingestor Service
Crate:
stemedb-ingestStatus: Implemented (Phase 1)
Purpose
The Ingestor is the background worker that bridges the Write-Ahead Log (WAL) to the KV storage engine. It continuously tails the WAL and persists records to sled using content-addressed keys.
Architecture
[WAL Journal] ---> [IngestWorker] ---> [KVStore (sled)]
|
v
[Subject Index]
Key Components
RecordType
Discriminator for WAL payloads (8-byte aligned header):
Assertion = 0- Knowledge claimsVote = 1- Consensus votesEpoch = 2- Paradigm definitions
Storage Layout
| Key Pattern | Value | Description |
|---|---|---|
H:{blake3_hash} |
Serialized Assertion | Content-addressed assertion store |
V:{assertion_hash}:{vote_hash} |
Serialized Vote | Votes on assertions |
E:{epoch_id_hex} |
Serialized Epoch | Epoch definitions |
S:{subject} |
BLAKE3 hash bytes | Subject adjacency index |
Usage
use stemedb_ingest::{Ingestor, serialize_assertion};
use stemedb_wal::Journal;
use stemedb_storage::SledStore;
// Create components
let journal = Arc::new(Mutex::new(Journal::open("./wal")?));
let store = Arc::new(SledStore::open("./db")?);
// Create and start ingestor
let mut ingestor = Ingestor::new(journal.clone(), store);
ingestor.start(); // Spawns background task
// Write to WAL (records will be ingested automatically)
let assertion = Assertion { ... };
let payload = serialize_assertion(&assertion)?;
journal.lock().await.append(payload)?;
Serialization
Records are serialized with an 8-byte header to maintain rkyv alignment:
[type: u8][padding: 7 bytes][rkyv payload...]
Helper functions:
serialize_assertion(&Assertion) -> Result<Vec<u8>>serialize_vote(&Vote) -> Result<Vec<u8>>serialize_epoch(&Epoch) -> Result<Vec<u8>>
Testing
The ingestor has integration tests covering:
- Single assertion ingestion
- Vote ingestion
- Epoch ingestion
- Multiple record processing
- Subject index creation
Related
- Storage Service - KVStore trait and SledStore
- Content Addressing - BLAKE3 hashing