Add CRC32C checksums to WAL record format (v2), implement crash recovery with automatic truncation of corrupt records, add feature-gated group commit buffer for batched fsync under concurrent load, and implement log rotation via segment files with global offset addressing. Key changes: - Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N] - recover_file() scans and truncates corrupt tail records - GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate) - SegmentManager with binary search resolution and cursor-based cleanup - Journal::read() auto-refreshes segments on miss for writer/reader split - Split recovery.rs and key_codec.rs into directory modules for 500-line max Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.2 KiB
The Ballot Box
Last Updated: 2026-01-31 Confidence: High Status: Implemented
Summary
The Ballot Box is Episteme's high-velocity vote ingestion system. It separates votes from assertions to enable thousands of agents to vote simultaneously without lock contention.
Key Facts:
- Votes are append-only (immutable)
- Content-addressed by BLAKE3 hash
- O(1) vote counts via cached counters
- O(1) aggregate weights for Materializer
- Decoupled from Assertion mutations
File Pointer: crates/stemedb-storage/src/vote_store.rs
Storage Layout
| Key Pattern | Value | Purpose |
|---|---|---|
V:{assertion_hash}:{vote_hash} |
Serialized Vote |
Individual votes |
VC:{assertion_hash} |
u64 (LE bytes) |
Vote count cache |
VW:{assertion_hash} |
f32 (LE bytes) |
Aggregate weight cache |
VoteStore Trait
#[async_trait]
pub trait VoteStore: Send + Sync {
/// Store a vote and return its content-addressed hash
async fn put_vote(&self, vote: &Vote) -> Result<Hash>;
/// Get a specific vote by hash
async fn get_vote(&self, assertion_hash: &Hash, vote_hash: &Hash) -> Result<Option<Vote>>;
/// Get all votes for an assertion (O(n))
async fn get_votes_for_assertion(&self, assertion_hash: &Hash) -> Result<Vec<Vote>>;
/// Get vote count (O(1) via cache)
async fn get_vote_count(&self, assertion_hash: &Hash) -> Result<u64>;
/// Get aggregate weight (O(1) via cache)
async fn get_aggregate_weight(&self, assertion_hash: &Hash) -> Result<f32>;
/// Check if any votes exist
async fn has_votes(&self, assertion_hash: &Hash) -> Result<bool>;
}
Usage Example
use stemedb_storage::{HybridStore, GenericVoteStore, VoteStore};
use stemedb_core::types::Vote;
// Create vote store backed by HybridStore (fjall + redb)
let kv_store = HybridStore::open("./data")?;
let vote_store = GenericVoteStore::new(kv_store);
// High-velocity vote ingestion
let vote = Vote {
assertion_hash: [1u8; 32],
agent_id: [2u8; 32],
weight: 0.85,
signature: sig_bytes,
timestamp: now,
};
let vote_hash = vote_store.put_vote(&vote).await?;
// O(1) aggregation queries (for Materializer)
let count = vote_store.get_vote_count(&assertion_hash).await?;
let total_weight = vote_store.get_aggregate_weight(&assertion_hash).await?;
Design Rationale
Why Separate Votes from Assertions?
Traditional databases would store votes as a column or join table:
-- Naive approach: votes as assertion metadata
UPDATE assertions SET vote_count = vote_count + 1 WHERE hash = ?;
Problems:
- Lock contention when many agents vote on same assertion
- Lost history (can't see who voted when)
- Violates append-only semantics
Ballot Box Solution:
- Votes are separate, immutable records
- Each vote is content-addressed
- Caches enable O(1) aggregation
- Full audit trail preserved
Cache Update Strategy
When put_vote() is called:
- Serialize vote with rkyv
- Compute BLAKE3 hash (content address)
- Store at
V:{assertion_hash}:{vote_hash} - Increment
VC:{assertion_hash}counter - Add weight to
VW:{assertion_hash}sum
The caches are updated atomically with the vote write, ensuring consistency.
Duplicate Vote Handling
The VoteStore does NOT prevent duplicate votes - it stores whatever is submitted. Duplicate detection is a higher-level concern (e.g., "one vote per agent per assertion") that should be enforced at the API layer.
This design choice keeps the storage layer simple and lets policy be defined elsewhere.
Integration with Materializer
The Materializer (Phase 2) will use the VoteStore to update Materialized Views:
// Materializer pseudocode
for assertion in new_assertions {
let votes = vote_store.get_votes_for_assertion(&assertion.hash).await?;
let weighted_score = calculate_consensus(votes, trustrank);
if should_update_mv(weighted_score, current_mv) {
store.put(
format!("MV:{}:{}", assertion.subject, assertion.predicate),
assertion.serialize()
).await?;
}
}