Implements the foundation of tidalDB's data pipeline: **Phase 1 – Schema primitives** - EntityId newtype (u64, big-endian ordering) - SignalTypeDefinition with pre-computed decay λ, deduped/sorted windows - SchemaBuilder with full constraint validation (duplicates, identifiers, half-life, windows, velocity) - LumenError wrapping all subsystems with required From impls **Phase 2 – Write-Ahead Log** - Length-prefixed, BLAKE3-protected entry format - Group-commit writer (batch up to 100 events / 10 ms) - Double-buffered content-hash deduplication - Checkpoint, truncation, and crash-recovery with full replay - Integration, property, and UAT tests (incl. 5,500-event deterministic UAT) - Proptest coverage scaled to 10 000 events/run (was ≤500) to meet acceptance criterion; cases reduced 100→10 to keep runtime comparable **Phase 3 – Storage engine** - StorageEngine trait (get/put/delete/scan/batch/flush) - Key encoding: [EntityId][0x00][Tag][suffix] with ordering/prefix helpers - InMemoryBackend (BTreeMap + RwLock) - FjallStorage with three isolated keyspaces and atomic batch helper - Property tests for key ordering and round-trip correctness Also adds planning docs for phases 4-5, research docs, architecture overview, and roadmap updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.8 KiB
Task 02: FjallBackend
Context
Milestone: 1 -- Signal Engine
Phase: m1p3 -- Storage Engine Trait and fjall Backend
Status: COMPLETE
Depends On: Task 01 (StorageEngine trait, WriteBatch, StorageError)
Blocks: None (Task 03 is parallel, not sequential)
Complexity: M
Objective
Implement FjallBackend, the production storage engine backed by fjall 3's LSM-tree. Wrap it in FjallStorage which manages three keyspaces (one per EntityKind) and provides entity-kind routing. Implement FjallAtomicBatch for cross-keyspace atomic writes.
fjall was chosen (over RocksDB and sled) because it is pure Rust, supports #![forbid(unsafe_code)] at the tidalDB level (fjall uses unsafe internally but the API surface is safe), has fast compile times, and exposes the OwnedWriteBatch API needed for cross-keyspace atomicity.
Requirements
FjallBackendwraps a singlefjall::Keyspaceand implementsStorageEnginescan_prefixreturns aPrefixIterator<'_>using fjall's range scan over the keyspacewrite_batchuses fjall's batch write API for atomicity within a keyspaceFjallStorageowns afjall::Databasewith three partitions: "items", "users", "creators"FjallStorage::backend(EntityKind) -> &FjallBackendroutes to the correct partition- Entity-kind isolation: writes to
EntityKind::Itemnever collide withEntityKind::Userfor the same key FjallAtomicBatchenables cross-partition atomic writes viafjall::OwnedWriteBatch- Data persists across close and reopen: write →
flush_all()→ drop → reopen → read succeeds - MSRV: 1.91 (required for fjall 3)
Technical Design
Architecture
FjallStorage
├── items_backend: FjallBackend (fjall partition "items")
├── users_backend: FjallBackend (fjall partition "users")
└── creators_backend: FjallBackend (fjall partition "creators")
Each FjallBackend wraps one fjall partition. Entity data is isolated by partition (keyspace), not by key prefix. This means the same encoded key [entity_id][NUL][Tag] can exist in both "items" and "users" without collision — they are different partition namespaces.
Within each partition, the subject-prefix key encoding enables efficient entity-scoped scans (scan_prefix(entity_prefix(id))).
Public API
// === fjall.rs ===
/// Production storage engine backed by a single fjall partition.
pub struct FjallBackend {
partition: fjall::PartitionHandle,
}
impl StorageEngine for FjallBackend { /* ... */ }
impl FjallBackend {
/// Create a backend from an existing fjall partition handle.
pub fn new(partition: fjall::PartitionHandle) -> Self;
}
/// Manages three fjall partitions, one per EntityKind.
pub struct FjallStorage {
keyspace: fjall::Keyspace,
items: FjallBackend,
users: FjallBackend,
creators: FjallBackend,
}
impl FjallStorage {
/// Open or create a FjallStorage at the given path.
pub fn open(path: impl AsRef<std::path::Path>) -> Result<Self, StorageError>;
/// Route to the backend for the given entity kind.
pub fn backend(&self, kind: EntityKind) -> &FjallBackend;
/// Flush all partitions to durable storage.
pub fn flush_all(&self) -> Result<(), StorageError>;
/// Begin a cross-partition atomic write batch.
pub fn atomic_batch(&self) -> FjallAtomicBatch;
}
/// Cross-partition atomic write batch.
///
/// Accumulates put/delete operations across multiple partitions
/// and applies them all atomically.
pub struct FjallAtomicBatch {
batch: fjall::OwnedWriteBatch,
keyspace: fjall::Keyspace,
}
impl FjallAtomicBatch {
pub fn put(&mut self, partition: &FjallBackend, key: &[u8], value: &[u8]);
pub fn delete(&mut self, partition: &FjallBackend, key: &[u8]);
/// Commit the batch atomically across all partitions.
pub fn commit(self) -> Result<(), StorageError>;
}
Test Strategy
Integration Tests (require tempdir)
#[test]
fn fjall_backend_get_put_delete() {
let dir = tempfile::tempdir().unwrap();
let storage = FjallStorage::open(dir.path()).unwrap();
let backend = storage.backend(EntityKind::Item);
backend.put(b"key1", b"value1").unwrap();
assert_eq!(backend.get(b"key1").unwrap(), Some(b"value1".to_vec()));
backend.delete(b"key1").unwrap();
assert_eq!(backend.get(b"key1").unwrap(), None);
}
#[test]
fn fjall_backend_scan_prefix() {
let dir = tempfile::tempdir().unwrap();
let storage = FjallStorage::open(dir.path()).unwrap();
let backend = storage.backend(EntityKind::Item);
let id = EntityId::new(42);
backend.put(&encode_key(id, Tag::Meta, b"a"), b"v1").unwrap();
backend.put(&encode_key(id, Tag::Meta, b"b"), b"v2").unwrap();
backend.put(&encode_key(EntityId::new(43), Tag::Meta, b"a"), b"v3").unwrap();
let prefix = entity_prefix(id);
let results: Vec<_> = backend.scan_prefix(&prefix).collect::<Result<Vec<_>, _>>().unwrap();
assert_eq!(results.len(), 2); // only entity 42's keys
}
#[test]
fn fjall_entity_kind_isolation() {
let dir = tempfile::tempdir().unwrap();
let storage = FjallStorage::open(dir.path()).unwrap();
let key = encode_key(EntityId::new(1), Tag::Meta, b"");
storage.backend(EntityKind::Item).put(&key, b"item_value").unwrap();
storage.backend(EntityKind::User).put(&key, b"user_value").unwrap();
assert_eq!(storage.backend(EntityKind::Item).get(&key).unwrap(), Some(b"item_value".to_vec()));
assert_eq!(storage.backend(EntityKind::User).get(&key).unwrap(), Some(b"user_value".to_vec()));
}
#[test]
fn fjall_persistence_survives_reopen() {
let dir = tempfile::tempdir().unwrap();
{
let storage = FjallStorage::open(dir.path()).unwrap();
storage.backend(EntityKind::Item).put(b"k", b"v").unwrap();
storage.flush_all().unwrap();
} // storage dropped here
let storage2 = FjallStorage::open(dir.path()).unwrap();
assert_eq!(storage2.backend(EntityKind::Item).get(b"k").unwrap(), Some(b"v".to_vec()));
}
#[test]
fn fjall_atomic_batch_all_or_nothing() {
let dir = tempfile::tempdir().unwrap();
let storage = FjallStorage::open(dir.path()).unwrap();
let mut batch = storage.atomic_batch();
batch.put(storage.backend(EntityKind::Item), b"item_key", b"item_val");
batch.put(storage.backend(EntityKind::User), b"user_key", b"user_val");
batch.commit().unwrap();
assert_eq!(storage.backend(EntityKind::Item).get(b"item_key").unwrap(), Some(b"item_val".to_vec()));
assert_eq!(storage.backend(EntityKind::User).get(b"user_key").unwrap(), Some(b"user_val".to_vec()));
}
#[test]
fn fjall_write_batch_atomic_within_partition() {
let dir = tempfile::tempdir().unwrap();
let storage = FjallStorage::open(dir.path()).unwrap();
let backend = storage.backend(EntityKind::Item);
let mut batch = WriteBatch::new();
batch.put(b"k1".to_vec(), b"v1".to_vec());
batch.put(b"k2".to_vec(), b"v2".to_vec());
batch.delete(b"k_missing".to_vec());
backend.write_batch(batch).unwrap();
assert_eq!(backend.get(b"k1").unwrap(), Some(b"v1".to_vec()));
assert_eq!(backend.get(b"k2").unwrap(), Some(b"v2".to_vec()));
}
Acceptance Criteria
FjallBackendimplements allStorageEnginemethodsscan_prefixreturns keys in lexicographic order (guaranteed by fjall's LSM-tree)FjallStoragecreates three partitions: "items", "users", "creators"FjallStorage::backend(EntityKind)routes to the correct partition- Same key written to different entity kind partitions does not collide
FjallAtomicBatch::commit()applies operations across partitions atomically- Data persists across close and reopen (flush_all + reopen test passes)
cargo clippy -D warningspasses with fjall 3
Research References
- thoughts.md — Part V.9 (fjall chosen over RocksDB: pure Rust, fast compile, trait-abstracted for swap; sled not considered due to maintenance uncertainty)
- CODING_GUIDELINES.md — Section 10 (fjall as primary backend, RocksDB deferred indefinitely unless benchmarks demand it)
Implementation Notes
- fjall 3 requires MSRV 1.91. The
rust-versionfield intidal/Cargo.tomlis set accordingly. FjallBackend::scan_prefixuses fjall's range scan fromprefixtoprefix + 1(lexicographic upper bound). Construct the upper bound by incrementing the last non-0xFF byte of the prefix.FjallAtomicBatchholds a reference to thefjall::Keyspace(not the individual partitions) becauseOwnedWriteBatchneeds to be committed against the keyspace, not a partition.StorageError::Backend(String)captures fjall errors viaformat!("{}", fjall_err). The fjall error type is not re-exported because higher modules should not depend on fjall directly.- The
#![forbid(unsafe_code)]directive applies to thetidalcrate; fjall's internal unsafe code is behind a dependency boundary and does not violate this rule.