Implements the foundation of tidalDB's data pipeline: **Phase 1 – Schema primitives** - EntityId newtype (u64, big-endian ordering) - SignalTypeDefinition with pre-computed decay λ, deduped/sorted windows - SchemaBuilder with full constraint validation (duplicates, identifiers, half-life, windows, velocity) - LumenError wrapping all subsystems with required From impls **Phase 2 – Write-Ahead Log** - Length-prefixed, BLAKE3-protected entry format - Group-commit writer (batch up to 100 events / 10 ms) - Double-buffered content-hash deduplication - Checkpoint, truncation, and crash-recovery with full replay - Integration, property, and UAT tests (incl. 5,500-event deterministic UAT) - Proptest coverage scaled to 10 000 events/run (was ≤500) to meet acceptance criterion; cases reduced 100→10 to keep runtime comparable **Phase 3 – Storage engine** - StorageEngine trait (get/put/delete/scan/batch/flush) - Key encoding: [EntityId][0x00][Tag][suffix] with ordering/prefix helpers - InMemoryBackend (BTreeMap + RwLock) - FjallStorage with three isolated keyspaces and atomic batch helper - Property tests for key ordering and round-trip correctness Also adds planning docs for phases 4-5, research docs, architecture overview, and roadmap updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
215 lines
8.8 KiB
Markdown
215 lines
8.8 KiB
Markdown
# Task 02: FjallBackend
|
|
|
|
## Context
|
|
|
|
**Milestone:** 1 -- Signal Engine
|
|
**Phase:** m1p3 -- Storage Engine Trait and fjall Backend
|
|
**Status:** COMPLETE
|
|
**Depends On:** Task 01 (`StorageEngine` trait, `WriteBatch`, `StorageError`)
|
|
**Blocks:** None (Task 03 is parallel, not sequential)
|
|
**Complexity:** M
|
|
|
|
## Objective
|
|
|
|
Implement `FjallBackend`, the production storage engine backed by fjall 3's LSM-tree. Wrap it in `FjallStorage` which manages three keyspaces (one per `EntityKind`) and provides entity-kind routing. Implement `FjallAtomicBatch` for cross-keyspace atomic writes.
|
|
|
|
fjall was chosen (over RocksDB and sled) because it is pure Rust, supports `#![forbid(unsafe_code)]` at the tidalDB level (fjall uses unsafe internally but the API surface is safe), has fast compile times, and exposes the `OwnedWriteBatch` API needed for cross-keyspace atomicity.
|
|
|
|
## Requirements
|
|
|
|
- `FjallBackend` wraps a single `fjall::Keyspace` and implements `StorageEngine`
|
|
- `scan_prefix` returns a `PrefixIterator<'_>` using fjall's range scan over the keyspace
|
|
- `write_batch` uses fjall's batch write API for atomicity within a keyspace
|
|
- `FjallStorage` owns a `fjall::Database` with three partitions: "items", "users", "creators"
|
|
- `FjallStorage::backend(EntityKind) -> &FjallBackend` routes to the correct partition
|
|
- Entity-kind isolation: writes to `EntityKind::Item` never collide with `EntityKind::User` for the same key
|
|
- `FjallAtomicBatch` enables cross-partition atomic writes via `fjall::OwnedWriteBatch`
|
|
- Data persists across close and reopen: write → `flush_all()` → drop → reopen → read succeeds
|
|
- MSRV: 1.91 (required for fjall 3)
|
|
|
|
## Technical Design
|
|
|
|
### Architecture
|
|
|
|
```
|
|
FjallStorage
|
|
├── items_backend: FjallBackend (fjall partition "items")
|
|
├── users_backend: FjallBackend (fjall partition "users")
|
|
└── creators_backend: FjallBackend (fjall partition "creators")
|
|
```
|
|
|
|
Each `FjallBackend` wraps one fjall partition. Entity data is isolated by partition (keyspace), not by key prefix. This means the same encoded key `[entity_id][NUL][Tag]` can exist in both "items" and "users" without collision — they are different partition namespaces.
|
|
|
|
Within each partition, the subject-prefix key encoding enables efficient entity-scoped scans (`scan_prefix(entity_prefix(id))`).
|
|
|
|
### Public API
|
|
|
|
```rust
|
|
// === fjall.rs ===
|
|
|
|
/// Production storage engine backed by a single fjall partition.
|
|
pub struct FjallBackend {
|
|
partition: fjall::PartitionHandle,
|
|
}
|
|
|
|
impl StorageEngine for FjallBackend { /* ... */ }
|
|
|
|
impl FjallBackend {
|
|
/// Create a backend from an existing fjall partition handle.
|
|
pub fn new(partition: fjall::PartitionHandle) -> Self;
|
|
}
|
|
|
|
/// Manages three fjall partitions, one per EntityKind.
|
|
pub struct FjallStorage {
|
|
keyspace: fjall::Keyspace,
|
|
items: FjallBackend,
|
|
users: FjallBackend,
|
|
creators: FjallBackend,
|
|
}
|
|
|
|
impl FjallStorage {
|
|
/// Open or create a FjallStorage at the given path.
|
|
pub fn open(path: impl AsRef<std::path::Path>) -> Result<Self, StorageError>;
|
|
|
|
/// Route to the backend for the given entity kind.
|
|
pub fn backend(&self, kind: EntityKind) -> &FjallBackend;
|
|
|
|
/// Flush all partitions to durable storage.
|
|
pub fn flush_all(&self) -> Result<(), StorageError>;
|
|
|
|
/// Begin a cross-partition atomic write batch.
|
|
pub fn atomic_batch(&self) -> FjallAtomicBatch;
|
|
}
|
|
|
|
/// Cross-partition atomic write batch.
|
|
///
|
|
/// Accumulates put/delete operations across multiple partitions
|
|
/// and applies them all atomically.
|
|
pub struct FjallAtomicBatch {
|
|
batch: fjall::OwnedWriteBatch,
|
|
keyspace: fjall::Keyspace,
|
|
}
|
|
|
|
impl FjallAtomicBatch {
|
|
pub fn put(&mut self, partition: &FjallBackend, key: &[u8], value: &[u8]);
|
|
pub fn delete(&mut self, partition: &FjallBackend, key: &[u8]);
|
|
/// Commit the batch atomically across all partitions.
|
|
pub fn commit(self) -> Result<(), StorageError>;
|
|
}
|
|
```
|
|
|
|
## Test Strategy
|
|
|
|
### Integration Tests (require tempdir)
|
|
|
|
```rust
|
|
#[test]
|
|
fn fjall_backend_get_put_delete() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
let backend = storage.backend(EntityKind::Item);
|
|
|
|
backend.put(b"key1", b"value1").unwrap();
|
|
assert_eq!(backend.get(b"key1").unwrap(), Some(b"value1".to_vec()));
|
|
|
|
backend.delete(b"key1").unwrap();
|
|
assert_eq!(backend.get(b"key1").unwrap(), None);
|
|
}
|
|
|
|
#[test]
|
|
fn fjall_backend_scan_prefix() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
let backend = storage.backend(EntityKind::Item);
|
|
|
|
let id = EntityId::new(42);
|
|
backend.put(&encode_key(id, Tag::Meta, b"a"), b"v1").unwrap();
|
|
backend.put(&encode_key(id, Tag::Meta, b"b"), b"v2").unwrap();
|
|
backend.put(&encode_key(EntityId::new(43), Tag::Meta, b"a"), b"v3").unwrap();
|
|
|
|
let prefix = entity_prefix(id);
|
|
let results: Vec<_> = backend.scan_prefix(&prefix).collect::<Result<Vec<_>, _>>().unwrap();
|
|
assert_eq!(results.len(), 2); // only entity 42's keys
|
|
}
|
|
|
|
#[test]
|
|
fn fjall_entity_kind_isolation() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
let key = encode_key(EntityId::new(1), Tag::Meta, b"");
|
|
|
|
storage.backend(EntityKind::Item).put(&key, b"item_value").unwrap();
|
|
storage.backend(EntityKind::User).put(&key, b"user_value").unwrap();
|
|
|
|
assert_eq!(storage.backend(EntityKind::Item).get(&key).unwrap(), Some(b"item_value".to_vec()));
|
|
assert_eq!(storage.backend(EntityKind::User).get(&key).unwrap(), Some(b"user_value".to_vec()));
|
|
}
|
|
|
|
#[test]
|
|
fn fjall_persistence_survives_reopen() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
{
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
storage.backend(EntityKind::Item).put(b"k", b"v").unwrap();
|
|
storage.flush_all().unwrap();
|
|
} // storage dropped here
|
|
|
|
let storage2 = FjallStorage::open(dir.path()).unwrap();
|
|
assert_eq!(storage2.backend(EntityKind::Item).get(b"k").unwrap(), Some(b"v".to_vec()));
|
|
}
|
|
|
|
#[test]
|
|
fn fjall_atomic_batch_all_or_nothing() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
|
|
let mut batch = storage.atomic_batch();
|
|
batch.put(storage.backend(EntityKind::Item), b"item_key", b"item_val");
|
|
batch.put(storage.backend(EntityKind::User), b"user_key", b"user_val");
|
|
batch.commit().unwrap();
|
|
|
|
assert_eq!(storage.backend(EntityKind::Item).get(b"item_key").unwrap(), Some(b"item_val".to_vec()));
|
|
assert_eq!(storage.backend(EntityKind::User).get(b"user_key").unwrap(), Some(b"user_val".to_vec()));
|
|
}
|
|
|
|
#[test]
|
|
fn fjall_write_batch_atomic_within_partition() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let storage = FjallStorage::open(dir.path()).unwrap();
|
|
let backend = storage.backend(EntityKind::Item);
|
|
|
|
let mut batch = WriteBatch::new();
|
|
batch.put(b"k1".to_vec(), b"v1".to_vec());
|
|
batch.put(b"k2".to_vec(), b"v2".to_vec());
|
|
batch.delete(b"k_missing".to_vec());
|
|
backend.write_batch(batch).unwrap();
|
|
|
|
assert_eq!(backend.get(b"k1").unwrap(), Some(b"v1".to_vec()));
|
|
assert_eq!(backend.get(b"k2").unwrap(), Some(b"v2".to_vec()));
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [x] `FjallBackend` implements all `StorageEngine` methods
|
|
- [x] `scan_prefix` returns keys in lexicographic order (guaranteed by fjall's LSM-tree)
|
|
- [x] `FjallStorage` creates three partitions: "items", "users", "creators"
|
|
- [x] `FjallStorage::backend(EntityKind)` routes to the correct partition
|
|
- [x] Same key written to different entity kind partitions does not collide
|
|
- [x] `FjallAtomicBatch::commit()` applies operations across partitions atomically
|
|
- [x] Data persists across close and reopen (flush_all + reopen test passes)
|
|
- [x] `cargo clippy -D warnings` passes with fjall 3
|
|
|
|
## Research References
|
|
|
|
- [thoughts.md](../../../../thoughts.md) — Part V.9 (fjall chosen over RocksDB: pure Rust, fast compile, trait-abstracted for swap; sled not considered due to maintenance uncertainty)
|
|
- [CODING_GUIDELINES.md](../../../../CODING_GUIDELINES.md) — Section 10 (fjall as primary backend, RocksDB deferred indefinitely unless benchmarks demand it)
|
|
|
|
## Implementation Notes
|
|
|
|
- fjall 3 requires MSRV 1.91. The `rust-version` field in `tidal/Cargo.toml` is set accordingly.
|
|
- `FjallBackend::scan_prefix` uses fjall's range scan from `prefix` to `prefix + 1` (lexicographic upper bound). Construct the upper bound by incrementing the last non-0xFF byte of the prefix.
|
|
- `FjallAtomicBatch` holds a reference to the `fjall::Keyspace` (not the individual partitions) because `OwnedWriteBatch` needs to be committed against the keyspace, not a partition.
|
|
- `StorageError::Backend(String)` captures fjall errors via `format!("{}", fjall_err)`. The fjall error type is not re-exported because higher modules should not depend on fjall directly.
|
|
- The `#![forbid(unsafe_code)]` directive applies to the `tidal` crate; fjall's internal unsafe code is behind a dependency boundary and does not violate this rule.
|