Implements the foundation of tidalDB's data pipeline: **Phase 1 – Schema primitives** - EntityId newtype (u64, big-endian ordering) - SignalTypeDefinition with pre-computed decay λ, deduped/sorted windows - SchemaBuilder with full constraint validation (duplicates, identifiers, half-life, windows, velocity) - LumenError wrapping all subsystems with required From impls **Phase 2 – Write-Ahead Log** - Length-prefixed, BLAKE3-protected entry format - Group-commit writer (batch up to 100 events / 10 ms) - Double-buffered content-hash deduplication - Checkpoint, truncation, and crash-recovery with full replay - Integration, property, and UAT tests (incl. 5,500-event deterministic UAT) - Proptest coverage scaled to 10 000 events/run (was ≤500) to meet acceptance criterion; cases reduced 100→10 to keep runtime comparable **Phase 3 – Storage engine** - StorageEngine trait (get/put/delete/scan/batch/flush) - Key encoding: [EntityId][0x00][Tag][suffix] with ordering/prefix helpers - InMemoryBackend (BTreeMap + RwLock) - FjallStorage with three isolated keyspaces and atomic batch helper - Property tests for key ordering and round-trip correctness Also adds planning docs for phases 4-5, research docs, architecture overview, and roadmap updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.8 KiB
Task 01: WAL Wire Format and Segment Files
Context
Milestone: 1 -- Signal Engine Phase: m1p2 -- Write-Ahead Log Status: COMPLETE Depends On: None Blocks: Task 02 (Group Commit Writer), Task 03 (Crash Recovery and Replay) Complexity: M
Objective
Define the on-disk binary format for WAL batches and event records, implement the segment file writer that manages 16 MB rotating files, and define the WalError type. This is the foundation everything else builds on — the format dictates how writers produce batches, how readers parse them, and how crash recovery validates them.
The key design decision (already resolved in docs/research/tidaldb_wal.md) is batch-oriented framing: frame entire batches rather than individual events. A 64-byte cache-line-aligned header with BLAKE3 checksum, followed by tightly-packed 21-byte event records. This matches the group-commit write path exactly and amortizes both checksum and fsync cost across 100 events per batch.
Requirements
BatchHeaderis exactly 64 bytes (#[repr(C)], compile-time assertion)- Magic bytes
0x54494C44("TIDL") at offset 0 for human-readable crash dumps - BLAKE3 hash at bytes [32..64] covers
header[0..32] || all_event_bytes— NOT the hash field itself EventRecordis exactly 21 bytes, little-endian throughout: entity_id (u64), signal_type (u8), weight (f32), timestamp_nanos (u64)SegmentWriteropens or creates a segment file and appends batches- Segment files named
wal-{first_seq:020}.seg— zero-padded 20-digit, lexicographic = numeric order list_segments(dir)returnsVec<(first_seq, PathBuf)>sorted by first sequence numberWalErrorcovers:Io(std::io::Error),Corruption(String),Closed,SendFailed,ShutdownFailed
Technical Design
Wire Format
BATCH FRAME:
+==========================================================================+
| Offset | Size | Field | Encoding | Notes |
+--------+------+---------------------+------------------+----------------+
| 0 | 4 | Magic | [0x54,0x49,0x4C,0x44] | "TIDL" |
| 4 | 1 | Version | u8 | Currently 1 |
| 5 | 1 | Flags | u8 | Reserved (0) |
| 6 | 2 | Event count | u16 LE | 1..=65535 |
| 8 | 8 | First sequence no. | u64 LE | Monotonic |
| 16 | 8 | Batch timestamp | u64 LE | Nanos epoch |
| 24 | 4 | Payload byte length | u32 LE | count * 21 |
| 28 | 4 | Reserved | [0u8; 4] | Future use |
| 32 | 32 | BLAKE3 checksum | [u8; 32] | See below |
+--------+------+---------------------+------------------+----------------+
| 64 | N*21 | Event records | packed structs | |
+==========================================================================+
BLAKE3 INPUT: blake3(header[0..32] || event_bytes[..])
(hash covers magic through reserved; the hash field [32..64] is excluded)
EVENT RECORD (21 bytes each, tightly packed):
| Offset | Size | Field | Encoding |
|--------|------|----------------|-----------|
| 0 | 8 | Entity ID | u64 LE |
| 8 | 1 | Signal type | u8 |
| 9 | 4 | Weight | f32 LE |
| 13 | 8 | Timestamp nanos| u64 LE |
Module Structure
tidal/src/wal/
format.rs -- BatchHeader, EventRecord: encode/decode
segment.rs -- SegmentWriter, list_segments
error.rs -- WalError
Public API Surface
// === format.rs ===
pub const MAGIC: [u8; 4] = [0x54, 0x49, 0x4C, 0x44]; // "TIDL"
pub const HEADER_SIZE: usize = 64;
pub const EVENT_SIZE: usize = 21;
pub const FORMAT_VERSION: u8 = 1;
#[derive(Debug, Clone, PartialEq)]
pub struct BatchHeader {
pub event_count: u16,
pub first_seq: u64,
pub batch_timestamp_nanos: u64,
pub payload_len: u32,
pub checksum: [u8; 32],
}
impl BatchHeader {
pub fn encode(&self) -> [u8; HEADER_SIZE];
pub fn decode(bytes: &[u8; HEADER_SIZE]) -> Result<Self, WalError>;
pub fn compute_checksum(header_prefix: &[u8; 32], events: &[u8]) -> [u8; 32];
}
#[derive(Debug, Clone, PartialEq)]
pub struct EventRecord {
pub entity_id: u64,
pub signal_type: u8,
pub weight: f32,
pub timestamp_nanos: u64,
}
impl EventRecord {
pub fn encode(&self) -> [u8; EVENT_SIZE];
pub fn decode(bytes: &[u8; EVENT_SIZE]) -> Self;
}
// === segment.rs ===
pub struct SegmentWriter { /* file handle, current size, segment_size limit */ }
impl SegmentWriter {
pub fn open(dir: &Path, first_seq: u64, segment_size: u64) -> Result<Self, WalError>;
/// Append raw batch bytes. Returns true if segment is now full.
pub fn append_batch(&mut self, bytes: &[u8]) -> Result<bool, WalError>;
pub fn flush(&mut self) -> Result<(), WalError>;
pub fn segment_size(&self) -> u64;
pub fn current_size(&self) -> u64;
}
pub fn segment_path(dir: &Path, first_seq: u64) -> PathBuf;
pub fn list_segments(dir: &Path) -> Result<Vec<(u64, PathBuf)>, WalError>;
Test Strategy
Unit Tests
#[test]
fn batch_header_roundtrip() {
let header = BatchHeader {
event_count: 42,
first_seq: 1000,
batch_timestamp_nanos: 1_700_000_000_000_000_000,
payload_len: 42 * 21,
checksum: [0xAB; 32],
};
let encoded = header.encode();
let decoded = BatchHeader::decode(&encoded).unwrap();
assert_eq!(header, decoded);
}
#[test]
fn event_record_roundtrip() {
let event = EventRecord { entity_id: 999, signal_type: 3, weight: 2.5, timestamp_nanos: 42_000_000_000 };
let encoded = event.encode();
let decoded = EventRecord::decode(&encoded);
assert_eq!(decoded.entity_id, 999);
assert_eq!(decoded.weight.to_bits(), 2.5_f32.to_bits());
}
#[test]
fn magic_bytes_in_header() {
let header = BatchHeader { event_count: 1, first_seq: 1, batch_timestamp_nanos: 0, payload_len: 21, checksum: [0u8; 32] };
let encoded = header.encode();
assert_eq!(&encoded[0..4], &[0x54, 0x49, 0x4C, 0x44]);
}
#[test]
fn segment_naming_is_ordered() {
let p1 = segment_path(Path::new("/tmp"), 1);
let p2 = segment_path(Path::new("/tmp"), 1000);
// Lexicographic order matches numeric order
assert!(p1.file_name() < p2.file_name());
}
#[test]
fn list_segments_returns_sorted() {
let dir = tempfile::tempdir().unwrap();
// Create segment files out of order
std::fs::write(segment_path(dir.path(), 200), b"").unwrap();
std::fs::write(segment_path(dir.path(), 1), b"").unwrap();
std::fs::write(segment_path(dir.path(), 100), b"").unwrap();
let segments = list_segments(dir.path()).unwrap();
assert_eq!(segments[0].0, 1);
assert_eq!(segments[1].0, 100);
assert_eq!(segments[2].0, 200);
}
#[test]
fn header_decode_rejects_wrong_magic() {
let mut bytes = [0u8; 64];
bytes[0] = 0xFF; // wrong magic
assert!(BatchHeader::decode(&bytes).is_err());
}
#[test]
fn header_decode_rejects_wrong_version() {
let mut bytes = [0u8; 64];
bytes[0..4].copy_from_slice(&[0x54, 0x49, 0x4C, 0x44]); // correct magic
bytes[4] = 99; // wrong version
assert!(BatchHeader::decode(&bytes).is_err());
}
Acceptance Criteria
BatchHeaderencodes to exactly 64 bytes (compile-time assertion)EventRecordencodes to exactly 21 bytes (compile-time assertion)- Magic bytes
0x54494C44appear at bytes [0..4] of every encoded header - BLAKE3 checksum covers
header[0..32] || event_bytes(excludes the hash field itself) BatchHeader::decode()returnsWalError::Corruptionon wrong magic or unknown versionEventRecord::encode/decoderoundtrip is lossless for all finite f32 weights- Segment files are named
wal-{seq:020}.seg;list_segments()returns them sorted ascending SegmentWriter::append_batch()writes raw bytes and returnstruewhen the segment has exceeded its size limit- All little-endian encoding — no byte-swap cost on x86/ARM
cargo clippy -D warningspasses
Research References
- docs/research/tidaldb_wal.md — Section 1 (Approach 3: batch-oriented framing with wire format table), Section 5 (segment rotation at 16 MB, naming convention)
Implementation Notes
payload_lenis alwaysevent_count * 21. The redundancy allows Phase 1 crash validation (check bounds before computing BLAKE3) without reading the event data.- The hash field at
header[32..64]is written AFTER computing the hash. The hash input uses a zeroed header suffix — equivalently, it hashesheader[0..32] || events. f32::to_bits()/f32::from_bits()are used for weight encoding — safe, const, and exact. Never cast f32 to u32 viaas.- Segment files do not need pre-allocation in m1p2. Defer
fallocateuntil disk write performance is a measured bottleneck.