# Task 01: WAL Wire Format and Segment Files ## Context **Milestone:** 1 -- Signal Engine **Phase:** m1p2 -- Write-Ahead Log **Status:** COMPLETE **Depends On:** None **Blocks:** Task 02 (Group Commit Writer), Task 03 (Crash Recovery and Replay) **Complexity:** M ## Objective Define the on-disk binary format for WAL batches and event records, implement the segment file writer that manages 16 MB rotating files, and define the `WalError` type. This is the foundation everything else builds on — the format dictates how writers produce batches, how readers parse them, and how crash recovery validates them. The key design decision (already resolved in `docs/research/tidaldb_wal.md`) is batch-oriented framing: frame entire batches rather than individual events. A 64-byte cache-line-aligned header with BLAKE3 checksum, followed by tightly-packed 21-byte event records. This matches the group-commit write path exactly and amortizes both checksum and fsync cost across 100 events per batch. ## Requirements - `BatchHeader` is exactly 64 bytes (`#[repr(C)]`, compile-time assertion) - Magic bytes `0x54494C44` ("TIDL") at offset 0 for human-readable crash dumps - BLAKE3 hash at bytes [32..64] covers `header[0..32] || all_event_bytes` — NOT the hash field itself - `EventRecord` is exactly 21 bytes, little-endian throughout: entity_id (u64), signal_type (u8), weight (f32), timestamp_nanos (u64) - `SegmentWriter` opens or creates a segment file and appends batches - Segment files named `wal-{first_seq:020}.seg` — zero-padded 20-digit, lexicographic = numeric order - `list_segments(dir)` returns `Vec<(first_seq, PathBuf)>` sorted by first sequence number - `WalError` covers: `Io(std::io::Error)`, `Corruption(String)`, `Closed`, `SendFailed`, `ShutdownFailed` ## Technical Design ### Wire Format ``` BATCH FRAME: +==========================================================================+ | Offset | Size | Field | Encoding | Notes | +--------+------+---------------------+------------------+----------------+ | 0 | 4 | Magic | [0x54,0x49,0x4C,0x44] | "TIDL" | | 4 | 1 | Version | u8 | Currently 1 | | 5 | 1 | Flags | u8 | Reserved (0) | | 6 | 2 | Event count | u16 LE | 1..=65535 | | 8 | 8 | First sequence no. | u64 LE | Monotonic | | 16 | 8 | Batch timestamp | u64 LE | Nanos epoch | | 24 | 4 | Payload byte length | u32 LE | count * 21 | | 28 | 4 | Reserved | [0u8; 4] | Future use | | 32 | 32 | BLAKE3 checksum | [u8; 32] | See below | +--------+------+---------------------+------------------+----------------+ | 64 | N*21 | Event records | packed structs | | +==========================================================================+ BLAKE3 INPUT: blake3(header[0..32] || event_bytes[..]) (hash covers magic through reserved; the hash field [32..64] is excluded) EVENT RECORD (21 bytes each, tightly packed): | Offset | Size | Field | Encoding | |--------|------|----------------|-----------| | 0 | 8 | Entity ID | u64 LE | | 8 | 1 | Signal type | u8 | | 9 | 4 | Weight | f32 LE | | 13 | 8 | Timestamp nanos| u64 LE | ``` ### Module Structure ``` tidal/src/wal/ format.rs -- BatchHeader, EventRecord: encode/decode segment.rs -- SegmentWriter, list_segments error.rs -- WalError ``` ### Public API Surface ```rust // === format.rs === pub const MAGIC: [u8; 4] = [0x54, 0x49, 0x4C, 0x44]; // "TIDL" pub const HEADER_SIZE: usize = 64; pub const EVENT_SIZE: usize = 21; pub const FORMAT_VERSION: u8 = 1; #[derive(Debug, Clone, PartialEq)] pub struct BatchHeader { pub event_count: u16, pub first_seq: u64, pub batch_timestamp_nanos: u64, pub payload_len: u32, pub checksum: [u8; 32], } impl BatchHeader { pub fn encode(&self) -> [u8; HEADER_SIZE]; pub fn decode(bytes: &[u8; HEADER_SIZE]) -> Result; pub fn compute_checksum(header_prefix: &[u8; 32], events: &[u8]) -> [u8; 32]; } #[derive(Debug, Clone, PartialEq)] pub struct EventRecord { pub entity_id: u64, pub signal_type: u8, pub weight: f32, pub timestamp_nanos: u64, } impl EventRecord { pub fn encode(&self) -> [u8; EVENT_SIZE]; pub fn decode(bytes: &[u8; EVENT_SIZE]) -> Self; } // === segment.rs === pub struct SegmentWriter { /* file handle, current size, segment_size limit */ } impl SegmentWriter { pub fn open(dir: &Path, first_seq: u64, segment_size: u64) -> Result; /// Append raw batch bytes. Returns true if segment is now full. pub fn append_batch(&mut self, bytes: &[u8]) -> Result; pub fn flush(&mut self) -> Result<(), WalError>; pub fn segment_size(&self) -> u64; pub fn current_size(&self) -> u64; } pub fn segment_path(dir: &Path, first_seq: u64) -> PathBuf; pub fn list_segments(dir: &Path) -> Result, WalError>; ``` ## Test Strategy ### Unit Tests ```rust #[test] fn batch_header_roundtrip() { let header = BatchHeader { event_count: 42, first_seq: 1000, batch_timestamp_nanos: 1_700_000_000_000_000_000, payload_len: 42 * 21, checksum: [0xAB; 32], }; let encoded = header.encode(); let decoded = BatchHeader::decode(&encoded).unwrap(); assert_eq!(header, decoded); } #[test] fn event_record_roundtrip() { let event = EventRecord { entity_id: 999, signal_type: 3, weight: 2.5, timestamp_nanos: 42_000_000_000 }; let encoded = event.encode(); let decoded = EventRecord::decode(&encoded); assert_eq!(decoded.entity_id, 999); assert_eq!(decoded.weight.to_bits(), 2.5_f32.to_bits()); } #[test] fn magic_bytes_in_header() { let header = BatchHeader { event_count: 1, first_seq: 1, batch_timestamp_nanos: 0, payload_len: 21, checksum: [0u8; 32] }; let encoded = header.encode(); assert_eq!(&encoded[0..4], &[0x54, 0x49, 0x4C, 0x44]); } #[test] fn segment_naming_is_ordered() { let p1 = segment_path(Path::new("/tmp"), 1); let p2 = segment_path(Path::new("/tmp"), 1000); // Lexicographic order matches numeric order assert!(p1.file_name() < p2.file_name()); } #[test] fn list_segments_returns_sorted() { let dir = tempfile::tempdir().unwrap(); // Create segment files out of order std::fs::write(segment_path(dir.path(), 200), b"").unwrap(); std::fs::write(segment_path(dir.path(), 1), b"").unwrap(); std::fs::write(segment_path(dir.path(), 100), b"").unwrap(); let segments = list_segments(dir.path()).unwrap(); assert_eq!(segments[0].0, 1); assert_eq!(segments[1].0, 100); assert_eq!(segments[2].0, 200); } #[test] fn header_decode_rejects_wrong_magic() { let mut bytes = [0u8; 64]; bytes[0] = 0xFF; // wrong magic assert!(BatchHeader::decode(&bytes).is_err()); } #[test] fn header_decode_rejects_wrong_version() { let mut bytes = [0u8; 64]; bytes[0..4].copy_from_slice(&[0x54, 0x49, 0x4C, 0x44]); // correct magic bytes[4] = 99; // wrong version assert!(BatchHeader::decode(&bytes).is_err()); } ``` ## Acceptance Criteria - [x] `BatchHeader` encodes to exactly 64 bytes (compile-time assertion) - [x] `EventRecord` encodes to exactly 21 bytes (compile-time assertion) - [x] Magic bytes `0x54494C44` appear at bytes [0..4] of every encoded header - [x] BLAKE3 checksum covers `header[0..32] || event_bytes` (excludes the hash field itself) - [x] `BatchHeader::decode()` returns `WalError::Corruption` on wrong magic or unknown version - [x] `EventRecord::encode`/`decode` roundtrip is lossless for all finite f32 weights - [x] Segment files are named `wal-{seq:020}.seg`; `list_segments()` returns them sorted ascending - [x] `SegmentWriter::append_batch()` writes raw bytes and returns `true` when the segment has exceeded its size limit - [x] All little-endian encoding — no byte-swap cost on x86/ARM - [x] `cargo clippy -D warnings` passes ## Research References - [docs/research/tidaldb_wal.md](../../../research/tidaldb_wal.md) — Section 1 (Approach 3: batch-oriented framing with wire format table), Section 5 (segment rotation at 16 MB, naming convention) ## Implementation Notes - `payload_len` is always `event_count * 21`. The redundancy allows Phase 1 crash validation (check bounds before computing BLAKE3) without reading the event data. - The hash field at `header[32..64]` is written AFTER computing the hash. The hash input uses a zeroed header suffix — equivalently, it hashes `header[0..32] || events`. - `f32::to_bits()` / `f32::from_bits()` are used for weight encoding — safe, const, and exact. Never cast f32 to u32 via `as`. - Segment files do not need pre-allocation in m1p2. Defer `fallocate` until disk write performance is a measured bottleneck.