tidaldb/docs/planning/milestone-1/phase-2/task-01-wal-format-and-segment-files.md

# Task 01: WAL Wire Format and Segment Files

## Context

**Milestone:** 1 -- Signal Engine
**Phase:** m1p2 -- Write-Ahead Log
**Status:** COMPLETE
**Depends On:** None
**Blocks:** Task 02 (Group Commit Writer), Task 03 (Crash Recovery and Replay)
**Complexity:** M

## Objective

Define the on-disk binary format for WAL batches and event records, implement the segment file writer that manages 16 MB rotating files, and define the `WalError` type. This is the foundation everything else builds on — the format dictates how writers produce batches, how readers parse them, and how crash recovery validates them.

The key design decision (already resolved in `docs/research/tidaldb_wal.md`) is batch-oriented framing: frame entire batches rather than individual events. A 64-byte cache-line-aligned header with BLAKE3 checksum, followed by tightly-packed 21-byte event records. This matches the group-commit write path exactly and amortizes both checksum and fsync cost across 100 events per batch.

## Requirements

- `BatchHeader` is exactly 64 bytes (`#[repr(C)]`, compile-time assertion)
- Magic bytes `0x54494C44` ("TIDL") at offset 0 for human-readable crash dumps
- BLAKE3 hash at bytes [32..64] covers `header[0..32] || all_event_bytes` — NOT the hash field itself
- `EventRecord` is exactly 21 bytes, little-endian throughout: entity_id (u64), signal_type (u8), weight (f32), timestamp_nanos (u64)
- `SegmentWriter` opens or creates a segment file and appends batches
- Segment files named `wal-{first_seq:020}.seg` — zero-padded 20-digit, lexicographic = numeric order
- `list_segments(dir)` returns `Vec<(first_seq, PathBuf)>` sorted by first sequence number
- `WalError` covers: `Io(std::io::Error)`, `Corruption(String)`, `Closed`, `SendFailed`, `ShutdownFailed`

## Technical Design

### Wire Format

```
BATCH FRAME:
+==========================================================================+
| Offset | Size | Field               | Encoding         | Notes          |
+--------+------+---------------------+------------------+----------------+
| 0      | 4    | Magic               | [0x54,0x49,0x4C,0x44] | "TIDL"    |
| 4      | 1    | Version             | u8               | Currently 1    |
| 5      | 1    | Flags               | u8               | Reserved (0)   |
| 6      | 2    | Event count         | u16 LE           | 1..=65535      |
| 8      | 8    | First sequence no.  | u64 LE           | Monotonic      |
| 16     | 8    | Batch timestamp     | u64 LE           | Nanos epoch    |
| 24     | 4    | Payload byte length | u32 LE           | count * 21     |
| 28     | 4    | Reserved            | [0u8; 4]         | Future use     |
| 32     | 32   | BLAKE3 checksum     | [u8; 32]         | See below      |
+--------+------+---------------------+------------------+----------------+
| 64     | N*21 | Event records       | packed structs   |                |
+==========================================================================+

BLAKE3 INPUT: blake3(header[0..32] || event_bytes[..])
(hash covers magic through reserved; the hash field [32..64] is excluded)

EVENT RECORD (21 bytes each, tightly packed):
| Offset | Size | Field          | Encoding  |
|--------|------|----------------|-----------|
| 0      | 8    | Entity ID      | u64 LE    |
| 8      | 1    | Signal type    | u8        |
| 9      | 4    | Weight         | f32 LE    |
| 13     | 8    | Timestamp nanos| u64 LE    |
```

### Module Structure

```
tidal/src/wal/
  format.rs   -- BatchHeader, EventRecord: encode/decode
  segment.rs  -- SegmentWriter, list_segments
  error.rs    -- WalError
```

### Public API Surface

```rust
// === format.rs ===

pub const MAGIC: [u8; 4] = [0x54, 0x49, 0x4C, 0x44]; // "TIDL"
pub const HEADER_SIZE: usize = 64;
pub const EVENT_SIZE: usize = 21;
pub const FORMAT_VERSION: u8 = 1;

#[derive(Debug, Clone, PartialEq)]
pub struct BatchHeader {
    pub event_count: u16,
    pub first_seq: u64,
    pub batch_timestamp_nanos: u64,
    pub payload_len: u32,
    pub checksum: [u8; 32],
}

impl BatchHeader {
    pub fn encode(&self) -> [u8; HEADER_SIZE];
    pub fn decode(bytes: &[u8; HEADER_SIZE]) -> Result<Self, WalError>;
    pub fn compute_checksum(header_prefix: &[u8; 32], events: &[u8]) -> [u8; 32];
}

#[derive(Debug, Clone, PartialEq)]
pub struct EventRecord {
    pub entity_id: u64,
    pub signal_type: u8,
    pub weight: f32,
    pub timestamp_nanos: u64,
}

impl EventRecord {
    pub fn encode(&self) -> [u8; EVENT_SIZE];
    pub fn decode(bytes: &[u8; EVENT_SIZE]) -> Self;
}

// === segment.rs ===

pub struct SegmentWriter { /* file handle, current size, segment_size limit */ }

impl SegmentWriter {
    pub fn open(dir: &Path, first_seq: u64, segment_size: u64) -> Result<Self, WalError>;
    /// Append raw batch bytes. Returns true if segment is now full.
    pub fn append_batch(&mut self, bytes: &[u8]) -> Result<bool, WalError>;
    pub fn flush(&mut self) -> Result<(), WalError>;
    pub fn segment_size(&self) -> u64;
    pub fn current_size(&self) -> u64;
}

pub fn segment_path(dir: &Path, first_seq: u64) -> PathBuf;
pub fn list_segments(dir: &Path) -> Result<Vec<(u64, PathBuf)>, WalError>;
```

## Test Strategy

### Unit Tests

```rust
#[test]
fn batch_header_roundtrip() {
    let header = BatchHeader {
        event_count: 42,
        first_seq: 1000,
        batch_timestamp_nanos: 1_700_000_000_000_000_000,
        payload_len: 42 * 21,
        checksum: [0xAB; 32],
    };
    let encoded = header.encode();
    let decoded = BatchHeader::decode(&encoded).unwrap();
    assert_eq!(header, decoded);
}

#[test]
fn event_record_roundtrip() {
    let event = EventRecord { entity_id: 999, signal_type: 3, weight: 2.5, timestamp_nanos: 42_000_000_000 };
    let encoded = event.encode();
    let decoded = EventRecord::decode(&encoded);
    assert_eq!(decoded.entity_id, 999);
    assert_eq!(decoded.weight.to_bits(), 2.5_f32.to_bits());
}

#[test]
fn magic_bytes_in_header() {
    let header = BatchHeader { event_count: 1, first_seq: 1, batch_timestamp_nanos: 0, payload_len: 21, checksum: [0u8; 32] };
    let encoded = header.encode();
    assert_eq!(&encoded[0..4], &[0x54, 0x49, 0x4C, 0x44]);
}

#[test]
fn segment_naming_is_ordered() {
    let p1 = segment_path(Path::new("/tmp"), 1);
    let p2 = segment_path(Path::new("/tmp"), 1000);
    // Lexicographic order matches numeric order
    assert!(p1.file_name() < p2.file_name());
}

#[test]
fn list_segments_returns_sorted() {
    let dir = tempfile::tempdir().unwrap();
    // Create segment files out of order
    std::fs::write(segment_path(dir.path(), 200), b"").unwrap();
    std::fs::write(segment_path(dir.path(), 1), b"").unwrap();
    std::fs::write(segment_path(dir.path(), 100), b"").unwrap();
    let segments = list_segments(dir.path()).unwrap();
    assert_eq!(segments[0].0, 1);
    assert_eq!(segments[1].0, 100);
    assert_eq!(segments[2].0, 200);
}

#[test]
fn header_decode_rejects_wrong_magic() {
    let mut bytes = [0u8; 64];
    bytes[0] = 0xFF; // wrong magic
    assert!(BatchHeader::decode(&bytes).is_err());
}

#[test]
fn header_decode_rejects_wrong_version() {
    let mut bytes = [0u8; 64];
    bytes[0..4].copy_from_slice(&[0x54, 0x49, 0x4C, 0x44]); // correct magic
    bytes[4] = 99; // wrong version
    assert!(BatchHeader::decode(&bytes).is_err());
}
```

## Acceptance Criteria

- [x] `BatchHeader` encodes to exactly 64 bytes (compile-time assertion)
- [x] `EventRecord` encodes to exactly 21 bytes (compile-time assertion)
- [x] Magic bytes `0x54494C44` appear at bytes [0..4] of every encoded header
- [x] BLAKE3 checksum covers `header[0..32] || event_bytes` (excludes the hash field itself)
- [x] `BatchHeader::decode()` returns `WalError::Corruption` on wrong magic or unknown version
- [x] `EventRecord::encode`/`decode` roundtrip is lossless for all finite f32 weights
- [x] Segment files are named `wal-{seq:020}.seg`; `list_segments()` returns them sorted ascending
- [x] `SegmentWriter::append_batch()` writes raw bytes and returns `true` when the segment has exceeded its size limit
- [x] All little-endian encoding — no byte-swap cost on x86/ARM
- [x] `cargo clippy -D warnings` passes

## Research References

- [docs/research/tidaldb_wal.md](../../../research/tidaldb_wal.md) — Section 1 (Approach 3: batch-oriented framing with wire format table), Section 5 (segment rotation at 16 MB, naming convention)

## Implementation Notes

- `payload_len` is always `event_count * 21`. The redundancy allows Phase 1 crash validation (check bounds before computing BLAKE3) without reading the event data.
- The hash field at `header[32..64]` is written AFTER computing the hash. The hash input uses a zeroed header suffix — equivalently, it hashes `header[0..32] || events`.
- `f32::to_bits()` / `f32::from_bits()` are used for weight encoding — safe, const, and exact. Never cast f32 to u32 via `as`.
- Segment files do not need pre-allocation in m1p2. Defer `fallocate` until disk write performance is a measured bottleneck.