# Task 03: BatchHeader v2 ## Delivers Extend `BatchHeader` in `tidal/src/wal/format/batch.rs` to v2 format with `shard_id` and `region_id` fields at bytes 58-61; update encode/decode; ensure v1 backward compatibility (zeros decode as shard 0, region 0). Bumps `FORMAT_VERSION` to 2. ## Complexity: S ## Dependencies - Task 01 (ShardId, RegionId types) ## Technical Design The existing `BatchHeader` is 64 bytes. The current layout (from WAL research doc): ``` Bytes 0-3: MAGIC (0x54494441 = "TIDA") Bytes 4-7: FORMAT_VERSION (u32 LE) Bytes 8-15: first_seq (u64 LE) Bytes 16-23: last_seq (u64 LE) Bytes 24-31: event_count (u64 LE) Bytes 32-39: uncompressed_size (u64 LE) Bytes 40-47: compressed_size (u64 LE) Bytes 48-55: timestamp_ns (u64 LE) Bytes 56-59: checksum (u32 LE) <- BLAKE3 first 4 bytes Bytes 60-61: [RESERVED / ZERO] Bytes 62-63: [RESERVED / ZERO] ``` v2 adds `shard_id` and `region_id` at the zero-padded bytes: ``` Bytes 56-59: checksum (u32 LE) Bytes 60-61: shard_id (u16 LE) <- NEW in v2 (was zero padding in v1) Bytes 62-63: region_id (u16 LE) <- NEW in v2 (was zero padding in v1) ``` This is backward compatible: v1 always wrote zeros at 60-63, so v2 code reading v1 segments correctly interprets shard_id=0, region_id=0. ```rust // tidal/src/wal/format/batch.rs pub const FORMAT_VERSION_V1: u32 = 1; pub const FORMAT_VERSION_V2: u32 = 2; pub const FORMAT_VERSION: u32 = FORMAT_VERSION_V2; #[derive(Debug, Clone, PartialEq)] pub struct BatchHeader { pub first_seq: u64, pub last_seq: u64, pub event_count: u64, pub uncompressed_size: u64, pub compressed_size: u64, pub timestamp_ns: u64, pub checksum: u32, // v2 fields -- default to 0 for single-node deployments pub shard_id: ShardId, pub region_id: RegionId, } impl BatchHeader { /// Encode to the 64-byte wire format. pub fn encode(&self) -> [u8; 64] { let mut buf = [0u8; 64]; buf[0..4].copy_from_slice(&MAGIC.to_le_bytes()); buf[4..8].copy_from_slice(&FORMAT_VERSION.to_le_bytes()); buf[8..16].copy_from_slice(&self.first_seq.to_le_bytes()); buf[16..24].copy_from_slice(&self.last_seq.to_le_bytes()); buf[24..32].copy_from_slice(&self.event_count.to_le_bytes()); buf[32..40].copy_from_slice(&self.uncompressed_size.to_le_bytes()); buf[40..48].copy_from_slice(&self.compressed_size.to_le_bytes()); buf[48..56].copy_from_slice(&self.timestamp_ns.to_le_bytes()); buf[56..60].copy_from_slice(&self.checksum.to_le_bytes()); buf[60..62].copy_from_slice(&self.shard_id.0.to_le_bytes()); buf[62..64].copy_from_slice(&self.region_id.0.to_le_bytes()); buf } /// Decode from a 64-byte buffer. /// /// Accepts both v1 (shard_id=0, region_id=0) and v2 format. pub fn decode(buf: &[u8; 64]) -> Result { let magic = u32::from_le_bytes(buf[0..4].try_into().unwrap()); if magic != MAGIC { return Err(WalError::Corruption("bad magic".into())); } let version = u32::from_le_bytes(buf[4..8].try_into().unwrap()); if version != FORMAT_VERSION_V1 && version != FORMAT_VERSION_V2 { return Err(WalError::Corruption(format!("unknown version {version}"))); } let shard_id = ShardId(u16::from_le_bytes(buf[60..62].try_into().unwrap())); let region_id = RegionId(u16::from_le_bytes(buf[62..64].try_into().unwrap())); Ok(Self { first_seq: u64::from_le_bytes(buf[8..16].try_into().unwrap()), last_seq: u64::from_le_bytes(buf[16..24].try_into().unwrap()), event_count: u64::from_le_bytes(buf[24..32].try_into().unwrap()), uncompressed_size: u64::from_le_bytes(buf[32..40].try_into().unwrap()), compressed_size: u64::from_le_bytes(buf[40..48].try_into().unwrap()), timestamp_ns: u64::from_le_bytes(buf[48..56].try_into().unwrap()), checksum: u32::from_le_bytes(buf[56..60].try_into().unwrap()), shard_id, region_id, }) } } ``` ## Acceptance Criteria - [ ] `BatchHeader` has `shard_id: ShardId` and `region_id: RegionId` fields - [ ] `BatchHeader::encode()` writes shard_id at bytes 60-61 (LE) and region_id at bytes 62-63 (LE) - [ ] `BatchHeader::decode()` reads these bytes; v1 batches (zeros at 60-63) decode as `ShardId(0)`, `RegionId(0)` - [ ] `FORMAT_VERSION` is bumped to 2; v1 reader accepts v1 and v2 version bytes - [ ] Property test: encode + decode roundtrips for random shard_id, region_id values - [ ] Property test: a buffer created with v1 code (shard bytes zeroed) decodes correctly - [ ] All existing WAL tests pass (write/read/recovery) -- single-node uses shard=0, region=0 by default - [ ] `cargo clippy -D warnings` and `cargo fmt` pass