Milestone 8 (phases 1-4): - Shard-aware WAL segment naming, BatchHeader v2, ShardRouter - Transport trait, InProcessTransport, WalShipper, FollowerDb - HLC, PNCounter, LWWRegister, CrdtSignalState, ReconciliationEngine - Session replication bridge with SeqNo/HWM, idempotency store Forage application: - Multi-source discovery engine with MAB exploration - Embedding-based label system, server handlers, UI refresh Other: - QUICKSTART.md, README.md, milestone-8 planning docs - Hard negative union semantics, RLHF export enhancements - Recovery benchmark and visibility test expansions - Split 8 oversized source files per CODING_GUIDELINES §9 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
121 lines
4.7 KiB
Markdown
121 lines
4.7 KiB
Markdown
# Task 03: BatchHeader v2
|
|
|
|
## Delivers
|
|
|
|
Extend `BatchHeader` in `tidal/src/wal/format/batch.rs` to v2 format with `shard_id` and `region_id` fields at bytes 58-61; update encode/decode; ensure v1 backward compatibility (zeros decode as shard 0, region 0). Bumps `FORMAT_VERSION` to 2.
|
|
|
|
## Complexity: S
|
|
|
|
## Dependencies
|
|
|
|
- Task 01 (ShardId, RegionId types)
|
|
|
|
## Technical Design
|
|
|
|
The existing `BatchHeader` is 64 bytes. The current layout (from WAL research doc):
|
|
|
|
```
|
|
Bytes 0-3: MAGIC (0x54494441 = "TIDA")
|
|
Bytes 4-7: FORMAT_VERSION (u32 LE)
|
|
Bytes 8-15: first_seq (u64 LE)
|
|
Bytes 16-23: last_seq (u64 LE)
|
|
Bytes 24-31: event_count (u64 LE)
|
|
Bytes 32-39: uncompressed_size (u64 LE)
|
|
Bytes 40-47: compressed_size (u64 LE)
|
|
Bytes 48-55: timestamp_ns (u64 LE)
|
|
Bytes 56-59: checksum (u32 LE) <- BLAKE3 first 4 bytes
|
|
Bytes 60-61: [RESERVED / ZERO]
|
|
Bytes 62-63: [RESERVED / ZERO]
|
|
```
|
|
|
|
v2 adds `shard_id` and `region_id` at the zero-padded bytes:
|
|
|
|
```
|
|
Bytes 56-59: checksum (u32 LE)
|
|
Bytes 60-61: shard_id (u16 LE) <- NEW in v2 (was zero padding in v1)
|
|
Bytes 62-63: region_id (u16 LE) <- NEW in v2 (was zero padding in v1)
|
|
```
|
|
|
|
This is backward compatible: v1 always wrote zeros at 60-63, so v2 code reading v1 segments correctly interprets shard_id=0, region_id=0.
|
|
|
|
```rust
|
|
// tidal/src/wal/format/batch.rs
|
|
|
|
pub const FORMAT_VERSION_V1: u32 = 1;
|
|
pub const FORMAT_VERSION_V2: u32 = 2;
|
|
pub const FORMAT_VERSION: u32 = FORMAT_VERSION_V2;
|
|
|
|
#[derive(Debug, Clone, PartialEq)]
|
|
pub struct BatchHeader {
|
|
pub first_seq: u64,
|
|
pub last_seq: u64,
|
|
pub event_count: u64,
|
|
pub uncompressed_size: u64,
|
|
pub compressed_size: u64,
|
|
pub timestamp_ns: u64,
|
|
pub checksum: u32,
|
|
// v2 fields -- default to 0 for single-node deployments
|
|
pub shard_id: ShardId,
|
|
pub region_id: RegionId,
|
|
}
|
|
|
|
impl BatchHeader {
|
|
/// Encode to the 64-byte wire format.
|
|
pub fn encode(&self) -> [u8; 64] {
|
|
let mut buf = [0u8; 64];
|
|
buf[0..4].copy_from_slice(&MAGIC.to_le_bytes());
|
|
buf[4..8].copy_from_slice(&FORMAT_VERSION.to_le_bytes());
|
|
buf[8..16].copy_from_slice(&self.first_seq.to_le_bytes());
|
|
buf[16..24].copy_from_slice(&self.last_seq.to_le_bytes());
|
|
buf[24..32].copy_from_slice(&self.event_count.to_le_bytes());
|
|
buf[32..40].copy_from_slice(&self.uncompressed_size.to_le_bytes());
|
|
buf[40..48].copy_from_slice(&self.compressed_size.to_le_bytes());
|
|
buf[48..56].copy_from_slice(&self.timestamp_ns.to_le_bytes());
|
|
buf[56..60].copy_from_slice(&self.checksum.to_le_bytes());
|
|
buf[60..62].copy_from_slice(&self.shard_id.0.to_le_bytes());
|
|
buf[62..64].copy_from_slice(&self.region_id.0.to_le_bytes());
|
|
buf
|
|
}
|
|
|
|
/// Decode from a 64-byte buffer.
|
|
///
|
|
/// Accepts both v1 (shard_id=0, region_id=0) and v2 format.
|
|
pub fn decode(buf: &[u8; 64]) -> Result<Self, WalError> {
|
|
let magic = u32::from_le_bytes(buf[0..4].try_into().unwrap());
|
|
if magic != MAGIC {
|
|
return Err(WalError::Corruption("bad magic".into()));
|
|
}
|
|
let version = u32::from_le_bytes(buf[4..8].try_into().unwrap());
|
|
if version != FORMAT_VERSION_V1 && version != FORMAT_VERSION_V2 {
|
|
return Err(WalError::Corruption(format!("unknown version {version}")));
|
|
}
|
|
|
|
let shard_id = ShardId(u16::from_le_bytes(buf[60..62].try_into().unwrap()));
|
|
let region_id = RegionId(u16::from_le_bytes(buf[62..64].try_into().unwrap()));
|
|
|
|
Ok(Self {
|
|
first_seq: u64::from_le_bytes(buf[8..16].try_into().unwrap()),
|
|
last_seq: u64::from_le_bytes(buf[16..24].try_into().unwrap()),
|
|
event_count: u64::from_le_bytes(buf[24..32].try_into().unwrap()),
|
|
uncompressed_size: u64::from_le_bytes(buf[32..40].try_into().unwrap()),
|
|
compressed_size: u64::from_le_bytes(buf[40..48].try_into().unwrap()),
|
|
timestamp_ns: u64::from_le_bytes(buf[48..56].try_into().unwrap()),
|
|
checksum: u32::from_le_bytes(buf[56..60].try_into().unwrap()),
|
|
shard_id,
|
|
region_id,
|
|
})
|
|
}
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `BatchHeader` has `shard_id: ShardId` and `region_id: RegionId` fields
|
|
- [ ] `BatchHeader::encode()` writes shard_id at bytes 60-61 (LE) and region_id at bytes 62-63 (LE)
|
|
- [ ] `BatchHeader::decode()` reads these bytes; v1 batches (zeros at 60-63) decode as `ShardId(0)`, `RegionId(0)`
|
|
- [ ] `FORMAT_VERSION` is bumped to 2; v1 reader accepts v1 and v2 version bytes
|
|
- [ ] Property test: encode + decode roundtrips for random shard_id, region_id values
|
|
- [ ] Property test: a buffer created with v1 code (shard bytes zeroed) decodes correctly
|
|
- [ ] All existing WAL tests pass (write/read/recovery) -- single-node uses shard=0, region=0 by default
|
|
- [ ] `cargo clippy -D warnings` and `cargo fmt` pass
|