tidaldb/docs/planning/milestone-8/phase-2/OVERVIEW.md
jordan f4cfd6c81f feat: complete M8 replication primitives + forage enhancements + docs
Milestone 8 (phases 1-4):
- Shard-aware WAL segment naming, BatchHeader v2, ShardRouter
- Transport trait, InProcessTransport, WalShipper, FollowerDb
- HLC, PNCounter, LWWRegister, CrdtSignalState, ReconciliationEngine
- Session replication bridge with SeqNo/HWM, idempotency store

Forage application:
- Multi-source discovery engine with MAB exploration
- Embedding-based label system, server handlers, UI refresh

Other:
- QUICKSTART.md, README.md, milestone-8 planning docs
- Hard negative union semantics, RLHF export enhancements
- Recovery benchmark and visibility test expansions
- Split 8 oversized source files per CODING_GUIDELINES §9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 13:17:19 -07:00

110 lines
6.2 KiB
Markdown

# m8p2: WAL Shipping and Follower Replay
## Delivers
One-way WAL replication from leader to followers. The leader ships sealed WAL
segments over an abstract transport trait. Followers receive segments, validate
checksums, and replay them idempotently through the existing signal ledger
`apply_wal_event()` path. A replication lag metric is emitted. A follower can
serve read queries (RETRIEVE, SEARCH) with bounded staleness.
This is the "read replicas" capability -- the foundation for multi-region deployment.
Deliverables:
- `Transport` trait: `async fn send_segment(peer: ShardId, segment: &WalSegmentPayload)` and `async fn recv_segment() -> WalSegmentPayload`
- `InProcessTransport`: for testing, uses `tokio::sync::mpsc` channels between co-located instances
- `WalShipper`: background task on leader that watches for sealed segments, ships them to registered followers
- `SegmentReceiver`: background task on follower that receives segments, validates BLAKE3, replays events
- `ReplicationLagGauge`: tracks the delta between leader's latest seqno and each follower's applied seqno
- `FollowerDb`: a `TidalDb` variant that does not accept writes, only replays segments; serves read queries from its local state
## Dependencies
- **Requires:** Phase 8.1 (ShardId, RegionId, WalSegmentId, BatchHeader v2, ReplicationState)
- **Files modified:**
- `tidal/src/wal/segment.rs` -- `sealed_segments_since(seqno)` helper
- `tidal/src/db/open.rs` -- support `NodeRole::Follower` startup
- `tidal/src/db/mod.rs` -- `TidalDb::is_follower()` guard on write paths
- `tidal/src/signals/ledger/mod.rs` -- ensure `apply_wal_event()` is idempotent when replaying duplicate segments
- **Files created:**
- `tidal/src/replication/transport.rs` -- `Transport` trait, `WalSegmentPayload`
- `tidal/src/replication/in_process.rs` -- `InProcessTransport`
- `tidal/src/replication/shipper.rs` -- `WalShipper`
- `tidal/src/replication/receiver.rs` -- `SegmentReceiver`
- `tidal/src/replication/lag.rs` -- `ReplicationLagGauge`
## Research References
- `docs/research/tidaldb_wal.md` -- Segment sealing, batch checksum validation
- `thoughts.md` -- Part V.5 (quarantine-first ingestion; WAL is source of truth)
## Acceptance Criteria (Phase Level)
- [ ] `Transport` trait has `send_segment` and `recv_segment` async methods; `InProcessTransport` implements them via bounded mpsc channels
- [ ] `WalShipper` runs as a background `tokio::task`; polls for newly sealed segments every 2 seconds (configurable); ships segments to all registered followers in parallel
- [ ] `SegmentReceiver` validates BLAKE3 checksum of each received segment before replay; rejects corrupted segments with `WalError::Corruption`
- [ ] Follower replay is idempotent: replaying a segment with seqno <= follower's high-water-mark is a no-op (no duplicate signal counting)
- [ ] `ReplicationLagGauge` reports `leader_seqno - follower_applied_seqno` per follower; accessible via `MetricsState`
- [ ] Leader writes 1,000 signals -> follower replays all 1,000 -> `read_decay_score` on follower matches leader to 6 decimal places (analytical equivalence)
- [ ] Follower rejects write operations (`db.signal()`, `db.write_item()`) with `TidalError::ReadOnly`
- [ ] Replication lag converges to 0 within 5 seconds after leader quiesces (in-process transport)
- [ ] Leader crash and restart: follower continues serving reads from last replayed state; leader resumes shipping from last sealed segment
- [ ] `FollowerDb` serves `db.retrieve()` and `db.search()` queries against its local replayed state
## Task Execution Order
```
Task 01: Transport Trait ──────┐
├──> Task 03: WalShipper
Task 02: InProcessTransport ───┘ │
v
Task 04: SegmentReceiver
v
Task 05: FollowerDb
v
Task 06: ReplicationLagGauge
v
Task 07: Integration Tests
```
Tasks 01 and 02 are parallelizable. Task 03 requires Task 01. Tasks 04-07 are sequential.
## Module Location
| File | Status | Contains |
|------|--------|----------|
| `tidal/src/replication/transport.rs` | NEW | `Transport` trait, `WalSegmentPayload` |
| `tidal/src/replication/in_process.rs` | NEW | `InProcessTransport` (channel-based) |
| `tidal/src/replication/shipper.rs` | NEW | `WalShipper` background task |
| `tidal/src/replication/receiver.rs` | NEW | `SegmentReceiver` with checksum validation and replay |
| `tidal/src/replication/lag.rs` | NEW | `ReplicationLagGauge` |
| `tidal/src/wal/segment.rs` | MODIFIED | `sealed_segments_since(seqno)` |
| `tidal/src/db/open.rs` | MODIFIED | Follower startup path |
| `tidal/src/db/mod.rs` | MODIFIED | Write-rejection guard for followers |
| `tidal/src/signals/ledger/mod.rs` | MODIFIED | Idempotency guard on `apply_wal_event` |
## Notes
### In-process transport only in this phase
A TCP/gRPC transport is deferred to Phase 8.5. The `Transport` trait is async to support both in-process channels and future network transports.
### Idempotency via seqno
Followers track their high-water-mark `applied_seqno`. Segments with `first_seq <= applied_seqno` are skipped entirely. This reuses the existing checkpoint format from M1.
### Timer-based segment sealing
The existing `WalHandle` seals segments when they reach `max_size`. For replication, we add a timer-based seal: every `wal_ship_interval` (default 2s), the active segment is sealed even if not full. This bounds replication lag.
### No Raft, no consensus
This is primary-backup replication. One leader, N followers. Promotion is manual or triggered by the control plane (Phase 8.5).
## Done When
A developer can start a leader and a follower using `InProcessTransport`, write 10,000 signals to the leader, observe the follower replay all events with lag < 5 seconds, and execute `db.retrieve()` on the follower with results matching the leader's state (modulo staleness of up to 1 batch).