# Task 06: ReplicationLagGauge ## Delivers `ReplicationLagGauge` in `tidal/src/replication/lag.rs` tracking per-follower lag (leader_seqno - follower_applied_seqno). Exposed via `MetricsState` so existing Prometheus scraping picks it up automatically. ## Complexity: S ## Dependencies - Phase 8.1 (ReplicationState) - Task 03 (WalShipper -- for leader_seqno) ## Technical Design ```rust // tidal/src/replication/lag.rs /// Tracks per-follower replication lag. /// /// Lag = leader's latest shipped seqno - follower's applied seqno. /// A lag of 0 means the follower is fully caught up. #[derive(Debug, Default)] pub struct ReplicationLagGauge { /// Per-follower: last seqno the leader has shipped. leader_seqno: DashMap, /// Per-follower: last seqno the follower has applied. follower_applied: Arc, } impl ReplicationLagGauge { pub fn new(replication_state: Arc) -> Self { Self { leader_seqno: DashMap::new(), follower_applied: replication_state, } } /// Update the leader's known shipped seqno for a follower. pub fn update_leader_seqno(&self, follower: ShardId, seqno: u64) { self.leader_seqno .entry(follower) .or_insert_with(|| AtomicU64::new(0)) .store(seqno, Ordering::Release); } /// Get the current lag for a follower in seqno units. pub fn lag_seqno(&self, follower: ShardId) -> i64 { let leader = self.leader_seqno .get(&follower) .map(|a| a.load(Ordering::Acquire)) .unwrap_or(0); let applied = self.follower_applied .applied_seqno(follower) .unwrap_or(0); leader as i64 - applied as i64 } /// Collect Prometheus-style gauge values for all followers. pub fn collect_metrics(&self) -> Vec<(ShardId, i64)> { self.leader_seqno .iter() .map(|entry| { let follower = *entry.key(); (follower, self.lag_seqno(follower)) }) .collect() } } ``` ### MetricsState integration ```rust // tidal/src/db/metrics.rs (existing metrics module) impl MetricsState { // Add to existing collect() method: pub fn replication_lag_seqno(&self, follower_shard: u16) -> i64 { self.lag_gauge .as_ref() .map(|g| g.lag_seqno(ShardId(follower_shard))) .unwrap_or(0) } } ``` ## Acceptance Criteria - [ ] `ReplicationLagGauge::lag_seqno(follower)` returns `leader_seqno - follower_applied_seqno` - [ ] `lag_seqno` returns 0 when follower is fully caught up - [ ] `lag_seqno` returns > 0 when follower is behind - [ ] `collect_metrics()` returns a snapshot of all follower lags - [ ] Integrated into `MetricsState` so existing `/metrics` endpoint exposes `replication_lag_seqno` gauge - [ ] Integration test: leader writes 100 segments; before follower applies them, lag = 100; after apply, lag = 0 - [ ] `cargo clippy -D warnings` and `cargo fmt` pass