# Task 02: ShardRouter ## Delivers `ShardRouter` with `EntityIdRange` type, range-based and hash-based routing, validation that ranges partition the full u64 space, and property tests for deterministic routing. The `ShardRouter` maps any `EntityId` to exactly one `ShardId` and is the single source of truth for shard assignment. ## Complexity: M ## Dependencies - Task 01 (ShardId, RegionId types) ## Technical Design ```rust // tidal/src/replication/shard.rs use crate::EntityId; /// A contiguous, half-open range of EntityIds: [start, end). /// /// Used to define shard boundaries in range-based routing. #[derive(Debug, Clone, Copy, PartialEq, Eq, serde::Serialize, serde::Deserialize)] pub struct EntityIdRange { pub start: u64, // inclusive pub end: u64, // exclusive; u64::MAX means "includes the last entity" } impl EntityIdRange { pub fn contains(&self, id: u64) -> bool { id >= self.start && id < self.end } /// The full u64 space (single-shard default). pub fn full() -> Self { Self { start: 0, end: u64::MAX } } } /// Routing strategy for entity-to-shard mapping. #[derive(Debug, Clone)] pub enum RoutingStrategy { /// All entities route to the default single shard. /// Used for single-node deployments (shard_id=0). Single, /// Hash-based routing: `hash(entity_id) % num_shards`. /// Uniform distribution; no explicit range boundaries. Hash { num_shards: u16 }, /// Range-based routing: each shard owns a contiguous range of EntityIds. /// Production deployments use this for controlled data placement. Range(Vec<(ShardId, EntityIdRange)>), } /// Routes EntityIds to ShardIds. /// /// Thread-safe; clone is cheap (inner data is Arc<_>). #[derive(Debug, Clone)] pub struct ShardRouter { strategy: RoutingStrategy, } impl ShardRouter { /// Create a single-node router (always returns ShardId(0)). pub fn single() -> Self { Self { strategy: RoutingStrategy::Single } } /// Create a hash-based router with `num_shards` shards. pub fn hash(num_shards: u16) -> Result { if num_shards == 0 { return Err(RouterError::ZeroShards); } Ok(Self { strategy: RoutingStrategy::Hash { num_shards } }) } /// Create a range-based router from a list of (ShardId, EntityIdRange) pairs. /// /// Validates that: /// - Ranges are non-overlapping /// - Ranges cover the full u64 space (no gaps) /// - ShardIds are unique pub fn range(ranges: Vec<(ShardId, EntityIdRange)>) -> Result { Self::validate_ranges(&ranges)?; Ok(Self { strategy: RoutingStrategy::Range(ranges) }) } /// Route an EntityId to its owning ShardId. /// /// Always returns exactly one shard. Never panics. pub fn route(&self, entity_id: EntityId) -> ShardId { let id = entity_id.as_u64(); match &self.strategy { RoutingStrategy::Single => ShardId::SINGLE, RoutingStrategy::Hash { num_shards } => { // FNV-1a hash for uniform distribution without dependencies let hash = fnv1a_hash(id); ShardId(hash as u16 % num_shards) } RoutingStrategy::Range(ranges) => { for (shard_id, range) in ranges { if range.contains(id) { return *shard_id; } } // Invariant: validated at construction time that ranges cover // the full space, so this is unreachable. ShardId::SINGLE } } } /// Returns all ShardIds known to this router. pub fn all_shards(&self) -> Vec { match &self.strategy { RoutingStrategy::Single => vec![ShardId::SINGLE], RoutingStrategy::Hash { num_shards } => { (0..*num_shards).map(ShardId).collect() } RoutingStrategy::Range(ranges) => { let mut shards: Vec<_> = ranges.iter().map(|(s, _)| *s).collect(); shards.sort(); shards.dedup(); shards } } } fn validate_ranges(ranges: &[(ShardId, EntityIdRange)]) -> Result<(), RouterError> { if ranges.is_empty() { return Err(RouterError::EmptyRanges); } // Sort by start position to check coverage and overlap. let mut sorted: Vec<_> = ranges.iter().collect(); sorted.sort_by_key(|(_, r)| r.start); // Check no gaps and no overlaps. let mut expected_start = 0u64; for (_, range) in &sorted { if range.start != expected_start { return Err(RouterError::Gap { expected: expected_start, found: range.start, }); } if range.end <= range.start { return Err(RouterError::EmptyRange { start: range.start }); } expected_start = range.end; } // Check coverage of full space. if expected_start != u64::MAX { return Err(RouterError::IncompleteCoverage { ends_at: expected_start }); } Ok(()) } } #[inline] fn fnv1a_hash(value: u64) -> u64 { const FNV_OFFSET: u64 = 14_695_981_039_346_656_037; const FNV_PRIME: u64 = 1_099_511_628_211; let mut hash = FNV_OFFSET; let bytes = value.to_le_bytes(); for byte in &bytes { hash ^= *byte as u64; hash = hash.wrapping_mul(FNV_PRIME); } hash } #[derive(Debug, thiserror::Error)] pub enum RouterError { #[error("shard count must be > 0")] ZeroShards, #[error("range list is empty")] EmptyRanges, #[error("gap in range: expected start {expected}, found {found}")] Gap { expected: u64, found: u64 }, #[error("empty range starting at {start}")] EmptyRange { start: u64 }, #[error("ranges don't cover full u64 space: ends at {ends_at}")] IncompleteCoverage { ends_at: u64 }, } ``` ## Acceptance Criteria - [ ] `ShardRouter::single()` always returns `ShardId(0)` for any input - [ ] `ShardRouter::hash(n)` distributes entities uniformly; property test with 10K IDs shows max deviation < 15% from expected bucket size - [ ] `ShardRouter::range(ranges)` returns the correct shard for boundaries; property test with 10K random IDs within each range - [ ] `RouterError::Gap` when ranges have a gap; `RouterError::IncompleteCoverage` when ranges don't reach u64::MAX - [ ] `ShardRouter::all_shards()` returns all shards for each routing strategy - [ ] Routing is a pure function: same input always returns same output (property test with proptest) - [ ] `cargo clippy -D warnings` and `cargo fmt` pass ## Test Strategy ```rust #[cfg(test)] mod tests { use super::*; use proptest::prelude::*; #[test] fn single_router_always_returns_shard_zero() { let router = ShardRouter::single(); for id in [0u64, 1, 100, u64::MAX - 1] { assert_eq!(router.route(EntityId::from(id)), ShardId(0)); } } #[test] fn range_router_validates_gap() { let result = ShardRouter::range(vec![ (ShardId(0), EntityIdRange { start: 0, end: 1000 }), (ShardId(1), EntityIdRange { start: 2000, end: u64::MAX }), ]); assert!(matches!(result, Err(RouterError::Gap { .. }))); } proptest! { #[test] fn hash_routing_is_deterministic(id in 0u64..u64::MAX) { let router = ShardRouter::hash(5).unwrap(); let entity = EntityId::from(id); assert_eq!(router.route(entity), router.route(entity)); } #[test] fn hash_routing_stays_in_range(id in 0u64..u64::MAX) { let router = ShardRouter::hash(5).unwrap(); let shard = router.route(EntityId::from(id)); assert!(shard.0 < 5); } } } ```