tidaldb/docs/planning/milestone-8/phase-5/task-02-tenant-router.md
jordan f4cfd6c81f feat: complete M8 replication primitives + forage enhancements + docs
Milestone 8 (phases 1-4):
- Shard-aware WAL segment naming, BatchHeader v2, ShardRouter
- Transport trait, InProcessTransport, WalShipper, FollowerDb
- HLC, PNCounter, LWWRegister, CrdtSignalState, ReconciliationEngine
- Session replication bridge with SeqNo/HWM, idempotency store

Forage application:
- Multi-source discovery engine with MAB exploration
- Embedding-based label system, server handlers, UI refresh

Other:
- QUICKSTART.md, README.md, milestone-8 planning docs
- Hard negative union semantics, RLHF export enhancements
- Recovery benchmark and visibility test expansions
- Split 8 oversized source files per CODING_GUIDELINES §9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 13:17:19 -07:00

6.1 KiB

Task 02: TenantRouter

Delivers

TenantRouter in tidal/src/replication/tenant.rs (same file as TenantId/TenantConfig). Extends ShardRouter with tenant-aware routing: (TenantId, EntityId) -> (RegionId, ShardId). Default routing uses consistent hashing. Residency policy constrains which regions are eligible for a tenant's data.

Complexity: M

Dependencies

  • Task 01 (TenantId, TenantConfig)
  • Phase 8.1, Task 02 (ShardRouter)

Technical Design

// tidal/src/replication/tenant.rs (continued)

/// Maps (TenantId, EntityId) -> (RegionId, ShardId) for data placement.
///
/// Wraps `ShardRouter` and adds:
/// 1. Tenant-to-shard affinity (consistent hash or explicit assignment)
/// 2. Residency policy enforcement (required_regions constraint)
/// 3. Tenant registry for O(1) config lookup
pub struct TenantRouter {
    /// Inner shard router (entity-level routing).
    shard_router: Arc<ShardRouter>,
    /// Per-tenant configuration.
    tenants: DashMap<TenantId, TenantConfig>,
    /// Cluster topology: which shards are in which regions.
    topology: Arc<ClusterTopology>,
}

/// Cluster topology snapshot: maps ShardId -> RegionId.
#[derive(Debug, Clone)]
pub struct ClusterTopology {
    /// Ordered list of (ShardId, RegionId) assignments.
    shards: Vec<ShardAssignment>,
}

#[derive(Debug, Clone, Copy)]
pub struct ShardAssignment {
    pub shard_id: ShardId,
    pub region_id: RegionId,
}

impl TenantRouter {
    pub fn new(shard_router: Arc<ShardRouter>, topology: Arc<ClusterTopology>) -> Self {
        Self {
            shard_router,
            tenants: DashMap::new(),
            topology,
        }
    }

    /// Register or update a tenant's configuration.
    pub fn register_tenant(&self, config: TenantConfig) {
        self.tenants.insert(config.tenant_id, config);
    }

    /// Look up routing for a (TenantId, EntityId) pair.
    ///
    /// Returns `(RegionId, ShardId)` for data placement.
    /// Applies residency policy if configured.
    pub fn route(
        &self,
        tenant_id: TenantId,
        entity_id: EntityId,
    ) -> Result<ShardAssignment> {
        // 1. Get eligible shards (all shards if no policy; filtered by region if policy set).
        let eligible_shards = self.eligible_shards_for(tenant_id)?;

        // 2. Consistent hash over eligible shards.
        let shard = self.consistent_hash(entity_id, &eligible_shards);
        Ok(shard)
    }

    /// Returns the primary shard assignment for a tenant's data.
    ///
    /// For single-shard tenants: always the same shard.
    /// For multi-shard tenants: hash-distributed.
    fn eligible_shards_for(&self, tenant_id: TenantId) -> Result<Vec<ShardAssignment>> {
        let config = self.tenants.get(&tenant_id);

        if let Some(config) = config {
            if !config.required_regions.is_empty() {
                // Filter topology to only shards in required regions.
                let eligible: Vec<_> = self.topology.shards.iter()
                    .copied()
                    .filter(|s| config.required_regions.contains(&s.region_id))
                    .collect();
                if eligible.is_empty() {
                    return Err(TidalError::Configuration(
                        format!("tenant {:?} residency policy has no eligible shards", tenant_id)
                    ));
                }
                return Ok(eligible);
            }
        }

        // No residency constraint: all shards eligible.
        Ok(self.topology.shards.clone())
    }

    /// Consistent hash: jumps hash over the eligible shard list.
    ///
    /// Uses Jump Consistent Hash (Lamping & Veach, 2014) for minimal
    /// remapping when shards are added/removed.
    fn consistent_hash(&self, entity_id: EntityId, shards: &[ShardAssignment]) -> ShardAssignment {
        let n = shards.len() as u64;
        let slot = jump_hash(entity_id.0, n);
        shards[slot as usize]
    }

    /// Rate limiter for a tenant (lazily created).
    pub fn rate_limiter_for(&self, tenant_id: TenantId) -> Option<Arc<TenantRateLimiter>> {
        self.tenants.get(&tenant_id)
            .and_then(|c| c.max_signals_per_sec)
            .map(|rate| Arc::new(TenantRateLimiter::new(rate)))
    }
}

/// Jump Consistent Hash (O(ln n) time, O(1) space).
fn jump_hash(key: u64, num_buckets: u64) -> u64 {
    let mut k = key;
    let mut b: i64 = -1;
    let mut j: i64 = 0;
    while j < num_buckets as i64 {
        b = j;
        k = k.wrapping_mul(2862933555777941757).wrapping_add(1);
        j = ((b + 1) as f64 * (((1u64 << 31) as f64) / (((k >> 33) + 1) as f64))) as i64;
    }
    b as u64
}

Integration with TidalDb Write Path

// tidal/src/db/mod.rs (additions to signal write path)

impl TidalDb {
    pub fn signal_for_tenant(
        &self,
        tenant_id: TenantId,
        signal_type: &str,
        entity_id: EntityId,
        weight: f64,
        timestamp: Timestamp,
    ) -> crate::Result<()> {
        // 1. Check rate limit.
        if let Some(limiter) = self.tenant_router.rate_limiter_for(tenant_id) {
            limiter.try_acquire()?;
        }

        // 2. Route to shard.
        let assignment = self.tenant_router.route(tenant_id, entity_id)?;

        // 3. Write signal to the tenant-scoped signal ledger.
        self.signal_impl(signal_type, entity_id, weight, timestamp)
    }
}

Acceptance Criteria

  • TenantRouter::route(tenant_id, entity_id) returns a ShardAssignment from the eligible shards
  • Residency policy: if TenantConfig::required_regions = [RegionId(1)] and only shard 2 is in region 1, all entities for that tenant route to shard 2
  • Residency policy violation: if required regions have no shards in ClusterTopology, returns TidalError::Configuration
  • Consistent hash is stable: same (tenant_id, entity_id) always maps to the same shard unless topology changes
  • Jump hash: adding a shard remaps approximately 1/N of keys (property test: 10K keys, add 1 shard, verify < 15% remapping)
  • TidalDb::signal_for_tenant applies rate limiting before write; QuotaExceeded is returned before WAL write (no partial state)
  • cargo clippy -D warnings and cargo fmt pass