Milestone 8 (phases 1-4): - Shard-aware WAL segment naming, BatchHeader v2, ShardRouter - Transport trait, InProcessTransport, WalShipper, FollowerDb - HLC, PNCounter, LWWRegister, CrdtSignalState, ReconciliationEngine - Session replication bridge with SeqNo/HWM, idempotency store Forage application: - Multi-source discovery engine with MAB exploration - Embedding-based label system, server handlers, UI refresh Other: - QUICKSTART.md, README.md, milestone-8 planning docs - Hard negative union semantics, RLHF export enhancements - Recovery benchmark and visibility test expansions - Split 8 oversized source files per CODING_GUIDELINES §9 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.7 KiB
5.7 KiB
Task 01: TenantId + TenantConfig
Delivers
TenantId(u64) and TenantConfig in tidal/src/replication/tenant.rs. Per-tenant quotas (signals/sec token bucket, max entities, max storage bytes). WAL segment directories namespaced under {data_dir}/tenants/{tenant_id}/wal/. TidalError::QuotaExceeded returned when limits are breached.
Complexity: M
Dependencies
- Phase 8.1 (ShardId, RegionId)
- Phase 8.2 (WAL segment naming)
Technical Design
// tidal/src/replication/tenant.rs
/// Tenant identity type.
///
/// A tenant is an agent workspace or an isolated application namespace.
/// All data (WAL segments, signal ledger state, entity metadata) is
/// scoped to a tenant's filesystem directory.
///
/// `TenantId(0)` is the default single-tenant ID used by non-multi-tenant
/// deployments. This ensures backward compatibility with all existing code.
#[derive(
Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash,
Default,
serde::Serialize, serde::Deserialize,
)]
pub struct TenantId(pub u64);
impl TenantId {
/// The default tenant ID for single-tenant deployments.
pub const DEFAULT: Self = Self(0);
}
impl std::fmt::Display for TenantId {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "t{}", self.0)
}
}
/// Per-tenant resource configuration.
///
/// Enforced at write time. Violations return `TidalError::QuotaExceeded`.
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
pub struct TenantConfig {
pub tenant_id: TenantId,
/// Maximum signals per second (token bucket rate limit).
///
/// `None` means unlimited (trusted internal tenant).
pub max_signals_per_sec: Option<u32>,
/// Maximum number of distinct entities (items + users + creators).
///
/// Checked on entity create; `None` means unlimited.
pub max_entities: Option<u64>,
/// Maximum total storage in bytes for this tenant's data directory.
///
/// Checked on WAL segment seal; `None` means unlimited.
pub max_storage_bytes: Option<u64>,
/// Residency policy: which regions this tenant's data must reside in.
///
/// Empty = no restriction. Used by `TenantRouter` to constrain placement.
pub required_regions: Vec<RegionId>,
/// Human-readable label for this tenant (for monitoring/logging).
pub label: String,
}
impl TenantConfig {
/// Default config: unlimited quotas, no residency constraint.
pub fn default_tenant() -> Self {
Self {
tenant_id: TenantId::DEFAULT,
max_signals_per_sec: None,
max_entities: None,
max_storage_bytes: None,
required_regions: Vec::new(),
label: "default".to_string(),
}
}
}
/// Token bucket rate limiter for per-tenant signal ingestion.
///
/// Refills at `max_signals_per_sec` tokens per second.
/// Costs 1 token per signal write. Bucket max = 2x rate (burst headroom).
#[derive(Debug)]
pub struct TenantRateLimiter {
/// Current tokens (f64 for sub-token precision).
tokens: AtomicF64,
/// Refill rate (tokens/ns).
refill_rate_per_ns: f64,
/// Maximum bucket size (tokens).
max_tokens: f64,
/// Last refill timestamp (ns).
last_refill_ns: AtomicU64,
}
impl TenantRateLimiter {
pub fn new(max_signals_per_sec: u32) -> Self {
let rate_per_ns = max_signals_per_sec as f64 / 1_000_000_000.0;
let max_tokens = (max_signals_per_sec as f64) * 2.0; // 2s burst
Self {
tokens: AtomicF64::new(max_tokens),
refill_rate_per_ns: rate_per_ns,
max_tokens,
last_refill_ns: AtomicU64::new(crate::util::now_ns()),
}
}
/// Try to consume 1 token. Returns `Ok(())` if allowed, `Err(QuotaExceeded)` if throttled.
pub fn try_acquire(&self) -> Result<()> {
let now = crate::util::now_ns();
let last = self.last_refill_ns.load(Ordering::Relaxed);
let elapsed_ns = now.saturating_sub(last);
let refill = elapsed_ns as f64 * self.refill_rate_per_ns;
let new_tokens = (self.tokens.load(Ordering::Relaxed) + refill)
.min(self.max_tokens);
if new_tokens < 1.0 {
return Err(TidalError::QuotaExceeded("signal rate limit exceeded".into()));
}
self.tokens.store(new_tokens - 1.0, Ordering::Relaxed);
self.last_refill_ns.store(now, Ordering::Relaxed);
Ok(())
}
}
WAL Directory Namespacing
// tidal/src/wal/segment.rs (additions)
/// Build the tenant-scoped WAL directory path.
///
/// For `TenantId::DEFAULT` (backward compat): returns `{data_dir}/wal/` unchanged.
/// For other tenants: returns `{data_dir}/tenants/{tenant_id}/wal/`.
pub fn tenant_wal_dir(data_dir: &Path, tenant_id: TenantId) -> PathBuf {
if tenant_id == TenantId::DEFAULT {
data_dir.join("wal")
} else {
data_dir
.join("tenants")
.join(tenant_id.0.to_string())
.join("wal")
}
}
Acceptance Criteria
TenantIdisCopy + Clone + Debug + Eq + Hash + Ord + Default + Serialize + DeserializeTenantId::DEFAULTisTenantId(0); all existing code usingTenantId(0)works unchangedTenantRateLimiter::try_acquire()returnsTidalError::QuotaExceededwithin 1ms when token bucket is empty- Token bucket refills at the configured rate: after sleeping
1/rateseconds, one token is available - WAL directory for
TenantId::DEFAULTis{data_dir}/wal/(unchanged from m1p5) - WAL directory for
TenantId(42)is{data_dir}/tenants/42/wal/ - Unit test: configure 100 signals/sec, write 200 signals in a tight loop, verify ~100 succeed and ~100 receive
QuotaExceeded cargo clippy -D warningsandcargo fmtpass