Milestone 8 (phases 1-4): - Shard-aware WAL segment naming, BatchHeader v2, ShardRouter - Transport trait, InProcessTransport, WalShipper, FollowerDb - HLC, PNCounter, LWWRegister, CrdtSignalState, ReconciliationEngine - Session replication bridge with SeqNo/HWM, idempotency store Forage application: - Multi-source discovery engine with MAB exploration - Embedding-based label system, server handlers, UI refresh Other: - QUICKSTART.md, README.md, milestone-8 planning docs - Hard negative union semantics, RLHF export enhancements - Recovery benchmark and visibility test expansions - Split 8 oversized source files per CODING_GUIDELINES §9 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
170 lines
5.7 KiB
Markdown
170 lines
5.7 KiB
Markdown
# Task 01: TenantId + TenantConfig
|
|
|
|
## Delivers
|
|
|
|
`TenantId(u64)` and `TenantConfig` in `tidal/src/replication/tenant.rs`. Per-tenant quotas (signals/sec token bucket, max entities, max storage bytes). WAL segment directories namespaced under `{data_dir}/tenants/{tenant_id}/wal/`. `TidalError::QuotaExceeded` returned when limits are breached.
|
|
|
|
## Complexity: M
|
|
|
|
## Dependencies
|
|
|
|
- Phase 8.1 (ShardId, RegionId)
|
|
- Phase 8.2 (WAL segment naming)
|
|
|
|
## Technical Design
|
|
|
|
```rust
|
|
// tidal/src/replication/tenant.rs
|
|
|
|
/// Tenant identity type.
|
|
///
|
|
/// A tenant is an agent workspace or an isolated application namespace.
|
|
/// All data (WAL segments, signal ledger state, entity metadata) is
|
|
/// scoped to a tenant's filesystem directory.
|
|
///
|
|
/// `TenantId(0)` is the default single-tenant ID used by non-multi-tenant
|
|
/// deployments. This ensures backward compatibility with all existing code.
|
|
#[derive(
|
|
Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash,
|
|
Default,
|
|
serde::Serialize, serde::Deserialize,
|
|
)]
|
|
pub struct TenantId(pub u64);
|
|
|
|
impl TenantId {
|
|
/// The default tenant ID for single-tenant deployments.
|
|
pub const DEFAULT: Self = Self(0);
|
|
}
|
|
|
|
impl std::fmt::Display for TenantId {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
write!(f, "t{}", self.0)
|
|
}
|
|
}
|
|
|
|
/// Per-tenant resource configuration.
|
|
///
|
|
/// Enforced at write time. Violations return `TidalError::QuotaExceeded`.
|
|
#[derive(Debug, Clone, serde::Serialize, serde::Deserialize)]
|
|
pub struct TenantConfig {
|
|
pub tenant_id: TenantId,
|
|
|
|
/// Maximum signals per second (token bucket rate limit).
|
|
///
|
|
/// `None` means unlimited (trusted internal tenant).
|
|
pub max_signals_per_sec: Option<u32>,
|
|
|
|
/// Maximum number of distinct entities (items + users + creators).
|
|
///
|
|
/// Checked on entity create; `None` means unlimited.
|
|
pub max_entities: Option<u64>,
|
|
|
|
/// Maximum total storage in bytes for this tenant's data directory.
|
|
///
|
|
/// Checked on WAL segment seal; `None` means unlimited.
|
|
pub max_storage_bytes: Option<u64>,
|
|
|
|
/// Residency policy: which regions this tenant's data must reside in.
|
|
///
|
|
/// Empty = no restriction. Used by `TenantRouter` to constrain placement.
|
|
pub required_regions: Vec<RegionId>,
|
|
|
|
/// Human-readable label for this tenant (for monitoring/logging).
|
|
pub label: String,
|
|
}
|
|
|
|
impl TenantConfig {
|
|
/// Default config: unlimited quotas, no residency constraint.
|
|
pub fn default_tenant() -> Self {
|
|
Self {
|
|
tenant_id: TenantId::DEFAULT,
|
|
max_signals_per_sec: None,
|
|
max_entities: None,
|
|
max_storage_bytes: None,
|
|
required_regions: Vec::new(),
|
|
label: "default".to_string(),
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Token bucket rate limiter for per-tenant signal ingestion.
|
|
///
|
|
/// Refills at `max_signals_per_sec` tokens per second.
|
|
/// Costs 1 token per signal write. Bucket max = 2x rate (burst headroom).
|
|
#[derive(Debug)]
|
|
pub struct TenantRateLimiter {
|
|
/// Current tokens (f64 for sub-token precision).
|
|
tokens: AtomicF64,
|
|
/// Refill rate (tokens/ns).
|
|
refill_rate_per_ns: f64,
|
|
/// Maximum bucket size (tokens).
|
|
max_tokens: f64,
|
|
/// Last refill timestamp (ns).
|
|
last_refill_ns: AtomicU64,
|
|
}
|
|
|
|
impl TenantRateLimiter {
|
|
pub fn new(max_signals_per_sec: u32) -> Self {
|
|
let rate_per_ns = max_signals_per_sec as f64 / 1_000_000_000.0;
|
|
let max_tokens = (max_signals_per_sec as f64) * 2.0; // 2s burst
|
|
Self {
|
|
tokens: AtomicF64::new(max_tokens),
|
|
refill_rate_per_ns: rate_per_ns,
|
|
max_tokens,
|
|
last_refill_ns: AtomicU64::new(crate::util::now_ns()),
|
|
}
|
|
}
|
|
|
|
/// Try to consume 1 token. Returns `Ok(())` if allowed, `Err(QuotaExceeded)` if throttled.
|
|
pub fn try_acquire(&self) -> Result<()> {
|
|
let now = crate::util::now_ns();
|
|
let last = self.last_refill_ns.load(Ordering::Relaxed);
|
|
let elapsed_ns = now.saturating_sub(last);
|
|
|
|
let refill = elapsed_ns as f64 * self.refill_rate_per_ns;
|
|
let new_tokens = (self.tokens.load(Ordering::Relaxed) + refill)
|
|
.min(self.max_tokens);
|
|
|
|
if new_tokens < 1.0 {
|
|
return Err(TidalError::QuotaExceeded("signal rate limit exceeded".into()));
|
|
}
|
|
|
|
self.tokens.store(new_tokens - 1.0, Ordering::Relaxed);
|
|
self.last_refill_ns.store(now, Ordering::Relaxed);
|
|
Ok(())
|
|
}
|
|
}
|
|
```
|
|
|
|
### WAL Directory Namespacing
|
|
|
|
```rust
|
|
// tidal/src/wal/segment.rs (additions)
|
|
|
|
/// Build the tenant-scoped WAL directory path.
|
|
///
|
|
/// For `TenantId::DEFAULT` (backward compat): returns `{data_dir}/wal/` unchanged.
|
|
/// For other tenants: returns `{data_dir}/tenants/{tenant_id}/wal/`.
|
|
pub fn tenant_wal_dir(data_dir: &Path, tenant_id: TenantId) -> PathBuf {
|
|
if tenant_id == TenantId::DEFAULT {
|
|
data_dir.join("wal")
|
|
} else {
|
|
data_dir
|
|
.join("tenants")
|
|
.join(tenant_id.0.to_string())
|
|
.join("wal")
|
|
}
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `TenantId` is `Copy + Clone + Debug + Eq + Hash + Ord + Default + Serialize + Deserialize`
|
|
- [ ] `TenantId::DEFAULT` is `TenantId(0)`; all existing code using `TenantId(0)` works unchanged
|
|
- [ ] `TenantRateLimiter::try_acquire()` returns `TidalError::QuotaExceeded` within 1ms when token bucket is empty
|
|
- [ ] Token bucket refills at the configured rate: after sleeping `1/rate` seconds, one token is available
|
|
- [ ] WAL directory for `TenantId::DEFAULT` is `{data_dir}/wal/` (unchanged from m1p5)
|
|
- [ ] WAL directory for `TenantId(42)` is `{data_dir}/tenants/42/wal/`
|
|
- [ ] Unit test: configure 100 signals/sec, write 200 signals in a tight loop, verify ~100 succeed and ~100 receive `QuotaExceeded`
|
|
- [ ] `cargo clippy -D warnings` and `cargo fmt` pass
|