Implements the foundation of tidalDB's data pipeline: **Phase 1 – Schema primitives** - EntityId newtype (u64, big-endian ordering) - SignalTypeDefinition with pre-computed decay λ, deduped/sorted windows - SchemaBuilder with full constraint validation (duplicates, identifiers, half-life, windows, velocity) - LumenError wrapping all subsystems with required From impls **Phase 2 – Write-Ahead Log** - Length-prefixed, BLAKE3-protected entry format - Group-commit writer (batch up to 100 events / 10 ms) - Double-buffered content-hash deduplication - Checkpoint, truncation, and crash-recovery with full replay - Integration, property, and UAT tests (incl. 5,500-event deterministic UAT) - Proptest coverage scaled to 10 000 events/run (was ≤500) to meet acceptance criterion; cases reduced 100→10 to keep runtime comparable **Phase 3 – Storage engine** - StorageEngine trait (get/put/delete/scan/batch/flush) - Key encoding: [EntityId][0x00][Tag][suffix] with ordering/prefix helpers - InMemoryBackend (BTreeMap + RwLock) - FjallStorage with three isolated keyspaces and atomic batch helper - Property tests for key ordering and round-trip correctness Also adds planning docs for phases 4-5, research docs, architecture overview, and roadmap updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
21 KiB
Task 01: Hot-Tier Signal State
Context
Milestone: 1 -- Signal Engine Phase: m1p4 -- Signal Ledger Depends On: None (uses types from m1p1 but no m1p4 tasks) Blocks: Task 03 (Signal Ledger and Velocity) Complexity: L
Objective
Deliver HotSignalState, the cache-line-aligned, lock-free struct that holds running exponential decay scores for a single signal type on a single entity. This is the structure touched on every ranking query -- it must be exactly 64 bytes, use atomic operations for concurrent read/write, and implement the running decay formula with mathematical exactness. The struct handles both in-order and out-of-order signal events, and provides lazy decay at read time so ranking queries pay only one exp() call per entity per decay rate.
This is the single most performance-critical data structure in tidalDB. Every design choice is driven by the hot-path constraint: a ranking query scoring 200 candidates must complete in under 5 microseconds. That means ~25 nanoseconds per entity for decay score reads, which allows exactly one L1 cache miss and one exp() call.
Requirements
HotSignalStatemust be#[repr(C, align(64))]-- exactly one L1 cache linestatic_assert!(size_of::<HotSignalState>() == 64)- Running decay formula:
S(t) = S(t_prev) * exp(-lambda * dt) + weight on_signal()updates decay scores via CAS loop with correct memory orderingcurrent_score()applies lazy decay at read time:stored_score * exp(-lambda * dt)- Out-of-order events: when
t_event < last_update_ns, pre-decay the weight instead of advancing time - Decay scores are non-negative (debug assertion)
- All atomic operations use Acquire/Release/AcqRel -- no Relaxed without explicit justification
Send + Sync(ensured by atomic-only fields)- No
unsafecode
Technical Design
Module Structure
tidal/src/signals/
hot.rs -- HotSignalState, all methods
Public API
// === signals/hot.rs ===
use std::sync::atomic::{AtomicU64, Ordering};
/// Hot-path signal state for a single signal type on a single entity.
///
/// One cache line (64 bytes). Touched on every ranking query involving this
/// signal. Contains running decay scores for up to 3 decay rates and the
/// timestamp of the last update for lazy decay at read time.
///
/// # Memory Layout
///
/// ```text
/// Offset Size Field
/// 0..8 8 entity_id (u64)
/// 8..16 8 last_update_ns (AtomicU64)
/// 16..18 2 signal_type_id (u16)
/// 18..20 2 flags (u16)
/// 20..24 4 _pad0
/// 24..32 8 decay_scores[0] (AtomicU64, f64 via to_bits/from_bits)
/// 32..40 8 decay_scores[1] (AtomicU64)
/// 40..48 8 decay_scores[2] (AtomicU64)
/// 48..64 16 _pad1
/// ```
///
/// # Concurrency
///
/// - Writers: CAS loop on each `decay_scores[i]`, then conditional store on
/// `last_update_ns`. Multiple concurrent writers are serialized by CAS retry.
/// - Readers: Acquire load on `last_update_ns`, then Acquire load on
/// `decay_scores[i]`. Lazy decay applied from stored time to query time.
/// - A reader may see a stale score with a fresh timestamp (over-decaying by
/// a few nanoseconds) or a fresh score with a stale timestamp (under-decaying).
/// Both produce ranking-correct results within floating-point epsilon.
#[repr(C, align(64))]
pub struct HotSignalState {
entity_id: u64,
last_update_ns: AtomicU64,
signal_type_id: u16,
flags: u16,
_pad0: [u8; 4],
decay_scores: [AtomicU64; 3],
_pad1: [u8; 16],
}
// Compile-time size assertion
const _: () = assert!(std::mem::size_of::<HotSignalState>() == 64);
const _: () = assert!(std::mem::align_of::<HotSignalState>() == 64);
/// Maximum number of decay rate slots per signal type.
pub const MAX_DECAY_RATES: usize = 3;
impl HotSignalState {
/// Construct a new, zeroed state for the given entity and signal type.
pub fn new(entity_id: u64, signal_type_id: u16) -> Self;
/// Construct with the velocity_enabled flag set.
pub fn with_flags(entity_id: u64, signal_type_id: u16, velocity_enabled: bool) -> Self;
/// The entity this state belongs to.
pub fn entity_id(&self) -> u64;
/// The signal type index.
pub fn signal_type_id(&self) -> u16;
/// Whether velocity computation is enabled for this signal.
pub fn velocity_enabled(&self) -> bool;
/// Update running decay scores on a new signal event.
///
/// For each configured lambda, applies the decay formula:
/// new_score = old_score * exp(-lambda * dt) + effective_weight
///
/// For in-order events (event_time_ns >= last_update_ns):
/// dt = (event_time_ns - last_update_ns) as seconds
/// effective_weight = weight
/// last_update_ns is advanced to event_time_ns
///
/// For out-of-order events (event_time_ns < last_update_ns):
/// The existing score is not decayed (dt=0 for the score shift).
/// Instead, the weight is pre-decayed:
/// effective_weight = weight * exp(-lambda * (last_update_ns - event_time_ns))
/// last_update_ns is NOT changed.
///
/// Cost: K * exp() calls where K = number of configured decay rates.
/// At K=1 (M1 default): ~12ns. At K=3: ~36ns.
pub fn on_signal(
&self,
weight: f64,
event_time_ns: u64,
lambdas: &[f64],
);
/// Read the current decay score at query time.
///
/// Applies lazy decay from last_update to query_time_ns:
/// score = stored_score * exp(-lambda * dt)
///
/// Cost: 1 load + 1 exp() + 1 multiply = ~15ns.
pub fn current_score(
&self,
decay_rate_idx: usize,
query_time_ns: u64,
lambda: f64,
) -> f64;
/// Read the raw stored score without lazy decay.
/// Used only for checkpoint serialization.
pub fn stored_score(&self, decay_rate_idx: usize) -> f64;
/// Read the last update timestamp in nanoseconds.
pub fn last_update_ns(&self) -> u64;
/// Restore state from a checkpoint (set all fields).
/// Called during crash recovery before WAL replay.
pub fn restore(
&self,
last_update_ns: u64,
scores: &[f64],
);
}
Internal Design
Atomic memory ordering rationale:
The critical invariant is that a reader who loads last_update_ns via Acquire must see decay scores that are consistent with (or more recent than) that timestamp. Without this, a reader could see a new timestamp with an old score, producing an over-decayed (too small) result.
last_update_nsloads:Ordering::Acquire-- establishes a happens-before edge with the Release store from the writer.last_update_nsstores:Ordering::Release-- makes all prior decay score CAS operations visible to readers who Acquire this timestamp.decay_scores[i]loads:Ordering::Acquire-- ensures we read the most recent value stored by any CAS.decay_scores[i]CAS:Ordering::AcqRel(success),Ordering::Acquire(failure) -- AcqRel on success makes the new score visible and acquires the latest value; Acquire on failure loads the freshest competing write.
The write order is critical: CAS all decay scores FIRST, then conditionally store last_update_ns. If the process crashes between CAS and timestamp store, the worst case is that a reader applies lazy decay from an older timestamp, producing a slightly under-decayed (too large) score. This is safe for ranking because it is bounded and self-correcting on the next write.
Out-of-order event handling:
When event_time_ns < last_update_ns, the event arrived late. We cannot "rewind" the running score. Instead, we pre-decay the weight to account for the event's age relative to the current state:
adjusted_weight = weight * exp(-lambda * (last_update_ns - event_time_ns) / 1e9)
This is mathematically equivalent to having processed the event at its original time: the contribution of the late event to the score at last_update_ns is exactly weight * exp(-lambda * age).
For the CAS loop on out-of-order events, dt is 0 (the score is not decayed), and the adjusted weight is added:
new_score = old_score + adjusted_weight
f64 via AtomicU64:
Decay scores are f64 values stored as u64 bit patterns using f64::to_bits() and f64::from_bits(). Both functions are safe, const, and produce well-defined results for all finite f64 values including 0.0, negative zero, and subnormals. NaN bit patterns are never stored because the decay formula cannot produce NaN from non-negative inputs.
Error Handling
No fallible operations. on_signal() and current_score() are infallible. decay_rate_idx out of bounds is a caller error -- debug-asserted but saturated to 0 in release (never panics on the hot path).
Test Strategy
Property Tests
use proptest::prelude::*;
// P1: Decay scores decrease monotonically without new events.
proptest! {
#[test]
fn decay_monotonic_decrease(
initial_score in 0.0f64..1e12,
lambda in 1e-7f64..1e-3,
dt_secs in 1.0f64..1e7,
) {
let decayed = initial_score * (-lambda * dt_secs).exp();
prop_assert!(decayed <= initial_score);
prop_assert!(decayed >= 0.0);
}
}
// P2: Running score matches analytical sum to 6 decimal places.
proptest! {
#[test]
fn running_score_matches_analytical(
events in prop::collection::vec(
(0.1f64..10.0, 1_000_000u64..1_000_000_000),
1..100,
),
lambda in 1e-7f64..1e-3,
) {
// Sort events by time for in-order processing
let mut sorted_events = events.clone();
sorted_events.sort_by_key(|e| e.1);
let query_time_ns = sorted_events.last().unwrap().1 + 1_000_000_000; // +1 second
// Build HotSignalState and process events
let state = HotSignalState::new(42, 0);
for &(weight, time_ns) in &sorted_events {
state.on_signal(weight, time_ns, &[lambda]);
}
let running = state.current_score(0, query_time_ns, lambda);
// Compute analytical sum
let analytical: f64 = sorted_events.iter()
.map(|&(w, t)| w * (-lambda * (query_time_ns - t) as f64 / 1e9).exp())
.sum();
let relative_error = if analytical.abs() < 1e-15 {
running.abs()
} else {
(running - analytical).abs() / analytical
};
prop_assert!(
relative_error < 1e-6,
"running={running}, analytical={analytical}, relative_error={relative_error}"
);
}
}
// P4: Out-of-order events produce same final score as in-order.
proptest! {
#[test]
fn out_of_order_events_commutative(
events in prop::collection::vec(
(0.1f64..10.0, 1_000_000u64..1_000_000_000),
2..50,
),
lambda in 1e-7f64..1e-3,
) {
let query_time_ns = events.iter().map(|e| e.1).max().unwrap() + 1_000_000_000;
// Process in-order
let mut sorted = events.clone();
sorted.sort_by_key(|e| e.1);
let state_ordered = HotSignalState::new(42, 0);
for &(w, t) in &sorted {
state_ordered.on_signal(w, t, &[lambda]);
}
let score_ordered = state_ordered.current_score(0, query_time_ns, lambda);
// Process in reverse order (all out-of-order except first)
sorted.reverse();
let state_reversed = HotSignalState::new(42, 0);
for &(w, t) in &sorted {
state_reversed.on_signal(w, t, &[lambda]);
}
let score_reversed = state_reversed.current_score(0, query_time_ns, lambda);
// Also compare to analytical sum
let analytical: f64 = events.iter()
.map(|&(w, t)| w * (-lambda * (query_time_ns - t) as f64 / 1e9).exp())
.sum();
let error_ordered = if analytical.abs() < 1e-15 {
score_ordered.abs()
} else {
(score_ordered - analytical).abs() / analytical
};
let error_reversed = if analytical.abs() < 1e-15 {
score_reversed.abs()
} else {
(score_reversed - analytical).abs() / analytical
};
prop_assert!(error_ordered < 1e-6,
"ordered: running={score_ordered}, analytical={analytical}, error={error_ordered}");
prop_assert!(error_reversed < 1e-6,
"reversed: running={score_reversed}, analytical={analytical}, error={error_reversed}");
}
}
// Decay scores are always non-negative (INV-SIG-3).
proptest! {
#[test]
fn decay_scores_non_negative(
events in prop::collection::vec(
(0.0f64..100.0, 0u64..2_000_000_000),
1..200,
),
lambda in 1e-7f64..1e-3,
query_offset in 0u64..2_000_000_000,
) {
let state = HotSignalState::new(1, 0);
for &(w, t) in &events {
state.on_signal(w, t, &[lambda]);
}
let query_time = events.iter().map(|e| e.1).max().unwrap_or(0) + query_offset;
let score = state.current_score(0, query_time, lambda);
prop_assert!(score >= 0.0, "score was {score}");
}
}
Unit Tests
#[test]
fn hot_signal_state_size_and_alignment() {
assert_eq!(std::mem::size_of::<HotSignalState>(), 64);
assert_eq!(std::mem::align_of::<HotSignalState>(), 64);
}
#[test]
fn new_state_is_zeroed() {
let state = HotSignalState::new(42, 5);
assert_eq!(state.entity_id(), 42);
assert_eq!(state.signal_type_id(), 5);
assert_eq!(state.last_update_ns(), 0);
assert_eq!(state.stored_score(0), 0.0);
assert_eq!(state.stored_score(1), 0.0);
assert_eq!(state.stored_score(2), 0.0);
}
#[test]
fn single_event_sets_score_to_weight() {
let state = HotSignalState::new(1, 0);
let lambda = std::f64::consts::LN_2 / (7.0 * 24.0 * 3600.0); // 7-day half-life
let t = 1_000_000_000u64; // 1 second in nanos
state.on_signal(1.0, t, &[lambda]);
// Immediately after, with no time elapsed, score should be ~1.0
let score = state.current_score(0, t, lambda);
assert!((score - 1.0).abs() < 1e-10);
}
#[test]
fn score_halves_after_half_life() {
let half_life_secs = 3600.0; // 1 hour
let lambda = std::f64::consts::LN_2 / half_life_secs;
let state = HotSignalState::new(1, 0);
let t0 = 0u64;
state.on_signal(1.0, t0, &[lambda]);
// Read after exactly one half-life
let t1 = (half_life_secs * 1e9) as u64;
let score = state.current_score(0, t1, lambda);
assert!((score - 0.5).abs() < 1e-10, "score was {score}, expected ~0.5");
}
#[test]
fn two_events_accumulate() {
let lambda = std::f64::consts::LN_2 / 3600.0; // 1h half-life
let state = HotSignalState::new(1, 0);
let t0 = 0u64;
let t1 = 1_000_000_000u64; // 1 second later
state.on_signal(1.0, t0, &[lambda]);
state.on_signal(1.0, t1, &[lambda]);
let score = state.current_score(0, t1, lambda);
// score = 1.0 * exp(-lambda * 1.0) + 1.0
let expected = 1.0_f64 * (-lambda * 1.0).exp() + 1.0;
assert!((score - expected).abs() < 1e-10, "score={score}, expected={expected}");
}
#[test]
fn out_of_order_event_predecays_weight() {
let lambda = std::f64::consts::LN_2 / 3600.0;
let state = HotSignalState::new(1, 0);
// Process event at t=10s first
let t_late = 10_000_000_000u64;
state.on_signal(1.0, t_late, &[lambda]);
// Then process event at t=5s (out of order)
let t_early = 5_000_000_000u64;
state.on_signal(1.0, t_early, &[lambda]);
// Query at t=10s -- should match analytical result
let analytical = 1.0 * (-lambda * 0.0).exp() // event at t=10, age=0
+ 1.0 * (-lambda * 5.0).exp(); // event at t=5, age=5s
let actual = state.current_score(0, t_late, lambda);
assert!((actual - analytical).abs() < 1e-10,
"actual={actual}, analytical={analytical}");
}
#[test]
fn last_update_ns_not_regressed_by_out_of_order() {
let lambda = std::f64::consts::LN_2 / 3600.0;
let state = HotSignalState::new(1, 0);
state.on_signal(1.0, 10_000_000_000, &[lambda]);
let ts_before = state.last_update_ns();
state.on_signal(1.0, 5_000_000_000, &[lambda]); // older event
let ts_after = state.last_update_ns();
assert_eq!(ts_before, ts_after, "timestamp should not regress");
assert_eq!(ts_after, 10_000_000_000);
}
#[test]
fn score_decays_to_near_zero_after_many_half_lives() {
let lambda = std::f64::consts::LN_2 / 3600.0; // 1h half-life
let state = HotSignalState::new(1, 0);
state.on_signal(1.0, 0, &[lambda]);
// After 100 half-lives (~100 hours), score should be essentially zero
let t = (100.0 * 3600.0 * 1e9) as u64;
let score = state.current_score(0, t, lambda);
assert!(score < 1e-20, "score was {score}");
}
#[test]
fn velocity_flag() {
let state = HotSignalState::with_flags(1, 0, true);
assert!(state.velocity_enabled());
let state2 = HotSignalState::with_flags(1, 0, false);
assert!(!state2.velocity_enabled());
}
#[test]
fn restore_sets_all_fields() {
let state = HotSignalState::new(1, 0);
state.restore(42_000_000_000, &[1.5, 2.5, 3.5]);
assert_eq!(state.last_update_ns(), 42_000_000_000);
assert!((state.stored_score(0) - 1.5).abs() < 1e-15);
assert!((state.stored_score(1) - 2.5).abs() < 1e-15);
assert!((state.stored_score(2) - 3.5).abs() < 1e-15);
}
#[test]
fn multiple_lambdas() {
let lambda_fast = std::f64::consts::LN_2 / 3600.0; // 1h half-life
let lambda_slow = std::f64::consts::LN_2 / 604800.0; // 7d half-life
let lambdas = [lambda_fast, lambda_slow];
let state = HotSignalState::new(1, 0);
state.on_signal(1.0, 0, &lambdas);
// After 1 hour, fast score ~0.5, slow score ~0.9996
let t = (3600.0 * 1e9) as u64;
let score_fast = state.current_score(0, t, lambda_fast);
let score_slow = state.current_score(1, t, lambda_slow);
assert!((score_fast - 0.5).abs() < 1e-6);
assert!((score_slow - (-lambda_slow * 3600.0).exp()).abs() < 1e-6);
assert!(score_slow > score_fast, "slow decay should retain more");
}
Acceptance Criteria
HotSignalStateis#[repr(C, align(64))]with compile-time size assertion== 64on_signal()implements the running decay formula with CAS loops usingAcqRel/Acquireorderingcurrent_score()applies lazy decay withAcquireloads- Out-of-order events pre-decay the weight and do not regress
last_update_ns - Running score matches analytical brute-force sum to 6 decimal places (property test P2)
- Decay scores monotonically decrease without new events (property test P1)
- Decay scores are always non-negative across all property test inputs (INV-SIG-3)
- Out-of-order processing produces same score as in-order to 6 decimal places (property test P4)
restore()correctly sets all fields for checkpoint recovery- No
unsafecode cargo clippy -- -D warningspasses- All property tests and unit tests pass
Research References
- docs/research/tidaldb_signal_ledger.md -- Section 3 (running-score formula proof), Section 4 (EntityState struct layout), Section 5 (f64 precision analysis: "adequate through year 18,000"), performance estimates (12ns per exp(), 36ns for 3 rates)
- Cormode, G. et al., "Forward Decay: A Practical Time Decay Model for Streaming Systems," ICDE 2009 -- mathematical foundation for running score exactness
Spec References
- docs/specs/03-signal-system.md -- Section 3 (HotSignalState layout), Section 4 (decay computation: write-path
on_signal, read-pathcurrent_score, out-of-order handling, numerical stability), invariants INV-SIG-2 (monotonic decrease), INV-SIG-3 (non-negative), INV-SIG-5 (running score exactness), INV-CON-1 (lock-free reads), INV-CON-2 (CAS correctness), performance targets (Section 12: hot-tier update < 50ns, decay score read ~15ns) - docs/specs/00-architecture-overview.md -- Section 8 code module map showing
signal/hot.rs
Implementation Notes
f64::from_bits(0u64)returns0.0and(0.0f64).to_bits()returns0u64. This means a zeroedAtomicU64reads as0.0throughfrom_bits, which is the correct initial decay score. No special initialization needed.compare_exchange_weakis used instead ofcompare_exchangebecause we are in a retry loop. The weak variant may fail spuriously but is faster on architectures with LL/SC (ARM). On x86, both compile toCMPXCHG.- The
_pad0and_pad1fields ensure the struct is exactly 64 bytes. Without them, the compiler might add different padding that changes the size.#[repr(C)]makes the layout deterministic. - Do NOT implement the Jacobs forward-decay trick in this task. It eliminates read-time computation but requires log-space arithmetic and overflow prevention. Deferred to M2+ as an optimization.
- Do NOT add benchmark harness in this task. Benchmarks are added in Task 03 after the full signal ledger is assembled. Property tests are the correctness gate for this task.