jordan 6fdaa1584b feat: complete M1 signal engine — m0p3 samples/docs, m1p5 TidalDb API, examples, and periodic checkpoint

- m0p3: CONTRIBUTING.md with run-samples checklist, all 4 examples
  (quickstart, cli_embedding, axum_embedding, actix_embedding), doc-test
  coverage for every public API surface
- m1p5: TidalDb public API — write_item, signal, read_decay_score,
  read_windowed_count, read_velocity; StorageBox enum routing memory vs
  fjall; WalSender/WalHandleWriter bridge; WAL replay on open
- Periodic checkpoint: 30s background thread for persistent+schema mode;
  FjallBackend::Clone (O(1), fjall::Keyspace is ref-counted); graceful
  shutdown via Arc<AtomicBool> + join before final checkpoint
- ROADMAP.md: M0 and M1 fully marked COMPLETE (341 tests passing)
- Milestone 2 planning scaffolding added under docs/planning/milestone-2/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 22:45:10 -07:00

44 KiB

Raw Blame History

Task 03: Profile Executor + Benchmarks

Context

Milestone: 2 -- Ranked Retrieval Phase: m2p3 -- Ranking Profile Engine Depends On: Task 01 (RankingProfile types, Sort, ScoringRule), Task 02 (Built-in profiles, ProfileRegistry with builtins registered) Blocks: m2p4 (diversity enforcement receives scored lists from executor), m2p5 (RETRIEVE executor calls profile executor for scoring) Complexity: L

Objective

Deliver the ProfileExecutor that takes a &RankingProfile and a &[EntityId] of candidates, reads signal state from the SignalLedger, applies the profile's scoring rules (boosts, penalties, gates, sort formulas), and returns Vec<ScoredCandidate> sorted by score descending. This is the heart of tidalDB's ranking engine -- the function that turns "here are 200 candidate items" into "here are 200 items ranked by this profile."

The executor implements all sort mode formulas from Spec 09 Section 11:

Hot: log10(max(|positive - negative|, 1)) / (age_hours + 2)^gravity
Controversial: (positive * negative) / (positive + negative)^2
Hidden Gems: quality_score * (1 / log10(view_count + 10))
Trending: share_velocity * 0.5 + view_velocity * 0.3 + reach_value * 0.2
Shuffle: random(seed) * sqrt(quality_score)
New: created_at descending (metadata sort)
TopWindow: Weighted signal sum within a window
MostViewed/MostLiked: Single signal count descending

The key performance gate: 200-candidate scoring pass < 10 microseconds (benchmarked with Criterion). This budget allows ~50ns per candidate, which is tight but achievable given that hot-tier signal reads are ~15ns and windowed count reads are ~200ns. The executor must avoid allocation on the hot path and batch signal reads efficiently.

Requirements

ScoredCandidate struct: entity_id, score (f64), signal_snapshot
ProfileExecutor struct: borrows SignalLedger for signal reads
ProfileExecutor::score(): scores candidates against a profile, returns sorted Vec<ScoredCandidate>
All sort formula implementations from Spec 09 Section 11
Min-max score normalization to [0.0, 1.0] (Spec 09 Section 8.2)
Gate evaluation: candidates below threshold get score 0.0, filtered out before return
Signal snapshot: record key signal values used in scoring for each result
ShuffleExecutor: deterministic seeded RNG from (user_id, profile_name, page_cursor) using SmallRng
Criterion benchmarks meeting the 10us/200-candidate target
Deterministic scoring (INV-RANK-1)
No unsafe code

Signal snapshot deferred to post-sort: The score() method must NOT build signal snapshots for all 200 candidates. Snapshots are expensive (String allocations per candidate). Build snapshots only for the final top-K result set (after gate filtering, sorting, and limiting to RETRIEVE's requested count). The ScoredCandidate type for the hot path can use an empty Vec for signal_snapshot; a separate enrichment step adds snapshots for the returned page only.

Technical Design

Module Structure

tidal/src/ranking/
  executor.rs   -- ProfileExecutor, ScoredCandidate, score(), sort formula implementations
  shuffle.rs    -- ShuffleExecutor, seeded RNG, shuffle_score()

tidal/benches/
  ranking.rs    -- Criterion benchmarks

Public API

// === ranking/executor.rs ===

use crate::schema::{EntityId, Window, Timestamp};
use crate::signals::SignalLedger;
use super::profile::{RankingProfile, Sort, Boost, Gate, Penalty, SignalAgg};

/// A scored candidate with signal transparency data.
///
/// Returned from `ProfileExecutor::score()`. Sorted by `score` descending.
/// The `signal_snapshot` provides the signal values used in scoring
/// for debugging and response transparency (Spec 09 Section 4, Stage 10).
#[derive(Debug, Clone)]
pub struct ScoredCandidate {
    /// The entity that was scored.
    pub entity_id: EntityId,

    /// The composite score after boost/penalty/gate/normalization.
    /// In range [0.0, 1.0] after normalization.
    /// Candidates with score 0.0 are excluded from results.
    pub score: f64,

    /// Key signal values used in scoring. For debugging/transparency.
    /// Contains (signal_name, value) pairs for signals referenced
    /// by the profile's scoring rules. Capped at 10 entries.
    pub signal_snapshot: Vec<(String, f64)>,
}

impl ScoredCandidate {
    /// Construct a scored candidate (used internally and in tests).
    pub fn new(entity_id: EntityId, score: f64) -> Self {
        Self {
            entity_id,
            score,
            signal_snapshot: Vec::new(),
        }
    }
}

/// Executes ranking profiles against candidate sets.
///
/// The executor borrows a `SignalLedger` for reading decay scores,
/// windowed counts, and velocity. It does NOT own the ledger or
/// modify signal state. Ranking is a pure read operation.
///
/// # Execution Pipeline
///
/// For profiles with a `sort` override (Spec 09 Section 11.9):
///   1. Evaluate sort formula per candidate (replaces boosts/penalties)
///   2. Evaluate gates -- candidates below threshold get score 0.0
///   3. Filter out score <= 0.0 candidates
///   4. Sort by score descending
///   5. Normalize scores to [0.0, 1.0]
///   6. Build signal snapshot for results
///
/// For profiles without a sort override (boost/penalty pipeline):
///   1. Initialize score to 0.0 per candidate
///   2. Apply boosts: score += normalize(signal_value) * weight
///   3. Apply penalties: score -= normalize(signal_value) * weight
///   4. Apply recency decay: score *= exp(-ln(2) / half_life * age)
///   5. Evaluate gates -- candidates below threshold get score 0.0
///   6. Filter out score <= 0.0 candidates
///   7. Sort by score descending
///   8. Normalize scores to [0.0, 1.0]
///   9. Build signal snapshot for results
///
/// # Performance
///
/// Target: 200 candidates scored in < 10 microseconds.
/// Per-candidate budget: ~50ns (decay read ~15ns + windowed read ~200ns
/// amortized across scoring rules).
pub struct ProfileExecutor<'a> {
    ledger: &'a SignalLedger,
}

impl<'a> ProfileExecutor<'a> {
    /// Create an executor that reads signal state from the given ledger.
    pub fn new(ledger: &'a SignalLedger) -> Self {
        Self { ledger }
    }

    /// Score a set of candidates against a ranking profile.
    ///
    /// Returns candidates sorted by score descending.
    /// Candidates that fail gates (score <= 0.0) are excluded.
    /// Scores are normalized to [0.0, 1.0] via min-max normalization.
    ///
    /// # Arguments
    ///
    /// * `candidates` -- Entity IDs to score. The caller (RETRIEVE executor)
    ///   generates this set via the profile's CandidateStrategy.
    /// * `profile` -- The ranking profile defining scoring rules.
    /// * `now` -- Current timestamp for decay computation.
    /// * `shuffle_seed` -- Optional seed for shuffle sort mode.
    ///   Constructed from (user_id, profile_name, page_cursor).
    pub fn score(
        &self,
        candidates: &[EntityId],
        profile: &RankingProfile,
        now: Timestamp,
        shuffle_seed: Option<u64>,
    ) -> Vec<ScoredCandidate>;

    /// Score candidates using a sort formula (stages 4-5 replaced).
    ///
    /// Called when `profile.has_sort_override()` is true.
    fn score_with_sort(
        &self,
        candidates: &[EntityId],
        sort: &Sort,
        profile: &RankingProfile,
        now: Timestamp,
        shuffle_seed: Option<u64>,
    ) -> Vec<ScoredCandidate>;

    /// Score candidates using the boost/penalty pipeline.
    ///
    /// Called when `profile.has_sort_override()` is false.
    fn score_with_pipeline(
        &self,
        candidates: &[EntityId],
        profile: &RankingProfile,
        now: Timestamp,
    ) -> Vec<ScoredCandidate>;
}

Sort Formula Implementations

Each sort formula is a standalone function for testability:

// === ranking/executor.rs (internal functions) ===

/// Hot formula: log10(max(|positive - negative|, 1)) / (age_hours + 2)^gravity
///
/// Spec 09 Section 11.1.
/// positive = like.count(all_time)
/// negative = dislike.count(all_time)
/// age_hours = (now - created_at).as_secs_f64() / 3600.0
fn hot_score(positive: u64, negative: u64, age_hours: f64, gravity: f64) -> f64 {
    let diff = (positive as f64 - negative as f64).abs().max(1.0);
    diff.log10() / (age_hours + 2.0).powf(gravity)
}

/// Controversial formula: (positive * negative) / (positive + negative)^2
///
/// Spec 09 Section 11.4.
/// Maximizes the product of positive and negative signals.
/// Score is 0.25 when positive == negative (maximum controversy).
fn controversial_score(positive: u64, negative: u64) -> f64 {
    let total = positive + negative;
    if total == 0 {
        return 0.0;
    }
    (positive as f64 * negative as f64) / (total as f64 * total as f64)
}

/// Hidden gems formula: quality_score * (1 / log10(view_count + 10))
///
/// Spec 09 Section 11.5.
/// quality_score = completion_rate * 0.6 + like_ratio * 0.4
/// Inverse reach ensures diminishing penalty as reach grows.
fn hidden_gems_score(quality_score: f64, view_count: u64) -> f64 {
    quality_score * (1.0 / (view_count as f64 + 10.0).log10())
}

/// Top window formula: weighted signal sum within a window.
///
/// Spec 09 Section 11.7.
/// weighted_sum = view * 0.3 + like * 0.3 + share * 0.2
///              + comment * 0.1 + completion_rate * views * 0.1
fn top_window_score(
    view_count: u64,
    like_count: u64,
    share_count: u64,
    completion_rate: f64,
) -> f64 {
    view_count as f64 * 0.3
        + like_count as f64 * 0.3
        + share_count as f64 * 0.2
        + completion_rate * view_count as f64 * 0.1
    // comment count deferred to M6 -- comment signal type not in M2 schema
}

/// Min-max normalization of scores to [0.0, 1.0].
///
/// Spec 09 Section 8.2.
/// If max == min, all scores are set to 0.5.
fn min_max_normalize(scores: &mut [f64]) {
    if scores.is_empty() {
        return;
    }
    let min = scores.iter().cloned().fold(f64::INFINITY, f64::min);
    let max = scores.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
    let range = max - min;
    if range < f64::EPSILON {
        for s in scores.iter_mut() {
            *s = 0.5;
        }
    } else {
        for s in scores.iter_mut() {
            *s = (*s - min) / range;
        }
    }
}

// === ranking/shuffle.rs ===

use rand::rngs::SmallRng;
use rand::{Rng, SeedableRng};

/// Produces deterministic shuffle scores for a set of candidates.
///
/// The seed is derived from a combination of user-specific and query-specific
/// values to ensure:
/// - Same user + same query = same ordering (within a time window)
/// - Different users = different orderings
/// - Same user at different times = different orderings (via timestamp_minute)
///
/// Spec 09 Section 11.6.
pub struct ShuffleExecutor {
    rng: SmallRng,
}

impl ShuffleExecutor {
    /// Create a shuffle executor with the given seed.
    ///
    /// The caller constructs the seed from (user_id, profile_name, page_cursor)
    /// or (timestamp_minute) for anonymous users.
    pub fn new(seed: u64) -> Self {
        Self {
            rng: SmallRng::seed_from_u64(seed),
        }
    }

    /// Compute a stable, deterministic shuffle seed from user identity and pagination state.
    ///
    /// Uses BLAKE3 (already in the crate graph via the WAL) for stable output
    /// across Rust compiler upgrades. `DefaultHasher` is NOT stable across
    /// Rust toolchain versions and must not be used for persistent ordering.
    pub fn compute_seed(user_id: u64, profile_name: &str, page_cursor: u64) -> u64 {
        let mut hasher = blake3::Hasher::new();
        hasher.update(&user_id.to_le_bytes());
        hasher.update(profile_name.as_bytes());
        hasher.update(&page_cursor.to_le_bytes());
        let hash = hasher.finalize();
        u64::from_le_bytes(hash.as_bytes()[..8].try_into().unwrap())
    }

    /// Score a single candidate for shuffle ordering.
    ///
    /// Formula: `random(0..1) * sqrt(quality_score)`
    /// quality_score should be in [0.0, 1.0].
    /// The sqrt ensures high-quality items are more likely to appear
    /// but does not guarantee it.
    pub fn shuffle_score(&mut self, quality_score: f64) -> f64 {
        let r: f64 = self.rng.random();
        r * quality_score.max(0.0).sqrt()
    }
}

Signal Read Strategy

The executor reads signal values from the SignalLedger via methods established in m1p4/m1p5:

Signal Read	Method	Latency
Decay score	`ledger.current_score(entity_id, signal_type_id, decay_rate_idx, now)`	~15ns
Windowed count	`ledger.windowed_count(entity_id, signal_type_id, window, now)`	~200ns
Velocity	`ledger.velocity(entity_id, signal_type_id, window, now)`	~500ns
All-time count	`ledger.windowed_count(entity_id, signal_type_id, Window::AllTime, now)`	~2ns

For a typical profile with 3 boosts and 1 penalty reading 4 signal values per candidate, the signal read cost is approximately:

4 signal reads * ~200ns avg = ~800ns per candidate
200 candidates * 800ns = ~160us total

This exceeds the 10us target. To meet the target, the executor must use the hot-tier decay scores (~15ns each) rather than windowed counts for the benchmark profile:

4 decay reads * 15ns = ~60ns per candidate
200 candidates * 60ns = ~12us total -- within budget with margin for scoring math

The benchmark profiles (Task acceptance criteria) use decay scores to demonstrate the <10us target. Profiles using windowed counts/velocity are expected to take ~200us for 200 candidates, which is within the Spec 09 Section 15.2 total scoring pipeline budget of <500us.

Resolution: The ROADMAP acceptance criterion "200-candidate scoring pass with a profile < 10 microseconds" is met using decay-score-only profiles (e.g., a profile with 2-3 boosts reading decay scores). Profiles using windowed counts are benchmarked separately and target the Spec 09 pipeline budget. The Criterion benchmarks include both scenarios.

Gate Evaluation

/// Evaluate a gate for a single candidate.
///
/// Returns true if the candidate passes the gate, false if it fails.
/// Failed candidates get score 0.0.
fn evaluate_gate(
    gate: &Gate,
    entity_id: EntityId,
    ledger: &SignalLedger,
    now: Timestamp,
) -> bool {
    match gate {
        Gate::MinSignal { signal, window, threshold } => {
            let signal_id = ledger.signal_type_id(signal);
            match signal_id {
                Some(id) => {
                    let value = ledger.windowed_count(entity_id, id, *window, now);
                    value as f64 >= *threshold
                }
                None => true, // Signal not in schema -- gate vacuously passes
            }
        }
        Gate::MinRatio { ratio_name, threshold } => {
            // Ratio computation requires multiple signal reads.
            // For M2, only "engagement_ratio" is supported.
            match ratio_name.as_str() {
                "engagement_ratio" => {
                    let views = read_signal_count(ledger, entity_id, "view", Window::AllTime, now);
                    let likes = read_signal_count(ledger, entity_id, "like", Window::AllTime, now);
                    if views == 0 { return false; }
                    let ratio = likes as f64 / views as f64;
                    ratio >= *threshold
                }
                _ => true, // Unknown ratio -- gate vacuously passes
            }
        }
        Gate::MinCount { signal, window, count } => {
            let signal_id = ledger.signal_type_id(signal);
            match signal_id {
                Some(id) => {
                    let value = ledger.windowed_count(entity_id, id, *window, now);
                    value >= *count
                }
                None => true, // Signal not in schema -- gate vacuously passes
            }
        }
    }
}

/// Helper: read windowed count for a signal by name. Returns 0 if signal not found.
fn read_signal_count(
    ledger: &SignalLedger,
    entity_id: EntityId,
    signal_name: &str,
    window: Window,
    now: Timestamp,
) -> u64 {
    ledger
        .signal_type_id(signal_name)
        .map(|id| ledger.windowed_count(entity_id, id, window, now))
        .unwrap_or(0)
}

Signal Snapshot Construction

/// Build a signal snapshot for a scored candidate.
///
/// Includes all signals referenced by the profile's scoring rules.
/// Capped at MAX_SNAPSHOT_SIGNALS entries.
const MAX_SNAPSHOT_SIGNALS: usize = 10;

fn build_signal_snapshot(
    entity_id: EntityId,
    profile: &RankingProfile,
    ledger: &SignalLedger,
    now: Timestamp,
) -> Vec<(String, f64)> {
    let mut snapshot = Vec::new();

    for boost in profile.boosts() {
        if snapshot.len() >= MAX_SNAPSHOT_SIGNALS { break; }
        if let Some(id) = ledger.signal_type_id(&boost.signal) {
            let value = match boost.aggregation {
                SignalAgg::DecayScore => ledger.current_score(entity_id, id, 0, now),
                SignalAgg::Value => ledger.windowed_count(entity_id, id, boost.window, now) as f64,
                SignalAgg::Velocity => ledger.velocity(entity_id, id, boost.window, now),
                _ => 0.0, // Ratio, RelativeVelocity deferred to M3+
            };
            snapshot.push((boost.signal.clone(), value));
        }
    }

    for penalty in profile.penalties() {
        if snapshot.len() >= MAX_SNAPSHOT_SIGNALS { break; }
        if let Some(id) = ledger.signal_type_id(&penalty.signal) {
            let value = ledger.windowed_count(entity_id, id, penalty.window, now) as f64;
            snapshot.push((penalty.signal.clone(), value));
        }
    }

    snapshot
}

Criterion Benchmarks

// === tidal/benches/ranking.rs ===

use criterion::{criterion_group, criterion_main, Criterion};
use tidaldb::schema::*;
use tidaldb::signals::*;
use tidaldb::ranking::*;

/// Setup: create a SignalLedger with 200 entities having signal state.
fn setup_ledger_200() -> (SignalLedger, Vec<EntityId>) {
    let schema = SchemaBuilder::new()
        .signal("view")
            .target(EntityKind::Item)
            .decay_exponential(std::time::Duration::from_secs(604_800))
            .windows(&[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
            .velocity(true)
            .done()
        .signal("like")
            .target(EntityKind::Item)
            .decay_exponential(std::time::Duration::from_secs(1_209_600))
            .windows(&[Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
            .done()
        .build()
        .unwrap();

    let ledger = SignalLedger::new(&schema);
    let entities: Vec<EntityId> = (0..200u64).map(EntityId::new).collect();
    let now_ns = 1_000_000_000_000u64; // arbitrary "now"

    // Populate signal state: each entity gets 10-50 events
    for &entity_id in &entities {
        let num_events = 10 + (entity_id.as_u64() % 40);
        for i in 0..num_events {
            let t = now_ns - (i * 60_000_000_000); // events 1 minute apart going back
            ledger.on_signal(entity_id, /* view signal id */ 0.into(), 1.0, t);
            if i % 3 == 0 {
                ledger.on_signal(entity_id, /* like signal id */ 1.into(), 1.0, t);
            }
        }
    }

    (ledger, entities)
}

/// KEY BENCHMARK: 200 candidates, trending profile, decay scores only.
/// Target: < 10 microseconds.
///
/// Note: this benchmark MUST populate the `SignalLedger` with actual signal state
/// for 200 entities before measuring. If the ledger is empty, the benchmark measures
/// no-op scoring (all signals return 0.0) rather than the actual scoring path.
/// Setup: write 200 entities with 10 signal events each before the timed section.
/// The < 10us target applies to decay-score-only profiles. For velocity-based
/// profiles like `trending`, the target is < 100us (200 candidates * ~500ns per
/// velocity read = ~100us). Update the benchmark's acceptance criterion accordingly.
fn bench_scoring_200_candidates_trending(c: &mut Criterion) {
    let (ledger, entities) = setup_ledger_200();
    let mut registry = ProfileRegistry::new();
    register_builtins(&mut registry, &[]); // signals validated separately
    let profile = registry.get("trending").unwrap();
    let now = Timestamp::from_nanos(1_000_000_000_000);
    let executor = ProfileExecutor::new(&ledger);

    c.bench_function("score_200_candidates_trending", |b| {
        b.iter(|| {
            executor.score(&entities, profile, now, None)
        })
    });
}

/// 200 candidates, hot formula (requires age computation).
fn bench_scoring_200_candidates_hot(c: &mut Criterion) {
    let (ledger, entities) = setup_ledger_200();
    let mut registry = ProfileRegistry::new();
    register_builtins(&mut registry, &[]);
    let profile = registry.get("hot").unwrap();
    let now = Timestamp::from_nanos(1_000_000_000_000);
    let executor = ProfileExecutor::new(&ledger);

    c.bench_function("score_200_candidates_hot", |b| {
        b.iter(|| {
            executor.score(&entities, profile, now, None)
        })
    });
}

/// 200 candidates, full pipeline profile with 3 boosts + 1 penalty + 1 gate.
fn bench_scoring_200_candidates_full_pipeline(c: &mut Criterion) {
    let (ledger, entities) = setup_ledger_200();
    let mut profile = RankingProfile::new("bench_full", 1);
    profile
        .with_boost(Boost::new("view", Window::TwentyFourHours, SignalAgg::DecayScore, 0.3))
        .with_boost(Boost::new("like", Window::AllTime, SignalAgg::DecayScore, 0.3))
        .with_boost(Boost::new("view", Window::SevenDays, SignalAgg::DecayScore, 0.2))
        .with_penalty(Penalty::new("view", Window::OneHour, 0.1))
        .with_gate(Gate::min_count("view", Window::AllTime, 5));

    let now = Timestamp::from_nanos(1_000_000_000_000);
    let executor = ProfileExecutor::new(&ledger);

    c.bench_function("score_200_candidates_full_pipeline", |b| {
        b.iter(|| {
            executor.score(&entities, &profile, now, None)
        })
    });
}

/// Sort phase only: sort 200 pre-scored candidates.
fn bench_sort_200_candidates(c: &mut Criterion) {
    let mut candidates: Vec<ScoredCandidate> = (0..200u64)
        .map(|i| ScoredCandidate::new(EntityId::new(i), i as f64 * 0.005))
        .collect();

    c.bench_function("sort_200_candidates", |b| {
        b.iter(|| {
            candidates.sort_by(|a, b| b.score.partial_cmp(&a.score).unwrap());
        })
    });
}

/// Shuffle scoring: 200 candidates with seeded RNG.
fn bench_scoring_200_candidates_shuffle(c: &mut Criterion) {
    let (ledger, entities) = setup_ledger_200();
    let mut registry = ProfileRegistry::new();
    register_builtins(&mut registry, &[]);
    let profile = registry.get("shuffle").unwrap();
    let now = Timestamp::from_nanos(1_000_000_000_000);
    let executor = ProfileExecutor::new(&ledger);
    let seed = ShuffleExecutor::compute_seed(42, "shuffle", 0);

    c.bench_function("score_200_candidates_shuffle", |b| {
        b.iter(|| {
            executor.score(&entities, profile, now, Some(seed))
        })
    });
}

/// Min-max normalization of 200 scores.
fn bench_normalize_200(c: &mut Criterion) {
    let mut scores: Vec<f64> = (0..200).map(|i| i as f64 * 0.01 + 0.5).collect();

    c.bench_function("normalize_200_scores", |b| {
        b.iter(|| {
            min_max_normalize(&mut scores);
        })
    });
}

criterion_group!(
    benches,
    bench_scoring_200_candidates_trending,
    bench_scoring_200_candidates_hot,
    bench_scoring_200_candidates_full_pipeline,
    bench_sort_200_candidates,
    bench_scoring_200_candidates_shuffle,
    bench_normalize_200,
);
criterion_main!(benches);

Error Handling

The executor never returns errors. Missing signals produce 0.0 contributions. Missing signal types in the ledger are silently skipped.
Gate evaluation on missing signals vacuously passes (the candidate is not excluded). This prevents missing signals from turning gates into global excluders.
score() returns an empty Vec if all candidates fail gates. This is valid -- the RETRIEVE executor will return an empty result set.
Division by zero in formulas (controversial with total=0, hidden gems with quality=0) is guarded with explicit checks returning 0.0.

Test Strategy

Unit Tests

// === Sort Formula Tests ===

#[test]
fn hot_score_basic() {
    // 100 likes, 10 dislikes, 1 hour old, gravity 1.8
    let score = hot_score(100, 10, 1.0, 1.8);
    // log10(|100-10|) / (1+2)^1.8 = log10(90) / 3^1.8
    let expected = 90.0_f64.log10() / 3.0_f64.powf(1.8);
    assert!((score - expected).abs() < 1e-10,
        "hot_score={score}, expected={expected}");
}

#[test]
fn hot_score_zero_engagement() {
    // No likes or dislikes -- score uses max(1, |0-0|)
    let score = hot_score(0, 0, 1.0, 1.8);
    let expected = 1.0_f64.log10() / 3.0_f64.powf(1.8);
    assert!((score - expected).abs() < 1e-10);
    assert!((score - 0.0).abs() < 1e-10, "log10(1) = 0, so score should be 0");
}

#[test]
fn hot_score_higher_gravity_lower_score() {
    let score_low = hot_score(100, 10, 6.0, 1.0);
    let score_high = hot_score(100, 10, 6.0, 2.5);
    assert!(score_low > score_high,
        "higher gravity should produce lower score for same age");
}

#[test]
fn hot_score_older_content_scores_lower() {
    let score_new = hot_score(100, 10, 1.0, 1.8);
    let score_old = hot_score(100, 10, 24.0, 1.8);
    assert!(score_new > score_old,
        "newer content should score higher with same engagement");
}

#[test]
fn controversial_balanced() {
    // Equal positive and negative = maximum controversy
    let score = controversial_score(100, 100);
    assert!((score - 0.25).abs() < 1e-10,
        "100*100 / (200)^2 = 10000/40000 = 0.25");
}

#[test]
fn controversial_lopsided() {
    // 1800 positive, 200 negative = low controversy
    let score = controversial_score(1800, 200);
    let expected = (1800.0 * 200.0) / (2000.0 * 2000.0);
    assert!((score - expected).abs() < 1e-10);
    assert!(score < 0.25, "lopsided should be less controversial");
}

#[test]
fn controversial_zero_total() {
    let score = controversial_score(0, 0);
    assert!((score - 0.0).abs() < f64::EPSILON);
}

#[test]
fn controversial_one_sided() {
    let score = controversial_score(100, 0);
    assert!((score - 0.0).abs() < f64::EPSILON,
        "one-sided engagement is not controversial");
}

#[test]
fn hidden_gems_high_quality_low_reach() {
    let score_hidden = hidden_gems_score(0.9, 100);      // 100 views
    let score_popular = hidden_gems_score(0.9, 1_000_000); // 1M views
    assert!(score_hidden > score_popular,
        "hidden gem should score higher than viral content with same quality");
}

#[test]
fn hidden_gems_zero_views() {
    // log10(0 + 10) = 1.0, so inverse_reach = 1.0
    let score = hidden_gems_score(0.8, 0);
    assert!((score - 0.8).abs() < 1e-10);
}

#[test]
fn top_window_weighted_sum() {
    let score = top_window_score(1000, 300, 50, 0.7);
    let expected = 1000.0 * 0.3 + 300.0 * 0.3 + 50.0 * 0.2 + 0.7 * 1000.0 * 0.1;
    assert!((score - expected).abs() < 1e-10);
}

#[test]
fn min_max_normalize_basic() {
    let mut scores = vec![10.0, 20.0, 30.0, 40.0, 50.0];
    min_max_normalize(&mut scores);
    assert!((scores[0] - 0.0).abs() < 1e-10);
    assert!((scores[2] - 0.5).abs() < 1e-10);
    assert!((scores[4] - 1.0).abs() < 1e-10);
}

#[test]
fn min_max_normalize_all_equal() {
    let mut scores = vec![5.0, 5.0, 5.0];
    min_max_normalize(&mut scores);
    assert!(scores.iter().all(|&s| (s - 0.5).abs() < 1e-10));
}

#[test]
fn min_max_normalize_single() {
    let mut scores = vec![42.0];
    min_max_normalize(&mut scores);
    assert!((scores[0] - 0.5).abs() < 1e-10);
}

#[test]
fn min_max_normalize_empty() {
    let mut scores: Vec<f64> = vec![];
    min_max_normalize(&mut scores); // should not panic
}

#[test]
fn min_max_normalize_range_01() {
    let mut scores = vec![0.0, 100.0, 50.0, 75.0, 25.0];
    min_max_normalize(&mut scores);
    for s in &scores {
        assert!(*s >= 0.0 && *s <= 1.0, "score {} out of [0,1] range", s);
    }
}

// === Shuffle Tests ===

#[test]
fn shuffle_deterministic_same_seed() {
    let seed = ShuffleExecutor::compute_seed(42, "shuffle", 0);
    let mut exec1 = ShuffleExecutor::new(seed);
    let mut exec2 = ShuffleExecutor::new(seed);

    let scores1: Vec<f64> = (0..10).map(|_| exec1.shuffle_score(0.8)).collect();
    let scores2: Vec<f64> = (0..10).map(|_| exec2.shuffle_score(0.8)).collect();

    assert_eq!(scores1, scores2, "same seed should produce identical scores");
}

#[test]
fn shuffle_different_seeds_differ() {
    let seed1 = ShuffleExecutor::compute_seed(42, "shuffle", 0);
    let seed2 = ShuffleExecutor::compute_seed(99, "shuffle", 0);

    let mut exec1 = ShuffleExecutor::new(seed1);
    let mut exec2 = ShuffleExecutor::new(seed2);

    let scores1: Vec<f64> = (0..10).map(|_| exec1.shuffle_score(0.8)).collect();
    let scores2: Vec<f64> = (0..10).map(|_| exec2.shuffle_score(0.8)).collect();

    assert_ne!(scores1, scores2, "different seeds should produce different scores");
}

#[test]
fn shuffle_higher_quality_higher_expected_score() {
    let seed = ShuffleExecutor::compute_seed(42, "shuffle", 0);
    let n = 10_000;

    let mut exec_high = ShuffleExecutor::new(seed);
    let avg_high: f64 = (0..n).map(|_| exec_high.shuffle_score(1.0)).sum::<f64>() / n as f64;

    let mut exec_low = ShuffleExecutor::new(seed);
    let avg_low: f64 = (0..n).map(|_| exec_low.shuffle_score(0.1)).sum::<f64>() / n as f64;

    assert!(avg_high > avg_low,
        "higher quality should produce higher average shuffle score");
}

#[test]
fn shuffle_score_non_negative() {
    let mut exec = ShuffleExecutor::new(42);
    for _ in 0..1000 {
        let score = exec.shuffle_score(0.5);
        assert!(score >= 0.0, "shuffle score should be non-negative");
    }
}

// === Executor Integration Tests ===
// (require SignalLedger -- these are integration-level unit tests)

#[test]
fn executor_scores_sorted_descending() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let profile = make_test_profile_with_boosts();
    let now = Timestamp::from_nanos(1_000_000_000_000);

    let results = executor.score(&entities, &profile, now, None);

    for pair in results.windows(2) {
        assert!(pair[0].score >= pair[1].score,
            "results should be sorted descending: {} >= {}", pair[0].score, pair[1].score);
    }
}

#[test]
fn executor_gates_exclude_candidates() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);

    // Gate: view count >= 1000 (most entities have < 50 events)
    let mut profile = RankingProfile::new("gated", 1);
    profile
        .with_boost(Boost::new("view", Window::AllTime, SignalAgg::DecayScore, 1.0))
        .with_gate(Gate::min_count("view", Window::AllTime, 1000));

    let now = Timestamp::from_nanos(1_000_000_000_000);
    let results = executor.score(&entities, &profile, now, None);

    // Most/all candidates should be filtered out by the gate
    assert!(results.len() < entities.len(),
        "gate should exclude some candidates");
    assert!(results.iter().all(|r| r.score > 0.0),
        "no zero-score candidates should be in results");
}

#[test]
fn executor_normalized_scores_in_range() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let profile = make_test_profile_with_boosts();
    let now = Timestamp::from_nanos(1_000_000_000_000);

    let results = executor.score(&entities, &profile, now, None);

    for r in &results {
        assert!(r.score >= 0.0 && r.score <= 1.0,
            "normalized score {} out of [0,1] range for entity {}",
            r.score, r.entity_id);
    }
}

#[test]
fn executor_deterministic_scoring() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let profile = make_test_profile_with_boosts();
    let now = Timestamp::from_nanos(1_000_000_000_000);

    let results1 = executor.score(&entities, &profile, now, None);
    let results2 = executor.score(&entities, &profile, now, None);

    assert_eq!(results1.len(), results2.len());
    for (r1, r2) in results1.iter().zip(results2.iter()) {
        assert_eq!(r1.entity_id, r2.entity_id);
        assert!((r1.score - r2.score).abs() < f64::EPSILON,
            "scoring must be deterministic: {} vs {}", r1.score, r2.score);
    }
}

#[test]
fn executor_signal_snapshot_populated() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let mut profile = RankingProfile::new("snapshot_test", 1);
    profile
        .with_boost(Boost::new("view", Window::AllTime, SignalAgg::DecayScore, 0.5))
        .with_boost(Boost::new("like", Window::AllTime, SignalAgg::DecayScore, 0.5));

    let now = Timestamp::from_nanos(1_000_000_000_000);
    let results = executor.score(&entities, &profile, now, None);

    // At least some results should have signal snapshots
    let has_snapshots = results.iter().any(|r| !r.signal_snapshot.is_empty());
    assert!(has_snapshots, "results should include signal snapshots");

    for r in &results {
        assert!(r.signal_snapshot.len() <= 10,
            "signal snapshot capped at 10, got {}", r.signal_snapshot.len());
    }
}

#[test]
fn executor_hot_formula_correct() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let mut registry = ProfileRegistry::new();
    register_builtins(&mut registry, &[]);
    let profile = registry.get("hot").unwrap();
    let now = Timestamp::from_nanos(1_000_000_000_000);

    let results = executor.score(&entities, profile, now, None);

    // Hot scores should be non-negative
    assert!(results.iter().all(|r| r.score >= 0.0));
    // Should have results (unless all gated out)
    // With default hot profile (no gates), all candidates should be scored
}

#[test]
fn executor_empty_candidates() {
    let (ledger, _) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);
    let profile = make_test_profile_with_boosts();
    let now = Timestamp::from_nanos(1_000_000_000_000);

    let results = executor.score(&[], &profile, now, None);
    assert!(results.is_empty());
}

#[test]
fn executor_sort_override_skips_boosts() {
    let (ledger, entities) = setup_test_ledger();
    let executor = ProfileExecutor::new(&ledger);

    // Profile with boosts AND a sort override
    let mut profile = RankingProfile::new("sort_test", 1);
    profile
        .with_boost(Boost::new("view", Window::AllTime, SignalAgg::DecayScore, 1.0))
        .with_sort(Sort::New);

    let now = Timestamp::from_nanos(1_000_000_000_000);
    let results = executor.score(&entities, &profile, now, None);

    // Sort::New should order by created_at, not by boost scores
    // (exact verification depends on entity metadata -- verify non-empty)
    assert!(!results.is_empty());
}

Property Tests

use proptest::prelude::*;

// P1: Normalized scores are always in [0.0, 1.0] (INV-RANK-2).
proptest! {
    #[test]
    fn normalized_scores_in_range(
        raw_scores in prop::collection::vec(
            -1000.0f64..1000.0, 2..200
        ),
    ) {
        let mut scores = raw_scores.clone();
        min_max_normalize(&mut scores);
        for &s in &scores {
            prop_assert!(s >= 0.0 && s <= 1.0,
                "normalized score {} out of range", s);
        }
    }
}

// P2: Scoring is deterministic (INV-RANK-1).
proptest! {
    #[test]
    fn scoring_deterministic(
        seed in any::<u64>(),
    ) {
        let mut exec1 = ShuffleExecutor::new(seed);
        let mut exec2 = ShuffleExecutor::new(seed);

        let scores1: Vec<f64> = (0..50).map(|_| exec1.shuffle_score(0.7)).collect();
        let scores2: Vec<f64> = (0..50).map(|_| exec2.shuffle_score(0.7)).collect();

        prop_assert_eq!(&scores1, &scores2);
    }
}

// P3: Controversial score is symmetric.
proptest! {
    #[test]
    fn controversial_symmetric(
        a in 0u64..10000,
        b in 0u64..10000,
    ) {
        let score_ab = controversial_score(a, b);
        let score_ba = controversial_score(b, a);
        prop_assert!((score_ab - score_ba).abs() < 1e-10,
            "controversial({a},{b})={score_ab} != controversial({b},{a})={score_ba}");
    }
}

// P4: Controversial score is maximized when positive == negative.
proptest! {
    #[test]
    fn controversial_max_at_balance(
        n in 10u64..10000,
        delta in 1u64..1000,
    ) {
        let balanced = controversial_score(n, n);
        let unbalanced = controversial_score(n + delta, n);
        prop_assert!(balanced >= unbalanced,
            "balanced ({n},{n})={balanced} should >= unbalanced ({},{n})={unbalanced}",
            n + delta);
    }
}

// P5: Hot score decreases with age (all else equal).
proptest! {
    #[test]
    fn hot_score_decreases_with_age(
        positive in 1u64..10000,
        negative in 0u64..10000,
        age1 in 0.1f64..100.0,
        age_delta in 0.1f64..100.0,
        gravity in 0.5f64..3.0,
    ) {
        let score1 = hot_score(positive, negative, age1, gravity);
        let score2 = hot_score(positive, negative, age1 + age_delta, gravity);
        prop_assert!(score1 >= score2,
            "hot score should decrease with age: age={age1} score={score1}, age={} score={score2}",
            age1 + age_delta);
    }
}

// P6: Hidden gems score decreases with more views (quality held constant).
proptest! {
    #[test]
    fn hidden_gems_decreases_with_views(
        quality in 0.01f64..1.0,
        views1 in 0u64..100000,
        views_delta in 1u64..100000,
    ) {
        let score1 = hidden_gems_score(quality, views1);
        let score2 = hidden_gems_score(quality, views1 + views_delta);
        prop_assert!(score1 >= score2,
            "hidden gems should decrease with views: views={views1} score={score1}, views={} score={score2}",
            views1 + views_delta);
    }
}

Acceptance Criteria

ScoredCandidate struct with entity_id, score, signal_snapshot
ProfileExecutor::new(ledger) borrows a SignalLedger
ProfileExecutor::score() takes candidates, profile, now, optional shuffle_seed; returns Vec<ScoredCandidate> sorted descending
Sort override detection: when profile.has_sort_override(), sort formula replaces boost/penalty pipeline
hot_score() implements log10(max(|positive - negative|, 1)) / (age_hours + 2)^gravity matching Spec 09 Section 11.1
controversial_score() implements (positive * negative) / (positive + negative)^2 matching Spec 09 Section 11.4
hidden_gems_score() implements quality_score * (1 / log10(view_count + 10)) matching Spec 09 Section 11.5
top_window_score() implements weighted signal sum matching Spec 09 Section 11.7
Sort::New produces created_at DESC ordering
Sort::MostViewed produces windowed view count DESC ordering
Sort::MostLiked produces windowed like count DESC ordering
Sort::Shuffle uses ShuffleExecutor with deterministic seeded RNG
ShuffleExecutor::compute_seed() produces deterministic seeds from (user_id, profile_name, page_cursor) using BLAKE3 (stable across Rust toolchain versions)
Same seed produces identical shuffle scores (deterministic, INV-RANK-1)
Different seeds produce different shuffle scores
min_max_normalize() maps scores to [0.0, 1.0]; all-equal scores map to 0.5
Gate evaluation: candidates below threshold get score 0.0 and are excluded from results
Missing signals (not in ledger) produce 0.0 contribution, not errors
Missing signals for gates vacuously pass (do not exclude)
Signal snapshot populated with values for all profile-referenced signals, capped at 10
Empty candidate set returns empty results (no panic)
Criterion benchmarks implemented and passing:
- score_200_candidates_trending -- < 100us target (velocity-based profile)
- score_200_candidates_hot -- measured
- score_200_candidates_full_pipeline -- measured
- sort_200_candidates -- measured
- score_200_candidates_shuffle -- measured
- normalize_200_scores -- measured
Deterministic scoring verified: same inputs produce identical outputs (property test)
Controversial symmetry verified: f(a,b) == f(b,a) (property test)
Controversial maximum at balance: f(n,n) >= f(n+d,n) (property test)
Hot score decreases with age (property test)
Hidden gems score decreases with views (property test)
Normalized scores always in [0.0, 1.0] (property test, INV-RANK-2)
No unsafe code
cargo clippy -- -D warnings passes
All unit tests, property tests, and benchmarks pass

Research References

docs/research/tidaldb_signal_ledger.md -- Signal read latencies: hot-tier decay score ~15ns, windowed count ~200ns, velocity ~500ns. These latencies establish the per-candidate scoring budget.

Spec References

docs/specs/09-ranking-scoring.md -- Section 4 (Scoring pipeline: 9-stage transformation), Section 5 (Boost application), Section 6 (Penalty application), Section 7 (Gate evaluation), Section 8 (Score composition and min-max normalization), Section 11 (Built-in sort mode formulas: Hot 11.1, Trending 11.2, Rising 11.3, Controversial 11.4, HiddenGems 11.5, Shuffle 11.6, Top 11.7, simple sorts 11.8), Section 15 (Performance targets: total scoring pipeline < 500us for 200 candidates, per-candidate ~1.5us), Section 16 (INV-RANK-1 deterministic scoring, INV-RANK-2 score non-negativity, INV-RANK-4 gate strictness)

Implementation Notes

Add [[bench]] name = "ranking" harness = false to tidal/Cargo.toml.
Add rand = "0.9" to [dependencies] (not just dev-dependencies) because ShuffleExecutor is used in production code, not just tests. SmallRng comes from rand::rngs::SmallRng.
blake3 is already in Cargo.toml (used by the WAL for checksums), so no new dependency is needed for the shuffle seed computation.
The setup_test_ledger() helper constructs a SignalLedger with schema and populates signal state for testing. It reuses the schema construction patterns from m1p4 and m1p5 tests. The exact API depends on how SignalLedger::new() works in the current codebase -- adapt as needed.
The ProfileExecutor does NOT execute candidate generation. It receives pre-generated &[EntityId] and only performs scoring. Candidate generation is the RETRIEVE executor's job (m2p5).
The executor does NOT execute diversity enforcement. It returns the full scored, sorted, gate-filtered list. Diversity is applied by the diversity engine (m2p4) as a post-processing step.
Sort::New requires reading created_at from entity metadata. For M2, this can use a placeholder approach: either (a) the entity store provides created_at via a metadata read, or (b) the executor uses EntityId ordering as a proxy for creation order (valid for monotonic IDs). Document the chosen approach.
Sort::Trending uses share_velocity(6h) which requires the share signal type and OneHour window (for 6h velocity, the closest available window). If the spec's 6h window is not directly supported by the Window enum (which has OneHour, TwentyFourHours, etc.), use OneHour velocity as a pragmatic substitute for M2. The 6h aggregation window requires a custom bucketed counter configuration that is deferred to M6.
Exclude rules are type stubs in M2 -- they require user state (m3p1). The executor skips Exclude evaluation entirely. They are present in the profile data structure for forward compatibility but have no effect on scoring until M3.
Do NOT implement percentile normalization for signal values (Spec 09 Section 8.3). For M2, raw signal values are used directly with the boost weight. Percentile normalization requires maintaining approximate percentile tables updated by the background materializer, which is an M6 optimization. The M2 approach works correctly for single-signal profiles and is acceptable for profiles with 2-3 signals of similar scale.
Do NOT implement recency decay (ProfileDecay) in this task for M2. Content age decay requires reading created_at metadata per candidate, which depends on the entity metadata read path. Implement it as a follow-up or in m2p5 when the full entity read path is available. The ProfileDecay field on the profile type is defined (Task 01) but not executed.

44 KiB Raw Blame History