tidaldb/docs/planning/milestone-3/phase-3/task-03-cold-start-and-exploration.md
jordan 39ada28c6e feat: complete Milestones 2–4 — RETRIEVE query, vector index, ranking profiles, diversity, entity system, sessions
M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
    usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
    MMR diversity enforcement, and m2_uat integration tests.

M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
    creator entities, session tracking, and m3_uat integration tests.

M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
    ranking executor with explain mode, query executor integration, benchmarks for
    query/ranking/vector/filters/diversity, and m4_uat integration tests.

Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:24:48 -07:00

506 lines
20 KiB
Markdown

# Task 03: Cold Start and Exploration
## Context
**Milestone:** 3 -- Personalized Ranking
**Phase:** m3p3 -- Personalized Ranking Profiles
**Depends On:** Task 02 (Personalized Profiles: `for_you` profile with `exploration: 0.1`), Task 01 (`UserContext::is_cold_start`), m2p4 (diversity enforcement, `DiversitySelector`)
**Blocks:** m3p4 (User State Filters + UAT need complete for_you behavior)
**Complexity:** M
## Objective
Deliver cold-start handling for new users and new items, plus the exploration budget injection for the `for_you` profile. These mechanisms prevent the personalization system from collapsing into a filter bubble or failing silently for users/items with no history.
**Cold-start users**: A new user with no engagement history has no preference vector. The `for_you` profile falls back to population-level signals: trending, quality (completion rate), and recency. This ensures new users see a reasonable feed on their first visit.
**Cold-start items**: New items with no signal history have no engagement data. Without intervention, they would never appear in personalized feeds, creating a chicken-and-egg problem. The exploration window gives new items a brief period (configurable, default 24h) where they are eligible for inclusion in the exploration budget of for_you feeds.
**Exploration budget**: The `for_you` profile reserves 10% of its result set (e.g., 5 of 50 items) for exploration candidates: items from creators the user does NOT follow. This prevents the feed from becoming a closed loop of familiar content and exposes users to new creators.
## Requirements
### Cold-Start User Fallback
- When `UserContext::is_cold_start` is true, `for_you` scoring uses population-level signals only
- Fallback formula: `trending_velocity * 0.4 + completion_rate * 0.3 + recency * 0.3`
- No preference match (no vector to compare against)
- No creator affinity (no interaction history)
- Candidate strategy: full corpus scan sorted by trending (not ANN, since no query vector)
- Diversity constraints still apply
### Cold-Start Item Exploration Window
- `ExplorationWindow` struct: items created within the last N hours (default 24h) with fewer than M signals (default 10) are eligible for exploration
- Eligible items are added to an `exploration_pool` bitmap
- The exploration budget draws from this pool when filling exploration slots
- After the exploration window expires, items must earn organic engagement to appear
- The window is configurable via `ExplorationConfig`
### Exploration Budget
- The `for_you` profile has `exploration: 0.1` (10%)
- For `LIMIT 50`, 5 items come from the exploration budget
- Exploration candidates are items NOT from followed creators
- Selection from the exploration pool: random shuffle, then score by population signals
- Exploration candidates are placed at positions throughout the result set (interleaved), not clustered at the end
- The remaining 90% (45 items) come from the standard personalized scoring
## Technical Design
### Module Structure
```
tidal/src/
ranking/
exploration.rs -- ExplorationConfig, ExplorationWindow, budget injection
```
### ExplorationConfig
```rust
// === ranking/exploration.rs ===
use std::time::Duration;
/// Configuration for cold-start and exploration behavior.
#[derive(Debug, Clone)]
pub struct ExplorationConfig {
/// How long new items are eligible for exploration.
/// Default: 24 hours.
pub item_window: Duration,
/// Maximum signal count for an item to be considered "cold start".
/// Items with more signals than this are no longer in the exploration pool.
/// Default: 10.
pub max_signals_for_cold_item: u64,
/// Fraction of results reserved for exploration.
/// Default: 0.1 (10%).
pub exploration_fraction: f64,
}
impl Default for ExplorationConfig {
fn default() -> Self {
Self {
item_window: Duration::from_secs(24 * 3600),
max_signals_for_cold_item: 10,
exploration_fraction: 0.1,
}
}
}
```
### Cold-Start User Scoring
```rust
use crate::db::user_context::UserContext;
use crate::schema::{EntityId, Timestamp, Window};
use crate::signals::SignalLedger;
/// Score a candidate for a cold-start user.
///
/// Uses population-level signals only: trending velocity, completion rate,
/// and recency. No personalization factors.
pub fn cold_start_score(
entity_id: EntityId,
now: Timestamp,
ledger: &SignalLedger,
item_created_at: &dyn Fn(EntityId) -> Option<u64>,
) -> f64 {
let view_vel = read_agg(entity_id, "view", &SignalAgg::Velocity, Window::TwentyFourHours, ledger);
let share_vel = read_agg(entity_id, "share", &SignalAgg::Velocity, Window::TwentyFourHours, ledger);
let trending = (view_vel + 2.0 * share_vel).min(1.0);
let completion = read_agg(entity_id, "completion", &SignalAgg::DecayScore, Window::AllTime, ledger);
let completion_rate = completion.min(1.0);
let recency = item_created_at(entity_id)
.map_or(0.5, |ts| recency_score(ts, now));
trending * 0.4 + completion_rate * 0.3 + recency * 0.3
}
```
### Exploration Budget Injection
```rust
/// Inject exploration candidates into a personalized result set.
///
/// Takes the scored personalized results and injects exploration candidates
/// at regular intervals (interleaved, not appended).
///
/// # Parameters
///
/// - `personalized`: scored candidates from the personalized pipeline (sorted desc)
/// - `exploration_candidates`: candidates from unfollowed creators, scored by population signals
/// - `total_limit`: target result count (e.g., 50)
/// - `exploration_fraction`: fraction of results for exploration (e.g., 0.1)
///
/// # Returns
///
/// A merged result set with exploration candidates interleaved.
pub fn inject_exploration(
personalized: &[ScoredCandidate],
exploration_candidates: &[ScoredCandidate],
total_limit: usize,
exploration_fraction: f64,
) -> Vec<ScoredCandidate> {
let exploration_count = ((total_limit as f64) * exploration_fraction).ceil() as usize;
let personalized_count = total_limit.saturating_sub(exploration_count);
let personalized_slice = &personalized[..personalized_count.min(personalized.len())];
let exploration_slice = &exploration_candidates[..exploration_count.min(exploration_candidates.len())];
if exploration_slice.is_empty() {
// No exploration candidates available: return personalized only.
return personalized[..total_limit.min(personalized.len())].to_vec();
}
// Interleave: place exploration candidates at regular intervals.
let mut result = Vec::with_capacity(total_limit);
let step = if exploration_slice.is_empty() {
usize::MAX
} else {
total_limit / exploration_slice.len()
};
let mut p_idx = 0;
let mut e_idx = 0;
for i in 0..total_limit {
if e_idx < exploration_slice.len() && i > 0 && i % step == 0 {
result.push(exploration_slice[e_idx].clone());
e_idx += 1;
} else if p_idx < personalized_slice.len() {
result.push(personalized_slice[p_idx].clone());
p_idx += 1;
} else if e_idx < exploration_slice.len() {
result.push(exploration_slice[e_idx].clone());
e_idx += 1;
}
}
result
}
/// Select exploration candidates from the item corpus.
///
/// Exploration candidates are items NOT from followed creators.
/// They are scored by population-level signals and returned in
/// descending score order.
pub fn select_exploration_candidates(
all_item_ids: &[EntityId],
user_ctx: &UserContext,
exploration_config: &ExplorationConfig,
now: Timestamp,
ledger: &SignalLedger,
item_creator_lookup: &dyn Fn(EntityId) -> Option<EntityId>,
item_created_at: &dyn Fn(EntityId) -> Option<u64>,
limit: usize,
) -> Vec<ScoredCandidate> {
let now_ns = now.as_nanos();
let window_ns = exploration_config.item_window.as_nanos() as u64;
let mut candidates: Vec<ScoredCandidate> = all_item_ids.iter()
.filter(|&&item_id| {
// Exclude items from followed creators.
let creator = item_creator_lookup(item_id);
let from_followed = creator
.map_or(false, |c| user_ctx.followed_creators.contains(&c.as_u64()));
!from_followed
})
.filter(|&&item_id| {
// Exclude hidden items and items from blocked creators.
if user_ctx.hidden_items.contains(&(item_id.as_u64() as u32)) {
return false;
}
let creator = item_creator_lookup(item_id);
if let Some(c) = creator {
if user_ctx.blocked_creators.contains(&c.as_u64()) {
return false;
}
}
true
})
.map(|&item_id| {
let score = cold_start_score(item_id, now, ledger, item_created_at);
ScoredCandidate {
entity_id: item_id,
score,
signal_snapshot: vec![],
creator_id: item_creator_lookup(item_id),
format: None,
}
})
.collect();
// Sort by score descending.
candidates.sort_by(|a, b| b.score.partial_cmp(&a.score).unwrap_or(std::cmp::Ordering::Equal));
candidates.truncate(limit);
candidates
}
/// Check if an item is within the cold-start exploration window.
pub fn is_cold_start_item(
item_id: EntityId,
now: Timestamp,
config: &ExplorationConfig,
item_created_at: &dyn Fn(EntityId) -> Option<u64>,
total_signal_count: &dyn Fn(EntityId) -> u64,
) -> bool {
let now_ns = now.as_nanos();
let window_ns = config.item_window.as_nanos() as u64;
// Created within window.
let created = item_created_at(item_id).unwrap_or(0);
if now_ns.saturating_sub(created) > window_ns {
return false;
}
// Below signal threshold.
total_signal_count(item_id) < config.max_signals_for_cold_item
}
```
## Test Strategy
### Unit Tests
```rust
#[test]
fn cold_start_score_uses_population_signals() {
let ledger = test_ledger_with_signals();
let entity = EntityId::new(1);
let now = Timestamp::now();
let score = cold_start_score(entity, now, &ledger, &|_| Some(now.as_nanos()));
assert!(score > 0.0, "cold start score should be positive: {}", score);
assert!(score <= 1.0, "cold start score should be <= 1.0: {}", score);
}
#[test]
fn inject_exploration_correct_count() {
let personalized: Vec<ScoredCandidate> = (0..45)
.map(|i| make_candidate(i + 1, (45 - i) as f64, Some(i as u64 + 1), None))
.collect();
let exploration: Vec<ScoredCandidate> = (0..5)
.map(|i| make_candidate(i + 100, (5 - i) as f64, Some(i as u64 + 50), None))
.collect();
let result = inject_exploration(&personalized, &exploration, 50, 0.1);
assert_eq!(result.len(), 50);
// Count exploration items (IDs >= 100).
let explore_count = result.iter().filter(|c| c.entity_id.as_u64() >= 100).count();
assert_eq!(explore_count, 5, "should have 5 exploration items");
}
#[test]
fn inject_exploration_empty_exploration_pool() {
let personalized: Vec<ScoredCandidate> = (0..50)
.map(|i| make_candidate(i + 1, (50 - i) as f64, Some(1), None))
.collect();
let exploration: Vec<ScoredCandidate> = vec![];
let result = inject_exploration(&personalized, &exploration, 50, 0.1);
assert_eq!(result.len(), 50);
// All items from personalized.
assert!(result.iter().all(|c| c.entity_id.as_u64() <= 50));
}
#[test]
fn inject_exploration_interleaves() {
let personalized: Vec<ScoredCandidate> = (0..45)
.map(|i| make_candidate(i + 1, (45 - i) as f64, Some(1), None))
.collect();
let exploration: Vec<ScoredCandidate> = (0..5)
.map(|i| make_candidate(i + 100, 1.0, Some(50), None))
.collect();
let result = inject_exploration(&personalized, &exploration, 50, 0.1);
// Exploration items should be spread through the list, not all at the end.
let first_half = &result[..25];
let second_half = &result[25..];
let explore_first = first_half.iter().filter(|c| c.entity_id.as_u64() >= 100).count();
let explore_second = second_half.iter().filter(|c| c.entity_id.as_u64() >= 100).count();
// At least one exploration item should be in each half.
assert!(explore_first > 0 || explore_second > 0);
}
#[test]
fn select_exploration_excludes_followed_creators() {
let user_state = UserStateIndex::new();
let user = EntityId::new(1);
user_state.add_follow(user, EntityId::new(10));
let iw_ledger = InteractionWeightLedger::new(InteractionWeightConfig::default());
let ctx = UserContext::load(user, &user_state, &iw_ledger, &|_| None, Timestamp::now());
let all_items: Vec<EntityId> = (1..=20).map(EntityId::new).collect();
let creator_lookup = |id: EntityId| -> Option<EntityId> {
// Items 1-10 from creator 10 (followed), 11-20 from creator 20 (not followed).
if id.as_u64() <= 10 {
Some(EntityId::new(10))
} else {
Some(EntityId::new(20))
}
};
let ledger = empty_test_ledger();
let config = ExplorationConfig::default();
let now = Timestamp::now();
let candidates = select_exploration_candidates(
&all_items, &ctx, &config, now, &ledger,
&creator_lookup, &|_| Some(now.as_nanos()), 10,
);
// Only items from creator 20 (unfollowed) should appear.
assert!(candidates.iter().all(|c| c.creator_id == Some(EntityId::new(20))),
"exploration should exclude followed creators");
}
#[test]
fn select_exploration_excludes_blocked_and_hidden() {
let user_state = UserStateIndex::new();
let user = EntityId::new(1);
user_state.add_block(user, EntityId::new(30));
user_state.add_hide(user, EntityId::new(15));
let iw_ledger = InteractionWeightLedger::new(InteractionWeightConfig::default());
let ctx = UserContext::load(user, &user_state, &iw_ledger, &|_| None, Timestamp::now());
let all_items: Vec<EntityId> = (1..=20).map(EntityId::new).collect();
let creator_lookup = |id: EntityId| -> Option<EntityId> {
if id.as_u64() <= 10 { Some(EntityId::new(20)) }
else { Some(EntityId::new(30)) } // Items 11-20 from blocked creator
};
let ledger = empty_test_ledger();
let config = ExplorationConfig::default();
let now = Timestamp::now();
let candidates = select_exploration_candidates(
&all_items, &ctx, &config, now, &ledger,
&creator_lookup, &|_| Some(now.as_nanos()), 20,
);
// Item 15 hidden, items 11-20 from blocked creator 30.
assert!(candidates.iter().all(|c| c.entity_id.as_u64() != 15),
"hidden items excluded");
assert!(candidates.iter().all(|c| c.creator_id != Some(EntityId::new(30))),
"blocked creator items excluded");
}
#[test]
fn is_cold_start_item_within_window() {
let config = ExplorationConfig::default(); // 24h window
let now = Timestamp::now();
// Item created 1 hour ago with 0 signals: cold start.
let one_hour_ago = now.as_nanos() - 3600 * 1_000_000_000;
assert!(is_cold_start_item(
EntityId::new(1), now, &config,
&|_| Some(one_hour_ago), &|_| 0,
));
}
#[test]
fn is_cold_start_item_outside_window() {
let config = ExplorationConfig::default();
let now = Timestamp::now();
// Item created 48 hours ago: outside 24h window.
let forty_eight_hours_ago = now.as_nanos() - 48 * 3600 * 1_000_000_000;
assert!(!is_cold_start_item(
EntityId::new(1), now, &config,
&|_| Some(forty_eight_hours_ago), &|_| 0,
));
}
#[test]
fn is_cold_start_item_too_many_signals() {
let config = ExplorationConfig::default(); // max 10 signals
let now = Timestamp::now();
// Item created recently but has 20 signals: not cold start.
assert!(!is_cold_start_item(
EntityId::new(1), now, &config,
&|_| Some(now.as_nanos()), &|_| 20,
));
}
```
### Property Tests
```rust
use proptest::prelude::*;
proptest! {
#[test]
fn inject_exploration_preserves_total_count(
n_personalized in 0usize..100,
n_exploration in 0usize..20,
total_limit in 1usize..100,
frac in 0.0f64..0.3,
) {
let personalized: Vec<ScoredCandidate> = (0..n_personalized)
.map(|i| make_candidate(i as u64 + 1, (n_personalized - i) as f64, Some(1), None))
.collect();
let exploration: Vec<ScoredCandidate> = (0..n_exploration)
.map(|i| make_candidate(i as u64 + 1000, 1.0, Some(50), None))
.collect();
let result = inject_exploration(&personalized, &exploration, total_limit, frac);
let expected = total_limit.min(n_personalized + n_exploration);
prop_assert!(result.len() <= total_limit,
"result {} > limit {}", result.len(), total_limit);
}
#[test]
fn cold_start_score_always_in_unit_range(
entity_id in 1u64..1000,
) {
let ledger = empty_test_ledger();
let now = Timestamp::now();
let score = cold_start_score(
EntityId::new(entity_id), now, &ledger,
&|_| Some(now.as_nanos()),
);
prop_assert!(score >= 0.0 && score <= 1.0,
"cold start score out of range: {}", score);
}
}
```
## Acceptance Criteria
- [ ] `ExplorationConfig` with configurable item window, signal threshold, and fraction
- [ ] `cold_start_score` returns population-level score in [0, 1]
- [ ] Cold-start users get `for_you` results ranked by trending + quality + recency
- [ ] `is_cold_start_item` correctly identifies new items within window and below signal threshold
- [ ] `inject_exploration` interleaves exploration candidates into personalized results
- [ ] Exploration count = `ceil(limit * fraction)` (e.g., 5 for limit=50, fraction=0.1)
- [ ] Exploration candidates exclude: followed creators, blocked creators, hidden items
- [ ] Exploration candidates are interleaved, not clustered at the end
- [ ] Empty exploration pool gracefully falls back to personalized-only results
- [ ] `select_exploration_candidates` returns items from unfollowed creators scored by population signals
- [ ] Property test: result count <= total_limit
- [ ] Property test: cold_start_score always in [0, 1]
- [ ] `cargo clippy -- -D warnings` passes
- [ ] All tests pass
## Research References
- [VISION.md](../../../../VISION.md) -- Exploration budget, cold-start handling
- [USE_CASES.md](../../../../USE_CASES.md) -- UC-01 (For You: exploration budget prevents filter bubbles)
- [thoughts.md](../../../../thoughts.md) -- Part V.16 (cold-start user defaults to population signals)
## Implementation Notes
- The exploration budget is enforced at the RETRIEVE executor level, after personalized scoring and before the final result assembly. The executor calls `select_exploration_candidates` to get exploration items, then `inject_exploration` to merge them with personalized results.
- For cold-start users, the candidate strategy switches from ANN (no query vector available) to full corpus scan sorted by `cold_start_score`. This is correct because without a preference vector, ANN retrieval has no meaningful query.
- The interleaving strategy is simple: place one exploration candidate every `total_limit / exploration_count` positions. This distributes exploration items evenly. More sophisticated interleaving (e.g., placing them at positions where the personalized score drops) is deferred to M6.
- `select_exploration_candidates` does a linear scan of all items. At M3 scale (10K items), this is fast. At larger scale, maintaining an exploration pool bitmap would be more efficient.
- The `ExplorationConfig` is stored as a field on `TidalDb`, initialized from defaults in `TidalDbBuilder::open()`. Custom values can be set via `builder.with_exploration_config(config)`.
- Cold-start item detection (`is_cold_start_item`) is used during exploration candidate selection to prioritize genuinely new items. However, all unfollowed items are eligible for exploration, not just cold-start items. Cold-start items get a small score boost within the exploration pool.