tidaldb/docs/specs/12-cold-start.md
jordan 413b712c0a chore: initialize tidalDB repository with schema foundation and standards
- Schema phase 1 (tasks 01-02): EntityId, EntityKind, Timestamp, Score, SignalTypeDef, DecayModel, Window, WindowSet — all with property tests and benchmarks scaffolding
- Stub modules for storage, signals, query, ranking
- Full documentation suite: VISION, USE_CASES, SEQUENCE, API, CODING_GUIDELINES, ai-lookup, research docs, specs, roadmap, planning docs
- Marketing site (Next.js) with blog infrastructure
- .claude/ agents and skills for the tidalDB development workflow
- Foundation standards enforced: thiserror + tracing declared as dependencies, clippy::unwrap_used = deny added to lint config
- .gitignore hardened: .next/, node_modules/, .env, secrets, logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 12:52:20 -07:00

1488 lines
72 KiB
Markdown

# 12 -- Cold Start Specification
**Status:** Draft
**Authors:** tidalDB Engineering
**Date:** 2026-02-20
**Depends on:** [Entity Model](02-entity-model.md), [Signal System](03-signal-system.md), [Relationships](04-relationships.md), [Cohorts](05-cohorts.md), [Feedback Loop](10-feedback-loop.md), [Schema](11-schema.md)
**References:** [VISION.md](../../VISION.md) (Design Principles: "Cold start is handled by the database"), [USE_CASES.md](../../USE_CASES.md) (UC-01, UC-13), [API.md](../../API.md) (ProfileDef.exploration), [thoughts.md](../../thoughts.md) (Part III, Gap 5)
---
## Table of Contents
1. [Overview](#1-overview)
2. [Design Principles](#2-design-principles)
3. [Cold Start Lifecycle](#3-cold-start-lifecycle)
4. [New Item Cold Start](#4-new-item-cold-start)
5. [New User Cold Start](#5-new-user-cold-start)
6. [New Creator Cold Start](#6-new-creator-cold-start)
7. [Cold Start and Cohorts](#7-cold-start-and-cohorts)
8. [Graduation Metrics](#8-graduation-metrics)
9. [Cold Start Across Surfaces](#9-cold-start-across-surfaces)
10. [Edge Cases](#10-edge-cases)
11. [Configuration Reference](#11-configuration-reference)
12. [Performance Considerations](#12-performance-considerations)
13. [Invariants and Correctness Guarantees](#13-invariants-and-correctness-guarantees)
14. [Property Tests](#14-property-tests)
---
## 1. Overview
Cold start is the problem of ranking entities that have no signal history. It affects three entity types -- items, users, and creators -- and manifests at three scales: individual entity cold start (a new item enters the database), cohort cold start (a new user with no history arrives), and system cold start (a brand new database with no data at all).
In the traditional multi-system architecture, cold start is application logic. The application maintains fallback rules, special-cases new content injection, manages exploration budgets in Redis, and runs A/B tests on cold start strategies in a separate experimentation framework. This is exactly the kind of domain logic that tidalDB internalizes.
**Cold start is a database responsibility.** The application writes `db.write_item(...)`. The database decides how to rank that item when it has zero signals. The application writes `db.write_user(...)`. The database decides what to show that user when they have zero history. The application does not manage exploration budgets, quality estimation from metadata, or cohort-based priors. The database does.
### The Fundamental Tension
Cold start is a tension between exploitation and exploration:
- **Exploitation:** Show users content that the system is confident they will like. This maximizes short-term engagement but creates filter bubbles and starves new content of exposure.
- **Exploration:** Show users content the system knows nothing about. This enables discovery and gives new content a fair chance but risks showing low-quality content.
tidalDB resolves this tension with three mechanisms:
1. **Exploration budgets** -- a configurable percentage of results reserved for cold-start items, managed per ranking profile. Items in cold start are distributed evenly through the result set, not appended at the end.
2. **Proxy scoring** -- predicting item quality from creator history, category baselines, metadata completeness, embedding similarity, and freshness, before any engagement signals exist.
3. **Cohort-based priors** -- using cohort membership to provide warm-start behavior for new users, replacing the population-level default with a segment-level default.
### Integration Points
| Subsystem | Cold Start Integration |
|-----------|----------------------|
| [Signal System (03)](03-signal-system.md) | `all_time_count` counters provide graduation tracking. Hot-tier atomic counters enable O(1) state detection. |
| [Entity Model (02)](02-entity-model.md) | Entity lifecycle (Active/Archived/Deleted) gates cold start eligibility. Creator computed fields (`avg_item_quality`, `avg_engagement_rate`, `follower_count`) feed proxy scoring. |
| [Cohorts (05)](05-cohorts.md) | Cohort centroids provide preference vector initialization for new users. Three-layer trending model provides cohort-scoped content for cold user feeds. |
| [Feedback Loop (10)](10-feedback-loop.md) | Adaptive learning rate (`lr_max=0.10`, `lr_min=0.01`, `decay_k=0.003`) provides rapid adaptation during cold start. Preference vector update formula uses the same mechanism. |
| [Schema (11)](11-schema.md) | `ProfileDef.exploration` field controls per-profile exploration budget. Section 8 defines population priors and cold start configuration. |
---
## 2. Design Principles
**Cold start is a state, not a flag.** An entity's cold start status is a property of its signal ledger, not a flag the application manages. The database knows an entity is cold because its `all_time_count` is below the graduation threshold. It does not need to be told. There is no `mark_as_cold_start()` API.
**Exploration decays linearly as evidence accumulates.** A new item starts with maximum exploration weight. As signals accumulate, the weight decreases linearly toward zero. When enough signals exist for the ranking profile to score the item confidently, exploration weight reaches zero and the item competes on signals alone. There is no permanent "new item" status.
**Proxy scores are stopgaps, not ranking strategies.** Predicted quality from creator history, category baselines, metadata, and embeddings is used only until real signals exist. It is phased out linearly as real signals accumulate. Proxy scores never override strong real signals.
**Cohort priors replace population priors for new users.** A new user who provides locale, age range, and interests at signup should not see global trending. They should see cohort-scoped trending -- what is popular among users who look like them. Cohort priors are the bridge between "no history" and "personalized."
**The application does not manage cold start.** There is no `set_exploration_budget()` API. The database detects cold start conditions automatically from the signal ledger state and applies the exploration strategy declared in the ranking profile. The `ProfileDef.exploration` field is the single configuration knob.
**Every entity graduates or expires.** No item remains cold indefinitely. Either signals accumulate and the item graduates to signal-based ranking, or the exploration window expires and the item exits the exploration pool. Both outcomes are bounded by configurable thresholds.
---
## 3. Cold Start Lifecycle
### Entity Lifecycle Diagram
Every entity in tidalDB progresses through three cold start phases. The phase is determined by the entity's signal ledger, not by explicit flags.
```
┌──────────────────┐
write_item() │ COLD START │ signal_count = 0
────────────────> │ │ exploration_weight = 1.0
│ Score: 100% │ Quality source: proxy scoring only
│ proxy │
└────────┬─────────┘
first signal arrives
┌────────▼─────────┐
│ ACCUMULATING │ 0 < signal_count < graduation_threshold
│ │ exploration_weight = max(0, 1 - count/threshold)
│ Score: blended │ Quality source: blended proxy + observed
│ proxy + signal │
└────────┬─────────┘
signal_count >= graduation_threshold
OR dynamic graduation triggered
┌────────▼─────────┐
│ GRADUATED │ signal_count >= graduation_threshold
│ │ exploration_weight = 0.0
│ Score: 100% │ Quality source: observed signals only
│ signal-based │
└──────────────────┘
```
### Phase Definitions
| Phase | Signal Count | Exploration Weight | Score Composition | Detection Cost |
|-------|-------------|-------------------|-------------------|---------------|
| Cold Start | 0 | 1.0 (maximum) | 100% proxy score | O(1) -- atomic counter read |
| Accumulating | 1 to `graduation_threshold - 1` | Linear decay toward 0 | Blended: `(1-ew) * signal_score + ew * proxy_score` | O(1) -- atomic counter read |
| Graduated | >= `graduation_threshold` | 0.0 | 100% signal-based score | O(1) -- atomic counter read |
### Exploration Weight Formula
The exploration weight decays linearly from 1.0 to 0.0 as signals accumulate:
```
exploration_weight = max(0, 1 - signal_count / graduation_threshold)
```
Where `graduation_threshold` is configurable per ranking profile (default: 100).
**Why linear, not sigmoid.** Linear decay is simpler, predictable, and debuggable. The exploration weight at 50 signals is exactly 0.5, not an opaque sigmoid output. The application developer can reason about the system: "my item has 30 signals out of 100, so 70% of its score comes from proxy estimation." Sigmoid introduces a parameter (`k`) that is difficult to tune and makes the relationship between signal count and exploration weight non-obvious.
### Blended Scoring Formula
During the Accumulating phase, an item's effective score is a linear blend:
```
score = exploration_weight * proxy_score + (1 - exploration_weight) * signal_score
```
Where:
- `proxy_score` is the quality estimate from Section 4.2
- `signal_score` is the score computed by the ranking profile's normal scoring pipeline
- `exploration_weight` decays linearly per the formula above
At Cold Start (0 signals): `score = 1.0 * proxy_score + 0.0 * signal_score = proxy_score`
At 50/100 signals: `score = 0.5 * proxy_score + 0.5 * signal_score`
At Graduated (100+ signals): `score = 0.0 * proxy_score + 1.0 * signal_score = signal_score`
### Phase Detection
Phase detection is O(1). The `all_time_count` for the primary signal (typically `view`) is maintained as an atomic counter in the hot-tier signal state, as specified in Signal System Section 3.
```rust
/// Determine an item's cold start phase.
/// Cost: one atomic load. No scan, no disk read.
fn cold_start_phase(
signal_ledger: &HotSignalState,
graduation_threshold: u64,
) -> ColdStartPhase {
let signal_count = signal_ledger.all_time_count("view");
if signal_count == 0 {
ColdStartPhase::ColdStart
} else if signal_count < graduation_threshold {
ColdStartPhase::Accumulating { signal_count }
} else {
ColdStartPhase::Graduated
}
}
```
---
## 4. New Item Cold Start
### Problem Statement
A newly ingested item has zero signals. No views, no likes, no completions, no skips. The ranking function -- which relies on engagement velocity, decay scores, completion rate, and like ratio -- has nothing to work with. Without intervention, the item would score zero and never appear in any ranked result, creating a chicken-and-egg problem: the item cannot get engagement without exposure, and it cannot get exposure without engagement.
### Solution: Three Mechanisms
#### 4.1 Exploration Budget
Every ranking profile declares an exploration budget: the percentage of result slots reserved for cold-start items.
```rust
db.define_profile(ProfileDef {
name: "for_you",
// ... candidate, boosts, gates, diversity ...
exploration: 0.10, // 10% of result slots reserved for exploration
})?;
```
The budget is applied after diversity enforcement, before pagination. For a query with `LIMIT 50` and `exploration: 0.10`, 5 result slots are reserved for exploration items. The remaining 45 slots are filled by the ranking profile's normal scoring pipeline.
**Budget bounds.** The exploration budget is clamped to `[0.0, 0.50]`. A budget above 50% would mean more exploration than ranked results, which defeats the purpose of ranking. A budget of 0.0 disables exploration entirely (used for surfaces like `trending` where cold items are ineligible by definition).
#### 4.2 Proxy Scoring
Before any engagement signals exist, the database estimates item quality from available metadata, the creator's track record, embedding similarity, and freshness. This proxy score determines which cold items are selected to fill the exploration budget and how they rank relative to each other.
```
proxy_score = weighted_sum(
creator_quality_score * 0.30,
category_baseline_score * 0.10,
metadata_completeness * 0.15,
embedding_novelty_score * 0.10,
embedding_similarity_score * 0.25,
freshness_score * 0.10,
)
```
Each component:
**Creator Quality Score (weight: 0.30):**
The creator's track record is the strongest predictor of new item quality.
```rust
fn creator_quality_score(creator: &CreatorEntity) -> f64 {
let avg_quality = creator.computed("avg_item_quality")
.unwrap_or(0.5); // default for new creators
let engagement_rate = creator.computed("avg_engagement_rate")
.unwrap_or(0.03); // default
let posting_freq = creator.computed("posting_frequency")
.unwrap_or(1.0); // items per week
let quality_norm = avg_quality.clamp(0.0, 1.0);
let engagement_norm = (engagement_rate / 0.10).clamp(0.0, 1.0);
let consistency_norm = (posting_freq / 7.0).clamp(0.0, 1.0);
quality_norm * 0.50 + engagement_norm * 0.35 + consistency_norm * 0.15
}
```
For new creators (no `avg_item_quality`), the creator cohort comparison (Section 6) provides the baseline.
**Category Baseline Score (weight: 0.10):**
The average quality of recently published items in the same category.
```rust
fn category_baseline_score(category: &str, baselines: &CategoryBaselines) -> f64 {
baselines.get(category)
.map(|b| b.avg_quality_score)
.unwrap_or(0.5) // neutral default for unknown categories
}
```
Category baselines are maintained by the background materializer as the mean quality score (completion rate * like ratio) of all items in the category published in the last 30 days with at least 100 views.
**Metadata Completeness Score (weight: 0.15):**
Items with complete metadata tend to be higher quality than items with sparse metadata.
```rust
fn metadata_completeness_score(item: &ItemEntity) -> f64 {
let mut score = 0.0;
// Title present and non-trivial (> 10 chars)
if item.get("title").map(|t| t.len() > 10).unwrap_or(false) {
score += 0.25;
}
// Description present and non-trivial (> 50 chars)
if item.get("description").map(|d| d.len() > 50).unwrap_or(false) {
score += 0.25;
}
// At least 2 tags
if item.get_keywords("tags").map(|t| t.len() >= 2).unwrap_or(false) {
score += 0.20;
}
// Category set
if item.get("category").is_some() {
score += 0.15;
}
// Has subtitles (accessibility = quality indicator)
if item.get_bool("has_subtitles").unwrap_or(false) {
score += 0.15;
}
score
}
```
**Embedding Novelty Score (weight: 0.10):**
Measures how different this item is from existing content. Items that fill gaps in the embedding space get a boost -- they provide genuine novelty rather than duplicating existing content.
```rust
fn embedding_novelty_score(
item_embedding: &[f32],
nearest_neighbor_distance: f64, // from HNSW index
) -> f64 {
// Higher distance = more novel. Sigmoid-mapped to [0, 1].
// Items very close to existing content score low.
// Items in underrepresented embedding regions score high.
let novelty = 1.0 - (-3.0 * nearest_neighbor_distance).exp();
novelty.clamp(0.0, 1.0)
}
```
**Embedding Similarity Score (weight: 0.25):**
How similar is this item's embedding to known high-quality items in the same category? This is the strongest content-based signal.
```rust
fn embedding_similarity_score(
item_embedding: &[f32],
category: &str,
quality_centroids: &CategoryQualityCentroids,
) -> f64 {
let centroid = quality_centroids.get(category);
match centroid {
Some(c) => {
let similarity = cosine_similarity(item_embedding, c);
(similarity + 1.0) / 2.0 // map [-1, 1] to [0, 1]
}
None => 0.5, // neutral default if no centroid computed yet
}
}
```
**Category quality centroids** are computed by the background materializer as the weighted mean embedding of items in the category with `completion_rate > 0.7`, `like_ratio > 0.85`, published in the last 90 days, with at least 500 views.
**Freshness Score (weight: 0.10):**
More recent items receive a slight boost, ensuring newly published content is prioritized within the exploration pool.
```rust
fn freshness_score(created_at: DateTime<Utc>, now: DateTime<Utc>) -> f64 {
let age_hours = (now - created_at).num_hours() as f64;
// Linear decay over 48 hours. Items older than exploration_window get 0.
(1.0 - age_hours / 48.0).max(0.0)
}
```
### Proxy Score Computation Timing
The proxy score is computed once at item ingestion (`write_item()`) and stored alongside the entity:
```
[entity_id][0x00][COLD:proxy_score] -> f32 (predicted quality)
[entity_id][0x00][COLD:created_at] -> u64 (creation timestamp)
```
The score is recomputed by the background materializer when:
- Creator's `avg_item_quality` is updated (daily)
- Category baselines change significantly (>20% relative change)
- The item accumulates signals (the blend ratio shifts)
#### 4.3 Exploration Distribution
Exploration items are distributed evenly through the result set, not clustered at the end. Placing all exploration items at positions 46-50 in a 50-item result means users who do not scroll past position 10 never see them, creating a systematic bias against new content.
**Exploration Distribution Algorithm:**
```
Given: LIMIT 50, exploration_count = 5
Exploration positions: 3, 8, 13, 18, 23
(min_position = 3, spacing = 5)
Constraints:
min_position >= 3 (never position 1 or 2 -- top slots are earned)
spacing = max(3, (limit - min_position) / exploration_count)
position[i] = min_position + i * spacing
```
```rust
fn exploration_positions(
limit: usize,
exploration_count: usize,
min_position: usize,
) -> Vec<usize> {
if exploration_count == 0 {
return vec![];
}
let min_position = min_position.max(3); // never top 2
let available = limit.saturating_sub(min_position);
let spacing = if exploration_count <= 1 {
available
} else {
(available / exploration_count).max(3)
};
(0..exploration_count)
.map(|i| (min_position + i * spacing).min(limit))
.collect()
}
```
**Rationale for min_position = 3.** Positions 1 and 2 are high-value real estate. Users judge the entire feed by the first two items. Inserting an unproven cold-start item there risks a poor first impression. Position 3 is the earliest safe insertion point -- the user has already seen two strong items.
**Rationale for spacing = 5 (for 5 items in 50 slots).** Evenly-spaced exploration items ensure that users who scroll to any depth encounter approximately the same density of new content. Clustering creates dead zones.
#### 4.4 Exploration Window
Cold items are exploration-eligible for a configurable duration after creation. The window defaults to 48 hours. After the window expires, the item must compete on signals alone -- it is no longer injected into exploration slots.
The window ensures that items which fail to attract any engagement during their exploration period are not perpetually given free exposure. Content that nobody engages with after 48 hours and hundreds of impressions is probably not interesting.
### Exploration Budget Mechanics Diagram
```
Query: RETRIEVE items FOR USER @u USING PROFILE for_you LIMIT 50
Step 1: Normal Ranking Pipeline
┌──────────────────────────────────────────┐
│ ANN retrieval (top 500 candidates) │
│ Signal scoring (decay, velocity, gates) │
│ Diversity enforcement (max 2/creator) │
│ Top 45 results by score │
└───────────────────┬──────────────────────┘
Step 2: Exploration Pool Selection (budget = 10% of 50 = 5 slots)
┌──────────────────────────────────────────┐
│ Select cold items from exploration pool: │
│ - Created within last 48h │
│ - signal_count < graduation_threshold │
│ - Not already in top 45 results │
│ - Not hidden/blocked for this user │
│ - proxy_score > min_quality_floor (0.2) │
│ Rank by proxy_score │
│ Take top 5 │
└───────────────────┬──────────────────────┘
Step 3: Interleaving at Calculated Positions
┌──────────────────────────────────────────┐
│ Insert exploration items at positions: │
│ 3, 8, 13, 18, 23 │
│ │
│ Result: [R R E R R R R E R R R R E ...] │
│ R = ranked item, E = exploration item │
└───────────────────┬──────────────────────┘
Step 4: Impression Tracking
┌──────────────────────────────────────────┐
│ All returned items (including exploration)│
│ generate impression signals. │
│ │
│ Exploration items MUST be tracked. │
│ The feedback loop is how they accumulate │
│ signals and graduate or get deprioritized.│
└──────────────────────────────────────────┘
```
### Exploration Pool Management
The exploration pool is the set of items eligible for exploration injection. It is maintained by the background materializer and cached in memory.
```
Exploration Pool:
Items where:
created_at > now() - exploration_window (within 48h)
AND signal_count < graduation_threshold (not yet graduated)
AND status = "published" (active)
AND proxy_score > min_quality_floor (0.2) (minimum quality)
Sorted by: proxy_score DESC
Size: typically 1,000 to 50,000 items
Refresh: every 5 minutes (background materializer)
Memory: ~50 bytes per item * 50K = ~2.5 MB
```
Items exit the exploration pool when:
1. They accumulate enough signals to graduate (`signal_count >= graduation_threshold`)
2. They exceed the exploration window age (48h)
3. They are archived or deleted
4. Dynamic graduation triggers early promotion (Section 8.2)
---
## 5. New User Cold Start
### Problem Statement
A new user has no preference vector, no engagement history, no relationship graph. The personalized ranking profile -- which depends on ANN retrieval from the user's preference vector, interaction weights with creators, and seen/unseen state -- has nothing to work with. Without intervention, the For You feed would either be empty or fall back to global popularity, which is rarely a good first impression.
### Solution: Three-Stage Onboarding
#### 5.1 Preference Vector Initialization
When a new user is created, their preference vector must be initialized to something meaningful. The initialization follows a hierarchy, using the best available prior:
```
User created via db.write_user(...)
┌─────────────────────────────────────────┐
│ STEP 1: Check explicit_interests │
│ │
│ Does the user have explicit_interests? │
│ ["jazz", "cooking", "rust"] │
└─────────────┬───────────────────────────┘
┌────┴────┐
│ │
YES NO
│ │
▼ ▼
┌────────────┐ ┌─────────────────────────────┐
│ Centroid │ │ STEP 2: Check cohort │
│ of interest│ │ │
│ embeddings │ │ Can the user be placed in │
│ │ │ a demographic cohort? │
│ Lookup │ │ (locale, age_range present) │
│ embedding │ └──────┬──────────────────────┘
│ for each │ │
│ interest │ ┌────┴────┐
│ keyword, │ │ │
│ compute │ YES NO
│ mean │ │ │
└────┬───────┘ ▼ ▼
│ ┌────────────┐ ┌────────────┐
│ │ Cohort │ │ Population │
│ │ centroid │ │ centroid │
│ │ │ │ │
│ │ Mean pref │ │ Mean pref │
│ │ vector of │ │ vector of │
│ │ cohort │ │ ALL users │
│ │ users with │ │ with 100+ │
│ │ 100+ │ │ signals │
│ │ signals │ │ │
│ └────┬───────┘ └─────┬──────┘
│ │ │
└────┬────┘ │
│ │
▼ │
┌────────────────────┐ │
│ Shift toward │ │
│ cohort centroid │◄─────────┘
│ (if available) │
└────────┬───────────┘
┌────────────────────┐
│ Normalize to │
│ unit length │
│ │
│ Insert into HNSW │
└────────────────────┘
```
**Priority hierarchy:**
1. **Explicit interests provided** -- compute centroid of interest embeddings, shift toward cohort centroid if available
2. **Demographic cohort available** -- use cohort centroid (mean preference vector of cohort users with 100+ signals)
3. **Neither available** -- use population centroid (mean preference vector of all users with 100+ signals)
#### 5.2 Early Personalization (Rapid Learning)
During the user's first signals, the adaptive learning rate is at its maximum (`lr_max = 0.10`). This means each signal moves the preference vector significantly:
```
lr = lr_max * exp(-decay_k * signal_count) + lr_min
Where:
lr_max = 0.10 (10% shift per signal at start)
lr_min = 0.01 (1% shift per signal at maturity)
decay_k = 0.003 (lr reaches floor at ~1500 signals)
```
| Signal Count | Learning Rate | Effect |
|-------------|---------------|--------|
| 0 | 0.10 | Each like moves preference vector ~10% toward item |
| 5 | 0.098 | Strong directional preference forming |
| 20 | 0.094 | Meaningfully different from initial centroid |
| 50 | 0.087 | Clear multi-interest profile emerging |
| 100 | 0.074 | Well-defined preferences |
| 500 | 0.023 | Stable but still responsive |
| 1000 | 0.015 | Near-stable |
| 1500+ | 0.010 | At floor -- stable |
These values match the Feedback Loop spec, Section 3. Cold start does not introduce different learning rates -- it relies on the adaptive learning rate mechanism that is naturally highest for new users.
**What "rapid learning" means in practice:** At `lr_max = 0.10` with a like (weight 1.0), 5 likes in the same category establish a strong directional preference. 10 likes across two categories establish a multi-interest profile. By 20 signals, the preference vector is meaningfully different from the initial centroid.
#### 5.3 Cold User Feed Strategy
New users receive two feed modifications:
**Elevated exploration budget.** New users get an exploration rate of `profile_exploration + new_user_exploration_boost` (default: `0.10 + 0.20 = 0.30`, i.e., 30% of results are exploration items). This decays linearly to the profile default as signals accumulate:
```
effective_exploration = profile_exploration
+ new_user_exploration_boost * max(0, 1 - signal_count / user_graduation_threshold)
Where:
profile_exploration = 0.10 (from ProfileDef)
new_user_exploration_boost = 0.20 (default)
user_graduation_threshold = 50 (default)
```
| Signal Count | Boost | Effective Rate |
|-------------|-------|----------------|
| 0 | 0.20 | 0.30 (30%) |
| 10 | 0.16 | 0.26 |
| 25 | 0.10 | 0.20 |
| 50 | 0.00 | 0.10 (profile default) |
**Cohort-to-personal transition.** As the user accumulates signals, candidate generation transitions from cohort-driven to preference-driven:
```
personal_weight = min(1.0, signal_count / cohort_blend_threshold)
cohort_weight = 1.0 - personal_weight
candidates = merge(
cohort_trending(user_cohort, top_k * cohort_weight),
ann_retrieval(user_preference, top_k * personal_weight),
)
Where cohort_blend_threshold = 50 (default)
```
| Signal Count | Cohort Weight | Personal Weight | Behavior |
|-------------|---------------|-----------------|----------|
| 0 | 1.00 | 0.00 | Entirely cohort-driven |
| 10 | 0.80 | 0.20 | Mostly cohort, some personal |
| 25 | 0.50 | 0.50 | Equal blend |
| 50 | 0.00 | 1.00 | Entirely personal |
| 100+ | 0.00 | 1.00 | Fully personalized |
```
Cold user For You feed composition evolution:
Signal Count 0:
Cohort-trending items: 70% (trending among users in same cohort)
Exploration items: 30% (quality-weighted, diverse creators)
Personal signal items: 0% (no history yet)
Signal Count 25:
Cohort-trending items: 35%
Exploration items: 20% (declining from 30%)
Personal signal items: 45% (ANN from preference vector)
Signal Count 50+:
Cohort-trending items: 0% (transition complete)
Exploration items: 10% (profile default)
Personal signal items: 90% (fully personalized)
```
---
## 6. New Creator Cold Start
### Problem Statement
A new creator has no followers, no engagement baseline, no catalog embedding. Their items receive no social proof boost (nobody follows them), no interaction weight boost (nobody has engaged with them before), and no collaborative filtering signal (no overlap with other creators' audiences). Their first content is doubly cold: the item is cold AND the creator is cold.
### Solution: Four Mechanisms
#### 6.1 Discovery Boost
New creators receive an additional exploration budget boost on top of the standard item exploration budget. This boost is applied to items by creators whose `total_items` computed field is below a threshold.
```rust
fn creator_discovery_boost(creator: &CreatorEntity) -> f64 {
let item_count = creator.computed("total_items").unwrap_or(0);
let follower_count = creator.computed("follower_count").unwrap_or(0);
if item_count <= NEW_CREATOR_ITEM_THRESHOLD // default: 5
&& follower_count <= NEW_CREATOR_FOLLOWER_THRESHOLD // default: 100
{
CREATOR_DISCOVERY_MULTIPLIER // default: 1.5
} else {
1.0
}
}
```
The discovery boost means a new creator's item gets `10% * 1.5 = 15%` exploration budget instead of the standard 10%.
#### 6.2 Provisional Creator Signals
A new creator's signal data is statistically unreliable. Their `avg_item_quality` and `avg_engagement_rate` computed fields are based on too few data points. To prevent a single viral or flopped item from permanently defining a creator's quality estimate, creator-level signals are weighted at 50% until the creator has at least 5 graduated items.
```rust
fn creator_signal_confidence(creator: &CreatorEntity) -> f64 {
let graduated_items = creator.computed("graduated_item_count")
.unwrap_or(0);
if graduated_items < CREATOR_MATURITY_THRESHOLD { // default: 5
PROVISIONAL_SIGNAL_WEIGHT // default: 0.5
} else {
1.0
}
}
```
When computing the creator quality component of an item's proxy score (Section 4.2), the creator score is multiplied by this confidence factor, and the remainder is filled by the category baseline:
```
adjusted_creator_score = creator_quality_score * creator_signal_confidence
+ category_baseline * (1.0 - creator_signal_confidence)
```
#### 6.3 Creator Cohort Comparison
Even without engagement history, a new creator has metadata: categories, tags, language, region. The quality estimation system compares new creators to established creators with similar metadata to establish baseline expectations.
```
creator_prior_quality = weighted_mean(
quality_scores_of_similar_creators,
weights = similarity_to_new_creator
)
where similar_creators = creators in same category AND region
with > 1000 total item views
sorted by tag overlap
top 20
```
This creator prior is used as the `category_baseline` fallback when the creator has no `avg_item_quality`.
#### 6.4 First-Item Boost
A creator's very first published item receives extra exploration budget regardless of the creator's other signals. This ensures that every creator has at least one chance to be seen.
```rust
fn first_item_boost(creator: &CreatorEntity) -> f64 {
let creator_item_count = creator.computed("total_items").unwrap_or(0);
if creator_item_count <= 1 {
FIRST_ITEM_BOOST_MULTIPLIER // default: 2.0
} else {
1.0
}
}
```
A creator's first item gets `10% * 2.0 = 20%` exploration budget. Combined with the creator discovery boost: `10% * 1.5 * 2.0 = 30%` total exploration budget for a new creator's first item. This is the maximum exploration commitment the system makes.
---
## 7. Cold Start and Cohorts
### Cohort-Based Priors for New Users
This is the critical capability enabled by the cohort system. When a new user is created with demographic attributes, they are immediately placed in matching cohorts. Instead of showing global trending (which skews toward majority demographics), the user sees cohort-scoped trending.
```
New user signs up:
locale: "ja-JP"
age_range: "18-24"
explicit_interests: ["anime", "music"]
Immediate cohort resolution:
region:JP --> bitmap A
age_range:18-24 --> bitmap B
interest:anime --> bitmap C
interest:music --> bitmap D
Primary cohort: A AND B --> "young Japanese users"
Interest cohort: A AND C --> "Japanese anime fans"
Interest cohort: A AND D --> "Japanese music fans"
```
**Why cohort priors matter:** A 22-year-old user in Tokyo gets Japanese music, anime, and locally relevant content in their first session. A 45-year-old user in Texas gets country music, cooking shows, and locally relevant content. Neither sees the globally dominant content (typically English-language pop culture) unless it also happens to be trending in their cohort.
### Cohort Centroid Computation
The cohort centroid is the mean preference vector of all users in the cohort who have at least 100 signals (graduated users). Users below 100 signals are excluded from the centroid to prevent cold users from diluting the centroid with their initial (non-personalized) vectors.
```rust
fn compute_cohort_centroid(
cohort_members: &[UserId],
min_signal_count: u64, // default: 100
) -> Option<Vec<f32>> {
let graduated_members: Vec<_> = cohort_members.iter()
.filter(|u| signal_count(*u) >= min_signal_count)
.collect();
if graduated_members.len() < MIN_COHORT_SIZE_FOR_CENTROID { // default: 50
return None; // not enough data -- fall back to population centroid
}
Some(mean_embedding(graduated_members.iter().map(|u| preference_vector(*u))))
}
```
**Minimum cohort size.** A cohort needs at least 50 graduated users (configurable) before its centroid is considered reliable. Below this threshold, the system falls back to the population centroid. This prevents small, possibly unrepresentative cohorts from creating misleading priors.
### Cohort-Scoped Trending for Cold Users
The three-layer trending model from the Cohorts spec (Section 6) directly serves cold user needs:
| Layer | What It Shows | When Used |
|-------|--------------|-----------|
| Global trending | What is popular with everyone | Fallback when no cohort available |
| Cohort-scoped trending | What is popular among users like this one | Primary feed for cold users with cohort data |
| Personal trending | What is popular among this user's followed creators | After user has follows and 50+ signals |
For a cold user with cohort data, the feed is composed primarily of cohort-scoped trending, supplemented by exploration items. This is the "zero query" experience -- the first feed the user sees without having done anything.
---
## 8. Graduation Metrics
### 8.1 Standard Graduation
An item graduates from cold start when its `all_time_count` for the primary signal (`view`) reaches the `graduation_threshold` (default: 100). At graduation:
1. `exploration_weight` drops to 0.0
2. The item exits the exploration pool
3. The item competes in the normal ranking pipeline on signals alone
4. The blended scoring formula produces `score = signal_score` (no proxy component)
Graduation is detected at query time via O(1) atomic counter read. There is no explicit "graduation event" -- the item simply stops qualifying for exploration on its next query.
### 8.2 Dynamic Graduation for Viral Items
Items that accumulate signals at an exceptional rate should graduate early. Keeping a viral item in the exploration pool is wasteful -- it has proven quality and does not need exploration slots.
```
dynamic_threshold = min(
graduation_threshold,
max(10, engagement_velocity / baseline_velocity * 10)
)
Where:
engagement_velocity = view.velocity(1h) for this item
baseline_velocity = median view velocity (1h) across all items
in the same category with GRADUATED status
```
When `signal_count >= dynamic_threshold`, the item graduates immediately.
**Example:** Category baseline velocity is 50 views/hour. An item receives 500 views in its first hour (10x baseline). Dynamic threshold = `min(100, max(10, 500/50 * 10))` = `min(100, 100)` = `100`. In this case, no early graduation because the dynamic threshold equals the standard threshold.
But if the item receives 2,000 views/hour (40x baseline): `min(100, max(10, 2000/50 * 10))` = `min(100, 400)` = `100`. The dynamic threshold is capped by `graduation_threshold`.
Where dynamic graduation actually matters: `min(100, max(10, 5000/50 * 10))` = `min(100, 1000)` = `100`. The cap at `graduation_threshold` means items always graduate at `graduation_threshold` at latest.
**Revised formula for early graduation -- breakout detection:**
The more useful form is detecting items that should graduate _before_ reaching 100 signals:
```rust
fn check_breakout(
item: EntityId,
signal_ledger: &HotSignalState,
category: &str,
category_baselines: &CategoryBaselines,
breakout_multiplier: f64, // default: 3.0
) -> bool {
let item_velocity = signal_ledger.velocity("view", &Window::hours(1));
let category_baseline = category_baselines.get(category)
.map(|b| b.avg_velocity_1h)
.unwrap_or(10.0);
item_velocity > category_baseline * breakout_multiplier
}
```
When breakout is detected:
1. Item's `exploration_weight` is forced to 0.0
2. Item is removed from the exploration pool
3. Item competes in the normal ranking pipeline
4. The signal changelog records the breakout event for analytics
**Default `breakout_multiplier`: 3.0.** An item with 3x the category's average view velocity in its first hour is a breakout.
### 8.3 Graduation Curve
```
exploration_weight
1.0 ┌─────────────────────────────────────────────────┐
│ ■ │
│ ■ │
0.8 │ ■ │
│ ■■ │
│ ■■ │
0.6 │ ■■ │
│ ■■ │
│ ■■ │
0.4 │ ■■ │
│ ■■ │
│ ■■ │
0.2 │ ■■ │
│ ■■ │
│ ■■ │
0.0 └─────────────────────────■■■■■■■■■■■■■■■■■■■■──┘
0 20 40 60 80 100 120 140
signal_count
Linear decay: exploration_weight = max(0, 1 - signal_count / 100)
At 0 signals: exploration_weight = 1.00 (full proxy score)
At 25 signals: exploration_weight = 0.75
At 50 signals: exploration_weight = 0.50 (equal blend)
At 75 signals: exploration_weight = 0.25
At 100 signals: exploration_weight = 0.00 (graduated)
```
### 8.4 User Graduation
Users graduate from cold start when their signal count reaches `user_graduation_threshold` (default: 50). At graduation:
1. Elevated exploration boost decays to zero
2. Cohort-to-personal transition completes (personal_weight = 1.0)
3. The user is counted toward cohort centroid computation (if they have 100+ signals)
### 8.5 Creator Graduation
Creators graduate from provisional status when they have at least `creator_maturity_threshold` (default: 5) graduated items. At graduation:
1. Creator signal confidence reaches 1.0
2. Creator quality score is no longer diluted with category baseline
3. Discovery multiplier no longer applies (but items still get standard exploration)
---
## 9. Cold Start Across Surfaces
Cold start behavior differs by surface because each surface has different signal requirements and different tolerance for unproven content.
### Surface Eligibility Matrix
| Surface | UC | Cold Item Eligible | Cold Item Strategy | Cold User Strategy |
|---------|-----|-------------------|-------------------|--------------------|
| For You | UC-01 | Yes (exploration budget) | Proxy-scored exploration injection | Cohort-trending + elevated exploration |
| Search | UC-02 | Yes (no engagement gate) | Ranked by text/semantic relevance | Reduced personalization boost |
| Trending | UC-03 | No (velocity required) | Excluded until 1h age + velocity | Global or cohort-scoped trending |
| Following | UC-04 | Yes (if user follows creator) | Chronological (no signal dependency) | N/A (requires follows) |
| Related | UC-05 | Yes (embedding available) | Ranked by semantic similarity | Anchor-based (user-independent) |
| Browse (new sort) | UC-06 | Yes | Chronological | N/A (not personalized) |
| Browse (hot/top sort) | UC-06 | Yes (proxy estimate) | Ranked by proxy score | N/A (not personalized) |
| Notifications | UC-07 | N/A | N/A | N/A |
| Creator Profile | UC-08 | Yes | Chronological or by popularity | N/A (creator-scoped) |
| User Library | UC-09 | N/A | N/A | Empty until engagement |
| People Search | UC-10 | Yes | Ranked by text relevance | N/A |
| Visual Search | UC-11 | Yes | Ranked by visual similarity | N/A |
| Live Content | UC-12 | Yes | Ranked by viewer count | N/A |
| Hidden Gems | UC-13 | Partial (50+ signals required) | Excluded below minimum | N/A (not personalized) |
| Controversial | UC-14 | No (dual-signal required) | Excluded until sufficient signals | N/A |
### Surface-Specific Details
**Search (UC-02).** New items are eligible for search results if their text relevance (BM25) or semantic similarity is high. There is no engagement gate for search -- withholding relevant results because the item has no signals would be incorrect for an intent-driven surface. However, the exploration budget for search is reduced to `0.05` (5%) because search users have explicit intent and should see primarily relevance-ranked results. Cold items in search must pass a relevance gate: `bm25_score > 0.3 OR semantic_similarity > 0.5`.
**Trending (UC-03).** New items are excluded from trending surfaces. Trending requires velocity signals, which require time to accumulate. **Minimum age for trending eligibility:** 1 hour (configurable). This prevents artificial trending from coordinated burst engagement on a new item.
**Hidden Gems (UC-13).** Hidden Gems explicitly favors items with high quality signals and low reach. Items in Cold Start phase are natural candidates for "low reach" -- but they must show quality signals. **Minimum requirement:** 50 signals with `completion_rate > 0.6` and `like_ratio > 0.8`. An item with zero completions is not a hidden gem; it is just unseen.
**Following (UC-04).** New items from followed creators appear immediately in the Following feed, sorted chronologically. No cold start mechanism is needed -- the user explicitly chose to follow this creator.
---
## 10. Edge Cases
### Edge Case Handling Table
| Edge Case | Behavior | Rationale |
|-----------|----------|-----------|
| **Item with no embedding** | Excluded from ANN-based exploration. Eligible for scan-based surfaces (browse, trending) once signals accumulate. Proxy score computed without embedding_similarity and embedding_novelty components (remaining weights renormalized). | Embedding is required for personalized ranking. Items without embeddings cannot participate in ANN retrieval. |
| **Creator with no items** | No impact on cold start. Creator embedding is zero vector until first item published. | Creator cold start only matters when they have items to rank. |
| **User signs up and immediately leaves** | Preference vector remains at initial centroid. No signals written. User contributes nothing to cohort centroids (below 100 signal threshold). No resources wasted. | The system does not eagerly compute anything for users who never engage. |
| **All items in exploration pool are from same creator** | Diversity enforcement applies to exploration items. Maximum `exploration_max_per_creator` (default: 1) items from the same creator in exploration slots. Remaining slots filled by next-best creators. | Prevents a single prolific creator from dominating exploration. |
| **Exploration pool is empty** | No exploration items injected. All result slots filled by ranked items. This is expected for mature platforms during low-publishing periods. | The system degrades gracefully -- no exploration is better than no results. |
| **User blocks a creator whose items are in exploration** | Blocked creator's items are excluded from exploration results, same as ranked results. INV-FL-2 (blocked creator exclusion) applies uniformly. | Block is a hard filter. No exceptions, including exploration. |
| **Item receives only negative signals** | Signals count toward graduation threshold. Item with 100 negative signals graduates with a very low signal score. It drops out of contention naturally. | Negative signals are data. They accumulate the same as positive signals for graduation purposes. |
| **Returning user after long absence** | If a user has been dormant for 30+ days (no signals), apply a temporary learning rate multiplier on their next signals. `lr_multiplier = min(2.0, 1.0 + (days_since_last_signal - 30) / 30)`. This allows the preference vector to readapt to potentially shifted interests without reverting to cold-start behavior. | A user who was active 3 months ago should not be treated as a new user, but their preferences may have drifted. The boost is temporary (decays after 20 signals) and bounded (max 2x). |
| **Burst of items from same creator** | Each item independently enters the exploration pool. Creator discovery boost applies per-item. Combined with diversity enforcement (`exploration_max_per_creator: 1`), at most 1 exploration slot per query goes to a single creator. | Prevents a creator from flooding the exploration pool by publishing many items at once. |
| **Cold item in cold category** | If the category has no baseline (fewer than 50 items with 100+ views), the category_baseline_score defaults to 0.5 (neutral). The embedding_similarity_score has no quality centroid to compare against, so it also defaults to 0.5. | New categories start neutral. The system does not penalize or reward items in unknown categories. |
### Returning User Absence Boost
When a previously active user returns after extended absence (30+ days since last signal), their preferences may have drifted. Rather than treating them as a cold user (which would discard their history), the system temporarily increases their learning rate:
```rust
fn absence_boost_lr(
base_lr: f64,
days_since_last_signal: u64,
signals_since_return: u64,
) -> f64 {
if days_since_last_signal < 30 || signals_since_return > 20 {
return base_lr; // no boost needed
}
// Linear multiplier from 1.0 (at 30 days) to 2.0 (at 60+ days)
let multiplier = (1.0 + ((days_since_last_signal as f64 - 30.0) / 30.0))
.min(2.0);
// Decay the boost over the first 20 signals after return
let decay = 1.0 - (signals_since_return as f64 / 20.0);
let effective_multiplier = 1.0 + (multiplier - 1.0) * decay.max(0.0);
base_lr * effective_multiplier
}
```
**Constraints:**
- Minimum absence for boost: 30 days
- Maximum learning rate multiplier: 2.0x
- Boost decays linearly over first 20 signals after return
- Does not revert to cold-start exploration budget or cohort priors
---
## 11. Configuration Reference
### Item Cold Start Configuration
```rust
pub struct ItemColdStartConfig {
/// Signal count at which item graduates to signal-based ranking.
/// Default: 100.
pub graduation_threshold: u64,
/// Exploration eligibility window after item creation.
/// Default: 48 hours.
pub exploration_window: Duration,
/// Minimum proxy score for exploration pool eligibility.
/// Default: 0.2.
pub min_quality_floor: f64,
/// Earliest result position for exploration items.
/// Default: 3.
pub min_exploration_position: usize,
/// Minimum spacing between exploration items in results.
/// Default: 3.
pub min_exploration_spacing: usize,
/// Breakout velocity multiplier over category baseline.
/// Default: 3.0.
pub breakout_multiplier: f64,
/// Maximum items from same creator in exploration slots per query.
/// Default: 1.
pub exploration_max_per_creator: u32,
/// Exploration pool refresh interval (background materializer).
/// Default: 5 minutes.
pub pool_refresh_interval: Duration,
/// Maximum items in the exploration pool.
/// Default: 50,000.
pub max_pool_size: usize,
}
```
### User Cold Start Configuration
```rust
pub struct UserColdStartConfig {
/// Additional exploration budget for cold users (added to profile default).
/// Default: 0.20 (so a profile with 0.10 becomes 0.30 for cold users).
pub new_user_exploration_boost: f64,
/// Signal count at which user exploration boost decays to zero.
/// Default: 50.
pub user_graduation_threshold: u64,
/// Signal count at which cohort-to-personal transition completes.
/// Default: 50.
pub cohort_blend_threshold: u64,
/// Minimum cohort size (graduated users) for cohort centroid to be used.
/// Default: 50.
pub min_cohort_size_for_centroid: u64,
/// Minimum signals per user for cohort centroid contribution.
/// Default: 100.
pub min_signals_for_centroid: u64,
/// Minimum absence days before returning user boost applies.
/// Default: 30.
pub absence_boost_threshold_days: u64,
/// Maximum learning rate multiplier for returning users.
/// Default: 2.0.
pub absence_boost_max_multiplier: f64,
/// Signals after return over which absence boost decays.
/// Default: 20.
pub absence_boost_decay_signals: u64,
}
```
### Creator Cold Start Configuration
```rust
pub struct CreatorColdStartConfig {
/// Maximum item count for a creator to qualify as "new."
/// Default: 5.
pub new_creator_item_threshold: u32,
/// Maximum follower count for a creator to qualify as "new."
/// Default: 100.
pub new_creator_follower_threshold: u32,
/// Exploration budget multiplier for new creator items.
/// Default: 1.5.
pub discovery_multiplier: f64,
/// Exploration budget multiplier for a creator's very first item.
/// Stacks with discovery_multiplier.
/// Default: 2.0.
pub first_item_multiplier: f64,
/// Minimum graduated items before creator signals reach full confidence.
/// Default: 5.
pub creator_maturity_threshold: u32,
/// Signal weight multiplier for provisional creators.
/// Default: 0.5.
pub provisional_signal_weight: f64,
/// Number of similar creators to compare against for quality prior.
/// Default: 20.
pub similar_creator_count: usize,
}
```
### Configuration Defaults Summary
| Parameter | Default | Range | Rationale |
|-----------|---------|-------|-----------|
| `graduation_threshold` | 100 | 10-1000 | 100 signals provide statistically meaningful engagement data |
| `exploration` (per profile) | 0.10 | 0.0-0.50 | 10% discovery, 90% ranked. Balances quality and freshness |
| `exploration_window` | 48h | 1h-168h | 48h gives items a weekend cycle |
| `min_quality_floor` | 0.2 | 0.0-0.5 | Prevents obviously low-quality content from consuming exploration budget |
| `min_exploration_position` | 3 | 1-10 | Top 2 positions are earned, not given to unproven content |
| `breakout_multiplier` | 3.0 | 1.5-10.0 | 3x category baseline is clearly exceptional, not noise |
| `new_user_exploration_boost` | 0.20 | 0.0-0.40 | 30% total exploration for new users (0.10 + 0.20) |
| `user_graduation_threshold` | 50 | 10-200 | 50 signals = meaningful preference vector divergence |
| `cohort_blend_threshold` | 50 | 10-200 | 50 signals = sufficient for ANN retrieval to be useful |
| `min_cohort_size_for_centroid` | 50 | 10-500 | Below 50 graduated users, centroid is unreliable |
| `new_creator_item_threshold` | 5 | 1-20 | Creators with < 5 items have insufficient track record |
| `discovery_multiplier` | 1.5 | 1.0-3.0 | 50% boost for new creator items |
| `first_item_multiplier` | 2.0 | 1.0-5.0 | Every creator deserves one strong chance |
| `creator_maturity_threshold` | 5 | 1-20 | 5 graduated items = reliable creator quality signal |
| `provisional_signal_weight` | 0.5 | 0.1-1.0 | Half-weight creator signals until maturity |
| `absence_boost_threshold_days` | 30 | 7-90 | 30 days is meaningfully absent |
| `absence_boost_max_multiplier` | 2.0 | 1.0-5.0 | Double learning rate at most |
---
## 12. Performance Considerations
Cold start should not slow queries. The mechanisms described here must operate within the existing query latency budget (< 50ms end-to-end for RETRIEVE queries).
### Performance Budget
| Operation | Budget | Mechanism |
|-----------|--------|-----------|
| Cold start phase detection | < 100 ns | O(1) atomic counter read from hot tier |
| Exploration weight computation | < 10 ns | One subtraction + division + max |
| Proxy score lookup (per item) | < 100 ns | Pre-computed, stored in entity store |
| Proxy score computation (at ingestion) | < 5 us | Four lookups + weighted sum + two ANN lookups |
| Exploration pool selection | < 2 ms | Pre-sorted pool, take top N |
| Exploration position calculation | < 100 ns | Arithmetic on limit + count |
| Cohort centroid lookup | < 100 ns | Cached in memory |
| Interleaving | < 500 ns | Array merge at calculated positions |
| User exploration rate computation | < 10 ns | One subtraction + max |
| Breakout detection (per item) | < 200 ns | One velocity read + comparison |
| Absence boost computation | < 50 ns | Timestamp comparison + multiplication |
### Total Cold Start Overhead per Query
| Query Type | Without Cold Start | With Cold Start | Overhead |
|-----------|-------------------|-----------------|----------|
| RETRIEVE for_you (established user) | ~40 ms | ~42 ms | +2 ms (exploration pool selection) |
| RETRIEVE for_you (cold user) | N/A | ~45 ms | Cohort trending + elevated exploration |
| SEARCH | ~30 ms | ~30 ms | Negligible (no exploration pool for search) |
| RETRIEVE trending | ~20 ms | ~20 ms | Cold items excluded (no overhead) |
### Memory Budget
| Component | Size | Notes |
|-----------|------|-------|
| Exploration pool (50K items * 50 bytes) | 2.5 MB | Entity ID + proxy score + created_at |
| Category baselines (1000 categories * 64 bytes) | 64 KB | Median velocity, avg quality |
| Category quality centroids (1000 * 1536 * 2 bytes) | 3 MB | f16 embeddings |
| Population centroid (1 * 1536 * 4 bytes) | 6 KB | f32 for precision |
| Cohort centroids (100 cohorts * 1536 * 4 bytes) | 600 KB | f32 |
| Cold start state per item | 0 bytes | Uses existing `all_time_count` atomic counters |
| **Total** | **~6.2 MB** | Negligible vs. hot tier budget |
### Background Computation Schedule
| Computation | Frequency | Cost | Trigger |
|-------------|-----------|------|---------|
| Exploration pool refresh | Every 5 min | ~100 ms (scan cold items, sort) | Timer |
| Category baselines | Every 1 hour | ~2 sec (scan items per category) | Materializer hourly cycle |
| Category quality centroids | Every 24 hours | ~30 sec (compute weighted means) | Materializer daily cycle |
| Population centroid | Every 24 hours | ~5 sec (mean of user preference vectors) | Materializer daily cycle |
| Cohort centroids | Every 24 hours | ~10 sec (mean per cohort) | Materializer daily cycle |
---
## 13. Invariants and Correctness Guarantees
### Cold Start Invariants
**INV-CS-1: No Permanent Cold State.** Every item either graduates through signal accumulation or exits the exploration pool through window expiration. No item remains in the exploration pool indefinitely.
Formally: For any item I, either:
- `signal_count(I, t) >= graduation_threshold` for some `t < created_at(I) + exploration_window`
- `t > created_at(I) + exploration_window` and I is no longer exploration-eligible
**INV-CS-2: Exploration Budget Bound.** The number of exploration items in any result set never exceeds `ceil(limit * budget)`. The budget is a hard cap, not a target.
Formally: For any query Q with `limit = L` and effective exploration budget `B`:
```
|exploration_items(results(Q))| <= ceil(L * B)
```
**INV-CS-3: Quality Floor for Exploration.** No item with `proxy_score < min_quality_floor` (default: 0.2) appears as an exploration item.
**INV-CS-4: Blocked/Hidden Exclusion in Exploration.** Exploration items respect all user exclusions. A hidden item is never injected as an exploration item. A blocked creator's items are never injected as exploration items.
Formally: INV-FL-1 (hidden items never reappear) and INV-FL-2 (blocked creator exclusion) hold for exploration items identically to ranked items.
**INV-CS-5: Exploration Position Bound.** No exploration item appears at position 1 or 2 in the result set. The minimum position is `min_exploration_position` (default: 3).
**INV-CS-6: Graduation Monotonicity.** Once an item's `signal_count >= graduation_threshold`, it never reverts to cold state. Graduation is a one-way transition. Signal counts are monotonically increasing (signals are append-only).
Formally: If `signal_count(I, t) >= graduation_threshold`, then for all `t' > t`:
```
signal_count(I, t') >= graduation_threshold
```
**INV-CS-7: Linear Blend Correctness.** The blended score at any point matches the analytical formula:
```
|effective_score - (ew * proxy + (1-ew) * signal)| < f64::EPSILON
where ew = max(0, 1 - signal_count / graduation_threshold)
```
**INV-CS-8: Cohort Prior Freshness.** A cold user's cohort centroid is at most 24 hours old (background materializer daily cycle). The population centroid is at most 24 hours old.
### Interaction with Other Invariants
| Invariant | Interaction |
|-----------|-------------|
| INV-FL-1 (hidden items never reappear) | Exploration items are filtered through the same exclusion bitmap as ranked items |
| INV-FL-2 (blocked creator exclusion) | Exploration items are filtered through the same blocked set as ranked items |
| INV-SIG-1 (no signal loss) | Signal loss would prevent graduation, keeping items cold longer than necessary. WAL durability prevents this. |
| INV-COH-7 (minimum population threshold) | Cohort priors are only used when the cohort meets the minimum population threshold. Below threshold, fall back to population centroid. |
---
## 14. Property Tests
```rust
// P1: Exploration budget never exceeds declared limit.
proptest! {
fn exploration_budget_bounded(
limit in 10usize..200,
budget in 0.01f64..0.50,
cold_item_count in 0usize..1000,
) {
let max_exploration = (limit as f64 * budget).ceil() as usize;
let actual = compute_exploration_count(limit, budget, cold_item_count);
prop_assert!(actual <= max_exploration,
"exploration count {} exceeds max {} (limit={}, budget={})",
actual, max_exploration, limit, budget);
}
}
// P2: Exploration weight is monotonically decreasing with signal count.
proptest! {
fn exploration_weight_monotonic(
signals_a in 0u64..10000,
signals_b in 0u64..10000,
threshold in 10u64..1000,
) {
let weight_a = (1.0 - signals_a as f64 / threshold as f64).max(0.0);
let weight_b = (1.0 - signals_b as f64 / threshold as f64).max(0.0);
if signals_a <= signals_b {
prop_assert!(weight_a >= weight_b - f64::EPSILON,
"exploration weight not monotonic: f({})={} < f({})={}",
signals_a, weight_a, signals_b, weight_b);
}
}
}
// P3: Exploration weight is exactly 0 at graduation threshold.
proptest! {
fn exploration_weight_zero_at_graduation(
threshold in 10u64..1000,
) {
let weight = (1.0 - threshold as f64 / threshold as f64).max(0.0);
prop_assert!((weight - 0.0).abs() < f64::EPSILON,
"exploration weight at threshold = {}, expected 0.0", weight);
}
}
// P4: Exploration weight is exactly 1.0 at zero signals.
proptest! {
fn exploration_weight_one_at_zero(
threshold in 10u64..1000,
) {
let weight = (1.0 - 0.0f64 / threshold as f64).max(0.0);
prop_assert!((weight - 1.0).abs() < f64::EPSILON,
"exploration weight at 0 signals = {}, expected 1.0", weight);
}
}
// P5: Proxy score is bounded [0, 1].
proptest! {
fn proxy_score_bounded(
creator_quality in 0.0f64..1.0,
category_baseline in 0.0f64..1.0,
metadata_complete in 0.0f64..1.0,
embedding_novelty in 0.0f64..1.0,
embedding_sim in -1.0f64..1.0,
freshness in 0.0f64..1.0,
) {
let score = proxy_score(
creator_quality, category_baseline,
metadata_complete, embedding_novelty,
embedding_sim, freshness,
);
prop_assert!(score >= 0.0 && score <= 1.0,
"proxy score {} out of bounds [0, 1]", score);
}
}
// P6: Blended score equals proxy score at zero signals.
proptest! {
fn blended_score_equals_proxy_at_zero(
proxy in 0.0f64..1.0,
signal_score in 0.0f64..1.0,
threshold in 10u64..1000,
) {
let ew = (1.0 - 0.0f64 / threshold as f64).max(0.0);
let blended = ew * proxy + (1.0 - ew) * signal_score;
prop_assert!((blended - proxy).abs() < f64::EPSILON,
"blended score {} != proxy {} at 0 signals", blended, proxy);
}
}
// P7: Blended score equals signal score at graduation.
proptest! {
fn blended_score_equals_signal_at_graduation(
proxy in 0.0f64..1.0,
signal_score in 0.0f64..1.0,
threshold in 10u64..1000,
) {
let ew = (1.0 - threshold as f64 / threshold as f64).max(0.0);
let blended = ew * proxy + (1.0 - ew) * signal_score;
prop_assert!((blended - signal_score).abs() < f64::EPSILON,
"blended score {} != signal {} at graduation", blended, signal_score);
}
}
// P8: Hidden items never appear in exploration results.
proptest! {
fn hidden_items_excluded_from_exploration(
items in arb_items(100),
hidden_indices in prop::collection::hash_set(0usize..100, 0..20),
) {
let db = setup_test_db();
let user = create_test_user(&db);
for item in &items {
db.write_item(item)?;
}
for &idx in &hidden_indices {
db.signal(Signal { kind: "hide", item: items[idx].id, user, .. })?;
}
let results = db.retrieve(Retrieve {
for_user: Some(user),
profile: "for_you",
limit: 50,
..Default::default()
})?;
for &idx in &hidden_indices {
prop_assert!(
!results.results.iter().any(|r| r.id == items[idx].id),
"Hidden item {} appeared in results (possibly as exploration)",
items[idx].id
);
}
}
}
// P9: Exploration items are never at position 1 or 2.
proptest! {
fn exploration_positions_respect_minimum(
limit in 10usize..200,
exploration_count in 1usize..20,
min_position in 2usize..10,
) {
let exploration_count = exploration_count.min(limit / 3);
if exploration_count == 0 { return Ok(()); }
let positions = exploration_positions(limit, exploration_count, min_position);
for &pos in &positions {
prop_assert!(pos >= min_position.max(3),
"exploration position {} below minimum {}", pos, min_position.max(3));
prop_assert!(pos <= limit,
"exploration position {} exceeds limit {}", pos, limit);
}
}
}
// P10: Exploration positions are evenly distributed (not clustered).
proptest! {
fn exploration_positions_distributed(
limit in 20usize..200,
exploration_count in 2usize..20,
) {
let exploration_count = exploration_count.min(limit / 4);
if exploration_count < 2 { return Ok(()); }
let positions = exploration_positions(limit, exploration_count, 3);
// Verify minimum spacing between consecutive positions
for window in positions.windows(2) {
let gap = window[1].saturating_sub(window[0]);
prop_assert!(gap >= 3,
"exploration positions too close: {} and {} (gap={})",
window[0], window[1], gap);
}
}
}
// P11: User exploration boost decays to profile default.
proptest! {
fn user_exploration_decays_to_default(
profile_exploration in 0.01f64..0.50,
boost in 0.0f64..0.40,
threshold in 10u64..200,
) {
let effective = profile_exploration
+ boost * (1.0 - threshold as f64 / threshold as f64).max(0.0);
prop_assert!((effective - profile_exploration).abs() < f64::EPSILON,
"effective {} != profile {} at graduation threshold",
effective, profile_exploration);
}
}
// P12: Absence boost is bounded.
proptest! {
fn absence_boost_bounded(
base_lr in 0.001f64..0.1,
days_absent in 0u64..365,
signals_since in 0u64..100,
) {
let boosted = absence_boost_lr(base_lr, days_absent, signals_since);
prop_assert!(boosted >= base_lr - f64::EPSILON,
"boosted lr {} below base {}", boosted, base_lr);
prop_assert!(boosted <= base_lr * 2.0 + f64::EPSILON,
"boosted lr {} exceeds 2x base {}", boosted, base_lr * 2.0);
}
}
```
---
## Appendix A: Glossary
| Term | Definition |
|------|------------|
| **Cold Start** | The phase where an entity has zero signals and cannot participate in signal-based ranking |
| **Accumulating** | The phase where an entity has some signals but below the graduation threshold; scoring is blended |
| **Graduated** | The phase where an entity has sufficient signals for purely signal-based ranking |
| **Exploration Budget** | The fraction of query result slots reserved for cold-start items, per ranking profile |
| **Exploration Pool** | The pre-sorted set of cold items eligible for exploration injection |
| **Exploration Window** | The duration after item creation during which items are exploration-eligible (default: 48h) |
| **Exploration Weight** | Linear function of signal count that controls the blend between proxy and signal scores |
| **Proxy Score** | Predicted item quality from creator history, category baselines, metadata, embeddings, and freshness |
| **Graduation Threshold** | The signal count at which exploration weight reaches 0 and the item competes on signals alone |
| **Breakout Detection** | Identifying items whose early signal velocity far exceeds the category baseline, triggering early graduation |
| **Cohort Prior** | Using cohort-level statistics (centroid embedding, trending content) as the initial state for a new user |
| **Population Centroid** | The mean preference vector of all users with 100+ signals, used as the ultimate fallback for cold users |
| **Cohort Centroid** | The mean preference vector of users in a specific cohort with 100+ signals |
| **Creator Discovery Boost** | Additional exploration budget allocated to items from new creators |
| **First-Item Boost** | Extra exploration budget for a creator's very first published item |
| **Provisional Creator Signals** | Creator-level signal data weighted at 50% until the creator has 5 graduated items |
| **Absence Boost** | Temporary learning rate multiplier for users returning after 30+ days of inactivity |
| **Quality Floor** | Minimum proxy score required for exploration eligibility (default: 0.2) |
## Appendix B: References
1. VISION.md, Design Principles: "Cold start is handled by the database." (Architectural requirement)
2. USE_CASES.md, UC-01: "minimum 10% exploration budget (creators the user does not follow)." (Product requirement)
3. USE_CASES.md, UC-13: "Creator follower count -- small/new creators get priority." (Discovery equity requirement)
4. API.md, ProfileDef: `exploration: 0.10`. (API surface)
5. Feedback Loop Specification, Section 3: Preference Vector Management. (Cold start initialization, adaptive learning rate: lr_max=0.10, lr_min=0.01, decay_k=0.003)
6. Cohort Specification, Section 6: Three-Layer Trending Model. (Cohort-scoped trending as cold user prior)
7. Entity Model Specification: Cold Start State. (Entity lifecycle cold start definition, creator computed fields)
8. Signal System Specification, Section 3: `all_time_count` atomic counters. (O(1) graduation tracking)
9. Schema Specification, Section 8: Defaults and Population Priors. (Population centroid, exploration budget mechanics)
10. Li, L., Chu, W., Langford, J., Schapire, R. "A Contextual-Bandit Approach to Personalized News Article Recommendation." WWW 2010. (Exploration-exploitation tradeoff in recommendation)
11. Agarwal, D., Chen, B., Elango, P. "Explore/Exploit Schemes for Web Content Optimization." ICDM 2009. (Exploration budget allocation)