tidaldb/docs/specs/12-cold-start.md
jordan 413b712c0a chore: initialize tidalDB repository with schema foundation and standards
- Schema phase 1 (tasks 01-02): EntityId, EntityKind, Timestamp, Score, SignalTypeDef, DecayModel, Window, WindowSet — all with property tests and benchmarks scaffolding
- Stub modules for storage, signals, query, ranking
- Full documentation suite: VISION, USE_CASES, SEQUENCE, API, CODING_GUIDELINES, ai-lookup, research docs, specs, roadmap, planning docs
- Marketing site (Next.js) with blog infrastructure
- .claude/ agents and skills for the tidalDB development workflow
- Foundation standards enforced: thiserror + tracing declared as dependencies, clippy::unwrap_used = deny added to lint config
- .gitignore hardened: .next/, node_modules/, .env, secrets, logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 12:52:20 -07:00

72 KiB

12 -- Cold Start Specification

Status: Draft Authors: tidalDB Engineering Date: 2026-02-20 Depends on: Entity Model, Signal System, Relationships, Cohorts, Feedback Loop, Schema References: VISION.md (Design Principles: "Cold start is handled by the database"), USE_CASES.md (UC-01, UC-13), API.md (ProfileDef.exploration), thoughts.md (Part III, Gap 5)


Table of Contents

  1. Overview
  2. Design Principles
  3. Cold Start Lifecycle
  4. New Item Cold Start
  5. New User Cold Start
  6. New Creator Cold Start
  7. Cold Start and Cohorts
  8. Graduation Metrics
  9. Cold Start Across Surfaces
  10. Edge Cases
  11. Configuration Reference
  12. Performance Considerations
  13. Invariants and Correctness Guarantees
  14. Property Tests

1. Overview

Cold start is the problem of ranking entities that have no signal history. It affects three entity types -- items, users, and creators -- and manifests at three scales: individual entity cold start (a new item enters the database), cohort cold start (a new user with no history arrives), and system cold start (a brand new database with no data at all).

In the traditional multi-system architecture, cold start is application logic. The application maintains fallback rules, special-cases new content injection, manages exploration budgets in Redis, and runs A/B tests on cold start strategies in a separate experimentation framework. This is exactly the kind of domain logic that tidalDB internalizes.

Cold start is a database responsibility. The application writes db.write_item(...). The database decides how to rank that item when it has zero signals. The application writes db.write_user(...). The database decides what to show that user when they have zero history. The application does not manage exploration budgets, quality estimation from metadata, or cohort-based priors. The database does.

The Fundamental Tension

Cold start is a tension between exploitation and exploration:

  • Exploitation: Show users content that the system is confident they will like. This maximizes short-term engagement but creates filter bubbles and starves new content of exposure.
  • Exploration: Show users content the system knows nothing about. This enables discovery and gives new content a fair chance but risks showing low-quality content.

tidalDB resolves this tension with three mechanisms:

  1. Exploration budgets -- a configurable percentage of results reserved for cold-start items, managed per ranking profile. Items in cold start are distributed evenly through the result set, not appended at the end.
  2. Proxy scoring -- predicting item quality from creator history, category baselines, metadata completeness, embedding similarity, and freshness, before any engagement signals exist.
  3. Cohort-based priors -- using cohort membership to provide warm-start behavior for new users, replacing the population-level default with a segment-level default.

Integration Points

Subsystem Cold Start Integration
Signal System (03) all_time_count counters provide graduation tracking. Hot-tier atomic counters enable O(1) state detection.
Entity Model (02) Entity lifecycle (Active/Archived/Deleted) gates cold start eligibility. Creator computed fields (avg_item_quality, avg_engagement_rate, follower_count) feed proxy scoring.
Cohorts (05) Cohort centroids provide preference vector initialization for new users. Three-layer trending model provides cohort-scoped content for cold user feeds.
Feedback Loop (10) Adaptive learning rate (lr_max=0.10, lr_min=0.01, decay_k=0.003) provides rapid adaptation during cold start. Preference vector update formula uses the same mechanism.
Schema (11) ProfileDef.exploration field controls per-profile exploration budget. Section 8 defines population priors and cold start configuration.

2. Design Principles

Cold start is a state, not a flag. An entity's cold start status is a property of its signal ledger, not a flag the application manages. The database knows an entity is cold because its all_time_count is below the graduation threshold. It does not need to be told. There is no mark_as_cold_start() API.

Exploration decays linearly as evidence accumulates. A new item starts with maximum exploration weight. As signals accumulate, the weight decreases linearly toward zero. When enough signals exist for the ranking profile to score the item confidently, exploration weight reaches zero and the item competes on signals alone. There is no permanent "new item" status.

Proxy scores are stopgaps, not ranking strategies. Predicted quality from creator history, category baselines, metadata, and embeddings is used only until real signals exist. It is phased out linearly as real signals accumulate. Proxy scores never override strong real signals.

Cohort priors replace population priors for new users. A new user who provides locale, age range, and interests at signup should not see global trending. They should see cohort-scoped trending -- what is popular among users who look like them. Cohort priors are the bridge between "no history" and "personalized."

The application does not manage cold start. There is no set_exploration_budget() API. The database detects cold start conditions automatically from the signal ledger state and applies the exploration strategy declared in the ranking profile. The ProfileDef.exploration field is the single configuration knob.

Every entity graduates or expires. No item remains cold indefinitely. Either signals accumulate and the item graduates to signal-based ranking, or the exploration window expires and the item exits the exploration pool. Both outcomes are bounded by configurable thresholds.


3. Cold Start Lifecycle

Entity Lifecycle Diagram

Every entity in tidalDB progresses through three cold start phases. The phase is determined by the entity's signal ledger, not by explicit flags.

                    ┌──────────────────┐
  write_item()      │    COLD START    │   signal_count = 0
  ────────────────> │                  │   exploration_weight = 1.0
                    │  Score: 100%     │   Quality source: proxy scoring only
                    │  proxy           │
                    └────────┬─────────┘
                             │
                    first signal arrives
                             │
                    ┌────────▼─────────┐
                    │   ACCUMULATING   │   0 < signal_count < graduation_threshold
                    │                  │   exploration_weight = max(0, 1 - count/threshold)
                    │  Score: blended  │   Quality source: blended proxy + observed
                    │  proxy + signal  │
                    └────────┬─────────┘
                             │
                    signal_count >= graduation_threshold
                    OR dynamic graduation triggered
                             │
                    ┌────────▼─────────┐
                    │    GRADUATED     │   signal_count >= graduation_threshold
                    │                  │   exploration_weight = 0.0
                    │  Score: 100%     │   Quality source: observed signals only
                    │  signal-based    │
                    └──────────────────┘

Phase Definitions

Phase Signal Count Exploration Weight Score Composition Detection Cost
Cold Start 0 1.0 (maximum) 100% proxy score O(1) -- atomic counter read
Accumulating 1 to graduation_threshold - 1 Linear decay toward 0 Blended: (1-ew) * signal_score + ew * proxy_score O(1) -- atomic counter read
Graduated >= graduation_threshold 0.0 100% signal-based score O(1) -- atomic counter read

Exploration Weight Formula

The exploration weight decays linearly from 1.0 to 0.0 as signals accumulate:

exploration_weight = max(0, 1 - signal_count / graduation_threshold)

Where graduation_threshold is configurable per ranking profile (default: 100).

Why linear, not sigmoid. Linear decay is simpler, predictable, and debuggable. The exploration weight at 50 signals is exactly 0.5, not an opaque sigmoid output. The application developer can reason about the system: "my item has 30 signals out of 100, so 70% of its score comes from proxy estimation." Sigmoid introduces a parameter (k) that is difficult to tune and makes the relationship between signal count and exploration weight non-obvious.

Blended Scoring Formula

During the Accumulating phase, an item's effective score is a linear blend:

score = exploration_weight * proxy_score + (1 - exploration_weight) * signal_score

Where:

  • proxy_score is the quality estimate from Section 4.2
  • signal_score is the score computed by the ranking profile's normal scoring pipeline
  • exploration_weight decays linearly per the formula above

At Cold Start (0 signals): score = 1.0 * proxy_score + 0.0 * signal_score = proxy_score At 50/100 signals: score = 0.5 * proxy_score + 0.5 * signal_score At Graduated (100+ signals): score = 0.0 * proxy_score + 1.0 * signal_score = signal_score

Phase Detection

Phase detection is O(1). The all_time_count for the primary signal (typically view) is maintained as an atomic counter in the hot-tier signal state, as specified in Signal System Section 3.

/// Determine an item's cold start phase.
/// Cost: one atomic load. No scan, no disk read.
fn cold_start_phase(
    signal_ledger: &HotSignalState,
    graduation_threshold: u64,
) -> ColdStartPhase {
    let signal_count = signal_ledger.all_time_count("view");
    if signal_count == 0 {
        ColdStartPhase::ColdStart
    } else if signal_count < graduation_threshold {
        ColdStartPhase::Accumulating { signal_count }
    } else {
        ColdStartPhase::Graduated
    }
}

4. New Item Cold Start

Problem Statement

A newly ingested item has zero signals. No views, no likes, no completions, no skips. The ranking function -- which relies on engagement velocity, decay scores, completion rate, and like ratio -- has nothing to work with. Without intervention, the item would score zero and never appear in any ranked result, creating a chicken-and-egg problem: the item cannot get engagement without exposure, and it cannot get exposure without engagement.

Solution: Three Mechanisms

4.1 Exploration Budget

Every ranking profile declares an exploration budget: the percentage of result slots reserved for cold-start items.

db.define_profile(ProfileDef {
    name: "for_you",
    // ... candidate, boosts, gates, diversity ...
    exploration: 0.10,  // 10% of result slots reserved for exploration
})?;

The budget is applied after diversity enforcement, before pagination. For a query with LIMIT 50 and exploration: 0.10, 5 result slots are reserved for exploration items. The remaining 45 slots are filled by the ranking profile's normal scoring pipeline.

Budget bounds. The exploration budget is clamped to [0.0, 0.50]. A budget above 50% would mean more exploration than ranked results, which defeats the purpose of ranking. A budget of 0.0 disables exploration entirely (used for surfaces like trending where cold items are ineligible by definition).

4.2 Proxy Scoring

Before any engagement signals exist, the database estimates item quality from available metadata, the creator's track record, embedding similarity, and freshness. This proxy score determines which cold items are selected to fill the exploration budget and how they rank relative to each other.

proxy_score = weighted_sum(
    creator_quality_score     * 0.30,
    category_baseline_score   * 0.10,
    metadata_completeness     * 0.15,
    embedding_novelty_score   * 0.10,
    embedding_similarity_score * 0.25,
    freshness_score           * 0.10,
)

Each component:

Creator Quality Score (weight: 0.30): The creator's track record is the strongest predictor of new item quality.

fn creator_quality_score(creator: &CreatorEntity) -> f64 {
    let avg_quality = creator.computed("avg_item_quality")
        .unwrap_or(0.5);     // default for new creators
    let engagement_rate = creator.computed("avg_engagement_rate")
        .unwrap_or(0.03);    // default
    let posting_freq = creator.computed("posting_frequency")
        .unwrap_or(1.0);     // items per week

    let quality_norm = avg_quality.clamp(0.0, 1.0);
    let engagement_norm = (engagement_rate / 0.10).clamp(0.0, 1.0);
    let consistency_norm = (posting_freq / 7.0).clamp(0.0, 1.0);

    quality_norm * 0.50 + engagement_norm * 0.35 + consistency_norm * 0.15
}

For new creators (no avg_item_quality), the creator cohort comparison (Section 6) provides the baseline.

Category Baseline Score (weight: 0.10): The average quality of recently published items in the same category.

fn category_baseline_score(category: &str, baselines: &CategoryBaselines) -> f64 {
    baselines.get(category)
        .map(|b| b.avg_quality_score)
        .unwrap_or(0.5)   // neutral default for unknown categories
}

Category baselines are maintained by the background materializer as the mean quality score (completion rate * like ratio) of all items in the category published in the last 30 days with at least 100 views.

Metadata Completeness Score (weight: 0.15): Items with complete metadata tend to be higher quality than items with sparse metadata.

fn metadata_completeness_score(item: &ItemEntity) -> f64 {
    let mut score = 0.0;

    // Title present and non-trivial (> 10 chars)
    if item.get("title").map(|t| t.len() > 10).unwrap_or(false) {
        score += 0.25;
    }
    // Description present and non-trivial (> 50 chars)
    if item.get("description").map(|d| d.len() > 50).unwrap_or(false) {
        score += 0.25;
    }
    // At least 2 tags
    if item.get_keywords("tags").map(|t| t.len() >= 2).unwrap_or(false) {
        score += 0.20;
    }
    // Category set
    if item.get("category").is_some() {
        score += 0.15;
    }
    // Has subtitles (accessibility = quality indicator)
    if item.get_bool("has_subtitles").unwrap_or(false) {
        score += 0.15;
    }

    score
}

Embedding Novelty Score (weight: 0.10): Measures how different this item is from existing content. Items that fill gaps in the embedding space get a boost -- they provide genuine novelty rather than duplicating existing content.

fn embedding_novelty_score(
    item_embedding: &[f32],
    nearest_neighbor_distance: f64,  // from HNSW index
) -> f64 {
    // Higher distance = more novel. Sigmoid-mapped to [0, 1].
    // Items very close to existing content score low.
    // Items in underrepresented embedding regions score high.
    let novelty = 1.0 - (-3.0 * nearest_neighbor_distance).exp();
    novelty.clamp(0.0, 1.0)
}

Embedding Similarity Score (weight: 0.25): How similar is this item's embedding to known high-quality items in the same category? This is the strongest content-based signal.

fn embedding_similarity_score(
    item_embedding: &[f32],
    category: &str,
    quality_centroids: &CategoryQualityCentroids,
) -> f64 {
    let centroid = quality_centroids.get(category);
    match centroid {
        Some(c) => {
            let similarity = cosine_similarity(item_embedding, c);
            (similarity + 1.0) / 2.0  // map [-1, 1] to [0, 1]
        }
        None => 0.5,  // neutral default if no centroid computed yet
    }
}

Category quality centroids are computed by the background materializer as the weighted mean embedding of items in the category with completion_rate > 0.7, like_ratio > 0.85, published in the last 90 days, with at least 500 views.

Freshness Score (weight: 0.10): More recent items receive a slight boost, ensuring newly published content is prioritized within the exploration pool.

fn freshness_score(created_at: DateTime<Utc>, now: DateTime<Utc>) -> f64 {
    let age_hours = (now - created_at).num_hours() as f64;
    // Linear decay over 48 hours. Items older than exploration_window get 0.
    (1.0 - age_hours / 48.0).max(0.0)
}

Proxy Score Computation Timing

The proxy score is computed once at item ingestion (write_item()) and stored alongside the entity:

[entity_id][0x00][COLD:proxy_score]   ->  f32 (predicted quality)
[entity_id][0x00][COLD:created_at]    ->  u64 (creation timestamp)

The score is recomputed by the background materializer when:

  • Creator's avg_item_quality is updated (daily)
  • Category baselines change significantly (>20% relative change)
  • The item accumulates signals (the blend ratio shifts)

4.3 Exploration Distribution

Exploration items are distributed evenly through the result set, not clustered at the end. Placing all exploration items at positions 46-50 in a 50-item result means users who do not scroll past position 10 never see them, creating a systematic bias against new content.

Exploration Distribution Algorithm:

Given: LIMIT 50, exploration_count = 5

Exploration positions:  3, 8, 13, 18, 23
    (min_position = 3, spacing = 5)

Constraints:
    min_position >= 3      (never position 1 or 2 -- top slots are earned)
    spacing = max(3, (limit - min_position) / exploration_count)
    position[i] = min_position + i * spacing
fn exploration_positions(
    limit: usize,
    exploration_count: usize,
    min_position: usize,
) -> Vec<usize> {
    if exploration_count == 0 {
        return vec![];
    }
    let min_position = min_position.max(3); // never top 2
    let available = limit.saturating_sub(min_position);
    let spacing = if exploration_count <= 1 {
        available
    } else {
        (available / exploration_count).max(3)
    };

    (0..exploration_count)
        .map(|i| (min_position + i * spacing).min(limit))
        .collect()
}

Rationale for min_position = 3. Positions 1 and 2 are high-value real estate. Users judge the entire feed by the first two items. Inserting an unproven cold-start item there risks a poor first impression. Position 3 is the earliest safe insertion point -- the user has already seen two strong items.

Rationale for spacing = 5 (for 5 items in 50 slots). Evenly-spaced exploration items ensure that users who scroll to any depth encounter approximately the same density of new content. Clustering creates dead zones.

4.4 Exploration Window

Cold items are exploration-eligible for a configurable duration after creation. The window defaults to 48 hours. After the window expires, the item must compete on signals alone -- it is no longer injected into exploration slots.

The window ensures that items which fail to attract any engagement during their exploration period are not perpetually given free exposure. Content that nobody engages with after 48 hours and hundreds of impressions is probably not interesting.

Exploration Budget Mechanics Diagram

Query: RETRIEVE items FOR USER @u USING PROFILE for_you LIMIT 50

Step 1: Normal Ranking Pipeline
    ┌──────────────────────────────────────────┐
    │  ANN retrieval (top 500 candidates)       │
    │  Signal scoring (decay, velocity, gates)  │
    │  Diversity enforcement (max 2/creator)     │
    │  Top 45 results by score                  │
    └───────────────────┬──────────────────────┘
                        │
Step 2: Exploration Pool Selection (budget = 10% of 50 = 5 slots)
    ┌──────────────────────────────────────────┐
    │  Select cold items from exploration pool:  │
    │    - Created within last 48h               │
    │    - signal_count < graduation_threshold   │
    │    - Not already in top 45 results         │
    │    - Not hidden/blocked for this user       │
    │    - proxy_score > min_quality_floor (0.2)  │
    │  Rank by proxy_score                       │
    │  Take top 5                                │
    └───────────────────┬──────────────────────┘
                        │
Step 3: Interleaving at Calculated Positions
    ┌──────────────────────────────────────────┐
    │  Insert exploration items at positions:    │
    │    3, 8, 13, 18, 23                       │
    │                                           │
    │  Result: [R R E R R R R E R R R R E ...]  │
    │  R = ranked item, E = exploration item     │
    └───────────────────┬──────────────────────┘
                        │
Step 4: Impression Tracking
    ┌──────────────────────────────────────────┐
    │  All returned items (including exploration)│
    │  generate impression signals.              │
    │                                           │
    │  Exploration items MUST be tracked.        │
    │  The feedback loop is how they accumulate  │
    │  signals and graduate or get deprioritized.│
    └──────────────────────────────────────────┘

Exploration Pool Management

The exploration pool is the set of items eligible for exploration injection. It is maintained by the background materializer and cached in memory.

Exploration Pool:
    Items where:
        created_at > now() - exploration_window       (within 48h)
        AND signal_count < graduation_threshold       (not yet graduated)
        AND status = "published"                       (active)
        AND proxy_score > min_quality_floor (0.2)      (minimum quality)

    Sorted by: proxy_score DESC

    Size: typically 1,000 to 50,000 items
    Refresh: every 5 minutes (background materializer)
    Memory: ~50 bytes per item * 50K = ~2.5 MB

Items exit the exploration pool when:

  1. They accumulate enough signals to graduate (signal_count >= graduation_threshold)
  2. They exceed the exploration window age (48h)
  3. They are archived or deleted
  4. Dynamic graduation triggers early promotion (Section 8.2)

5. New User Cold Start

Problem Statement

A new user has no preference vector, no engagement history, no relationship graph. The personalized ranking profile -- which depends on ANN retrieval from the user's preference vector, interaction weights with creators, and seen/unseen state -- has nothing to work with. Without intervention, the For You feed would either be empty or fall back to global popularity, which is rarely a good first impression.

Solution: Three-Stage Onboarding

5.1 Preference Vector Initialization

When a new user is created, their preference vector must be initialized to something meaningful. The initialization follows a hierarchy, using the best available prior:

   User created via db.write_user(...)
        │
        ▼
  ┌─────────────────────────────────────────┐
  │  STEP 1: Check explicit_interests       │
  │                                         │
  │  Does the user have explicit_interests? │
  │    ["jazz", "cooking", "rust"]          │
  └─────────────┬───────────────────────────┘
                │
           ┌────┴────┐
           │         │
          YES        NO
           │         │
           ▼         ▼
  ┌────────────┐  ┌─────────────────────────────┐
  │ Centroid   │  │  STEP 2: Check cohort        │
  │ of interest│  │                              │
  │ embeddings │  │  Can the user be placed in   │
  │            │  │  a demographic cohort?        │
  │ Lookup     │  │  (locale, age_range present) │
  │ embedding  │  └──────┬──────────────────────┘
  │ for each   │         │
  │ interest   │    ┌────┴────┐
  │ keyword,   │    │         │
  │ compute    │   YES        NO
  │ mean       │    │         │
  └────┬───────┘    ▼         ▼
       │    ┌────────────┐  ┌────────────┐
       │    │ Cohort     │  │ Population │
       │    │ centroid   │  │ centroid   │
       │    │            │  │            │
       │    │ Mean pref  │  │ Mean pref  │
       │    │ vector of  │  │ vector of  │
       │    │ cohort     │  │ ALL users  │
       │    │ users with │  │ with 100+  │
       │    │ 100+       │  │ signals    │
       │    │ signals    │  │            │
       │    └────┬───────┘  └─────┬──────┘
       │         │                │
       └────┬────┘                │
            │                     │
            ▼                     │
  ┌────────────────────┐          │
  │ Shift toward       │          │
  │ cohort centroid    │◄─────────┘
  │ (if available)     │
  └────────┬───────────┘
           │
           ▼
  ┌────────────────────┐
  │  Normalize to      │
  │  unit length       │
  │                    │
  │  Insert into HNSW  │
  └────────────────────┘

Priority hierarchy:

  1. Explicit interests provided -- compute centroid of interest embeddings, shift toward cohort centroid if available
  2. Demographic cohort available -- use cohort centroid (mean preference vector of cohort users with 100+ signals)
  3. Neither available -- use population centroid (mean preference vector of all users with 100+ signals)

5.2 Early Personalization (Rapid Learning)

During the user's first signals, the adaptive learning rate is at its maximum (lr_max = 0.10). This means each signal moves the preference vector significantly:

lr = lr_max * exp(-decay_k * signal_count) + lr_min

Where:
    lr_max   = 0.10    (10% shift per signal at start)
    lr_min   = 0.01    (1% shift per signal at maturity)
    decay_k  = 0.003   (lr reaches floor at ~1500 signals)
Signal Count Learning Rate Effect
0 0.10 Each like moves preference vector ~10% toward item
5 0.098 Strong directional preference forming
20 0.094 Meaningfully different from initial centroid
50 0.087 Clear multi-interest profile emerging
100 0.074 Well-defined preferences
500 0.023 Stable but still responsive
1000 0.015 Near-stable
1500+ 0.010 At floor -- stable

These values match the Feedback Loop spec, Section 3. Cold start does not introduce different learning rates -- it relies on the adaptive learning rate mechanism that is naturally highest for new users.

What "rapid learning" means in practice: At lr_max = 0.10 with a like (weight 1.0), 5 likes in the same category establish a strong directional preference. 10 likes across two categories establish a multi-interest profile. By 20 signals, the preference vector is meaningfully different from the initial centroid.

5.3 Cold User Feed Strategy

New users receive two feed modifications:

Elevated exploration budget. New users get an exploration rate of profile_exploration + new_user_exploration_boost (default: 0.10 + 0.20 = 0.30, i.e., 30% of results are exploration items). This decays linearly to the profile default as signals accumulate:

effective_exploration = profile_exploration
    + new_user_exploration_boost * max(0, 1 - signal_count / user_graduation_threshold)

Where:
    profile_exploration        = 0.10 (from ProfileDef)
    new_user_exploration_boost = 0.20 (default)
    user_graduation_threshold  = 50   (default)
Signal Count Boost Effective Rate
0 0.20 0.30 (30%)
10 0.16 0.26
25 0.10 0.20
50 0.00 0.10 (profile default)

Cohort-to-personal transition. As the user accumulates signals, candidate generation transitions from cohort-driven to preference-driven:

personal_weight = min(1.0, signal_count / cohort_blend_threshold)
cohort_weight   = 1.0 - personal_weight

candidates = merge(
    cohort_trending(user_cohort, top_k * cohort_weight),
    ann_retrieval(user_preference, top_k * personal_weight),
)

Where cohort_blend_threshold = 50 (default)
Signal Count Cohort Weight Personal Weight Behavior
0 1.00 0.00 Entirely cohort-driven
10 0.80 0.20 Mostly cohort, some personal
25 0.50 0.50 Equal blend
50 0.00 1.00 Entirely personal
100+ 0.00 1.00 Fully personalized
Cold user For You feed composition evolution:

Signal Count 0:
    Cohort-trending items:  70% (trending among users in same cohort)
    Exploration items:      30% (quality-weighted, diverse creators)
    Personal signal items:   0% (no history yet)

Signal Count 25:
    Cohort-trending items:  35%
    Exploration items:      20% (declining from 30%)
    Personal signal items:  45% (ANN from preference vector)

Signal Count 50+:
    Cohort-trending items:   0% (transition complete)
    Exploration items:      10% (profile default)
    Personal signal items:  90% (fully personalized)

6. New Creator Cold Start

Problem Statement

A new creator has no followers, no engagement baseline, no catalog embedding. Their items receive no social proof boost (nobody follows them), no interaction weight boost (nobody has engaged with them before), and no collaborative filtering signal (no overlap with other creators' audiences). Their first content is doubly cold: the item is cold AND the creator is cold.

Solution: Four Mechanisms

6.1 Discovery Boost

New creators receive an additional exploration budget boost on top of the standard item exploration budget. This boost is applied to items by creators whose total_items computed field is below a threshold.

fn creator_discovery_boost(creator: &CreatorEntity) -> f64 {
    let item_count = creator.computed("total_items").unwrap_or(0);
    let follower_count = creator.computed("follower_count").unwrap_or(0);

    if item_count <= NEW_CREATOR_ITEM_THRESHOLD       // default: 5
        && follower_count <= NEW_CREATOR_FOLLOWER_THRESHOLD // default: 100
    {
        CREATOR_DISCOVERY_MULTIPLIER  // default: 1.5
    } else {
        1.0
    }
}

The discovery boost means a new creator's item gets 10% * 1.5 = 15% exploration budget instead of the standard 10%.

6.2 Provisional Creator Signals

A new creator's signal data is statistically unreliable. Their avg_item_quality and avg_engagement_rate computed fields are based on too few data points. To prevent a single viral or flopped item from permanently defining a creator's quality estimate, creator-level signals are weighted at 50% until the creator has at least 5 graduated items.

fn creator_signal_confidence(creator: &CreatorEntity) -> f64 {
    let graduated_items = creator.computed("graduated_item_count")
        .unwrap_or(0);

    if graduated_items < CREATOR_MATURITY_THRESHOLD {  // default: 5
        PROVISIONAL_SIGNAL_WEIGHT  // default: 0.5
    } else {
        1.0
    }
}

When computing the creator quality component of an item's proxy score (Section 4.2), the creator score is multiplied by this confidence factor, and the remainder is filled by the category baseline:

adjusted_creator_score = creator_quality_score * creator_signal_confidence
                       + category_baseline * (1.0 - creator_signal_confidence)

6.3 Creator Cohort Comparison

Even without engagement history, a new creator has metadata: categories, tags, language, region. The quality estimation system compares new creators to established creators with similar metadata to establish baseline expectations.

creator_prior_quality = weighted_mean(
    quality_scores_of_similar_creators,
    weights = similarity_to_new_creator
)

where similar_creators = creators in same category AND region
                         with > 1000 total item views
                         sorted by tag overlap
                         top 20

This creator prior is used as the category_baseline fallback when the creator has no avg_item_quality.

6.4 First-Item Boost

A creator's very first published item receives extra exploration budget regardless of the creator's other signals. This ensures that every creator has at least one chance to be seen.

fn first_item_boost(creator: &CreatorEntity) -> f64 {
    let creator_item_count = creator.computed("total_items").unwrap_or(0);
    if creator_item_count <= 1 {
        FIRST_ITEM_BOOST_MULTIPLIER  // default: 2.0
    } else {
        1.0
    }
}

A creator's first item gets 10% * 2.0 = 20% exploration budget. Combined with the creator discovery boost: 10% * 1.5 * 2.0 = 30% total exploration budget for a new creator's first item. This is the maximum exploration commitment the system makes.


7. Cold Start and Cohorts

Cohort-Based Priors for New Users

This is the critical capability enabled by the cohort system. When a new user is created with demographic attributes, they are immediately placed in matching cohorts. Instead of showing global trending (which skews toward majority demographics), the user sees cohort-scoped trending.

New user signs up:
    locale: "ja-JP"
    age_range: "18-24"
    explicit_interests: ["anime", "music"]

Immediate cohort resolution:
    region:JP           --> bitmap A
    age_range:18-24     --> bitmap B
    interest:anime      --> bitmap C
    interest:music      --> bitmap D

    Primary cohort: A AND B     --> "young Japanese users"
    Interest cohort: A AND C    --> "Japanese anime fans"
    Interest cohort: A AND D    --> "Japanese music fans"

Why cohort priors matter: A 22-year-old user in Tokyo gets Japanese music, anime, and locally relevant content in their first session. A 45-year-old user in Texas gets country music, cooking shows, and locally relevant content. Neither sees the globally dominant content (typically English-language pop culture) unless it also happens to be trending in their cohort.

Cohort Centroid Computation

The cohort centroid is the mean preference vector of all users in the cohort who have at least 100 signals (graduated users). Users below 100 signals are excluded from the centroid to prevent cold users from diluting the centroid with their initial (non-personalized) vectors.

fn compute_cohort_centroid(
    cohort_members: &[UserId],
    min_signal_count: u64,  // default: 100
) -> Option<Vec<f32>> {
    let graduated_members: Vec<_> = cohort_members.iter()
        .filter(|u| signal_count(*u) >= min_signal_count)
        .collect();

    if graduated_members.len() < MIN_COHORT_SIZE_FOR_CENTROID {  // default: 50
        return None;  // not enough data -- fall back to population centroid
    }

    Some(mean_embedding(graduated_members.iter().map(|u| preference_vector(*u))))
}

Minimum cohort size. A cohort needs at least 50 graduated users (configurable) before its centroid is considered reliable. Below this threshold, the system falls back to the population centroid. This prevents small, possibly unrepresentative cohorts from creating misleading priors.

The three-layer trending model from the Cohorts spec (Section 6) directly serves cold user needs:

Layer What It Shows When Used
Global trending What is popular with everyone Fallback when no cohort available
Cohort-scoped trending What is popular among users like this one Primary feed for cold users with cohort data
Personal trending What is popular among this user's followed creators After user has follows and 50+ signals

For a cold user with cohort data, the feed is composed primarily of cohort-scoped trending, supplemented by exploration items. This is the "zero query" experience -- the first feed the user sees without having done anything.


8. Graduation Metrics

8.1 Standard Graduation

An item graduates from cold start when its all_time_count for the primary signal (view) reaches the graduation_threshold (default: 100). At graduation:

  1. exploration_weight drops to 0.0
  2. The item exits the exploration pool
  3. The item competes in the normal ranking pipeline on signals alone
  4. The blended scoring formula produces score = signal_score (no proxy component)

Graduation is detected at query time via O(1) atomic counter read. There is no explicit "graduation event" -- the item simply stops qualifying for exploration on its next query.

8.2 Dynamic Graduation for Viral Items

Items that accumulate signals at an exceptional rate should graduate early. Keeping a viral item in the exploration pool is wasteful -- it has proven quality and does not need exploration slots.

dynamic_threshold = min(
    graduation_threshold,
    max(10, engagement_velocity / baseline_velocity * 10)
)

Where:
    engagement_velocity = view.velocity(1h) for this item
    baseline_velocity   = median view velocity (1h) across all items
                          in the same category with GRADUATED status

When signal_count >= dynamic_threshold, the item graduates immediately.

Example: Category baseline velocity is 50 views/hour. An item receives 500 views in its first hour (10x baseline). Dynamic threshold = min(100, max(10, 500/50 * 10)) = min(100, 100) = 100. In this case, no early graduation because the dynamic threshold equals the standard threshold.

But if the item receives 2,000 views/hour (40x baseline): min(100, max(10, 2000/50 * 10)) = min(100, 400) = 100. The dynamic threshold is capped by graduation_threshold.

Where dynamic graduation actually matters: min(100, max(10, 5000/50 * 10)) = min(100, 1000) = 100. The cap at graduation_threshold means items always graduate at graduation_threshold at latest.

Revised formula for early graduation -- breakout detection:

The more useful form is detecting items that should graduate before reaching 100 signals:

fn check_breakout(
    item: EntityId,
    signal_ledger: &HotSignalState,
    category: &str,
    category_baselines: &CategoryBaselines,
    breakout_multiplier: f64,  // default: 3.0
) -> bool {
    let item_velocity = signal_ledger.velocity("view", &Window::hours(1));
    let category_baseline = category_baselines.get(category)
        .map(|b| b.avg_velocity_1h)
        .unwrap_or(10.0);

    item_velocity > category_baseline * breakout_multiplier
}

When breakout is detected:

  1. Item's exploration_weight is forced to 0.0
  2. Item is removed from the exploration pool
  3. Item competes in the normal ranking pipeline
  4. The signal changelog records the breakout event for analytics

Default breakout_multiplier: 3.0. An item with 3x the category's average view velocity in its first hour is a breakout.

8.3 Graduation Curve

exploration_weight
    1.0 ┌─────────────────────────────────────────────────┐
        │ ■                                               │
        │  ■                                              │
    0.8 │   ■                                             │
        │    ■■                                           │
        │      ■■                                         │
    0.6 │        ■■                                       │
        │          ■■                                     │
        │            ■■                                   │
    0.4 │              ■■                                 │
        │                ■■                               │
        │                  ■■                             │
    0.2 │                    ■■                           │
        │                      ■■                         │
        │                        ■■                       │
    0.0 └─────────────────────────■■■■■■■■■■■■■■■■■■■■──┘
        0     20     40     60     80    100    120    140
                            signal_count

    Linear decay: exploration_weight = max(0, 1 - signal_count / 100)

    At 0 signals:   exploration_weight = 1.00  (full proxy score)
    At 25 signals:  exploration_weight = 0.75
    At 50 signals:  exploration_weight = 0.50  (equal blend)
    At 75 signals:  exploration_weight = 0.25
    At 100 signals: exploration_weight = 0.00  (graduated)

8.4 User Graduation

Users graduate from cold start when their signal count reaches user_graduation_threshold (default: 50). At graduation:

  1. Elevated exploration boost decays to zero
  2. Cohort-to-personal transition completes (personal_weight = 1.0)
  3. The user is counted toward cohort centroid computation (if they have 100+ signals)

8.5 Creator Graduation

Creators graduate from provisional status when they have at least creator_maturity_threshold (default: 5) graduated items. At graduation:

  1. Creator signal confidence reaches 1.0
  2. Creator quality score is no longer diluted with category baseline
  3. Discovery multiplier no longer applies (but items still get standard exploration)

9. Cold Start Across Surfaces

Cold start behavior differs by surface because each surface has different signal requirements and different tolerance for unproven content.

Surface Eligibility Matrix

Surface UC Cold Item Eligible Cold Item Strategy Cold User Strategy
For You UC-01 Yes (exploration budget) Proxy-scored exploration injection Cohort-trending + elevated exploration
Search UC-02 Yes (no engagement gate) Ranked by text/semantic relevance Reduced personalization boost
Trending UC-03 No (velocity required) Excluded until 1h age + velocity Global or cohort-scoped trending
Following UC-04 Yes (if user follows creator) Chronological (no signal dependency) N/A (requires follows)
Related UC-05 Yes (embedding available) Ranked by semantic similarity Anchor-based (user-independent)
Browse (new sort) UC-06 Yes Chronological N/A (not personalized)
Browse (hot/top sort) UC-06 Yes (proxy estimate) Ranked by proxy score N/A (not personalized)
Notifications UC-07 N/A N/A N/A
Creator Profile UC-08 Yes Chronological or by popularity N/A (creator-scoped)
User Library UC-09 N/A N/A Empty until engagement
People Search UC-10 Yes Ranked by text relevance N/A
Visual Search UC-11 Yes Ranked by visual similarity N/A
Live Content UC-12 Yes Ranked by viewer count N/A
Hidden Gems UC-13 Partial (50+ signals required) Excluded below minimum N/A (not personalized)
Controversial UC-14 No (dual-signal required) Excluded until sufficient signals N/A

Surface-Specific Details

Search (UC-02). New items are eligible for search results if their text relevance (BM25) or semantic similarity is high. There is no engagement gate for search -- withholding relevant results because the item has no signals would be incorrect for an intent-driven surface. However, the exploration budget for search is reduced to 0.05 (5%) because search users have explicit intent and should see primarily relevance-ranked results. Cold items in search must pass a relevance gate: bm25_score > 0.3 OR semantic_similarity > 0.5.

Trending (UC-03). New items are excluded from trending surfaces. Trending requires velocity signals, which require time to accumulate. Minimum age for trending eligibility: 1 hour (configurable). This prevents artificial trending from coordinated burst engagement on a new item.

Hidden Gems (UC-13). Hidden Gems explicitly favors items with high quality signals and low reach. Items in Cold Start phase are natural candidates for "low reach" -- but they must show quality signals. Minimum requirement: 50 signals with completion_rate > 0.6 and like_ratio > 0.8. An item with zero completions is not a hidden gem; it is just unseen.

Following (UC-04). New items from followed creators appear immediately in the Following feed, sorted chronologically. No cold start mechanism is needed -- the user explicitly chose to follow this creator.


10. Edge Cases

Edge Case Handling Table

Edge Case Behavior Rationale
Item with no embedding Excluded from ANN-based exploration. Eligible for scan-based surfaces (browse, trending) once signals accumulate. Proxy score computed without embedding_similarity and embedding_novelty components (remaining weights renormalized). Embedding is required for personalized ranking. Items without embeddings cannot participate in ANN retrieval.
Creator with no items No impact on cold start. Creator embedding is zero vector until first item published. Creator cold start only matters when they have items to rank.
User signs up and immediately leaves Preference vector remains at initial centroid. No signals written. User contributes nothing to cohort centroids (below 100 signal threshold). No resources wasted. The system does not eagerly compute anything for users who never engage.
All items in exploration pool are from same creator Diversity enforcement applies to exploration items. Maximum exploration_max_per_creator (default: 1) items from the same creator in exploration slots. Remaining slots filled by next-best creators. Prevents a single prolific creator from dominating exploration.
Exploration pool is empty No exploration items injected. All result slots filled by ranked items. This is expected for mature platforms during low-publishing periods. The system degrades gracefully -- no exploration is better than no results.
User blocks a creator whose items are in exploration Blocked creator's items are excluded from exploration results, same as ranked results. INV-FL-2 (blocked creator exclusion) applies uniformly. Block is a hard filter. No exceptions, including exploration.
Item receives only negative signals Signals count toward graduation threshold. Item with 100 negative signals graduates with a very low signal score. It drops out of contention naturally. Negative signals are data. They accumulate the same as positive signals for graduation purposes.
Returning user after long absence If a user has been dormant for 30+ days (no signals), apply a temporary learning rate multiplier on their next signals. lr_multiplier = min(2.0, 1.0 + (days_since_last_signal - 30) / 30). This allows the preference vector to readapt to potentially shifted interests without reverting to cold-start behavior. A user who was active 3 months ago should not be treated as a new user, but their preferences may have drifted. The boost is temporary (decays after 20 signals) and bounded (max 2x).
Burst of items from same creator Each item independently enters the exploration pool. Creator discovery boost applies per-item. Combined with diversity enforcement (exploration_max_per_creator: 1), at most 1 exploration slot per query goes to a single creator. Prevents a creator from flooding the exploration pool by publishing many items at once.
Cold item in cold category If the category has no baseline (fewer than 50 items with 100+ views), the category_baseline_score defaults to 0.5 (neutral). The embedding_similarity_score has no quality centroid to compare against, so it also defaults to 0.5. New categories start neutral. The system does not penalize or reward items in unknown categories.

Returning User Absence Boost

When a previously active user returns after extended absence (30+ days since last signal), their preferences may have drifted. Rather than treating them as a cold user (which would discard their history), the system temporarily increases their learning rate:

fn absence_boost_lr(
    base_lr: f64,
    days_since_last_signal: u64,
    signals_since_return: u64,
) -> f64 {
    if days_since_last_signal < 30 || signals_since_return > 20 {
        return base_lr;  // no boost needed
    }

    // Linear multiplier from 1.0 (at 30 days) to 2.0 (at 60+ days)
    let multiplier = (1.0 + ((days_since_last_signal as f64 - 30.0) / 30.0))
        .min(2.0);

    // Decay the boost over the first 20 signals after return
    let decay = 1.0 - (signals_since_return as f64 / 20.0);
    let effective_multiplier = 1.0 + (multiplier - 1.0) * decay.max(0.0);

    base_lr * effective_multiplier
}

Constraints:

  • Minimum absence for boost: 30 days
  • Maximum learning rate multiplier: 2.0x
  • Boost decays linearly over first 20 signals after return
  • Does not revert to cold-start exploration budget or cohort priors

11. Configuration Reference

Item Cold Start Configuration

pub struct ItemColdStartConfig {
    /// Signal count at which item graduates to signal-based ranking.
    /// Default: 100.
    pub graduation_threshold: u64,

    /// Exploration eligibility window after item creation.
    /// Default: 48 hours.
    pub exploration_window: Duration,

    /// Minimum proxy score for exploration pool eligibility.
    /// Default: 0.2.
    pub min_quality_floor: f64,

    /// Earliest result position for exploration items.
    /// Default: 3.
    pub min_exploration_position: usize,

    /// Minimum spacing between exploration items in results.
    /// Default: 3.
    pub min_exploration_spacing: usize,

    /// Breakout velocity multiplier over category baseline.
    /// Default: 3.0.
    pub breakout_multiplier: f64,

    /// Maximum items from same creator in exploration slots per query.
    /// Default: 1.
    pub exploration_max_per_creator: u32,

    /// Exploration pool refresh interval (background materializer).
    /// Default: 5 minutes.
    pub pool_refresh_interval: Duration,

    /// Maximum items in the exploration pool.
    /// Default: 50,000.
    pub max_pool_size: usize,
}

User Cold Start Configuration

pub struct UserColdStartConfig {
    /// Additional exploration budget for cold users (added to profile default).
    /// Default: 0.20 (so a profile with 0.10 becomes 0.30 for cold users).
    pub new_user_exploration_boost: f64,

    /// Signal count at which user exploration boost decays to zero.
    /// Default: 50.
    pub user_graduation_threshold: u64,

    /// Signal count at which cohort-to-personal transition completes.
    /// Default: 50.
    pub cohort_blend_threshold: u64,

    /// Minimum cohort size (graduated users) for cohort centroid to be used.
    /// Default: 50.
    pub min_cohort_size_for_centroid: u64,

    /// Minimum signals per user for cohort centroid contribution.
    /// Default: 100.
    pub min_signals_for_centroid: u64,

    /// Minimum absence days before returning user boost applies.
    /// Default: 30.
    pub absence_boost_threshold_days: u64,

    /// Maximum learning rate multiplier for returning users.
    /// Default: 2.0.
    pub absence_boost_max_multiplier: f64,

    /// Signals after return over which absence boost decays.
    /// Default: 20.
    pub absence_boost_decay_signals: u64,
}

Creator Cold Start Configuration

pub struct CreatorColdStartConfig {
    /// Maximum item count for a creator to qualify as "new."
    /// Default: 5.
    pub new_creator_item_threshold: u32,

    /// Maximum follower count for a creator to qualify as "new."
    /// Default: 100.
    pub new_creator_follower_threshold: u32,

    /// Exploration budget multiplier for new creator items.
    /// Default: 1.5.
    pub discovery_multiplier: f64,

    /// Exploration budget multiplier for a creator's very first item.
    /// Stacks with discovery_multiplier.
    /// Default: 2.0.
    pub first_item_multiplier: f64,

    /// Minimum graduated items before creator signals reach full confidence.
    /// Default: 5.
    pub creator_maturity_threshold: u32,

    /// Signal weight multiplier for provisional creators.
    /// Default: 0.5.
    pub provisional_signal_weight: f64,

    /// Number of similar creators to compare against for quality prior.
    /// Default: 20.
    pub similar_creator_count: usize,
}

Configuration Defaults Summary

Parameter Default Range Rationale
graduation_threshold 100 10-1000 100 signals provide statistically meaningful engagement data
exploration (per profile) 0.10 0.0-0.50 10% discovery, 90% ranked. Balances quality and freshness
exploration_window 48h 1h-168h 48h gives items a weekend cycle
min_quality_floor 0.2 0.0-0.5 Prevents obviously low-quality content from consuming exploration budget
min_exploration_position 3 1-10 Top 2 positions are earned, not given to unproven content
breakout_multiplier 3.0 1.5-10.0 3x category baseline is clearly exceptional, not noise
new_user_exploration_boost 0.20 0.0-0.40 30% total exploration for new users (0.10 + 0.20)
user_graduation_threshold 50 10-200 50 signals = meaningful preference vector divergence
cohort_blend_threshold 50 10-200 50 signals = sufficient for ANN retrieval to be useful
min_cohort_size_for_centroid 50 10-500 Below 50 graduated users, centroid is unreliable
new_creator_item_threshold 5 1-20 Creators with < 5 items have insufficient track record
discovery_multiplier 1.5 1.0-3.0 50% boost for new creator items
first_item_multiplier 2.0 1.0-5.0 Every creator deserves one strong chance
creator_maturity_threshold 5 1-20 5 graduated items = reliable creator quality signal
provisional_signal_weight 0.5 0.1-1.0 Half-weight creator signals until maturity
absence_boost_threshold_days 30 7-90 30 days is meaningfully absent
absence_boost_max_multiplier 2.0 1.0-5.0 Double learning rate at most

12. Performance Considerations

Cold start should not slow queries. The mechanisms described here must operate within the existing query latency budget (< 50ms end-to-end for RETRIEVE queries).

Performance Budget

Operation Budget Mechanism
Cold start phase detection < 100 ns O(1) atomic counter read from hot tier
Exploration weight computation < 10 ns One subtraction + division + max
Proxy score lookup (per item) < 100 ns Pre-computed, stored in entity store
Proxy score computation (at ingestion) < 5 us Four lookups + weighted sum + two ANN lookups
Exploration pool selection < 2 ms Pre-sorted pool, take top N
Exploration position calculation < 100 ns Arithmetic on limit + count
Cohort centroid lookup < 100 ns Cached in memory
Interleaving < 500 ns Array merge at calculated positions
User exploration rate computation < 10 ns One subtraction + max
Breakout detection (per item) < 200 ns One velocity read + comparison
Absence boost computation < 50 ns Timestamp comparison + multiplication

Total Cold Start Overhead per Query

Query Type Without Cold Start With Cold Start Overhead
RETRIEVE for_you (established user) ~40 ms ~42 ms +2 ms (exploration pool selection)
RETRIEVE for_you (cold user) N/A ~45 ms Cohort trending + elevated exploration
SEARCH ~30 ms ~30 ms Negligible (no exploration pool for search)
RETRIEVE trending ~20 ms ~20 ms Cold items excluded (no overhead)

Memory Budget

Component Size Notes
Exploration pool (50K items * 50 bytes) 2.5 MB Entity ID + proxy score + created_at
Category baselines (1000 categories * 64 bytes) 64 KB Median velocity, avg quality
Category quality centroids (1000 * 1536 * 2 bytes) 3 MB f16 embeddings
Population centroid (1 * 1536 * 4 bytes) 6 KB f32 for precision
Cohort centroids (100 cohorts * 1536 * 4 bytes) 600 KB f32
Cold start state per item 0 bytes Uses existing all_time_count atomic counters
Total ~6.2 MB Negligible vs. hot tier budget

Background Computation Schedule

Computation Frequency Cost Trigger
Exploration pool refresh Every 5 min ~100 ms (scan cold items, sort) Timer
Category baselines Every 1 hour ~2 sec (scan items per category) Materializer hourly cycle
Category quality centroids Every 24 hours ~30 sec (compute weighted means) Materializer daily cycle
Population centroid Every 24 hours ~5 sec (mean of user preference vectors) Materializer daily cycle
Cohort centroids Every 24 hours ~10 sec (mean per cohort) Materializer daily cycle

13. Invariants and Correctness Guarantees

Cold Start Invariants

INV-CS-1: No Permanent Cold State. Every item either graduates through signal accumulation or exits the exploration pool through window expiration. No item remains in the exploration pool indefinitely.

Formally: For any item I, either:

  • signal_count(I, t) >= graduation_threshold for some t < created_at(I) + exploration_window
  • t > created_at(I) + exploration_window and I is no longer exploration-eligible

INV-CS-2: Exploration Budget Bound. The number of exploration items in any result set never exceeds ceil(limit * budget). The budget is a hard cap, not a target.

Formally: For any query Q with limit = L and effective exploration budget B:

|exploration_items(results(Q))| <= ceil(L * B)

INV-CS-3: Quality Floor for Exploration. No item with proxy_score < min_quality_floor (default: 0.2) appears as an exploration item.

INV-CS-4: Blocked/Hidden Exclusion in Exploration. Exploration items respect all user exclusions. A hidden item is never injected as an exploration item. A blocked creator's items are never injected as exploration items.

Formally: INV-FL-1 (hidden items never reappear) and INV-FL-2 (blocked creator exclusion) hold for exploration items identically to ranked items.

INV-CS-5: Exploration Position Bound. No exploration item appears at position 1 or 2 in the result set. The minimum position is min_exploration_position (default: 3).

INV-CS-6: Graduation Monotonicity. Once an item's signal_count >= graduation_threshold, it never reverts to cold state. Graduation is a one-way transition. Signal counts are monotonically increasing (signals are append-only).

Formally: If signal_count(I, t) >= graduation_threshold, then for all t' > t:

signal_count(I, t') >= graduation_threshold

INV-CS-7: Linear Blend Correctness. The blended score at any point matches the analytical formula:

|effective_score - (ew * proxy + (1-ew) * signal)| < f64::EPSILON
where ew = max(0, 1 - signal_count / graduation_threshold)

INV-CS-8: Cohort Prior Freshness. A cold user's cohort centroid is at most 24 hours old (background materializer daily cycle). The population centroid is at most 24 hours old.

Interaction with Other Invariants

Invariant Interaction
INV-FL-1 (hidden items never reappear) Exploration items are filtered through the same exclusion bitmap as ranked items
INV-FL-2 (blocked creator exclusion) Exploration items are filtered through the same blocked set as ranked items
INV-SIG-1 (no signal loss) Signal loss would prevent graduation, keeping items cold longer than necessary. WAL durability prevents this.
INV-COH-7 (minimum population threshold) Cohort priors are only used when the cohort meets the minimum population threshold. Below threshold, fall back to population centroid.

14. Property Tests

// P1: Exploration budget never exceeds declared limit.
proptest! {
    fn exploration_budget_bounded(
        limit in 10usize..200,
        budget in 0.01f64..0.50,
        cold_item_count in 0usize..1000,
    ) {
        let max_exploration = (limit as f64 * budget).ceil() as usize;
        let actual = compute_exploration_count(limit, budget, cold_item_count);
        prop_assert!(actual <= max_exploration,
            "exploration count {} exceeds max {} (limit={}, budget={})",
            actual, max_exploration, limit, budget);
    }
}

// P2: Exploration weight is monotonically decreasing with signal count.
proptest! {
    fn exploration_weight_monotonic(
        signals_a in 0u64..10000,
        signals_b in 0u64..10000,
        threshold in 10u64..1000,
    ) {
        let weight_a = (1.0 - signals_a as f64 / threshold as f64).max(0.0);
        let weight_b = (1.0 - signals_b as f64 / threshold as f64).max(0.0);
        if signals_a <= signals_b {
            prop_assert!(weight_a >= weight_b - f64::EPSILON,
                "exploration weight not monotonic: f({})={} < f({})={}",
                signals_a, weight_a, signals_b, weight_b);
        }
    }
}

// P3: Exploration weight is exactly 0 at graduation threshold.
proptest! {
    fn exploration_weight_zero_at_graduation(
        threshold in 10u64..1000,
    ) {
        let weight = (1.0 - threshold as f64 / threshold as f64).max(0.0);
        prop_assert!((weight - 0.0).abs() < f64::EPSILON,
            "exploration weight at threshold = {}, expected 0.0", weight);
    }
}

// P4: Exploration weight is exactly 1.0 at zero signals.
proptest! {
    fn exploration_weight_one_at_zero(
        threshold in 10u64..1000,
    ) {
        let weight = (1.0 - 0.0f64 / threshold as f64).max(0.0);
        prop_assert!((weight - 1.0).abs() < f64::EPSILON,
            "exploration weight at 0 signals = {}, expected 1.0", weight);
    }
}

// P5: Proxy score is bounded [0, 1].
proptest! {
    fn proxy_score_bounded(
        creator_quality in 0.0f64..1.0,
        category_baseline in 0.0f64..1.0,
        metadata_complete in 0.0f64..1.0,
        embedding_novelty in 0.0f64..1.0,
        embedding_sim in -1.0f64..1.0,
        freshness in 0.0f64..1.0,
    ) {
        let score = proxy_score(
            creator_quality, category_baseline,
            metadata_complete, embedding_novelty,
            embedding_sim, freshness,
        );
        prop_assert!(score >= 0.0 && score <= 1.0,
            "proxy score {} out of bounds [0, 1]", score);
    }
}

// P6: Blended score equals proxy score at zero signals.
proptest! {
    fn blended_score_equals_proxy_at_zero(
        proxy in 0.0f64..1.0,
        signal_score in 0.0f64..1.0,
        threshold in 10u64..1000,
    ) {
        let ew = (1.0 - 0.0f64 / threshold as f64).max(0.0);
        let blended = ew * proxy + (1.0 - ew) * signal_score;
        prop_assert!((blended - proxy).abs() < f64::EPSILON,
            "blended score {} != proxy {} at 0 signals", blended, proxy);
    }
}

// P7: Blended score equals signal score at graduation.
proptest! {
    fn blended_score_equals_signal_at_graduation(
        proxy in 0.0f64..1.0,
        signal_score in 0.0f64..1.0,
        threshold in 10u64..1000,
    ) {
        let ew = (1.0 - threshold as f64 / threshold as f64).max(0.0);
        let blended = ew * proxy + (1.0 - ew) * signal_score;
        prop_assert!((blended - signal_score).abs() < f64::EPSILON,
            "blended score {} != signal {} at graduation", blended, signal_score);
    }
}

// P8: Hidden items never appear in exploration results.
proptest! {
    fn hidden_items_excluded_from_exploration(
        items in arb_items(100),
        hidden_indices in prop::collection::hash_set(0usize..100, 0..20),
    ) {
        let db = setup_test_db();
        let user = create_test_user(&db);

        for item in &items {
            db.write_item(item)?;
        }

        for &idx in &hidden_indices {
            db.signal(Signal { kind: "hide", item: items[idx].id, user, .. })?;
        }

        let results = db.retrieve(Retrieve {
            for_user: Some(user),
            profile: "for_you",
            limit: 50,
            ..Default::default()
        })?;

        for &idx in &hidden_indices {
            prop_assert!(
                !results.results.iter().any(|r| r.id == items[idx].id),
                "Hidden item {} appeared in results (possibly as exploration)",
                items[idx].id
            );
        }
    }
}

// P9: Exploration items are never at position 1 or 2.
proptest! {
    fn exploration_positions_respect_minimum(
        limit in 10usize..200,
        exploration_count in 1usize..20,
        min_position in 2usize..10,
    ) {
        let exploration_count = exploration_count.min(limit / 3);
        if exploration_count == 0 { return Ok(()); }

        let positions = exploration_positions(limit, exploration_count, min_position);

        for &pos in &positions {
            prop_assert!(pos >= min_position.max(3),
                "exploration position {} below minimum {}", pos, min_position.max(3));
            prop_assert!(pos <= limit,
                "exploration position {} exceeds limit {}", pos, limit);
        }
    }
}

// P10: Exploration positions are evenly distributed (not clustered).
proptest! {
    fn exploration_positions_distributed(
        limit in 20usize..200,
        exploration_count in 2usize..20,
    ) {
        let exploration_count = exploration_count.min(limit / 4);
        if exploration_count < 2 { return Ok(()); }

        let positions = exploration_positions(limit, exploration_count, 3);

        // Verify minimum spacing between consecutive positions
        for window in positions.windows(2) {
            let gap = window[1].saturating_sub(window[0]);
            prop_assert!(gap >= 3,
                "exploration positions too close: {} and {} (gap={})",
                window[0], window[1], gap);
        }
    }
}

// P11: User exploration boost decays to profile default.
proptest! {
    fn user_exploration_decays_to_default(
        profile_exploration in 0.01f64..0.50,
        boost in 0.0f64..0.40,
        threshold in 10u64..200,
    ) {
        let effective = profile_exploration
            + boost * (1.0 - threshold as f64 / threshold as f64).max(0.0);
        prop_assert!((effective - profile_exploration).abs() < f64::EPSILON,
            "effective {} != profile {} at graduation threshold",
            effective, profile_exploration);
    }
}

// P12: Absence boost is bounded.
proptest! {
    fn absence_boost_bounded(
        base_lr in 0.001f64..0.1,
        days_absent in 0u64..365,
        signals_since in 0u64..100,
    ) {
        let boosted = absence_boost_lr(base_lr, days_absent, signals_since);
        prop_assert!(boosted >= base_lr - f64::EPSILON,
            "boosted lr {} below base {}", boosted, base_lr);
        prop_assert!(boosted <= base_lr * 2.0 + f64::EPSILON,
            "boosted lr {} exceeds 2x base {}", boosted, base_lr * 2.0);
    }
}

Appendix A: Glossary

Term Definition
Cold Start The phase where an entity has zero signals and cannot participate in signal-based ranking
Accumulating The phase where an entity has some signals but below the graduation threshold; scoring is blended
Graduated The phase where an entity has sufficient signals for purely signal-based ranking
Exploration Budget The fraction of query result slots reserved for cold-start items, per ranking profile
Exploration Pool The pre-sorted set of cold items eligible for exploration injection
Exploration Window The duration after item creation during which items are exploration-eligible (default: 48h)
Exploration Weight Linear function of signal count that controls the blend between proxy and signal scores
Proxy Score Predicted item quality from creator history, category baselines, metadata, embeddings, and freshness
Graduation Threshold The signal count at which exploration weight reaches 0 and the item competes on signals alone
Breakout Detection Identifying items whose early signal velocity far exceeds the category baseline, triggering early graduation
Cohort Prior Using cohort-level statistics (centroid embedding, trending content) as the initial state for a new user
Population Centroid The mean preference vector of all users with 100+ signals, used as the ultimate fallback for cold users
Cohort Centroid The mean preference vector of users in a specific cohort with 100+ signals
Creator Discovery Boost Additional exploration budget allocated to items from new creators
First-Item Boost Extra exploration budget for a creator's very first published item
Provisional Creator Signals Creator-level signal data weighted at 50% until the creator has 5 graduated items
Absence Boost Temporary learning rate multiplier for users returning after 30+ days of inactivity
Quality Floor Minimum proxy score required for exploration eligibility (default: 0.2)

Appendix B: References

  1. VISION.md, Design Principles: "Cold start is handled by the database." (Architectural requirement)
  2. USE_CASES.md, UC-01: "minimum 10% exploration budget (creators the user does not follow)." (Product requirement)
  3. USE_CASES.md, UC-13: "Creator follower count -- small/new creators get priority." (Discovery equity requirement)
  4. API.md, ProfileDef: exploration: 0.10. (API surface)
  5. Feedback Loop Specification, Section 3: Preference Vector Management. (Cold start initialization, adaptive learning rate: lr_max=0.10, lr_min=0.01, decay_k=0.003)
  6. Cohort Specification, Section 6: Three-Layer Trending Model. (Cohort-scoped trending as cold user prior)
  7. Entity Model Specification: Cold Start State. (Entity lifecycle cold start definition, creator computed fields)
  8. Signal System Specification, Section 3: all_time_count atomic counters. (O(1) graduation tracking)
  9. Schema Specification, Section 8: Defaults and Population Priors. (Population centroid, exploration budget mechanics)
  10. Li, L., Chu, W., Langford, J., Schapire, R. "A Contextual-Bandit Approach to Personalized News Article Recommendation." WWW 2010. (Exploration-exploitation tradeoff in recommendation)
  11. Agarwal, D., Chen, B., Elango, P. "Explore/Exploit Schemes for Web Content Optimization." ICDM 2009. (Exploration budget allocation)