M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
MMR diversity enforcement, and m2_uat integration tests.
M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
creator entities, session tracking, and m3_uat integration tests.
M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
ranking executor with explain mode, query executor integration, benchmarks for
query/ranking/vector/filters/diversity, and m4_uat integration tests.
Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
582 lines
21 KiB
Markdown
582 lines
21 KiB
Markdown
# Task 02: Personalized Profiles
|
|
|
|
## Context
|
|
|
|
**Milestone:** 3 -- Personalized Ranking
|
|
**Phase:** m3p3 -- Personalized Ranking Profiles
|
|
**Depends On:** Task 01 (`UserContext` with preference vector, interaction weights, follows), m2p3 (profile engine, `ProfileExecutor`, `RankingProfile`), m2p1 (vector index for ANN retrieval), m2p4 (diversity enforcement)
|
|
**Blocks:** Task 03 (Cold Start and Exploration needs `for_you` profile to inject exploration candidates)
|
|
**Complexity:** L
|
|
|
|
## Objective
|
|
|
|
Deliver four personalized ranking profiles: `for_you`, `following`, `related`, and `notification`. These profiles are registered as builtins in the `ProfileRegistry` alongside the existing M2 profiles (trending, hot, new, etc.). Each profile uses the `UserContext` from Task 01 to personalize candidate retrieval and scoring.
|
|
|
|
The `ProfileExecutor` is extended with a `score_with_context` method that accepts an optional `UserContext`. When user context is available, the executor applies personalization factors: preference match (cosine similarity between user and item embeddings), creator affinity (interaction weight boost), and social proof (engagement from followed creators).
|
|
|
|
These four profiles cover UC-01 (For You Feed), UC-04 (Following Feed), UC-05 (Related/Up Next), and UC-07 (Notifications).
|
|
|
|
## Requirements
|
|
|
|
### for_you Profile
|
|
- Candidate strategy: ANN retrieval using user preference vector (top 200 candidates)
|
|
- Scoring formula: `preference_match * 0.4 + engagement_velocity * 0.3 + recency * 0.2 + creator_affinity * 0.1`
|
|
- `preference_match`: cosine similarity between user preference vector and item embedding
|
|
- `engagement_velocity`: normalized view + share velocity from signal ledger
|
|
- `recency`: exponential decay from item age (half-life 48h)
|
|
- `creator_affinity`: interaction weight between user and item's creator, normalized to [0, 1]
|
|
- Gate: completion_rate > 0.02 (filters very low quality)
|
|
- Diversity: max_per_creator from query, format_mix 0.6
|
|
- Exploration: 10% budget (injected by Task 03)
|
|
|
|
### following Profile
|
|
- Candidate strategy: relationship-based (items from followed creators only)
|
|
- Scoring: `created_at` DESC (chronological), with tiebreaker on `completion_rate`
|
|
- No engagement-based scoring -- chronological is the default for following feeds
|
|
- No diversity enforcement (creator identity IS the filter)
|
|
- No exploration budget
|
|
|
|
### related Profile
|
|
- Candidate strategy: ANN retrieval using source item embedding (top 100 candidates)
|
|
- Scoring: `item_similarity * 0.5 + preference_match * 0.3 + engagement * 0.2`
|
|
- `item_similarity`: cosine between source item and candidate item embeddings
|
|
- `preference_match`: cosine between user preference vector and candidate, if available
|
|
- `engagement`: normalized population signals (view count, like count)
|
|
- Filter: exclude the source item itself
|
|
- Diversity: max_per_creator:2
|
|
|
|
### notification Profile
|
|
- Candidate strategy: scan recent items from followed creators (last 48h)
|
|
- Scoring: `relationship_strength * 0.6 + item_quality * 0.4`
|
|
- `relationship_strength`: interaction weight between user and creator, normalized
|
|
- `item_quality`: composite of view velocity + completion rate
|
|
- Filter: only items from followed creators, created within 48h
|
|
- Sort: descending by score
|
|
|
|
## Technical Design
|
|
|
|
### Module Structure
|
|
|
|
```
|
|
tidal/src/
|
|
ranking/
|
|
personalized.rs -- Personalized scoring functions
|
|
builtins.rs -- Extended with new profile definitions
|
|
```
|
|
|
|
### Personalized Scoring Functions
|
|
|
|
```rust
|
|
// === ranking/personalized.rs ===
|
|
|
|
use crate::db::user_context::UserContext;
|
|
use crate::schema::{EntityId, Timestamp, Window};
|
|
use crate::signals::SignalLedger;
|
|
|
|
/// Compute the preference match score between a user and an item.
|
|
///
|
|
/// Returns cosine similarity in [-1.0, 1.0], remapped to [0.0, 1.0].
|
|
/// Returns 0.5 (neutral) if either vector is unavailable.
|
|
pub fn preference_match(
|
|
user_ctx: &UserContext,
|
|
item_embedding: Option<&[f32]>,
|
|
) -> f64 {
|
|
match (&user_ctx.preference_vector, item_embedding) {
|
|
(Some(pref), Some(item)) => {
|
|
if let Some(pref_data) = pref.as_slice() {
|
|
if pref_data.len() == item.len() {
|
|
let cosine: f64 = pref_data.iter()
|
|
.zip(item.iter())
|
|
.map(|(&a, &b)| f64::from(a) * f64::from(b))
|
|
.sum();
|
|
// Remap [-1, 1] to [0, 1].
|
|
(cosine + 1.0) / 2.0
|
|
} else {
|
|
0.5 // Dimension mismatch: neutral score
|
|
}
|
|
} else {
|
|
0.5 // Cold start: neutral score
|
|
}
|
|
}
|
|
_ => 0.5, // Missing data: neutral score
|
|
}
|
|
}
|
|
|
|
/// Compute the creator affinity score for a user-creator pair.
|
|
///
|
|
/// Normalizes the interaction weight to [0.0, 1.0] using a sigmoid-like
|
|
/// transformation: `affinity = weight / (weight + k)` where k is a
|
|
/// half-saturation constant (default 5.0).
|
|
pub fn creator_affinity(
|
|
user_ctx: &UserContext,
|
|
creator_id: Option<EntityId>,
|
|
) -> f64 {
|
|
const K: f64 = 5.0; // Half-saturation constant
|
|
match creator_id {
|
|
Some(cid) => {
|
|
let weight = user_ctx.interaction_weight(cid);
|
|
weight / (weight + K)
|
|
}
|
|
None => 0.0,
|
|
}
|
|
}
|
|
|
|
/// Compute a recency score based on item age.
|
|
///
|
|
/// Uses exponential decay with a 48-hour half-life.
|
|
/// Items created at `now` get score 1.0; items 48h old get 0.5.
|
|
pub fn recency_score(
|
|
created_at_ns: u64,
|
|
now: Timestamp,
|
|
) -> f64 {
|
|
let now_ns = now.as_nanos();
|
|
if created_at_ns >= now_ns {
|
|
return 1.0;
|
|
}
|
|
let age_secs = (now_ns - created_at_ns) as f64 / 1_000_000_000.0;
|
|
let half_life_secs = 48.0 * 3600.0;
|
|
let lambda = std::f64::consts::LN_2 / half_life_secs;
|
|
(-lambda * age_secs).exp()
|
|
}
|
|
|
|
/// Composite for_you score for a single candidate.
|
|
pub fn for_you_score(
|
|
pref_match: f64,
|
|
engagement_vel: f64,
|
|
recency: f64,
|
|
affinity: f64,
|
|
) -> f64 {
|
|
pref_match * 0.4 + engagement_vel * 0.3 + recency * 0.2 + affinity * 0.1
|
|
}
|
|
|
|
/// Composite related score for a single candidate.
|
|
pub fn related_score(
|
|
item_similarity: f64,
|
|
pref_match: f64,
|
|
engagement: f64,
|
|
) -> f64 {
|
|
item_similarity * 0.5 + pref_match * 0.3 + engagement * 0.2
|
|
}
|
|
|
|
/// Composite notification score for a single candidate.
|
|
pub fn notification_score(
|
|
relationship_strength: f64,
|
|
item_quality: f64,
|
|
) -> f64 {
|
|
relationship_strength * 0.6 + item_quality * 0.4
|
|
}
|
|
```
|
|
|
|
### Profile Definitions
|
|
|
|
```rust
|
|
// === ranking/builtins.rs (extensions) ===
|
|
|
|
/// Register the personalized profiles.
|
|
pub fn register_personalized_builtins(registry: &mut ProfileRegistry) -> crate::Result<()> {
|
|
// for_you
|
|
registry.register(RankingProfile {
|
|
name: "for_you".into(),
|
|
version: 1,
|
|
candidate_strategy: CandidateStrategy::Ann {
|
|
slot: "user_preference".into(),
|
|
limit: 200,
|
|
},
|
|
boosts: vec![],
|
|
decay: None,
|
|
gates: vec![Gate {
|
|
signal: "completion".into(),
|
|
agg: SignalAgg::Ratio,
|
|
window: Window::AllTime,
|
|
min_threshold: 0.02,
|
|
}],
|
|
penalties: vec![Penalty {
|
|
signal: "skip".into(),
|
|
agg: SignalAgg::Value,
|
|
window: Window::TwentyFourHours,
|
|
weight: 0.1,
|
|
}],
|
|
excludes: vec![],
|
|
diversity: DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
format_mix_max_fraction: Some(0.6),
|
|
},
|
|
exploration: 0.1, // 10%
|
|
sort: None, // Custom scoring via score_with_context
|
|
is_builtin: true,
|
|
})?;
|
|
|
|
// following
|
|
registry.register(RankingProfile {
|
|
name: "following".into(),
|
|
version: 1,
|
|
candidate_strategy: CandidateStrategy::Relationship,
|
|
boosts: vec![],
|
|
decay: None,
|
|
gates: vec![],
|
|
penalties: vec![],
|
|
excludes: vec![],
|
|
diversity: DiversitySpec::default(),
|
|
exploration: 0.0,
|
|
sort: Some(Sort::New), // Chronological
|
|
is_builtin: true,
|
|
})?;
|
|
|
|
// related
|
|
registry.register(RankingProfile {
|
|
name: "related".into(),
|
|
version: 1,
|
|
candidate_strategy: CandidateStrategy::Ann {
|
|
slot: "default".into(), // Source item embedding
|
|
limit: 100,
|
|
},
|
|
boosts: vec![],
|
|
decay: None,
|
|
gates: vec![],
|
|
penalties: vec![],
|
|
excludes: vec![],
|
|
diversity: DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
format_mix_max_fraction: None,
|
|
},
|
|
exploration: 0.0,
|
|
sort: None, // Custom scoring via score_with_context
|
|
is_builtin: true,
|
|
})?;
|
|
|
|
// notification
|
|
registry.register(RankingProfile {
|
|
name: "notification".into(),
|
|
version: 1,
|
|
candidate_strategy: CandidateStrategy::Relationship,
|
|
boosts: vec![],
|
|
decay: None,
|
|
gates: vec![],
|
|
penalties: vec![],
|
|
excludes: vec![],
|
|
diversity: DiversitySpec::default(),
|
|
exploration: 0.0,
|
|
sort: None, // Custom scoring via score_with_context
|
|
is_builtin: true,
|
|
})?;
|
|
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### ProfileExecutor Extension
|
|
|
|
```rust
|
|
impl<'a> ProfileExecutor<'a> {
|
|
/// Score candidates with user context for personalized ranking.
|
|
///
|
|
/// When `user_ctx` is provided, the executor uses personalized scoring
|
|
/// functions. The profile name determines which scoring formula is used:
|
|
/// - `for_you`: preference match + engagement + recency + affinity
|
|
/// - `related`: item similarity + preference match + engagement
|
|
/// - `notification`: relationship strength + item quality
|
|
/// - All others: delegates to `score()` (population-level)
|
|
pub fn score_with_context(
|
|
&self,
|
|
candidates: &[EntityId],
|
|
profile: &RankingProfile,
|
|
now: Timestamp,
|
|
user_ctx: &UserContext,
|
|
item_embeddings: &dyn Fn(EntityId) -> Option<Vec<f32>>,
|
|
item_created_at: &dyn Fn(EntityId) -> Option<u64>,
|
|
) -> Vec<ScoredCandidate> {
|
|
let profile_name = profile.name.as_str();
|
|
|
|
let mut scored: Vec<ScoredCandidate> = candidates
|
|
.iter()
|
|
.filter(|&&eid| passes_gates(eid, &profile.gates, self.ledger))
|
|
.map(|&entity_id| {
|
|
let raw = match profile_name {
|
|
"for_you" => self.score_for_you(entity_id, user_ctx, now, item_embeddings, item_created_at),
|
|
"related" => self.score_related(entity_id, user_ctx, item_embeddings),
|
|
"notification" => self.score_notification(entity_id, user_ctx),
|
|
_ => self.compute_raw_score(entity_id, profile, now),
|
|
};
|
|
ScoredCandidate {
|
|
entity_id,
|
|
score: raw,
|
|
signal_snapshot: vec![],
|
|
creator_id: None,
|
|
format: None,
|
|
}
|
|
})
|
|
.collect();
|
|
|
|
scored.sort_by(|a, b| b.score.partial_cmp(&a.score).unwrap_or(std::cmp::Ordering::Equal));
|
|
normalize(&mut scored);
|
|
scored
|
|
}
|
|
|
|
fn score_for_you(
|
|
&self,
|
|
entity_id: EntityId,
|
|
user_ctx: &UserContext,
|
|
now: Timestamp,
|
|
item_embeddings: &dyn Fn(EntityId) -> Option<Vec<f32>>,
|
|
item_created_at: &dyn Fn(EntityId) -> Option<u64>,
|
|
) -> f64 {
|
|
let item_emb = item_embeddings(entity_id);
|
|
let pref_match = preference_match(user_ctx, item_emb.as_deref());
|
|
|
|
let view_vel = read_agg(entity_id, "view", &SignalAgg::Velocity, Window::TwentyFourHours, self.ledger);
|
|
let share_vel = read_agg(entity_id, "share", &SignalAgg::Velocity, Window::TwentyFourHours, self.ledger);
|
|
let engagement_vel = (view_vel + 2.0 * share_vel).min(1.0);
|
|
|
|
let recency = item_created_at(entity_id)
|
|
.map_or(0.5, |ts| recency_score(ts, now));
|
|
|
|
let creator_id = None; // Read from metadata in actual implementation
|
|
let affinity = creator_affinity(user_ctx, creator_id);
|
|
|
|
for_you_score(pref_match, engagement_vel, recency, affinity)
|
|
}
|
|
|
|
fn score_related(
|
|
&self,
|
|
entity_id: EntityId,
|
|
user_ctx: &UserContext,
|
|
item_embeddings: &dyn Fn(EntityId) -> Option<Vec<f32>>,
|
|
) -> f64 {
|
|
let item_emb = item_embeddings(entity_id);
|
|
let pref_match = preference_match(user_ctx, item_emb.as_deref());
|
|
|
|
let views = read_agg(entity_id, "view", &SignalAgg::Value, Window::AllTime, self.ledger);
|
|
let likes = read_agg(entity_id, "like", &SignalAgg::Value, Window::AllTime, self.ledger);
|
|
let engagement = (views.log10().max(0.0) + likes.log10().max(0.0)) / 10.0;
|
|
|
|
// item_similarity is computed by the caller from ANN distances.
|
|
// For now, use preference match as a proxy.
|
|
let item_similarity = pref_match;
|
|
|
|
related_score(item_similarity, pref_match, engagement)
|
|
}
|
|
|
|
fn score_notification(
|
|
&self,
|
|
entity_id: EntityId,
|
|
user_ctx: &UserContext,
|
|
) -> f64 {
|
|
let creator_id = None; // Read from metadata
|
|
let rel_strength = creator_affinity(user_ctx, creator_id);
|
|
|
|
let view_vel = read_agg(entity_id, "view", &SignalAgg::Velocity, Window::TwentyFourHours, self.ledger);
|
|
let completion = read_agg(entity_id, "completion", &SignalAgg::DecayScore, Window::AllTime, self.ledger);
|
|
let item_quality = (view_vel.log10().max(0.0) + completion) / 2.0;
|
|
|
|
notification_score(rel_strength, item_quality)
|
|
}
|
|
}
|
|
```
|
|
|
|
## Test Strategy
|
|
|
|
### Unit Tests
|
|
|
|
```rust
|
|
#[test]
|
|
fn preference_match_identical_vectors() {
|
|
let pref = PreferenceVector::from_embedding(vec![1.0, 0.0, 0.0], 3).unwrap();
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
preference_vector: Some(pref),
|
|
top_creators: vec![],
|
|
followed_creators: HashSet::new(),
|
|
blocked_creators: HashSet::new(),
|
|
hidden_items: HashSet::new(),
|
|
is_cold_start: false,
|
|
};
|
|
let item = [1.0f32, 0.0, 0.0];
|
|
let score = preference_match(&ctx, Some(&item));
|
|
assert!((score - 1.0).abs() < 0.01, "identical vectors: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn preference_match_orthogonal_vectors() {
|
|
let pref = PreferenceVector::from_embedding(vec![1.0, 0.0, 0.0], 3).unwrap();
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
preference_vector: Some(pref),
|
|
..cold_start_context()
|
|
};
|
|
let item = [0.0f32, 1.0, 0.0];
|
|
let score = preference_match(&ctx, Some(&item));
|
|
assert!((score - 0.5).abs() < 0.01, "orthogonal: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn preference_match_opposite_vectors() {
|
|
let pref = PreferenceVector::from_embedding(vec![1.0, 0.0, 0.0], 3).unwrap();
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
preference_vector: Some(pref),
|
|
..cold_start_context()
|
|
};
|
|
let item = [-1.0f32, 0.0, 0.0];
|
|
let score = preference_match(&ctx, Some(&item));
|
|
assert!((score - 0.0).abs() < 0.01, "opposite: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn preference_match_cold_start_returns_neutral() {
|
|
let ctx = cold_start_context();
|
|
let item = [1.0f32, 0.0, 0.0];
|
|
let score = preference_match(&ctx, Some(&item));
|
|
assert!((score - 0.5).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn creator_affinity_zero_for_no_interaction() {
|
|
let ctx = cold_start_context();
|
|
let score = creator_affinity(&ctx, Some(EntityId::new(10)));
|
|
assert!((score - 0.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn creator_affinity_saturates() {
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
top_creators: vec![(EntityId::new(10), 100.0)],
|
|
..cold_start_context()
|
|
};
|
|
let score = creator_affinity(&ctx, Some(EntityId::new(10)));
|
|
// weight=100, k=5: 100/(100+5) = 0.952
|
|
assert!(score > 0.9, "high affinity: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn recency_score_now_is_one() {
|
|
let now = Timestamp::now();
|
|
let score = recency_score(now.as_nanos(), now);
|
|
assert!((score - 1.0).abs() < 0.01);
|
|
}
|
|
|
|
#[test]
|
|
fn recency_score_48h_is_half() {
|
|
let now = Timestamp::now();
|
|
let forty_eight_hours_ago = now.as_nanos() - 48 * 3600 * 1_000_000_000;
|
|
let score = recency_score(forty_eight_hours_ago, now);
|
|
assert!((score - 0.5).abs() < 0.05, "48h recency: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn for_you_score_range() {
|
|
let score = for_you_score(1.0, 1.0, 1.0, 1.0);
|
|
assert!((score - 1.0).abs() < f64::EPSILON);
|
|
let score_zero = for_you_score(0.0, 0.0, 0.0, 0.0);
|
|
assert!((score_zero - 0.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn related_score_range() {
|
|
let score = related_score(1.0, 1.0, 1.0);
|
|
assert!((score - 1.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn notification_score_range() {
|
|
let score = notification_score(1.0, 1.0);
|
|
assert!((score - 1.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
fn cold_start_context() -> UserContext {
|
|
UserContext {
|
|
user_id: EntityId::new(1),
|
|
preference_vector: None,
|
|
top_creators: vec![],
|
|
followed_creators: HashSet::new(),
|
|
blocked_creators: HashSet::new(),
|
|
hidden_items: HashSet::new(),
|
|
is_cold_start: true,
|
|
}
|
|
}
|
|
```
|
|
|
|
### Property Tests
|
|
|
|
```rust
|
|
use proptest::prelude::*;
|
|
|
|
proptest! {
|
|
#[test]
|
|
fn preference_match_always_in_unit_range(
|
|
pref_vec in proptest::collection::vec(-1.0f32..1.0, 16),
|
|
item_vec in proptest::collection::vec(-1.0f32..1.0, 16),
|
|
) {
|
|
if let Some(pref) = PreferenceVector::from_embedding(pref_vec, 16) {
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
preference_vector: Some(pref),
|
|
..cold_start_context()
|
|
};
|
|
let score = preference_match(&ctx, Some(&item_vec));
|
|
prop_assert!(score >= 0.0 && score <= 1.0,
|
|
"preference match out of range: {}", score);
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn for_you_score_always_in_unit_range(
|
|
pm in 0.0f64..1.0,
|
|
ev in 0.0f64..1.0,
|
|
r in 0.0f64..1.0,
|
|
a in 0.0f64..1.0,
|
|
) {
|
|
let score = for_you_score(pm, ev, r, a);
|
|
prop_assert!(score >= 0.0 && score <= 1.0,
|
|
"for_you score out of range: {}", score);
|
|
}
|
|
|
|
#[test]
|
|
fn creator_affinity_always_in_unit_range(
|
|
weight in 0.0f64..1000.0,
|
|
) {
|
|
let ctx = UserContext {
|
|
user_id: EntityId::new(1),
|
|
top_creators: vec![(EntityId::new(10), weight)],
|
|
..cold_start_context()
|
|
};
|
|
let score = creator_affinity(&ctx, Some(EntityId::new(10)));
|
|
prop_assert!(score >= 0.0 && score <= 1.0,
|
|
"creator affinity out of range: {}", score);
|
|
}
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `preference_match` returns cosine similarity remapped to [0, 1], neutral 0.5 for missing data
|
|
- [ ] `creator_affinity` returns sigmoid-normalized interaction weight in [0, 1]
|
|
- [ ] `recency_score` returns exponential decay with 48h half-life
|
|
- [ ] `for_you_score` combines four factors with weights summing to 1.0
|
|
- [ ] `related_score` combines three factors with weights summing to 1.0
|
|
- [ ] `notification_score` combines two factors with weights summing to 1.0
|
|
- [ ] All scoring functions return values in [0.0, 1.0] (property tested)
|
|
- [ ] `for_you` profile registered as builtin with correct configuration
|
|
- [ ] `following` profile registered with `Sort::New` and `CandidateStrategy::Relationship`
|
|
- [ ] `related` profile registered with ANN candidate strategy
|
|
- [ ] `notification` profile registered with relationship-based candidates
|
|
- [ ] `ProfileExecutor::score_with_context` dispatches to correct scoring function by profile name
|
|
- [ ] Cold-start users get neutral scores (0.5 preference match, 0.0 affinity)
|
|
- [ ] `cargo clippy -- -D warnings` passes
|
|
- [ ] All tests pass
|
|
|
|
## Research References
|
|
|
|
- [docs/research/ann_for_tidaldb.md](../../../research/ann_for_tidaldb.md) -- Cosine similarity via dot product on unit vectors
|
|
- [VISION.md](../../../../VISION.md) -- Ranking profile formulas
|
|
- [USE_CASES.md](../../../../USE_CASES.md) -- UC-01, UC-04, UC-05, UC-07
|
|
|
|
## Implementation Notes
|
|
|
|
- The scoring functions are intentionally simple linear combinations. The weights (0.4/0.3/0.2/0.1 for for_you) are starting points that can be tuned without code changes if the profile system is extended to accept configurable weights. For M3, hardcoded weights are sufficient.
|
|
- `creator_affinity` uses a sigmoid-like `w/(w+k)` transformation instead of raw weight. This bounds the output to [0, 1] and prevents high-weight creators from completely dominating the score. The half-saturation constant `k=5.0` means a weight of 5 produces affinity 0.5.
|
|
- For the `related` profile, the `item_similarity` should ideally come from the ANN distance between the source item and candidate item. In this task, we use preference match as a proxy. The full implementation should pipe ANN distances through from the candidate retrieval phase.
|
|
- The `following` profile uses `Sort::New` from the existing sort system. No custom scoring is needed -- the executor's existing `score_by_sort` handles chronological ordering.
|
|
- The `notification` profile's `CandidateStrategy::Relationship` means candidates are sourced from followed creators' items. The RETRIEVE executor must implement this candidate sourcing strategy, which uses the `FollowsBitmap` from m3p1 Task 03.
|
|
- Do NOT implement the exploration budget injection in this task. The `exploration: 0.1` field on the `for_you` profile is defined here but not enforced. Enforcement is done in Task 03 (Cold Start and Exploration).
|