M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
MMR diversity enforcement, and m2_uat integration tests.
M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
creator entities, session tracking, and m3_uat integration tests.
M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
ranking executor with explain mode, query executor integration, benchmarks for
query/ranking/vector/filters/diversity, and m4_uat integration tests.
Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
81 lines
5.3 KiB
Markdown
81 lines
5.3 KiB
Markdown
# Milestone 3, Phase 3: Personalized Ranking Profiles
|
|
|
|
## Phase Deliverable
|
|
|
|
Four personalized ranking profiles (`for_you`, `following`, `related`, `notification`) that incorporate user context into scoring, plus cold-start handling for new users and new items. The `FOR USER @user_id` clause in the query parser is parsed and resolved into a `UserContext` that loads the user's preference vector, interaction weights, followed creators, and blocked state. The profile executor uses this context to score candidates with personalization factors.
|
|
|
|
Before m3p3, all ranking profiles operate on population-level signals (trending, hot, new, etc.). After m3p3, profiles can weight candidates by how well they match the user's learned preferences, boost items from creators the user frequently engages with, and inject exploration candidates from unfollowed creators to prevent filter bubbles.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `FOR USER @user_id` clause parsed by the query parser
|
|
- [ ] `UserContext` loaded from `UserStateIndex`, `InteractionWeightLedger`, preference vector
|
|
- [ ] `for_you` profile: ANN retrieval using user preference vector, scoring = preference_match * engagement * recency * social_proof, 10% exploration budget
|
|
- [ ] `following` profile: candidates restricted to followed creators, sorted by `created_at` DESC
|
|
- [ ] `related` profile: ANN retrieval using source item embedding, re-ranked by user preference
|
|
- [ ] `notification` profile: candidates from followed creators' recent items, scored by relationship_strength * item_quality
|
|
- [ ] Cold-start users (no preference vector): fall back to population-level signals (trending/quality)
|
|
- [ ] Cold-start items (no signals): exploration window -- appear in ~2% of for_you feeds
|
|
- [ ] Exploration budget: ~5 of 50 for_you results from unfollowed creators
|
|
- [ ] `ProfileExecutor` extended with `score_with_user_context()` method
|
|
- [ ] Profile registration: `for_you`, `following`, `related`, `notification` added to `ProfileRegistry` as builtins
|
|
|
|
## Dependencies
|
|
|
|
- **Requires:** m3p2 (feedback loop: preference vectors populated, interaction weights updated, user state bitmaps maintained), m2p3 (ranking profile engine, `ProfileExecutor`), m2p5 (query parser, RETRIEVE executor), m2p1 (vector index for ANN retrieval with user preference vector)
|
|
- **Blocks:** m3p4 (User State Filters need `FOR USER` parsing to inject user context into filter evaluation)
|
|
|
|
## Research References
|
|
|
|
- [docs/research/ann_for_tidaldb.md](../../../research/ann_for_tidaldb.md) -- ANN retrieval with user preference vector as query
|
|
- [VISION.md](../../../../VISION.md) -- Ranking profiles, personalization factors
|
|
- [USE_CASES.md](../../../../USE_CASES.md) -- UC-01 (For You), UC-04 (Following), UC-05 (Related), UC-07 (Notifications)
|
|
|
|
## Task Index
|
|
|
|
| # | Task | Delivers | Depends On | Complexity |
|
|
|---|------|----------|------------|------------|
|
|
| 01 | FOR USER Query Context | `UserContext` loader, query parser extension for `FOR USER`, planner integration | None | M |
|
|
| 02 | Personalized Profiles | `for_you`, `following`, `related`, `notification` profiles, executor extensions | Task 01 | L |
|
|
| 03 | Cold Start and Exploration | Cold-start fallback, exploration budget injection, new-item exploration window | Task 02 | M |
|
|
|
|
## Task Dependency DAG
|
|
|
|
```
|
|
Task 01: FOR USER Query Context
|
|
|
|
|
v
|
|
Task 02: Personalized Profiles
|
|
|
|
|
v
|
|
Task 03: Cold Start and Exploration
|
|
```
|
|
|
|
All three tasks are sequential: Task 02 needs the user context from Task 01, and Task 03 needs the profiles from Task 02 to inject exploration candidates.
|
|
|
|
## File Layout
|
|
|
|
```
|
|
tidal/src/
|
|
db/
|
|
user_context.rs -- UserContext loader, query context resolution (Task 01)
|
|
query/
|
|
mod.rs -- Extended parser with FOR USER clause (Task 01)
|
|
ranking/
|
|
personalized.rs -- Personalized scoring functions, profile definitions (Task 02)
|
|
exploration.rs -- Cold-start fallback, exploration budget (Task 03)
|
|
builtins.rs -- Extended with for_you, following, related, notification (Task 02)
|
|
tidal/tests/
|
|
m3p3_personalized.rs -- Phase integration tests
|
|
```
|
|
|
|
## Open Questions
|
|
|
|
1. **Exploration budget implementation**: Should exploration candidates be selected randomly from the full corpus, or from a "new items" pool? Recommendation: random selection from the full corpus minus followed creators' items. This maximizes serendipity. New-item exploration is a separate budget within the exploration slice.
|
|
|
|
2. **Social proof signal**: How should "social proof" (engagement from followed creators' followers) be implemented? For M3, social proof is approximated by the item's population-level engagement signals (view velocity, like count). True social graph traversal ("trending among my follows' follows") is deferred to M6.
|
|
|
|
3. **SIMILAR TO clause**: The `related` profile needs a source item for ANN retrieval. Should `SIMILAR TO @item_id` be a separate parser clause, or embedded in the profile configuration? Recommendation: separate clause (`RETRIEVE items SIMILAR TO @item_abc USING PROFILE related ...`). This keeps the profile generic and the source item explicit.
|
|
|
|
4. **Notification frequency capping**: Should the `notification` profile enforce per-creator notification limits (e.g., max 3 per creator per day)? Recommendation: deferred to M6. For M3, the notification profile ranks by recency * relationship strength without capping.
|