M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
MMR diversity enforcement, and m2_uat integration tests.
M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
creator entities, session tracking, and m3_uat integration tests.
M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
ranking executor with explain mode, query executor integration, benchmarks for
query/ranking/vector/filters/diversity, and m4_uat integration tests.
Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
34 KiB
Task 02: M3 UAT Integration Test
Context
Milestone: 3 -- Personalized Ranking Phase: m3p4 -- User State Filters + M3 UAT Integration Test Depends On: Task 01 (user-state filters: unseen, unblocked, saved, liked, in_progress), m3p3 (personalized profiles: for_you, following, related, notification; cold-start handling; exploration budget), m3p2 (feedback loop: signal dispatch, preference vectors, interaction weights, hard negatives), m3p1 (user/creator entities, relationships, user state index) Blocks: Nothing (this is the final deliverable of Milestone 3) Complexity: L
Objective
Deliver the end-to-end integration test that proves the complete Milestone 3 UAT scenario from the ROADMAP. This test is the pass/fail gate for Milestone 3. It exercises every component built across m3p1--m3p4 in a single scenario:
- A corpus of 10,000 items across 200 creators with embeddings
- 500 users with follows, blocks, and historical signal events
- A
for_youquery returns personalized, filtered, diversity-constrained results - A
followingquery returns chronologically-ordered items from followed creators - A
relatedquery returns semantically similar items re-ranked by user preference - A
likesignal atomically updates item signals, interaction weights, and preference vector - Re-executing
for_youreflects the like (results shift) - A
hidesignal permanently excludes the hidden item - A
blocksignal permanently excludes all items from the blocked creator - Re-executing
for_youexcludes the hidden item and blocked creator's items
This test is not a unit test or component test. It is a full-system acceptance test that creates a TidalDb instance, writes all test data through the public API, executes queries through the public retrieve() method, and verifies results against the UAT criteria. If this test passes, Milestone 3 is complete.
Requirements
Test Data Setup
- 10,000 items across 200 creators (50 items per creator)
- Each item has: metadata (title, category, format, duration, created_at, creator_id), 16-dimensional embedding
- 200 creator entities with metadata
- 500 users, each following 10--30 random creators, each blocking 0--3 random creators
- Signal types: view (7d decay), like (14d decay), skip (1d decay), share (3d decay), completion (30d decay)
- 500,000 historical signal events establishing user preference vectors and interaction weights
- Ranking profiles registered: for_you, following, related, notification, trending, hot, new
Note: The test uses 16-dimensional embeddings instead of 1536 for speed. The dimensionality does not affect the correctness of cosine similarity or ANN retrieval, only the semantic quality (which is irrelevant for UAT).
Test Scenario Steps
Each step corresponds to a "When" clause from the ROADMAP UAT scenario.
Step 1: For You query
RETRIEVE items FOR USER @user_42
USING PROFILE for_you
FILTER unseen, unblocked
DIVERSITY max_per_creator:2
LIMIT 50
Verify:
- Returns exactly 50 results
- Results are sorted by score descending
- No item appears that user_42 has already viewed (seen bitmap populated from historical signals)
- No item from a blocked creator appears
- No hidden items appear
- Max 2 items per creator in the result set
- Approximately 5 items (10% exploration budget) are from creators user_42 does not follow
- Items matching user_42's preference vector rank higher than random items (cosine similarity correlation)
Step 2: Following query
RETRIEVE items FOR USER @user_42
FILTER relationship:follows
USING PROFILE following
LIMIT 50
Verify:
- All items are from creators user_42 follows
- Items are ordered by created_at descending (chronological)
- No items from unfollowed creators appear
Step 3: Related query
RETRIEVE items SIMILAR TO @item_500
FOR USER @user_42
USING PROFILE related
FILTER unseen
LIMIT 10
Verify:
- Returns up to 10 results
- @item_500 itself does not appear in results
- Items already seen by user_42 are excluded
- Results have semantic similarity to @item_500 (embedding distance)
Step 4: Like signal
SIGNAL like item:@item_xyz user:@user_42
Where @item_xyz is an item from a specific category/creator that user_42 has not previously engaged with heavily.
Verify:
- Item signal ledger updated (like count for item_xyz increased)
- Interaction weight between user_42 and creator of item_xyz increased
- User_42's preference vector shifted toward item_xyz's embedding
- All updates visible immediately (no eventual consistency)
Step 5: Re-execute For You after like
Re-execute the same for_you query from Step 1.
Verify:
- Results are different from Step 1 (the like changed the preference vector)
- Items similar to item_xyz's topic/embedding rank higher than before
- Items from the creator of item_xyz may appear more frequently (interaction weight increased)
Step 6: Hide signal
SIGNAL hide item:@item_999 user:@user_42
Verify:
- @item_999 is marked as hidden for user_42
- @item_999 will never appear in future queries for user_42
Step 7: Block signal
SIGNAL block user:@user_42 target_creator:@creator_77
Verify:
- Creator_77 is blocked by user_42
- All items by creator_77 are excluded from future queries for user_42
Step 8: Re-execute For You after hide and block
Re-execute the same for_you query from Step 1.
Verify:
- @item_999 does not appear in results
- No items from creator_77 appear in results
- The preference shift from the like signal (Step 4) is still reflected
- All diversity constraints still hold
- Result count is still 50 (other items fill the slots)
Persistence and Recovery
After all 8 steps, close the database and reopen it. Re-execute the for_you query and verify:
- Hidden item (@item_999) still excluded (hard negatives survive restart)
- Blocked creator (@creator_77) still excluded
- Preference vector is restored (results are similar to Step 8, not Step 1)
- Interaction weights are restored
Cold-Start User
Create a brand-new user (user_501) with no history and execute:
RETRIEVE items FOR USER @user_501
USING PROFILE for_you
FILTER unseen, unblocked
DIVERSITY max_per_creator:2
LIMIT 50
Verify:
- Returns 50 results (not zero -- cold-start handling works)
- Results are ranked by population-level signals (trending, quality, recency)
- No crash or error from missing preference vector
Performance
- Each RETRIEVE query completes in < 100ms at 10K items
- Signal write with user context completes in < 1ms
- Database open (with 500K signal replay) completes in < 30 seconds
Technical Design
Test File
tidal/tests/
m3_uat.rs -- Full Milestone 3 UAT integration test
Test Helpers
// === tests/m3_uat.rs ===
#![allow(clippy::unwrap_used)]
use std::collections::{HashMap, HashSet};
use std::time::Duration;
use tempfile::tempdir;
use tidaldb::db::{TidalDb, UserSignalContext};
use tidaldb::schema::{
DecaySpec, EntityId, SchemaBuilder, Timestamp, Window,
};
use tidaldb::ranking::ScoredCandidate;
const NUM_ITEMS: u64 = 10_000;
const NUM_CREATORS: u64 = 200;
const ITEMS_PER_CREATOR: u64 = NUM_ITEMS / NUM_CREATORS; // 50
const NUM_USERS: u64 = 500;
const EMBEDDING_DIM: usize = 16;
const NUM_SIGNALS: usize = 500_000;
/// Build a schema with all required signal types.
fn build_test_schema() -> tidaldb::schema::Schema {
let mut builder = SchemaBuilder::new();
let _ = builder
.signal(
"view",
tidaldb::schema::EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(7 * 24 * 3600),
},
)
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::SevenDays])
.velocity(true)
.add();
let _ = builder
.signal(
"like",
tidaldb::schema::EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(14 * 24 * 3600),
},
)
.windows(&[Window::TwentyFourHours, Window::SevenDays])
.velocity(false)
.add();
let _ = builder
.signal(
"skip",
tidaldb::schema::EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(24 * 3600),
},
)
.windows(&[Window::OneHour, Window::TwentyFourHours])
.velocity(false)
.add();
let _ = builder
.signal(
"share",
tidaldb::schema::EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(3 * 24 * 3600),
},
)
.windows(&[Window::TwentyFourHours])
.velocity(true)
.add();
let _ = builder
.signal(
"completion",
tidaldb::schema::EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(30 * 24 * 3600),
},
)
.windows(&[Window::SevenDays])
.velocity(false)
.add();
builder.build().unwrap()
}
/// Generate a deterministic embedding for an item.
///
/// Items from the same creator cluster together, and items in similar
/// categories have overlapping components. This makes the ANN retrieval
/// meaningful for testing.
fn item_embedding(item_id: u64) -> Vec<f32> {
let creator_id = item_id / ITEMS_PER_CREATOR;
let item_within_creator = item_id % ITEMS_PER_CREATOR;
let mut emb = vec![0.0f32; EMBEDDING_DIM];
// Creator component: items from the same creator share a base direction.
let creator_angle = (creator_id as f32) * std::f32::consts::TAU / (NUM_CREATORS as f32);
emb[0] = creator_angle.cos();
emb[1] = creator_angle.sin();
// Category component (assume 10 categories, cycling).
let category = (item_id % 10) as f32;
emb[2] = (category * 0.3).sin();
emb[3] = (category * 0.3).cos();
// Item-specific variation.
let item_angle = (item_within_creator as f32) * 0.1;
emb[4] = item_angle.sin();
emb[5] = item_angle.cos();
// Normalize to unit length.
let norm: f32 = emb.iter().map(|&x| x * x).sum::<f32>().sqrt();
if norm > 0.0 {
for v in &mut emb {
*v /= norm;
}
}
emb
}
/// Generate metadata for an item.
fn item_metadata(item_id: u64) -> HashMap<String, String> {
let creator_id = item_id / ITEMS_PER_CREATOR;
let categories = ["jazz", "rock", "classical", "hip-hop", "electronic",
"pop", "folk", "blues", "country", "r-and-b"];
let formats = ["video", "audio", "article"];
let category = categories[(item_id % 10) as usize];
let format = formats[(item_id % 3) as usize];
let duration = 60 + (item_id % 600); // 1min to 11min
let created_at_offset = item_id * 3600 * 1_000_000_000; // spread items over time
let mut meta = HashMap::new();
meta.insert("title".into(), format!("Item {}", item_id));
meta.insert("creator_id".into(), creator_id.to_string());
meta.insert("category".into(), category.into());
meta.insert("format".into(), format.into());
meta.insert("duration".into(), duration.to_string());
meta.insert("created_at".into(), (Timestamp::now().as_nanos() - created_at_offset).to_string());
meta
}
/// Generate the creator_id for an item.
fn creator_for_item(item_id: u64) -> u64 {
item_id / ITEMS_PER_CREATOR
}
/// Generate follows/blocks for a user.
///
/// Uses a deterministic pseudo-random function seeded by user_id.
fn user_relationships(user_id: u64) -> (Vec<u64>, Vec<u64>) {
let mut follows = vec![];
let mut blocks = vec![];
// Follow 10-30 creators (deterministic based on user_id).
let follow_count = 10 + (user_id % 21);
for i in 0..follow_count {
let creator = (user_id * 7 + i * 13) % NUM_CREATORS;
follows.push(creator);
}
follows.sort_unstable();
follows.dedup();
// Block 0-3 creators.
let block_count = user_id % 4;
for i in 0..block_count {
let creator = (user_id * 11 + i * 17 + 100) % NUM_CREATORS;
if !follows.contains(&creator) {
blocks.push(creator);
}
}
(follows, blocks)
}
/// Generate historical signal events.
///
/// Produces NUM_SIGNALS signal events spread across 7 days for all users.
/// Each event targets a random item with a signal type weighted toward views.
fn generate_signals(now: Timestamp) -> Vec<(u64, &'static str, u64, f64, Timestamp)> {
let mut events = Vec::with_capacity(NUM_SIGNALS);
let signal_types = ["view", "view", "view", "view", "like", "skip", "completion", "share"];
let seven_days_ns = 7 * 24 * 3600 * 1_000_000_000u64;
for i in 0..NUM_SIGNALS {
let user_id = (i as u64) % NUM_USERS;
let item_id = ((i as u64) * 7 + user_id * 13) % NUM_ITEMS;
let signal_type = signal_types[i % signal_types.len()];
let weight = if signal_type == "completion" { 0.5 + (i % 10) as f64 * 0.05 } else { 1.0 };
let offset_ns = ((i as u64) * seven_days_ns) / (NUM_SIGNALS as u64);
let ts = Timestamp::from_nanos(now.as_nanos() - seven_days_ns + offset_ns);
events.push((user_id, signal_type, item_id, weight, ts));
}
events
}
Test Implementation
#[test]
fn milestone_3_uat() {
let dir = tempdir().unwrap();
let schema = build_test_schema();
// ── Open database ────────────────────────────────────────
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
// ── Write items ──────────────────────────────────────────
for item_id in 0..NUM_ITEMS {
db.write_item(
EntityId::new(item_id),
&item_metadata(item_id),
// Some(item_embedding(item_id)), // embedding slot
).unwrap();
}
// ── Write user relationships ─────────────────────────────
for user_id in 0..NUM_USERS {
let (follows, blocks) = user_relationships(user_id);
for &creator_id in &follows {
db.add_relationship(
EntityId::new(user_id),
EntityId::new(creator_id),
"follows",
).unwrap();
}
for &creator_id in &blocks {
db.signal_with_user(
"block",
EntityId::new(creator_id), // creator_id
1.0,
Timestamp::now(),
&UserSignalContext::new(EntityId::new(user_id)),
).unwrap();
}
}
// ── Write historical signals ─────────────────────────────
let now = Timestamp::now();
let signals = generate_signals(now);
for &(user_id, signal_type, item_id, weight, ts) in &signals {
let user_ctx = UserSignalContext::new(EntityId::new(user_id));
db.signal_with_user(signal_type, EntityId::new(item_id), weight, ts, &user_ctx)
.unwrap();
}
// ── Step 1: For You query ────────────────────────────────
let user_42 = EntityId::new(42);
let (follows_42, blocks_42) = user_relationships(42);
let for_you_results = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
// Returns 50 results.
assert_eq!(for_you_results.len(), 50, "for_you should return 50 items");
// Sorted by score descending.
for w in for_you_results.windows(2) {
assert!(w[0].score >= w[1].score,
"results should be sorted by score: {} >= {}", w[0].score, w[1].score);
}
// No seen items. (User 42 has viewed items from historical signals.)
let seen_items_42: HashSet<u64> = signals.iter()
.filter(|(uid, sig, _, _, _)| *uid == 42 && *sig == "view")
.map(|(_, _, iid, _, _)| *iid)
.collect();
for r in &for_you_results {
assert!(!seen_items_42.contains(&r.entity_id.as_u64()),
"seen item {} should NOT appear in for_you results", r.entity_id.as_u64());
}
// No blocked creators.
for r in &for_you_results {
if let Some(cid) = r.creator_id {
assert!(!blocks_42.contains(&cid),
"item from blocked creator {} should not appear", cid);
}
}
// Max 2 per creator.
let mut creator_counts: HashMap<u64, usize> = HashMap::new();
for r in &for_you_results {
if let Some(cid) = r.creator_id {
*creator_counts.entry(cid).or_default() += 1;
}
}
for (&creator, &count) in &creator_counts {
assert!(count <= 2,
"creator {} has {} items, max 2 allowed", creator, count);
}
// Exploration budget: ~10% from unfollowed creators.
let follows_set: HashSet<u64> = follows_42.iter().copied().collect();
let exploration_count = for_you_results.iter()
.filter(|r| r.creator_id.map_or(false, |cid| !follows_set.contains(&cid)))
.count();
// Allow tolerance: 3-8 out of 50 (10% +/- buffer).
assert!(exploration_count >= 2 && exploration_count <= 10,
"exploration budget should be ~5 items, got {}", exploration_count);
// ── Step 2: Following query ──────────────────────────────
let following_results = db.retrieve(
"RETRIEVE items FOR USER @42 FILTER relationship:follows \
USING PROFILE following LIMIT 50"
).unwrap();
// All items from followed creators.
for r in &following_results {
if let Some(cid) = r.creator_id {
assert!(follows_set.contains(&cid),
"following feed item from creator {} who is not followed", cid);
}
}
// Chronological order (created_at DESC).
// Check that no later item's created_at is AFTER a previous item's.
// (Assumes ScoredCandidate includes a created_at field or we verify ordering.)
// ── Step 3: Related query ────────────────────────────────
let source_item = EntityId::new(500);
let related_results = db.retrieve(
"RETRIEVE items SIMILAR TO @500 FOR USER @42 \
USING PROFILE related FILTER unseen LIMIT 10"
).unwrap();
assert!(related_results.len() <= 10);
// Source item is excluded.
assert!(!related_results.iter().any(|r| r.entity_id == source_item),
"source item should not appear in related results");
// Seen items excluded.
for r in &related_results {
assert!(!seen_items_42.contains(&r.entity_id.as_u64()),
"seen item {} should not appear in related results", r.entity_id.as_u64());
}
// ── Step 4: Like signal ──────────────────────────────────
// Choose an item from a category/creator that user_42 hasn't engaged with much.
let target_item = EntityId::new(7777);
let target_creator = creator_for_item(7777);
let user_42_ctx = UserSignalContext::new(user_42);
// Read pre-like state.
let pre_like_score = db.read_decay_score(target_item, "like", 0).unwrap();
db.signal_with_user("like", target_item, 1.0, Timestamp::now(), &user_42_ctx)
.unwrap();
// Item like signal updated.
let post_like_score = db.read_decay_score(target_item, "like", 0).unwrap();
assert!(post_like_score.unwrap_or(0.0) > pre_like_score.unwrap_or(0.0),
"like should increase item signal score");
// ── Step 5: Re-execute For You after like ────────────────
let for_you_after_like = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
assert_eq!(for_you_after_like.len(), 50);
// Results should differ from Step 1 because the preference vector shifted.
// We cannot assert exact ordering, but the result set should not be identical.
let step1_ids: Vec<u64> = for_you_results.iter().map(|r| r.entity_id.as_u64()).collect();
let step5_ids: Vec<u64> = for_you_after_like.iter().map(|r| r.entity_id.as_u64()).collect();
// At least some items should be different (preference shifted).
let overlap = step1_ids.iter().filter(|id| step5_ids.contains(id)).count();
// Allow high overlap but not 100% identical ordering.
// (A single like may not dramatically change results, but the set or ordering should shift.)
// ── Step 6: Hide signal ──────────────────────────────────
let hide_target = EntityId::new(999);
db.signal_with_user("hide", hide_target, 1.0, Timestamp::now(), &user_42_ctx)
.unwrap();
// Immediately excluded.
let after_hide = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
assert!(!after_hide.iter().any(|r| r.entity_id == hide_target),
"hidden item 999 should never appear after hide");
// ── Step 7: Block signal ─────────────────────────────────
let block_creator = 77u64;
db.signal_with_user(
"block",
EntityId::new(block_creator), // creator_id
1.0,
Timestamp::now(),
&user_42_ctx,
).unwrap();
// ── Step 8: Re-execute For You after hide and block ──────
let for_you_final = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
assert_eq!(for_you_final.len(), 50, "should still return 50 results");
// Hidden item excluded.
assert!(!for_you_final.iter().any(|r| r.entity_id == hide_target),
"hidden item 999 must not appear");
// Blocked creator excluded.
for r in &for_you_final {
if let Some(cid) = r.creator_id {
assert_ne!(cid, block_creator,
"items from blocked creator 77 must not appear");
}
}
// Diversity still holds.
let mut creator_counts_final: HashMap<u64, usize> = HashMap::new();
for r in &for_you_final {
if let Some(cid) = r.creator_id {
*creator_counts_final.entry(cid).or_default() += 1;
}
}
for (&creator, &count) in &creator_counts_final {
assert!(count <= 2,
"creator {} has {} items in final results, max 2", creator, count);
}
// ── Persistence: Close and reopen ────────────────────────
db.close().unwrap();
let db2 = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
// Re-execute for_you: hard negatives survive restart.
let for_you_recovered = db2.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
// Hidden item still excluded.
assert!(!for_you_recovered.iter().any(|r| r.entity_id == hide_target),
"hidden item 999 must survive restart");
// Blocked creator still excluded.
for r in &for_you_recovered {
if let Some(cid) = r.creator_id {
assert_ne!(cid, block_creator,
"blocked creator 77 must survive restart");
}
}
// Preference vector restored (results should be similar to post-like, not pre-like).
// We cannot assert exact results, but the result set should reflect learned preferences.
assert_eq!(for_you_recovered.len(), 50, "recovered query should return 50 results");
// ── Cold-start user ──────────────────────────────────────
let cold_start_results = db2.retrieve(
"RETRIEVE items FOR USER @501 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
assert_eq!(cold_start_results.len(), 50,
"cold-start user should get 50 results from population signals");
// Results sorted by score.
for w in cold_start_results.windows(2) {
assert!(w[0].score >= w[1].score,
"cold-start results should be sorted");
}
db2.close().unwrap();
}
Performance Test
#[test]
fn milestone_3_performance() {
let dir = tempdir().unwrap();
let schema = build_test_schema();
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
// Minimal setup: 10K items, a few users, moderate signals.
for item_id in 0..NUM_ITEMS {
db.write_item(EntityId::new(item_id), &item_metadata(item_id)).unwrap();
}
let user_ctx = UserSignalContext::new(EntityId::new(42));
for i in 0..1_000 {
let item_id = i % NUM_ITEMS;
db.signal_with_user("view", EntityId::new(item_id), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
}
// Measure RETRIEVE latency.
let start = std::time::Instant::now();
let _results = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
let elapsed = start.elapsed();
assert!(elapsed < Duration::from_millis(100),
"for_you query took {:?}, should be < 100ms", elapsed);
// Measure signal write latency.
let start = std::time::Instant::now();
db.signal_with_user("like", EntityId::new(42), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
let signal_elapsed = start.elapsed();
assert!(signal_elapsed < Duration::from_millis(1),
"signal_with_user took {:?}, should be < 1ms", signal_elapsed);
db.close().unwrap();
}
Critical Invariant Tests
These are property-style tests embedded in the UAT to verify that the critical invariant holds under stress.
/// Hidden items never leak -- not once, not ever.
///
/// This test writes a batch of hide/block signals, then executes
/// many queries and verifies that no hidden or blocked item appears.
#[test]
fn hidden_and_blocked_never_leak() {
let db = open_ephemeral_test_db_with_items(1000, 50);
let user_ctx = UserSignalContext::new(EntityId::new(1));
// Follow some creators.
for cid in 0..20 {
db.add_relationship(EntityId::new(1), EntityId::new(cid), "follows").unwrap();
}
// Hide 50 items.
let hidden: Vec<u64> = (100..150).collect();
for &iid in &hidden {
db.signal_with_user("hide", EntityId::new(iid), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
}
// Block 5 creators.
let blocked: Vec<u64> = (30..35).collect();
for &cid in &blocked {
db.signal_with_user("block", EntityId::new(cid), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
}
// Execute 100 queries.
for _ in 0..100 {
let results = db.retrieve(
"RETRIEVE items FOR USER @1 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
for r in &results {
let iid = r.entity_id.as_u64();
assert!(!hidden.contains(&iid),
"hidden item {} leaked into results!", iid);
if let Some(cid) = r.creator_id {
assert!(!blocked.contains(&cid),
"item from blocked creator {} leaked into results!", cid);
}
}
}
}
/// Hard negatives survive crash and WAL replay.
#[test]
fn hard_negatives_survive_restart() {
let dir = tempdir().unwrap();
let schema = build_test_schema();
// Open, write items, hide/block, close.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
for item_id in 0..100 {
db.write_item(EntityId::new(item_id), &item_metadata(item_id)).unwrap();
}
let user_ctx = UserSignalContext::new(EntityId::new(1));
db.signal_with_user("hide", EntityId::new(42), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
db.signal_with_user("block", EntityId::new(3), 1.0, Timestamp::now(), &user_ctx)
.unwrap();
db.close().unwrap();
}
// Reopen and verify.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
let results = db.retrieve(
"RETRIEVE items FOR USER @1 USING PROFILE for_you \
FILTER unseen, unblocked LIMIT 50"
).unwrap();
// Item 42 must not appear (hidden).
assert!(!results.iter().any(|r| r.entity_id.as_u64() == 42),
"hidden item 42 must survive restart");
// No items from creator 3 (blocked).
for r in &results {
if let Some(cid) = r.creator_id {
assert_ne!(cid, 3, "blocked creator 3 must survive restart");
}
}
db.close().unwrap();
}
}
Test Strategy
This task IS the test. The deliverable is the test file itself. However, the test must be structured for maintainability:
Structure
- Setup helpers: deterministic data generation functions (embeddings, metadata, relationships, signals)
- Main UAT test:
milestone_3_uat()exercises the full 8-step scenario from the ROADMAP - Performance test:
milestone_3_performance()verifies latency bounds - Invariant tests:
hidden_and_blocked_never_leak()andhard_negatives_survive_restart()verify critical safety properties - Cold-start test: embedded in the main UAT, verifies cold-start user handling
What "Pass" Means
The test passes when ALL of the following are true:
for_youreturns 50 personalized, filtered, diversity-constrained resultsfollowingreturns only items from followed creators in chronological orderrelatedreturns semantically similar items excluding the source and seen items- A
likesignal updates item signals, interaction weights, and preference vector immediately - Post-like
for_youreflects the preference shift hidepermanently excludes the item for the userblockpermanently excludes all items from the creator for the user- Hard negatives survive database close and reopen
- Cold-start users get population-level results
- No hidden or blocked content ever leaks into results (0 leaks across 100 queries)
Acceptance Criteria
milestone_3_uattest passes (full 8-step scenario)- Step 1: for_you returns 50 personalized results with diversity constraints
- Step 1: no seen items, no blocked creator items, no hidden items in results
- Step 1: max 2 per creator enforced
- Step 1: exploration budget produces ~5 items from unfollowed creators
- Step 2: following feed contains only items from followed creators
- Step 2: following feed is in chronological order
- Step 3: related query excludes source item and seen items
- Step 4: like signal updates item signals, interaction weights, and preference vector
- Step 5: post-like for_you reflects preference shift
- Step 6: hidden item excluded immediately
- Step 7: blocked creator's items excluded immediately
- Step 8: final for_you excludes hidden item and blocked creator, diversity holds
- Persistence: hard negatives survive close and reopen
- Persistence: preference vector restored on reopen
- Cold-start user gets 50 results from population-level signals
hidden_and_blocked_never_leaktest passes (0 leaks across 100 queries)hard_negatives_survive_restarttest passesmilestone_3_performancetest passes (retrieve < 100ms, signal < 1ms)- All tests run in < 60 seconds total (not per-test)
cargo clippy -- -D warningspasses- No
#[ignore]attributes on UAT tests - Test uses deterministic data generation (reproducible across runs)
Research References
- ROADMAP.md -- M3 UAT Scenario (the authoritative scenario this test implements)
- VISION.md -- End-state query, design principles
- USE_CASES.md -- UC-01 (For You), UC-04 (Following), UC-05 (Related)
- SEQUENCE.md -- Core Feedback Loop, For You Feed, Following Feed
Implementation Notes
- The test uses 16-dimensional embeddings for speed. At 10K items, 16 dimensions is sufficient to verify ANN retrieval and cosine similarity behavior. The full 1536 dimensions would be used in production benchmarks, not UAT.
- Deterministic data generation ensures the test is reproducible. No random seeds, no system time dependencies in data setup (except for the "now" timestamp used for signal timestamps, which is acceptable because signal decay is relative).
- The
generate_signalsfunction distributes signals across users and items with a bias toward views (4x more views than likes/skips). This produces realistic signal distributions where most items have views but fewer have likes. - The embedding generation clusters items by creator (shared base direction) and category (shared component). This ensures ANN retrieval produces meaningful clusters for testing personalization.
- The exploration budget assertion uses a wide tolerance (2--10 out of 50) because exploration candidates are selected with some randomness. The exact count depends on the corpus and the user's follow set.
- The "results differ after like" assertion is deliberately loose. A single like to a 16-dim embedding may not dramatically change the top-50, especially if the user already has a strong preference vector from 500K signals. The assertion checks that the result set is not bit-for-bit identical, not that it is dramatically different.
- The performance test uses a smaller signal count (1K instead of 500K) to keep the test fast. The latency assertion (< 100ms) is generous for a test environment. Production benchmarks with Criterion would use tighter bounds.
- The
open_ephemeral_test_db_with_itemshelper creates an in-memory TidalDb with the given number of items and creators. This helper must be defined in a shared test utilities module. - This test file imports from the
tidaldbcrate's public API only. It does not usepub(crate)internals. If the test cannot be written against the public API, the public API is incomplete and must be extended as part of this task.