- m0p3: CONTRIBUTING.md with run-samples checklist, all 4 examples (quickstart, cli_embedding, axum_embedding, actix_embedding), doc-test coverage for every public API surface - m1p5: TidalDb public API — write_item, signal, read_decay_score, read_windowed_count, read_velocity; StorageBox enum routing memory vs fjall; WalSender/WalHandleWriter bridge; WAL replay on open - Periodic checkpoint: 30s background thread for persistent+schema mode; FjallBackend::Clone (O(1), fjall::Keyspace is ref-counted); graceful shutdown via Arc<AtomicBool> + join before final checkpoint - ROADMAP.md: M0 and M1 fully marked COMPLETE (341 tests passing) - Milestone 2 planning scaffolding added under docs/planning/milestone-2/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1090 lines
37 KiB
Markdown
1090 lines
37 KiB
Markdown
# Task 03: M2 UAT Integration Test
|
|
|
|
## Context
|
|
|
|
**Milestone:** 2 -- Ranked Retrieval
|
|
**Phase:** m2p5 -- Query Parser and RETRIEVE Executor
|
|
**Depends On:** Task 01 (Retrieve, Results, QueryError types), Task 02 (RetrieveExecutor, TidalDb::retrieve())
|
|
**Blocks:** Milestone 3 (personalized ranking)
|
|
**Complexity:** M
|
|
|
|
## Objective
|
|
|
|
Deliver the Milestone 2 User Acceptance Test as a Rust integration test in `tidal/tests/m2_uat.rs`. This test exercises the complete M2 scenario from the roadmap: open a database with a full schema (5 signal types, 6 ranking profiles), write 10K items with metadata and embeddings, write 10K signal events, execute all 6 profile queries verifying ordering and filter correctness, write a signal burst and verify rank change, and re-verify after shutdown and reopen.
|
|
|
|
This is the milestone gate. If it passes, Milestone 2 is done. The test proves that "a single query retrieves, scores, and ranks content using live signals" -- the M2 thesis.
|
|
|
|
## Requirements
|
|
|
|
- Full M2 UAT scenario from ROADMAP.md implemented as `tidal/tests/m2_uat.rs`
|
|
- 10K items with metadata (category, format, creator_id) and 64-dim embeddings
|
|
- 10K signal events spanning 7 days across 5 signal types
|
|
- All 6 RETRIEVE queries executed and verified:
|
|
1. `trending` with `max_per_creator:1` diversity -- 25 results, creator-diverse, score-sorted
|
|
2. `hot` with `category:jazz` filter -- only jazz items, score-sorted
|
|
3. `new` -- created_at descending
|
|
4. `top_week` -- signal-based ordering within 7d window
|
|
5. `hidden_gems` -- quality/reach ratio ordering
|
|
6. `controversial` -- dual-signal ranking
|
|
- Signal burst for item #500, re-query trending, verify rank change
|
|
- Shutdown and reopen, re-verify all queries
|
|
- All tests use `tempfile::TempDir` for isolation
|
|
- Tests must pass `cargo test --test m2_uat`
|
|
- Deterministic test data (fixed timestamps, reproducible event sequences)
|
|
|
|
## Technical Design
|
|
|
|
### Module Structure
|
|
|
|
```
|
|
tidal/tests/
|
|
m2_uat.rs -- Full M2 UAT integration test
|
|
```
|
|
|
|
### Test Implementation
|
|
|
|
```rust
|
|
// === tidal/tests/m2_uat.rs ===
|
|
|
|
use std::collections::HashMap;
|
|
use std::time::Duration;
|
|
use tempfile::TempDir;
|
|
|
|
use tidaldb::query::retrieve::Retrieve;
|
|
use tidaldb::ranking::diversity::DiversityConstraints;
|
|
use tidaldb::schema::*;
|
|
use tidaldb::storage::indexes::filter::FilterExpr;
|
|
use tidaldb::{Config, TidalDB};
|
|
|
|
// ============================================================
|
|
// Test Helpers
|
|
// ============================================================
|
|
|
|
/// Build the M2 schema: 5 signal types, 6 ranking profiles, 64-dim embeddings.
|
|
fn m2_schema() -> Schema {
|
|
let mut builder = SchemaBuilder::new();
|
|
|
|
// Embedding slot for items: 64-dim (small for test speed)
|
|
builder.embedding_slot("default", EntityKind::Item, 64);
|
|
|
|
// Signal types
|
|
builder
|
|
.signal(
|
|
"view",
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(7 * 24 * 3600), // 7 days
|
|
},
|
|
)
|
|
.windows(&[
|
|
Window::OneHour,
|
|
Window::TwentyFourHours,
|
|
Window::SevenDays,
|
|
Window::AllTime,
|
|
])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
builder
|
|
.signal(
|
|
"like",
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(14 * 24 * 3600), // 14 days
|
|
},
|
|
)
|
|
.windows(&[
|
|
Window::TwentyFourHours,
|
|
Window::SevenDays,
|
|
Window::AllTime,
|
|
])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
builder
|
|
.signal(
|
|
"skip",
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(24 * 3600), // 1 day
|
|
},
|
|
)
|
|
.windows(&[Window::OneHour, Window::TwentyFourHours])
|
|
.velocity(false)
|
|
.add();
|
|
|
|
builder
|
|
.signal(
|
|
"share",
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(3 * 24 * 3600), // 3 days
|
|
},
|
|
)
|
|
.windows(&[
|
|
Window::OneHour,
|
|
Window::TwentyFourHours,
|
|
Window::SevenDays,
|
|
])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
builder
|
|
.signal(
|
|
"completion",
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(30 * 24 * 3600), // 30 days
|
|
},
|
|
)
|
|
.windows(&[Window::SevenDays, Window::AllTime])
|
|
.velocity(false)
|
|
.add();
|
|
|
|
// Built-in profiles are auto-registered: trending, hot, new, top_week,
|
|
// hidden_gems, controversial, most_viewed, most_liked, shuffle, etc.
|
|
|
|
builder.build().unwrap()
|
|
}
|
|
|
|
/// Categories used for test items. 10 distinct values.
|
|
const CATEGORIES: &[&str] = &[
|
|
"jazz", "rock", "classical", "electronic", "hip_hop",
|
|
"country", "blues", "folk", "metal", "pop",
|
|
];
|
|
|
|
/// Formats used for test items. 4 distinct values.
|
|
const FORMATS: &[&str] = &["video", "audio", "article", "short"];
|
|
|
|
/// Generate deterministic item metadata.
|
|
///
|
|
/// Returns (category, format, creator_id, created_at_offset_nanos).
|
|
fn item_metadata(
|
|
item_index: u64,
|
|
) -> (String, String, EntityId, u64) {
|
|
let category = CATEGORIES[(item_index as usize) % CATEGORIES.len()].to_string();
|
|
let format = FORMATS[(item_index as usize) % FORMATS.len()].to_string();
|
|
// 200 creators, distributed round-robin
|
|
let creator_id = EntityId::new((item_index % 200) + 1);
|
|
// Spread creation times across 30 days (newest items have highest index)
|
|
let thirty_days_nanos = 30u64 * 24 * 3600 * 1_000_000_000;
|
|
let created_at_offset = (item_index * thirty_days_nanos) / 10_000;
|
|
(category, format, creator_id, created_at_offset)
|
|
}
|
|
|
|
/// Generate a deterministic 64-dim embedding for an item.
|
|
///
|
|
/// Uses a simple deterministic formula based on the item index.
|
|
/// The embeddings are normalized to unit length for cosine similarity.
|
|
fn generate_embedding(item_index: u64, dimensions: usize) -> Vec<f32> {
|
|
let mut vec: Vec<f32> = (0..dimensions)
|
|
.map(|d| {
|
|
// Deterministic pseudo-random using item index and dimension
|
|
let seed = (item_index as f32 * 0.7 + d as f32 * 1.3).sin();
|
|
seed
|
|
})
|
|
.collect();
|
|
|
|
// L2 normalize
|
|
let norm: f32 = vec.iter().map(|x| x * x).sum::<f32>().sqrt();
|
|
if norm > 0.0 {
|
|
for v in &mut vec {
|
|
*v /= norm;
|
|
}
|
|
}
|
|
|
|
vec
|
|
}
|
|
|
|
/// Generate deterministic signal events spanning a time range.
|
|
///
|
|
/// Distributes events across entities and signal types with a prime
|
|
/// stride for reproducible but varied patterns. Each entity gets a
|
|
/// different number of events to create interesting ranking dynamics.
|
|
fn generate_signal_events(
|
|
count: usize,
|
|
entity_count: u64,
|
|
base_time_nanos: u64,
|
|
span_nanos: u64,
|
|
) -> Vec<(EntityId, &'static str, f64, u64)> {
|
|
let signal_types = ["view", "like", "skip", "share", "completion"];
|
|
let mut events = Vec::with_capacity(count);
|
|
|
|
for i in 0..count {
|
|
// Entity distribution: power-law-ish (some items get many more events)
|
|
let entity_raw = ((i as u64) * 7919 + 1) % entity_count;
|
|
let entity_id = EntityId::new(entity_raw + 1);
|
|
|
|
// Signal type: round-robin
|
|
let signal = signal_types[i % signal_types.len()];
|
|
|
|
// Weight: always 1.0 for count-based signals
|
|
let weight = 1.0;
|
|
|
|
// Timestamp: spread across the time span
|
|
let offset = ((i as u64) * 104729 + 1) % span_nanos;
|
|
let ts = base_time_nanos.saturating_sub(span_nanos) + offset;
|
|
|
|
events.push((entity_id, signal, weight, ts));
|
|
}
|
|
|
|
events
|
|
}
|
|
|
|
/// Count unique creators in a result set.
|
|
fn creator_counts(
|
|
results: &[tidaldb::query::retrieve::RetrieveResult],
|
|
db: &TidalDB,
|
|
) -> HashMap<EntityId, usize> {
|
|
let mut counts: HashMap<EntityId, usize> = HashMap::new();
|
|
for result in results {
|
|
if let Ok(Some(meta)) = db.get_item_metadata(result.entity_id) {
|
|
if let Some(creator_id) = meta.creator_id {
|
|
*counts.entry(creator_id).or_insert(0) += 1;
|
|
}
|
|
}
|
|
}
|
|
counts
|
|
}
|
|
|
|
/// Get the category of an item from the database.
|
|
fn item_category(db: &TidalDB, entity_id: EntityId) -> Option<String> {
|
|
db.get_item_metadata(entity_id)
|
|
.ok()
|
|
.flatten()
|
|
.and_then(|m| m.category.clone())
|
|
}
|
|
|
|
// ============================================================
|
|
// THE M2 UAT TEST
|
|
// ============================================================
|
|
//
|
|
// This is the definitive acceptance test for Milestone 2.
|
|
// It matches the UAT scenario in ROADMAP.md.
|
|
#[test]
|
|
fn milestone_2_uat() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema: schema.clone(),
|
|
})
|
|
.unwrap();
|
|
|
|
// ============================================================
|
|
// Setup: Write 10K items with metadata and embeddings
|
|
// ============================================================
|
|
|
|
let now = Timestamp::now();
|
|
let now_nanos = now.as_nanos();
|
|
|
|
for i in 0..10_000u64 {
|
|
let (category, format, creator_id, created_at_offset) = item_metadata(i);
|
|
let embedding = generate_embedding(i, 64);
|
|
let created_at_nanos = now_nanos.saturating_sub(created_at_offset);
|
|
|
|
db.write_item_with_metadata(
|
|
EntityId::new(i + 1),
|
|
&category,
|
|
&format,
|
|
creator_id,
|
|
Timestamp::from_nanos(created_at_nanos),
|
|
Some(&embedding),
|
|
)
|
|
.unwrap();
|
|
}
|
|
|
|
// Verify item count
|
|
assert_eq!(db.item_count().unwrap(), 10_000);
|
|
|
|
// ============================================================
|
|
// Setup: Write 10K signal events spanning 7 days
|
|
// ============================================================
|
|
|
|
let seven_days_nanos = 7u64 * 24 * 3600 * 1_000_000_000;
|
|
let events = generate_signal_events(10_000, 10_000, now_nanos, seven_days_nanos);
|
|
|
|
for (entity_id, signal_type, weight, ts_nanos) in &events {
|
|
db.signal(signal_type, *entity_id, *weight, Timestamp::from_nanos(*ts_nanos))
|
|
.unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 1: Trending with diversity
|
|
// ============================================================
|
|
// RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25
|
|
|
|
let trending_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("trending")
|
|
.diversity(DiversityConstraints::new().max_per_creator(1))
|
|
.limit(25)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let trending_results = db.retrieve(&trending_query).unwrap();
|
|
|
|
// Verify: got results (up to 25)
|
|
assert!(
|
|
!trending_results.is_empty(),
|
|
"trending query should return results"
|
|
);
|
|
assert!(
|
|
trending_results.len() <= 25,
|
|
"trending query should return at most 25 results, got {}",
|
|
trending_results.len()
|
|
);
|
|
|
|
// Verify: scores are sorted descending
|
|
for pair in trending_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"trending results should be sorted descending: {} >= {} (ranks {} and {})",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
pair[0].rank,
|
|
pair[1].rank,
|
|
);
|
|
}
|
|
|
|
// Verify: creator diversity (max 1 per creator)
|
|
let creators = creator_counts(&trending_results.items, &db);
|
|
for (creator_id, count) in &creators {
|
|
assert!(
|
|
*count <= 1,
|
|
"max_per_creator:1 violated: creator {} appears {} times",
|
|
creator_id,
|
|
count,
|
|
);
|
|
}
|
|
|
|
// Verify: ranks are 1-based and sequential
|
|
for (i, item) in trending_results.items.iter().enumerate() {
|
|
assert_eq!(
|
|
item.rank,
|
|
i + 1,
|
|
"rank should be 1-based sequential, got {} at position {}",
|
|
item.rank,
|
|
i,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 2: Hot with category filter
|
|
// ============================================================
|
|
// RETRIEVE items FILTER category:jazz USING PROFILE hot LIMIT 20
|
|
|
|
let jazz_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("hot")
|
|
.filter(FilterExpr::eq("category", "jazz"))
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let jazz_results = db.retrieve(&jazz_query).unwrap();
|
|
|
|
// Verify: only jazz items returned
|
|
for item in &jazz_results.items {
|
|
let category = item_category(&db, item.entity_id);
|
|
assert_eq!(
|
|
category.as_deref(),
|
|
Some("jazz"),
|
|
"hot+jazz query returned non-jazz item: entity={}, category={:?}",
|
|
item.entity_id,
|
|
category,
|
|
);
|
|
}
|
|
|
|
// Verify: scores are sorted descending
|
|
for pair in jazz_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"jazz results should be sorted descending: {} >= {}",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 3: New (created_at descending)
|
|
// ============================================================
|
|
// RETRIEVE items USING PROFILE new LIMIT 20
|
|
|
|
let new_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("new")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let new_results = db.retrieve(&new_query).unwrap();
|
|
|
|
assert!(
|
|
!new_results.is_empty(),
|
|
"new query should return results"
|
|
);
|
|
assert!(
|
|
new_results.len() <= 20,
|
|
"new query should return at most 20 results"
|
|
);
|
|
|
|
// Verify: scores are sorted descending (new profile uses created_at as score)
|
|
for pair in new_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"new results should be sorted descending: {} >= {} (entities {} and {})",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
pair[0].entity_id,
|
|
pair[1].entity_id,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 4: Top week (signal-based ordering within 7d window)
|
|
// ============================================================
|
|
// RETRIEVE items USING PROFILE top_week LIMIT 20
|
|
|
|
let top_week_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("top_week")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let top_week_results = db.retrieve(&top_week_query).unwrap();
|
|
|
|
assert!(
|
|
!top_week_results.is_empty(),
|
|
"top_week query should return results"
|
|
);
|
|
|
|
// Verify: scores are sorted descending
|
|
for pair in top_week_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"top_week results should be sorted descending: {} >= {}",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 5: Hidden gems
|
|
// ============================================================
|
|
// ROADMAP UAT: RETRIEVE items USING PROFILE hidden_gems FILTER min_completion_rate:0.7 LIMIT 10
|
|
//
|
|
// M2 limitation: `min_completion_rate` is a signal-derived filter (completion
|
|
// rate = completion_count / view_count). The m2p2 filter engine supports
|
|
// metadata field filters (BitmapIndex, RangeIndex) but not computed signal
|
|
// ratios. Signal-derived predicates are an M3+ extension to the filter engine.
|
|
// For M2, the hidden_gems query runs without the completion rate filter;
|
|
// all items are candidates and the hidden_gems scoring formula naturally
|
|
// surfaces items with high completion-to-view ratios.
|
|
|
|
let hidden_gems_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("hidden_gems")
|
|
// TODO M3: add .filter(FilterExpr::signal_ratio("completion", "view", 0.7))
|
|
// once signal-derived predicates are supported in the filter engine.
|
|
.limit(10)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let hidden_gems_results = db.retrieve(&hidden_gems_query).unwrap();
|
|
|
|
assert!(
|
|
!hidden_gems_results.is_empty(),
|
|
"hidden_gems query should return results"
|
|
);
|
|
|
|
// Verify: scores are sorted descending
|
|
for pair in hidden_gems_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"hidden_gems results should be sorted descending: {} >= {}",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Query 6: Controversial (dual-signal ranking)
|
|
// ============================================================
|
|
// RETRIEVE items USING PROFILE controversial LIMIT 10
|
|
|
|
let controversial_query = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("controversial")
|
|
.limit(10)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let controversial_results = db.retrieve(&controversial_query).unwrap();
|
|
|
|
assert!(
|
|
!controversial_results.is_empty(),
|
|
"controversial query should return results"
|
|
);
|
|
|
|
// Verify: scores are sorted descending
|
|
for pair in controversial_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"controversial results should be sorted descending: {} >= {}",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
);
|
|
}
|
|
|
|
// ============================================================
|
|
// Signal Burst: Write 100 "share" signals for item #500
|
|
// ============================================================
|
|
|
|
// Record pre-burst trending results
|
|
let pre_burst_trending = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("trending")
|
|
.limit(50)
|
|
.build()
|
|
.unwrap();
|
|
let pre_burst_results = db.retrieve(&pre_burst_trending).unwrap();
|
|
let pre_burst_rank = pre_burst_results
|
|
.items
|
|
.iter()
|
|
.position(|r| r.entity_id == EntityId::new(500));
|
|
|
|
// Write 100 "share" signals for item #500 at the current time
|
|
let burst_time = Timestamp::now();
|
|
for _ in 0..100 {
|
|
db.signal("share", EntityId::new(500), 1.0, burst_time)
|
|
.unwrap();
|
|
}
|
|
|
|
// Re-execute trending query
|
|
let post_burst_results = db.retrieve(&pre_burst_trending).unwrap();
|
|
let post_burst_rank = post_burst_results
|
|
.items
|
|
.iter()
|
|
.position(|r| r.entity_id == EntityId::new(500));
|
|
|
|
// Verify: item #500 should be present (or rose from absent to present)
|
|
// and its rank should have improved (or appeared)
|
|
match (pre_burst_rank, post_burst_rank) {
|
|
(None, Some(rank)) => {
|
|
// Item was not in the top 50 before, now it is -- signal burst worked
|
|
assert!(
|
|
rank < 50,
|
|
"item #500 should appear in top 50 after burst, found at position {}",
|
|
rank
|
|
);
|
|
}
|
|
(Some(pre), Some(post)) => {
|
|
// Item was in top 50 and should have moved up
|
|
assert!(
|
|
post <= pre,
|
|
"item #500 should rank higher after burst: pre={}, post={}",
|
|
pre,
|
|
post
|
|
);
|
|
}
|
|
(None, None) => {
|
|
// If item #500 still does not appear in top 50 after 100 share signals,
|
|
// check that it at least has a higher score than before.
|
|
// This can happen if the item is in a crowded ranking.
|
|
// We verify signal write worked by reading the signal directly.
|
|
let share_count = db
|
|
.read_windowed_count(EntityId::new(500), "share", Window::AllTime)
|
|
.unwrap();
|
|
assert!(
|
|
share_count >= 100,
|
|
"item #500 should have at least 100 shares after burst, got {}",
|
|
share_count
|
|
);
|
|
}
|
|
(Some(_), None) => {
|
|
panic!(
|
|
"item #500 was in trending before burst but disappeared after -- this is wrong"
|
|
);
|
|
}
|
|
}
|
|
|
|
// ============================================================
|
|
// Crash Recovery: Shutdown and reopen
|
|
// ============================================================
|
|
|
|
db.shutdown().unwrap();
|
|
|
|
let db2 = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema: schema.clone(),
|
|
})
|
|
.unwrap();
|
|
|
|
// Re-verify: items survived
|
|
assert_eq!(
|
|
db2.item_count().unwrap(),
|
|
10_000,
|
|
"item count should survive restart"
|
|
);
|
|
|
|
// Re-verify: trending query still works
|
|
let recovered_trending = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("trending")
|
|
.limit(25)
|
|
.build()
|
|
.unwrap();
|
|
let recovered_results = db2.retrieve(&recovered_trending).unwrap();
|
|
assert!(
|
|
!recovered_results.is_empty(),
|
|
"trending query should work after restart"
|
|
);
|
|
|
|
// Re-verify: scores are sorted descending after restart
|
|
for pair in recovered_results.items.windows(2) {
|
|
assert!(
|
|
pair[0].score >= pair[1].score,
|
|
"trending results after restart should be sorted: {} >= {}",
|
|
pair[0].score,
|
|
pair[1].score,
|
|
);
|
|
}
|
|
|
|
// Re-verify: hot+jazz filter still works
|
|
let recovered_jazz = Retrieve::builder()
|
|
.entity(EntityKind::Item)
|
|
.profile("hot")
|
|
.filter(FilterExpr::eq("category", "jazz"))
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
let recovered_jazz_results = db2.retrieve(&recovered_jazz).unwrap();
|
|
for item in &recovered_jazz_results.items {
|
|
let category = item_category(&db2, item.entity_id);
|
|
assert_eq!(
|
|
category.as_deref(),
|
|
Some("jazz"),
|
|
"jazz filter should still work after restart"
|
|
);
|
|
}
|
|
|
|
// Re-verify: signal burst for item #500 survived
|
|
let recovered_share_count = db2
|
|
.read_windowed_count(EntityId::new(500), "share", Window::AllTime)
|
|
.unwrap();
|
|
assert!(
|
|
recovered_share_count >= 100,
|
|
"share signals for item #500 should survive restart, got {}",
|
|
recovered_share_count
|
|
);
|
|
|
|
db2.shutdown().unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// SIGNAL SNAPSHOT TRANSPARENCY TEST
|
|
// ============================================================
|
|
//
|
|
// Verifies that RETRIEVE results include signal snapshots
|
|
// for debugging and ranking transparency.
|
|
#[test]
|
|
fn retrieve_results_include_signal_snapshots() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema,
|
|
})
|
|
.unwrap();
|
|
|
|
// Write 100 items with embeddings
|
|
for i in 0..100u64 {
|
|
let (category, format, creator_id, created_at_offset) = item_metadata(i);
|
|
let embedding = generate_embedding(i, 64);
|
|
let now = Timestamp::now();
|
|
|
|
db.write_item_with_metadata(
|
|
EntityId::new(i + 1),
|
|
&category,
|
|
&format,
|
|
creator_id,
|
|
Timestamp::from_nanos(now.as_nanos().saturating_sub(created_at_offset)),
|
|
Some(&embedding),
|
|
)
|
|
.unwrap();
|
|
}
|
|
|
|
// Write enough signals so profiles have data to score with
|
|
let now = Timestamp::now();
|
|
for i in 0..500u64 {
|
|
let entity = EntityId::new((i % 100) + 1);
|
|
db.signal("view", entity, 1.0, now).unwrap();
|
|
if i % 3 == 0 {
|
|
db.signal("like", entity, 1.0, now).unwrap();
|
|
}
|
|
}
|
|
|
|
// Query with hot profile
|
|
let query = Retrieve::builder()
|
|
.profile("hot")
|
|
.limit(10)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let results = db.retrieve(&query).unwrap();
|
|
|
|
// At least some results should have signal snapshots
|
|
let has_snapshots = results
|
|
.items
|
|
.iter()
|
|
.any(|r| !r.signal_snapshot.is_empty());
|
|
assert!(
|
|
has_snapshots,
|
|
"at least some results should include signal snapshots"
|
|
);
|
|
|
|
// Signal snapshots should be capped at 10
|
|
for item in &results.items {
|
|
assert!(
|
|
item.signal_snapshot.len() <= 10,
|
|
"signal snapshot should be capped at 10, got {}",
|
|
item.signal_snapshot.len()
|
|
);
|
|
}
|
|
|
|
db.shutdown().unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// EXCLUDE LIST TEST
|
|
// ============================================================
|
|
//
|
|
// Verifies that EXCLUDE IDs are removed from results.
|
|
#[test]
|
|
fn retrieve_excludes_specified_ids() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema,
|
|
})
|
|
.unwrap();
|
|
|
|
// Write 50 items
|
|
for i in 0..50u64 {
|
|
let (category, format, creator_id, created_at_offset) = item_metadata(i);
|
|
let embedding = generate_embedding(i, 64);
|
|
let now = Timestamp::now();
|
|
|
|
db.write_item_with_metadata(
|
|
EntityId::new(i + 1),
|
|
&category,
|
|
&format,
|
|
creator_id,
|
|
Timestamp::from_nanos(now.as_nanos().saturating_sub(created_at_offset)),
|
|
Some(&embedding),
|
|
)
|
|
.unwrap();
|
|
}
|
|
|
|
// Write signals
|
|
let now = Timestamp::now();
|
|
for i in 0..200u64 {
|
|
let entity = EntityId::new((i % 50) + 1);
|
|
db.signal("view", entity, 1.0, now).unwrap();
|
|
}
|
|
|
|
// Query without excludes
|
|
let query_no_exclude = Retrieve::builder()
|
|
.profile("hot")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
let results_no_exclude = db.retrieve(&query_no_exclude).unwrap();
|
|
|
|
// Pick the top 3 IDs to exclude
|
|
let exclude_ids: Vec<EntityId> = results_no_exclude
|
|
.items
|
|
.iter()
|
|
.take(3)
|
|
.map(|r| r.entity_id)
|
|
.collect();
|
|
|
|
// Query with excludes
|
|
let query_with_exclude = Retrieve::builder()
|
|
.profile("hot")
|
|
.exclude_ids(exclude_ids.clone())
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
let results_with_exclude = db.retrieve(&query_with_exclude).unwrap();
|
|
|
|
// Verify: excluded IDs are not in results
|
|
for item in &results_with_exclude.items {
|
|
assert!(
|
|
!exclude_ids.contains(&item.entity_id),
|
|
"excluded entity {} should not appear in results",
|
|
item.entity_id,
|
|
);
|
|
}
|
|
|
|
db.shutdown().unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// PAGINATION TEST
|
|
// ============================================================
|
|
//
|
|
// Verifies that offset-based cursor pagination works correctly
|
|
// in the absence of concurrent writes. Note: offset cursors are
|
|
// NOT stable under concurrent signal writes (the ranked list can
|
|
// shift between pages). This test only covers the non-concurrent
|
|
// case. See Cursor doc in task-01 for the full limitation note.
|
|
#[test]
|
|
fn retrieve_pagination_via_cursor() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema,
|
|
})
|
|
.unwrap();
|
|
|
|
// Write 100 items
|
|
for i in 0..100u64 {
|
|
let (category, format, creator_id, created_at_offset) = item_metadata(i);
|
|
let embedding = generate_embedding(i, 64);
|
|
let now = Timestamp::now();
|
|
|
|
db.write_item_with_metadata(
|
|
EntityId::new(i + 1),
|
|
&category,
|
|
&format,
|
|
creator_id,
|
|
Timestamp::from_nanos(now.as_nanos().saturating_sub(created_at_offset)),
|
|
Some(&embedding),
|
|
)
|
|
.unwrap();
|
|
}
|
|
|
|
// Write signals
|
|
let now = Timestamp::now();
|
|
for i in 0..500u64 {
|
|
let entity = EntityId::new((i % 100) + 1);
|
|
db.signal("view", entity, 1.0, now).unwrap();
|
|
}
|
|
|
|
// Page 1: first 10 results
|
|
let page1_query = Retrieve::builder()
|
|
.profile("hot")
|
|
.limit(10)
|
|
.build()
|
|
.unwrap();
|
|
let page1 = db.retrieve(&page1_query).unwrap();
|
|
|
|
assert_eq!(page1.len(), 10, "page 1 should have 10 results");
|
|
assert!(
|
|
page1.next_cursor.is_some(),
|
|
"page 1 should have a next cursor"
|
|
);
|
|
|
|
// Page 2: next 10 results using cursor
|
|
let page2_query = Retrieve::builder()
|
|
.profile("hot")
|
|
.limit(10)
|
|
.cursor(page1.next_cursor.unwrap())
|
|
.build()
|
|
.unwrap();
|
|
let page2 = db.retrieve(&page2_query).unwrap();
|
|
|
|
assert_eq!(page2.len(), 10, "page 2 should have 10 results");
|
|
|
|
// Verify: no overlap between pages
|
|
let page1_ids: Vec<EntityId> = page1.items.iter().map(|r| r.entity_id).collect();
|
|
let page2_ids: Vec<EntityId> = page2.items.iter().map(|r| r.entity_id).collect();
|
|
for id in &page2_ids {
|
|
assert!(
|
|
!page1_ids.contains(id),
|
|
"entity {} appears on both page 1 and page 2",
|
|
id,
|
|
);
|
|
}
|
|
|
|
// Verify: page 2 ranks continue from page 1
|
|
assert_eq!(page2.items[0].rank, 11, "page 2 should start at rank 11");
|
|
|
|
db.shutdown().unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// QUERY VALIDATION ERROR TEST
|
|
// ============================================================
|
|
//
|
|
// Verifies that invalid queries produce clear errors.
|
|
#[test]
|
|
fn retrieve_rejects_invalid_queries() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema,
|
|
})
|
|
.unwrap();
|
|
|
|
// Unknown profile
|
|
let unknown_profile = Retrieve::builder()
|
|
.profile("nonexistent_profile")
|
|
.limit(10)
|
|
.build()
|
|
.unwrap();
|
|
let result = db.retrieve(&unknown_profile);
|
|
assert!(
|
|
matches!(result, Err(tidaldb::query::retrieve::QueryError::ProfileNotFound(_))),
|
|
"unknown profile should return ProfileNotFound, got: {:?}",
|
|
result,
|
|
);
|
|
|
|
// Limit = 0 (caught at builder level)
|
|
let result = Retrieve::builder().profile("new").limit(0).build();
|
|
assert!(
|
|
matches!(result, Err(tidaldb::query::retrieve::QueryError::InvalidLimit { .. })),
|
|
"limit=0 should return InvalidLimit"
|
|
);
|
|
|
|
// Limit > 500 (caught at builder level)
|
|
let result = Retrieve::builder().profile("new").limit(501).build();
|
|
assert!(
|
|
matches!(result, Err(tidaldb::query::retrieve::QueryError::InvalidLimit { .. })),
|
|
"limit=501 should return InvalidLimit"
|
|
);
|
|
|
|
db.shutdown().unwrap();
|
|
}
|
|
|
|
// ============================================================
|
|
// DETERMINISTIC RESULTS TEST
|
|
// ============================================================
|
|
//
|
|
// Verifies INV-QUERY-1: same query with same state produces
|
|
// identical results.
|
|
#[test]
|
|
fn retrieve_deterministic_results() {
|
|
let dir = TempDir::new().unwrap();
|
|
let schema = m2_schema();
|
|
|
|
let db = TidalDB::open(Config {
|
|
data_dir: dir.path().to_owned(),
|
|
schema,
|
|
})
|
|
.unwrap();
|
|
|
|
// Write 100 items with signals
|
|
let now = Timestamp::now();
|
|
for i in 0..100u64 {
|
|
let (category, format, creator_id, created_at_offset) = item_metadata(i);
|
|
let embedding = generate_embedding(i, 64);
|
|
db.write_item_with_metadata(
|
|
EntityId::new(i + 1),
|
|
&category,
|
|
&format,
|
|
creator_id,
|
|
Timestamp::from_nanos(now.as_nanos().saturating_sub(created_at_offset)),
|
|
Some(&embedding),
|
|
)
|
|
.unwrap();
|
|
|
|
db.signal("view", EntityId::new(i + 1), 1.0, now).unwrap();
|
|
if i % 3 == 0 {
|
|
db.signal("like", EntityId::new(i + 1), 1.0, now)
|
|
.unwrap();
|
|
}
|
|
}
|
|
|
|
let query = Retrieve::builder()
|
|
.profile("hot")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap();
|
|
|
|
let results1 = db.retrieve(&query).unwrap();
|
|
let results2 = db.retrieve(&query).unwrap();
|
|
|
|
assert_eq!(results1.len(), results2.len(), "result counts must match");
|
|
|
|
for (r1, r2) in results1.items.iter().zip(results2.items.iter()) {
|
|
assert_eq!(
|
|
r1.entity_id, r2.entity_id,
|
|
"entity IDs must match at rank {}",
|
|
r1.rank,
|
|
);
|
|
assert!(
|
|
(r1.score - r2.score).abs() < f64::EPSILON,
|
|
"scores must be identical for entity {} at rank {}: {} vs {}",
|
|
r1.entity_id,
|
|
r1.rank,
|
|
r1.score,
|
|
r2.score,
|
|
);
|
|
}
|
|
|
|
db.shutdown().unwrap();
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `milestone_2_uat` test passes: all 6 queries return correctly ordered results
|
|
- [ ] Query 1 (trending): results sorted descending, creator diversity enforced (max 1 per creator), ranks are 1-based sequential
|
|
- [ ] Query 2 (hot + jazz filter): only jazz items returned, sorted descending by hot score
|
|
- [ ] Query 3 (new): results sorted by created_at descending
|
|
- [ ] Query 4 (top_week): results sorted by 7d signal-based score
|
|
- [ ] Query 5 (hidden_gems): results sorted by quality/reach ratio
|
|
- [ ] Query 6 (controversial): results sorted by dual-signal score
|
|
- [ ] Signal burst: writing 100 "share" signals for item #500 causes it to rise in trending rank (or appear if previously absent)
|
|
- [ ] Crash recovery: shutdown and reopen preserves all items, signals, and query functionality
|
|
- [ ] `retrieve_results_include_signal_snapshots` test passes: at least some results have non-empty snapshots, all capped at 10
|
|
- [ ] `retrieve_excludes_specified_ids` test passes: excluded IDs never appear in results
|
|
- [ ] `retrieve_pagination_via_cursor` test passes: pages do not overlap, ranks continue correctly
|
|
- [ ] `retrieve_rejects_invalid_queries` test passes: clear errors for unknown profile, invalid limit
|
|
- [ ] `retrieve_deterministic_results` test passes: same query produces identical results (INV-QUERY-1)
|
|
- [ ] `cargo test --test m2_uat` passes
|
|
- [ ] No `unsafe` code in tests
|
|
- [ ] Test data is deterministic (fixed seeds, reproducible event sequences)
|
|
|
|
## Research References
|
|
|
|
- [docs/research/tidaldb_signal_ledger.md](../../../research/tidaldb_signal_ledger.md) -- Signal write/read latencies referenced in test timing expectations
|
|
- [docs/research/ann_for_tidaldb.md](../../../research/ann_for_tidaldb.md) -- ANN recall@k expectations for verifying retrieval correctness
|
|
|
|
## Spec References
|
|
|
|
- [docs/specs/08-query-engine.md](../../../specs/08-query-engine.md) -- Section 2 (RETRIEVE operation), Section 5 (execution pipeline), Section 8 (pagination), Section 15 (invariants: INV-QUERY-1 deterministic, INV-QUERY-2 filter correctness)
|
|
- [docs/specs/09-ranking-scoring.md](../../../specs/09-ranking-scoring.md) -- Section 11 (sort mode formulas verified by query ordering), Section 16 (INV-RANK-1 deterministic scoring, INV-RANK-5 diversity never reduces result count)
|
|
|
|
## Implementation Notes
|
|
|
|
- **Signal count (10K vs ROADMAP's 100K)**: The ROADMAP UAT specifies 100K signal events. This test uses 10K to keep `cargo test --test m2_uat` under 30 seconds. 10K signals across 10K items averages 1 signal per entity — sparse but sufficient for correctness testing of ranking logic. For scale validation, add a `#[ignore]` test:
|
|
```rust
|
|
#[test]
|
|
#[ignore = "scale test: takes 2-3 minutes, run with --ignored"]
|
|
fn milestone_2_uat_100k_signals() {
|
|
// same as milestone_2_uat but with 100K signals
|
|
}
|
|
```
|
|
Run with: `cargo test --test m2_uat -- --ignored milestone_2_uat_100k_signals`
|
|
- The `generate_embedding` function uses `sin()` for deterministic pseudo-random vectors. The embeddings are L2-normalized so they work correctly with USearch's cosine/L2 equivalence. Use 64 dimensions for test speed -- the trait abstraction handles any dimension.
|
|
- The `generate_signal_events` function uses prime strides (7919, 104729) for reproducible distribution without a PRNG dependency. The distribution is power-law-ish: some entities get more events than others, creating interesting ranking dynamics.
|
|
- The `write_item_with_metadata` API is a convenience wrapper expected to exist on `TidalDb` for M2. If it does not exist, this task must add it. It stores structured metadata (category, format, creator_id, created_at) that the bitmap/range indexes and the RETRIEVE executor can read. The exact API shape depends on how metadata is stored after m2p2 (bitmap indexes) is integrated.
|
|
- The signal burst test (100 "share" signals for item #500) verifies signal freshness: a signal written during the test is reflected in the very next query. The test handles the case where item #500 does not appear in the top 50 before or after the burst (possible with random signal distribution) by falling back to verifying the signal count directly.
|
|
- The crash recovery section re-verifies item count, trending query, jazz filter, and signal persistence. It does NOT require exact score-level equality with pre-crash results (decay scores advance with time, so scores computed at a later time after restart will differ slightly). It verifies functional correctness: queries work, filters apply, signals survived.
|
|
- Test execution time target: < 30 seconds for the full `m2_uat` test. At 10K items with 64-dim embeddings and 10K signals, setup should take ~5 seconds (item writes + signal writes), and the 6 queries should each take < 100ms. If the test is too slow, reduce item count to 5K or embedding dimension to 32.
|
|
- All test assertions include descriptive failure messages. A failing assertion should tell the developer exactly what went wrong and which UAT step failed.
|
|
- The `m2_uat.rs` file is an integration test (in `tidal/tests/`), not a unit test (in `src/`). It links against the compiled crate and tests the public API exactly as a user would.
|