16 KiB
Task 08: Hard Negative Crash Invariant Test
Delivers
Integration tests proving that after any crash scenario, RETRIEVE never returns items that the user has hidden (hard negatives) or content from creators that the user has blocked. This is the ultimate correctness invariant for crash recovery: no matter what goes wrong during a crash, the user's negative preferences are respected in query results.
The invariant under test: if a user has recorded a hide relationship on item X or a block relationship on creator C, then after any crash-and-recovery sequence, RETRIEVE ... FOR USER @user ... FILTER unblocked, unseen must never include item X or any item by creator C in the results.
Complexity: M
Dependencies
- Task 07 (M6 crash fencing -- ensures all state surfaces recover correctly, which is prerequisite for this end-to-end invariant)
Technical Design
1. Test architecture
Each test follows this pattern:
- Setup: Open persistent database. Write items with metadata (including
creator_id). Write user relationships (hide, block). Write signals to ensure items are rankable. - Verify pre-crash: Execute
RETRIEVEand confirm hidden/blocked items are absent. - Crash simulation: Close and reopen the database (simulating a clean restart, which is the minimal crash scenario; the property tests from tasks 02 and 07 cover unclean crashes).
- Verify post-crash: Execute the same
RETRIEVEand confirm hidden/blocked items are still absent.
2. Hidden items invariant
// tidal/tests/m7_crash_invariant.rs
#![allow(clippy::unwrap_used)]
use std::collections::HashMap;
use std::time::Duration;
use tidaldb::schema::{
DecaySpec, EntityId, EntityKind, SchemaBuilder, Timestamp, Window,
};
use tidaldb::{TidalDb, TidalDbBuilder};
use tidaldb::query::retrieve::{Retrieve, RetrieveBuilder};
fn invariant_schema() -> tidaldb::schema::Schema {
let mut builder = SchemaBuilder::new();
let _ = builder
.signal(
"view",
EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(7 * 24 * 3600),
},
)
.windows(&[Window::AllTime])
.velocity(false)
.add();
let _ = builder
.signal(
"like",
EntityKind::Item,
DecaySpec::Exponential {
half_life: Duration::from_secs(14 * 24 * 3600),
},
)
.windows(&[Window::AllTime])
.velocity(false)
.add();
// Add text fields for item metadata.
builder.text_field("title", tidaldb::schema::TextFieldType::Text);
// Add embedding slot for vector search (required by some profiles).
builder.embedding_slot(EntityKind::Item, "content", 128);
builder.build().unwrap()
}
fn write_items_with_creators(db: &TidalDb, count: u64) {
for i in 1..=count {
let creator_id = (i % 5) + 1; // 5 creators
let mut meta = HashMap::new();
meta.insert("title".to_string(), format!("Item {i}"));
meta.insert("creator_id".to_string(), creator_id.to_string());
meta.insert("category".to_string(), "music".to_string());
db.write_item_with_metadata(EntityId::new(i), &meta).unwrap();
// Write a signal so the item is rankable.
let ts = Timestamp::from_nanos(1_000_000_000_000 + i * 1_000_000);
db.signal("view", EntityId::new(i), 1.0, ts).unwrap();
}
}
/// Core invariant: hidden items never appear in RETRIEVE results.
#[test]
fn hidden_items_never_returned_after_restart() {
let dir = tempfile::tempdir().unwrap();
let schema = invariant_schema();
let user_id = 42u64;
let hidden_item_ids: Vec<u64> = vec![3, 7, 15, 22];
// Phase 1: Write data and hide items.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
write_items_with_creators(&db, 30);
// Write user.
db.write_user(EntityId::new(user_id), &HashMap::new()).unwrap();
// Hide specific items.
for &item_id in &hidden_item_ids {
db.hide_item(EntityId::new(user_id), EntityId::new(item_id)).unwrap();
}
// Verify pre-crash: hidden items are absent from results.
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unseen()
.limit(30)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
assert!(
!hidden_item_ids.contains(&item.entity_id.as_u64()),
"hidden item {} appeared in pre-crash RETRIEVE",
item.entity_id.as_u64()
);
}
db.close().unwrap();
}
// Phase 2: Reopen and verify the invariant holds.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unseen()
.limit(30)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
assert!(
!hidden_item_ids.contains(&item.entity_id.as_u64()),
"INVARIANT VIOLATION: hidden item {} appeared in RETRIEVE after restart",
item.entity_id.as_u64()
);
}
// Also verify via direct state check.
for &item_id in &hidden_item_ids {
// The user_state should still have the hide relationship.
// This depends on rebuild_entity_state scanning the users engine.
}
db.close().unwrap();
}
}
3. Blocked creators invariant
/// Core invariant: blocked creator content never appears in RETRIEVE results.
#[test]
fn blocked_creator_content_never_returned_after_restart() {
let dir = tempfile::tempdir().unwrap();
let schema = invariant_schema();
let user_id = 42u64;
let blocked_creator_id = 3u64; // creator 3
// Phase 1: Write data and block a creator.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
write_items_with_creators(&db, 30);
db.write_user(EntityId::new(user_id), &HashMap::new()).unwrap();
// Block creator 3.
db.block_creator(EntityId::new(user_id), EntityId::new(blocked_creator_id))
.unwrap();
// Verify pre-crash: no items by creator 3 in results.
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unblocked()
.limit(30)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
// Items with creator_id == blocked_creator_id should be absent.
// creator_id = (item_id % 5) + 1. So items where (id % 5) + 1 == 3,
// i.e., id % 5 == 2, have creator 3.
let item_creator = (item.entity_id.as_u64() % 5) + 1;
assert_ne!(
item_creator, blocked_creator_id,
"blocked creator's item {} appeared in pre-crash RETRIEVE",
item.entity_id.as_u64()
);
}
db.close().unwrap();
}
// Phase 2: Reopen and verify.
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unblocked()
.limit(30)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
let item_creator = (item.entity_id.as_u64() % 5) + 1;
assert_ne!(
item_creator, blocked_creator_id,
"INVARIANT VIOLATION: blocked creator's item {} appeared after restart",
item.entity_id.as_u64()
);
}
db.close().unwrap();
}
}
4. Combined hide + block invariant
/// Both hidden items AND blocked creators must be absent after restart.
#[test]
fn combined_hide_and_block_after_restart() {
let dir = tempfile::tempdir().unwrap();
let schema = invariant_schema();
let user_id = 99u64;
let hidden_items = vec![1u64, 5, 10];
let blocked_creator = 2u64; // creator 2: items where (id % 5) + 1 == 2, i.e., id % 5 == 1
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
write_items_with_creators(&db, 50);
db.write_user(EntityId::new(user_id), &HashMap::new()).unwrap();
for &item_id in &hidden_items {
db.hide_item(EntityId::new(user_id), EntityId::new(item_id)).unwrap();
}
db.block_creator(EntityId::new(user_id), EntityId::new(blocked_creator))
.unwrap();
db.close().unwrap();
}
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unseen()
.filter_unblocked()
.limit(50)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
let id = item.entity_id.as_u64();
let item_creator = (id % 5) + 1;
assert!(
!hidden_items.contains(&id),
"INVARIANT VIOLATION: hidden item {id} in results after restart"
);
assert_ne!(
item_creator, blocked_creator,
"INVARIANT VIOLATION: blocked creator {blocked_creator}'s item {id} in results after restart"
);
}
db.close().unwrap();
}
}
5. Property test: random hide/block patterns
use proptest::prelude::*;
proptest! {
#![proptest_config(ProptestConfig::with_cases(100))]
#[test]
fn no_phantom_items_after_restart(
item_count in 10usize..60,
hidden_count in 1usize..10,
block_creator in 1u64..6,
) {
let dir = tempfile::tempdir().unwrap();
let schema = invariant_schema();
let user_id = 42u64;
// Choose which items to hide (random subset).
let hidden: Vec<u64> = (1..=hidden_count as u64).collect();
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
write_items_with_creators(&db, item_count as u64);
db.write_user(EntityId::new(user_id), &HashMap::new()).unwrap();
for &h in &hidden {
if h <= item_count as u64 {
db.hide_item(EntityId::new(user_id), EntityId::new(h)).unwrap();
}
}
db.block_creator(EntityId::new(user_id), EntityId::new(block_creator))
.unwrap();
db.close().unwrap();
}
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unseen()
.filter_unblocked()
.limit(item_count as u32)
.build();
let results = db.retrieve(&query).unwrap();
for item in &results.items {
let id = item.entity_id.as_u64();
let creator = (id % 5) + 1;
prop_assert!(
!hidden.contains(&id),
"hidden item {id} appeared in results"
);
prop_assert_ne!(
creator, block_creator,
"blocked creator {block_creator}'s item {id} appeared"
);
}
db.close().unwrap();
}
}
}
6. Hard negative leak detection
Test that hard negatives recorded via the session feedback path also survive restart. This covers the HardNegIndex rebuild from durable RelationshipType::Hide edges.
#[test]
fn hard_negatives_from_session_survive_restart() {
let dir = tempfile::tempdir().unwrap();
let schema = invariant_schema();
let user_id = 42u64;
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema.clone())
.open()
.unwrap();
write_items_with_creators(&db, 20);
db.write_user(EntityId::new(user_id), &HashMap::new()).unwrap();
// Start a session and record negative feedback.
let handle = db.start_session(user_id, "test-agent", "default").unwrap();
// Signal "skip" or "dislike" which triggers hard negative.
let ts = Timestamp::now();
// Use the skip signal (if registered) or hide_item directly.
db.hide_item(EntityId::new(user_id), EntityId::new(5)).unwrap();
db.hide_item(EntityId::new(user_id), EntityId::new(12)).unwrap();
db.close_session(handle.session_id()).unwrap();
db.close().unwrap();
}
{
let db = TidalDb::builder()
.with_data_dir(dir.path())
.with_schema(schema)
.open()
.unwrap();
// The hard negatives should be rebuilt from the users engine
// (RelationshipType::Hide edges).
let query = Retrieve::builder()
.for_user(user_id)
.using_profile("chronological")
.filter_unseen()
.limit(20)
.build();
let results = db.retrieve(&query).unwrap();
let result_ids: Vec<u64> = results.items.iter().map(|i| i.entity_id.as_u64()).collect();
assert!(
!result_ids.contains(&5),
"hidden item 5 leaked after restart"
);
assert!(
!result_ids.contains(&12),
"hidden item 12 leaked after restart"
);
db.close().unwrap();
}
}
Acceptance Criteria
hidden_items_never_returned_after_restart: hidden items absent from RETRIEVE after clean restartblocked_creator_content_never_returned_after_restart: blocked creator items absent after restartcombined_hide_and_block_after_restart: both hidden items and blocked creator content absentno_phantom_items_after_restart: 100 proptest cases with random hide/block patterns, no invariant violationshard_negatives_from_session_survive_restart: session-recorded hard negatives persist through restart- No test produces a false positive (the invariant is actually tested end-to-end through RETRIEVE, not just by checking internal state)
- All tests pass with
cargo test --test m7_crash_invariant
Test Strategy
The tests above ARE the deliverable. Key design principles:
-
End-to-end verification: Every invariant is verified by executing a
RETRIEVEquery through the public API. This catches bugs anywhere in the pipeline (state rebuild, filter evaluation, user state index, hard negative index). -
Persistent mode only: All tests use
with_data_dir()to exercise the full durability pipeline. -
Property tests for coverage: The
no_phantom_items_after_restartproptest generates random combinations of hidden items and blocked creators to catch edge cases in the rebuild logic (e.g., boundary conditions inRoaringBitmapserialization, off-by-one in entity ID casting). -
Explicit creator mapping: Items are assigned to creators deterministically (
creator_id = (item_id % 5) + 1) so the test can verify which items should be blocked without needing to read metadata. -
Both hide and block paths: The tests exercise both
hide_item(user -> item edge,Tag::RelwithRelationshipType::Hide) andblock_creator(user -> creator edge,RelationshipType::Blocks). Both are rebuilt byrebuild_entity_statefrom the users engine scan.