tidaldb/docs/planning/milestone-7/phase-1/task-07-crash-fencing-m6-state.md

# Task 07: Crash Fencing for M6 State

## Delivers

Property tests proving crash recovery correctness for all M6 state surfaces: `CohortSignalLedger`, `CollectionIndex`, `CoEngagementIndex`, and active sessions. Each state surface is tested against its relevant crash points with random event sequences. After simulated crash + reopen, every invariant must hold.

## Complexity: L

## Dependencies

- Task 01 (CrashPoint enum + fault injection hooks)

## Technical Design

### 1. CohortSignalLedger crash recovery tests

The cohort ledger checkpoints atomically via `WriteBatch`. If the checkpoint is interrupted, the previous checkpoint (or empty state) is the restore point. The cohort ledger is best-effort (does not go through WAL), so signals written to the cohort ledger but not checkpointed are lost on crash. This is by design -- the acceptance criterion is that the restored state is consistent (no corrupt entries), not that every cohort signal survives.

```rust
// tidal/tests/m7_crash_m6.rs

use std::time::Duration;
use tidaldb::cohort::CohortSignalLedger;
use tidaldb::schema::{DecaySpec, EntityId, EntityKind, SchemaBuilder, Timestamp, Window};
use tidaldb::signals::SignalTypeId;
use tidaldb::storage::InMemoryBackend;
use tidaldb::signals::checkpoint::CheckpointMeta;

fn test_schema() -> tidaldb::schema::Schema {
    let mut builder = SchemaBuilder::new();
    let _ = builder
        .signal("view", EntityKind::Item,
            DecaySpec::Exponential { half_life: Duration::from_secs(7 * 24 * 3600) })
        .windows(&[Window::AllTime])
        .velocity(false)
        .add();
    builder.build().unwrap()
}

#[test]
fn cohort_ledger_checkpoint_restore_roundtrip() {
    let schema = test_schema();
    let ledger = CohortSignalLedger::new(&schema);
    let type_id = ledger.resolve_signal_type("view").unwrap();
    let ts_ns = Timestamp::now().as_nanos();

    // Record signals across multiple cohorts.
    for i in 1..=100u64 {
        ledger.record("cohort_a", EntityId::new(i), type_id, 1.0, ts_ns);
        if i % 3 == 0 {
            ledger.record("cohort_b", EntityId::new(i), type_id, 2.0, ts_ns);
        }
    }

    let storage = InMemoryBackend::new();
    let meta = CheckpointMeta {
        checkpoint_time_ns: ts_ns,
        wal_sequence: 42,
        payload_hash: [0u8; 32], // cohort uses same meta but no BLAKE3 yet
    };
    ledger.checkpoint(&storage, meta).unwrap();

    // Restore into a fresh ledger.
    let ledger2 = CohortSignalLedger::new(&schema);
    let restored_meta = ledger2.restore(&storage).unwrap().unwrap();

    assert_eq!(restored_meta.wal_sequence, 42);
    assert_eq!(ledger2.entry_count(), ledger.entry_count());

    // Verify specific counts.
    let count_a = ledger2
        .read_windowed_count("cohort_a", EntityId::new(1), "view", Window::AllTime)
        .unwrap();
    assert_eq!(count_a, 1);

    let count_b = ledger2
        .read_windowed_count("cohort_b", EntityId::new(3), "view", Window::AllTime)
        .unwrap();
    assert_eq!(count_b, 1);

    // Cohort not in the ledger returns 0.
    let count_c = ledger2
        .read_windowed_count("cohort_c", EntityId::new(1), "view", Window::AllTime)
        .unwrap();
    assert_eq!(count_c, 0);
}
```

### 2. CohortSignalLedger crash during checkpoint

```rust
use proptest::prelude::*;

proptest! {
    #![proptest_config(ProptestConfig::with_cases(500))]

    #[test]
    fn cohort_ledger_survives_checkpoint_crash(
        cohort_count in 1usize..5,
        entities_per_cohort in 1usize..20,
        signals_per_entity in 1usize..10,
    ) {
        let schema = test_schema();
        let storage = InMemoryBackend::new();
        let type_id = SignalTypeId::new(0); // "view" is first alphabetically
        let base_ns = 1_000_000_000_000u64;

        // Phase 1: Write signals and do a clean checkpoint.
        let ledger = CohortSignalLedger::new(&schema);
        for c in 0..cohort_count {
            let cohort_name = format!("cohort_{c}");
            for e in 1..=entities_per_cohort as u64 {
                for s in 0..signals_per_entity {
                    let ts_ns = base_ns + (s as u64) * 1_000_000;
                    ledger.record(&cohort_name, EntityId::new(e), type_id, 1.0, ts_ns);
                }
            }
        }

        let meta = CheckpointMeta {
            checkpoint_time_ns: base_ns,
            wal_sequence: 100,
            payload_hash: [0u8; 32],
        };
        ledger.checkpoint(&storage, meta).unwrap();
        let checkpoint_entry_count = ledger.entry_count();

        // Phase 2: Write more signals (these will be lost on crash).
        for e in 1..=3u64 {
            ledger.record("cohort_0", EntityId::new(e + 1000), type_id, 1.0, base_ns);
        }
        prop_assert!(ledger.entry_count() > checkpoint_entry_count);

        // Phase 3: Simulate crash (don't checkpoint the new signals).
        // Restore from the checkpoint.
        let ledger2 = CohortSignalLedger::new(&schema);
        let restored_meta = ledger2.restore(&storage).unwrap().unwrap();

        prop_assert_eq!(restored_meta.wal_sequence, 100);
        // Restored entry count should match the CHECKPOINT entry count,
        // not the live entry count (post-checkpoint signals are lost).
        prop_assert_eq!(ledger2.entry_count(), checkpoint_entry_count);
    }
}
```

### 3. CollectionIndex crash recovery

Collections are persisted to fjall via `Tag::Collection` keys. The `rebuild_collections` function scans these keys on open. Test that after a crash during `add_to_collection`, the collection state on reopen is consistent (either the item is in the collection or it is not -- no partial state).

```rust
#[test]
fn collection_persists_across_restart() {
    let dir = tempfile::tempdir().unwrap();
    let schema = test_schema_with_collections(); // schema with text fields for items

    // Phase 1: Create collection, add items, close.
    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema.clone())
            .open()
            .unwrap();

        // Write items first (collections reference items).
        for i in 1..=10u64 {
            db.write_item(EntityId::new(i), &HashMap::new()).unwrap();
        }

        let cid = db.create_collection(42, "favorites", Visibility::Private).unwrap();
        for i in 1..=5u64 {
            db.add_to_collection(cid, EntityId::new(i)).unwrap();
        }

        db.close().unwrap();
    }

    // Phase 2: Reopen and verify collection state.
    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema)
            .open()
            .unwrap();

        let collections = db.list_collections(42).unwrap();
        assert_eq!(collections.len(), 1);
        assert_eq!(collections[0].name, "favorites");
        assert_eq!(collections[0].item_count, 5);

        db.close().unwrap();
    }
}

proptest! {
    #![proptest_config(ProptestConfig::with_cases(200))]

    #[test]
    fn collection_items_survive_restart(
        item_count in 1usize..50,
        items_in_collection in 1usize..50,
    ) {
        let n_items = item_count;
        let n_in_coll = items_in_collection.min(n_items);
        let dir = tempfile::tempdir().unwrap();
        let schema = test_schema_with_collections();

        {
            let db = TidalDb::builder()
                .with_data_dir(dir.path())
                .with_schema(schema.clone())
                .open()
                .unwrap();

            for i in 1..=n_items as u64 {
                db.write_item(EntityId::new(i), &HashMap::new()).unwrap();
            }

            let cid = db.create_collection(1, "test", Visibility::Public).unwrap();
            for i in 1..=n_in_coll as u64 {
                db.add_to_collection(cid, EntityId::new(i)).unwrap();
            }

            db.close().unwrap();
        }

        {
            let db = TidalDb::builder()
                .with_data_dir(dir.path())
                .with_schema(schema)
                .open()
                .unwrap();

            let collections = db.list_collections(1).unwrap();
            prop_assert_eq!(collections.len(), 1);
            prop_assert_eq!(collections[0].item_count, n_in_coll);

            db.close().unwrap();
        }
    }
}
```

### 4. CoEngagementIndex crash recovery

The co-engagement index checkpoints during shutdown. Test that edges survive a clean restart and that the weight-based eviction invariant (edge_count <= capacity) holds after recovery.

```rust
#[test]
fn co_engagement_survives_restart() {
    let dir = tempfile::tempdir().unwrap();
    let schema = test_schema_with_entities();

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema.clone())
            .open()
            .unwrap();

        // Write items and signals that trigger co-engagement.
        for i in 1..=20u64 {
            db.write_item(EntityId::new(i), &HashMap::new()).unwrap();
        }

        // Simulate user engagement: user 1 likes items 1-10 sequentially.
        for i in 1..=10u64 {
            let ts = Timestamp::from_nanos(1_000_000_000_000 + i * 1_000_000);
            db.signal("like", EntityId::new(i), 1.0, ts).unwrap();
        }

        let edge_count_before = db.co_engagement().edge_count();
        assert!(edge_count_before > 0, "co-engagement edges should exist");

        db.close().unwrap();
    }

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema)
            .open()
            .unwrap();

        let edge_count_after = db.co_engagement().edge_count();
        // Edges should survive the restart (checkpoint in shutdown).
        assert!(edge_count_after > 0, "co-engagement edges should survive restart");

        db.close().unwrap();
    }
}

proptest! {
    #![proptest_config(ProptestConfig::with_cases(200))]

    #[test]
    fn co_engagement_capacity_invariant_after_restart(
        edge_count in 1usize..100,
    ) {
        let index = CoEngagementIndex::with_capacity(50);
        let storage = InMemoryBackend::new();

        // Insert edges directly.
        for i in 0..edge_count {
            index.insert_edge(i as u64, (i + 1) as u64, (i as f32) + 0.5);
        }

        // Checkpoint.
        index.checkpoint(&storage).unwrap();

        // Restore into a fresh index.
        let index2 = CoEngagementIndex::with_capacity(50);
        index2.restore(&storage).unwrap();

        // Edge count after restore matches the original (capped by checkpoint,
        // not by capacity -- capacity eviction happens on record_positive, not restore).
        let expected = edge_count.min(100); // all inserted edges, up to what we wrote
        prop_assert_eq!(index2.edge_count(), expected.min(index.edge_count()));
    }
}
```

### 5. Session crash recovery

Sessions are journaled to `sessions.log` via the WAL writer thread. If a session-start is written but no session-close, the session should be restored as active on reopen. Test this with the existing `session_durability` test pattern.

```rust
#[test]
fn active_session_restored_after_crash() {
    let dir = tempfile::tempdir().unwrap();
    let schema = test_schema_with_sessions();
    let mut session_id = 0u64;

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema.clone())
            .open()
            .unwrap();

        // Start a session but do NOT close it (simulating crash).
        let handle = db.start_session(42, "test-agent", "default").unwrap();
        session_id = handle.session_id().as_u64();

        // Write some session signals.
        handle.signal("view", EntityId::new(1), 1.0, Timestamp::now()).unwrap();
        handle.signal("view", EntityId::new(2), 1.0, Timestamp::now()).unwrap();

        // Graceful close (writes checkpoint, but session is still "open").
        db.close().unwrap();
    }

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema)
            .open()
            .unwrap();

        // The session should be restored with its signals.
        // The exact API depends on session restore implementation.
        // At minimum: the session's signal effects (view counts) should be present.
        let count = db.read_windowed_count(
            EntityId::new(1), "view", Window::AllTime
        ).unwrap();
        assert!(count >= 1, "session signals should survive restart");

        db.close().unwrap();
    }
}

#[test]
fn session_metadata_restored_after_crash() {
    let dir = tempfile::tempdir().unwrap();
    let schema = test_schema_with_sessions();

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema.clone())
            .open()
            .unwrap();

        let handle = db.start_session(42, "test-agent", "default").unwrap();
        handle.annotate(EntityId::new(1), "good recommendation").unwrap();
        // Do NOT close the session -- simulate crash.
        db.close().unwrap();
    }

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema)
            .open()
            .unwrap();

        // Session journal recovery should have replayed the session-start
        // and the annotation signal. The exact verification depends on
        // whether session state is queryable after restore.
        // At minimum: the database should open without error.
        db.health_check().unwrap();

        db.close().unwrap();
    }
}
```

### 6. Cross-surface crash test

Verify that a crash during a mixed workload (signals + collections + co-engagement + cohort) does not corrupt any surface.

```rust
#[test]
fn mixed_workload_crash_recovery() {
    let dir = tempfile::tempdir().unwrap();
    let schema = test_schema_full();

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema.clone())
            .open()
            .unwrap();

        // Write items.
        for i in 1..=100u64 {
            let mut meta = HashMap::new();
            meta.insert("title".to_string(), format!("Item {i}"));
            meta.insert("creator_id".to_string(), format!("{}", i % 10));
            db.write_item_with_metadata(EntityId::new(i), &meta).unwrap();
        }

        // Write signals.
        for i in 1..=100u64 {
            let ts = Timestamp::from_nanos(1_000_000_000_000 + i * 1_000_000);
            db.signal("view", EntityId::new(i), 1.0, ts).unwrap();
        }

        // Create collection.
        let cid = db.create_collection(1, "test", Visibility::Public).unwrap();
        for i in 1..=10u64 {
            db.add_to_collection(cid, EntityId::new(i)).unwrap();
        }

        db.close().unwrap();
    }

    {
        let db = TidalDb::builder()
            .with_data_dir(dir.path())
            .with_schema(schema)
            .open()
            .unwrap();

        // Verify all surfaces are consistent.
        db.health_check().unwrap();

        // Signals survived.
        let count = db.read_windowed_count(
            EntityId::new(50), "view", Window::AllTime
        ).unwrap();
        assert!(count >= 1);

        // Collections survived.
        let collections = db.list_collections(1).unwrap();
        assert_eq!(collections.len(), 1);
        assert_eq!(collections[0].item_count, 10);

        // Items survived.
        let meta = db.get_item_metadata(EntityId::new(1)).unwrap();
        assert!(meta.is_some());

        db.close().unwrap();
    }
}
```

## Acceptance Criteria

- [ ] `cohort_ledger_checkpoint_restore_roundtrip`: cohort signal state round-trips correctly through checkpoint/restore
- [ ] `cohort_ledger_survives_checkpoint_crash`: 500 proptest cases, interrupted checkpoint does not corrupt state
- [ ] `collection_persists_across_restart`: collection with items survives clean restart
- [ ] `collection_items_survive_restart`: 200 proptest cases, collection item counts match after restart
- [ ] `co_engagement_survives_restart`: co-engagement edges persist through shutdown/reopen
- [ ] `co_engagement_capacity_invariant_after_restart`: 200 proptest cases, edge count preserved through checkpoint/restore
- [ ] `active_session_restored_after_crash`: session signals survive when session is not closed before shutdown
- [ ] `session_metadata_restored_after_crash`: session journal events replayed on open
- [ ] `mixed_workload_crash_recovery`: all surfaces consistent after mixed write + restart
- [ ] All tests pass with `cargo test --test m7_crash_m6`

## Test Strategy

All tests described in the Technical Design section are the deliverable. The key principle: each test writes state through the public `TidalDb` API, closes the database, reopens it, and verifies the state through the public API. This tests the full persistence pipeline (in-memory -> checkpoint/fjall -> restore/rebuild) without relying on internal implementation details.

Property tests use proptest with varied workload sizes to catch edge cases in serialization, key encoding, and scan/rebuild logic.