# TidalDB Roadmap

## Vision Statement

When tidalDB is complete, an engineering team building any content platform -- a media library, a social feed, a marketplace, a discovery surface -- can embed a single Rust database and replace the Elasticsearch + Redis + Kafka + feature store + vector database + ranking service stack. One process, one query interface, one operational model. The query `RETRIEVE items FOR USER @user_id USING PROFILE for_you FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50` executes in under 50ms, reflects signals written 100ms ago, enforces diversity without application logic, handles cold-start items without application intervention, and returns results a user would describe as "it knows what I want."

## Thesis

A single embeddable database can replace the 6-system content ranking stack by treating signals, ranking profiles, and diversity constraints as database primitives rather than application logic.

---

## Milestone Summary

| # | Name | Proves | Enables |
|---|------|--------|---------|
| M1 | Signal Engine | Signals are a database primitive with O(1) decay, not application math | UC-03 (partial), UC-06 (partial), UC-14 (partial) |
| M2 | Ranked Retrieval | A single query retrieves, scores, and ranks content using live signals | UC-03, UC-04, UC-06, UC-08, UC-13, UC-14 |
| M3 | Personalized Ranking | User context shapes retrieval and ranking -- the "For You" query works | UC-01, UC-05, UC-07, UC-09 (partial) |
| M4 | Hybrid Search | Text + semantic + signal-ranked search in one query | UC-02, UC-10, UC-11 |
| M5 | Full Surface Coverage | Every use case, every sort mode, every filter, every feedback loop | UC-01 through UC-14 complete |
| M6 | Production Hardening | Crash safety, graceful degradation, operational readiness | All UCs at production quality |

---

## Current Status

| Phase | Status | Tests |
|-------|--------|-------|
| **m1p1: Core Type System and Schema** | COMPLETE | 77 passing |
| **m1p2: Write-Ahead Log** | COMPLETE | passing (unit + integration) |
| **m1p3: Storage Engine Trait and fjall Backend** | COMPLETE | 140 passing (128 unit + 12 integration) |
| m1p4: Signal Ledger | NOT STARTED | -- |
| m1p5: Entity CRUD and Signal Write API | NOT STARTED | -- |

**Current phase:** m1p4 (Signal Ledger) is next. m1p2 and m1p3 are complete, unblocking m1p4.

**Lessons learned:**
- m1p3 keyspaces are organized per `EntityKind` ("items", "users", "creators"), not by data category. The `Tag` enum in key encoding provides the data-category namespace within each entity-kind keyspace.
- The `LumenError` name is a legacy artifact from a predecessor project. Will be renamed when convenient but does not block progress.
- MSRV was bumped to 1.91 for fjall 3 compatibility.

---

## Milestone 1: Signal Engine -- "Signals are a database primitive"

### Milestone Thesis

A developer can open a tidalDB instance, define signal types with decay rates, write engagement events, and read back decay-correct scores and windowed aggregates -- all without computing any temporal math in application code. This proves that the hardest primitive (temporal signals with O(1) decay, velocity, and windowed aggregation) works correctly and meets the performance budget.

### UAT Scenario

```
Given:
  A tidalDB instance is opened with a schema defining:
    - Entity type: Item with metadata fields (title, category, created_at)
    - Signal type: "view" with exponential decay, half_life=7d, windows=[1h, 24h, 7d]
    - Signal type: "like" with exponential decay, half_life=14d, windows=[24h, 7d, all_time]
    - Signal type: "skip" with exponential decay, half_life=1d, windows=[1h, 24h]

When:
  1. Write 100 items with metadata
  2. Write 10,000 signal events across the items (views, likes, skips)
     with timestamps spanning the last 7 days
  3. Read the decay score for item #42, signal "view", at current time
  4. Read the windowed count for item #42, signal "view", window=24h
  5. Read the velocity for item #42, signal "view", window=1h
  6. Write a new "view" event for item #42
  7. Immediately re-read the decay score, windowed count, and velocity
  8. Close and reopen the tidalDB instance
  9. Re-read all values for item #42

Then:
  - Step 3: Decay score matches S(t) = sum(w_i * exp(-lambda * (t - t_i)))
    computed analytically from raw events, to 6 decimal places
  - Step 4: Windowed count equals the exact count of "view" events
    within the last 24h window
  - Step 5: Velocity equals windowed_count / window_duration
  - Step 7: All values reflect the new event immediately
    (decay score increased, count incremented, velocity updated)
  - Step 9: All values match step 7 (crash recovery preserves state)
  - Performance: decay score read < 100ns per entity,
    signal write < 100us including WAL fsync (amortized),
    200-entity scoring pass < 5us
```

### Phases

#### Phase 1: Core Type System and Schema -- COMPLETE

**Delivers:** The foundational type system -- entity IDs, signal type definitions, decay rate declarations, window specifications, and the error types that every subsequent module depends on. The schema module that validates and stores signal/entity definitions.

**Acceptance Criteria:**
- [x] `EntityId` is a u64 newtype with `Display`, `Hash`, `Eq`, `Ord`, `to_be_bytes()` (big-endian, preserves numeric ordering)
- [x] `EntityKind` enum: `Item`, `User`, `Creator`
- [x] `SignalTypeDef` captures: name, target `EntityKind`, `DecayModel` (exponential with pre-computed lambda / linear / permanent), `WindowSet`, velocity enabled flag
- [x] `DecayModel::Exponential` stores pre-computed `lambda = ln(2) / half_life.as_secs_f64()` -- no division on hot path
- [x] `Window` enum: `OneHour`, `TwentyFourHours`, `SevenDays`, `ThirtyDays`, `AllTime` with `duration()`, `label()`, `duration_secs_f64()`
- [x] `WindowSet` deduplicates and sorts windows; `empty()` for permanent signals
- [x] `LumenError` enum covers Storage, NotFound, Schema, Durability, Query, Internal variants with `From` impls for each sub-error
- [x] `SchemaError` enum validates: duplicate signal names, invalid identifiers, zero half-life/lifetime, empty windows for non-permanent signals, velocity without windows
- [x] Schema validation via `SchemaBuilder` rejects invalid configurations at construction time
- [x] Property tests: lambda correctness across half-life range, byte ordering preservation
- [x] `cargo fmt` clean, `cargo clippy -D warnings` clean, all 77 tests pass

**Depends On:** None
**Complexity:** M
**Research Reference:** `docs/research/tidaldb_signal_ledger.md` (decay formula, EntityState struct)

#### Phase 2: Write-Ahead Log -- COMPLETE

**Delivers:** A durable, append-only log for signal events. Every signal write is fsync'd before acknowledgment. Group commit amortizes fsync cost. Content-addressed events via BLAKE3 for deduplication. The WAL is the source of truth -- all other state is derived.

**Acceptance Criteria:**
- [x] WAL entries are length-prefixed with BLAKE3 checksums
- [x] Group commit batches up to 100 events or 10ms, whichever comes first
- [x] Duplicate events (same BLAKE3 hash) are silently deduplicated
- [x] WAL replay from any checkpoint produces identical state to uninterrupted execution (property test with 10,000+ random event sequences)
- [x] `fsync` is called per batch, not per event
- [x] WAL can be truncated after a checkpoint without losing committed state
- [x] Crash simulation (kill at random WAL positions) never produces corrupt state -- either the event is committed or it is not

**Depends On:** Phase 1
**Complexity:** L
**Research Reference:** `docs/research/tidaldb_wal.md` (wire format, group commit, crash detection, deduplication), `thoughts.md` Part II.1 (WAL convergence), Part V.5-6 (quarantine-first, group commit)

#### Phase 3: Storage Engine Trait and fjall Backend -- COMPLETE

**Delivers:** The `StorageEngine` trait abstraction and two implementations: `FjallBackend` (fjall 3 LSM-tree) for production and `InMemoryBackend` (BTreeMap + RwLock) for deterministic testing. Key encoding follows the subject-prefix pattern with a `Tag` discriminant. `FjallStorage` coordinates three keyspaces per entity kind. `FjallAtomicBatch` provides cross-keyspace atomic writes.

**Acceptance Criteria:**
- [x] `StorageEngine` trait with `get`, `put`, `delete`, `scan_prefix`, `write_batch`, `flush` operations
- [x] Key encoding: `[entity_id: 8 bytes BE][0x00][Tag: 1 byte][suffix...]` with `Tag` enum (`Evt`=0x01, `Sig`=0x02, `Meta`=0x03, `Rel`=0x04, `Mv`=0x05, `Idx`=0x06)
- [x] `encode_key`, `parse_key` roundtrip correctly for all tag variants and arbitrary suffixes
- [x] `entity_prefix` (9 bytes) and `entity_tag_prefix` (10 bytes) for scoped prefix scans
- [x] Byte-lexicographic key ordering matches numeric entity ID ordering (property tested)
- [x] `FjallBackend` wraps a single fjall `Keyspace`, implements `StorageEngine`
- [x] `FjallStorage` owns a fjall `Database` with three keyspaces: "items", "users", "creators" (one per `EntityKind`)
- [x] `FjallStorage::backend(EntityKind)` routes to the correct keyspace backend
- [x] Entity kind isolation: same key written to different entity kinds does not collide
- [x] `FjallAtomicBatch` provides cross-keyspace atomic writes via `fjall::OwnedWriteBatch`
- [x] Data persists across close and reopen (`flush_all` + reopen test)
- [x] `InMemoryBackend` uses `BTreeMap` + `RwLock` for deterministic, sorted, concurrent testing
- [x] `WriteBatch` and `BatchOp` types for atomic multi-operation writes
- [x] `PrefixIterator` type alias for boxed prefix scan iterators
- [x] Property tests with proptest: encode/parse roundtrip, prefix ordering, prefix containment
- [x] Criterion benchmarks passing
- [x] `cargo fmt` clean, `cargo clippy -D warnings` clean, all 140 tests pass (128 unit + 12 integration)

**Depends On:** Phase 1
**Complexity:** L
**Research Reference:** `thoughts.md` Part V.9 (hybrid storage), Part V.12 (subject-prefix keys), `CODING_GUIDELINES.md` section 2

#### Phase 4: Signal Ledger -- Decay Scores and Windowed Aggregation

**Delivers:** The in-memory per-entity signal state with running decay scores (O(1) update, O(1) read) and bucketed windowed counters. Signal writes update the running scores atomically. Signal reads return decay-correct values without scanning raw events. State is checkpointed to storage for crash recovery.

**Acceptance Criteria:**
- [ ] `EntitySignalState` is `#[repr(C, align(64))]` -- one L1 cache line per hot-path struct
- [ ] Running decay formula: `S(t) = S(t_prev) * exp(-lambda * dt) + weight` -- mathematically exact, verified against analytical brute-force computation to 6 decimal places across 10,000 random event sequences (property test)
- [ ] Out-of-order events handled correctly: when `t_event < last_update`, weight is pre-decayed: `score += weight * exp(-lambda * (last_update - t_event))`
- [ ] Windowed counts use per-minute bucketed counters (BucketedCounter) supporting 1h/24h/7d windows
- [ ] Velocity = windowed_count / window_duration_seconds
- [ ] Signal write latency < 100 microseconds including WAL write (amortized), benchmarked with criterion
- [ ] Decay score read latency < 100ns per entity per lambda, benchmarked with criterion
- [ ] 200-entity scoring pass < 5 microseconds, benchmarked with criterion
- [ ] State checkpointed to storage every 30 seconds; crash recovery reconstructs from checkpoint + WAL replay
- [ ] DashMap or sharded map for concurrent entity state access; signal counters use AtomicU64 with Relaxed ordering

**Depends On:** Phase 2, Phase 3
**Complexity:** XL
**Research Reference:** `docs/research/tidaldb_signal_ledger.md` (running-score formula, SWAG, BucketedCounter, EntityState struct, three-tier architecture)

#### Phase 5: Entity CRUD and Signal Write API

**Delivers:** The public API surface for Milestone 1. `TidalDB::open()`, `TidalDB::shutdown()`, entity write/read, signal write/read. This is the interface the UAT scenario tests against. Includes the `signal()` method that atomically writes to WAL, updates in-memory state, and returns immediately.

**Acceptance Criteria:**
- [ ] `TidalDB::open(config)` opens storage, restores in-memory state from checkpoint + WAL replay, returns `Result<TidalDB>`
- [ ] `TidalDB::shutdown()` checkpoints all in-memory state, syncs WAL, closes storage cleanly
- [ ] `db.write_item(id, metadata)` stores entity metadata
- [ ] `db.signal(signal_type, entity_id, weight, timestamp)` atomically: appends to WAL, updates decay scores, updates windowed counters
- [ ] `db.read_decay_score(entity_id, signal_type, lambda_index)` returns current decayed score
- [ ] `db.read_windowed_count(entity_id, signal_type, window)` returns count within window
- [ ] `db.read_velocity(entity_id, signal_type, window)` returns count / window_duration
- [ ] Full UAT scenario passes as an integration test
- [ ] `TidalDB` is `Send + Sync` -- safe to share across threads behind `Arc`

**Depends On:** Phase 4
**Complexity:** M
**Research Reference:** `CODING_GUIDELINES.md` section 9 (public API surface)

### Deferred to Later Milestones

- **User entities and preference vectors** -- deferred to M3 because M1 proves the signal primitive without needing user context
- **Creator entities and relationship edges** -- deferred to M2/M3 because M1 only needs items to prove signal correctness
- **Vector index (USearch)** -- deferred to M2 because M1 does not need ANN retrieval
- **Text index (Tantivy)** -- deferred to M4 because M1 does not need full-text search
- **Ranking profiles** -- deferred to M2 because M1 proves signals work; M2 proves ranking over signals works
- **Query parser** -- deferred to M2; M1 uses the Rust API directly
- **Diversity enforcement** -- deferred to M2 because M1 does not produce ranked result sets
- **Signal rollups (hourly/daily materialization)** -- deferred to M5 because the bucketed counter approach serves the performance budget through M4; rollups become necessary only at scale for 30d+ windows
- **RocksDB backend** -- deferred indefinitely; fjall is the primary backend, RocksDB is the trait-abstracted fallback if benchmarks demand it

### Integration Test

```rust
#[test]
fn milestone_1_uat() {
    // Open tidalDB with signal schema
    let db = TidalDB::open(Config {
        data_dir: temp_dir(),
        schema: Schema::builder()
            .entity_type("item", &["title", "category", "created_at"])
            .signal("view", Decay::exponential(Duration::days(7)),
                    &[Window::Hours(1), Window::Hours(24), Window::Days(7)])
            .signal("like", Decay::exponential(Duration::days(14)),
                    &[Window::Hours(24), Window::Days(7), Window::AllTime])
            .signal("skip", Decay::exponential(Duration::days(1)),
                    &[Window::Hours(1), Window::Hours(24)])
            .build(),
    }).unwrap();

    // Write 100 items
    for i in 0..100 {
        db.write_item(EntityId(i), metadata(i)).unwrap();
    }

    // Write 10,000 signal events spanning 7 days
    let events = generate_events(10_000, Duration::days(7));
    for e in &events {
        db.signal(e.signal_type, e.entity_id, e.weight, e.timestamp).unwrap();
    }

    // Read and verify item #42
    let now = Timestamp::now();
    let analytical_score = compute_analytical_decay(&events, EntityId(42), "view", now);
    let actual_score = db.read_decay_score(EntityId(42), "view", 0).unwrap();
    assert!((actual_score - analytical_score).abs() < 1e-6);

    let analytical_count = count_events_in_window(&events, EntityId(42), "view", now, Duration::hours(24));
    let actual_count = db.read_windowed_count(EntityId(42), "view", Window::Hours(24)).unwrap();
    assert_eq!(actual_count, analytical_count);

    // Write new event and verify immediate visibility
    db.signal("view", EntityId(42), 1.0, now).unwrap();
    let new_score = db.read_decay_score(EntityId(42), "view", 0).unwrap();
    assert!(new_score > actual_score);

    // Close, reopen, verify persistence
    db.shutdown().unwrap();
    let db2 = TidalDB::open(same_config()).unwrap();
    let recovered_score = db2.read_decay_score(EntityId(42), "view", 0).unwrap();
    assert!((recovered_score - new_score).abs() < 1e-6);
}
```

### Done When

A developer can embed tidalDB as a Rust dependency, define signal types with decay rates and windows in schema, write thousands of signal events, and read back decay-correct scores, windowed counts, and velocity values that match analytical computation to 6 decimal places -- including after a crash and restart. Performance benchmarks pass: signal write < 100us amortized, decay read < 100ns per entity, 200-entity scoring < 5us.

---

## Milestone 2: Ranked Retrieval -- "A single query retrieves, scores, and ranks content"

### Milestone Thesis

A developer can write items with metadata and embeddings, write signal events, and execute a RETRIEVE query that returns items ranked by a named profile using live signal scores -- with metadata filters and diversity constraints applied by the database, not the application. This proves that ranking is a database operation, not application logic.

### UAT Scenario

```
Given:
  A tidalDB instance with:
    - 10,000 items with metadata (title, category, format, duration, created_at)
      and 1536-dim embeddings
    - Signal types: view (7d decay), like (14d decay), skip (1d decay),
      share (3d decay), completion (30d decay)
    - 100,000 signal events spanning 7 days across the items
    - Ranking profiles defined:
      * "trending" -- share_velocity(6h) primary, view_velocity(6h) secondary,
        engagement_ratio gate > 0.03
      * "hot" -- score / (age_hours + 2)^1.8
      * "new" -- created_at DESC
      * "top_week" -- quality_score within 7d window
      * "hidden_gems" -- high completion_rate, inverse view_count
      * "controversial" -- max(likes * dislikes)

When:
  1. RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25
  2. RETRIEVE items FILTER category:jazz USING PROFILE hot LIMIT 20
  3. RETRIEVE items USING PROFILE new LIMIT 20
  4. RETRIEVE items USING PROFILE top_week LIMIT 20
  5. RETRIEVE items USING PROFILE hidden_gems FILTER min_completion_rate:0.7 LIMIT 10
  6. RETRIEVE items USING PROFILE controversial LIMIT 10
  7. Write a burst of 100 "share" signals for item #500
  8. Re-execute the trending query

Then:
  - Step 1: Items ordered by share velocity, max 1 per creator, items with
    engagement_ratio < 0.03 excluded
  - Step 2: Only jazz items returned, ordered by hot formula
  - Step 3: Items ordered by created_at descending, no signal computation
  - Step 4: Items ordered by quality score computed from 7d-windowed signals
  - Step 5: Items with high completion but low views, sorted by quality/reach ratio
  - Step 6: Items with highest product of positive and negative signals
  - Step 7: ok
  - Step 8: Item #500 appears higher in trending results (signal written 100ms ago
    is reflected)
  - Performance: end-to-end RETRIEVE < 50ms for 10K items
```

### Phases

#### Phase 1: Vector Index Integration (USearch)

**Delivers:** USearch wrapped behind a trait, with mmap persistence, f16 quantization, and the adaptive filtered search planner. Items can be inserted with embeddings and retrieved by ANN similarity.

**Acceptance Criteria:**
- [ ] `VectorIndex` trait with `insert(key, vector)`, `remove(key)`, `search(query, k)`, `filtered_search(query, k, predicate)`, `save()`, `load()`, `view()`
- [ ] USearch backend implements the trait with f16 quantization (default), mmap persistence
- [ ] Vectors normalized at insertion time (L2 distance equivalent to cosine for unit vectors)
- [ ] Adaptive query planner: selectivity < 2% triggers pre-filter + brute-force; 2-100% uses `filtered_search` with predicate callback
- [ ] ANN retrieval at 10K vectors returns top-100 with recall@10 > 0.95
- [ ] ANN retrieval latency < 10ms at 10K vectors (benchmarked)
- [ ] Persistence: save on checkpoint, view() on restart for immediate read serving
- [ ] `#![forbid(unsafe_code)]` relaxed only in the USearch FFI boundary module with SAFETY comments

**Depends On:** m1p3 (storage traits)
**Complexity:** L
**Research Reference:** `docs/research/ann_for_tidaldb.md` (USearch architecture, filtered search, f16, mmap)

#### Phase 2: Metadata Indexes and Filter Engine

**Delivers:** Roaring bitmap indexes for categorical metadata, B-tree indexes for range attributes, and a composable filter engine that evaluates arbitrary filter combinations. The filter engine produces either a bitmap (for pre-filtering ANN) or a predicate closure (for in-graph filtering).

**Acceptance Criteria:**
- [ ] Roaring bitmap per high-cardinality metadata value: category, format, creator_id
- [ ] B-tree index for range attributes: created_at, duration
- [ ] Filter expressions are composable: AND across dimensions, OR within a dimension
- [ ] `filter.selectivity()` estimates the fraction of items matching (for query planner)
- [ ] `filter.to_bitmap()` returns a RoaringBitmap for pre-filtering
- [ ] `filter.to_predicate()` returns a `Fn(EntityId) -> bool` for in-graph filtering
- [ ] Filters tested: category:jazz, format:video, duration_min:5m, created_within:7d, and arbitrary combinations
- [ ] Filter evaluation < 1 microsecond per candidate (benchmarked)

**Depends On:** m1p3 (storage engine)
**Complexity:** M
**Research Reference:** `docs/research/ann_for_tidaldb.md` (metadata indexes, selectivity estimation, roaring bitmaps)

#### Phase 3: Ranking Profile Engine

**Delivers:** Named ranking profiles declared as data (not compiled code), parsed, validated, stored, and executed by the database. Profiles reference signal scores, windowed aggregates, velocity, metadata fields, and define quality gates. Profiles are versioned and swappable at query time.

**Acceptance Criteria:**
- [ ] Profile declaration syntax supports: primary signal, secondary signals with weights, BOOST, GATE (minimum threshold), PENALIZE, EXCLUDE
- [ ] Profiles stored in schema, versioned, retrievable by name
- [ ] Profile execution: given a candidate set and a profile, produce a scored and sorted result list
- [ ] Built-in profiles implemented: `trending`, `hot`, `new`, `top_week`, `top_month`, `top_all_time`, `hidden_gems`, `controversial`, `most_viewed`, `most_liked`, `shuffle`
- [ ] `hot` formula: `score / (age_hours + 2)^gravity` with configurable gravity
- [ ] `controversial` formula: `max(positive_signals * negative_signals)`
- [ ] `hidden_gems` formula: `quality_score * (1 / log(1 + view_count))`
- [ ] Profile change does not require recompile -- profiles are runtime data
- [ ] 200-candidate scoring pass with a profile < 10 microseconds (benchmarked)

**Depends On:** m1p4 (signal ledger)
**Complexity:** L
**Research Reference:** `VISION.md` (ranking profile declarations), `ai-lookup/services/ranking-profiles.md`, `USE_CASES.md` Appendix B (sort mode formulas)

#### Phase 4: Diversity Enforcement

**Delivers:** Post-scoring diversity pass that reorders results to satisfy constraints (max_per_creator, format_mix) without reducing result count. Implemented as a greedy selection pass over the scored candidate list.

**Acceptance Criteria:**
- [ ] `max_per_creator:N` enforced: no more than N items from any single creator in the result set
- [ ] `format_mix:true` enforced: no more than 60% of results from any single format
- [ ] Diversity pass does not reduce result count -- it selects the next-best candidate that satisfies constraints
- [ ] Diversity pass adds < 1ms for 200 candidates (benchmarked)
- [ ] When diversity constraints cannot be fully satisfied (too few creators), results are returned with a warning flag, not an error
- [ ] Property test: diversity constraints hold for 10,000 random candidate sets

**Depends On:** Phase 3 (ranking profiles produce scored lists)
**Complexity:** M
**Research Reference:** `VISION.md` (diversity as query constraint), `thoughts.md` Part V.14 (MMR post-scoring)

#### Phase 5: Query Parser and RETRIEVE Executor

**Delivers:** The query parser for the RETRIEVE operation and the executor that orchestrates candidate retrieval, filtering, scoring, diversity, and result assembly. This is the "one query" entry point. For M2, the RETRIEVE query does not require `FOR USER` (no personalization yet) -- it operates on the full item corpus with filters and profiles.

**Acceptance Criteria:**
- [ ] Parser handles: `RETRIEVE items`, `USING PROFILE <name>`, `FILTER <conditions>`, `DIVERSITY <constraints>`, `LIMIT <n>`, `EXCLUDE [ids]`
- [ ] Parser produces a typed AST; parse errors include position and helpful message
- [ ] Executor pipeline: candidate retrieval (ANN or full scan based on profile) -> filter -> score -> diversity -> limit -> return
- [ ] When profile uses velocity/decay signals, executor uses ANN retrieval over embeddings then scores with signal state
- [ ] When profile is `new` or `alphabetical`, executor skips ANN and uses metadata index directly
- [ ] End-to-end RETRIEVE latency < 50ms at 10K items (benchmarked)
- [ ] Results include: entity_id, score, and a signal snapshot (key signal values used in scoring) for debugging/transparency
- [ ] `SIGNAL` write command also parsed and routed to signal write path from M1
- [ ] Full M2 UAT scenario passes as an integration test

**Depends On:** Phase 1, Phase 2, Phase 3, Phase 4
**Complexity:** L
**Research Reference:** `ai-lookup/features/query-language.md`, `SEQUENCE.md` (all sequence diagrams)

### Deferred to Later Milestones

- **FOR USER clause and user preference vectors** -- deferred to M3; M2 proves ranking works without personalization
- **SIMILAR TO clause (related content)** -- deferred to M3; requires user context for personalization layer
- **Relationship graph (follows, blocks)** -- deferred to M3; M2 filters on metadata, not relationships
- **SEARCH query (text + semantic)** -- deferred to M4; M2 proves RETRIEVE ranking
- **Full-text index (Tantivy)** -- deferred to M4
- **Exploration budget / cold start** -- deferred to M3; requires user context to be meaningful
- **User state filters (unseen, saved, liked)** -- deferred to M3; requires user entities
- **Engagement threshold filters (min_views, min_likes)** -- partially implemented via signal reads; full composable filter syntax deferred to M5

### Integration Test

```rust
#[test]
fn milestone_2_uat() {
    let db = open_with_full_schema();

    // Write 10K items with embeddings
    for i in 0..10_000 {
        db.write_item(EntityId(i), metadata(i), Some(embedding(i))).unwrap();
    }

    // Write 100K signal events
    for e in generate_events(100_000, Duration::days(7)) {
        db.signal(e.signal_type, e.entity_id, e.weight, e.timestamp).unwrap();
    }

    // Trending query with diversity
    let results = db.retrieve(
        "RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25"
    ).unwrap();
    assert_eq!(results.len(), 25);
    assert!(results.windows(2).all(|w| w[0].score >= w[1].score));
    assert!(creator_counts(&results).values().all(|&c| c <= 1));

    // Category filter with hot sort
    let jazz = db.retrieve(
        "RETRIEVE items FILTER category:jazz USING PROFILE hot LIMIT 20"
    ).unwrap();
    assert!(jazz.iter().all(|r| r.metadata["category"] == "jazz"));

    // Signal freshness: write burst, verify ranking change
    let pre_burst = db.retrieve(
        "RETRIEVE items USING PROFILE trending LIMIT 10"
    ).unwrap();
    for _ in 0..100 {
        db.signal("share", EntityId(500), 1.0, Timestamp::now()).unwrap();
    }
    let post_burst = db.retrieve(
        "RETRIEVE items USING PROFILE trending LIMIT 10"
    ).unwrap();
    let pre_rank = pre_burst.iter().position(|r| r.id == EntityId(500));
    let post_rank = post_burst.iter().position(|r| r.id == EntityId(500));
    assert!(post_rank.unwrap() < pre_rank.unwrap_or(25));
}
```

### Done When

A developer can write items with embeddings and metadata, write signal events, and execute RETRIEVE queries with any of the 11+ built-in sort modes, metadata filters, and diversity constraints. Results are correctly ranked by the named profile. Signal events written 100ms ago are reflected in the next query. End-to-end latency < 50ms at 10K items. Diversity constraints hold in every result set.

---

## Milestone 3: Personalized Ranking -- "The For You query works"

### Milestone Thesis

A developer can write user entities with preference vectors, write relationship edges (follows, blocks), write engagement signals that update user profiles and relationship weights automatically, and execute `RETRIEVE items FOR USER @user_id USING PROFILE for_you` -- getting results shaped by the user's history, relationships, and implicit preferences. This proves that the feedback loop closes inside the database.

### UAT Scenario

```
Given:
  A tidalDB instance with:
    - 10,000 items across 200 creators, with embeddings
    - 500 users with initial preference embeddings
    - Relationship edges: follows, blocks
    - Signals: view, like, skip, hide, completion, share
    - 500,000 historical signal events establishing user preferences
    - Profiles: for_you, following, related, notification

When:
  1. RETRIEVE items FOR USER @user_42 USING PROFILE for_you
     FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50
  2. RETRIEVE items FOR USER @user_42 FILTER relationship:follows
     USING PROFILE following LIMIT 50
  3. RETRIEVE items SIMILAR TO @item_abc FOR USER @user_42
     USING PROFILE related FILTER unseen LIMIT 10
  4. SIGNAL like item:@item_xyz user:@user_42
  5. Re-execute the for_you query
  6. SIGNAL hide item:@item_999 user:@user_42
  7. SIGNAL block user:@user_42 target_creator:@creator_77
  8. Re-execute the for_you query

Then:
  - Step 1: Results personalized -- items matching user_42's preference vector
    rank higher; items from blocked creators excluded; items already seen excluded;
    max 2 per creator; 10% exploration budget (items from unfollowed creators)
  - Step 2: Only items from followed creators, chronological order
  - Step 3: Items semantically similar to @item_abc, re-ranked by user_42's
    preference match, already-seen excluded
  - Step 4: Signal write atomically updates: item like count, user->creator
    interaction weight, user preference vector shifted toward item embedding
  - Step 5: Results shift -- items similar to @item_xyz's topic rank higher;
    creator of @item_xyz appears more frequently
  - Step 6: @item_999 never appears in any future query for user_42
  - Step 7: All items by creator_77 excluded from all queries for user_42
  - Step 8: No items from creator_77; no item_999; shift from like reflected
```

### Phases

#### Phase 1: User and Creator Entities with Relationships

**Delivers:** User and creator entity types with preference vectors and a relationship graph. Relationship edges are weighted, directional, and queryable. Follows, blocks, interaction weights are first-class.

**Acceptance Criteria:**
- [ ] User entities store: user_id, preference embedding (mutable, updated on signals), metadata
- [ ] Creator entities store: creator_id, catalog embedding (aggregated from items), metadata
- [ ] Relationship edges: `(from_entity, to_entity, type, weight, timestamp)` with types: follows, blocks, interaction_weight, hide, mute
- [ ] `follows` filter: efficiently enumerate all items by creators a user follows (roaring bitmap of creator's item set, intersected with follows set)
- [ ] `blocked` filter: efficiently exclude all items by blocked creators
- [ ] `unseen` filter: roaring bitmap of user's seen item set, inverted
- [ ] Relationship write/read latency < 50 microseconds

**Depends On:** m1p3 (storage), m2p2 (bitmap indexes)
**Complexity:** L

#### Phase 2: Feedback Loop -- Signal Writes Update User State

**Delivers:** When a signal event is written (like, skip, hide, completion), the database atomically updates the item's signal ledger, the user-to-item relationship, the user-to-creator interaction weight, and the user's preference vector. One write, multiple state updates, no application logic.

**Acceptance Criteria:**
- [ ] `db.signal("like", item_id, user_id, weight, timestamp)` atomically:
  1. Appends event to WAL
  2. Updates item signal ledger (decay scores, windowed counts)
  3. Increments user->creator interaction_weight
  4. Shifts user preference vector toward item embedding (configurable learning rate)
- [ ] `db.signal("skip", ...)` atomically: updates item skip count, decays user->creator weight, shifts preference vector away from item embedding
- [ ] `db.signal("hide", ...)` sets permanent hard-negative on user->item relationship; item excluded from all future queries for this user
- [ ] `db.signal("block", user, creator)` sets permanent block; all items by creator excluded from all queries for this user
- [ ] Preference vector update uses exponential moving average: `pref = alpha * item_embedding + (1 - alpha) * pref` (positive) or `pref = pref - alpha * item_embedding` (negative), normalized after update
- [ ] All updates visible to the next query (no eventual consistency lag within the process)
- [ ] Property test: 10,000 random signal sequences never produce a state where a hidden item or blocked creator appears in query results

**Depends On:** Phase 1, m1p4 (signal ledger)
**Complexity:** XL

#### Phase 3: Personalized Ranking Profiles

**Delivers:** Ranking profiles that incorporate user context: preference match (embedding similarity between user and item), user-creator interaction weight, social proof (engagement from user's follows), and user-specific exclusions. The `for_you`, `following`, `related`, and `notification` profiles.

**Acceptance Criteria:**
- [ ] `for_you` profile: ANN retrieval using user preference vector, scoring = preference_match * engagement_velocity * recency_decay * social_proof, gates on completion_rate, penalizes skip count, 10% exploration budget
- [ ] `following` profile: candidate set restricted to followed creators' items, sorted by created_at DESC, tiebreaker on completion_rate
- [ ] `related` profile: ANN retrieval using source item's embedding, collaborative filtering boost (items co-engaged with source), personalization re-rank by user preference
- [ ] `notification` profile: candidates from followed creators' recent items, scored by relationship_strength * item_quality
- [ ] Exploration budget: 10% of for_you results are from creators the user does not follow, to prevent filter bubbles
- [ ] Cold start: new users with no signal history get results ranked by population-level signals (trending, quality)
- [ ] Cold start: new items with no signals get an exploration window (appear in a small % of for_you feeds)
- [ ] `FOR USER @user_id` clause parsed and user state loaded into query context

**Depends On:** Phase 2, m2p3 (ranking engine), m2p5 (query parser)
**Complexity:** L

#### Phase 4: User State Filters

**Delivers:** Filters that depend on user state: unseen, in_progress, saved, liked, in_collection. These require per-user bitmaps or sets maintained by the signal system.

**Acceptance Criteria:**
- [ ] `unseen` filter: excludes items the user has viewed (maintained as roaring bitmap per user, updated on view signal)
- [ ] `unblocked` filter: excludes items from blocked creators and hidden items
- [ ] `saved` filter: returns only items the user has saved
- [ ] `liked` filter: returns only items the user has liked
- [ ] `in_progress` filter: returns items with partial completion signal
- [ ] User state filters compose with all metadata filters from M2
- [ ] Per-user seen bitmap memory: ~125KB per user at 1M items (roaring bitmap), manageable for 10K users in memory

**Depends On:** Phase 1, Phase 2
**Complexity:** M

### Deferred to Later Milestones

- **SEARCH query with personalization** -- deferred to M4; M3 proves personalized RETRIEVE
- **Tantivy integration** -- deferred to M4
- **People/creator search (UC-10)** -- deferred to M4
- **Social graph traversal for trending ("trending among my follows")** -- deferred to M5; requires graph query capabilities beyond simple follows filter
- **Collaborative filtering** -- basic co-engagement signals used in `related` profile; full matrix-factorization-style CF deferred to M5
- **User-created collections/boards (UC-09.4)** -- deferred to M5
- **Live content status tracking (UC-12)** -- deferred to M5

### Integration Test

```rust
#[test]
fn milestone_3_uat() {
    let db = open_with_users_and_relationships();

    // User 42 likes jazz, follows creators 1-10, blocked creator 77
    let feed = db.retrieve(
        "RETRIEVE items FOR USER @42 USING PROFILE for_you \
         FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
    ).unwrap();
    assert_eq!(feed.len(), 50);
    assert!(feed.iter().all(|r| !user_42_seen.contains(&r.id)));
    assert!(feed.iter().all(|r| r.creator_id != CreatorId(77)));
    assert!(creator_counts(&feed).values().all(|&c| c <= 2));

    // Like an item, verify preference shift
    db.signal("like", EntityId(500), UserId(42), 1.0, now()).unwrap();
    let feed2 = db.retrieve(same_for_you_query()).unwrap();
    // Items topically similar to item 500 should rank higher
    let topic_500 = db.read_item(EntityId(500)).unwrap().category;
    let topic_match_before = feed.iter().filter(|r| r.category == topic_500).count();
    let topic_match_after = feed2.iter().filter(|r| r.category == topic_500).count();
    assert!(topic_match_after >= topic_match_before);

    // Hide and block, verify exclusion
    db.signal("hide", EntityId(999), UserId(42), 1.0, now()).unwrap();
    db.signal("block", UserId(42), CreatorId(77), 1.0, now()).unwrap();
    let feed3 = db.retrieve(same_for_you_query()).unwrap();
    assert!(feed3.iter().all(|r| r.id != EntityId(999)));
    assert!(feed3.iter().all(|r| r.creator_id != CreatorId(77)));
}
```

### Done When

The full "For You" query works: `RETRIEVE items FOR USER @user_id USING PROFILE for_you FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50` returns personalized, diversity-constrained results that reflect the user's engagement history, exclude hidden items and blocked creators, include an exploration budget, handle cold-start users and items, and update in response to new signal events within 100ms. The `following`, `related`, and `notification` profiles also work correctly.

---

## Milestone 4: Hybrid Search -- "Text + semantic + signals in one query"

### Milestone Thesis

A developer can execute `SEARCH items QUERY "rust tutorial beginner" VECTOR query_vector FOR USER @user_id USING PROFILE search LIMIT 20` and get results that combine BM25 text relevance, semantic similarity, and user personalization in a single ranked list. This proves that search and retrieval are the same system.

### UAT Scenario

```
Given:
  A tidalDB instance with:
    - 10,000 items with text fields (title, description, tags) indexed for full-text search
    - All items have embeddings
    - 500 users with engagement history
    - Search profile defined: text relevance as floor, semantic similarity,
      personalization adjustment

When:
  1. SEARCH items QUERY "rust tutorial beginner" VECTOR [query_embedding]
     FOR USER @user_42 USING PROFILE search DIVERSITY max_per_creator:2 LIMIT 20
  2. SEARCH items QUERY "jazz piano" FOR USER @user_42
     USING PROFILE search FILTER duration:short, format:video LIMIT 20
  3. SEARCH items QUERY "\"exact phrase match\"" USING PROFILE search LIMIT 10
  4. SEARCH items QUERY "jazz -beginner" USING PROFILE search LIMIT 10
  5. SEARCH creators QUERY "jazz" LIMIT 10
  6. User clicks result #3, record SIGNAL search_click
  7. User searches same query again

Then:
  - Step 1: Results combine BM25 + semantic similarity via RRF;
    personalization re-ranks within relevant set; user_42 (a beginner)
    sees beginner content elevated
  - Step 2: Text-only search (no vector), filtered by duration and format
  - Step 3: Exact phrase match -- only items containing "exact phrase match"
  - Step 4: Boolean exclusion -- no items matching "beginner"
  - Step 5: Creator search by name/topic
  - Step 6: Signal recorded with query context and rank position
  - Step 7: Clicked result may rank higher due to search_click signal
  - Performance: SEARCH < 50ms at 10K items
```

### Phases

#### Phase 1: Tantivy Integration

**Delivers:** Tantivy embedded as a derived index for full-text search. DB-primary consistency pattern: entity store is source of truth, Tantivy is a materialized view updated via outbox. BM25 scoring exposed via custom Collector and Weight/Scorer seek pattern.

**Acceptance Criteria:**
- [ ] Tantivy index created from schema text field definitions (title, description, tags)
- [ ] Background indexer reads entity store outbox and feeds Tantivy writer
- [ ] Tantivy commit stores last-processed sequence number in payload for crash recovery
- [ ] Custom `AllScoresCollector` returns all matching doc IDs with BM25 scores
- [ ] `Weight::scorer` + `DocSet::seek` pattern scores specific candidate IDs (for re-ranking ANN results)
- [ ] External entity ID -> DocAddress mapping maintained and updated on segment merge
- [ ] Boolean queries supported: AND, OR, NOT, exact phrase, field-scoped
- [ ] Commit interval: every 1-5 seconds or every N thousand documents
- [ ] Index rebuild from entity store completes in < 10 minutes at 10K items
- [ ] BM25 query latency < 10ms at 10K documents (benchmarked)

**Depends On:** m1p3 (storage engine), m1p5 (entity API)
**Complexity:** L
**Research Reference:** `docs/research/tantivy.md` (Collector API, consistency pattern, seek scoring, commit model)

#### Phase 2: Hybrid Fusion (RRF)

**Delivers:** Reciprocal Rank Fusion combining BM25 ranked lists with ANN ranked lists into a single scored result set. The starting point is RRF with k=60; the architecture supports upgrading to tuned linear combination when relevance labels exist.

**Acceptance Criteria:**
- [ ] `RRF(d) = 1/(60 + rank_bm25(d)) + 1/(60 + rank_ann(d))` implemented
- [ ] Documents appearing in only one list contribute only their single-list term
- [ ] RRF results are re-rankable by personalization (user preference overlay)
- [ ] When only text query is provided (no vector), pure BM25 ranking used
- [ ] When only vector is provided (no text), pure ANN ranking used
- [ ] Fusion adds < 1ms to query time (benchmarked)
- [ ] k parameter configurable (default 60)

**Depends On:** Phase 1 (BM25 scores), m2p1 (ANN scores)
**Complexity:** S
**Research Reference:** `docs/research/tantivy.md` (RRF section, Cormack et al.)

#### Phase 3: SEARCH Query Parser and Executor

**Delivers:** The SEARCH query parser and executor that orchestrates text retrieval, semantic retrieval, fusion, personalization, filtering, diversity, and result assembly.

**Acceptance Criteria:**
- [ ] Parser handles: `SEARCH items/creators`, `QUERY "text"`, `VECTOR [embedding]`, `FOR USER`, `USING PROFILE`, `FILTER`, `DIVERSITY`, `LIMIT`
- [ ] Query text parsing: exact phrase (`"...""`), boolean operators (AND/OR/NOT/-), field-scoped (`title:...`), wildcard (`term*`)
- [ ] Executor pipeline: text retrieval -> ANN retrieval -> fusion -> personalization -> filter -> diversity -> return
- [ ] When both QUERY and VECTOR provided, hybrid fusion (RRF)
- [ ] When only QUERY, BM25-only retrieval
- [ ] When only VECTOR, ANN-only retrieval
- [ ] Search results include: entity_id, combined_score, bm25_score, semantic_score, rank
- [ ] `search_click` signal writes include query context and rank position
- [ ] End-to-end SEARCH < 50ms at 10K items (benchmarked)

**Depends On:** Phase 1, Phase 2, m2p5 (query parser infrastructure)
**Complexity:** M

#### Phase 4: Creator and People Search

**Delivers:** Search over creator entities by name, topic, and attributes. "Creators like X" via creator embedding similarity. Enables UC-10.

**Acceptance Criteria:**
- [ ] Creator entities indexed in Tantivy (name, handle, bio, topics)
- [ ] Creator embeddings searchable via ANN (aggregated from catalog)
- [ ] `SEARCH creators QUERY "jazz" LIMIT 10` returns creators matching topic
- [ ] `SEARCH creators SIMILAR TO @creator_id LIMIT 10` returns similar creators by embedding
- [ ] Creator filters: verified, min_followers, language, followed_by_user
- [ ] Creator sort modes: follower_count, engagement_rate, posting_frequency

**Depends On:** Phase 1, m3p1 (creator entities)
**Complexity:** M

### Deferred to Later Milestones

- **Autocomplete and search suggestions (UC-02.3)** -- deferred to M5; requires prefix indexes and trending query tracking
- **Saved searches and alerts (UC-02.4)** -- deferred to M5; requires persistent query storage and push notification
- **Visual search / image search (UC-11)** -- deferred to M5; requires multi-modal embedding support
- **"Did you mean" typo correction** -- deferred to M5; requires edit-distance computation on term dictionary
- **Tuned linear combination (replacing RRF)** -- deferred to M5; requires relevance labels for alpha tuning

### Done When

A developer can execute SEARCH queries that combine full-text BM25 relevance with semantic vector similarity and user personalization in a single ranked result set. Boolean queries, phrase matching, field-scoped search, and creator search all work. Results reflect engagement signals. End-to-end SEARCH latency < 50ms at 10K items.

---

## Milestone 5: Full Surface Coverage -- "Every use case works"

### Milestone Thesis

Every one of the 14 use cases works end-to-end. Every sort mode, every filter dimension, every discovery surface described in USE_CASES.md is operational. The query `RETRIEVE items FOR USER @user_id CONTEXT feed USING PROFILE for_you FILTER unseen, unblocked, format:video, duration:short DIVERSITY max_per_creator:2, format_mix:true LIMIT 50` is the complete, production-quality end state query.

### UAT Scenario

```
Given:
  A tidalDB instance loaded with:
    - 100,000 items across 1,000 creators
    - 10,000 users with engagement histories
    - All 14 use case scenarios configured
    - All sort modes and filter dimensions exercised

When:
  All 14 use cases are executed as described in USE_CASES.md:
    UC-01: For You Feed with full diversity and exploration
    UC-02: Search with all filter dimensions, autocomplete, saved searches
    UC-03: Trending (global, category, social-graph scoped)
    UC-04: Following feed (chronological, algorithmic modes)
    UC-05: Related/Up Next with collaborative filtering
    UC-06: Browse with all sort modes, faceted filters, mood filters
    UC-07: Notification prioritization with frequency capping
    UC-08: Creator profile (Top, New, Hot, For You modes)
    UC-09: User library (history, saved, liked, collections, continue watching)
    UC-10: People search with "creators like X"
    UC-11: Visual/semantic search with image embeddings
    UC-12: Live content with real-time viewer count
    UC-13: Hidden gems with breakout detection
    UC-14: Controversial and Hot with dual-signal ranking

Then:
  Every query returns correct results per use case specification.
  All 25+ sort modes produce correctly ordered results.
  All filter dimensions compose correctly.
  Performance: < 50ms for all queries at 100K items.
```

### Phases

(Phases for M5 are provisional -- detailed decomposition happens after M4 ships, informed by what was learned.)

#### Phase 1: Complete Sort Mode Coverage

**Delivers:** All 25+ sort modes from Appendix B operational. Windowed top sorts (hour, today, week, month, year, all_time), shuffle, alphabetical, shortest/longest, live_viewer_count, date_saved, creator_engagement_rate.

**Depends On:** M4 complete
**Complexity:** L

#### Phase 2: Complete Filter Coverage

**Delivers:** All filter dimensions from Appendix A operational and composable. Geographic filters, accessibility filters, community signal filters, availability filters, engagement threshold filters.

**Depends On:** Phase 1
**Complexity:** L

#### Phase 3: Social Graph Queries and Collaborative Filtering

**Delivers:** Social graph traversal for trending-among-follows, collaborative filtering for related/up-next, "creators followed by people I follow." The graph query capabilities needed for UC-03 (social trending), UC-05 (collaborative filtering), UC-10 (social creator discovery).

**Depends On:** Phase 1
**Complexity:** L

#### Phase 4: User Library, Collections, and Continue Watching

**Delivers:** UC-09 complete: watch history, saved items, liked items, user-created collections, continue watching (resume position), download state. Collections as rankable entities.

**Depends On:** Phase 2
**Complexity:** M

#### Phase 5: Advanced Search Features

**Delivers:** Autocomplete, search suggestions, trending searches, saved searches, "did you mean" typo correction, related query suggestions. UC-02.3 and UC-02.4.

**Depends On:** Phase 1
**Complexity:** L

#### Phase 6: Live Content and Notification Systems

**Delivers:** UC-12 (live content with real-time viewer count, scheduled content, reminders) and UC-07 (notification prioritization with frequency capping, per-creator limits). Real-time signal types for viewer count and schedule awareness.

**Depends On:** Phase 1
**Complexity:** M

### Deferred to Later Milestones

- **Signal rollups (hourly/daily materialization)** -- built if 100K-item benchmarks show bucketed counters exceeding the latency budget for 30d+ windows
- **Multi-vector user interest clustering (PinnerSage)** -- deferred to M6 or beyond; single preference vector serves through M5
- **ACORN-1 two-hop expansion for very selective filters** -- deferred to M6; USearch predicate callback sufficient through M5

### Done When

All 14 use cases pass their UAT scenarios as defined in USE_CASES.md. All 25+ sort modes work. All filter dimensions compose. Every sequence diagram in SEQUENCE.md can be executed. Performance: < 50ms for all queries at 100K items.

---

## Milestone 6: Production Hardening -- "Ready for real workloads"

### Milestone Thesis

tidalDB can be embedded in a production application and operated with confidence. Crash recovery is correct and fast. Graceful degradation works under load. Operational visibility exists. Performance meets targets at 1M+ items. The database is trustworthy.

### UAT Scenario

```
Given:
  A tidalDB instance with:
    - 1,000,000 items, 100,000 users, 10,000 creators
    - Sustained write load: 10,000 signal events/second
    - Concurrent read load: 1,000 RETRIEVE queries/second

When:
  1. Run full workload for 1 hour
  2. Kill the process at a random point
  3. Restart and measure recovery time
  4. Verify no data loss and no inconsistency
  5. Run workload at 3x expected load
  6. Verify graceful degradation (reduced precision, not errors)

Then:
  - Step 1: All queries < 50ms p99, all signal writes < 100us amortized
  - Step 3: Recovery time < 30 seconds
  - Step 4: WAL replay produces state identical to pre-crash;
    no phantom items, no lost signals, no inconsistent aggregates
  - Step 5: Under overload, tidalDB reduces candidate set size, uses coarser
    aggregates, skips diversity -- but never returns errors for well-formed queries
  - Step 6: Degradation follows the documented order:
    1. Reduce candidate set (500 -> 200)
    2. Use coarser aggregates
    3. Skip diversity
    4. Return from materialized cache
```

### Phases

(Phases for M6 are provisional -- detailed decomposition happens after M5 ships.)

#### Phase 1: Crash Recovery Hardening

**Delivers:** Comprehensive crash recovery testing and hardening. Fault injection at every write-path stage. Recovery time targets. WAL compaction and checkpoint optimization.

**Depends On:** M5 complete
**Complexity:** XL

#### Phase 2: Graceful Degradation Under Load

**Delivers:** Automatic quality reduction under load pressure. Configurable degradation order. Backpressure on write path. Never errors for well-formed queries.

**Depends On:** Phase 1
**Complexity:** L

#### Phase 3: Performance at Scale

**Delivers:** Benchmarks and optimization at 1M items, 100K users. USearch performance tuning (M, ef_search, quantization). Tantivy segment management. Signal state memory optimization. Hot/warm/cold tiering for signal state if memory budget requires it.

**Depends On:** Phase 1
**Complexity:** XL

#### Phase 4: Operational Visibility

**Delivers:** Metrics, diagnostics, and observability. Query execution stats (candidates considered, filters applied, scoring time, diversity adjustments). Signal system health (WAL lag, checkpoint age, memory usage). Index health (segment count, tombstone ratio). Error reporting with context.

**Depends On:** Phase 1
**Complexity:** M

### Deferred (Post-M6 / Future)

- **Horizontal distribution** -- the single-node architecture scales vertically first; distribution is a separate product decision
- **Multi-tenancy** -- per-tenant isolation within a single tidalDB instance
- **Streaming query results** -- cursor-based streaming for very large result sets
- **A/B testing infrastructure** -- comparing two profile versions within the database
- **Signal rollup to external cold storage** -- S3/GCS archival for compliance
- **Client libraries** -- language-specific wrappers beyond Rust embedding

### Done When

tidalDB operates correctly at 1M items under sustained concurrent read/write load. Crash recovery completes in < 30 seconds with zero data loss. Graceful degradation works under 3x overload without returning errors. All performance targets met at p99. A developer can embed tidalDB in a production application and operate it with confidence.

---

## Use Case Coverage Progression

| UC | Description | M1 | M2 | M3 | M4 | M5 | M6 |
|----|-------------|----|----|----|----|----|----|
| UC-01 | For You Feed | - | - | **Full** | Full | Full | Full |
| UC-02 | Search | - | - | - | **Core** | **Full** | Full |
| UC-03 | Trending/Rising | Signals | **Full** | Full | Full | Full | Full |
| UC-04 | Following Feed | - | Partial | **Full** | Full | Full | Full |
| UC-05 | Related/Up Next | - | - | **Core** | Core | **Full** | Full |
| UC-06 | Browse/Category | Signals | **Core** | Core | Core | **Full** | Full |
| UC-07 | Notifications | - | - | **Core** | Core | **Full** | Full |
| UC-08 | Creator Profile | - | **Core** | Core | Core | **Full** | Full |
| UC-09 | User Library | - | - | Partial | Partial | **Full** | Full |
| UC-10 | People Search | - | - | - | **Core** | **Full** | Full |
| UC-11 | Visual/Semantic | - | - | - | Partial | **Full** | Full |
| UC-12 | Live Content | - | - | - | - | **Full** | Full |
| UC-13 | Hidden Gems | - | **Full** | Full | Full | Full | Full |
| UC-14 | Controversial/Hot | Signals | **Full** | Full | Full | Full | Full |

Legend:
- `-` = Not addressed
- `Signals` = Signal primitives exist but no query surface
- `Partial` = Some functionality, not all modes
- `Core` = Primary query path works, some modes/filters missing
- **Full** = All modes, filters, and feedback loops per USE_CASES.md specification

---

## Dependency DAG

```
m1p1 (Types/Schema) ✓
  |
  +---> m1p2 (WAL) ✓
  |       |
  +---> m1p3 (Storage/fjall) ✓ ---+
  |       |                        |
  |       +---> m1p4 (Signal Ledger)
  |               |
  |               +---> m1p5 (Entity + Signal API)  = M1 COMPLETE
  |               |
  |               +---> m2p3 (Ranking Profiles)
  |                       |
  +---> m2p1 (USearch) ---+
  |                        |
  +---> m2p2 (Filters) ---+---> m2p4 (Diversity)
                           |       |
                           +-------+---> m2p5 (RETRIEVE Query) = M2 COMPLETE
                           |
                           +---> m3p1 (Users/Creators/Relationships)
                           |       |
                           |       +---> m3p2 (Feedback Loop)
                           |       |       |
                           |       |       +---> m3p3 (Personalized Profiles)
                           |       |
                           |       +---> m3p4 (User State Filters)
                           |
                           |       m3p3 + m3p4 = M3 COMPLETE
                           |
                           +---> m4p1 (Tantivy)
                                   |
                                   +---> m4p2 (RRF Fusion)
                                   |       |
                                   |       +---> m4p3 (SEARCH Query)
                                   |
                                   +---> m4p4 (Creator Search)

                                   m4p3 + m4p4 = M4 COMPLETE

                                   M5 phases (provisional) depend on M4
                                   M6 phases (provisional) depend on M5
```

**Parallelization opportunities:**
- m1p2 (WAL) and m1p3 (Storage) are parallel after m1p1 (both now complete: m1p3 was completed first, m1p2 followed)
- m2p1 (USearch) and m2p2 (Filters) can be built in parallel after m1p3
- m3p1 (Entities) and m4p1 (Tantivy) can start in parallel with later M2 phases
- m3p4 (User State Filters) can be built in parallel with m3p3 (Profiles)
- m4p2 (RRF) and m4p4 (Creator Search) can be built in parallel

---

## Architectural Decisions Locked In

These decisions are made. They are not revisited unless benchmarks prove them wrong.

| Decision | Chosen | Alternative | Rationale |
|----------|--------|-------------|-----------|
| Storage engine | fjall (pure Rust) | RocksDB | Pure Rust, `#![forbid(unsafe_code)]`, fast compile, trait-abstracted for swap |
| Vector index | USearch (C++ FFI) | hnsw_rs | 10-100x QPS, predicate callbacks, mmap, f16 quantization |
| Text search | Tantivy (embedded) | Custom BM25 | 40K lines of battle-tested code; Collector/Scorer API provides exact hooks needed |
| Decay formula | Running S(t)=S(prev)*exp(-lambda*dt)+w | Raw event scan | O(1) vs O(N), proven exact, 20-60x faster at 50+ events/entity |
| Windowed aggregation | Bucketed counters (Scotty pattern) | SWAG two-stacks | Simpler, serves multiple window sizes from one set of buckets |
| Hybrid fusion | RRF (k=60) | Tuned linear combination | Zero-config, robust; linear combo is the upgrade path with relevance labels |
| Consistency model | DB-primary, Tantivy as derived index | Two-phase commit | Simpler, deterministic recovery, source of truth is always the entity store |
| WAL checksums | BLAKE3 | CRC32C | Content-addressing enables deduplication; BLAKE3 is fast enough |
| Key encoding | Subject-prefix `[entity_id][0x00][TAG:suffix]` | Separate key namespaces | Co-locates entity data, natural shard boundary, single prefix scan |
| Embedding format | f16 quantization (default) | float32 | Half memory, < 1% recall loss at 1536D |
| Query language | Custom (RETRIEVE/SEARCH/SIGNAL) | SQL | Domain semantics cannot be expressed in SQL without losing optimization opportunities |

---

## What This Roadmap Does NOT Cover

These are explicitly out of scope for the foreseeable future:

1. **Embedding generation** -- tidalDB retrieves and ranks over vectors. It does not generate them. Bring your own model.
2. **Horizontal distribution** -- Single-node first. Scale vertically. Distribution is a separate product.
3. **ACID transactions across entities** -- Signal writes are atomic within an entity's state. Cross-entity transactions are not needed for the ranking problem.
4. **SQL compatibility** -- The custom query language exists because SQL cannot express ranking semantics. No SQL layer.
5. **Multi-tenancy** -- One tidalDB instance serves one application. Tenant isolation is the application's concern.
6. **Content moderation, authentication, payments, CDN** -- tidalDB solves one problem: ranking. Everything else is someone else's job.