tidaldb/docs/planning/milestone-2/phase-5/OVERVIEW.md

# Milestone 2, Phase 5: Query Parser and RETRIEVE Executor

## Phase Deliverable

The RETRIEVE query operation: a typed AST (`Retrieve` struct), a builder API for ergonomic query construction (no text parser in M2 -- that is M5), a `Signal` write command struct, and the `RetrieveExecutor` that orchestrates m2p1 through m2p4 into a complete pipeline: ANN candidate retrieval or full-scan or signal-ranked selection, filter evaluation, signal scoring, diversity enforcement, result assembly. The full M2 UAT scenario passes as a Rust integration test.

This is the capstone phase. Everything built in M2 converges here. The vector index, the filter engine, the ranking profile executor, and the diversity selector are wired together by a single orchestrator. After this phase, a developer can write items with embeddings, write signal events, and execute `db.retrieve(query)` to get ranked, filtered, diverse results -- in under 50ms for 10K items.

## Acceptance Criteria

- [ ] `Retrieve` struct: `entity_kind`, `profile` (name + optional version), `filters` (`Vec<FilterExpr>`), `diversity` (`DiversityConstraints`), `limit` (default 50, max 500), `exclude` (`Vec<EntityId>`), `cursor` (`Option<Cursor>`)
- [ ] `RetrieveBuilder` with ergonomic builder pattern: `Retrieve::builder().entity(EntityKind::Item).profile("trending").filter(FilterExpr::eq("category", "jazz")).diversity(DiversityConstraints::new().max_per_creator(2)).limit(25).build()`
- [ ] Validation: limit out of range returns error, unknown profile name returns error, incompatible filters for entity kind returns error
- [ ] `Results` struct: `items` (`Vec<RetrieveResult>`), `next_cursor` (`Option<Cursor>`), `total_scored` (how many candidates were scored), `constraints_satisfied` (from diversity result)
- [ ] `RetrieveResult` struct: `entity_id`, `score` (f64), `rank` (usize), `signal_snapshot` (`Vec<(String, f64)>`)
- [ ] `Signal` struct: write command wired to existing `TidalDb::signal()` path from M1
- [ ] `Cursor` struct: offset-based opaque cursor encoded as base64 string
- [ ] `QueryError` enum: `ProfileNotFound`, `InvalidFilter`, `IndexNotAvailable`, `StorageError`, `InvalidLimit`, `InvalidCursor`
- [ ] `RetrieveExecutor` pipeline: candidate retrieval (ANN or full scan based on profile's `CandidateStrategy`) -> filter -> score -> diversity -> limit -> return
- [ ] When profile uses velocity/decay signals (e.g., `trending`, `hot`), executor uses ANN retrieval over embeddings then scores with signal state
- [ ] When profile is `new` or `alphabetical`, executor skips ANN and uses metadata index directly (full scan sorted by `created_at` or field)
- [ ] When profile is `SignalRanked` (e.g., `most_viewed`, `most_liked`), executor reads signal state from ledger without ANN
- [ ] `EXCLUDE` list applied before scoring (candidates in exclude list are removed from candidate set)
- [ ] End-to-end RETRIEVE latency < 50ms at 10K items (Criterion benchmarked)
- [ ] Results include signal snapshot for debugging/transparency (top signals used in scoring per result)
- [ ] `TidalDb::retrieve()` method wires `RetrieveExecutor` to the public API
- [ ] Full M2 UAT scenario passes as an integration test (`tidal/tests/m2_uat.rs`)
- [ ] `cargo clippy -- -D warnings` passes
- [ ] No `unsafe` code in `query/` module

## Dependencies

- **Requires:** m2p1 (`VectorIndex` trait, `UsearchIndex`, `AdaptiveQueryPlanner`, `EmbeddingSlotRegistry`), m2p2 (`BitmapIndex`, `RangeIndex`, `FilterExpr`, `FilterEvaluator`, `FilterResult`), m2p3 (`RankingProfile`, `ProfileRegistry`, `ProfileExecutor`, `ScoredCandidate`, `Sort`, `CandidateStrategy`), m2p4 (`DiversityConstraints`, `DiversitySelector`, `DiversityResult`), m1p4 (`SignalLedger`), m1p5 (`TidalDb` struct, `Config`, `write_item`, `signal`, `item_exists`, `open`, `shutdown`)
- **Blocks:** Milestone 3 (personalized ranking adds `FOR USER` clause and user context to the RETRIEVE pipeline)

## Research References

- [docs/research/ann_for_tidaldb.md](../../../research/ann_for_tidaldb.md) -- Adaptive query planner integration, ANN candidate retrieval strategy, recall@k vs latency tradeoffs
- [docs/research/tidaldb_signal_ledger.md](../../../research/tidaldb_signal_ledger.md) -- Signal read latencies (~15ns hot-tier, ~200ns windowed) establishing per-candidate scoring budget
- [thoughts.md](../../../../thoughts.md) -- Part V.14 (MMR post-scoring), Part V.9 (vector index as derived state)

## Spec References

- [docs/specs/08-query-engine.md](../../../specs/08-query-engine.md) -- THE authoritative spec:
  - Section 2 (RETRIEVE operation: candidate generation, filtering, scoring, diversity, pagination)
  - Section 3 (Query parsing: `Retrieve` struct, validation, resolution, `QueryError` enum)
  - Section 4 (Query planning: `CandidateStrategy`, plan construction, decision tree)
  - Section 5 (Execution pipeline: 6-stage architecture, candidate generation, filter evaluation, signal loading, scoring, diversity enforcement, pagination)
  - Section 7 (Filter evaluation: bitmap-based architecture, filter push-down, short-circuit)
  - Section 8 (Pagination: cursor structure, cursor semantics, cursor encoding)
  - Section 15 (Invariants: INV-QUERY-1 deterministic results, INV-QUERY-2 filter correctness, INV-QUERY-3 diversity constraints)
- [docs/specs/09-ranking-scoring.md](../../../specs/09-ranking-scoring.md) -- Section 3 (CandidateStrategy variants), Section 4 (scoring pipeline stages), Section 9 (diversity enforcement)

## Task Index

| # | Task | Delivers | Depends On | Complexity |
|---|------|----------|------------|------------|
| 01 | RETRIEVE AST + Parser | `Retrieve` struct, `RetrieveBuilder`, `ProfileRef`, `Cursor`, `Results`, `RetrieveResult`, `Signal` write struct, `QueryError`, validation | None | M |
| 02 | RETRIEVE Executor Pipeline | `RetrieveExecutor`, 5-stage pipeline (candidate -> filter -> score -> diversity -> assemble), `TidalDb::retrieve()`, Criterion benchmarks | Task 01 | L |
| 03 | M2 UAT Integration Test | Full M2 UAT scenario as `tidal/tests/m2_uat.rs`: 10K items, 10K signals, all 6 profile queries, signal burst rank change, crash recovery | Task 01, Task 02 | M |

## Task Dependency DAG

```
Task 01: RETRIEVE AST + Parser
    |
    v
Task 02: RETRIEVE Executor Pipeline
    |
    v
Task 03: M2 UAT Integration Test
```

Linear dependency chain. Task 01 defines the types that Task 02 consumes. Task 03 exercises the complete system including Task 02's executor wired through `TidalDb::retrieve()`.

## File Layout

```
tidal/src/
  query/
    mod.rs       -- pub mod retrieve; pub mod executor; re-exports of Retrieve, Results,
                    RetrieveResult, RetrieveExecutor, QueryError, Cursor, Signal
    retrieve.rs  -- Retrieve struct, RetrieveBuilder, ProfileRef, Cursor, Results,
                    RetrieveResult, Signal struct, validation (Task 01)
    executor.rs  -- RetrieveExecutor, 5-stage pipeline, TidalDb::retrieve() wiring (Task 02)
  lib.rs         -- add `pub mod query;` and TidalDb::retrieve() method (Task 02)
tidal/tests/
  m2_uat.rs      -- Full M2 UAT integration test (Task 03)
tidal/benches/
  query.rs       -- Criterion benchmarks for end-to-end RETRIEVE (Task 02)
tidal/Cargo.toml -- add `base64` dependency for cursor encoding; add `[[bench]] name = "query"
                    harness = false`
```

## Open Questions

1. **Embedding dimension for M2 integration tests**: 1536-dim vectors make the M2 UAT test slow (~2x indexing time). Use 64-dim vectors in tests. Production profiles use any dimension supported by the `VectorIndex` trait. The trait abstraction handles any dimension; set `dimensions: 64` in the test schema. The USearch backend's f16 quantization works identically at 64d.

2. **`TidalDb::retrieve()` method**: The m1p5 `TidalDb` struct needs a `retrieve(&self, query: Retrieve) -> Result<Results, QueryError>` method. For M2, `TidalDb` must hold references to the vector index registry, filter evaluator, profile registry, and signal ledger. These are initialized at `TidalDb::open()` time. The `retrieve()` method constructs a `RetrieveExecutor` from these references and delegates to it.

3. **Filter + ANN interaction for M2**: In M2, filters and ANN are applied sequentially (ANN first, then filter). The adaptive query planner from m2p1 already selects the ANN strategy based on filter selectivity. For M2, the pipeline calls the planner for ANN strategy selection but applies metadata filters post-ANN. Pre-filtering via USearch predicate callbacks is available via `filtered_search()` but the sequential approach is the simpler correct baseline. Document this as a known performance limitation that M3+ refines.

4. **No `FOR USER` clause in M2**: The RETRIEVE query in M2 does not support user context. `CandidateStrategy::Ann` uses a query vector from the item embedding space (e.g., a representative category embedding), not a user preference vector. User preference vectors come in M3 when user entities are introduced. The `Retrieve` struct has a `for_user: Option<UserId>` field that is always `None` in M2.

5. **Cursor-based pagination**: For M2, implement a simple offset-based cursor (encode the current offset as a base64 opaque string). True keyset-based pagination (score + entity_id tiebreaker as described in Spec 08 Section 8.2) is an M5+ concern. The offset cursor is sufficient for the M2 UAT which does not paginate.

6. **`CandidateStrategy` routing**: The `RetrieveExecutor` reads the profile's `CandidateStrategy` to decide how to generate candidates. For M2, three strategies are implemented: `Ann` (ANN search over embeddings), `Scan` (full entity scan sorted by metadata field), and `SignalRanked` (top-K by signal value). `Relationship`, `Hybrid`, and `CohortTrending` strategies are type stubs that produce errors if invoked -- they require M3+ infrastructure (user entities, text index, cohorts).