# Task 02: Retrieval Mode Router ## Delivers `RetrievalMode` enum and `route_results()` function. `RetrievalMode::determine()` selects text-only, vector-only, or hybrid based on what's present in the query. `route_results()` converts pre-retrieved result lists through the appropriate path — direct passthrough for single-mode, `HybridFusion::fuse()` for hybrid. Criterion benchmark confirming fusion adds < 1ms at 1000 candidates per list. ## Complexity: S ## Dependencies - Task 01 COMPLETE: `HybridFusion` with `fuse()` method in `tidal/src/query/fusion.rs` - m5p1 COMPLETE: `EntityId` type - m2p1 COMPLETE: `VectorSearchResult { id: VectorId, distance: f32 }` in `tidal/src/storage/vector/` ## Technical Design ### RetrievalMode ```rust // tidal/src/query/fusion.rs (additions) /// Which retrieval system(s) to use for a search query. /// /// Determined by what the query provides: /// - `TextOnly` — only `query_text` is present /// - `VectorOnly` — only `query_vector` is present /// - `Hybrid` — both `query_text` and `query_vector` are present #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum RetrievalMode { /// Execute BM25 text search only. TextOnly, /// Execute ANN vector search only. VectorOnly, /// Execute both and fuse results via RRF. Hybrid, } impl RetrievalMode { /// Determine the retrieval mode from query contents. /// /// Returns `None` if neither text nor vector is provided (invalid query). #[must_use] pub fn determine(has_text: bool, has_vector: bool) -> Option { match (has_text, has_vector) { (true, false) => Some(Self::TextOnly), (false, true) => Some(Self::VectorOnly), (true, true) => Some(Self::Hybrid), (false, false) => None, } } } ``` ### route_results() ```rust /// Route pre-retrieved result lists through the appropriate fusion path. /// /// - `TextOnly`: converts BM25 scores to `f64` and returns them sorted descending. /// - `VectorOnly`: converts ANN distance → rank-based score and returns sorted descending. /// - `Hybrid`: calls `HybridFusion::fuse()` and returns the fused result. /// /// # Inputs /// /// - `bm25_results`: `(EntityId, f32)` where f32 is BM25 score, **pre-sorted descending**. /// - `ann_results`: `(EntityId, f32)` where f32 is L2-squared distance, **pre-sorted ascending**. /// - Both slices may be empty; callers pass `&[]` for unused modes. /// /// # Returns /// /// `Vec<(EntityId, f64)>` sorted descending by score. For `TextOnly` and `VectorOnly`, /// scores are normalized to `[0, 1]` relative to the top candidate (score 1.0). /// For `Hybrid`, scores are raw RRF values (typically 0.01–0.04 for k=60). pub fn route_results( mode: RetrievalMode, bm25_results: &[(EntityId, f32)], ann_results: &[(EntityId, f32)], fusion: &HybridFusion, ) -> Vec<(EntityId, f64)> { match mode { RetrievalMode::TextOnly => { // Convert f32 BM25 scores to f64; already sorted descending by caller. bm25_results .iter() .map(|(id, score)| (*id, f64::from(*score))) .collect() } RetrievalMode::VectorOnly => { // Convert rank-position to a score using the same RRF formula for // consistency: score = 1.0 / (k + rank). This gives ANN-only results // the same score range as hybrid results. let k = f64::from(fusion.k); ann_results .iter() .enumerate() .map(|(i, (id, _distance))| { let rank = (i + 1) as f64; (*id, 1.0 / (k + rank)) }) .collect() } RetrievalMode::Hybrid => fusion.fuse(bm25_results, ann_results), } } ``` ### ann_to_ranked() A helper to convert `Vec` (returned by `VectorIndex::search()`) to `Vec<(EntityId, f32)>` suitable as input to `fuse()` or `route_results()`: ```rust use crate::storage::vector::VectorSearchResult; /// Convert ANN search results to a ranked list for fusion input. /// /// `VectorSearchResult` is already sorted ascending by distance (best first). /// This function maps it to `(EntityId, f32)` where the f32 is the raw L2 distance. /// The caller passes this to `fuse()` or `route_results()` which uses position-as-rank. #[must_use] pub fn ann_to_ranked(ann_results: &[VectorSearchResult]) -> Vec<(EntityId, f32)> { ann_results .iter() .map(|r| (EntityId::new(r.id), r.distance)) .collect() } ``` ### Module Integration Add to `tidal/src/query/mod.rs`: ```rust pub use fusion::{HybridFusion, RetrievalMode, ann_to_ranked, route_results}; ``` ### Criterion Benchmark ```rust // tidal/benches/fusion.rs fn bench_rrf_1k_per_list(c: &mut Criterion) { // 1000 BM25 results let bm25: Vec<(EntityId, f32)> = (0u64..1000) .map(|i| (EntityId::new(i), (1000 - i) as f32)) .collect(); // 1000 ANN results, 50% overlap with BM25 let ann: Vec<(EntityId, f32)> = (500u64..1500) .enumerate() .map(|(i, id)| (EntityId::new(id), i as f32 * 0.001)) .collect(); let fusion = HybridFusion::new(); c.bench_function("rrf_fuse_1k_per_list", |b| { b.iter(|| { let results = fusion.fuse(black_box(&bm25), black_box(&ann)); black_box(results) }); }); } ``` ## Acceptance Criteria - [ ] `RetrievalMode` enum with `TextOnly`, `VectorOnly`, `Hybrid` variants in `fusion.rs` - [ ] `RetrievalMode::determine(has_text, has_vector) -> Option` returns correct variant - [ ] `determine(false, false)` returns `None` - [ ] `route_results(mode, bm25, ann, fusion) -> Vec<(EntityId, f64)>` implemented - [ ] `TextOnly` path: BM25 scores converted to f64, list preserved - [ ] `VectorOnly` path: ANN results converted to rank-based scores via `1.0 / (k + rank)` - [ ] `Hybrid` path: calls `HybridFusion::fuse()` and returns result - [ ] `ann_to_ranked(ann_results: &[VectorSearchResult]) -> Vec<(EntityId, f32)>` helper - [ ] `RetrievalMode`, `route_results`, `ann_to_ranked` exported from `tidal/src/query/mod.rs` - [ ] `tidal/benches/fusion.rs` created with Criterion benchmark `rrf_fuse_1k_per_list` - [ ] Benchmark result confirms fusion < 1ms for 1000 candidates per list - [ ] `[[bench]] name = "fusion" harness = false` added to `tidal/Cargo.toml` - [ ] Unit tests: `determine_text_only`, `determine_vector_only`, `determine_hybrid`, `determine_none`, `route_text_only_passthrough`, `route_vector_only_rank_based`, `route_hybrid_calls_fuse`, `ann_to_ranked_converts_correctly` - [ ] `cargo check`, `cargo fmt`, `cargo clippy -D warnings` all pass ## Test Strategy ```rust #[test] fn determine_text_only() { assert_eq!(RetrievalMode::determine(true, false), Some(RetrievalMode::TextOnly)); } #[test] fn determine_hybrid() { assert_eq!(RetrievalMode::determine(true, true), Some(RetrievalMode::Hybrid)); } #[test] fn determine_none() { assert_eq!(RetrievalMode::determine(false, false), None); } #[test] fn route_text_only_passthrough() { let bm25 = vec![(EntityId::new(1), 1.0f32), (EntityId::new(2), 0.5f32)]; let fusion = HybridFusion::new(); let results = route_results(RetrievalMode::TextOnly, &bm25, &[], &fusion); assert_eq!(results.len(), 2); assert!((results[0].1 - 1.0f64).abs() < 1e-6); // f32 → f64 exact assert!((results[1].1 - 0.5f64).abs() < 1e-6); } #[test] fn route_vector_only_rank_based() { // VectorSearchResult order: rank 1 (index 0) gets score 1/(60+1) let ann = vec![ (EntityId::new(1), 0.1f32), // rank 1 (EntityId::new(2), 0.2f32), // rank 2 ]; let fusion = HybridFusion::new(); let results = route_results(RetrievalMode::VectorOnly, &[], &ann, &fusion); assert_eq!(results.len(), 2); let expected_rank1 = 1.0 / (60.0 + 1.0); let expected_rank2 = 1.0 / (60.0 + 2.0); assert!((results[0].1 - expected_rank1).abs() < 1e-9); assert!((results[1].1 - expected_rank2).abs() < 1e-9); } #[test] fn ann_to_ranked_converts_correctly() { use crate::storage::vector::VectorSearchResult; let ann_results = vec![ VectorSearchResult { id: 42, distance: 0.1 }, VectorSearchResult { id: 99, distance: 0.3 }, ]; let ranked = ann_to_ranked(&ann_results); assert_eq!(ranked.len(), 2); assert_eq!(ranked[0].0.as_u64(), 42); assert!((ranked[0].1 - 0.1f32).abs() < 1e-6); assert_eq!(ranked[1].0.as_u64(), 99); } ```