# Task 02: USearch Parameter Tuning ## Delivers A systematic benchmark of USearch HNSW parameters (M x ef_construction) at 1M vectors, documenting the recall/latency tradeoff. The optimal configuration is applied to `VectorIndexConfig::default()`. ANN recall@10 must exceed 0.95. ## Complexity M ## Dependencies - task-01 complete (scale benchmark infrastructure) - `docs/research/ann_for_tidaldb.md` (parameter guidance) ## Technical Design ### 1. Parameter matrix The research doc identifies M and ef_construction as the two critical HNSW parameters for recall/latency tradeoff. At 1536D (production) or 128D (benchmark), the relationship between these parameters and recall quality must be measured, not assumed. | Parameter | Values | Rationale | |-----------|--------|-----------| | M (connectivity) | 8, 16, 32 | M=16 is the USearch default; M=8 saves ~50% graph memory; M=32 improves recall under selective filters at 2x memory | | ef_construction | 100, 200, 400 | Controls index build quality; diminishing returns past 200 in most benchmarks | | ef_search | 200 (fixed) | Query-time expansion factor; held constant to isolate build-quality effects | This produces a 3x3 = 9 configuration matrix. ### 2. Benchmark implementation Add to `tidal/benches/vector.rs` or a new `tidal/benches/usearch_tuning.rs`: ```rust #![allow(clippy::unwrap_used, clippy::cast_precision_loss)] use criterion::{Criterion, black_box, criterion_group, criterion_main, BenchmarkId}; use rand::Rng; use std::time::Duration; use tidaldb::storage::vector::{ AdaptiveQueryPlanner, BruteForceIndex, DistanceMetric, QuantizationLevel, VectorId, VectorIndex, VectorIndexConfig, }; const DIM: usize = 128; const N: u64 = 1_000_000; const K: usize = 10; const NUM_QUERIES: usize = 100; fn random_unit_vector(dim: usize, rng: &mut impl Rng) -> Vec { let v: Vec = (0..dim).map(|_| rng.random::() - 0.5).collect(); let norm: f32 = v.iter().map(|x| x * x).sum::().sqrt(); if norm < f32::EPSILON { let mut fallback = vec![0.0f32; dim]; fallback[0] = 1.0; return fallback; } v.iter().map(|x| x / norm).collect() } struct TuningResult { m: usize, ef_construction: usize, recall_at_10: f64, mean_latency_us: f64, build_time_secs: f64, memory_bytes: usize, } /// Build an index with specific parameters, compute ground truth recall, /// and measure search latency. fn evaluate_config(m: usize, ef_construction: usize) -> TuningResult { let config = VectorIndexConfig { dimensions: DIM, metric: DistanceMetric::L2, quantization: QuantizationLevel::F32, connectivity: m, ef_construction, ef_search: 200, }; let mut rng = rand::rng(); // Build ground truth with brute force let brute = BruteForceIndex::new(config.clone()); let build_start = std::time::Instant::now(); // (In practice, use the real USearch-backed index here, not BruteForceIndex) for id in 0..N { let vec = random_unit_vector(DIM, &mut rng); brute.insert(id, &vec).unwrap(); } let build_time = build_start.elapsed(); // Generate query vectors let queries: Vec> = (0..NUM_QUERIES) .map(|_| random_unit_vector(DIM, &mut rng)) .collect(); // Compute ground truth (brute force top-K for each query) let ground_truths: Vec> = queries .iter() .map(|q| { brute .search(q, K, K * 2) .unwrap() .iter() .map(|r| r.id) .collect() }) .collect(); // Search and measure recall + latency let planner = AdaptiveQueryPlanner::with_defaults(); let mut total_recall = 0.0; let mut total_latency = Duration::ZERO; for (query, gt) in queries.iter().zip(ground_truths.iter()) { let start = std::time::Instant::now(); let results = planner .execute(&brute, query, K, None, 1.0, None) .unwrap(); total_latency += start.elapsed(); let result_ids: Vec = results.iter().map(|r| r.id).collect(); let hits = result_ids.iter().filter(|id| gt.contains(id)).count(); total_recall += hits as f64 / gt.len() as f64; } TuningResult { m, ef_construction, recall_at_10: total_recall / NUM_QUERIES as f64, mean_latency_us: total_latency.as_micros() as f64 / NUM_QUERIES as f64, build_time_secs: build_time.as_secs_f64(), memory_bytes: 0, // Measured via index-specific API if available } } ``` ### 3. Criterion benchmark for the optimal config After determining the optimal (M, ef_construction) from the evaluation, add a Criterion benchmark that measures search latency at the chosen parameters: ```rust fn bench_usearch_optimal_1m(c: &mut Criterion) { let mut group = c.benchmark_group("usearch_1m"); group.sample_size(10); group.measurement_time(Duration::from_secs(30)); // Build index with candidate-optimal config let configs = [ (8, 100), (8, 200), (16, 100), (16, 200), (16, 400), (32, 200), (32, 400), ]; let mut rng = rand::rng(); let query = random_unit_vector(DIM, &mut rng); for &(m, ef_c) in &configs { let config = VectorIndexConfig { dimensions: DIM, metric: DistanceMetric::L2, quantization: QuantizationLevel::F32, connectivity: m, ef_construction: ef_c, ef_search: 200, }; let index = BruteForceIndex::new(config); // Pre-populate (in real implementation, use the HNSW-backed index) for id in 0..10_000u64 { let vec = random_unit_vector(DIM, &mut rng); index.insert(id, &vec).unwrap(); } group.bench_with_input( BenchmarkId::new("search", format!("M{m}_ef{ef_c}")), &(m, ef_c), |b, _| { b.iter(|| { index.search(black_box(&query), black_box(K), black_box(200)).unwrap() }); }, ); } group.finish(); } ``` ### 4. Apply optimal config Once the optimal (M, ef_construction) is determined, update `VectorIndexConfig` defaults: ```rust // In tidal/src/storage/vector/mod.rs or config.rs impl Default for VectorIndexConfig { fn default() -> Self { Self { dimensions: 128, metric: DistanceMetric::L2, quantization: QuantizationLevel::F16, // research doc recommends f16 default connectivity: OPTIMAL_M, // determined by benchmark ef_construction: OPTIMAL_EF_C, // determined by benchmark ef_search: 200, } } } ``` ### 5. Recall measurement methodology Recall@K is computed as the fraction of brute-force top-K results that appear in the HNSW search results: ``` recall@K = |HNSW_top_K intersect BruteForce_top_K| / K ``` Averaged over 100 random queries. The threshold is recall@10 > 0.95. ### 6. Memory estimation Per the research doc, HNSW graph overhead is ~300 bytes per node. At 1M vectors with 128D float32: | M | Vector data | Graph overhead | Total | |---|-------------|---------------|-------| | 8 | 488 MB | ~150 MB | ~638 MB | | 16 | 488 MB | ~300 MB | ~788 MB | | 32 | 488 MB | ~600 MB | ~1.1 GB | At 1536D (production), multiply vector data by 12x. The graph overhead stays the same. ## Acceptance Criteria - [ ] All 9 (M, ef_construction) configurations benchmarked at 1M vectors (or subset for CI time) - [ ] Recall@10 > 0.95 for the selected optimal configuration - [ ] Search latency for 100 queries recorded: mean and p99 - [ ] Build time per configuration recorded - [ ] Optimal (M, ef_construction) applied to `VectorIndexConfig` default - [ ] Results documented in `docs/profiling/usearch-tuning.md` with a recall/latency tradeoff table - [ ] If recall@10 < 0.95 for all configs, document the finding and propose mitigation (increase ef_search, ACORN-1, etc.) ## Test Strategy 1. **Recall validation:** For the chosen config, run 100 queries and verify recall@10 > 0.95 against brute-force ground truth. This is a correctness test, not just a benchmark. 2. **Regression guard:** After applying the optimal config, re-run the existing `tidal/benches/vector.rs` benchmarks to ensure no regression at 10K scale. 3. **Config round-trip:** Verify that the new default config serializes and deserializes correctly if `VectorIndexConfig` is persisted.