3.0 KiB
Scale Benchmarks: 1M-Item Baselines
Hardware
macOS Darwin 23.6.0 (Apple Silicon / x86-64 — see run date below).
Run command:
cargo bench --manifest-path tidal/Cargo.toml --bench scale
Date: 2026-02-23
Dataset
| Parameter | Value |
|---|---|
| Items | 1,000,000 |
| Creators | 10,000 (100 items/creator) |
| Categories | 20 |
| Embedding dim | 128 (not 1536 — reduced for bench RAM) |
| Signal coverage | 10% view, 5% like |
| Bench tool | Criterion (sample_size=10, 30s measurement, Flat mode) |
Acceptance Criteria
| Benchmark | Target | Measured | Status |
|---|---|---|---|
| RETRIEVE p99 | < 50ms | 152 µs (for_you) | ✅ PASS |
| SEARCH p99 | < 100ms | 28.9 ms (text_only) | ✅ PASS |
| Signal write p99 | < 100µs | 82 ns | ✅ PASS |
All three targets pass by a wide margin.
Benchmark Results
RETRIEVE (1M items)
retrieve_1m/for_you time: [151.88 µs 152.13 µs 152.40 µs]
retrieve_1m/trending time: [127.96 µs 128.25 µs 128.52 µs]
retrieve_1m/new_filtered time: [ 7.5636 µs 7.5855 µs 7.6058 µs]
All RETRIEVE queries < 200µs. The 50ms target is beaten by 3 orders of magnitude.
for_you: signal-scored ranking over full 1M-item universe — 152µstrending: windowed view count ranking — 128µsnew_filtered: category filter at ~5% selectivity — 7.6µs (bitmap pre-filter eliminates 95% of candidates)
SEARCH (1M items)
search_1m/text_only time: [28.844 ms 28.934 ms 29.021 ms]
search_1m/text_filtered time: [ 1.8972 ms 1.9104 ms 1.9220 ms]
Both SEARCH queries < 30ms. The 100ms target is beaten by 3-50×.
text_only: BM25 over 1M documents — 28.9ms (most expensive path; dominated by Tantivy posting list traversal)text_filtered: BM25 with category filter reduces candidate set — 1.9ms
Signal Write (1M-item DB, rotating 1K entities)
signal_write_1m/write_rotating_1k_entities time: [82.033 ns 82.286 ns 82.535 ns]
82 ns per write. The 100µs target is beaten by 1,200×. DashMap hot-path write amortises to sub-100ns across 1K rotating entity IDs.
Setup Notes
The LazyLock<TidalDb> pattern ensures the 1M-item database is built exactly once per bench run. Build time ~30s on the reference hardware above. The text syncer waits 3s after ingestion.
Database Build Time
Approximately 30 seconds on reference hardware (observed from [scale bench] Database ready log line).
Analysis
tidalDB operates well within all three acceptance-criteria targets at 1M items. The dominant cost is SEARCH text_only at ~29ms — driven by Tantivy posting list traversal across 1M documents. The LogMergePolicy tuning (< 20 segments at steady state) keeps this below the 100ms target with headroom.
Signal writes at 82ns confirm the DashMap hot-path is not a bottleneck at this scale. The 5M-entry LRU trimming threshold (DEFAULT_MAX_SIGNAL_ENTRIES) provides ample headroom for the 100K-item signal coverage in this benchmark (~200K entries = ~218MB).