# Tantivy Merge Policy Tuning ## Configuration Applied `TextIndex` construction in `tidal/src/text/index.rs` configures `LogMergePolicy` with: | Parameter | Value | Rationale | |-----------|-------|-----------| | `min_num_segments` | 4 | More aggressive than default (8) — triggers merges earlier to bound segment count at steady state | | `max_docs_before_merge` | 5,000,000 | Smaller max segment than default (10M) — reduces worst-case merge duration | | `del_docs_ratio_before_merge` | 0.3 | Triggers merge when 30% of docs deleted — tidalDB uses delete-then-add for updates, so deleted docs accumulate | Applied to both Item and Creator `TextIndex` instances. ## API ```rust // Returns the number of active Tantivy search segments. // Useful for monitoring merge policy effectiveness. db.text_index.segment_count() ``` Note: `TidalDb::text_segment_count()` is exposed in `tidal/src/db/items.rs`. ## Segment Count Target **< 20 segments at steady state** after initial 1M-item ingest. At 1M items with 1000-item commits (tidalDB's default syncer batch size), the initial ingest produces ~1000 commits. Without merge policy tuning, segment count can reach 50+. With `min_num_segments=4`, merges fire aggressively and keep steady-state count below 20. ## Verification The integration tests in `tidal/tests/tantivy_merge.rs` (marked `#[ignore]`) verify: - `tantivy_segment_evolution`: segment count stays < 20 during 10 rounds of steady-state writes - `tantivy_concurrent_read_write_latency`: read p99 < 100ms during concurrent writes To run: ```bash cargo test --manifest-path tidal/Cargo.toml -- tantivy_segment_evolution --ignored --nocapture cargo test --manifest-path tidal/Cargo.toml -- tantivy_concurrent_read_write_latency --ignored --nocapture ``` ## Regression Guard No regression in `tidal/benches/text_index.rs` and `tidal/benches/search.rs` benchmarks after applying `LogMergePolicy` changes. ## References - `docs/research/tantivy.md` — LogMergePolicy background and parameter guidance - Tantivy docs: https://docs.rs/tantivy/latest/tantivy/merge_policy/struct.LogMergePolicy.html