tidaldb/docs/planning/milestone-5/phase-4/task-03-creator-search-executor.md
jordan 192c473f55 feat: complete Milestone 5 — full-text search, RRF fusion, and creator search
- M5p1: BM25 text indexing via Tantivy with background syncer (0.26ms @ 10K docs)
- M5p2: RRF fusion layer combining BM25 + ANN scores (46µs @ 1K candidates)
- M5p3: unified Search query API (8-stage pipeline, BM25 + vector + ranking)
- M5p4: creator text + vector indexing and creator search executor (< 20ms @ 200 creators)
- Refactor db/mod.rs into focused sub-modules (creators, items, sessions, signals, etc.)
- Decompose monolithic files into directory modules (query/executor, ranking/diversity, etc.)
- Split brute.rs → brute/mod.rs + brute/tests.rs; extract search executor helpers
- Add benches: fusion, search, session, text_index
- Add M5 UAT test suites (m5_uat, m5_search, m5p4_creator_search, text_index)
- Update blog posts, roadmap, content strategy, and M5 planning docs
- Add tmp/ and .claude/worktrees/ to .gitignore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 23:53:16 -07:00

2.3 KiB

Task 03: Creator Search Executor

Goal

Extend SearchExecutor to route text index and ANN slot based on query.entity_kind. When entity_kind = Creator, use creator_text_index and the (EntityKind::Creator, "content") slot.

Files to Modify

  • tidal/src/query/search.rs — add creator_text_index field, routing in execute()
  • tidal/src/db/mod.rs — pass creator text index in search()
  • tidal/tests/m5p4_creator_search.rs — new integration tests

SearchExecutor Changes

Add creator_text_index: Option<&'a Arc<crate::text::TextIndex>> field.

In execute() Stage 1a:

let effective_text_index = match query.entity_kind {
    EntityKind::Creator => self.creator_text_index,
    _ => self.text_index,
};

In execute() Stage 1b ANN, use query.entity_kind instead of hardcoded EntityKind::Item:

match registry.get(query.entity_kind, "content") { ... }

Add builder method:

pub fn with_creator_text_index(mut self, idx: &'a Arc<crate::text::TextIndex>) -> Self {
    self.creator_text_index = Some(idx);
    self
}

TidalDb::search() Routing

if query.entity_kind == EntityKind::Creator {
    if let Some(idx) = self.creator_text_index.as_ref() {
        base_executor = base_executor.with_creator_text_index(idx);
    }
}

Note: for Creator search, pass None as text_index (item text index) to SearchExecutor::new() — or pass both and let the executor route. Simplest: always pass item text index to new(), add creator index via builder method, executor picks based on entity_kind.

Integration Tests

Create tidal/tests/m5p4_creator_search.rs:

  • step01_creator_text_search_returns_results() — write 200 creators, search "jazz", assert ≥ 1 result with bm25_score.is_some()
  • step02_creator_verified_filter() — search with filter(verified = "true"), assert all results have verified metadata
  • step03_creator_similar_to() — write embeddings, search with vector, assert results have semantic_score.is_some()
  • step04_creator_search_latency_under_20ms() — measure 10 iterations, assert p50 < 20ms

Acceptance Criteria

  • Creator search returns BM25-ranked results
  • Filter by verified = "true" works
  • Vector-only and hybrid search work for creators
  • All existing m5p3 item search tests still pass
  • Latency < 20ms at 200 creators