tidaldb/docs/planning/milestone-5/phase-4/task-01-creator-text-indexing.md
jordan 192c473f55 feat: complete Milestone 5 — full-text search, RRF fusion, and creator search
- M5p1: BM25 text indexing via Tantivy with background syncer (0.26ms @ 10K docs)
- M5p2: RRF fusion layer combining BM25 + ANN scores (46µs @ 1K candidates)
- M5p3: unified Search query API (8-stage pipeline, BM25 + vector + ranking)
- M5p4: creator text + vector indexing and creator search executor (< 20ms @ 200 creators)
- Refactor db/mod.rs into focused sub-modules (creators, items, sessions, signals, etc.)
- Decompose monolithic files into directory modules (query/executor, ranking/diversity, etc.)
- Split brute.rs → brute/mod.rs + brute/tests.rs; extract search executor helpers
- Add benches: fusion, search, session, text_index
- Add M5 UAT test suites (m5_uat, m5_search, m5p4_creator_search, text_index)
- Update blog posts, roadmap, content strategy, and M5 planning docs
- Add tmp/ and .claude/worktrees/ to .gitignore

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 23:53:16 -07:00

1.9 KiB

Task 01: Creator Text Indexing

Goal

Add a separate Tantivy text index for creator entities, parallel to the existing item text index. Creator text fields are declared in the schema via creator_text_field(). The background syncer enqueues writes from write_creator().

Files to Modify

  • tidal/src/schema/validation.rs — add creator_text_fields vec to Schema and SchemaBuilder
  • tidal/src/schema/mod.rs — re-export nothing new (types already exported)
  • tidal/src/db/mod.rs — add creator text index fields, spawn creator syncer, extend write_creator(), add reload_creator_text_index()

Schema Changes

Add creator_text_fields: Vec<TextFieldDef> to both Schema and SchemaBuilder.

impl SchemaBuilder {
    pub fn creator_text_field(&mut self, key: &str, field_type: TextFieldType) -> &mut Self {
        self.creator_text_fields.push(TextFieldDef { key: key.to_owned(), field_type });
        self
    }
}

impl Schema {
    pub fn creator_text_fields(&self) -> &[TextFieldDef] {
        &self.creator_text_fields
    }
}

TidalDb Changes

Add three new fields parallel to text_index, text_tx, text_syncer_thread:

creator_text_index: Option<Arc<crate::text::TextIndex>>,
creator_text_tx: std::sync::Mutex<Option<crossbeam::channel::Sender<crate::text::PendingWrite>>>,
creator_text_syncer_thread: std::sync::Mutex<Option<std::thread::JoinHandle<crate::Result<()>>>>,

Spawn in from_parts() using the same pattern as the item syncer. In write_creator(), enqueue to creator_text_tx when present.

Add reload_creator_text_index() helper for tests.

Acceptance Criteria

  • SchemaBuilder::creator_text_field() compiles
  • Writing a creator with matching metadata enqueues to the creator text index
  • reload_creator_text_index() reloads the reader for test synchronization
  • Existing item text index is unaffected