tidaldb/docs/planning/milestone-7/phase-4/task-03-index-health-metrics.md
2026-02-23 22:41:16 -07:00

8.5 KiB

Task 03: Index Health Metrics

Delivers

Prometheus gauges for all secondary index health: Tantivy segment count and indexed document count, USearch vector count and byte size, bitmap index total cardinality. These metrics let operators detect stale derived indexes, growing segment fragmentation, and index size anomalies before they affect query latency.

Complexity: M

Dependencies

  • task-01 complete (establishes instrumentation pattern)
  • tidal/src/db/metrics.rs -- MetricsState to extend
  • tidal/src/text/index.rs -- TextIndex wraps Tantivy Index and IndexReader
  • tidal/src/storage/vector/registry.rs -- EmbeddingSlotRegistry owns USearch indexes
  • tidal/src/storage/indexes/bitmap.rs -- BitmapIndex with RoaringBitmap values

Technical Design

1. Add atomic gauges to MetricsState

In tidal/src/db/metrics.rs:

pub struct MetricsState {
    // ... existing + task-02 fields ...

    // ── Index health metrics (m7p4) ────────────────────────────────────
    /// Number of Tantivy segments for the items text index.
    #[cfg(feature = "metrics")]
    pub(crate) tantivy_segment_count: AtomicU64,
    /// Number of documents indexed in the items text index.
    #[cfg(feature = "metrics")]
    pub(crate) tantivy_indexed_docs: AtomicU64,
    /// Total byte size of the USearch index files on disk (or in-memory estimate).
    #[cfg(feature = "metrics")]
    pub(crate) usearch_index_size_bytes: AtomicU64,
    /// Number of vectors stored in the USearch index.
    #[cfg(feature = "metrics")]
    pub(crate) usearch_vector_count: AtomicU64,
    /// Total cardinality across all bitmap index entries (category + format + creator + tag).
    #[cfg(feature = "metrics")]
    pub(crate) bitmap_index_cardinality: AtomicU64,
}

2. Expose index introspection methods

TextIndex

Add a method to TextIndex for segment and document count:

impl TextIndex {
    /// Return the number of segments and total indexed documents.
    ///
    /// Reads from the current IndexReader snapshot. Thread-safe.
    #[must_use]
    pub fn index_stats(&self) -> (usize, u64) {
        let searcher = self.reader.searcher();
        let segment_count = searcher.segment_readers().len();
        let doc_count = searcher
            .segment_readers()
            .iter()
            .map(|r| u64::from(r.num_docs()))
            .sum();
        (segment_count, doc_count)
    }
}

EmbeddingSlotRegistry

Add a method to report total vector count and estimated byte size:

impl EmbeddingSlotRegistry {
    /// Return the total vector count and estimated byte size across all slots.
    #[must_use]
    pub fn index_stats(&self) -> (u64, u64) {
        let mut total_vectors: u64 = 0;
        let mut total_bytes: u64 = 0;
        for slot in self.slots.values() {
            let count = slot.index.size() as u64;
            // USearch reports serialized size; use dimensions * sizeof(f16) * count as estimate
            let dim = slot.dimensions as u64;
            let bytes = count * dim * 2; // f16 = 2 bytes
            total_vectors += count;
            total_bytes += bytes;
        }
        (total_vectors, total_bytes)
    }
}

If USearch provides a serialized_length() method, prefer that over the estimate. The estimate is a lower bound (excludes HNSW graph overhead).

BitmapIndex

Add a method to report total cardinality:

impl BitmapIndex {
    /// Total number of entity IDs across all bitmap entries.
    #[must_use]
    pub fn total_cardinality(&self) -> u64 {
        self.entries.iter().map(|e| e.value().len()).sum()
    }
}

3. Periodic metrics refresh

In the checkpoint thread or a dedicated metrics-refresh interval (reuse the pattern from task-02), collect index stats:

#[cfg(feature = "metrics")]
fn refresh_index_metrics(db: &TidalDb) {
    // Tantivy
    if let Some(text_index) = &db.text_index {
        let (segments, docs) = text_index.index_stats();
        db.metrics.tantivy_segment_count.store(segments as u64, Ordering::Relaxed);
        db.metrics.tantivy_indexed_docs.store(docs, Ordering::Relaxed);
    }

    // USearch
    if let Ok(registry) = db.embedding_registry.read() {
        let (vectors, bytes) = registry.index_stats();
        db.metrics.usearch_vector_count.store(vectors, Ordering::Relaxed);
        db.metrics.usearch_index_size_bytes.store(bytes, Ordering::Relaxed);
    }

    // Bitmap indexes
    let cardinality = db.category_index.total_cardinality()
        + db.format_index.total_cardinality()
        + db.creator_index.total_cardinality()
        + db.tag_index.total_cardinality();
    db.metrics.bitmap_index_cardinality.store(cardinality, Ordering::Relaxed);
}

Call this function every 10 seconds from the checkpoint thread's periodic loop. Index stats are not hot-path -- 10-second staleness is acceptable for monitoring.

4. Render in Prometheus format

Extend MetricsState::render_prometheus():

// Tantivy
write_gauge(&mut out, "tidaldb_tantivy_segment_count",
    "Number of Tantivy index segments",
    self.tantivy_segment_count.load(Ordering::Relaxed) as f64);

write_gauge(&mut out, "tidaldb_tantivy_indexed_docs",
    "Number of documents indexed in Tantivy",
    self.tantivy_indexed_docs.load(Ordering::Relaxed) as f64);

// USearch
write_gauge(&mut out, "tidaldb_usearch_index_size_bytes",
    "Estimated byte size of USearch vector indexes",
    self.usearch_index_size_bytes.load(Ordering::Relaxed) as f64);

write_gauge(&mut out, "tidaldb_usearch_vector_count",
    "Number of vectors stored in USearch indexes",
    self.usearch_vector_count.load(Ordering::Relaxed) as f64);

// Bitmap
write_gauge(&mut out, "tidaldb_bitmap_index_cardinality",
    "Total entity IDs across all bitmap indexes",
    self.bitmap_index_cardinality.load(Ordering::Relaxed) as f64);

5. Metric names (string literals)

Metric name Type Description
tidaldb_tantivy_segment_count gauge Number of Tantivy index segments
tidaldb_tantivy_indexed_docs gauge Number of documents indexed in Tantivy
tidaldb_usearch_index_size_bytes gauge Estimated byte size of USearch vector indexes
tidaldb_usearch_vector_count gauge Number of vectors stored in USearch indexes
tidaldb_bitmap_index_cardinality gauge Total entity IDs across all bitmap indexes

Acceptance Criteria

  • TextIndex::index_stats() returns (segment_count, doc_count) correctly
  • EmbeddingSlotRegistry::index_stats() returns (vector_count, byte_size)
  • BitmapIndex::total_cardinality() sums across all entries
  • MetricsState extended with 5 atomic gauges, all #[cfg(feature = "metrics")]
  • Metrics refreshed periodically (every 10 seconds in checkpoint thread)
  • /metrics endpoint renders all 5 new metrics in valid Prometheus format
  • Metrics reflect actual index state after writes (verified in integration test)
  • cargo clippy -D warnings and cargo fmt --check pass

Test Strategy

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn text_index_stats_empty() {
        let fields = vec![TextFieldDef { key: "title".into(), field_type: TextFieldType::Text }];
        let idx = TextIndex::ephemeral(&fields).unwrap();
        let (segments, docs) = idx.index_stats();
        assert_eq!(docs, 0);
        // Tantivy may report 0 or 1 segments for an empty index
        assert!(segments <= 1);
    }

    #[test]
    fn bitmap_total_cardinality_empty() {
        let idx = BitmapIndex::new("test");
        assert_eq!(idx.total_cardinality(), 0);
    }

    #[test]
    fn bitmap_total_cardinality_after_inserts() {
        let idx = BitmapIndex::new("test");
        idx.insert("jazz", 1);
        idx.insert("jazz", 2);
        idx.insert("rock", 3);
        assert_eq!(idx.total_cardinality(), 3);
    }
}

Integration test:

#[test]
fn index_metrics_reflect_writes() {
    let db = make_test_db_with_text_schema();
    // Write items with metadata
    for i in 0..10 {
        db.write_item_with_metadata(
            EntityId::new(i),
            &HashMap::from([
                ("title".to_string(), format!("Item {i}")),
                ("category".to_string(), "jazz".to_string()),
            ]),
        ).unwrap();
    }
    db.flush_text_index().unwrap();

    let metrics = db.metrics();
    let prom = metrics.render_prometheus();
    assert!(prom.contains("tidaldb_tantivy_indexed_docs"));
    assert!(prom.contains("tidaldb_bitmap_index_cardinality"));
}