137 lines
5.6 KiB
Markdown
137 lines
5.6 KiB
Markdown
# Task 07: `metrics` Feature Flag + Zero-Overhead Verification
|
|
|
|
## Delivers
|
|
|
|
Verification that all metrics instrumentation (atomic counters, histogram, HTTP endpoint, Prometheus rendering) is gated behind `#[cfg(feature = "metrics")]` and compiles to zero overhead when the feature is disabled. Adds a CI-enforceable check that `--no-default-features` produces no metrics code.
|
|
|
|
## Complexity: S
|
|
|
|
## Dependencies
|
|
|
|
- task-01 complete (QueryStats is NOT gated -- it has value without Prometheus)
|
|
- task-02, task-03, task-04 complete (all atomic counters must exist before we verify gating)
|
|
- `tidal/Cargo.toml` -- `metrics` feature definition
|
|
|
|
## Technical Design
|
|
|
|
### 1. Audit all `#[cfg(feature = "metrics")]` gates
|
|
|
|
Every `AtomicU64` counter, the `LatencyHistogram`, and the HTTP server module must be gated. The audit covers:
|
|
|
|
| Location | Gated item |
|
|
|---|---|
|
|
| `db/metrics.rs` | `wal_lag_bytes`, `wal_compacted_segments_total`, `last_checkpoint_ns`, `signal_hot_entries`, `signal_writes_total`, `signal_write_latency`, `tantivy_segment_count`, `tantivy_indexed_docs`, `usearch_index_size_bytes`, `usearch_vector_count`, `bitmap_index_cardinality`, `active_sessions`, `closed_sessions_total`, `session_auto_closed_total`, `rate_limited_total`, `degradation_level` |
|
|
| `db/http.rs` | Entire module already gated: `#[cfg(feature = "metrics")] pub mod http;` |
|
|
| `db/signals.rs` | `Instant::now()` + `observe()` call in `signal()` |
|
|
| `db/sessions.rs` | `fetch_add`/`fetch_sub` calls on session counters |
|
|
| `db/mod.rs` | `metrics_handle` field, `refresh_index_metrics()` call |
|
|
| `lib.rs` | `pub use db::http::MetricsHandle` re-export |
|
|
|
|
### 2. Verify `QueryStats` is NOT gated
|
|
|
|
`QueryStats` is always available. It carries per-query telemetry that embedders use for their own logging and debugging, independent of Prometheus. The `stats` field on `Results` and `SearchResults` is always populated.
|
|
|
|
### 3. Verify `MetricsState` base fields are NOT gated
|
|
|
|
`MetricsState.opened_at` and `MetricsState.health_ok` exist regardless of the `metrics` feature. The `render_prometheus()` and `render_healthz()` methods exist regardless (they render the base metrics). Only the additional m7p4 counters are gated.
|
|
|
|
### 4. Zero-overhead verification
|
|
|
|
Add a compile-time check to CI:
|
|
|
|
```bash
|
|
# Verify the crate compiles without the metrics feature
|
|
cargo check --manifest-path tidal/Cargo.toml --no-default-features
|
|
|
|
# Verify no metrics-related symbols appear in the binary
|
|
cargo build --manifest-path tidal/Cargo.toml --release --no-default-features
|
|
# Inspect with nm or objdump for tidaldb_wal_lag_bytes etc.
|
|
# (This is a manual verification step documented in the task, not automated)
|
|
```
|
|
|
|
### 5. Conditional compilation pattern
|
|
|
|
Every instrumentation site follows this pattern:
|
|
|
|
```rust
|
|
// GOOD: zero overhead when feature is off
|
|
#[cfg(feature = "metrics")]
|
|
{
|
|
let elapsed_us = start.elapsed().as_micros() as u64;
|
|
self.metrics.signal_writes_total.fetch_add(1, Ordering::Relaxed);
|
|
self.metrics.signal_write_latency.observe(elapsed_us);
|
|
}
|
|
|
|
// BAD: timing overhead even when feature is off
|
|
let start = Instant::now(); // <-- this runs unconditionally
|
|
// ... work ...
|
|
#[cfg(feature = "metrics")]
|
|
{
|
|
self.metrics.signal_writes_total.fetch_add(1, Ordering::Relaxed);
|
|
}
|
|
```
|
|
|
|
The `Instant::now()` call must also be inside the `#[cfg]` block. On hot paths (signal writes, query execution), even `Instant::now()` is measurable overhead (~20ns on macOS).
|
|
|
|
Exception: `QueryStats` timing uses `Instant::now()` unconditionally because `QueryStats` is always populated. This is acceptable because query execution is not the write-path hot loop.
|
|
|
|
### 6. Feature flag definition
|
|
|
|
Verify `tidal/Cargo.toml` has:
|
|
|
|
```toml
|
|
[features]
|
|
default = ["metrics"]
|
|
metrics = []
|
|
```
|
|
|
|
The `metrics` feature is enabled by default so that out-of-the-box usage includes observability. Embedders who need zero overhead opt out with `default-features = false`.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] All m7p4 atomic counters gated behind `#[cfg(feature = "metrics")]`
|
|
- [ ] `LatencyHistogram` struct and all its call sites gated
|
|
- [ ] HTTP metrics server module gated (already done, verify)
|
|
- [ ] `Instant::now()` for metric timing is inside `#[cfg]` blocks on write-path code
|
|
- [ ] `QueryStats` is NOT gated (always available)
|
|
- [ ] `MetricsState.opened_at` and `MetricsState.health_ok` are NOT gated
|
|
- [ ] `cargo check --manifest-path tidal/Cargo.toml --no-default-features` succeeds
|
|
- [ ] `cargo test --manifest-path tidal/Cargo.toml --no-default-features --lib` passes
|
|
- [ ] No dead code warnings when `metrics` feature is disabled
|
|
- [ ] `cargo clippy -D warnings` passes with and without `metrics` feature
|
|
|
|
## Test Strategy
|
|
|
|
```rust
|
|
// This test runs in both feature configurations via CI matrix
|
|
#[test]
|
|
fn query_stats_always_available() {
|
|
// QueryStats is not feature-gated
|
|
let stats = QueryStats::new("test".to_owned());
|
|
assert_eq!(stats.profile_name, "test");
|
|
}
|
|
|
|
#[test]
|
|
fn metrics_state_base_always_available() {
|
|
let state = MetricsState::new();
|
|
assert!(state.uptime_seconds() >= 0.0);
|
|
assert!((state.health_ok_value() - 1.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[cfg(feature = "metrics")]
|
|
#[test]
|
|
fn metrics_feature_counters_exist() {
|
|
let state = MetricsState::new();
|
|
state.signal_writes_total.fetch_add(1, Ordering::Relaxed);
|
|
assert_eq!(state.signal_writes_total.load(Ordering::Relaxed), 1);
|
|
}
|
|
```
|
|
|
|
CI verification (add to pre-commit or CI script):
|
|
|
|
```bash
|
|
# Verify no-default-features compilation
|
|
cargo check --manifest-path tidal/Cargo.toml --no-default-features 2>&1 | grep -c "error" && exit 1 || true
|
|
cargo test --manifest-path tidal/Cargo.toml --no-default-features --lib 2>&1
|
|
```
|