357 lines
13 KiB
Markdown
357 lines
13 KiB
Markdown
# Task 03: Observability + Export UAT Tests
|
|
|
|
## Delivers
|
|
|
|
Three integration tests in `tidal/tests/m7_uat.rs` proving operational visibility features work end-to-end:
|
|
|
|
1. **`uat_query_stats_populated`** -- Execute RETRIEVE and SEARCH; verify the `stats` field on both results contains non-zero `candidates_considered`, `scoring_time_us`, `total_time_us`, and a valid `profile_name`.
|
|
2. **`uat_metrics_content`** -- Verify Prometheus text output contains expected metric names (`tidaldb_signal_hot_entries`, `tidaldb_wal_lag_bytes`, etc.) and is well-formed.
|
|
3. **`uat_export_and_session_aggregation`** -- Write session signals, close the session, call `export_signals()` to verify events are present, call `user_session_summary()` to verify aggregated counts.
|
|
|
|
## Complexity: M
|
|
|
|
## Dependencies
|
|
|
|
- m7p4 complete (`QueryStats` struct, `Results.stats`, `SearchResults.stats`, Prometheus metrics export, `db.export_signals()`, `db.user_session_summary()`)
|
|
- m7p2 complete (`DegradationLevel` used in `QueryStats`)
|
|
- m7p1 complete (WAL segments readable for `export_signals`)
|
|
|
|
## Technical Design
|
|
|
|
### Test 7: QueryStats populated on RETRIEVE and SEARCH
|
|
|
|
```rust
|
|
#[test]
|
|
fn uat_query_stats_populated() {
|
|
let db = TidalDb::builder()
|
|
.ephemeral()
|
|
.with_schema(m7_uat_schema())
|
|
.open()
|
|
.unwrap();
|
|
|
|
let now = Timestamp::now();
|
|
|
|
// Write 100 items with metadata + signals so both RETRIEVE and SEARCH
|
|
// have non-trivial work to do.
|
|
for id in 1..=100u64 {
|
|
let mut meta = HashMap::new();
|
|
meta.insert("title".to_string(), format!("Stats Item {id}"));
|
|
meta.insert("description".to_string(), format!("Description for item {id}"));
|
|
meta.insert("category".to_string(), "test".to_string());
|
|
db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
|
|
}
|
|
for id in 1..=50u64 {
|
|
for _ in 0..5 {
|
|
db.signal("view", EntityId::new(id), 1.0, now).unwrap();
|
|
}
|
|
}
|
|
|
|
// Flush text index so SEARCH has results.
|
|
db.flush_text_index().unwrap();
|
|
|
|
// -- RETRIEVE stats --
|
|
let retrieve_results = db
|
|
.retrieve(
|
|
&Retrieve::builder()
|
|
.profile("trending")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap(),
|
|
)
|
|
.unwrap();
|
|
|
|
let stats = &retrieve_results.stats;
|
|
assert!(
|
|
stats.candidates_considered > 0,
|
|
"RETRIEVE stats.candidates_considered should be > 0, got {}",
|
|
stats.candidates_considered
|
|
);
|
|
assert!(
|
|
stats.total_time_us > 0,
|
|
"RETRIEVE stats.total_time_us should be > 0, got {}",
|
|
stats.total_time_us
|
|
);
|
|
assert!(
|
|
stats.scoring_time_us > 0,
|
|
"RETRIEVE stats.scoring_time_us should be > 0, got {}",
|
|
stats.scoring_time_us
|
|
);
|
|
assert!(
|
|
!stats.profile_name.is_empty(),
|
|
"RETRIEVE stats.profile_name should not be empty"
|
|
);
|
|
assert_eq!(
|
|
stats.profile_name, "trending",
|
|
"RETRIEVE stats.profile_name should be 'trending'"
|
|
);
|
|
|
|
// -- SEARCH stats --
|
|
let search_results = db
|
|
.search(
|
|
&Search::builder()
|
|
.query("Stats Item")
|
|
.limit(20)
|
|
.build()
|
|
.unwrap(),
|
|
)
|
|
.unwrap();
|
|
|
|
let search_stats = &search_results.stats;
|
|
assert!(
|
|
search_stats.candidates_considered > 0,
|
|
"SEARCH stats.candidates_considered should be > 0, got {}",
|
|
search_stats.candidates_considered
|
|
);
|
|
assert!(
|
|
search_stats.total_time_us > 0,
|
|
"SEARCH stats.total_time_us should be > 0, got {}",
|
|
search_stats.total_time_us
|
|
);
|
|
}
|
|
```
|
|
|
|
### Test 8: Prometheus metrics content
|
|
|
|
```rust
|
|
#[test]
|
|
fn uat_metrics_content() {
|
|
let db = TidalDb::builder()
|
|
.ephemeral()
|
|
.with_schema(m7_uat_schema())
|
|
.open()
|
|
.unwrap();
|
|
|
|
let now = Timestamp::now();
|
|
|
|
// Write some data so metrics have non-zero values.
|
|
for id in 1..=50u64 {
|
|
let mut meta = HashMap::new();
|
|
meta.insert("title".to_string(), format!("Metrics Item {id}"));
|
|
db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
|
|
}
|
|
for id in 1..=20u64 {
|
|
db.signal("view", EntityId::new(id), 1.0, now).unwrap();
|
|
}
|
|
|
|
// Render Prometheus text metrics.
|
|
let metrics_text = db.render_metrics();
|
|
|
|
// Verify the output is non-empty.
|
|
assert!(
|
|
!metrics_text.is_empty(),
|
|
"metrics output should not be empty"
|
|
);
|
|
|
|
// Verify expected metric names are present.
|
|
let expected_metrics = [
|
|
"tidaldb_signal_hot_entries",
|
|
"tidaldb_signal_writes_total",
|
|
];
|
|
for metric_name in &expected_metrics {
|
|
assert!(
|
|
metrics_text.contains(metric_name),
|
|
"metrics output should contain '{metric_name}', got:\n{metrics_text}"
|
|
);
|
|
}
|
|
|
|
// Verify Prometheus text format: lines starting with # are comments/HELP/TYPE,
|
|
// data lines are "metric_name{labels} value" or "metric_name value".
|
|
for line in metrics_text.lines() {
|
|
if line.is_empty() || line.starts_with('#') {
|
|
continue;
|
|
}
|
|
// Data line: should have at least a metric name and a numeric value.
|
|
let parts: Vec<&str> = line.split_whitespace().collect();
|
|
assert!(
|
|
parts.len() >= 2,
|
|
"malformed Prometheus line (expected 'name value'): {line}"
|
|
);
|
|
// The last token should be parseable as a number.
|
|
let value_str = parts.last().unwrap();
|
|
assert!(
|
|
value_str.parse::<f64>().is_ok(),
|
|
"Prometheus value should be numeric, got '{value_str}' in line: {line}"
|
|
);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Test 9: RLHF export + cross-session aggregation
|
|
|
|
```rust
|
|
#[test]
|
|
fn uat_export_and_session_aggregation() {
|
|
let mut builder = SchemaBuilder::new();
|
|
for &(name, half_life_days) in &[("view", 7), ("like", 14), ("skip", 1)] {
|
|
let _ = builder
|
|
.signal(
|
|
name,
|
|
EntityKind::Item,
|
|
DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(half_life_days * 24 * 3600),
|
|
},
|
|
)
|
|
.windows(&[Window::AllTime])
|
|
.velocity(false)
|
|
.add();
|
|
}
|
|
builder.session_policy(
|
|
"default",
|
|
tidaldb::schema::AgentPolicy::builder()
|
|
.max_session_duration(Duration::from_secs(300))
|
|
.build(),
|
|
);
|
|
let schema = builder.build().unwrap();
|
|
|
|
let db = TidalDb::builder()
|
|
.ephemeral()
|
|
.with_schema(schema)
|
|
.open()
|
|
.unwrap();
|
|
|
|
let now = Timestamp::now();
|
|
|
|
// Write items.
|
|
for id in 1..=10u64 {
|
|
let mut meta = HashMap::new();
|
|
meta.insert("title".to_string(), format!("Export Item {id}"));
|
|
db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
|
|
}
|
|
|
|
let user_id = 42u64;
|
|
|
|
// -- Session 1: 5 views + 2 likes --
|
|
let session_1 = db
|
|
.start_session(user_id, "agent_export", "default", HashMap::new())
|
|
.unwrap();
|
|
for id in 1..=5u64 {
|
|
db.session_signal(&session_1, "view", EntityId::new(id), 1.0, now, None)
|
|
.unwrap();
|
|
}
|
|
db.session_signal(&session_1, "like", EntityId::new(1), 1.0, now, Some("loved it".into()))
|
|
.unwrap();
|
|
db.session_signal(&session_1, "like", EntityId::new(2), 1.0, now, None)
|
|
.unwrap();
|
|
let summary_1 = db.close_session(session_1).unwrap();
|
|
assert_eq!(summary_1.signals_written, 7, "session 1 should have 7 signals");
|
|
|
|
// -- Session 2: 3 views + 1 skip --
|
|
let session_2 = db
|
|
.start_session(user_id, "agent_export", "default", HashMap::new())
|
|
.unwrap();
|
|
for id in 6..=8u64 {
|
|
db.session_signal(&session_2, "view", EntityId::new(id), 1.0, now, None)
|
|
.unwrap();
|
|
}
|
|
db.session_signal(&session_2, "skip", EntityId::new(9), 1.0, now, None)
|
|
.unwrap();
|
|
let summary_2 = db.close_session(session_2).unwrap();
|
|
assert_eq!(summary_2.signals_written, 4, "session 2 should have 4 signals");
|
|
|
|
// -- Export signals --
|
|
let exported = db
|
|
.export_signals(tidaldb::ExportRequest {
|
|
user_id: Some(user_id),
|
|
signal_types: vec!["view".into(), "like".into(), "skip".into()],
|
|
since: Timestamp::from_nanos(0),
|
|
until: Timestamp::from_nanos(u64::MAX),
|
|
format: tidaldb::ExportFormat::JsonLines,
|
|
})
|
|
.unwrap();
|
|
|
|
// Should have at least 11 exported events (7 from session 1 + 4 from session 2).
|
|
assert!(
|
|
exported.len() >= 11,
|
|
"export should return at least 11 signal events, got {}",
|
|
exported.len()
|
|
);
|
|
|
|
// Verify the annotation survived export.
|
|
let annotated = exported
|
|
.iter()
|
|
.filter(|e| e.annotation.is_some())
|
|
.count();
|
|
assert!(
|
|
annotated >= 1,
|
|
"at least one exported event should have an annotation"
|
|
);
|
|
|
|
// Verify signal types are correct.
|
|
let view_count = exported.iter().filter(|e| e.signal_type == "view").count();
|
|
let like_count = exported.iter().filter(|e| e.signal_type == "like").count();
|
|
let skip_count = exported.iter().filter(|e| e.signal_type == "skip").count();
|
|
assert!(view_count >= 8, "should have at least 8 view events, got {view_count}");
|
|
assert!(like_count >= 2, "should have at least 2 like events, got {like_count}");
|
|
assert!(skip_count >= 1, "should have at least 1 skip event, got {skip_count}");
|
|
|
|
// -- Cross-session aggregation --
|
|
let summary = db
|
|
.user_session_summary(user_id, Timestamp::from_nanos(0))
|
|
.unwrap();
|
|
|
|
assert_eq!(
|
|
summary.sessions_count, 2,
|
|
"user should have 2 closed sessions"
|
|
);
|
|
assert_eq!(
|
|
summary.total_signals, 11,
|
|
"total signals across sessions should be 11"
|
|
);
|
|
assert_eq!(
|
|
summary.total_rejections, 0,
|
|
"no policy rejections should have occurred"
|
|
);
|
|
|
|
// top_signal_types should include "view" as the most frequent.
|
|
assert!(
|
|
!summary.top_signal_types.is_empty(),
|
|
"top_signal_types should not be empty"
|
|
);
|
|
let top_type = &summary.top_signal_types[0];
|
|
assert_eq!(
|
|
top_type.0, "view",
|
|
"most frequent signal type should be 'view', got '{}'",
|
|
top_type.0
|
|
);
|
|
assert!(
|
|
top_type.1 >= 8,
|
|
"view count should be at least 8, got {}",
|
|
top_type.1
|
|
);
|
|
}
|
|
```
|
|
|
|
### Imports needed (additions to the shared header)
|
|
|
|
```rust
|
|
use tidaldb::{ExportRequest, ExportFormat}; // From m7p4
|
|
```
|
|
|
|
### Note on `render_metrics`
|
|
|
|
The test calls `db.render_metrics()` which is the m7p4 API for producing Prometheus text. If the actual API name differs (e.g., `db.metrics_text()` or requires the `metrics` feature), the test should be gated accordingly:
|
|
|
|
```rust
|
|
#[test]
|
|
#[cfg(feature = "metrics")]
|
|
fn uat_metrics_content() { ... }
|
|
```
|
|
|
|
The task document uses `render_metrics()` as the expected name per the m7p4 spec. The implementer should match whatever m7p4 actually delivers.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `uat_query_stats_populated` passes: both RETRIEVE and SEARCH return `stats` with non-zero `candidates_considered`, `total_time_us`, `scoring_time_us`; RETRIEVE `profile_name == "trending"`
|
|
- [ ] `uat_metrics_content` passes: Prometheus text output contains `tidaldb_signal_hot_entries` and `tidaldb_signal_writes_total`; all data lines have numeric values; output is non-empty
|
|
- [ ] `uat_export_and_session_aggregation` passes: export returns >= 11 events with correct signal type distribution; annotation preserved; `user_session_summary` returns `sessions_count == 2`, `total_signals == 11`, `total_rejections == 0`; `top_signal_types[0]` is `("view", >= 8)`
|
|
- [ ] Each test completes in under 10 seconds (no wall-clock waits)
|
|
- [ ] `cargo clippy --manifest-path tidal/Cargo.toml -- -D warnings` passes
|
|
|
|
## Test Strategy
|
|
|
|
**QueryStats test** validates the instrumentation wiring, not the exact values. The assertions check `> 0` rather than specific numbers because execution time varies by machine. The `profile_name` assertion verifies the stats carry semantic context.
|
|
|
|
**Metrics test** validates Prometheus text format compliance at a basic level: lines are either comments (`#`) or data lines with `name value` structure where value is numeric. The test does not exhaustively check every metric defined in m7p4 -- it verifies the two most important signal system metrics to confirm the pipeline is wired correctly. Additional metrics can be asserted as the m7p4 implementation stabilizes.
|
|
|
|
**Export + aggregation test** writes a deliberate signal pattern across two sessions with known types and counts, then verifies the export and aggregation APIs return matching numbers. The annotation on one `like` signal verifies that annotations flow through the WAL and into the export path. The `user_session_summary` is tested with `since: 0` to capture all historical sessions.
|