tidaldb/docs/planning/milestone-7/phase-5/task-03-observability-export-uat.md
2026-02-23 22:41:16 -07:00

13 KiB

Task 03: Observability + Export UAT Tests

Delivers

Three integration tests in tidal/tests/m7_uat.rs proving operational visibility features work end-to-end:

  1. uat_query_stats_populated -- Execute RETRIEVE and SEARCH; verify the stats field on both results contains non-zero candidates_considered, scoring_time_us, total_time_us, and a valid profile_name.
  2. uat_metrics_content -- Verify Prometheus text output contains expected metric names (tidaldb_signal_hot_entries, tidaldb_wal_lag_bytes, etc.) and is well-formed.
  3. uat_export_and_session_aggregation -- Write session signals, close the session, call export_signals() to verify events are present, call user_session_summary() to verify aggregated counts.

Complexity: M

Dependencies

  • m7p4 complete (QueryStats struct, Results.stats, SearchResults.stats, Prometheus metrics export, db.export_signals(), db.user_session_summary())
  • m7p2 complete (DegradationLevel used in QueryStats)
  • m7p1 complete (WAL segments readable for export_signals)

Technical Design

#[test]
fn uat_query_stats_populated() {
    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(m7_uat_schema())
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write 100 items with metadata + signals so both RETRIEVE and SEARCH
    // have non-trivial work to do.
    for id in 1..=100u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Stats Item {id}"));
        meta.insert("description".to_string(), format!("Description for item {id}"));
        meta.insert("category".to_string(), "test".to_string());
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }
    for id in 1..=50u64 {
        for _ in 0..5 {
            db.signal("view", EntityId::new(id), 1.0, now).unwrap();
        }
    }

    // Flush text index so SEARCH has results.
    db.flush_text_index().unwrap();

    // -- RETRIEVE stats --
    let retrieve_results = db
        .retrieve(
            &Retrieve::builder()
                .profile("trending")
                .limit(20)
                .build()
                .unwrap(),
        )
        .unwrap();

    let stats = &retrieve_results.stats;
    assert!(
        stats.candidates_considered > 0,
        "RETRIEVE stats.candidates_considered should be > 0, got {}",
        stats.candidates_considered
    );
    assert!(
        stats.total_time_us > 0,
        "RETRIEVE stats.total_time_us should be > 0, got {}",
        stats.total_time_us
    );
    assert!(
        stats.scoring_time_us > 0,
        "RETRIEVE stats.scoring_time_us should be > 0, got {}",
        stats.scoring_time_us
    );
    assert!(
        !stats.profile_name.is_empty(),
        "RETRIEVE stats.profile_name should not be empty"
    );
    assert_eq!(
        stats.profile_name, "trending",
        "RETRIEVE stats.profile_name should be 'trending'"
    );

    // -- SEARCH stats --
    let search_results = db
        .search(
            &Search::builder()
                .query("Stats Item")
                .limit(20)
                .build()
                .unwrap(),
        )
        .unwrap();

    let search_stats = &search_results.stats;
    assert!(
        search_stats.candidates_considered > 0,
        "SEARCH stats.candidates_considered should be > 0, got {}",
        search_stats.candidates_considered
    );
    assert!(
        search_stats.total_time_us > 0,
        "SEARCH stats.total_time_us should be > 0, got {}",
        search_stats.total_time_us
    );
}

Test 8: Prometheus metrics content

#[test]
fn uat_metrics_content() {
    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(m7_uat_schema())
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write some data so metrics have non-zero values.
    for id in 1..=50u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Metrics Item {id}"));
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }
    for id in 1..=20u64 {
        db.signal("view", EntityId::new(id), 1.0, now).unwrap();
    }

    // Render Prometheus text metrics.
    let metrics_text = db.render_metrics();

    // Verify the output is non-empty.
    assert!(
        !metrics_text.is_empty(),
        "metrics output should not be empty"
    );

    // Verify expected metric names are present.
    let expected_metrics = [
        "tidaldb_signal_hot_entries",
        "tidaldb_signal_writes_total",
    ];
    for metric_name in &expected_metrics {
        assert!(
            metrics_text.contains(metric_name),
            "metrics output should contain '{metric_name}', got:\n{metrics_text}"
        );
    }

    // Verify Prometheus text format: lines starting with # are comments/HELP/TYPE,
    // data lines are "metric_name{labels} value" or "metric_name value".
    for line in metrics_text.lines() {
        if line.is_empty() || line.starts_with('#') {
            continue;
        }
        // Data line: should have at least a metric name and a numeric value.
        let parts: Vec<&str> = line.split_whitespace().collect();
        assert!(
            parts.len() >= 2,
            "malformed Prometheus line (expected 'name value'): {line}"
        );
        // The last token should be parseable as a number.
        let value_str = parts.last().unwrap();
        assert!(
            value_str.parse::<f64>().is_ok(),
            "Prometheus value should be numeric, got '{value_str}' in line: {line}"
        );
    }
}

Test 9: RLHF export + cross-session aggregation

#[test]
fn uat_export_and_session_aggregation() {
    let mut builder = SchemaBuilder::new();
    for &(name, half_life_days) in &[("view", 7), ("like", 14), ("skip", 1)] {
        let _ = builder
            .signal(
                name,
                EntityKind::Item,
                DecaySpec::Exponential {
                    half_life: Duration::from_secs(half_life_days * 24 * 3600),
                },
            )
            .windows(&[Window::AllTime])
            .velocity(false)
            .add();
    }
    builder.session_policy(
        "default",
        tidaldb::schema::AgentPolicy::builder()
            .max_session_duration(Duration::from_secs(300))
            .build(),
    );
    let schema = builder.build().unwrap();

    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(schema)
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write items.
    for id in 1..=10u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Export Item {id}"));
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }

    let user_id = 42u64;

    // -- Session 1: 5 views + 2 likes --
    let session_1 = db
        .start_session(user_id, "agent_export", "default", HashMap::new())
        .unwrap();
    for id in 1..=5u64 {
        db.session_signal(&session_1, "view", EntityId::new(id), 1.0, now, None)
            .unwrap();
    }
    db.session_signal(&session_1, "like", EntityId::new(1), 1.0, now, Some("loved it".into()))
        .unwrap();
    db.session_signal(&session_1, "like", EntityId::new(2), 1.0, now, None)
        .unwrap();
    let summary_1 = db.close_session(session_1).unwrap();
    assert_eq!(summary_1.signals_written, 7, "session 1 should have 7 signals");

    // -- Session 2: 3 views + 1 skip --
    let session_2 = db
        .start_session(user_id, "agent_export", "default", HashMap::new())
        .unwrap();
    for id in 6..=8u64 {
        db.session_signal(&session_2, "view", EntityId::new(id), 1.0, now, None)
            .unwrap();
    }
    db.session_signal(&session_2, "skip", EntityId::new(9), 1.0, now, None)
        .unwrap();
    let summary_2 = db.close_session(session_2).unwrap();
    assert_eq!(summary_2.signals_written, 4, "session 2 should have 4 signals");

    // -- Export signals --
    let exported = db
        .export_signals(tidaldb::ExportRequest {
            user_id: Some(user_id),
            signal_types: vec!["view".into(), "like".into(), "skip".into()],
            since: Timestamp::from_nanos(0),
            until: Timestamp::from_nanos(u64::MAX),
            format: tidaldb::ExportFormat::JsonLines,
        })
        .unwrap();

    // Should have at least 11 exported events (7 from session 1 + 4 from session 2).
    assert!(
        exported.len() >= 11,
        "export should return at least 11 signal events, got {}",
        exported.len()
    );

    // Verify the annotation survived export.
    let annotated = exported
        .iter()
        .filter(|e| e.annotation.is_some())
        .count();
    assert!(
        annotated >= 1,
        "at least one exported event should have an annotation"
    );

    // Verify signal types are correct.
    let view_count = exported.iter().filter(|e| e.signal_type == "view").count();
    let like_count = exported.iter().filter(|e| e.signal_type == "like").count();
    let skip_count = exported.iter().filter(|e| e.signal_type == "skip").count();
    assert!(view_count >= 8, "should have at least 8 view events, got {view_count}");
    assert!(like_count >= 2, "should have at least 2 like events, got {like_count}");
    assert!(skip_count >= 1, "should have at least 1 skip event, got {skip_count}");

    // -- Cross-session aggregation --
    let summary = db
        .user_session_summary(user_id, Timestamp::from_nanos(0))
        .unwrap();

    assert_eq!(
        summary.sessions_count, 2,
        "user should have 2 closed sessions"
    );
    assert_eq!(
        summary.total_signals, 11,
        "total signals across sessions should be 11"
    );
    assert_eq!(
        summary.total_rejections, 0,
        "no policy rejections should have occurred"
    );

    // top_signal_types should include "view" as the most frequent.
    assert!(
        !summary.top_signal_types.is_empty(),
        "top_signal_types should not be empty"
    );
    let top_type = &summary.top_signal_types[0];
    assert_eq!(
        top_type.0, "view",
        "most frequent signal type should be 'view', got '{}'",
        top_type.0
    );
    assert!(
        top_type.1 >= 8,
        "view count should be at least 8, got {}",
        top_type.1
    );
}

Imports needed (additions to the shared header)

use tidaldb::{ExportRequest, ExportFormat};  // From m7p4

Note on render_metrics

The test calls db.render_metrics() which is the m7p4 API for producing Prometheus text. If the actual API name differs (e.g., db.metrics_text() or requires the metrics feature), the test should be gated accordingly:

#[test]
#[cfg(feature = "metrics")]
fn uat_metrics_content() { ... }

The task document uses render_metrics() as the expected name per the m7p4 spec. The implementer should match whatever m7p4 actually delivers.

Acceptance Criteria

  • uat_query_stats_populated passes: both RETRIEVE and SEARCH return stats with non-zero candidates_considered, total_time_us, scoring_time_us; RETRIEVE profile_name == "trending"
  • uat_metrics_content passes: Prometheus text output contains tidaldb_signal_hot_entries and tidaldb_signal_writes_total; all data lines have numeric values; output is non-empty
  • uat_export_and_session_aggregation passes: export returns >= 11 events with correct signal type distribution; annotation preserved; user_session_summary returns sessions_count == 2, total_signals == 11, total_rejections == 0; top_signal_types[0] is ("view", >= 8)
  • Each test completes in under 10 seconds (no wall-clock waits)
  • cargo clippy --manifest-path tidal/Cargo.toml -- -D warnings passes

Test Strategy

QueryStats test validates the instrumentation wiring, not the exact values. The assertions check > 0 rather than specific numbers because execution time varies by machine. The profile_name assertion verifies the stats carry semantic context.

Metrics test validates Prometheus text format compliance at a basic level: lines are either comments (#) or data lines with name value structure where value is numeric. The test does not exhaustively check every metric defined in m7p4 -- it verifies the two most important signal system metrics to confirm the pipeline is wired correctly. Additional metrics can be asserted as the m7p4 implementation stabilizes.

Export + aggregation test writes a deliberate signal pattern across two sessions with known types and counts, then verifies the export and aggregation APIs return matching numbers. The annotation on one like signal verifies that annotations flow through the WAL and into the export path. The user_session_summary is tested with since: 0 to capture all historical sessions.