tidaldb/docs/planning/milestone-7/phase-5/task-03-observability-export-uat.md

# Task 03: Observability + Export UAT Tests

## Delivers

Three integration tests in `tidal/tests/m7_uat.rs` proving operational visibility features work end-to-end:

1. **`uat_query_stats_populated`** -- Execute RETRIEVE and SEARCH; verify the `stats` field on both results contains non-zero `candidates_considered`, `scoring_time_us`, `total_time_us`, and a valid `profile_name`.
2. **`uat_metrics_content`** -- Verify Prometheus text output contains expected metric names (`tidaldb_signal_hot_entries`, `tidaldb_wal_lag_bytes`, etc.) and is well-formed.
3. **`uat_export_and_session_aggregation`** -- Write session signals, close the session, call `export_signals()` to verify events are present, call `user_session_summary()` to verify aggregated counts.

## Complexity: M

## Dependencies

- m7p4 complete (`QueryStats` struct, `Results.stats`, `SearchResults.stats`, Prometheus metrics export, `db.export_signals()`, `db.user_session_summary()`)
- m7p2 complete (`DegradationLevel` used in `QueryStats`)
- m7p1 complete (WAL segments readable for `export_signals`)

## Technical Design

### Test 7: QueryStats populated on RETRIEVE and SEARCH

```rust
#[test]
fn uat_query_stats_populated() {
    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(m7_uat_schema())
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write 100 items with metadata + signals so both RETRIEVE and SEARCH
    // have non-trivial work to do.
    for id in 1..=100u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Stats Item {id}"));
        meta.insert("description".to_string(), format!("Description for item {id}"));
        meta.insert("category".to_string(), "test".to_string());
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }
    for id in 1..=50u64 {
        for _ in 0..5 {
            db.signal("view", EntityId::new(id), 1.0, now).unwrap();
        }
    }

    // Flush text index so SEARCH has results.
    db.flush_text_index().unwrap();

    // -- RETRIEVE stats --
    let retrieve_results = db
        .retrieve(
            &Retrieve::builder()
                .profile("trending")
                .limit(20)
                .build()
                .unwrap(),
        )
        .unwrap();

    let stats = &retrieve_results.stats;
    assert!(
        stats.candidates_considered > 0,
        "RETRIEVE stats.candidates_considered should be > 0, got {}",
        stats.candidates_considered
    );
    assert!(
        stats.total_time_us > 0,
        "RETRIEVE stats.total_time_us should be > 0, got {}",
        stats.total_time_us
    );
    assert!(
        stats.scoring_time_us > 0,
        "RETRIEVE stats.scoring_time_us should be > 0, got {}",
        stats.scoring_time_us
    );
    assert!(
        !stats.profile_name.is_empty(),
        "RETRIEVE stats.profile_name should not be empty"
    );
    assert_eq!(
        stats.profile_name, "trending",
        "RETRIEVE stats.profile_name should be 'trending'"
    );

    // -- SEARCH stats --
    let search_results = db
        .search(
            &Search::builder()
                .query("Stats Item")
                .limit(20)
                .build()
                .unwrap(),
        )
        .unwrap();

    let search_stats = &search_results.stats;
    assert!(
        search_stats.candidates_considered > 0,
        "SEARCH stats.candidates_considered should be > 0, got {}",
        search_stats.candidates_considered
    );
    assert!(
        search_stats.total_time_us > 0,
        "SEARCH stats.total_time_us should be > 0, got {}",
        search_stats.total_time_us
    );
}
```

### Test 8: Prometheus metrics content

```rust
#[test]
fn uat_metrics_content() {
    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(m7_uat_schema())
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write some data so metrics have non-zero values.
    for id in 1..=50u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Metrics Item {id}"));
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }
    for id in 1..=20u64 {
        db.signal("view", EntityId::new(id), 1.0, now).unwrap();
    }

    // Render Prometheus text metrics.
    let metrics_text = db.render_metrics();

    // Verify the output is non-empty.
    assert!(
        !metrics_text.is_empty(),
        "metrics output should not be empty"
    );

    // Verify expected metric names are present.
    let expected_metrics = [
        "tidaldb_signal_hot_entries",
        "tidaldb_signal_writes_total",
    ];
    for metric_name in &expected_metrics {
        assert!(
            metrics_text.contains(metric_name),
            "metrics output should contain '{metric_name}', got:\n{metrics_text}"
        );
    }

    // Verify Prometheus text format: lines starting with # are comments/HELP/TYPE,
    // data lines are "metric_name{labels} value" or "metric_name value".
    for line in metrics_text.lines() {
        if line.is_empty() || line.starts_with('#') {
            continue;
        }
        // Data line: should have at least a metric name and a numeric value.
        let parts: Vec<&str> = line.split_whitespace().collect();
        assert!(
            parts.len() >= 2,
            "malformed Prometheus line (expected 'name value'): {line}"
        );
        // The last token should be parseable as a number.
        let value_str = parts.last().unwrap();
        assert!(
            value_str.parse::<f64>().is_ok(),
            "Prometheus value should be numeric, got '{value_str}' in line: {line}"
        );
    }
}
```

### Test 9: RLHF export + cross-session aggregation

```rust
#[test]
fn uat_export_and_session_aggregation() {
    let mut builder = SchemaBuilder::new();
    for &(name, half_life_days) in &[("view", 7), ("like", 14), ("skip", 1)] {
        let _ = builder
            .signal(
                name,
                EntityKind::Item,
                DecaySpec::Exponential {
                    half_life: Duration::from_secs(half_life_days * 24 * 3600),
                },
            )
            .windows(&[Window::AllTime])
            .velocity(false)
            .add();
    }
    builder.session_policy(
        "default",
        tidaldb::schema::AgentPolicy::builder()
            .max_session_duration(Duration::from_secs(300))
            .build(),
    );
    let schema = builder.build().unwrap();

    let db = TidalDb::builder()
        .ephemeral()
        .with_schema(schema)
        .open()
        .unwrap();

    let now = Timestamp::now();

    // Write items.
    for id in 1..=10u64 {
        let mut meta = HashMap::new();
        meta.insert("title".to_string(), format!("Export Item {id}"));
        db.write_item_with_metadata(EntityId::new(id), &meta).unwrap();
    }

    let user_id = 42u64;

    // -- Session 1: 5 views + 2 likes --
    let session_1 = db
        .start_session(user_id, "agent_export", "default", HashMap::new())
        .unwrap();
    for id in 1..=5u64 {
        db.session_signal(&session_1, "view", EntityId::new(id), 1.0, now, None)
            .unwrap();
    }
    db.session_signal(&session_1, "like", EntityId::new(1), 1.0, now, Some("loved it".into()))
        .unwrap();
    db.session_signal(&session_1, "like", EntityId::new(2), 1.0, now, None)
        .unwrap();
    let summary_1 = db.close_session(session_1).unwrap();
    assert_eq!(summary_1.signals_written, 7, "session 1 should have 7 signals");

    // -- Session 2: 3 views + 1 skip --
    let session_2 = db
        .start_session(user_id, "agent_export", "default", HashMap::new())
        .unwrap();
    for id in 6..=8u64 {
        db.session_signal(&session_2, "view", EntityId::new(id), 1.0, now, None)
            .unwrap();
    }
    db.session_signal(&session_2, "skip", EntityId::new(9), 1.0, now, None)
        .unwrap();
    let summary_2 = db.close_session(session_2).unwrap();
    assert_eq!(summary_2.signals_written, 4, "session 2 should have 4 signals");

    // -- Export signals --
    let exported = db
        .export_signals(tidaldb::ExportRequest {
            user_id: Some(user_id),
            signal_types: vec!["view".into(), "like".into(), "skip".into()],
            since: Timestamp::from_nanos(0),
            until: Timestamp::from_nanos(u64::MAX),
            format: tidaldb::ExportFormat::JsonLines,
        })
        .unwrap();

    // Should have at least 11 exported events (7 from session 1 + 4 from session 2).
    assert!(
        exported.len() >= 11,
        "export should return at least 11 signal events, got {}",
        exported.len()
    );

    // Verify the annotation survived export.
    let annotated = exported
        .iter()
        .filter(|e| e.annotation.is_some())
        .count();
    assert!(
        annotated >= 1,
        "at least one exported event should have an annotation"
    );

    // Verify signal types are correct.
    let view_count = exported.iter().filter(|e| e.signal_type == "view").count();
    let like_count = exported.iter().filter(|e| e.signal_type == "like").count();
    let skip_count = exported.iter().filter(|e| e.signal_type == "skip").count();
    assert!(view_count >= 8, "should have at least 8 view events, got {view_count}");
    assert!(like_count >= 2, "should have at least 2 like events, got {like_count}");
    assert!(skip_count >= 1, "should have at least 1 skip event, got {skip_count}");

    // -- Cross-session aggregation --
    let summary = db
        .user_session_summary(user_id, Timestamp::from_nanos(0))
        .unwrap();

    assert_eq!(
        summary.sessions_count, 2,
        "user should have 2 closed sessions"
    );
    assert_eq!(
        summary.total_signals, 11,
        "total signals across sessions should be 11"
    );
    assert_eq!(
        summary.total_rejections, 0,
        "no policy rejections should have occurred"
    );

    // top_signal_types should include "view" as the most frequent.
    assert!(
        !summary.top_signal_types.is_empty(),
        "top_signal_types should not be empty"
    );
    let top_type = &summary.top_signal_types[0];
    assert_eq!(
        top_type.0, "view",
        "most frequent signal type should be 'view', got '{}'",
        top_type.0
    );
    assert!(
        top_type.1 >= 8,
        "view count should be at least 8, got {}",
        top_type.1
    );
}
```

### Imports needed (additions to the shared header)

```rust
use tidaldb::{ExportRequest, ExportFormat};  // From m7p4
```

### Note on `render_metrics`

The test calls `db.render_metrics()` which is the m7p4 API for producing Prometheus text. If the actual API name differs (e.g., `db.metrics_text()` or requires the `metrics` feature), the test should be gated accordingly:

```rust
#[test]
#[cfg(feature = "metrics")]
fn uat_metrics_content() { ... }
```

The task document uses `render_metrics()` as the expected name per the m7p4 spec. The implementer should match whatever m7p4 actually delivers.

## Acceptance Criteria

- [ ] `uat_query_stats_populated` passes: both RETRIEVE and SEARCH return `stats` with non-zero `candidates_considered`, `total_time_us`, `scoring_time_us`; RETRIEVE `profile_name == "trending"`
- [ ] `uat_metrics_content` passes: Prometheus text output contains `tidaldb_signal_hot_entries` and `tidaldb_signal_writes_total`; all data lines have numeric values; output is non-empty
- [ ] `uat_export_and_session_aggregation` passes: export returns >= 11 events with correct signal type distribution; annotation preserved; `user_session_summary` returns `sessions_count == 2`, `total_signals == 11`, `total_rejections == 0`; `top_signal_types[0]` is `("view", >= 8)`
- [ ] Each test completes in under 10 seconds (no wall-clock waits)
- [ ] `cargo clippy --manifest-path tidal/Cargo.toml -- -D warnings` passes

## Test Strategy

**QueryStats test** validates the instrumentation wiring, not the exact values. The assertions check `> 0` rather than specific numbers because execution time varies by machine. The `profile_name` assertion verifies the stats carry semantic context.

**Metrics test** validates Prometheus text format compliance at a basic level: lines are either comments (`#`) or data lines with `name value` structure where value is numeric. The test does not exhaustively check every metric defined in m7p4 -- it verifies the two most important signal system metrics to confirm the pipeline is wired correctly. Additional metrics can be asserted as the m7p4 implementation stabilizes.

**Export + aggregation test** writes a deliberate signal pattern across two sessions with known types and counts, then verifies the export and aggregation APIs return matching numbers. The annotation on one `like` signal verifies that annotations flow through the WAL and into the export path. The `user_session_summary` is tested with `since: 0` to capture all historical sessions.