jordan 213b8efcca feat: complete M6-M7 + Enterprise Readiness milestones; split oversized source files per CODING_GUIDELINES §9

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-23 22:41:16 -07:00

4.8 KiB

Raw Blame History

m7p5: M7 UAT Integration Test

Delivers

End-to-end M7 UAT integration test suite (tidal/tests/m7_uat.rs) proving all production hardening capabilities work together. Crash recovery, graceful degradation, rate limiting, session auto-cleanup, observability (QueryStats + Prometheus metrics), RLHF export, and cross-session aggregation all exercised in a single comprehensive test file with separate #[test] functions.

Dependencies

m7p1 (Crash Recovery Hardening) -- CrashPoint enum, WAL compaction, checkpoint BLAKE3 integrity, crash fencing for M6 state
m7p2 (Graceful Degradation, Rate Limiting, Session Cleanup) -- DegradationLevel, TidalError::RateLimited, TidalError::Backpressure, session TTL auto-cleanup sweeper, SessionSummary.auto_closed
m7p3 (Performance at Scale) -- benchmarks and optimizations; UAT validates behaviour, not scale numbers
m7p4 (Operational Visibility) -- QueryStats, Prometheus metrics export, db.export_signals(), db.user_session_summary()

Research References

All M7 phase specifications in docs/planning/ROADMAP.md
docs/research/tidaldb_wal.md -- crash recovery, segment format
docs/research/tidaldb_signal_ledger.md -- checkpoint format, running-score formula
thoughts.md Part V -- graceful degradation, operational simplicity

Acceptance Criteria (Phase Level)

tidal/tests/m7_uat.rs with separate #[test] functions per UAT step
Crash recovery tests: write items + signals; simulate crash at WAL-write, checkpoint, and with M6 state; verify recovery produces correct state; verify hard negatives (hidden items, blocked creators) never leak after any crash scenario
Session cleanup test: create session with 30s TTL; wait 35s; verify sweeper auto-closed the session; verify auto_closed: true in summary
Degradation test: simulate concurrent load above threshold; verify degradation_level in response matches expected level; verify all queries still return results
Rate limiting test: configure 10 signals/sec rate limit; write 50 signals in 1 second; first 10 succeed; remaining return TidalError::RateLimited; other sessions unaffected
QueryStats test: execute RETRIEVE and SEARCH; verify stats field populated with non-zero candidates_considered, scoring_time_us, total_time_us
Metrics test: verify Prometheus text output contains expected metric names (tidaldb_signal_hot_entries, tidaldb_wal_lag_bytes, etc.)
Export + aggregation test: write session signals; close session; export_signals() returns expected events; user_session_summary() returns correct counts
All prior UAT suites pass (m2_uat, m3_uat, m4_uat, m5_uat, m6_uat) -- no regressions
No individual test exceeds 60 seconds

Task Execution Order

task-01 (Crash Recovery UAT)  ──┐
                                ├──  task-04 (Regression Gate)
task-02 (Degradation + Rate   ──┤
         Limiting + Cleanup)    │
                                │
task-03 (Observability +       ──┘
         Export UAT)

Tasks 01, 02, 03 can be implemented in parallel (they are independent test functions in the same file). Task 04 runs last -- it verifies nothing regressed across all prior UAT suites.

Tasks

#	Task	Delivers	Complexity
01	Crash recovery UAT tests	3 tests: crash at WAL-write, crash at checkpoint, crash with M6 state; verify correct recovery and hard negative invariant	L
02	Degradation + rate limiting + session cleanup UAT tests	3 tests: degradation progression, rate limiting isolation, session auto-cleanup	L
03	Observability + export UAT tests	3 tests: QueryStats populated, metrics content, RLHF export + session aggregation	M
04	Regression gate	Verify all prior UAT suites pass (m2_uat through m6_uat)	S

Notes

Tests use small datasets (100-500 items, not 1M) to keep individual test runtime under 60 seconds. Scale performance is already validated by m7p3 benchmarks.
Crash recovery tests use TempTidalHome for on-disk durability and #[cfg(feature = "test-utils")] for crash injection hooks.
The session cleanup test is the only test that involves a wall-clock wait (35s). All other tests are compute-bound and complete in under 5 seconds.
Rate limiting tests use a low token count (10/sec) so they do not need to actually wait -- they exhaust the bucket immediately.
The regression gate (task-04) is a shell-level check documented in the task file; it does not add Rust code.

Done When

cargo test --manifest-path tidal/Cargo.toml --test m7_uat passes with all tests green.
Each individual test completes in under 60 seconds.
cargo test --manifest-path tidal/Cargo.toml --test m2_uat --test m3_uat --test m4_uat --test m5_uat --test m6_uat all pass -- no regressions.
cargo clippy --manifest-path tidal/Cargo.toml -- -D warnings passes.

4.8 KiB Raw Blame History