6.3 KiB
Task 05: tidalctl diagnostics Command
Delivers
A diagnostics subcommand for tidalctl that reads the database's metrics state and persistent storage to print a human-readable health summary. Operators use this to triage production issues without attaching a debugger or parsing Prometheus output.
Complexity: M
Dependencies
- task-02 complete (signal + WAL metrics must be wired)
- task-03 complete (index health metrics must be wired)
- task-04 complete (session + cohort + degradation metrics must be wired)
- Existing
tidalctlbinary withstatusandpathssubcommands (m0p2) tidal/src/db/metrics.rs--MetricsStatewith all m7p4 metrics
Technical Design
1. Add diagnostics subcommand to tidalctl
In the tidalctl binary (manual arg parsing), add a new match arm:
"diagnostics" => {
let path = parse_path_flag(&args)?;
run_diagnostics(&path, pretty)?;
}
2. Diagnostics data collection
The diagnostics command opens the database in read-only inspection mode. It does NOT start a full TidalDb instance. Instead, it reads:
- Config: from
{data_dir}/config.json(existingtidalctl statuspath) - WAL state: scan
{wal_dir}/for segment files, compute total size and count - Checkpoint age: read
{wal_dir}/checkpointfile, parseCheckpointMeta, compute age fromcheckpoint_time_ns - Signal ledger size: read the checkpoint file size (approximate; each entity-signal entry is ~983 bytes from m1p4 format)
- Tantivy index: if
{data_dir}/text_index/exists, open read-only, count segments and docs - USearch index: if
{data_dir}/vectors/exists, report directory size - Session count: count entries in session journal (
{wal_dir}/session_journal.bin) - Collection count: scan
{data_dir}/items/forTag::Collectionkeys - Cohort count: scan
{data_dir}/items/for cohort-related keys
For items 5-9, if the directory or file does not exist, report "not available" rather than erroring.
3. Diagnostics output format
tidalDB Diagnostics
===================
Version: 0.7.0 (build: abc123)
Data dir: /var/lib/tidaldb/data
Storage mode: durable
WAL
---
Segments: 12
Total size: 48.3 MB
Lag (uncompacted): 12.1 MB
Checkpoint
----------
Last checkpoint: 2026-02-23 14:30:12 UTC (47s ago)
WAL sequence: 148293
Signal Ledger
-------------
Estimated entries: ~152,000
Text Index (Tantivy)
--------------------
Segments: 4
Indexed docs: 98,412
Vector Index (USearch)
---------------------
Directory size: 256.7 MB
Sessions
--------
Active: 3
Closed (total): 1,247
Auto-closed: 12
Degradation
-----------
Level: 0 (healthy)
Collections: 8
Cohorts: 3
When --pretty is NOT set, output machine-readable JSON:
{
"version": "0.7.0",
"build_hash": "abc123",
"wal_segments": 12,
"wal_total_bytes": 50659328,
"wal_lag_bytes": 12689408,
"checkpoint_age_seconds": 47,
"checkpoint_wal_sequence": 148293,
"signal_estimated_entries": 152000,
"tantivy_segments": 4,
"tantivy_indexed_docs": 98412,
"usearch_directory_bytes": 269156352,
"sessions_active": 3,
"sessions_closed_total": 1247,
"sessions_auto_closed_total": 12,
"degradation_level": 0,
"collection_count": 8,
"cohort_count": 3
}
4. Exit codes
| Code | Meaning |
|---|---|
| 0 | Diagnostics completed successfully |
| 1 | Data directory does not exist or is not readable |
| 2 | WAL directory missing or corrupt (partial output still printed) |
5. No TidalDb instance required
The diagnostics command reads files directly. It does NOT call TidalDb::builder().open(). This means it can run against a database that is currently open by another process (read-only file access) or against a database that failed to start (helping debug startup failures).
The one exception: if a running TidalDb has the metrics HTTP server enabled, tidalctl diagnostics could alternatively fetch /metrics and format the output. Implement the file-based approach as the primary path; the HTTP-based approach is a future enhancement.
Acceptance Criteria
tidalctl diagnostics --path <dir>prints human-readable health summarytidalctl diagnostics --path <dir>(without--pretty) prints machine-readable JSON- Output includes: WAL segment count, WAL total size, WAL lag, checkpoint age, checkpoint sequence, estimated signal entries, Tantivy segment count, Tantivy indexed docs, USearch directory size, active sessions, closed sessions, auto-closed sessions, degradation level, collection count, cohort count
- Missing subsystems (no text index, no vectors) show "not available" rather than error
- Works against a database currently open by another process (read-only access)
- Exit code 0 on success, 1 on missing data dir, 2 on WAL issues
cargo clippy -D warningsandcargo fmt --checkpass
Test Strategy
// CLI integration test (runs the binary as a subprocess)
#[test]
fn diagnostics_json_output_valid() {
let db = make_test_db_with_items(10);
let data_dir = db.paths().data_dir().to_path_buf();
db.close().unwrap();
let output = Command::new(tidalctl_binary_path())
.args(["diagnostics", "--path", data_dir.to_str().unwrap()])
.output()
.unwrap();
assert!(output.status.success());
let json: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap();
assert!(json["version"].is_string());
assert!(json["wal_segments"].is_number());
assert!(json["checkpoint_age_seconds"].is_number());
}
#[test]
fn diagnostics_pretty_output_readable() {
let db = make_test_db_with_items(10);
let data_dir = db.paths().data_dir().to_path_buf();
db.close().unwrap();
let output = Command::new(tidalctl_binary_path())
.args(["diagnostics", "--path", data_dir.to_str().unwrap(), "--pretty"])
.output()
.unwrap();
assert!(output.status.success());
let stdout = String::from_utf8_lossy(&output.stdout);
assert!(stdout.contains("tidalDB Diagnostics"));
assert!(stdout.contains("WAL"));
assert!(stdout.contains("Checkpoint"));
}
#[test]
fn diagnostics_missing_dir_exits_1() {
let output = Command::new(tidalctl_binary_path())
.args(["diagnostics", "--path", "/nonexistent/path"])
.output()
.unwrap();
assert_eq!(output.status.code(), Some(1));
}