tidaldb/docs/research/tidaldb_tooling_and_diagnostics.md

# Research: CLI Framework and Embedded HTTP for m0p2 Tooling & Diagnostics

## Question

What is the minimum-viable set of dependencies and design patterns for:
1. A `tidalctl` CLI binary (2 subcommands, 1 required arg, 1 optional flag, JSON output)
2. An optional embedded HTTP endpoint (`/healthz` JSON, `/metrics` Prometheus text format)
3. Prometheus text format output for 5-10 counters/gauges
4. Config serialization for CLI-to-library communication

## TidalDB Context

tidalDB is an embeddable, single-node-first Rust database. The dependency philosophy from CODING_GUIDELINES.md is explicit: "Every dependency must justify its existence against 'could we write this in 200 lines?'" The library crate has `#![forbid(unsafe_code)]` at crate level. MSRV is 1.91 (Rust 2024 edition).

**m0p2 scope is narrow:**
- `tidalctl status --path <dir>` and `tidalctl paths --path <dir>` -- two subcommands, one required flag (`--path`), one optional flag (`--pretty`), JSON output
- `/healthz` returning JSON health status
- `/metrics` returning Prometheus text format with ~5-10 metrics (uptime, WAL sequence, queue depth, build hash)
- The HTTP endpoint is feature-gated (`metrics` feature), disabled by default
- Expected concurrent connections to the metrics endpoint: <10 (dev/ops tooling only)

**Existing dependency context (from Cargo.lock):** `criterion` (dev-dependency) already pulls in `clap 4.5.60`, `serde 1.0.228`, `serde_json 1.0.149`, and `serde_derive 1.0.228`. These are compiled in every `cargo test` and `cargo bench` invocation today. `serde`/`serde_json` are also listed as approved dependencies in CODING_GUIDELINES.md (line 296).

---

## Question 1: CLI Argument Parsing for `tidalctl`

### Approaches Surveyed

#### Approach 1: `clap` 4.x (derive API)

**How it works:** Declarative derive macros on structs generate a full argument parser with help text, error messages, completions, and subcommand routing. The derive API maps directly from struct fields to CLI flags.

**Used by:** TiKV (`tikv-ctl`), Meilisearch, SurrealDB, Vector, Nushell, ripgrep, bat, fd. The dominant choice in the Rust CLI ecosystem. Criterion (already a tidalDB dev-dep) uses clap 4 internally.

**Evidence:**
- argparse-rosetta-rs benchmarks (2024): 3s full debug build, 392ms incremental. 654 KiB release binary overhead (full features) or 427 KiB (minimal features).
- MSRV: 1.74. Compatible with tidalDB's 1.91.
- Rain's Rust CLI Recommendations: "use clap unless you have a really simple application."

**Strengths:**
- Auto-generated `--help` with subcommand tree, argument descriptions, and defaults.
- Compile-time validation of argument structure via derive macros.
- Shell completions via `clap_complete`.
- Already in Cargo.lock via criterion -- zero additional compile-time cost in dev builds.

**Weaknesses:**
- 654 KiB binary overhead (full) / 427 KiB (minimal) added to the `tidalctl` release binary.
- Proc-macro dependency chain (syn, quote, proc-macro2) -- though these are already compiled for criterion.
- Overkill for 2 subcommands.

#### Approach 2: `argh` 0.1.13 (Google's derive parser)

**How it works:** Derive-based parser optimized for code size, designed for Google Fuchsia's CLI conventions. Similar derive API to clap but with a smaller binary footprint.

**Used by:** Google Fuchsia tooling. Limited adoption outside Google's ecosystem.

**Evidence:**
- argparse-rosetta-rs benchmarks: 3s full debug build (same as clap due to proc-macro overhead), 203ms incremental. 38 KiB binary overhead.
- MSRV: not explicitly declared. Uses 2018 edition. Last release ~12 months ago.
- License: BSD-3-Clause. "This is not an officially supported Google product."

**Strengths:**
- Much smaller binary overhead than clap (38 KiB vs 427-654 KiB).
- Derive-based API similar to clap.

**Weaknesses:**
- Not in Cargo.lock -- adds a new dependency tree.
- Fuchsia-specific conventions (not standard Unix `--flag=value` in all cases).
- Lower community adoption; maintenance uncertain (not officially supported by Google).
- No shell completions.
- 3s initial compile (proc-macro overhead same as clap).

#### Approach 3: `pico-args` 0.5.0

**How it works:** Manual argument extraction via method calls. No derive, no proc-macros, no help generation. Parse arguments by calling `opt_value_from_str("--path")`, `contains("--pretty")`, and `subcommand()`.

**Used by:** RazrFalcon's suite of tools (resvg, usvg, svgcleaner). Popular in the "small tool" Rust ecosystem. 11M+ total downloads on crates.io.

**Evidence:**
- argparse-rosetta-rs benchmarks: 384ms full debug build, 185ms incremental. 23 KiB binary overhead.
- Zero dependencies. Zero proc-macros. 666 lines of code.
- MSRV: 1.32. Compatible with any Rust version.
- License: MIT.
- No unsafe code (`#![forbid(unsafe_code)]`).

**Strengths:**
- Negligible compile-time and binary size impact.
- Zero dependencies -- no transitive risk.
- API is simple enough for 2 subcommands.
- Matches tidalDB's dependency philosophy perfectly.

**Weaknesses:**
- No auto-generated `--help`. Must be hand-written (10-15 lines for this CLI).
- No derive -- argument parsing is imperative code.
- Subcommand routing is manual string matching.
- Error messages are less polished than clap.

#### Approach 4: `lexopt` 0.3.1

**How it works:** Low-level lexer that yields tokens (`Short`, `Long`, `Value`). The application matches on tokens in a loop. One file, zero dependencies, zero macros.

**Used by:** cargo (as `clap_lex` which is derived from lexopt's design), uutils.

**Evidence:**
- argparse-rosetta-rs benchmarks: 385ms full debug build, 184ms incremental. 34 KiB binary overhead.
- Zero dependencies. MSRV 1.31. License: MIT/Apache-2.0.

**Strengths:**
- Handles `OsString` correctly (important for path arguments).
- Slightly more structured than raw `std::env::args()`.

**Weaknesses:**
- More boilerplate than pico-args for the same result.
- No subcommand abstraction -- everything is a token loop.
- Slightly larger binary overhead than pico-args for less ergonomic API.

#### Approach 5: Manual (`std::env::args()`)

**How it works:** Read `std::env::args()` into a `Vec<String>`, match on the first positional argument for the subcommand, iterate remaining args for flags.

**Used by:** Many internal tools. SQLite's CLI is hand-rolled in C (not using getopt). DuckDB's CLI is based on SQLite's hand-rolled parser.

**Evidence:**
- Zero dependencies, zero binary overhead, zero compile time addition.
- For 2 subcommands + 2 flags, this is approximately 50-80 lines of Rust.

**Strengths:**
- Absolute minimum footprint.
- No dependency to maintain, audit, or version-pin.
- Complete control over error messages.

**Weaknesses:**
- Must handle edge cases manually: `--path=<dir>` vs `--path <dir>`, `--` separator, unknown flags.
- No help generation.
- More code to maintain than pico-args for equivalent behavior.
- Easy to introduce subtle parsing bugs (e.g., `--path` at end of args without value).

### Comparison

| Criterion | clap 4.x | argh 0.1.13 | pico-args 0.5.0 | lexopt 0.3.1 | Manual |
|---|---|---|---|---|---|
| Full debug build | 3s | 3s | 384ms | 385ms | 0ms |
| Incremental build | 392ms | 203ms | 185ms | 184ms | 0ms |
| Binary overhead (release) | 427-654 KiB | 38 KiB | 23 KiB | 34 KiB | 0 KiB |
| Dependencies | ~10 transitive | ~3 (proc-macro) | 0 | 0 | 0 |
| Auto `--help` | Yes | Yes | No | No | No |
| Subcommand support | Native | Native | Manual matching | Manual matching | Manual matching |
| Proc-macros | Yes (derive) | Yes (derive) | No | No | No |
| `#![forbid(unsafe_code)]` | No (clap uses unsafe) | Unknown | Yes | Yes | Yes |
| MSRV | 1.74 | ~1.56 (2018 ed.) | 1.32 | 1.31 | N/A |
| Already in Cargo.lock | Yes (via criterion) | No | No | No | N/A |
| License | MIT/Apache-2.0 | BSD-3-Clause | MIT | MIT/Apache-2.0 | N/A |
| Lines of code (user-side) | ~25 (derive struct) | ~25 (derive struct) | ~40 (imperative) | ~50 (token loop) | ~60-80 |

### Recommendation: Manual `std::env::args()` for `tidalctl`

**The case is clear when you look at the actual scope.** `tidalctl` has 2 subcommands, 1 required flag, and 1 optional flag. This is a 60-line match statement, not a parser configuration problem.

The key arguments:

1. **The CODING_GUIDELINES.md test:** "Could we write this in 200 lines?" -- Yes, in about 60 lines, including help text and error messages. No dependency passes this bar for this scope.

2. **`tidalctl` is a separate binary crate, not the library.** It will have its own `Cargo.toml`. Even though clap is in the workspace Cargo.lock via criterion, `tidalctl`'s release build would need to compile clap into the binary, adding 427+ KiB. The CLI binary should be small -- the `status` command reads a config file and prints JSON; it should not be a 1+ MiB binary.

3. **The "escape hatch" argument favors manual.** If `tidalctl` grows to 5+ subcommands (e.g., `tidalctl compact`, `tidalctl backup`, `tidalctl schema`), switching from manual to pico-args or clap is a straightforward refactor. The reverse migration (clap to manual) is harder because derive macros become load-bearing.

4. **Production precedent:** SQLite and DuckDB both use hand-rolled CLI parsers. For embedded database tooling with few commands, this is the norm, not the exception.

**If the team prefers a library:** pico-args 0.5.0 is the right choice. Zero dependencies, 23 KiB overhead, `#![forbid(unsafe_code)]`, and the API is natural for this use case. Pin to `pico-args = "0.5"`.

**Do not use clap for `tidalctl` at this scope.** It is the right tool for a CLI with 10+ subcommands and complex argument validation. It is overkill for 2 subcommands and would add 427 KiB to a binary that should be 100-200 KiB total.

---

## Question 2: Sync Embedded HTTP for Metrics Endpoint

### Design Tension

The m0p2 task document says: "Endpoint can run on the same Tokio runtime as host service (returns `Future` implementor)." But the research question notes: "Needs to work without Tokio as a hard dependency." These are in tension.

**Resolution:** The metrics endpoint should be designed as a synchronous server running on a background `std::thread`. When a host application has Tokio, it can `tokio::task::spawn_blocking` to move the sync server onto its runtime. The API should return `std::thread::JoinHandle<()>`, not a `Future`. This is simpler, avoids a Tokio dependency, and is compatible with both async and sync host applications.

A future `metrics-tokio` feature flag could add a `Future`-returning wrapper, but m0p2 does not need it.

### Approaches Surveyed

#### Approach 1: `tiny_http` 0.12.0

**How it works:** Synchronous HTTP server using `std::net::TcpListener` internally with a thread pool. Handles HTTP/1.1 parsing, keep-alive, chunked transfer, content encoding. You call `server.recv()` in a loop and respond synchronously.

**Used by:** devserver, nickel (legacy), numerous internal tools. 1.1K GitHub stars, 395 downstream crates.

**Evidence:**
- Version 0.12.0, released October 2022. Edition 2018. MSRV 1.57.
- Core dependencies: `ascii`, `chunked_transfer`, `httpdate` -- minimal tree (~5 crates without TLS).
- Size: 120 KB crate, ~2.5K source lines.
- License: MIT/Apache-2.0.
- No TLS needed for localhost metrics (disable all `ssl-*` features).
- Uses some `unsafe` internally (HTTP parsing optimizations).

**Strengths:**
- Fully synchronous -- no Tokio dependency.
- Handles HTTP edge cases (keep-alive, chunked, pipelining) correctly.
- Mature, battle-tested for low-traffic use cases.
- Simple API: `server.recv()` -> `Request` -> `request.respond(Response)`.

**Weaknesses:**
- Last release October 2022 -- 3+ years old. Active maintenance is uncertain.
- Internal thread pool adds complexity tidalDB does not need for 2 endpoints.
- Pulls in `ascii` and `chunked_transfer` crates -- small but nonzero dependency surface.
- Uses `unsafe` internally, which cannot be audited as easily as a hand-rolled solution.
- MSRV 1.57 is fine, but edition 2018 is dated.

#### Approach 2: `rouille` 0.6.2

**How it works:** Macro-based synchronous web framework built on top of `tiny_http`. Adds routing macros, form parsing, and session handling.

**Used by:** Small Rust web projects. 1.1K GitHub stars.

**Evidence:**
- Built on `tiny_http` -- inherits its HTTP handling.
- Adds significant API surface (routing macros, sessions, forms) that tidalDB does not need.
- Last commit activity has slowed.
- License: MIT/Apache-2.0.

**Strengths:**
- Routing macros reduce boilerplate for multi-endpoint servers.

**Weaknesses:**
- Wrapper around `tiny_http` -- adds dependency on top of dependency.
- Routing macros are unnecessary for 2 endpoints.
- Maintenance status unclear.
- Fails the "200 lines" test -- we are adding a framework when we need 2 `if` branches.

#### Approach 3: Hand-rolled (`std::net::TcpListener`)

**How it works:** Bind a `TcpListener`, accept connections in a loop on a background thread, parse the HTTP request line (just the method and path), write a raw HTTP response. For 2 endpoints with static-ish content, this is ~80-120 lines.

**Used by:** The Rust Book's web server tutorial uses this exact pattern. Prometheus client libraries in other languages often use minimal HTTP for the `/metrics` endpoint. SQLite does not embed an HTTP server, but the pattern is standard for database diagnostics (e.g., RocksDB statistics are often exposed via a hand-rolled HTTP endpoint in embedding applications).

**Evidence:**
- Zero dependencies. Zero binary overhead.
- The Rust standard library's `TcpListener` + `BufReader` handles everything needed for HTTP/1.1 request parsing at this scale.
- For `/healthz` and `/metrics` with <10 concurrent connections, HTTP keep-alive and chunked transfer are unnecessary -- `Connection: close` on every response is acceptable.

**Strengths:**
- Zero dependencies -- maximally embeddable.
- Audit surface is 80-120 lines of code that the team wrote and understands.
- No `unsafe` (stays within `#![forbid(unsafe_code)]`).
- Thread model is explicit: one `std::thread::spawn` with a loop, one `TcpListener`.
- Trivially testable: connect with `std::net::TcpStream` in integration tests.

**Weaknesses:**
- Must handle HTTP parsing manually. But for this scope: read the first line, split on spaces, match path. Malformed requests get a 400 response. This is ~20 lines.
- No keep-alive, no chunked transfer, no content encoding. Acceptable for dev/ops metrics endpoint at <10 connections.
- If requirements grow (TLS, WebSocket, many endpoints), must migrate to a real server. But m0p2 has 2 endpoints.

#### Approach 4: `axum` + Tokio (async)

**How it works:** Full async web framework built on `hyper` and `tokio`. Tower middleware ecosystem, type-safe extractors, Router-based routing.

**Used by:** Most production Rust web services. The ecosystem standard for async HTTP.

**Evidence:**
- Pulls in `tokio`, `hyper`, `tower`, `http`, and dozens of transitive dependencies.
- Binary size impact: 1-3 MiB.
- Compile time: 10-20s for a clean build.

**Strengths:**
- Production-grade HTTP handling.
- Seamless integration if the host application already runs Tokio.

**Weaknesses:**
- **Fundamentally incompatible with tidalDB's embeddable philosophy.** Adding Tokio as a dependency means every embedder must link Tokio, even if they never enable metrics. Feature-gating mitigates this, but the `metrics` feature would still pull in the entire async runtime.
- Massive dependency tree for 2 endpoints.
- Does not pass the "200 lines" test by orders of magnitude.

#### Approach 5: `warp` (async, Tokio-based)

Same category as axum. Pulls Tokio. Same disqualification for the same reasons.

### Comparison

| Criterion | tiny_http 0.12 | rouille 0.6 | Hand-rolled | axum + Tokio |
|---|---|---|---|---|
| Async? | No (sync) | No (sync) | No (sync) | Yes |
| Dependencies | ~5 crates | ~8 crates (via tiny_http) | 0 | ~50+ crates |
| Binary size impact | ~50-80 KiB | ~80-120 KiB | 0 KiB | 1-3 MiB |
| Compile time impact | ~1-2s | ~2-3s | 0s | 10-20s |
| HTTP correctness | Full HTTP/1.1 | Full HTTP/1.1 | Minimal (sufficient) | Full HTTP/1.1 + HTTP/2 |
| `#![forbid(unsafe_code)]` | No (internal unsafe) | No | Yes | No |
| MSRV | 1.57 | Unknown | N/A (std only) | ~1.70+ |
| Maintenance | Last release Oct 2022 | Uncertain | N/A (owned code) | Active |
| License | MIT/Apache-2.0 | MIT/Apache-2.0 | N/A | MIT |
| Shutdown coordination | `server.unblock()` | `server.unblock()` | `AtomicBool` flag | `tokio::sync::oneshot` |
| Concurrent connections | Thread pool | Thread pool | Sequential (acceptable) | Async (unlimited) |

### Recommendation: Hand-rolled `std::net::TcpListener`

**For 2 endpoints serving <10 concurrent connections in a dev/ops context, a hand-rolled HTTP listener is the correct choice.**

The arguments:

1. **The "200 lines" test is decisive.** The entire metrics HTTP server -- binding, accept loop, request parsing, routing, response formatting, graceful shutdown -- fits in ~100-120 lines of safe Rust. No dependency justifies its existence here.

2. **Zero dependency cost.** The `metrics` feature flag should add only tidalDB's own code, not a third-party HTTP server. An embedder who enables `metrics` should not be surprised by new transitive dependencies.

3. **`#![forbid(unsafe_code)]` compatibility.** tiny_http uses unsafe internally. A hand-rolled solution stays within tidalDB's safety guarantees.

4. **Shutdown is trivial with an `AtomicBool`.** The background thread checks `running.load(Ordering::Relaxed)` on each accept iteration. `TcpListener::set_nonblocking(true)` with a 100ms poll interval, or use `TcpListener` with `SO_REUSEADDR` and connect-to-self to unblock. Alternatively, set a short `accept` timeout.

5. **The "escape hatch" works both directions.** If m0p2 grows beyond 2 endpoints or needs TLS, migrating to tiny_http or axum is straightforward -- the endpoint handler functions remain the same, only the server harness changes.

**API design:**

```rust
/// Start the metrics HTTP server on a background thread.
///
/// Returns a handle that stops the server when dropped.
pub fn start_metrics_server(addr: std::net::SocketAddr, db: Arc<TidalDb>) -> MetricsHandle;

pub struct MetricsHandle {
    shutdown: Arc<AtomicBool>,
    thread: Option<std::thread::JoinHandle<()>>,
}

impl Drop for MetricsHandle {
    fn drop(&mut self) {
        self.shutdown.store(true, Ordering::Release);
        if let Some(handle) = self.thread.take() {
            let _ = handle.join();
        }
    }
}
```

**Tokio compatibility:** An embedder running Tokio can wrap this in `tokio::task::spawn_blocking(|| start_metrics_server(...))`. No tidalDB code needs to know about Tokio.

---

## Question 3: Prometheus Text Format

### Format Specification

The Prometheus text exposition format (version 0.0.4) is line-oriented, UTF-8 encoded, with `\n` line endings:

```
# HELP <metric_name> <docstring>
# TYPE <metric_name> <counter|gauge|histogram|summary|untyped>
<metric_name>{<label_name>="<label_value>",...} <value> [<timestamp>]
```

Rules:
- `# HELP` and `# TYPE` must appear before the first sample for a metric.
- Only one `# HELP` and one `# TYPE` per metric name.
- If `# TYPE` is omitted, metric defaults to `untyped`.
- Label values must escape `\` as `\\`, `"` as `\"`, `\n` as `\\n`.
- Values are Go `ParseFloat` format: integers, floats, `NaN`, `+Inf`, `-Inf`.
- Timestamp is optional (milliseconds since epoch). Prometheus will use scrape time if omitted.
- Content-Type: `text/plain; version=0.0.4; charset=utf-8`.

### Example for tidalDB's metrics

```
# HELP tidaldb_uptime_seconds Seconds since the database was opened.
# TYPE tidaldb_uptime_seconds gauge
tidaldb_uptime_seconds{partition_id="0"} 3723.5

# HELP tidaldb_wal_sequence Current WAL sequence number.
# TYPE tidaldb_wal_sequence counter
tidaldb_wal_sequence{partition_id="0"} 148293

# HELP tidaldb_wal_queue_depth Number of WAL entries pending flush.
# TYPE tidaldb_wal_queue_depth gauge
tidaldb_wal_queue_depth{partition_id="0"} 12

# HELP tidaldb_build_info Build metadata. Value is always 1.
# TYPE tidaldb_build_info gauge
tidaldb_build_info{version="0.1.0",build_hash="abc123",partition_id="0"} 1

# HELP tidaldb_open_segments Number of open WAL segments.
# TYPE tidaldb_open_segments gauge
tidaldb_open_segments{partition_id="0"} 3
```

### Approaches Surveyed

#### Approach 1: `prometheus` crate (tikv/rust-prometheus) 0.13.x

**How it works:** Registry-based. Create `Counter`, `Gauge`, `Histogram` objects, register them with a `Registry`, call `TextEncoder::encode()` to produce the exposition format.

**Used by:** TiKV, Linkerd, numerous Rust services. The de facto standard.

**Evidence:**
- Well-maintained (tikv organization). License: Apache-2.0.
- Pulls in `protobuf` (for optional protobuf format), `lazy_static`, `parking_lot`, `memchr`.
- Forces string allocations during metric collection (Collector trait limitation).
- Binary size: ~100-200 KiB.
- MSRV: 1.56.

**Strengths:**
- Battle-tested encoding. Guaranteed format correctness.
- Histogram and summary support built-in.

**Weaknesses:**
- Significant dependency tree for 5 counters/gauges.
- `protobuf` dependency is unnecessary for text-only exposition.
- Allocation-heavy collector API (documented ~40% slower than prometheus-client).
- Overkill: we need `writeln!` for 5 metrics, not a registry system.

#### Approach 2: `prometheus-client` crate 0.22.x

**How it works:** OpenMetrics-compatible. Type-safe labels via Rust type system (not string pairs). Visitor-based encoding (no allocations).

**Used by:** Official Prometheus Rust client. Recommended for new projects.

**Evidence:**
- Prometheus organization maintained. License: Apache-2.0.
- No unsafe code.
- ~40% faster encoding than tikv/rust-prometheus due to visitor pattern.
- Smaller dependency footprint than tikv version.

**Strengths:**
- Type-safe labels catch errors at compile time.
- No allocation during encoding.
- Official Prometheus project.

**Weaknesses:**
- Still a registry-based abstraction layer for 5 metrics.
- Adds dependency tree that is not justified for the scope.

#### Approach 3: Hand-written format

**How it works:** Use `write!` / `writeln!` to a `String` or `Vec<u8>`, following the format spec directly. For 5 counters/gauges with static names and 1-2 labels, this is a function that reads metric values and formats them.

**Evidence:**
- The format is trivially simple for counters and gauges. The complete formatting logic for 5 metrics is ~30-40 lines.
- No histograms or summaries needed at m0p2 scope.
- Validation: the output must match `# HELP`, `# TYPE`, then metric lines. A unit test can assert the format parses correctly (or simply check line structure).

**Strengths:**
- Zero dependencies.
- Complete control over output format.
- Trivially auditable -- the format spec is 1 page.
- No registry overhead, no trait objects, no allocations beyond the output buffer.

**Weaknesses:**
- Must follow the spec precisely. If a label value contains `"` or `\n`, it must be escaped. For tidalDB's labels (`partition_id="0"`, `version="0.1.0"`), these are compile-time string literals -- no escaping needed.
- If tidalDB grows to 50+ metrics with histograms, a library becomes justified. But at 5-10 counters/gauges, it is not.

### Comparison

| Criterion | prometheus (tikv) | prometheus-client | Hand-written |
|---|---|---|---|
| Dependencies | ~8 (incl. protobuf) | ~3 | 0 |
| Binary size | ~100-200 KiB | ~50-100 KiB | 0 KiB |
| Histogram support | Yes | Yes | No (not needed) |
| Allocation during encode | Yes (Collector trait) | No (visitor pattern) | No (write! to buffer) |
| Format correctness | Guaranteed | Guaranteed | Unit-tested |
| Lines of code (user-side) | ~30 (register + encode) | ~30 (register + encode) | ~40 (format directly) |
| `#![forbid(unsafe_code)]` | Unknown | Yes | Yes |

### Recommendation: Hand-written Prometheus text format

For 5-10 counters and gauges with known-safe label values, hand-writing the exposition format is the clear choice. The implementation is approximately 40 lines:

```rust
use std::fmt::Write;

pub fn render_prometheus_metrics(metrics: &MetricsSnapshot) -> String {
    let mut out = String::with_capacity(1024);

    write_gauge(&mut out, "tidaldb_uptime_seconds",
        "Seconds since the database was opened",
        &[("partition_id", "0")], metrics.uptime_secs);

    write_counter(&mut out, "tidaldb_wal_sequence",
        "Current WAL sequence number",
        &[("partition_id", "0")], metrics.wal_sequence);

    // ... more metrics
    out
}

fn write_gauge(out: &mut String, name: &str, help: &str,
               labels: &[(&str, &str)], value: f64) {
    let _ = writeln!(out, "# HELP {name} {help}");
    let _ = writeln!(out, "# TYPE {name} gauge");
    write_sample(out, name, labels, value);
}

fn write_counter(out: &mut String, name: &str, help: &str,
                 labels: &[(&str, &str)], value: f64) {
    let _ = writeln!(out, "# HELP {name} {help}");
    let _ = writeln!(out, "# TYPE {name} counter");
    write_sample(out, name, labels, value);
}

fn write_sample(out: &mut String, name: &str,
                labels: &[(&str, &str)], value: f64) {
    let _ = write!(out, "{name}{{");
    for (i, (k, v)) in labels.iter().enumerate() {
        if i > 0 { let _ = write!(out, ","); }
        let _ = write!(out, "{k}=\"{v}\"");
    }
    let _ = writeln!(out, "}} {value}");
}
```

**When to migrate:** If tidalDB needs histograms (e.g., query latency distributions) or 50+ metrics, adopt `prometheus-client` (the official Prometheus crate, not tikv's). Pin to `prometheus-client = "0.22"`. But that is a post-m0p2 decision.

---

## Question 4: Serde for Config Serialization

### Current State

`Config` is a 4-field struct (`mode: StorageMode`, `data_dir: Option<PathBuf>`, `wal_dir: Option<PathBuf>`, `cache_dir: Option<PathBuf>`). It currently has no serialization support. The CLI needs to read a serialized config snapshot from disk.

### Approaches Surveyed

#### Approach 1: `serde` + `serde_json` (feature-gated on library crate)

**How it works:** Add `#[derive(Serialize, Deserialize)]` to `Config` and `StorageMode` behind a `serde` feature flag. The CLI binary depends on the library with the `serde` feature enabled. `serde_json` handles the JSON encoding.

**Evidence:**
- `serde` (1.0.228) and `serde_json` (1.0.149) are already in `Cargo.lock` via criterion.
- CODING_GUIDELINES.md line 296 explicitly approves serde/serde_json: "serialization (at API boundaries only, not in hot paths)."
- Best practice from Rust API Guidelines and community consensus: library crates should feature-gate serde behind an optional `serde` feature.
- Binary size: serde_json adds ~70-100 KiB to release binaries. serde_derive's proc-macro adds ~5-10s to initial compile, but is already compiled for criterion.
- fjall (tidalDB's storage engine) does not use serde -- adding it to tidalDB does not create a circular dependency or conflict.

**Strengths:**
- Industry standard. Every Rust developer knows serde.
- Already approved in CODING_GUIDELINES.md.
- Already compiled in dev builds (via criterion).
- Feature-gated: embedders who do not need serialization pay zero cost.
- Config is at an API boundary (CLI reads library's config), exactly where serde belongs.

**Weaknesses:**
- serde_derive adds proc-macro compile time. Mitigated by: already compiled for criterion.
- Monomorphization can bloat binary. Mitigated by: Config is a small struct with 4 fields; the generated code is minimal.

#### Approach 2: `miniserde`

**How it works:** Lightweight alternative to serde that uses trait objects instead of monomorphization. ~12x less code than serde + serde_derive + serde_json combined.

**Evidence:**
- JSON-only. No format plugins.
- No error messages on deserialization failure.
- Does not support enums with data (only C-style enums). `StorageMode` is C-style, so this works.
- Does not support `#[serde(rename)]` or most serde attributes.
- Limited type support (no tuple structs, no enums with variant data).

**Strengths:**
- Smaller binary size than serde.
- Faster compile time (no proc-macro overhead comparable to serde_derive).

**Weaknesses:**
- serde is already compiled in the workspace. miniserde adds a *new* dependency tree rather than reusing what exists.
- No error messages -- if the CLI reads a corrupt config file, it gets `None` with no indication of what went wrong.
- Would become a migration tax later when tidalDB needs serde for other types (e.g., schema definitions, ranking profiles).

#### Approach 3: Hand-written JSON serialization

**How it works:** Implement `Display` for `Config` that writes JSON manually, and a `from_json_str` function that parses it. For a 4-field struct, this is ~50-80 lines.

**Evidence:**
- Zero dependencies.
- But: manual JSON parsing is error-prone. Escaping, nested objects, null handling, and whitespace tolerance all need implementation.
- tidalDB will need JSON serialization in multiple places beyond Config (API responses, query results, schema export). Implementing a JSON parser from scratch to avoid an already-approved dependency is false economy.

**Strengths:**
- Zero dependency cost.

**Weaknesses:**
- JSON parsing is not a 200-line problem if done correctly. Escaping, unicode, nested structures, error reporting -- this is exactly what serde_json solves.
- Creates maintenance burden that serde eliminates.
- CODING_GUIDELINES.md already approved serde for this exact use case.

### Comparison

| Criterion | serde + serde_json | miniserde | Hand-written |
|---|---|---|---|
| Already in Cargo.lock | Yes (via criterion) | No | N/A |
| Approved in CODING_GUIDELINES | Yes (explicitly) | No | N/A |
| Error messages on parse failure | Yes (detailed) | None | Custom |
| Enum support | Full | C-style only | Custom |
| Future reuse in tidalDB | High (schema, API, query results) | Low | Low |
| Binary size overhead | ~70-100 KiB | ~30-50 KiB | 0 KiB |
| Compile time overhead | 0s (already compiled) | New compilation | 0s |
| Correctness risk | None (battle-tested) | Low | Medium (hand-rolled parser) |

### Recommendation: `serde` + `serde_json`, feature-gated

**This is the one dependency question where the answer is unambiguously "use the library."**

1. **Already approved.** CODING_GUIDELINES.md says: "serde / serde_json -- serialization (at API boundaries only, not in hot paths)." Config serialization for CLI communication is the textbook API boundary use case.

2. **Already compiled.** Both crates are in Cargo.lock via criterion. Adding them as optional dependencies of the main crate adds zero compile time for developers who are already running tests and benchmarks.

3. **Future-proof.** tidalDB will need JSON serialization for: config export, schema definitions, query result formatting, API responses, ranking profile serialization. Every one of these will use serde. Starting with Config establishes the pattern.

4. **Feature-gate it.** The library crate adds:

```toml
[dependencies]
serde = { version = "1", features = ["derive"], optional = true }
serde_json = { version = "1", optional = true }

[features]
serde = ["dep:serde", "dep:serde_json"]
```

And on the struct:

```rust
#[derive(Debug, Clone)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct Config {
    pub mode: StorageMode,
    pub data_dir: Option<PathBuf>,
    pub wal_dir: Option<PathBuf>,
    pub cache_dir: Option<PathBuf>,
}
```

Embedders who do not need serialization pay nothing. The `tidalctl` binary crate depends on `tidaldb = { path = "../tidal", features = ["serde"] }`.

---

## Open Questions

1. **Config file format and location.** m0p2 task-01 says the CLI reads a "Config dump." Where does the running database write this? Likely `{data_dir}/config.json` written atomically during `TidalDb::open()`. The exact path should be a `Paths` method (e.g., `paths.config_file()`). This is an implementation decision for the engineer, not a research question.

2. **Metrics collection mechanism.** The hand-rolled metrics HTTP server needs to read metrics from the database. What is the interface? Options: (a) `TidalDb` exposes a `pub fn metrics_snapshot(&self) -> MetricsSnapshot` method; (b) a shared `Arc<AtomicU64>` counter registry. Option (a) is simpler and keeps the metrics code behind the public API. The engineer should decide based on what metrics are available at m0p2 (uptime and build info are trivial; WAL sequence requires WAL to be wired up).

3. **Graceful shutdown of the HTTP listener.** `std::net::TcpListener::accept()` blocks. To unblock it for shutdown, three options: (a) `set_nonblocking(true)` with a polling loop (simple, slight CPU waste); (b) connect-to-self to unblock accept (clever, no CPU waste); (c) use `SO_REUSEADDR` + `shutdown` on a cloned socket. Option (a) with a 200ms sleep is the simplest and sufficient for a diagnostics endpoint. Benchmark the CPU overhead if concerned -- it will be negligible for a 200ms poll.

4. **When to add `clap`.** If `tidalctl` grows beyond 5 subcommands or needs dynamic completions, switch to `clap`. The migration from manual to clap is a single-commit refactor: define a derive struct matching the existing `match` arms. Document this as the escape hatch in the `tidalctl` crate README.

5. **When to add `prometheus-client`.** If tidalDB needs histograms (query latency distributions, signal write latency distributions) or exceeds 20 metrics, adopt `prometheus-client = "0.22"`. The hand-written format functions become a `MetricFamily` registration. Document the threshold.

6. **Integration testing the HTTP endpoint.** The test should `start_metrics_server` on an ephemeral port, `GET /metrics` with `std::net::TcpStream`, and assert the response contains expected metric lines. This is straightforward with the hand-rolled approach and does not require an HTTP client library -- raw TCP + string matching is sufficient.

---

## Summary of Recommendations

| Component | Recommendation | Justification |
|---|---|---|
| CLI argument parsing | Manual `std::env::args()` | 2 subcommands, 60 lines. "200 lines" test passes. Upgrade path to pico-args/clap exists. |
| HTTP metrics server | Hand-rolled `std::net::TcpListener` | 2 endpoints, <10 connections. ~100 lines of safe Rust. Zero dependencies. |
| Prometheus text format | Hand-written `write!` formatting | 5-10 counters/gauges. ~40 lines. Format spec is trivial for this scope. |
| Config serialization | `serde` + `serde_json`, feature-gated | Already approved, already compiled, future-proof. Feature-gate as `serde`. |

**Total new dependencies for m0p2:** One optional dependency pair (`serde` + `serde_json`) that is already in Cargo.lock and already approved. Everything else is standard library code.

**Estimated code footprint for m0p2 tooling:**
- `tidalctl` binary: ~150-200 lines (arg parsing + config reading + JSON output)
- Metrics HTTP server: ~100-120 lines (listener + routing + response)
- Prometheus formatter: ~40-50 lines (metric rendering)
- Config serde derives: ~5 lines (derive attributes + feature gate)

---

## Sources

### CLI Argument Parsing
- [Rain's Rust CLI Recommendations: Picking an Argument Parser](https://rust-cli-recommendations.sunshowers.io/cli-parser.html)
- [argparse-rosetta-rs: Benchmark data for Rust argument parsers](https://github.com/rosetta-rs/argparse-rosetta-rs) -- compile time, binary size, parse time comparisons
- [pico-args: Ultra simple CLI arguments parser](https://github.com/RazrFalcon/pico-args) -- 666 lines, zero deps, `#![forbid(unsafe_code)]`
- [lexopt: Minimalist pedantic command line parser](https://github.com/blyxxyz/lexopt) -- MSRV 1.31, zero deps
- [clap: Full featured CLI parser](https://docs.rs/clap/latest/clap/) -- MSRV 1.74, derive API
- [argh: Google's derive-based parser](https://github.com/google/argh) -- BSD-3-Clause, Fuchsia conventions
- [Rust CLI argument parsing libraries comparison (jpab.uk)](https://www.jpab.uk/blog/review-rust-cli-flag-parsers/)

### HTTP Servers
- [tiny-http: Low level HTTP server library in Rust](https://github.com/tiny-http/tiny-http) -- v0.12.0, MSRV 1.57, 1.1K stars
- [Rust Book: Building a Multithreaded Web Server](https://doc.rust-lang.org/book/ch21-02-multithreaded.html) -- std::net::TcpListener pattern
- [rouille: Synchronous micro-framework on crates.io](https://crates.io/crates/rouille)
- [Is there any popular synchronous HTTP crate? (Rust Forum)](https://users.rust-lang.org/t/is-there-any-popular-synchronous-http-crate/108111)

### Prometheus Text Format
- [Prometheus Exposition Formats (official specification)](https://prometheus.io/docs/instrumenting/exposition_formats/) -- format version 0.0.4
- [tikv/rust-prometheus: Instrumentation library](https://github.com/tikv/rust-prometheus) -- Collector trait, string allocation issue
- [prometheus/client_rust: Official Prometheus Rust client](https://github.com/prometheus/client_rust) -- visitor pattern, no unsafe, ~40% faster encoding
- [OpenMetrics specification](https://prometheus.io/docs/specs/om/open_metrics_spec/)

### Serialization
- [Serde: Serialization framework for Rust](https://serde.rs/) -- feature flags documentation
- [Serde use within a library -- best practices (Rust Forum)](https://users.rust-lang.org/t/serde-use-within-a-library-best-practices/111059) -- feature-gating consensus
- [miniserde: Data structure serialization library](https://docs.rs/miniserde) -- 12x less code than serde, JSON-only, limited type support
- [Rust serialization benchmarks](https://github.com/djkoloski/rust_serialization_benchmark)
- [Rust API Guidelines: Feature flag naming for serde](https://github.com/rust-lang/api-guidelines/discussions/180)