## M0p1 — Embeddable Runtime Skeleton (329 tests)
- TidalDb with builder(), health_check(), close(), and Drop-based cleanup
- TidalDbBuilder fluent API: ephemeral(), with_data_dir(), wal_dir(), cache_dir()
- Config, StorageMode, ConfigError types; Config(ConfigError) variant on LumenError
- Paths: single source of truth for directory layout (wal, items, users, creators, cache)
- TempTidalHome: test isolation helper gated behind #[cfg(test)] / test-utils feature
- 8 integration tests: tests/sandboxed_storage.rs
## M0p2 — Tooling & Diagnostics (349 tests)
- Workspace root Cargo.toml (members: ["tidal", "tidalctl"])
- tidal/build.rs: BUILD_HASH from GIT_HASH with option_env!() fallback to "dev"
- MetricsState: always-compiled Arc-shared atomics (uptime, health_ok)
- MetricsHandle (metrics feature): hand-rolled TcpListener HTTP, zero new deps
- GET /healthz → {"status":"ok","uptime_secs":N}
- GET /metrics → Prometheus text (tidaldb_uptime_seconds, health_ok, info)
- TidalDbBuilder.enable_metrics(addr) starts background metrics thread
- tidalctl binary: status + paths commands, manual std::env::args() parsing
- 7 metrics integration tests, 9 tidalctl CLI tests
## m1p4 Signal Ledger (in-progress)
- SignalLedger: DashMap<(EntityId, SignalTypeId), EntitySignalEntry>, WAL-first writes
- HotSignalState: #[repr(C, align(64))], lock-free CAS decay, out-of-order handling
- BucketedCounter: 60 per-minute + 168 per-hour circular buffers, trigger-based rotation
- CheckpointMeta + serialize/restore: 983-byte fixed records, atomic WriteBatch
- Property tests: running score matches analytical to 1e-6, decay monotonic, non-negative
- Proptest regression: signals/warm.txt
## Documentation and planning
- ROADMAP: m0p1 COMPLETE (329), m0p2 COMPLETE (349), product track milestones
- PRODUCT_ROADMAP: P0-P4 product milestone track (personal briefing beachhead)
- Milestone planning docs: milestone-0 (phases 1-3), milestone-p (phases 1-5)
- docs/research/tidaldb_tooling_and_diagnostics.md
- ARCHITECTURE.md, CLAUDE.md, VISION.md updates
## Site
- Blog: every-platform-builds-the-same-6-systems.mdx (new)
- Blog: why-tidaldb.mdx (updated)
- next.config.ts, layout.tsx, blog/page.tsx updates
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
19 KiB
Milestone 0, Phase 2: Tooling & Diagnostics -- Scoping Decisions
Date: 2026-02-20 Author: @tidal-visionary (Spencer Kimball) Status: APPROVED -- ready for implementation
Context
m0p1 (Embeddable Runtime Skeleton) is complete. m1p1-p3 (Type System, WAL, Storage Engine) are also complete. The codebase has:
TidalDbas a thin handle: holdsConfig, hashealth_check(),close(),Drop- A full WAL implementation (
WalHandle,SegmentWriter,CheckpointManager) that writes segment files (wal-{seq:020}.seg) and checkpoint metadata (checkpoint.meta) to disk - No
db.signal()yet in the public API (deferred to m1p5) - No WAL writes from the
TidalDbpublic API -- the WAL is implemented but not wired to theTidalDbfacade Confighas no serde derive -- it is a plain struct with no serialization- Single crate
tidal/, no workspace
The task documents in phase-2/ were written before m1p2 and m1p3 shipped. They assumed WAL writes would be accessible from the public API. They are not. This scoping document corrects the task definitions to match reality.
1. tidalctl Scope at M0
What tidalctl Can Do
tidalctl is a cold inspector. There is no live process to connect to. The CLI reads files from disk and reports what it finds. This is the correct model for an embeddable database -- there is no server process listening on a port. The inspector reads the same files the embedded library writes.
Commands
tidalctl status --path <dir>
Reads the tidalDB home directory and prints a JSON report:
{
"version": "0.1.0",
"build_hash": "29400d4",
"status": "ok",
"storage_mode": "persistent",
"wal": {
"segments": 3,
"first_seq": 1,
"last_segment_seq": 201,
"checkpoint_seq": 150,
"checkpoint_ts": "2026-02-20T14:30:00Z",
"wal_dir_bytes": 49152
},
"dirs": {
"base": "/var/lib/tidaldb",
"wal": "/var/lib/tidaldb/wal",
"items": "/var/lib/tidaldb/items",
"users": "/var/lib/tidaldb/users",
"creators": "/var/lib/tidaldb/creators",
"cache": "/var/lib/tidaldb/cache"
}
}
How each field is computed:
| Field | Source | Notes |
|---|---|---|
version |
Compiled into binary via env!("CARGO_PKG_VERSION") |
Always available |
build_hash |
Compiled via option_env!("GIT_HASH") or build script |
Falls back to "unknown" |
status |
"ok" if dir exists, has wal subdir, and at least one segment |
"empty" if no WAL segments, "error" if dir missing |
storage_mode |
Inferred: if WAL dir exists with segments, "persistent" |
No way to know ephemeral from disk -- ephemeral leaves no trace |
wal.segments |
segment::list_segments(&wal_dir)?.len() |
Already implemented in tidal/src/wal/segment.rs |
wal.first_seq |
First element of list_segments() result |
0 if empty |
wal.last_segment_seq |
Last element of list_segments() result |
0 if empty |
wal.checkpoint_seq |
CheckpointManager::read(&wal_dir)? |
null if no checkpoint file |
wal.checkpoint_ts |
Same -- the ts field, formatted as ISO 8601 |
null if no checkpoint |
wal.wal_dir_bytes |
Sum of file sizes in WAL dir | Filesystem stat |
dirs.* |
Paths::new(base) expanded |
Existence checked per dir |
No config file is written. tidalctl does not need TidalDb::open() to write a .tidaldb.json config snapshot. The CLI reports what it can observe on the filesystem. The config is a runtime concept -- it exists in memory while the process runs and is not persisted. This is correct for M0. If future milestones need a config file for operational tooling, that is a separate decision.
No live process query. tidalctl reads disk. It does not connect to a running process. No Unix socket, no HTTP, no PID file. This is the right model for an embeddable library.
tidalctl paths --path <dir>
Prints the resolved directory layout:
{
"base": "/var/lib/tidaldb",
"wal": "/var/lib/tidaldb/wal",
"items": "/var/lib/tidaldb/items",
"users": "/var/lib/tidaldb/users",
"creators": "/var/lib/tidaldb/creators",
"cache": "/var/lib/tidaldb/cache",
"exists": {
"base": true,
"wal": true,
"items": true,
"users": false,
"creators": false,
"cache": false
}
}
This uses Paths::new(dir) -- the same path helper from m0p1. No duplication.
Common Flags
--path <dir>(required): the tidalDB home directory--pretty(optional): pretty-print JSON output (default: compact)--format json|text(optional, defaultjson):textprints human-friendly tabular output
What tidalctl Does NOT Do at M0
- No
tidalctl init(creating a fresh tidalDB home) -- the library creates dirs on open - No
tidalctl repair(WAL repair) -- crash recovery is automatic inWalHandle::open() - No
tidalctl compact(storage compaction) -- no compaction exists yet - No
tidalctl dump(WAL event dump) -- useful but not needed for the m0p2 UAT - No live process communication of any kind
2. Metrics Scope at M0
The Problem with the Original Task
The original task says: "Integration test hits /metrics and asserts counters increment when WAL appends."
At M0, the TidalDb public API has no WAL write path. WalHandle::append() exists but is not wired to TidalDb. There are no signal writes from the public API. A test that asserts "counters increment when WAL appends" cannot be written without either (a) using WAL internals directly or (b) waiting for m1p5.
Corrected Scope
The metrics surface at M0 serves one purpose: prove the plumbing works so later milestones can add counters without redesigning the metrics layer. The counters themselves are scaffolding. The architecture is the deliverable.
Endpoints
GET /healthz
{
"status": "ok",
"uptime_seconds": 127.3,
"version": "0.1.0",
"build_hash": "29400d4"
}
GET /metrics
Prometheus text exposition format:
# HELP tidaldb_uptime_seconds Seconds since database opened.
# TYPE tidaldb_uptime_seconds gauge
tidaldb_uptime_seconds{partition_id="0"} 127.3
# HELP tidaldb_health_ok Whether the database is healthy. 1 = ok, 0 = degraded.
# TYPE tidaldb_health_ok gauge
tidaldb_health_ok{partition_id="0"} 1
# HELP tidaldb_info Build and version information.
# TYPE tidaldb_info gauge
tidaldb_info{version="0.1.0",build_hash="29400d4",partition_id="0"} 1
Exact Counters at M0
| Counter | Type | Source | Note |
|---|---|---|---|
tidaldb_uptime_seconds |
Gauge | Instant::now() - opened_at |
Computed on read |
tidaldb_health_ok |
Gauge | health_check().is_ok() as u8 |
1 or 0 |
tidaldb_info |
Gauge (info-pattern) | Build constants | Static, always 1 |
That is the complete set. Three metrics. No WAL counters, no signal counters, no storage counters. Those arrive in m1p5 when the WAL is wired to the public API.
What the Integration Test Verifies
The integration test at M0 verifies:
TidalDb::builder().ephemeral().enable_metrics("127.0.0.1:0").open()succeeds (port 0 = OS assigns)GET /healthzreturns 200 withstatus: "ok"anduptime_seconds > 0GET /metricsreturns 200 with valid Prometheus text formattidaldb_uptime_secondsincreases between two reads separated by a sleeptidaldb_health_okis 1db.close()stops the metrics server cleanly (no leaked threads, no port still bound)
No WAL assertions. No signal assertions. The test proves the HTTP server starts, serves correct responses, and shuts down cleanly.
What Is Deferred
| Counter | Deferred To | Why |
|---|---|---|
tidaldb_wal_seq |
m1p5 | WAL not wired to public API yet |
tidaldb_wal_segments |
m1p5 | Same |
tidaldb_wal_bytes_total |
m1p5 | Same |
tidaldb_signal_writes_total |
m1p5 | db.signal() does not exist yet |
tidaldb_signal_read_latency |
m1p5 | Signal reads do not exist yet |
tidaldb_query_latency |
m2p5 | Query executor does not exist yet |
tidaldb_query_count |
m2p5 | Same |
3. HTTP Approach: Sync (Option A)
Chosen: (a) Sync HTTP via tiny_http in a background thread.
Rationale:
-
Minimal deps is an explicit tidalDB requirement. Tokio is 200+ transitive dependencies.
tiny_httpis 5. For an embeddable library, dependency weight matters -- every dep is a compile-time cost and an audit surface for every user. -
The metrics endpoint does ~2 requests per scrape interval. This is not a high-throughput server. A single-threaded sync HTTP listener on a background thread handles thousands of req/s. Prometheus scrapes every 15-30s.
tiny_httphandles this with zero contention. -
No Tokio runtime conflict. If the host application uses Tokio (likely for an Axum/Actix service), embedding a second Tokio runtime inside tidalDB creates footguns: nested
block_on, unexpected thread pools, panic behavior. A backgroundstd::threadwith sync HTTP avoids all of this. -
The "Future implementor" spec is wrong for M0. The original task assumed tidalDB would share the host's async runtime. That is a leaky abstraction. An embeddable library should not assume or require any particular async runtime. A background thread with sync HTTP is the correct primitive.
-
Feature flag is premature. Option (c) with feature flags adds compile-time complexity for a surface that serves 3 metrics. Ship sync now. If M7 (Production Hardening) needs async HTTP for high-frequency scraping, add it then. The internal
MetricsRegistry/ counter abstraction is the same either way -- only the HTTP transport changes.
Implementation Shape
// Builder API
let db = TidalDb::builder()
.ephemeral()
.enable_metrics("127.0.0.1:9090") // Starts background thread
.open()?;
// Internal: spawns std::thread with tiny_http::Server
// Thread reads from Arc<MetricsState> (uptime, health_ok, build_info)
// Thread exits cleanly when TidalDb::close() sets a shutdown flag
Dependency Addition
# In tidal/Cargo.toml, behind a feature flag:
[features]
metrics = ["dep:tiny_http"]
[dependencies]
tiny_http = { version = "0.12", optional = true }
The metrics feature is opt-in. Users who do not need the HTTP endpoint pay zero compile cost. The MetricsState struct (atomic counters) exists unconditionally -- only the HTTP server is gated.
4. Workspace Structure: Workspace with Separate Binary Crate
Confirmed: workspace layout.
Structure
tidalDB/
Cargo.toml # [workspace] members = ["tidal", "tidalctl"]
tidal/
Cargo.toml # [package] name = "tidaldb" (the library)
src/
tidalctl/
Cargo.toml # [package] name = "tidalctl" (the binary)
src/
main.rs
Why Workspace, Not [[bin]]
-
Separate dependency trees. tidalctl needs
clapfor argument parsing. The tidaldb library should not carryclapas a dependency -- embeddable libraries do not parse CLI arguments. A[[bin]]insidetidal/would either makeclapunconditional or require a feature flag, both of which pollute the library. -
Independent versioning path. tidalctl may version independently from tidaldb. The CLI is a companion tool, not part of the library API surface.
-
cargo install tidalctlworks naturally. Users install the CLI separately from embedding the library. A workspace member with[[bin]]in its own crate givescargo install --path tidalctlthe right behavior. -
Shared dependencies via workspace.
tidalctldepends ontidaldb(forPaths,WalConfig, segment parsing, checkpoint reading). The workspace ensures they share the same compiled artifacts.
tidalctl Dependencies
[package]
name = "tidalctl"
version = "0.1.0"
edition = "2024"
[dependencies]
tidaldb = { path = "../tidal" }
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
What This Means for Pre-Commit Hooks and CI
The root Cargo.toml becomes the workspace root. All cargo commands (fmt, clippy, test) need to run from the workspace root or with --workspace. The pre-commit hook currently uses --manifest-path tidal/Cargo.toml -- this must be updated to use the workspace root.
5. Deferred Items
Explicitly NOT in m0p2
| Item | Why Deferred | Arrives In |
|---|---|---|
Config serialization to disk (.tidaldb.json) |
tidalctl inspects filesystem artifacts, not config files. Config is a runtime concept. | Revisit in M7 if operational tooling needs it |
tidalctl init command |
Library creates dirs on open. A separate init command is redundant. | Possibly never |
tidalctl repair command |
Crash recovery is automatic in WalHandle::open(). Manual repair is a production concern. |
M7 |
tidalctl dump (WAL event dump) |
Useful for debugging but not required for m0p2 UAT | M1 or M2 when developers need to debug signal event streams |
| WAL counters in metrics | WAL not wired to public API yet | m1p5 |
| Signal counters in metrics | db.signal() does not exist yet |
m1p5 |
| Query counters in metrics | Query executor does not exist yet | m2p5 |
| Async HTTP for metrics | Sync HTTP is sufficient for Prometheus scraping | M7 if needed |
tidalctl connecting to live process |
Embeddable library has no server process | Possibly never |
Serde on Config |
tidalctl does not read a config file. Config serde is needed only if we write a config file, which is deferred. | When needed |
6. Acceptance Criteria
Task 1: tidalctl CLI
- AC-1:
tidalctl status --path <dir>against a directory with WAL segments and checkpoint outputs valid JSON containingversion,wal.segments,wal.checkpoint_seq, anddirs.base - AC-2:
tidalctl status --path <dir>against an empty directory (no WAL, no segments) outputs JSON withstatus: "empty"andwal.segments: 0 - AC-3:
tidalctl status --path /nonexistentexits with non-zero status and prints a JSON error object to stderr - AC-4:
tidalctl paths --path <dir>outputs JSON with all six directory paths and existence flags matching actual filesystem state - AC-5:
--prettyflag produces indented JSON; absence produces compact JSON - AC-6:
cargo test -p tidalctlpasses with tests for: valid home, empty home, missing home, pretty flag, paths command
Task 2: Metrics Surface
- AC-7:
TidalDb::builder().ephemeral().enable_metrics("127.0.0.1:0").open()starts a background HTTP thread bound to an OS-assigned port - AC-8:
GET /healthzreturns HTTP 200 with JSON containingstatus: "ok"anduptime_seconds > 0 - AC-9:
GET /metricsreturns HTTP 200 with valid Prometheus text format containingtidaldb_uptime_seconds,tidaldb_health_ok, andtidaldb_info - AC-10:
tidaldb_uptime_secondsincreases monotonically between reads (verified by sleeping 100ms between two fetches) - AC-11:
TidalDb::close()stops the metrics HTTP thread; subsequent connection attempts to the port are refused - AC-12: Building
tidaldbwithout themetricsfeature flag compiles successfully with notiny_httpdependency;enable_metrics()method is absent or returns a compile error guiding the user to enable the feature
7. UAT Scenario
Given
A developer has:
- Built the workspace: `cargo build --workspace`
- Created a persistent tidalDB instance that wrote WAL segments:
let home = TempTidalHome::new()?;
let paths = home.paths();
paths.ensure_all()?;
let wal_config = WalConfig { dir: home.path().to_path_buf(), ..Default::default() };
let (wal, _) = WalHandle::open(wal_config)?;
wal.append(event_1)?;
wal.append(event_2)?;
wal.checkpoint(2)?;
wal.shutdown()?;
- Opened a TidalDb with metrics enabled:
let db = TidalDb::builder()
.ephemeral()
.enable_metrics("127.0.0.1:0")
.open()?;
When
1. Run: tidalctl status --path <home.path()>
2. Run: tidalctl paths --path <home.path()>
3. HTTP GET /healthz on the metrics port
4. HTTP GET /metrics on the metrics port
5. Sleep 200ms
6. HTTP GET /metrics again
7. db.close()
8. Attempt HTTP GET /healthz on the metrics port
Then
Step 1: JSON output with wal.segments >= 1, wal.checkpoint_seq == 2,
status == "ok", version matches Cargo.toml
Step 2: JSON output with dirs.wal == "<home>/wal", exists.wal == true
Step 3: HTTP 200, body contains "status":"ok", uptime_seconds > 0
Step 4: HTTP 200, body contains tidaldb_uptime_seconds,
tidaldb_health_ok 1, tidaldb_info{version="0.1.0"...} 1
Step 5: (sleep)
Step 6: tidaldb_uptime_seconds > value from step 4
Step 7: close() returns Ok(())
Step 8: Connection refused (metrics server stopped)
Pass/Fail Gate
m0p2 is done when:
cargo test -p tidalctlpassescargo test -p tidaldb --features metricspasses (metrics integration tests)cargo build --workspacesucceeds with no warnings underclippy -D warnings- All 12 acceptance criteria above are verified by automated tests
- tidalctl uses
Pathsfrom the tidaldb crate (no duplicated layout logic)
Implementation Notes
Build Hash
Use a build script (tidal/build.rs) or option_env!("GIT_HASH") set by CI. For local builds, fall back to "dev". Both tidalctl and the metrics endpoint use the same constant.
Metrics State Sharing
pub(crate) struct MetricsState {
opened_at: Instant,
health_ok: AtomicBool,
// Future milestones add: wal_seq: AtomicU64, signal_writes: AtomicU64, etc.
}
This struct is Arc-shared between TidalDb and the metrics HTTP thread. Adding new counters in future milestones is a one-line addition to this struct plus a one-line addition to the Prometheus renderer. The plumbing is paid for once in m0p2.
tidalctl WAL Inspection
tidalctl depends on tidaldb as a library. It calls:
tidaldb::db::Paths::new(dir)for path resolutiontidaldb::wal::segment::list_segments(&wal_dir)for segment enumerationtidaldb::wal::checkpoint::CheckpointManager::read(&wal_dir)for checkpoint state
These are all pub functions already. No new internal APIs need to be exposed. The WAL module's public surface is sufficient.
Complexity Estimates
| Task | Complexity | Rationale |
|---|---|---|
| Workspace setup (root Cargo.toml, pre-commit hook update) | S | Mechanical, no design decisions |
| tidalctl CLI (clap, status, paths) | M | Two commands, JSON output, error handling, tests |
| Metrics surface (tiny_http, feature flag, MetricsState, endpoints) | M | Background thread lifecycle, Prometheus format, integration test |
| Build hash plumbing | S | Build script or env var, shared constant |
Total phase complexity: M (two M tasks + two S tasks, all independent after workspace setup)