jordan 4f076c927d feat: M0p1 runtime skeleton, M0p2 tooling & diagnostics, m1p4 signal ledger

## M0p1 — Embeddable Runtime Skeleton (329 tests)
- TidalDb with builder(), health_check(), close(), and Drop-based cleanup
- TidalDbBuilder fluent API: ephemeral(), with_data_dir(), wal_dir(), cache_dir()
- Config, StorageMode, ConfigError types; Config(ConfigError) variant on LumenError
- Paths: single source of truth for directory layout (wal, items, users, creators, cache)
- TempTidalHome: test isolation helper gated behind #[cfg(test)] / test-utils feature
- 8 integration tests: tests/sandboxed_storage.rs

## M0p2 — Tooling & Diagnostics (349 tests)
- Workspace root Cargo.toml (members: ["tidal", "tidalctl"])
- tidal/build.rs: BUILD_HASH from GIT_HASH with option_env!() fallback to "dev"
- MetricsState: always-compiled Arc-shared atomics (uptime, health_ok)
- MetricsHandle (metrics feature): hand-rolled TcpListener HTTP, zero new deps
  - GET /healthz → {"status":"ok","uptime_secs":N}
  - GET /metrics → Prometheus text (tidaldb_uptime_seconds, health_ok, info)
- TidalDbBuilder.enable_metrics(addr) starts background metrics thread
- tidalctl binary: status + paths commands, manual std::env::args() parsing
- 7 metrics integration tests, 9 tidalctl CLI tests

## m1p4 Signal Ledger (in-progress)
- SignalLedger: DashMap<(EntityId, SignalTypeId), EntitySignalEntry>, WAL-first writes
- HotSignalState: #[repr(C, align(64))], lock-free CAS decay, out-of-order handling
- BucketedCounter: 60 per-minute + 168 per-hour circular buffers, trigger-based rotation
- CheckpointMeta + serialize/restore: 983-byte fixed records, atomic WriteBatch
- Property tests: running score matches analytical to 1e-6, decay monotonic, non-negative
- Proptest regression: signals/warm.txt

## Documentation and planning
- ROADMAP: m0p1 COMPLETE (329), m0p2 COMPLETE (349), product track milestones
- PRODUCT_ROADMAP: P0-P4 product milestone track (personal briefing beachhead)
- Milestone planning docs: milestone-0 (phases 1-3), milestone-p (phases 1-5)
- docs/research/tidaldb_tooling_and_diagnostics.md
- ARCHITECTURE.md, CLAUDE.md, VISION.md updates

## Site
- Blog: every-platform-builds-the-same-6-systems.mdx (new)
- Blog: why-tidaldb.mdx (updated)
- next.config.ts, layout.tsx, blog/page.tsx updates

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 20:32:00 -07:00

24 KiB

Raw Blame History

Content Strategy

Blog posts mapped to the tidalDB roadmap. Each entry identifies the moment worth writing about, the thesis that makes it shareable, and the type of post it demands.

The audience is engineers who have built or are currently maintaining recommendation and discovery systems -- the people running the 6-system stack this database replaces. They know what Kafka lag feels like at 3am. They know why cache invalidation bugs in the ranking pipeline are the ones that never get root-caused. They will smell marketing language from the first sentence. Respect that.

Publishing Principles

Write when something is true, not when something is scheduled. A blog post published the day a milestone passes its UAT is credible. A blog post published before the code works is fiction.

One insight per post. The reader should leave with a single idea they did not have before. If the post contains two insights, it is two posts.

Code proves claims. Every technical assertion is backed by a code example or a benchmark number from the actual codebase. Not a prototype. Not a plan. The shipped code.

The title is the thesis. If the title does not work as a standalone sentence that makes an engineer stop scrolling, the post is not ready.

Content Calendar

Pre-Implementation (Now)

These posts can be written before the engine is feature-complete. They draw on the vision, architecture research, and the problem space -- not on shipped code.

Post 1: "Every content platform builds the same 6 systems from scratch"

Type: Vision / Problem Statement
Thesis: The Elasticsearch + Redis + Kafka + feature store + vector DB + ranking service stack is not an architecture. It is scar tissue. The seams between these systems are where correctness dies.
Source material: VISION.md, thoughts.md (Part VI)
When to publish: Any time. This post defines the problem and does not depend on implementation progress.
Why it matters: This is the foundational narrative. Every subsequent post assumes the reader understands this problem. It also serves as the litmus test for whether the audience cares -- if this post does not resonate, the subsequent ones will not either.
Structure: Problem statement. The 6 systems named and indicted. The seams enumerated (stale signals, ETL lag, cache invalidation, operational burden). The thesis: ranking is not a feature, it is a primitive. End with the one-query vision, not with a product pitch.

Milestone 1: Signal Engine

M1 proves that temporal signals with O(1) decay, velocity, and windowed aggregation work as a database primitive. This is the most technically interesting milestone for blog content because the math is elegant and the performance numbers are dramatic.

Post 2: "Running decay scores are O(1) -- here is the math"

Type: Technical Deep Dive
Roadmap phase: m1p4 (Signal Ledger) completion
Thesis: The forward-decay formula S(t) = S(t_prev) * exp(-lambda * dt) + weight eliminates raw-event scanning at query time. Three exp() calls on write, one on read. 15 nanoseconds per entity. Every platform computing trending_score = views / (age + 2)^1.8 in application code is doing O(N) work that should be O(1).
Source material: docs/research/tidaldb_signal_ledger.md, ARCHITECTURE.md (Signal System section), m1p4 task docs
When to publish: After m1p4 passes UAT with benchmark numbers in hand.
Code to include: The EntitySignalState struct. The forward-decay write path. The out-of-order event correction. Benchmark output showing 200-entity scoring pass under 5 microseconds.
Why it matters: This is the post that demonstrates tidalDB is not vaporware. The math is verifiable. The benchmarks are reproducible. Engineers who have implemented trending scores in Redis will immediately understand the value.

Post 3: "What three databases taught us before we wrote a line of code"

Type: Architecture Decision Record
Roadmap phase: m1p1-m1p3 completion (the foundation phases)
Thesis: We studied Engram (cognitive memory), Citadel (append-only logging), and StemeDB (knowledge graph) -- three purpose-built databases in the same codebase -- and stole their best patterns. WAL-first durability from Citadel. Cache-line aligned hot structs from Engram. Subject-prefix key encoding from StemeDB. Background materialization from StemeDB. Here is what converged and what the gaps taught us.
Source material: thoughts.md (all six parts), CODING_GUIDELINES.md
When to publish: After m1p3 (Storage Engine) is complete. The patterns referenced are already implemented.
Code to include: Key encoding format. Cache-line aligned struct. Group commit writer. Side-by-side comparison of the pattern in the source database and in tidalDB.
Why it matters: Engineers respect builders who study prior art. This post establishes technical credibility and shows the architectural foundation is grounded in real patterns, not invented from scratch.

Post 4: "Signals wrote 100ms ago. The query sees them now."

Type: Devlog / Milestone Announcement
Roadmap phase: m1p5 (Entity CRUD and Signal Write API) -- M1 complete
Thesis: Milestone 1 is done. A developer can open a tidalDB instance, define signal types with decay rates and windows, write 10,000 engagement events, and read back decay-correct scores that match analytical computation to 6 decimal places. Including after a crash. The UAT scenario passes.
Source material: The m1p5 integration test, benchmark results, git log for the M1 period
When to publish: The day M1 UAT passes.
Code to include: The full UAT scenario (or a clean excerpt). TidalDB::open() with schema. Signal write. Decay score read. Before/after crash recovery.
Why it matters: This is the first "it works" post. It converts skeptics from "interesting idea" to "this is real." The UAT code is the proof.

Milestone 2: Ranked Retrieval

M2 proves that a single query can retrieve, filter, score, and enforce diversity over live signals. This is where tidalDB stops being a signal engine and starts being a database.

Post 5: "One query. Six systems. Under 50 milliseconds."

Type: Technical Deep Dive / Announcement
Roadmap phase: m2p5 (RETRIEVE Query Executor) -- M2 complete
Thesis: RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25 executes in under 50ms on 10K items. It retrieves candidates via ANN, filters by metadata, scores using live decay signals and velocity, enforces diversity, and returns a ranked list. That is what Elasticsearch + Redis + a ranking service produce. It is one query here.
Source material: m2p5 integration test, benchmark results, the dependency DAG showing how all M2 phases compose
When to publish: After M2 UAT passes.
Code to include: The RETRIEVE query. The ranked result with signal snapshots. The trending profile definition. A before/after signal burst showing the ranking change.
Why it matters: This is the money post. The one-query thesis is no longer a vision document -- it is a benchmark. Engineers who operate the 6-system stack will immediately understand what this eliminates.

Post 6: "Diversity enforcement in 3 microseconds"

Type: Technical Deep Dive
Roadmap phase: m2p4 (Diversity Enforcement)
Thesis: "No more than 2 items per creator" does not belong in your API layer. It belongs in the query. tidalDB enforces diversity as a post-scoring reordering pass -- it does not reduce result count. The greedy selection algorithm runs in under 3 microseconds for 200 candidates.
Source material: m2p4 task docs, VISION.md (diversity section), benchmark results
When to publish: After m2p4 is complete.
Code to include: The DiversitySpec. The greedy selector. A concrete example showing reordering (creator A dominates pre-diversity, balanced post-diversity). Benchmark numbers.
Why it matters: Every team building a feed implements diversity in the API layer. Showing that it belongs in the database -- and costs 3 microseconds -- is a strong differentiator. This is the kind of post that gets shared in Slack channels.

Post 7: "Ranking profiles are data, not code"

Type: Architecture Decision Record
Roadmap phase: m2p3 (Ranking Profile Engine)
Thesis: Changing how content is ranked should not require a code change, a deployment, or a restart. tidalDB treats ranking profiles as versioned schema declarations. Define a profile. Name it. Swap it at query time. A/B test two profiles by name. The database executes the entire pipeline.
Source material: m2p3 task docs, API.md (ranking profiles section), VISION.md
When to publish: After m2p3 is complete.
Code to include: A trending profile definition. A for_you profile definition. The same RETRIEVE query with two different profile names producing different orderings. The profile versioning API.
Why it matters: This reframes ranking as a database concern. Engineers who maintain ranking services as separate microservices will recognize the operational simplification.

Milestone 3: Personalized Ranking

M3 is where the feedback loop closes. Signal writes update the user's preference vector, the creator's interaction weight, and the item's signal ledger -- atomically, in one write. The "For You" query works.

Post 8: "The feedback loop that closes in one write"

Type: Technical Deep Dive
Roadmap phase: m3p2 (Feedback Loop) completion
Thesis: When a user likes an item, the database atomically updates: the item's like count, the user-to-creator interaction weight, and the user's preference vector (shifted toward the item's embedding). One db.signal("like", ...) call. No Kafka consumer to lag. No feature store to sync. No cache to invalidate. The next ranking query -- even 100ms later -- reflects the change.
Source material: m3p2 task docs, ARCHITECTURE.md (Write Path section), SEQUENCE.md
When to publish: After m3p2 passes UAT.
Code to include: The signal write. The 10-step atomic update path. A before/after query showing the preference shift. The property test that proves hidden items and blocked creators never surface.
Why it matters: The closed feedback loop is the core architectural thesis of tidalDB. This post proves it works. It is the strongest argument against the 6-system stack, because the stack's primary failure mode is feedback lag.

Post 9: "Negative signals are equal citizens"

Type: Architecture Decision Record
Roadmap phase: m3p2 (Feedback Loop)
Thesis: A skip is not the absence of a like. It is data. tidalDB treats negative signals -- skips, hides, blocks, "not interested" -- with the same precision and immediacy as positive signals. A skip within 3 seconds is a strong quality signal. A hide creates a permanent exclusion. A block removes all of a creator's content from all future queries. These are not afterthoughts. They are first-class signal types with their own decay rates, velocity, and ranking weight.
Source material: VISION.md (negative signals section), USE_CASES.md (UC-01 feedback), m3p2 task docs
When to publish: After m3p2 is complete. Can be bundled with or separated from Post 8.
Code to include: Signal type definitions for skip, hide, block. The penalty clause in a ranking profile. The property test: 10,000 random signal sequences never produce a result where a hidden item or blocked creator appears.
Why it matters: Most recommendation systems handle negative feedback as an afterthought -- a manual "not interested" button that writes to a separate blocklist. tidalDB's approach is architecturally different and engineers building these systems will recognize the improvement immediately.

Post 10: "Cold start without application logic"

Type: Technical Deep Dive
Roadmap phase: m3p3 (Personalized Ranking Profiles)
Thesis: New items with no signals get an exploration budget. New users with no history get a sensible default from population-level signals. The application does not manage either. The exploration rate decays as signals accumulate. This is declared per ranking profile, not implemented in application code.
Source material: m3p3 task docs, VISION.md (cold start section)
When to publish: After m3p3 is complete.
Code to include: The exploration budget in a profile definition. A new item appearing in a for_you feed despite having zero signals. The decay of exploration as signals arrive.
Why it matters: Cold start is the problem everyone hacks around and no one solves cleanly. Showing a database-native solution is a strong differentiator.

Milestone 4: Hybrid Search

M4 merges full-text search with semantic similarity and signal-ranked results. Search and retrieval become the same system.

Post 11: "Search and ranking are the same system"

Type: Technical Deep Dive / Announcement
Roadmap phase: m4p3 (SEARCH Query Executor) -- M4 complete
Thesis: SEARCH items QUERY "jazz piano" VECTOR [embedding] FOR USER @user_42 USING PROFILE search LIMIT 20 combines BM25 text relevance, semantic vector similarity, and user personalization in one ranked list. The fusion uses Reciprocal Rank Fusion. Personalization re-ranks within the relevant set -- an irrelevant result never surfaces because the user likes the creator. This is one query. It replaces Elasticsearch + a vector DB + a ranking service.
Source material: m4p3 integration test, docs/research/tantivy.md, ARCHITECTURE.md (Text Search, Hybrid Fusion)
When to publish: After M4 UAT passes.
Code to include: The SEARCH query. The RRF formula. A comparison: the same query with BM25 only, ANN only, and fused. The personalization overlay changing result order for two different users.
Why it matters: Search is the most complex surface and the one engineers know best. Showing that text search, semantic search, and ranking collapse into one query is the most concrete demonstration of the 6-to-1 thesis.

Post 12: "Tantivy as a derived index, not a source of truth"

Type: Architecture Decision Record
Roadmap phase: m4p1 (Tantivy Integration)
Thesis: The entity store is the source of truth. Tantivy is a materialized view. If the index is corrupted, it can be rebuilt from the entity store. Crash recovery replays from a stored sequence number. Consistency is DB-primary, not two-phase commit. This is simpler, deterministic, and the right model for an embedded database.
Source material: docs/research/tantivy.md, m4p1 task docs, ARCHITECTURE.md
When to publish: After m4p1 is complete.
Code to include: The outbox pattern. The crash recovery sequence number. The background indexer. The consistency model.
Why it matters: This is a useful architectural pattern beyond tidalDB. Engineers building systems with derived indexes will find this directly applicable.

Milestone 5: Full Surface Coverage

M5 completes all 14 use cases. The content here shifts from "how does the engine work" to "what can you build with it."

Post 13: "14 use cases, one query engine"

Type: Devlog / Announcement
Roadmap phase: M5 complete
Thesis: For You feeds, trending, search, following, related content, notifications, hidden gems, controversial, live content, creator discovery, user library, cohort-scoped trending -- every surface a content platform needs, driven by the same query primitives. The application specifies profiles, filters, and context. The database executes ranking.
Source material: USE_CASES.md, M5 UAT results
When to publish: After M5 UAT passes.
Code to include: A curated selection of 4-5 queries spanning different surfaces (for_you, trending, search, hidden_gems, cohort_trending). Each with a brief setup and result.
Why it matters: This is the completeness post. It demonstrates that the database is not a toy or a prototype -- it handles the full surface area of a real content platform.

Type: Technical Deep Dive
Roadmap phase: M5, likely Phase 3 (Social Graph and Collaborative Filtering)
Thesis: "What's trending" means different things to different audiences. A 22-year-old in Tokyo and a 45-year-old in Texas see different trending pages -- not because of personalization (individual preference), but because different content is genuinely trending within their respective audience segments. tidalDB maintains per-cohort signal aggregation using RoaringBitmaps for O(1) membership testing and sparse fan-out for storage efficiency.
Source material: USE_CASES.md (UC-15), ARCHITECTURE.md (Cohort-scoped aggregation), API.md (Cohort Definitions)
When to publish: After cohort-scoped trending passes integration tests.
Code to include: Cohort definition. Three-layer query (global trending, cohort trending, search within cohort trending). The fan-out write path. Storage cost analysis.
Why it matters: Cohort-scoped trending is a differentiator. Most systems compute trending globally. Slicing by audience segment is a product feature that usually requires a separate analytics pipeline. tidalDB does it natively.

Milestone 6: Production Hardening

M6 is about trust. The content shifts from "what it does" to "why you can trust it."

Post 15: "Kill it at any point. It comes back correct."

Type: Technical Deep Dive
Roadmap phase: m6p1 (Crash Recovery Hardening)
Thesis: We injected faults at every write-path stage. Recovery time is under 30 seconds at 1M items. WAL replay produces state identical to pre-crash. No phantom items, no lost signals, no inconsistent aggregates. The WAL is the source of truth. Everything else is derived state that can be rebuilt.
Source material: m6p1 test results, fault injection methodology
When to publish: After m6p1 passes.
Code to include: The crash simulation test. Recovery time measurements. The WAL checkpoint and replay sequence.
Why it matters: Trust is the precondition for adoption. Engineers will not embed a database they cannot crash-test. This post is the trust credential.

Post 16: "Graceful degradation: less precise, never wrong"

Type: Architecture Decision Record
Roadmap phase: m6p2 (Graceful Degradation)
Thesis: Under 3x overload, tidalDB does not return errors. It reduces candidate set size, uses coarser aggregates, skips diversity enforcement, and serves from materialized cache -- in that order. Results are less precise but never incorrect. The degradation order is documented and configurable.
Source material: m6p2 task docs, ARCHITECTURE.md (Graceful degradation)
When to publish: After m6p2 is complete.
Code to include: The degradation cascade. Load test results at 1x, 2x, 3x. Latency distribution at each level.
Why it matters: This is how production systems should behave. Engineers who have been paged for "ranking service returned 500" will appreciate a system that degrades gracefully instead.

Ongoing / Anytime

These posts are not tied to specific milestones. They can be written whenever the insight is clear.

"Why not SQL"

Type: Architecture Decision Record
Thesis: The custom query language exists because SQL cannot express ranking semantics without losing optimization opportunities. FOR USER means "load this user's preference vector and relationship graph." USING PROFILE means "apply this named scoring function." DIVERSITY means "enforce post-ranking constraints." These are not WHERE clauses.
Source material: thoughts.md (Part II.4), VISION.md (query examples), API.md
When to publish: Any time after M1. Best paired with M2 when the RETRIEVE query is functional.

"Why we chose fjall over RocksDB (for now)"

Type: Architecture Decision Record
Thesis: Pure Rust, #![forbid(unsafe_code)], fast compile times, trait-abstracted for swap. fjall is not the fastest LSM-tree. It is the right one for an embeddable database built by a small team that values correctness over raw throughput, with a trait boundary that makes the decision reversible.
Source material: thoughts.md (Part V.9), m1p3 task docs, CODING_GUIDELINES.md
When to publish: After m1p3 is complete (already shipped). This post is ready now.

"USearch, not from scratch"

Type: Architecture Decision Record
Thesis: Correct, high-performance, concurrent HNSW with SIMD distance computation is 6-12 months of dedicated work. We are not a vector database company. USearch runs in ScyllaDB, ClickHouse, and DuckDB. The FFI boundary is thin. Build what differentiates you. Borrow what does not.
Source material: docs/research/ann_for_tidaldb.md, m2p1 task docs, ARCHITECTURE.md (Vector Index)
When to publish: After m2p1 (USearch integration) is complete.

Post Cadence

Milestone	Posts	Approximate Pace
Pre-implementation	1	Publish when ready
M1 (Signal Engine)	2-3	One per phase completion
M2 (Ranked Retrieval)	3	One per major phase
M3 (Personalized Ranking)	2-3	One per key insight
M4 (Hybrid Search)	2	One per major phase
M5 (Full Coverage)	2	At milestone boundaries
M6 (Production Hardening)	2	At milestone boundaries
Ongoing / ADRs	2-3	When the decision is fresh

Target: 16-20 posts across the full roadmap. Not more. Each one earns its place.

What Not to Write

Progress updates that are changelogs. ("We merged 47 PRs this month.") Nobody cares.
Posts that announce intent without shipped code. ("We plan to build...") Ship first.
Posts with titles that are labels. ("Q1 Update," "Phase 3 Complete.") The title is the thesis.
Posts that explain what a concept is without showing why the reader should care. ("Windowed aggregation is...") Start with the problem.
Posts that use "we're excited to announce." You are not excited. You are precise.

Reference: Roadmap to Post Mapping

Roadmap Phase	Post #	Title (Working)
Pre-implementation	1	Every content platform builds the same 6 systems from scratch
m1p1-m1p3 (Foundation)	3	What three databases taught us before we wrote a line of code
m1p4 (Signal Ledger)	2	Running decay scores are O(1) -- here is the math
m1p5 (M1 Complete)	4	Signals wrote 100ms ago. The query sees them now.
m2p3 (Ranking Profiles)	7	Ranking profiles are data, not code
m2p4 (Diversity)	6	Diversity enforcement in 3 microseconds
m2p5 (M2 Complete)	5	One query. Six systems. Under 50 milliseconds.
m3p2 (Feedback Loop)	8, 9	The feedback loop that closes in one write / Negative signals are equal citizens
m3p3 (Personalized Profiles)	10	Cold start without application logic
m4p1 (Tantivy)	12	Tantivy as a derived index, not a source of truth
m4p3 (M4 Complete)	11	Search and ranking are the same system
M5 Complete	13, 14	14 use cases, one query engine / Cohort-scoped trending
m6p1 (Crash Recovery)	15	Kill it at any point. It comes back correct.
m6p2 (Graceful Degradation)	16	Graceful degradation: less precise, never wrong
Any time	--	Why not SQL / Why fjall / USearch, not from scratch

Immediate Next Actions

Write Post 1 ("Every content platform builds the same 6 systems from scratch") -- this can be published now. It establishes the problem and the audience. It does not depend on shipped code.
Write Post 3 ("What three databases taught us") -- m1p1 through m1p3 are complete. The source material (thoughts.md) is rich. The code exists.
Prepare Post 2 outline ("Running decay scores are O(1)") -- the research doc exists, the math is decided, but the implementation is not yet shipped (m1p4 is next). Write the outline. Wait for the benchmarks.

24 KiB Raw Blame History