# 00 -- Architecture Overview **Status:** Draft **Author:** tidalDB Engineering **Date:** 2026-02-20 **Purpose:** Show how the 14 specs connect. The forest before the trees. --- ## 1. Core Insight The WAL is the single event stream. Everything else is a materialized view. The signal ledger is a materialized view over signal events. The user preference vector is a materialized view over signal events weighted by item embeddings. The relationship weight between a user and a creator is a materialized view over interaction signals. The cohort-scoped trending counter is a materialized view over signal events filtered by user attributes. This is not a metaphor. The WAL (spec 01) records every mutation: signal events, entity writes, relationship writes, schema changes. After a record is durable in the WAL, downstream materializers consume it and update their derived state. If any materializer's state is lost, it is rebuilt by replaying the WAL from the last checkpoint. The WAL is truth. Everything else is cache. The existing specs already embody this pattern -- spec 03 Section 3 says "immutable events, mutable aggregates," spec 10 Section 2 shows a single signal event updating six subsystems, spec 01 says "the WAL is the source of truth; everything else is derived state." The architecture overview names the pattern explicitly and shows how the 14 specs are instances of it. --- ## 2. System Diagram ``` APPLICATION | db.signal() / db.write_item() / db.retrieve() | +-----------+-----------+ | | WRITE PATH READ PATH | | v v +------------------+ +-------------------+ | WAL | | QUERY ENGINE | | (append-only log)| | (spec 08) | | spec 01 | | | +--------+---------+ +----+---------+----+ | | | v | reads from +------------------------+ | | | MATERIALIZER REGISTRY | | +----+----+---+--------+ | fans out each event to | | | | | | | | all registered | | | | | | | | materializers | | v v v v v +--+----+----+----+------+ | Signal Entity Rel. User Cohort | | | | | Ledger Store Graph State Counters v v v v | (hot/ (redb) (redb) (redb) (fjall) +----+----+----+------+ | warm) | G | U | R | C | | | l | s | e | o | +--reads from--+ | o | e | l | h | | | b | r | a | o | +---------+---------+---------+ | a | P | t | r | | | | | | l | r | i | t | v v v v | | e | o | | +-------+ +-------+ +--------+ +-------+ | S | f | n | S | |Tantivy| |USearch| |Roaring | |Cohort | | i | | s | i | | Text | |Vector | |Bitmap | |Rollup | | g | V | h | g | | Index | | Index | |Filters | |Tables | | n | e | i | n | |spec 06| |spec 07| |spec 08 | |spec 05| | a | c | p | a | +-------+ +-------+ +--------+ +-------+ | l | t | | l | | | o | W | | | M | r | e | M | | a | | i | a | | t | M | g | t | | . | a | h | . | | | t | t | | | | . | | | | | | M | | | | | a | | | | | t | | | | | . | | +----+----+----+------+ ``` Write path: event arrives, WAL appends, materializer registry fans out to all registered materializers. Each materializer updates its scoped state. Read path: query engine reads from materialized state (signal ledger for scores, entity store for metadata, indexes for retrieval, cohort counters for scoped trending). No materializer is invoked on the read path. Reads never touch the WAL. --- ## 3. Materializer Trait The materializer is the core abstraction boundary between the event stream and derived state. Every piece of state that a query reads -- signal scores, preference vectors, relationship weights, cohort counters, user-item state -- is produced by a materializer. ```rust /// The scope at which a materializer operates. /// Determines what subset of events it processes and what key space it writes to. pub enum Scope { /// All events. Global signal counters, global trending. Global, /// Events from users in a specific cohort. Cohort-scoped trending. Cohort(CohortId), /// Events involving a specific user. Preference vectors, user-item state. User(UserId), /// Events between two entities. Interaction weights, engagement affinity. Relationship(EntityId, EntityId), } /// A materializer consumes WAL events and produces derived state. /// /// Implementations: /// GlobalSignalMaterializer -- hot-tier decay scores, windowed counters (M1) /// UserPreferenceMaterializer -- preference vector shifts (M3) /// RelationshipWeightMaterializer -- interaction weights, engagement affinity (M3) /// CohortSignalMaterializer -- dimensional rollup counters (M4) /// UserStateMaterializer -- seen/liked/saved/hidden bitmaps (M3) pub trait Materializer: Send + Sync { /// Process a single WAL event. Called by the registry for every event /// after WAL durability is confirmed. /// /// Implementations must be idempotent: replaying the same event twice /// must produce the same state as processing it once. fn on_event(&self, event: &WalEvent) -> Result<()>; /// Write current state to a checkpoint. Called periodically by the /// background checkpoint task. After a successful checkpoint, the WAL /// segments before the checkpoint sequence number are eligible for cleanup. fn checkpoint(&self, writer: &mut dyn Write) -> Result<()>; /// Restore state from a checkpoint. Called during crash recovery /// before WAL replay begins. After restore, the materializer's state /// matches the checkpoint. WAL events after the checkpoint sequence /// number are then replayed via on_event(). fn restore(&self, reader: &mut dyn Read) -> Result<()>; } /// The registry holds all active materializers and fans out events. pub struct MaterializerRegistry { materializers: Vec>, } impl MaterializerRegistry { /// Fan out a single event to all registered materializers. /// Called after WAL append confirms durability. pub fn on_event(&self, event: &WalEvent) -> Result<()> { for m in &self.materializers { m.on_event(event)?; } Ok(()) } } ``` The trait is small by design. Three methods. Each materializer owns its scope, its storage, and its invariants. The registry is a fan-out mechanism, nothing more. This is an S-complexity addition in M1 that prevents an M-complexity refactor later. The `GlobalSignalMaterializer` is the first implementation. `UserPreferenceMaterializer` and `RelationshipWeightMaterializer` arrive in M3. `CohortSignalMaterializer` arrives in M4. The trait boundary means each can be developed and tested in isolation. --- ## 4. Spec Map Every spec has a role in the data flow. Some define what goes into the event stream. Some define materializers that consume the stream. Some define how the query engine reads materialized state. Some are cross-cutting. | Spec | Name | Role in Data Flow | Category | |------|------|-------------------|----------| | 01 | Storage Engine | WAL format, segment lifecycle, crash recovery, dual-backend (fjall + redb) | **Event Stream** | | 02 | Entity Model | Entity write events in WAL, entity store as materialized state in redb | **Event Stream + Materialized View** | | 03 | Signal System | Signal events in WAL, three-tier signal ledger as materialized view, cohort dimensional rollups as materialized views | **Materialized View** (primary) | | 04 | Relationships | Relationship write events in WAL, edge store as materialized state, implicit edges updated by signal materializers | **Event Stream + Materialized View** | | 05 | Cohorts | Cohort definitions, membership resolution, scoped signal counters as materialized views | **Materialized View** | | 06 | Text Retrieval | Tantivy index as materialized view over entity text fields, queried at read time | **Query-Time Index** | | 07 | Vector Retrieval | USearch HNSW index as materialized view over entity embeddings, queried at read time | **Query-Time Index** | | 08 | Query Engine | Orchestrator that reads from all materialized state, never writes | **Query-Time Reader** | | 09 | Ranking/Scoring | Scoring pipeline, profiles, diversity -- reads signals, relationships, vectors at query time | **Query-Time Reader** | | 10 | Feedback Loop | Defines the semantic mapping from signal events to materializer updates (which signal shifts the preference vector in which direction, which signal increments which relationship weight) | **Materializer Orchestration** | | 11 | Schema | Definitions for entities, signals, profiles, cohorts -- the contract that all materializers and the query engine validate against | **Cross-Cutting** | | 12 | Cold Start | Exploration budgets, proxy scoring, cohort priors -- query-time logic for entities with no signal history | **Query-Time Reader** | | 13 | Concurrency | Lock-free hot path, group commit, thread model, memory ordering -- the mechanism that makes concurrent materialization and querying safe | **Cross-Cutting** | | 14 | Scale Architecture | Partition keys, capacity model, single-node ceiling -- design constraints that influence WAL format, key encoding, and materializer scope | **Cross-Cutting** | The pattern: specs 01-05 define the write side (event stream + materialized views). Specs 06-07 define query-time indexes (also materialized views, but read-only from the query engine's perspective). Specs 08-09 define the read side. Spec 10 is the bridge between write and read. Specs 11-14 are cross-cutting concerns. --- ## 5. Signal Write Walkthrough Trace one event through the system: **user U likes item I** (where item I was created by creator C). ``` Application calls: db.signal(Signal { kind: "like", item: "item_I", user: "user_U" }) Step 1: DEDUPLICATION CHECK ~100 ns BLAKE3(like, item_I, user_U, timestamp_trunc_1s) -> hash Check bloom filter -> PASS (not a duplicate) Step 2: WAL APPEND ~50 us Serialize to WAL record: type: 0x01 (SignalEvent) payload: { kind: "like", item_id: I, user_id: U, weight: 1.0, ts: now } Write to current WAL segment, fsync (batched) Assign sequence number: seqno 47291 *** DURABILITY BOUNDARY *** Event is now durable. All subsequent updates are derived state. Step 3: MATERIALIZER REGISTRY FAN-OUT registry.on_event(WalEvent { seqno: 47291, type: SignalEvent, ... }) Invokes each registered materializer: 3a: GlobalSignalMaterializer ~40 ns Read item I's HotSignalState for signal "like" CAS update: decay_score += weight * exp(-lambda * dt) Atomic increment: warm tier minute bucket counter Atomic increment: all_time_count Result: item I's like score, velocity, windowed counts updated 3b: UserPreferenceMaterializer ~10 us Load user U's preference vector (1536D) Load item I's content embedding (1536D) Signal polarity: positive (like) Shift: pref_new = normalize(pref_old + lr * item_embedding) Write back updated preference vector Result: user U's taste profile reflects this like 3c: RelationshipWeightMaterializer ~5 us Resolve item I -> creator C Load interaction_weight(U, C), apply time decay, add delta (+0.15) Clamp to [0.0, 1.0], write back Load engagement_affinity(U, I), update similarly Result: U's affinity for creator C increased 3d: CohortSignalMaterializer ~20 us Load user U's cached cohort memberships: {region:US, age:18-24, lang:en} Increment global counter for item I / like / current_hour Increment region:US counter for item I / like / current_hour Increment age:18-24 counter for item I / like / current_hour Increment lang:en counter for item I / like / current_hour Check behavioral segments: U is in "jazz_fans" -> increment that counter Result: cohort-scoped trending reflects this engagement 3e: UserStateMaterializer ~5 us Set bitmap: user_U has "liked" item_I Result: future queries with FILTER liked include this pair RETURN Ok(()) Total: < 100 us p50 ``` One API call. One WAL append. Five materializer updates. The next ranking query -- even 1ms later -- sees all of this. No ETL. No Kafka. No stale data. --- ## 6. Query Walkthrough Trace a composed query through the system: ``` RETRIEVE items FOR USER @u1 USING PROFILE for_you FILTER unseen WITHIN TRENDING COHORT locale:US, age:18-24 DIVERSITY max_per_creator:2 LIMIT 50 ``` This is a three-layer query: personalized ranking within cohort-scoped trending. ``` Step 1: PARSE AND VALIDATE ~1 us Resolve profile "for_you" from schema -> ProfileDef v3 Resolve cohort predicates: locale:US AND age:18-24 Validate user @u1 exists Validate all filter fields exist in schema Step 2: COHORT RESOLUTION ~2 ms Resolve cohort "locale:US AND age:18-24" to a CohortId This is a Level 3 (composite) cohort: intersection of Level 1 dimension region:US (dimension_id=1, cohort_value=0x0001) Level 1 dimension age_group:18-24 (dimension_id=3, cohort_value=0x0002) No pre-computed counters for the composite. Plan: fetch Level 1 counters for both dimensions, estimate intersection using independence assumption: count(US AND 18-24) ~ count(US) * count(18-24) / count(global) Step 3: CANDIDATE GENERATION FROM COHORT TRENDING ~15 ms Read cohort_signals CF for dimension region:US, signal "view", window: last 24 hours (24 hour-buckets) Read cohort_signals CF for dimension age_group:18-24, signal "view", window: last 24 hours For each item: compute estimated cohort velocity using independence assumption Sort by estimated velocity, take top 500 candidates This is the "what is trending for US users aged 18-24" candidate set Step 4: FILTER APPLICATION ~3 ms Load RoaringBitmap for user @u1's "seen" items Remove seen items from candidate set Apply any metadata filters (none beyond "unseen" in this query) Surviving candidates: ~400 Step 5: SIGNAL LOADING ~2 ms For each surviving candidate, load from hot tier: like.decay_score, view.velocity(24h), share.decay_score For user @u1, load: preference_vector (1536D) interaction_weight(u1, candidate.creator) for each candidate's creator All reads are lock-free atomic loads from memory-resident state Step 6: SCORING VIA RANKING PROFILE ~5 ms Profile "for_you" scoring pipeline (9 stages): 1. Base score: cohort velocity (from step 3) 2. Personalization boost: cosine_sim(u1.preference_vector, item.embedding) 3. Relationship boost: interaction_weight(u1, item.creator) 4. Signal boosts: like.decay_score, share.decay_score 5. Recency curve: time_decay(item.created_at) 6. Penalties: low completion rate, flagged content 7. Quality gates: minimum signal thresholds 8. Cold start: exploration budget injection (10% of slots) 9. Final score composition: weighted sum with normalization Step 7: DIVERSITY ENFORCEMENT ~1 ms Sort by score descending Enforce max_per_creator:2 Greedy scan: for each item, if creator already has 2 items in result, demote to end of list Take top 50 after diversity enforcement Step 8: RESULT ASSEMBLY ~1 ms Load entity metadata for 50 items from redb Build cursor for pagination (encodes last item's score + id) Return Results { items, cursor, total_estimate } TOTAL LATENCY: ~30 ms (within 50 ms budget) ``` --- ## 7. Three-Layer Trending Global trending, cohort-scoped trending, and search-within-cohort-trending are not three different systems. They are three scopes applied to the same materializer architecture, using the same math. **The math:** Velocity is the rate of change of a windowed signal count. For a 24-hour window: ``` velocity(item, signal, window) = count(item, signal, window) / window_duration ``` Acceleration (rising detection) is the rate of change of velocity: ``` acceleration = velocity(current_window) - velocity(previous_window) ``` This formula is identical at every scope. The only thing that changes is which counter you read. **Layer 1: Global trending** ``` RETRIEVE items USING PROFILE trending WINDOW 24h LIMIT 25 ``` Reads from: `GlobalSignalMaterializer` counters. Level 0 in the dimensional hierarchy. One counter per item per signal per hour bucket. Sum the last 24 buckets, divide by 24h. Sort by velocity. Done. **Layer 2: Cohort-scoped trending** ``` RETRIEVE items USING PROFILE trending COHORT locale:US, age:18-24 WINDOW 24h LIMIT 25 ``` Reads from: `CohortSignalMaterializer` counters. Level 1 dimensions region:US and age_group:18-24. For a composite cohort (Level 3), estimate the intersection using independence assumption. Same velocity formula, different counters. The math does not change. The scope does. **Layer 3: Search within cohort-scoped trending** ``` SEARCH items QUERY "piano tutorial" WITHIN TRENDING COHORT locale:US, age:18-24 WINDOW 24h LIMIT 20 ``` Step 1: Generate the cohort-trending candidate set (Layer 2). Step 2: Run text search (Tantivy BM25) restricted to that candidate set. Step 3: Fuse cohort velocity score with BM25 relevance score. Same materializer output, filtered by a text query. The architecture makes this composable because each layer reads from the same materialized state. The query planner recognizes `WITHIN TRENDING COHORT ...` as "generate candidates from cohort velocity, then filter by text match." No special-case code. No separate trending service. One materializer hierarchy, three query shapes. --- ## 8. Code Module Map ``` tidal/src/ lib.rs # TidalDB struct, public API, lifecycle wal/ # Spec 01: Write-ahead log mod.rs # WAL reader/writer, segment management record.rs # WalEvent enum, serialization segment.rs # Segment file lifecycle, preallocate, seal recovery.rs # Crash recovery: scan, validate, replay materializer/ # Architecture overview: core abstraction mod.rs # Materializer trait, Scope enum registry.rs # MaterializerRegistry, fan-out, checkpoint coordination storage/ # Spec 01: Dual-backend storage mod.rs # StorageEngine trait fjall.rs # fjall backend: WAL, cold-tier signals, cohort counters redb.rs # redb backend: entities, relationships, user state keys.rs # Key encoding (partition-ready prefixes) entity/ # Spec 02: Items, Users, Creators mod.rs # Entity trait, EntityKind enum item.rs # Item struct, metadata fields, lifecycle user.rs # User struct, attributes, computed fields creator.rs # Creator struct, catalog embedding signal/ # Spec 03: Signal system mod.rs # SignalDef, Decay enum, Window enum hot.rs # HotSignalState (cache-line aligned, atomic) warm.rs # WarmSignalState (per-minute buckets, SWAG) cold.rs # Cold-tier event storage, hourly/daily rollups velocity.rs # Velocity and acceleration computation decay.rs # Exponential/linear decay formulas global_mat.rs # GlobalSignalMaterializer (impl Materializer) cohort_mat.rs # CohortSignalMaterializer (impl Materializer) user_pref_mat.rs # UserPreferenceMaterializer (impl Materializer) user_state_mat.rs # UserStateMaterializer (impl Materializer) relationship/ # Spec 04: Edges between entities mod.rs # Edge types, directionality, storage weight.rs # Weight update mechanics, decay traversal.rs # Fan-out queries (following feed, collab filtering) rel_mat.rs # RelationshipWeightMaterializer (impl Materializer) cohort/ # Spec 05: Dynamic population segments mod.rs # CohortDef, CohortId, predicate evaluation membership.rs # Bitmap-based membership resolution rollup.rs # Dimensional hierarchy (Level 0/1/2/3) index/ # Specs 06-07: Secondary indexes mod.rs # Index trait bounds text.rs # TextIndex trait + Tantivy implementation (spec 06) vector.rs # VectorIndex trait + USearch implementation (spec 07) bitmap.rs # RoaringBitmap filter indexes (spec 08) query/ # Spec 08: Query engine mod.rs # retrieve(), search(), suggest() entry points parser.rs # Input validation, schema resolution, AST construction planner.rs # Cost-based plan selection, selectivity estimation executor.rs # Pipeline execution, subsystem coordination cursor.rs # Pagination cursor encoding/decoding composition.rs # WITHIN clause, cohort-scoped candidate generation ranking/ # Specs 09 + 12: Scoring and cold start mod.rs # ProfileDef, scoring pipeline (9 stages) boosts.rs # Signal, personalization, relationship, recency boosts penalties.rs # Low-quality, flagged content, repetition penalties gates.rs # Quality gates, minimum thresholds diversity.rs # max_per_creator, format_mix, greedy enforcement cold_start.rs # Exploration budget, proxy scoring, cohort priors sort_modes.rs # 20+ built-in sort modes (trending, hot, rising, etc.) schema/ # Spec 11: Schema system mod.rs # define_entity, define_signal, define_profile, etc. validation.rs # Schema validation rules, breaking change detection migration.rs # Migration planner, dry-run, execute version.rs # Version tracking, introspection api/ # Public Rust API surface mod.rs # Re-exports, builder patterns, error types ``` The materializer implementations live inside their domain modules (`signal/`, `relationship/`), not in `materializer/`. The `materializer/` module owns the trait and the registry. Each domain module owns its materializer implementation. This keeps domain logic co-located with its materializer. --- ## 9. Spec Dependency Graph ``` +----------+ | 11 Schema| (cross-cutting: all specs validate against schema) +----+-----+ | +----v-----+ |01 Storage| (foundation: WAL, dual-backend, crash recovery) +----+-----+ | +----------+----------+ | | +-----v------+ +-----v------+ | 02 Entity | | 03 Signal | | Model | | System | +-----+------+ +--+----+----+ | | | +---------+--------+ +---+ +--------+ | | | | | +---v---+ +--v---+ +--v--+ | +-----v-----+ |06 Text| |07 Vec| |04 Rel| | | 05 Cohort | |Retriev| |Retri.| |ation.| | | | +---+---+ +--+---+ +--+---+ | +-----+-----+ | | | | | +---------+--------+-----+---------+-------+ | | +-----v------+ +-----v------+ | 08 Query | | 10 Feedback| | Engine | | Loop | +-----+------+ +------------+ | +-----v------+ | 09 Ranking | | + 12 Cold | +------------+ Cross-cutting (not shown as edges -- they constrain everything): 11 Schema -- all definitions validated against schema 13 Concurrency -- lock-free patterns for all hot-path state 14 Scale -- partition-ready key encoding, aggregation scopes ``` Read the graph bottom-up for implementation order. Read it top-down for dependency chains. **Critical path:** 01 -> 03 -> 05 -> 08 -> 09. This is the longest dependency chain and the path that enables the full three-layer trending query. Every milestone must make progress along this chain. **Parallel tracks after 01:** Entity model (02), signal system (03), and schema (11) can proceed in parallel once the storage engine exists. Text (06) and vector (07) retrieval can proceed in parallel once the entity model exists. Relationships (04) and cohorts (05) can proceed in parallel once signals exist. --- ## 10. Cross-Cutting Principles **WAL is truth.** Every mutation is durable in the WAL before it is visible anywhere. Materialized state can be lost and rebuilt. The WAL cannot. This is not a design preference -- it is the correctness foundation. Spec 01 Invariant 2: "A signal event acknowledged to the caller survives any single crash." **Materializers are the abstraction boundary.** The write path does not know what derived state exists. It appends to the WAL and calls `registry.on_event()`. Adding a new kind of derived state means implementing `Materializer` and registering it. No changes to the write path. No changes to existing materializers. **Same math at every scope.** Velocity is `count / duration`. Decay is `score * exp(-lambda * dt)`. These formulas do not change when you switch from global to cohort to user-local scope. What changes is which counter you read. Global velocity reads Level 0 counters. Cohort velocity reads Level 1/2 counters and estimates Level 3 intersections. The ranking profile does not know the difference -- it sees a velocity number. This uniformity is what makes three-layer trending a query parameter, not a feature. **Scale is a design constraint from day one.** The WAL record format includes a partition key field (spec 14). Key encoding in the storage layer uses big-endian prefixes that sort correctly under range partitioning. `SignalDef` carries an `aggregation_scope` field. The `Materializer` trait's `Scope` enum maps directly to partition boundaries. None of this requires a distributed runtime to exist. All of it is required so that when the distributed runtime arrives, it does not require a storage engine rewrite. CockroachDB, TiDB, and Elasticsearch learned this lesson. tidalDB builds on it. **Single-node-first but partition-ready.** A single tidalDB process is a complete, self-contained shard. It runs the full WAL, all materializers, all indexes, and the full query engine. Distribution, when it comes, is the coordination of many such shards -- not a redesign of what a shard does. The atoms are right from day one. The orchestration comes later. **Readers never block writers. Writers never block readers.** The concurrency model (spec 13) enforces this structurally, not by convention. Hot-tier signal state uses atomic CAS. Warm-tier counters use atomic increments. Entity reads use epoch-based reclamation. The WAL writer is channel-serialized (one writer, many producers). No ranking query ever acquires a lock on the scoring path. **The query engine is stateless.** It holds no data. It reads from materialized state produced by materializers and from secondary indexes (Tantivy, USearch, RoaringBitmaps). If the query engine crashes, no data is lost, no recovery is needed. It restarts and reads from the same materialized state. **Schema encodes behavior, not just shape.** A signal's half-life, a ranking profile's scoring weights, a cohort's predicate, a diversity constraint -- these are schema declarations, not application code. The database enforces them. The query optimizer reasons about them. Behavior changes are schema mutations, not redeployments. This is the Stage 3 insight from thoughts.md.