## M0p1 — Embeddable Runtime Skeleton (329 tests)
- TidalDb with builder(), health_check(), close(), and Drop-based cleanup
- TidalDbBuilder fluent API: ephemeral(), with_data_dir(), wal_dir(), cache_dir()
- Config, StorageMode, ConfigError types; Config(ConfigError) variant on LumenError
- Paths: single source of truth for directory layout (wal, items, users, creators, cache)
- TempTidalHome: test isolation helper gated behind #[cfg(test)] / test-utils feature
- 8 integration tests: tests/sandboxed_storage.rs
## M0p2 — Tooling & Diagnostics (349 tests)
- Workspace root Cargo.toml (members: ["tidal", "tidalctl"])
- tidal/build.rs: BUILD_HASH from GIT_HASH with option_env!() fallback to "dev"
- MetricsState: always-compiled Arc-shared atomics (uptime, health_ok)
- MetricsHandle (metrics feature): hand-rolled TcpListener HTTP, zero new deps
- GET /healthz → {"status":"ok","uptime_secs":N}
- GET /metrics → Prometheus text (tidaldb_uptime_seconds, health_ok, info)
- TidalDbBuilder.enable_metrics(addr) starts background metrics thread
- tidalctl binary: status + paths commands, manual std::env::args() parsing
- 7 metrics integration tests, 9 tidalctl CLI tests
## m1p4 Signal Ledger (in-progress)
- SignalLedger: DashMap<(EntityId, SignalTypeId), EntitySignalEntry>, WAL-first writes
- HotSignalState: #[repr(C, align(64))], lock-free CAS decay, out-of-order handling
- BucketedCounter: 60 per-minute + 168 per-hour circular buffers, trigger-based rotation
- CheckpointMeta + serialize/restore: 983-byte fixed records, atomic WriteBatch
- Property tests: running score matches analytical to 1e-6, decay monotonic, non-negative
- Proptest regression: signals/warm.txt
## Documentation and planning
- ROADMAP: m0p1 COMPLETE (329), m0p2 COMPLETE (349), product track milestones
- PRODUCT_ROADMAP: P0-P4 product milestone track (personal briefing beachhead)
- Milestone planning docs: milestone-0 (phases 1-3), milestone-p (phases 1-5)
- docs/research/tidaldb_tooling_and_diagnostics.md
- ARCHITECTURE.md, CLAUDE.md, VISION.md updates
## Site
- Blog: every-platform-builds-the-same-6-systems.mdx (new)
- Blog: why-tidaldb.mdx (updated)
- next.config.ts, layout.tsx, blog/page.tsx updates
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
69 KiB
TidalDB Roadmap
Vision Statement
When tidalDB is complete, an engineering team building any content platform -- a media library, a social feed, a marketplace, a discovery surface -- can embed a single Rust database and replace the Elasticsearch + Redis + Kafka + feature store + vector database + ranking service stack. One process, one query interface, one operational model. The query RETRIEVE items FOR USER @user_id USING PROFILE for_you FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50 executes in under 50ms, reflects signals written 100ms ago, enforces diversity without application logic, handles cold-start items without application intervention, and returns results a user would describe as "it knows what I want."
Thesis
A single embeddable database can replace the 6-system content ranking stack by treating signals, ranking profiles, and diversity constraints as database primitives rather than application logic.
Milestone Summary
| # | Name | Proves | Enables |
|---|---|---|---|
| M0 | Embeddable Runtime | tidalDB can run in-process with zero-config defaults and tooling | Cuts proof-of-concept friction, enables internal dogfooding |
| M1 | Signal Engine | Signals are a database primitive with O(1) decay, not application math | UC-03 (partial), UC-06 (partial), UC-14 (partial) |
| M2 | Ranked Retrieval | A single query retrieves, scores, and ranks content using live signals | UC-03, UC-04, UC-06, UC-08, UC-13, UC-14 |
| M3 | Personalized Ranking | User context shapes retrieval and ranking -- the "For You" query works | UC-01, UC-05, UC-07, UC-09 (partial) |
| M4 | Agent Memory | Agents can create sessions, write signals, and enforce policy inside tidalDB | Agent-mediated personalization, RLHF loops, conversational memory |
| M5 | Hybrid Search | Text + semantic + signal-ranked search in one query | UC-02, UC-10, UC-11 |
| M6 | Full Surface Coverage | Every use case, every sort mode, every filter, every feedback loop | UC-01 through UC-14 complete |
| M7 | Production Hardening | Crash safety, graceful degradation, operational readiness | All UCs at production quality |
Product Milestone Summary (New)
The roadmap now has two tracks:
- Engine Track (M0-M7): proves tidalDB capabilities.
- Product Track (P0-P4): proves end-user value for the beachhead product.
| # | Name | Proves | Depends On |
|---|---|---|---|
| P0 | Beachhead Validation | Knowledge workers and consumers care about a personal briefing feed enough to use it repeatedly | M0 (embedding/runtime), partial M1 |
| P1 | Concierge Alpha | Daily "Today Brief" with explicit feedback controls creates Day-2 retention in a small cohort | M1 complete, partial M2 |
| PG1 | Personalization Core Done (Blocking Gate) | Personalization loop is correct, immediate, and measurably better than baseline | P1 + M1/M2/M3 core slices |
| P2 | Productized Beta | Self-serve onboarding + real-time adaptation + explanation UX works without manual curation | M2 complete, partial M3 |
| P3 | Public Launch | The product is reliable, useful, and trusted at real user volume | M3 + M5 core, M6 partial |
| P4 | Scale + Revenue Fit | Sustainable retention and monetization without quality collapse | M6 + M7 |
Current Status
| Phase | Status | Tests |
|---|---|---|
| m0p1: Embeddable Runtime Skeleton | COMPLETE | 329 passing (293 unit + 36 integration + 3 doc) |
| m0p2: Tooling & Diagnostics | COMPLETE | 349 passing (+7 metrics unit + 7 metrics integration + 9 tidalctl CLI) |
| m0p3: Samples & Docs | NOT STARTED | -- |
| m1p1: Core Type System and Schema | COMPLETE | 77 passing |
| m1p2: Write-Ahead Log | COMPLETE | passing (unit + integration) |
| m1p3: Storage Engine Trait and fjall Backend | COMPLETE | 140 passing (128 unit + 12 integration) |
| m1p4: Signal Ledger | NOT STARTED | -- |
| m1p5: Entity CRUD and Signal Write API | NOT STARTED | -- |
| P0: Beachhead Validation | NOT STARTED | -- |
| P1: Concierge Alpha | NOT STARTED | -- |
| PG1: Personalization Core Done gate | NOT STARTED | -- |
| P2: Productized Beta | NOT STARTED | -- |
| P3: Public Launch | NOT STARTED | -- |
| P4: Scale + Revenue Fit | NOT STARTED | -- |
Current phase: m0p2 (Tooling & Diagnostics) or m1p4 (Signal Ledger) — m0p1 unblocks m0p2; m1p2 and m1p3 unblock m1p4.
Lessons learned:
- m1p3 keyspaces are organized per
EntityKind("items", "users", "creators"), not by data category. TheTagenum in key encoding provides the data-category namespace within each entity-kind keyspace. - The
LumenErrorname is a legacy artifact from a predecessor project. Will be renamed when convenient but does not block progress. - MSRV was bumped to 1.91 for fjall 3 compatibility.
Product Track: Personal Briefing Feed (Knowledge Workers + Consumers)
This track defines the milestones for the actual product experience (not only the database engine).
Use case reference: docs/personal-briefing-beachhead.md.
Dedicated roadmap: docs/planning/PRODUCT_ROADMAP.md.
P0: Beachhead Validation -- "Do users care enough to return?"
Milestone Thesis
Validate that a personal briefing feed solves a painful daily job for users and drives repeat use.
Acceptance Criteria
- Recruit 20-50 target users (knowledge workers + high-intent consumers).
- Run daily briefing prototype (can include manual source QA).
- At least one meaningful feedback action per session for the median user (
more,less,hide,mute,save). - User interviews confirm value vs baseline feeds ("less noise", "more useful", "saves time").
- D2 retention reaches agreed threshold for target segment.
P1: Concierge Alpha -- "High-value daily brief for a narrow cohort"
Milestone Thesis
Deliver a reliable daily Today Brief experience with immediate visible adaptation after user feedback.
Acceptance Criteria
- App surface: ranked brief, reason labels, source links, save/feedback controls.
- Feedback loop: next refresh reflects
less/hide/muteactions immediately. - Time-budget mode (
5/10/20min) is available and used. - Diversity constraints prevent source/topic domination in top results.
- Weekly active usage demonstrates repeated utility.
P2: Productized Beta -- "Self-serve and repeatable without handholding"
Milestone Thesis
Turn the alpha into a self-serve product with stable onboarding, trust UX, and measurable quality.
Acceptance Criteria
- Self-serve onboarding completed in under 3 minutes.
- "Why this" explanations are present and understandable on every briefing card.
- Cohort layer available ("trending for people like you").
- Trust controls available (source transparency, mute/hide persistence).
- D7 retention and "useful item rate" exceed baseline comparison feed.
- PG1 Personalization Core Done gate has passed.
P3: Public Launch -- "Trusted at real volume"
Milestone Thesis
Launch publicly with reliability, quality, and trust guardrails suitable for broad use.
Acceptance Criteria
- Reliability and latency SLOs defined and met for briefing generation.
- Quality floor enforced (freshness, source quality, duplicate suppression).
- Notification cadence controls prevent spam.
- Core support and incident process in place for user-facing regressions.
P4: Scale + Revenue Fit -- "Sustainable business without degrading quality"
Milestone Thesis
Prove the product can grow and monetize while preserving user trust and briefing quality.
Acceptance Criteria
- Monetization model validated (subscription, team plan, or equivalent).
- Revenue metrics tracked alongside quality metrics (no quality-revenue trade-off regressions).
- Retention and engagement remain stable as volume increases.
- Product roadmap for next segment expansion is data-backed.
PG1: Personalization Core Done (Blocking Gate)
Milestone Thesis
Before product breadth expansion, the core personalization loop must be provably correct and immediately responsive.
Acceptance Criteria
- Hard negatives (
hide/mute/block) never leak after write, restart, or replay. - Explicit feedback (
more/less/skip/save) changes next-refresh ranking within target latency. - User personalization state rebuilds deterministically from checkpoint + WAL replay.
- Useful-item rate and repeated-unwanted-item rate outperform a non-personalized baseline.
- Diversity guardrails hold while maintaining personalization quality.
Milestone 0: Embeddable Runtime -- "Runs in your process in minutes"
Milestone Thesis
Before we prove any ranking math, developers must be able to embed tidalDB inside an existing service with zero operational prep. M0 delivers the runtime glue — an ergonomic builder API, deterministic storage layout, a tiny admin CLI, and living examples — so the very first experience is cargo add tidaldb, TidalDb::builder().in_memory().open(), and a passing smoke test.
Phases
Phase 1: Embeddable Runtime Skeleton
Delivers: A cohesive Config/Builder API for single-process use, with in-memory and filesystem-backed defaults, sandboxed data directories, and graceful shutdown hooks developers can call from tests or application drop handlers.
- Builder exposes
ephemeral()/single_process()shortcuts and eagerly validates directories. - Shutdown hooks drain WAL writer threads and surface errors.
- Temp-directory helper guarantees deterministic cleanup (used in doctests).
Phase 2: Tooling & Diagnostics
Delivers: tidalctl (a minimal CLI) for inspecting embedded instances, plus a lightweight metrics surface (Prometheus text or JSON) tagged with the same IDs future distributed deployments will use.
tidalctl status --path <dir>returns JSON with WAL seq, config hash, uptime.- Metrics endpoint optional (disabled by default) exposes
/metricsand/healthz. - Tooling reuses the same path helpers from Phase 1.
Phase 3: Samples & Docs
Delivers: Quick-start samples (For You POC + integration tests) compiled as doctests, and reference snippets for embedding tidalDB inside Axum/Actix or a CLI app. Keeps DX in lockstep with the runtime.
- Quickstart example + doctest run under CI (
cargo test --doc --examples). - Axum/Actix embedding examples include graceful shutdown + metrics wiring.
- CONTRIBUTING updated with “run samples” checklist.
UAT Scenario
Given:
// in tests/lib.rs
let db = TidalDb::builder()
.ephemeral()
.with_temp_dir()
.open()
.unwrap();
When:
db.health_check(); // ok
tidalctl status --path <dir> // prints WAL, storage, signal counts
cargo test --doc // quick-start snippet compiles & runs
Then:
- Builder defaults require zero manual config
- CLI connects to the same files used by the embedded process
- Samples stay in sync (failing doctest fails CI)
Milestone 1: Signal Engine -- "Signals are a database primitive"
Milestone Thesis
A developer can open a tidalDB instance, define signal types with decay rates, write engagement events, and read back decay-correct scores and windowed aggregates -- all without computing any temporal math in application code. This proves that the hardest primitive (temporal signals with O(1) decay, velocity, and windowed aggregation) works correctly and meets the performance budget.
UAT Scenario
Given:
A tidalDB instance is opened with a schema defining:
- Entity type: Item with metadata fields (title, category, created_at)
- Signal type: "view" with exponential decay, half_life=7d, windows=[1h, 24h, 7d]
- Signal type: "like" with exponential decay, half_life=14d, windows=[24h, 7d, all_time]
- Signal type: "skip" with exponential decay, half_life=1d, windows=[1h, 24h]
When:
1. Write 100 items with metadata
2. Write 10,000 signal events across the items (views, likes, skips)
with timestamps spanning the last 7 days
3. Read the decay score for item #42, signal "view", at current time
4. Read the windowed count for item #42, signal "view", window=24h
5. Read the velocity for item #42, signal "view", window=1h
6. Write a new "view" event for item #42
7. Immediately re-read the decay score, windowed count, and velocity
8. Close and reopen the tidalDB instance
9. Re-read all values for item #42
Then:
- Step 3: Decay score matches S(t) = sum(w_i * exp(-lambda * (t - t_i)))
computed analytically from raw events, to 6 decimal places
- Step 4: Windowed count equals the exact count of "view" events
within the last 24h window
- Step 5: Velocity equals windowed_count / window_duration
- Step 7: All values reflect the new event immediately
(decay score increased, count incremented, velocity updated)
- Step 9: All values match step 7 (crash recovery preserves state)
- Performance: decay score read < 100ns per entity,
signal write < 100us including WAL fsync (amortized),
200-entity scoring pass < 5us
Phases
Phase 1: Core Type System and Schema -- COMPLETE
Delivers: The foundational type system -- entity IDs, signal type definitions, decay rate declarations, window specifications, and the error types that every subsequent module depends on. The schema module that validates and stores signal/entity definitions.
Acceptance Criteria:
EntityIdis a u64 newtype withDisplay,Hash,Eq,Ord,to_be_bytes()(big-endian, preserves numeric ordering)EntityKindenum:Item,User,CreatorSignalTypeDefcaptures: name, targetEntityKind,DecayModel(exponential with pre-computed lambda / linear / permanent),WindowSet, velocity enabled flagDecayModel::Exponentialstores pre-computedlambda = ln(2) / half_life.as_secs_f64()-- no division on hot pathWindowenum:OneHour,TwentyFourHours,SevenDays,ThirtyDays,AllTimewithduration(),label(),duration_secs_f64()WindowSetdeduplicates and sorts windows;empty()for permanent signalsLumenErrorenum covers Storage, NotFound, Schema, Durability, Query, Internal variants withFromimpls for each sub-errorSchemaErrorenum validates: duplicate signal names, invalid identifiers, zero half-life/lifetime, empty windows for non-permanent signals, velocity without windows- Schema validation via
SchemaBuilderrejects invalid configurations at construction time - Property tests: lambda correctness across half-life range, byte ordering preservation
cargo fmtclean,cargo clippy -D warningsclean, all 77 tests pass
Depends On: None
Complexity: M
Research Reference: docs/research/tidaldb_signal_ledger.md (decay formula, EntityState struct)
Phase 2: Write-Ahead Log -- COMPLETE
Delivers: A durable, append-only log for signal events. Every signal write is fsync'd before acknowledgment. Group commit amortizes fsync cost. Content-addressed events via BLAKE3 for deduplication. The WAL is the source of truth -- all other state is derived.
Acceptance Criteria:
- WAL entries are length-prefixed with BLAKE3 checksums
- Group commit batches up to 100 events or 10ms, whichever comes first
- Duplicate events (same BLAKE3 hash) are silently deduplicated
- WAL replay from any checkpoint produces identical state to uninterrupted execution (property test with 10,000+ random event sequences)
fsyncis called per batch, not per event- WAL can be truncated after a checkpoint without losing committed state
- Crash simulation (kill at random WAL positions) never produces corrupt state -- either the event is committed or it is not
Depends On: Phase 1
Complexity: L
Research Reference: docs/research/tidaldb_wal.md (wire format, group commit, crash detection, deduplication), thoughts.md Part II.1 (WAL convergence), Part V.5-6 (quarantine-first, group commit)
Phase 3: Storage Engine Trait and fjall Backend -- COMPLETE
Delivers: The StorageEngine trait abstraction and two implementations: FjallBackend (fjall 3 LSM-tree) for production and InMemoryBackend (BTreeMap + RwLock) for deterministic testing. Key encoding follows the subject-prefix pattern with a Tag discriminant. FjallStorage coordinates three keyspaces per entity kind. FjallAtomicBatch provides cross-keyspace atomic writes.
Acceptance Criteria:
StorageEnginetrait withget,put,delete,scan_prefix,write_batch,flushoperations- Key encoding:
[entity_id: 8 bytes BE][0x00][Tag: 1 byte][suffix...]withTagenum (Evt=0x01,Sig=0x02,Meta=0x03,Rel=0x04,Mv=0x05,Idx=0x06) encode_key,parse_keyroundtrip correctly for all tag variants and arbitrary suffixesentity_prefix(9 bytes) andentity_tag_prefix(10 bytes) for scoped prefix scans- Byte-lexicographic key ordering matches numeric entity ID ordering (property tested)
FjallBackendwraps a single fjallKeyspace, implementsStorageEngineFjallStorageowns a fjallDatabasewith three keyspaces: "items", "users", "creators" (one perEntityKind)FjallStorage::backend(EntityKind)routes to the correct keyspace backend- Entity kind isolation: same key written to different entity kinds does not collide
FjallAtomicBatchprovides cross-keyspace atomic writes viafjall::OwnedWriteBatch- Data persists across close and reopen (
flush_all+ reopen test) InMemoryBackendusesBTreeMap+RwLockfor deterministic, sorted, concurrent testingWriteBatchandBatchOptypes for atomic multi-operation writesPrefixIteratortype alias for boxed prefix scan iterators- Property tests with proptest: encode/parse roundtrip, prefix ordering, prefix containment
- Criterion benchmarks passing
cargo fmtclean,cargo clippy -D warningsclean, all 140 tests pass (128 unit + 12 integration)
Depends On: Phase 1
Complexity: L
Research Reference: thoughts.md Part V.9 (hybrid storage), Part V.12 (subject-prefix keys), CODING_GUIDELINES.md section 2
Phase 4: Signal Ledger -- Decay Scores and Windowed Aggregation
Delivers: The in-memory per-entity signal state with running decay scores (O(1) update, O(1) read) and bucketed windowed counters. Signal writes update the running scores atomically. Signal reads return decay-correct values without scanning raw events. State is checkpointed to storage for crash recovery.
Acceptance Criteria:
EntitySignalStateis#[repr(C, align(64))]-- one L1 cache line per hot-path struct- Running decay formula:
S(t) = S(t_prev) * exp(-lambda * dt) + weight-- mathematically exact, verified against analytical brute-force computation to 6 decimal places across 10,000 random event sequences (property test) - Out-of-order events handled correctly: when
t_event < last_update, weight is pre-decayed:score += weight * exp(-lambda * (last_update - t_event)) - Windowed counts use per-minute bucketed counters (BucketedCounter) supporting 1h/24h/7d windows
- Velocity = windowed_count / window_duration_seconds
- Signal write latency < 100 microseconds including WAL write (amortized), benchmarked with criterion
- Decay score read latency < 100ns per entity per lambda, benchmarked with criterion
- 200-entity scoring pass < 5 microseconds, benchmarked with criterion
- State checkpointed to storage every 30 seconds; crash recovery reconstructs from checkpoint + WAL replay
- DashMap or sharded map for concurrent entity state access; signal counters use AtomicU64 with Relaxed ordering
Depends On: Phase 2, Phase 3
Complexity: XL
Research Reference: docs/research/tidaldb_signal_ledger.md (running-score formula, SWAG, BucketedCounter, EntityState struct, three-tier architecture)
Phase 5: Entity CRUD and Signal Write API
Delivers: The public API surface for Milestone 1. TidalDB::open(), TidalDB::shutdown(), entity write/read, signal write/read. This is the interface the UAT scenario tests against. Includes the signal() method that atomically writes to WAL, updates in-memory state, and returns immediately.
Acceptance Criteria:
TidalDB::open(config)opens storage, restores in-memory state from checkpoint + WAL replay, returnsResult<TidalDB>TidalDB::shutdown()checkpoints all in-memory state, syncs WAL, closes storage cleanlydb.write_item(id, metadata)stores entity metadatadb.signal(signal_type, entity_id, weight, timestamp)atomically: appends to WAL, updates decay scores, updates windowed countersdb.read_decay_score(entity_id, signal_type, lambda_index)returns current decayed scoredb.read_windowed_count(entity_id, signal_type, window)returns count within windowdb.read_velocity(entity_id, signal_type, window)returns count / window_duration- Full UAT scenario passes as an integration test
TidalDBisSend + Sync-- safe to share across threads behindArc
Depends On: Phase 4
Complexity: M
Research Reference: CODING_GUIDELINES.md section 9 (public API surface)
Deferred to Later Milestones
- User entities and preference vectors -- deferred to M3 because M1 proves the signal primitive without needing user context
- Creator entities and relationship edges -- deferred to M2/M3 because M1 only needs items to prove signal correctness
- Vector index (USearch) -- deferred to M2 because M1 does not need ANN retrieval
- Text index (Tantivy) -- deferred to M4 because M1 does not need full-text search
- Ranking profiles -- deferred to M2 because M1 proves signals work; M2 proves ranking over signals works
- Query parser -- deferred to M2; M1 uses the Rust API directly
- Diversity enforcement -- deferred to M2 because M1 does not produce ranked result sets
- Signal rollups (hourly/daily materialization) -- deferred to M5 because the bucketed counter approach serves the performance budget through M4; rollups become necessary only at scale for 30d+ windows
- RocksDB backend -- deferred indefinitely; fjall is the primary backend, RocksDB is the trait-abstracted fallback if benchmarks demand it
Integration Test
#[test]
fn milestone_1_uat() {
// Open tidalDB with signal schema
let db = TidalDB::open(Config {
data_dir: temp_dir(),
schema: Schema::builder()
.entity_type("item", &["title", "category", "created_at"])
.signal("view", Decay::exponential(Duration::days(7)),
&[Window::Hours(1), Window::Hours(24), Window::Days(7)])
.signal("like", Decay::exponential(Duration::days(14)),
&[Window::Hours(24), Window::Days(7), Window::AllTime])
.signal("skip", Decay::exponential(Duration::days(1)),
&[Window::Hours(1), Window::Hours(24)])
.build(),
}).unwrap();
// Write 100 items
for i in 0..100 {
db.write_item(EntityId(i), metadata(i)).unwrap();
}
// Write 10,000 signal events spanning 7 days
let events = generate_events(10_000, Duration::days(7));
for e in &events {
db.signal(e.signal_type, e.entity_id, e.weight, e.timestamp).unwrap();
}
// Read and verify item #42
let now = Timestamp::now();
let analytical_score = compute_analytical_decay(&events, EntityId(42), "view", now);
let actual_score = db.read_decay_score(EntityId(42), "view", 0).unwrap();
assert!((actual_score - analytical_score).abs() < 1e-6);
let analytical_count = count_events_in_window(&events, EntityId(42), "view", now, Duration::hours(24));
let actual_count = db.read_windowed_count(EntityId(42), "view", Window::Hours(24)).unwrap();
assert_eq!(actual_count, analytical_count);
// Write new event and verify immediate visibility
db.signal("view", EntityId(42), 1.0, now).unwrap();
let new_score = db.read_decay_score(EntityId(42), "view", 0).unwrap();
assert!(new_score > actual_score);
// Close, reopen, verify persistence
db.shutdown().unwrap();
let db2 = TidalDB::open(same_config()).unwrap();
let recovered_score = db2.read_decay_score(EntityId(42), "view", 0).unwrap();
assert!((recovered_score - new_score).abs() < 1e-6);
}
Done When
A developer can embed tidalDB as a Rust dependency, define signal types with decay rates and windows in schema, write thousands of signal events, and read back decay-correct scores, windowed counts, and velocity values that match analytical computation to 6 decimal places -- including after a crash and restart. Performance benchmarks pass: signal write < 100us amortized, decay read < 100ns per entity, 200-entity scoring < 5us.
Milestone 2: Ranked Retrieval -- "A single query retrieves, scores, and ranks content"
Milestone Thesis
A developer can write items with metadata and embeddings, write signal events, and execute a RETRIEVE query that returns items ranked by a named profile using live signal scores -- with metadata filters and diversity constraints applied by the database, not the application. This proves that ranking is a database operation, not application logic.
UAT Scenario
Given:
A tidalDB instance with:
- 10,000 items with metadata (title, category, format, duration, created_at)
and 1536-dim embeddings
- Signal types: view (7d decay), like (14d decay), skip (1d decay),
share (3d decay), completion (30d decay)
- 100,000 signal events spanning 7 days across the items
- Ranking profiles defined:
* "trending" -- share_velocity(6h) primary, view_velocity(6h) secondary,
engagement_ratio gate > 0.03
* "hot" -- score / (age_hours + 2)^1.8
* "new" -- created_at DESC
* "top_week" -- quality_score within 7d window
* "hidden_gems" -- high completion_rate, inverse view_count
* "controversial" -- max(likes * dislikes)
When:
1. RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25
2. RETRIEVE items FILTER category:jazz USING PROFILE hot LIMIT 20
3. RETRIEVE items USING PROFILE new LIMIT 20
4. RETRIEVE items USING PROFILE top_week LIMIT 20
5. RETRIEVE items USING PROFILE hidden_gems FILTER min_completion_rate:0.7 LIMIT 10
6. RETRIEVE items USING PROFILE controversial LIMIT 10
7. Write a burst of 100 "share" signals for item #500
8. Re-execute the trending query
Then:
- Step 1: Items ordered by share velocity, max 1 per creator, items with
engagement_ratio < 0.03 excluded
- Step 2: Only jazz items returned, ordered by hot formula
- Step 3: Items ordered by created_at descending, no signal computation
- Step 4: Items ordered by quality score computed from 7d-windowed signals
- Step 5: Items with high completion but low views, sorted by quality/reach ratio
- Step 6: Items with highest product of positive and negative signals
- Step 7: ok
- Step 8: Item #500 appears higher in trending results (signal written 100ms ago
is reflected)
- Performance: end-to-end RETRIEVE < 50ms for 10K items
Phases
Phase 1: Vector Index Integration (USearch)
Delivers: USearch wrapped behind a trait, with mmap persistence, f16 quantization, and the adaptive filtered search planner. Items can be inserted with embeddings and retrieved by ANN similarity.
Acceptance Criteria:
VectorIndextrait withinsert(key, vector),remove(key),search(query, k),filtered_search(query, k, predicate),save(),load(),view()- USearch backend implements the trait with f16 quantization (default), mmap persistence
- Vectors normalized at insertion time (L2 distance equivalent to cosine for unit vectors)
- Adaptive query planner: selectivity < 2% triggers pre-filter + brute-force; 2-100% uses
filtered_searchwith predicate callback - ANN retrieval at 10K vectors returns top-100 with recall@10 > 0.95
- ANN retrieval latency < 10ms at 10K vectors (benchmarked)
- Persistence: save on checkpoint, view() on restart for immediate read serving
#![forbid(unsafe_code)]relaxed only in the USearch FFI boundary module with SAFETY comments
Depends On: m1p3 (storage traits)
Complexity: L
Research Reference: docs/research/ann_for_tidaldb.md (USearch architecture, filtered search, f16, mmap)
Phase 2: Metadata Indexes and Filter Engine
Delivers: Roaring bitmap indexes for categorical metadata, B-tree indexes for range attributes, and a composable filter engine that evaluates arbitrary filter combinations. The filter engine produces either a bitmap (for pre-filtering ANN) or a predicate closure (for in-graph filtering).
Acceptance Criteria:
- Roaring bitmap per high-cardinality metadata value: category, format, creator_id
- B-tree index for range attributes: created_at, duration
- Filter expressions are composable: AND across dimensions, OR within a dimension
filter.selectivity()estimates the fraction of items matching (for query planner)filter.to_bitmap()returns a RoaringBitmap for pre-filteringfilter.to_predicate()returns aFn(EntityId) -> boolfor in-graph filtering- Filters tested: category:jazz, format:video, duration_min:5m, created_within:7d, and arbitrary combinations
- Filter evaluation < 1 microsecond per candidate (benchmarked)
Depends On: m1p3 (storage engine)
Complexity: M
Research Reference: docs/research/ann_for_tidaldb.md (metadata indexes, selectivity estimation, roaring bitmaps)
Phase 3: Ranking Profile Engine
Delivers: Named ranking profiles declared as data (not compiled code), parsed, validated, stored, and executed by the database. Profiles reference signal scores, windowed aggregates, velocity, metadata fields, and define quality gates. Profiles are versioned and swappable at query time.
Acceptance Criteria:
- Profile declaration syntax supports: primary signal, secondary signals with weights, BOOST, GATE (minimum threshold), PENALIZE, EXCLUDE
- Profiles stored in schema, versioned, retrievable by name
- Profile execution: given a candidate set and a profile, produce a scored and sorted result list
- Built-in profiles implemented:
trending,hot,new,top_week,top_month,top_all_time,hidden_gems,controversial,most_viewed,most_liked,shuffle hotformula:score / (age_hours + 2)^gravitywith configurable gravitycontroversialformula:max(positive_signals * negative_signals)hidden_gemsformula:quality_score * (1 / log(1 + view_count))- Profile change does not require recompile -- profiles are runtime data
- 200-candidate scoring pass with a profile < 10 microseconds (benchmarked)
Depends On: m1p4 (signal ledger)
Complexity: L
Research Reference: VISION.md (ranking profile declarations), ai-lookup/services/ranking-profiles.md, USE_CASES.md Appendix B (sort mode formulas)
Phase 4: Diversity Enforcement
Delivers: Post-scoring diversity pass that reorders results to satisfy constraints (max_per_creator, format_mix) without reducing result count. Implemented as a greedy selection pass over the scored candidate list.
Acceptance Criteria:
max_per_creator:Nenforced: no more than N items from any single creator in the result setformat_mix:trueenforced: no more than 60% of results from any single format- Diversity pass does not reduce result count -- it selects the next-best candidate that satisfies constraints
- Diversity pass adds < 1ms for 200 candidates (benchmarked)
- When diversity constraints cannot be fully satisfied (too few creators), results are returned with a warning flag, not an error
- Property test: diversity constraints hold for 10,000 random candidate sets
Depends On: Phase 3 (ranking profiles produce scored lists)
Complexity: M
Research Reference: VISION.md (diversity as query constraint), thoughts.md Part V.14 (MMR post-scoring)
Phase 5: Query Parser and RETRIEVE Executor
Delivers: The query parser for the RETRIEVE operation and the executor that orchestrates candidate retrieval, filtering, scoring, diversity, and result assembly. This is the "one query" entry point. For M2, the RETRIEVE query does not require FOR USER (no personalization yet) -- it operates on the full item corpus with filters and profiles.
Acceptance Criteria:
- Parser handles:
RETRIEVE items,USING PROFILE <name>,FILTER <conditions>,DIVERSITY <constraints>,LIMIT <n>,EXCLUDE [ids] - Parser produces a typed AST; parse errors include position and helpful message
- Executor pipeline: candidate retrieval (ANN or full scan based on profile) -> filter -> score -> diversity -> limit -> return
- When profile uses velocity/decay signals, executor uses ANN retrieval over embeddings then scores with signal state
- When profile is
neworalphabetical, executor skips ANN and uses metadata index directly - End-to-end RETRIEVE latency < 50ms at 10K items (benchmarked)
- Results include: entity_id, score, and a signal snapshot (key signal values used in scoring) for debugging/transparency
SIGNALwrite command also parsed and routed to signal write path from M1- Full M2 UAT scenario passes as an integration test
Depends On: Phase 1, Phase 2, Phase 3, Phase 4
Complexity: L
Research Reference: ai-lookup/features/query-language.md, SEQUENCE.md (all sequence diagrams)
Deferred to Later Milestones
- FOR USER clause and user preference vectors -- deferred to M3; M2 proves ranking works without personalization
- SIMILAR TO clause (related content) -- deferred to M3; requires user context for personalization layer
- Relationship graph (follows, blocks) -- deferred to M3; M2 filters on metadata, not relationships
- SEARCH query (text + semantic) -- deferred to M4; M2 proves RETRIEVE ranking
- Full-text index (Tantivy) -- deferred to M4
- Exploration budget / cold start -- deferred to M3; requires user context to be meaningful
- User state filters (unseen, saved, liked) -- deferred to M3; requires user entities
- Engagement threshold filters (min_views, min_likes) -- partially implemented via signal reads; full composable filter syntax deferred to M5
Integration Test
#[test]
fn milestone_2_uat() {
let db = open_with_full_schema();
// Write 10K items with embeddings
for i in 0..10_000 {
db.write_item(EntityId(i), metadata(i), Some(embedding(i))).unwrap();
}
// Write 100K signal events
for e in generate_events(100_000, Duration::days(7)) {
db.signal(e.signal_type, e.entity_id, e.weight, e.timestamp).unwrap();
}
// Trending query with diversity
let results = db.retrieve(
"RETRIEVE items USING PROFILE trending DIVERSITY max_per_creator:1 LIMIT 25"
).unwrap();
assert_eq!(results.len(), 25);
assert!(results.windows(2).all(|w| w[0].score >= w[1].score));
assert!(creator_counts(&results).values().all(|&c| c <= 1));
// Category filter with hot sort
let jazz = db.retrieve(
"RETRIEVE items FILTER category:jazz USING PROFILE hot LIMIT 20"
).unwrap();
assert!(jazz.iter().all(|r| r.metadata["category"] == "jazz"));
// Signal freshness: write burst, verify ranking change
let pre_burst = db.retrieve(
"RETRIEVE items USING PROFILE trending LIMIT 10"
).unwrap();
for _ in 0..100 {
db.signal("share", EntityId(500), 1.0, Timestamp::now()).unwrap();
}
let post_burst = db.retrieve(
"RETRIEVE items USING PROFILE trending LIMIT 10"
).unwrap();
let pre_rank = pre_burst.iter().position(|r| r.id == EntityId(500));
let post_rank = post_burst.iter().position(|r| r.id == EntityId(500));
assert!(post_rank.unwrap() < pre_rank.unwrap_or(25));
}
Done When
A developer can write items with embeddings and metadata, write signal events, and execute RETRIEVE queries with any of the 11+ built-in sort modes, metadata filters, and diversity constraints. Results are correctly ranked by the named profile. Signal events written 100ms ago are reflected in the next query. End-to-end latency < 50ms at 10K items. Diversity constraints hold in every result set.
Milestone 3: Personalized Ranking -- "The For You query works"
Milestone Thesis
A developer can write user entities with preference vectors, write relationship edges (follows, blocks), write engagement signals that update user profiles and relationship weights automatically, and execute RETRIEVE items FOR USER @user_id USING PROFILE for_you -- getting results shaped by the user's history, relationships, and implicit preferences. This proves that the feedback loop closes inside the database.
UAT Scenario
Given:
A tidalDB instance with:
- 10,000 items across 200 creators, with embeddings
- 500 users with initial preference embeddings
- Relationship edges: follows, blocks
- Signals: view, like, skip, hide, completion, share
- 500,000 historical signal events establishing user preferences
- Profiles: for_you, following, related, notification
When:
1. RETRIEVE items FOR USER @user_42 USING PROFILE for_you
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50
2. RETRIEVE items FOR USER @user_42 FILTER relationship:follows
USING PROFILE following LIMIT 50
3. RETRIEVE items SIMILAR TO @item_abc FOR USER @user_42
USING PROFILE related FILTER unseen LIMIT 10
4. SIGNAL like item:@item_xyz user:@user_42
5. Re-execute the for_you query
6. SIGNAL hide item:@item_999 user:@user_42
7. SIGNAL block user:@user_42 target_creator:@creator_77
8. Re-execute the for_you query
Then:
- Step 1: Results personalized -- items matching user_42's preference vector
rank higher; items from blocked creators excluded; items already seen excluded;
max 2 per creator; 10% exploration budget (items from unfollowed creators)
- Step 2: Only items from followed creators, chronological order
- Step 3: Items semantically similar to @item_abc, re-ranked by user_42's
preference match, already-seen excluded
- Step 4: Signal write atomically updates: item like count, user->creator
interaction weight, user preference vector shifted toward item embedding
- Step 5: Results shift -- items similar to @item_xyz's topic rank higher;
creator of @item_xyz appears more frequently
- Step 6: @item_999 never appears in any future query for user_42
- Step 7: All items by creator_77 excluded from all queries for user_42
- Step 8: No items from creator_77; no item_999; shift from like reflected
Phases
Phase 1: User and Creator Entities with Relationships
Delivers: User and creator entity types with preference vectors and a relationship graph. Relationship edges are weighted, directional, and queryable. Follows, blocks, interaction weights are first-class.
Acceptance Criteria:
- User entities store: user_id, preference embedding (mutable, updated on signals), metadata
- Creator entities store: creator_id, catalog embedding (aggregated from items), metadata
- Relationship edges:
(from_entity, to_entity, type, weight, timestamp)with types: follows, blocks, interaction_weight, hide, mute followsfilter: efficiently enumerate all items by creators a user follows (roaring bitmap of creator's item set, intersected with follows set)blockedfilter: efficiently exclude all items by blocked creatorsunseenfilter: roaring bitmap of user's seen item set, inverted- Relationship write/read latency < 50 microseconds
Depends On: m1p3 (storage), m2p2 (bitmap indexes) Complexity: L
Phase 2: Feedback Loop -- Signal Writes Update User State
Delivers: When a signal event is written (like, skip, hide, completion), the database atomically updates the item's signal ledger, the user-to-item relationship, the user-to-creator interaction weight, and the user's preference vector. One write, multiple state updates, no application logic.
Acceptance Criteria:
db.signal("like", item_id, user_id, weight, timestamp)atomically:- Appends event to WAL
- Updates item signal ledger (decay scores, windowed counts)
- Increments user->creator interaction_weight
- Shifts user preference vector toward item embedding (configurable learning rate)
db.signal("skip", ...)atomically: updates item skip count, decays user->creator weight, shifts preference vector away from item embeddingdb.signal("hide", ...)sets permanent hard-negative on user->item relationship; item excluded from all future queries for this userdb.signal("block", user, creator)sets permanent block; all items by creator excluded from all queries for this user- Preference vector update uses exponential moving average:
pref = alpha * item_embedding + (1 - alpha) * pref(positive) orpref = pref - alpha * item_embedding(negative), normalized after update - All updates visible to the next query (no eventual consistency lag within the process)
- Property test: 10,000 random signal sequences never produce a state where a hidden item or blocked creator appears in query results
Depends On: Phase 1, m1p4 (signal ledger) Complexity: XL
Phase 3: Personalized Ranking Profiles
Delivers: Ranking profiles that incorporate user context: preference match (embedding similarity between user and item), user-creator interaction weight, social proof (engagement from user's follows), and user-specific exclusions. The for_you, following, related, and notification profiles.
Acceptance Criteria:
for_youprofile: ANN retrieval using user preference vector, scoring = preference_match * engagement_velocity * recency_decay * social_proof, gates on completion_rate, penalizes skip count, 10% exploration budgetfollowingprofile: candidate set restricted to followed creators' items, sorted by created_at DESC, tiebreaker on completion_raterelatedprofile: ANN retrieval using source item's embedding, collaborative filtering boost (items co-engaged with source), personalization re-rank by user preferencenotificationprofile: candidates from followed creators' recent items, scored by relationship_strength * item_quality- Exploration budget: 10% of for_you results are from creators the user does not follow, to prevent filter bubbles
- Cold start: new users with no signal history get results ranked by population-level signals (trending, quality)
- Cold start: new items with no signals get an exploration window (appear in a small % of for_you feeds)
FOR USER @user_idclause parsed and user state loaded into query context
Depends On: Phase 2, m2p3 (ranking engine), m2p5 (query parser) Complexity: L
Phase 4: User State Filters
Delivers: Filters that depend on user state: unseen, in_progress, saved, liked, in_collection. These require per-user bitmaps or sets maintained by the signal system.
Acceptance Criteria:
unseenfilter: excludes items the user has viewed (maintained as roaring bitmap per user, updated on view signal)unblockedfilter: excludes items from blocked creators and hidden itemssavedfilter: returns only items the user has savedlikedfilter: returns only items the user has likedin_progressfilter: returns items with partial completion signal- User state filters compose with all metadata filters from M2
- Per-user seen bitmap memory: ~125KB per user at 1M items (roaring bitmap), manageable for 10K users in memory
Depends On: Phase 1, Phase 2 Complexity: M
Deferred to Later Milestones
- SEARCH query with personalization -- deferred to M5; M3 proves personalized RETRIEVE
- Tantivy integration -- deferred to M5
- People/creator search (UC-10) -- deferred to M5
- Social graph traversal for trending ("trending among my follows") -- deferred to M6; requires graph query capabilities beyond simple follows filter
- Collaborative filtering -- basic co-engagement signals used in
relatedprofile; full matrix-factorization-style CF deferred to M6 - User-created collections/boards (UC-09.4) -- deferred to M6
- Live content status tracking (UC-12) -- deferred to M6
Integration Test
#[test]
fn milestone_3_uat() {
let db = open_with_users_and_relationships();
// User 42 likes jazz, follows creators 1-10, blocked creator 77
let feed = db.retrieve(
"RETRIEVE items FOR USER @42 USING PROFILE for_you \
FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50"
).unwrap();
assert_eq!(feed.len(), 50);
assert!(feed.iter().all(|r| !user_42_seen.contains(&r.id)));
assert!(feed.iter().all(|r| r.creator_id != CreatorId(77)));
assert!(creator_counts(&feed).values().all(|&c| c <= 2));
// Like an item, verify preference shift
db.signal("like", EntityId(500), UserId(42), 1.0, now()).unwrap();
let feed2 = db.retrieve(same_for_you_query()).unwrap();
// Items topically similar to item 500 should rank higher
let topic_500 = db.read_item(EntityId(500)).unwrap().category;
let topic_match_before = feed.iter().filter(|r| r.category == topic_500).count();
let topic_match_after = feed2.iter().filter(|r| r.category == topic_500).count();
assert!(topic_match_after >= topic_match_before);
// Hide and block, verify exclusion
db.signal("hide", EntityId(999), UserId(42), 1.0, now()).unwrap();
db.signal("block", UserId(42), CreatorId(77), 1.0, now()).unwrap();
let feed3 = db.retrieve(same_for_you_query()).unwrap();
assert!(feed3.iter().all(|r| r.id != EntityId(999)));
assert!(feed3.iter().all(|r| r.creator_id != CreatorId(77)));
}
Done When
The full "For You" query works: RETRIEVE items FOR USER @user_id USING PROFILE for_you FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50 returns personalized, diversity-constrained results that reflect the user's engagement history, exclude hidden items and blocked creators, include an exploration budget, handle cold-start users and items, and update in response to new signal events within 100ms. The following, related, and notification profiles also work correctly.
Milestone 4: Agent Memory -- "Agents own the personalization substrate"
Milestone Thesis
Agents mediate the user interaction: they ground LLM responses, collect preferences, and emit feedback. This milestone proves a developer can embed tidalDB alongside an agent runtime, create sessions, append structured feedback signals (reward, tool usage, critiques), enforce per-agent policy, and query session memory in milliseconds.
Phases
Phase 1: Session Schema & Lifecycle
Delivers: SessionId, AgentId, and AgentPolicy types in schema plus builder flags (with_sessions(true)). APIs to start_session, append_session_metadata, close_session. WAL entries tagged with agent metadata and CLI output listing active sessions.
Phase 2: Session Materializers & Short-Lived Aggregates
Delivers: SessionMaterializer (minute-scale decay buckets for reward/pref hints, tool usage counters) registered via the existing materializer trait. Query APIs session_view(session_id) and session_velocity(session_id, signal_type) with <5µs read latency. Integration tests proving hot path throughput at 50k updates/sec.
Phase 3: Policy & Safety Layer
Delivers: Declarative schema-bound policies (allowed signal types, max QPS, storage TTL). Enforcement in the signal write path (reject or queue). Audit log per agent (accessible via CLI/metrics) plus rate-limiters to isolate noisy agents.
Phase 4: Agent-Facing APIs & Explanations
Delivers: retrieve_for_session / search_for_session endpoints returning ranked items plus a session_snapshot (top signals, reasons, reward velocity). Agent-friendly error codes, documentation, and samples (user → agent → tidalDB). Session data plumbed into ranking profiles via new SessionContext.
UAT Scenario
Given:
An agent opens session S for user @u123 with metadata {tool:"planner"}
Policy allows signals preference_hint and reward; forbids raw_log
When:
1. Agent writes preference_hint ("more jazz today")
2. Agent writes reward(+0.8) after delivering an answer
3. Agent executes RETRIEVE ... FOR USER @u123 FOR SESSION @S USING PROFILE for_you LIMIT 10
4. Agent receives ranked items and session_snapshot (reward_velocity, last_tool)
5. Agent attempts to write raw_log → rejected with policy violation
6. Session closes; CLI shows duration, writes, rejections
Then:
- Session aggregates reflect preference/reward immediately
- Policy enforcement blocks disallowed write with audit trail
- After closure, querying session S returns archived snapshot with final signals
Milestone 5: Hybrid Search -- "Text + semantic + signals in one query"
Milestone Thesis
A developer can execute SEARCH items QUERY "rust tutorial beginner" VECTOR query_vector FOR USER @user_id USING PROFILE search LIMIT 20 and get results that combine BM25 text relevance, semantic similarity, and user personalization in a single ranked list. This proves that search and retrieval are the same system.
UAT Scenario
Given:
A tidalDB instance with:
- 10,000 items with text fields (title, description, tags) indexed for full-text search
- All items have embeddings
- 500 users with engagement history
- Search profile defined: text relevance as floor, semantic similarity,
personalization adjustment
When:
1. SEARCH items QUERY "rust tutorial beginner" VECTOR [query_embedding]
FOR USER @user_42 USING PROFILE search DIVERSITY max_per_creator:2 LIMIT 20
2. SEARCH items QUERY "jazz piano" FOR USER @user_42
USING PROFILE search FILTER duration:short, format:video LIMIT 20
3. SEARCH items QUERY "\"exact phrase match\"" USING PROFILE search LIMIT 10
4. SEARCH items QUERY "jazz -beginner" USING PROFILE search LIMIT 10
5. SEARCH creators QUERY "jazz" LIMIT 10
6. User clicks result #3, record SIGNAL search_click
7. User searches same query again
Then:
- Step 1: Results combine BM25 + semantic similarity via RRF;
personalization re-ranks within relevant set; user_42 (a beginner)
sees beginner content elevated
- Step 2: Text-only search (no vector), filtered by duration and format
- Step 3: Exact phrase match -- only items containing "exact phrase match"
- Step 4: Boolean exclusion -- no items matching "beginner"
- Step 5: Creator search by name/topic
- Step 6: Signal recorded with query context and rank position
- Step 7: Clicked result may rank higher due to search_click signal
- Performance: SEARCH < 50ms at 10K items
Phases
Phase 1: Tantivy Integration
Delivers: Tantivy embedded as a derived index for full-text search. DB-primary consistency pattern: entity store is source of truth, Tantivy is a materialized view updated via outbox. BM25 scoring exposed via custom Collector and Weight/Scorer seek pattern.
Acceptance Criteria:
- Tantivy index created from schema text field definitions (title, description, tags)
- Background indexer reads entity store outbox and feeds Tantivy writer
- Tantivy commit stores last-processed sequence number in payload for crash recovery
- Custom
AllScoresCollectorreturns all matching doc IDs with BM25 scores Weight::scorer+DocSet::seekpattern scores specific candidate IDs (for re-ranking ANN results)- External entity ID -> DocAddress mapping maintained and updated on segment merge
- Boolean queries supported: AND, OR, NOT, exact phrase, field-scoped
- Commit interval: every 1-5 seconds or every N thousand documents
- Index rebuild from entity store completes in < 10 minutes at 10K items
- BM25 query latency < 10ms at 10K documents (benchmarked)
Depends On: m1p3 (storage engine), m1p5 (entity API)
Complexity: L
Research Reference: docs/research/tantivy.md (Collector API, consistency pattern, seek scoring, commit model)
Phase 2: Hybrid Fusion (RRF)
Delivers: Reciprocal Rank Fusion combining BM25 ranked lists with ANN ranked lists into a single scored result set. The starting point is RRF with k=60; the architecture supports upgrading to tuned linear combination when relevance labels exist.
Acceptance Criteria:
RRF(d) = 1/(60 + rank_bm25(d)) + 1/(60 + rank_ann(d))implemented- Documents appearing in only one list contribute only their single-list term
- RRF results are re-rankable by personalization (user preference overlay)
- When only text query is provided (no vector), pure BM25 ranking used
- When only vector is provided (no text), pure ANN ranking used
- Fusion adds < 1ms to query time (benchmarked)
- k parameter configurable (default 60)
Depends On: Phase 1 (BM25 scores), m2p1 (ANN scores)
Complexity: S
Research Reference: docs/research/tantivy.md (RRF section, Cormack et al.)
Phase 3: SEARCH Query Parser and Executor
Delivers: The SEARCH query parser and executor that orchestrates text retrieval, semantic retrieval, fusion, personalization, filtering, diversity, and result assembly.
Acceptance Criteria:
- Parser handles:
SEARCH items/creators,QUERY "text",VECTOR [embedding],FOR USER,USING PROFILE,FILTER,DIVERSITY,LIMIT - Query text parsing: exact phrase (
"...""), boolean operators (AND/OR/NOT/-), field-scoped (title:...), wildcard (term*) - Executor pipeline: text retrieval -> ANN retrieval -> fusion -> personalization -> filter -> diversity -> return
- When both QUERY and VECTOR provided, hybrid fusion (RRF)
- When only QUERY, BM25-only retrieval
- When only VECTOR, ANN-only retrieval
- Search results include: entity_id, combined_score, bm25_score, semantic_score, rank
search_clicksignal writes include query context and rank position- End-to-end SEARCH < 50ms at 10K items (benchmarked)
Depends On: Phase 1, Phase 2, m2p5 (query parser infrastructure) Complexity: M
Phase 4: Creator and People Search
Delivers: Search over creator entities by name, topic, and attributes. "Creators like X" via creator embedding similarity. Enables UC-10.
Acceptance Criteria:
- Creator entities indexed in Tantivy (name, handle, bio, topics)
- Creator embeddings searchable via ANN (aggregated from catalog)
SEARCH creators QUERY "jazz" LIMIT 10returns creators matching topicSEARCH creators SIMILAR TO @creator_id LIMIT 10returns similar creators by embedding- Creator filters: verified, min_followers, language, followed_by_user
- Creator sort modes: follower_count, engagement_rate, posting_frequency
Depends On: Phase 1, m3p1 (creator entities) Complexity: M
Deferred to Later Milestones
- Autocomplete and search suggestions (UC-02.3) -- deferred to M5; requires prefix indexes and trending query tracking
- Saved searches and alerts (UC-02.4) -- deferred to M5; requires persistent query storage and push notification
- Visual search / image search (UC-11) -- deferred to M5; requires multi-modal embedding support
- "Did you mean" typo correction -- deferred to M5; requires edit-distance computation on term dictionary
- Tuned linear combination (replacing RRF) -- deferred to M5; requires relevance labels for alpha tuning
Done When
A developer can execute SEARCH queries that combine full-text BM25 relevance with semantic vector similarity and user personalization in a single ranked result set. Boolean queries, phrase matching, field-scoped search, and creator search all work. Results reflect engagement signals. End-to-end SEARCH latency < 50ms at 10K items.
Milestone 6: Full Surface Coverage -- "Every use case works"
Milestone Thesis
Every one of the 14 use cases works end-to-end. Every sort mode, every filter dimension, every discovery surface described in USE_CASES.md is operational. The query RETRIEVE items FOR USER @user_id CONTEXT feed USING PROFILE for_you FILTER unseen, unblocked, format:video, duration:short DIVERSITY max_per_creator:2, format_mix:true LIMIT 50 is the complete, production-quality end state query.
UAT Scenario
Given:
A tidalDB instance loaded with:
- 100,000 items across 1,000 creators
- 10,000 users with engagement histories
- All 14 use case scenarios configured
- All sort modes and filter dimensions exercised
When:
All 14 use cases are executed as described in USE_CASES.md:
UC-01: For You Feed with full diversity and exploration
UC-02: Search with all filter dimensions, autocomplete, saved searches
UC-03: Trending (global, category, social-graph scoped)
UC-04: Following feed (chronological, algorithmic modes)
UC-05: Related/Up Next with collaborative filtering
UC-06: Browse with all sort modes, faceted filters, mood filters
UC-07: Notification prioritization with frequency capping
UC-08: Creator profile (Top, New, Hot, For You modes)
UC-09: User library (history, saved, liked, collections, continue watching)
UC-10: People search with "creators like X"
UC-11: Visual/semantic search with image embeddings
UC-12: Live content with real-time viewer count
UC-13: Hidden gems with breakout detection
UC-14: Controversial and Hot with dual-signal ranking
Then:
Every query returns correct results per use case specification.
All 25+ sort modes produce correctly ordered results.
All filter dimensions compose correctly.
Performance: < 50ms for all queries at 100K items.
Phases
(Phases for M5 are provisional -- detailed decomposition happens after M4 ships, informed by what was learned.)
Phase 1: Complete Sort Mode Coverage
Delivers: All 25+ sort modes from Appendix B operational. Windowed top sorts (hour, today, week, month, year, all_time), shuffle, alphabetical, shortest/longest, live_viewer_count, date_saved, creator_engagement_rate.
Depends On: M4 complete Complexity: L
Phase 2: Complete Filter Coverage
Delivers: All filter dimensions from Appendix A operational and composable. Geographic filters, accessibility filters, community signal filters, availability filters, engagement threshold filters.
Depends On: Phase 1 Complexity: L
Phase 3: Social Graph Queries and Collaborative Filtering
Delivers: Social graph traversal for trending-among-follows, collaborative filtering for related/up-next, "creators followed by people I follow." The graph query capabilities needed for UC-03 (social trending), UC-05 (collaborative filtering), UC-10 (social creator discovery).
Depends On: Phase 1 Complexity: L
Phase 4: User Library, Collections, and Continue Watching
Delivers: UC-09 complete: watch history, saved items, liked items, user-created collections, continue watching (resume position), download state. Collections as rankable entities.
Depends On: Phase 2 Complexity: M
Phase 5: Advanced Search Features
Delivers: Autocomplete, search suggestions, trending searches, saved searches, "did you mean" typo correction, related query suggestions. UC-02.3 and UC-02.4.
Depends On: Phase 1 Complexity: L
Phase 6: Live Content and Notification Systems
Delivers: UC-12 (live content with real-time viewer count, scheduled content, reminders) and UC-07 (notification prioritization with frequency capping, per-creator limits). Real-time signal types for viewer count and schedule awareness.
Depends On: Phase 1 Complexity: M
Deferred to Later Milestones
- Signal rollups (hourly/daily materialization) -- built if 100K-item benchmarks show bucketed counters exceeding the latency budget for 30d+ windows
- Multi-vector user interest clustering (PinnerSage) -- deferred to M7 or beyond; single preference vector serves through M6
- ACORN-1 two-hop expansion for very selective filters -- deferred to M7; USearch predicate callback sufficient through M6
Done When
All 14 use cases pass their UAT scenarios as defined in USE_CASES.md. All 25+ sort modes work. All filter dimensions compose. Every sequence diagram in SEQUENCE.md can be executed. Performance: < 50ms for all queries at 100K items.
Milestone 7: Production Hardening -- "Ready for real workloads"
Milestone Thesis
tidalDB can be embedded in a production application and operated with confidence. Crash recovery is correct and fast. Graceful degradation works under load. Operational visibility exists. Performance meets targets at 1M+ items. The database is trustworthy.
UAT Scenario
Given:
A tidalDB instance with:
- 1,000,000 items, 100,000 users, 10,000 creators
- Sustained write load: 10,000 signal events/second
- Concurrent read load: 1,000 RETRIEVE queries/second
When:
1. Run full workload for 1 hour
2. Kill the process at a random point
3. Restart and measure recovery time
4. Verify no data loss and no inconsistency
5. Run workload at 3x expected load
6. Verify graceful degradation (reduced precision, not errors)
Then:
- Step 1: All queries < 50ms p99, all signal writes < 100us amortized
- Step 3: Recovery time < 30 seconds
- Step 4: WAL replay produces state identical to pre-crash;
no phantom items, no lost signals, no inconsistent aggregates
- Step 5: Under overload, tidalDB reduces candidate set size, uses coarser
aggregates, skips diversity -- but never returns errors for well-formed queries
- Step 6: Degradation follows the documented order:
1. Reduce candidate set (500 -> 200)
2. Use coarser aggregates
3. Skip diversity
4. Return from materialized cache
Phases
(Phases for M7 are provisional -- detailed decomposition happens after M6 ships.)
Phase 1: Crash Recovery Hardening
Delivers: Comprehensive crash recovery testing and hardening. Fault injection at every write-path stage. Recovery time targets. WAL compaction and checkpoint optimization.
Depends On: M6 complete Complexity: XL
Phase 2: Graceful Degradation Under Load
Delivers: Automatic quality reduction under load pressure. Configurable degradation order. Backpressure on write path. Never errors for well-formed queries.
Depends On: Phase 1 Complexity: L
Phase 3: Performance at Scale
Delivers: Benchmarks and optimization at 1M items, 100K users. USearch performance tuning (M, ef_search, quantization). Tantivy segment management. Signal state memory optimization. Hot/warm/cold tiering for signal state if memory budget requires it.
Depends On: Phase 1 Complexity: XL
Phase 4: Operational Visibility
Delivers: Metrics, diagnostics, and observability. Query execution stats (candidates considered, filters applied, scoring time, diversity adjustments). Signal system health (WAL lag, checkpoint age, memory usage). Index health (segment count, tombstone ratio). Error reporting with context.
Depends On: Phase 1 Complexity: M
Deferred (Post-M7 / Future)
- Horizontal distribution -- the single-node architecture scales vertically first; distribution is a separate product decision
- Multi-tenancy -- per-tenant isolation within a single tidalDB instance
- Streaming query results -- cursor-based streaming for very large result sets
- A/B testing infrastructure -- comparing two profile versions within the database
- Signal rollup to external cold storage -- S3/GCS archival for compliance
- Client libraries -- language-specific wrappers beyond Rust embedding
Done When
tidalDB operates correctly at 1M items under sustained concurrent read/write load. Crash recovery completes in < 30 seconds with zero data loss. Graceful degradation works under 3x overload without returning errors. All performance targets met at p99. A developer can embed tidalDB in a production application and operate it with confidence.
Use Case Coverage Progression
| UC | Description | M1 | M2 | M3 | M4 | M5 | M6 | M7 |
|---|---|---|---|---|---|---|---|---|
| UC-01 | For You Feed | - | - | Full | Full | Full | Full | Full |
| UC-02 | Search | - | - | - | - | Core | Full | Full |
| UC-03 | Trending/Rising | Signals | Full | Full | Full | Full | Full | Full |
| UC-04 | Following Feed | - | Partial | Full | Full | Full | Full | Full |
| UC-05 | Related/Up Next | - | - | Core | Core | Core | Full | Full |
| UC-06 | Browse/Category | Signals | Core | Core | Core | Core | Full | Full |
| UC-07 | Notifications | - | - | Core | Core | Core | Full | Full |
| UC-08 | Creator Profile | - | Core | Core | Core | Core | Full | Full |
| UC-09 | User Library | - | - | Partial | Partial | Partial | Full | Full |
| UC-10 | People Search | - | - | - | - | Core | Full | Full |
| UC-11 | Visual/Semantic | - | - | - | - | Partial | Full | Full |
| UC-12 | Live Content | - | - | - | - | - | Full | Full |
| UC-13 | Hidden Gems | - | Full | Full | Full | Full | Full | Full |
| UC-14 | Controversial/Hot | Signals | Full | Full | Full | Full | Full | Full |
Legend:
-= Not addressedSignals= Signal primitives exist but no query surfacePartial= Some functionality, not all modesCore= Primary query path works, some modes/filters missing- Full = All modes, filters, and feedback loops per USE_CASES.md specification
Dependency DAG
m1p1 (Types/Schema) ✓
|
+---> m1p2 (WAL) ✓
| |
+---> m1p3 (Storage/fjall) ✓ ---+
| | |
| +---> m1p4 (Signal Ledger)
| |
| +---> m1p5 (Entity + Signal API) = M1 COMPLETE
| |
| +---> m2p3 (Ranking Profiles)
| |
+---> m2p1 (USearch) ---+
| |
+---> m2p2 (Filters) ---+---> m2p4 (Diversity)
| |
+-------+---> m2p5 (RETRIEVE Query) = M2 COMPLETE
|
+---> m3p1 (Users/Creators/Relationships)
| |
| +---> m3p2 (Feedback Loop)
| | |
| | +---> m3p3 (Personalized Profiles)
| |
| +---> m3p4 (User State Filters)
|
| m3p3 + m3p4 = M3 COMPLETE
|
+---> m4p1 (Tantivy)
|
+---> m4p2 (RRF Fusion)
| |
| +---> m4p3 (SEARCH Query)
|
+---> m4p4 (Creator Search)
m4p3 + m4p4 = M4 COMPLETE
M5 phases (provisional) depend on M4
M6 phases (provisional) depend on M5
Parallelization opportunities:
- m1p2 (WAL) and m1p3 (Storage) are parallel after m1p1 (both now complete: m1p3 was completed first, m1p2 followed)
- m2p1 (USearch) and m2p2 (Filters) can be built in parallel after m1p3
- m3p1 (Entities) and m4p1 (Tantivy) can start in parallel with later M2 phases
- m3p4 (User State Filters) can be built in parallel with m3p3 (Profiles)
- m4p2 (RRF) and m4p4 (Creator Search) can be built in parallel
Architectural Decisions Locked In
These decisions are made. They are not revisited unless benchmarks prove them wrong.
| Decision | Chosen | Alternative | Rationale |
|---|---|---|---|
| Storage engine | fjall (pure Rust) | RocksDB | Pure Rust, #![forbid(unsafe_code)], fast compile, trait-abstracted for swap |
| Vector index | USearch (C++ FFI) | hnsw_rs | 10-100x QPS, predicate callbacks, mmap, f16 quantization |
| Text search | Tantivy (embedded) | Custom BM25 | 40K lines of battle-tested code; Collector/Scorer API provides exact hooks needed |
| Decay formula | Running S(t)=S(prev)exp(-lambdadt)+w | Raw event scan | O(1) vs O(N), proven exact, 20-60x faster at 50+ events/entity |
| Windowed aggregation | Bucketed counters (Scotty pattern) | SWAG two-stacks | Simpler, serves multiple window sizes from one set of buckets |
| Hybrid fusion | RRF (k=60) | Tuned linear combination | Zero-config, robust; linear combo is the upgrade path with relevance labels |
| Consistency model | DB-primary, Tantivy as derived index | Two-phase commit | Simpler, deterministic recovery, source of truth is always the entity store |
| WAL checksums | BLAKE3 | CRC32C | Content-addressing enables deduplication; BLAKE3 is fast enough |
| Key encoding | Subject-prefix [entity_id][0x00][TAG:suffix] |
Separate key namespaces | Co-locates entity data, natural shard boundary, single prefix scan |
| Embedding format | f16 quantization (default) | float32 | Half memory, < 1% recall loss at 1536D |
| Query language | Custom (RETRIEVE/SEARCH/SIGNAL) | SQL | Domain semantics cannot be expressed in SQL without losing optimization opportunities |
What This Roadmap Does NOT Cover
These are explicitly out of scope for the foreseeable future:
- Embedding generation -- tidalDB retrieves and ranks over vectors. It does not generate them. Bring your own model.
- Horizontal distribution -- Single-node first. Scale vertically. Distribution is a separate product.
- ACID transactions across entities -- Signal writes are atomic within an entity's state. Cross-entity transactions are not needed for the ranking problem.
- SQL compatibility -- The custom query language exists because SQL cannot express ranking semantics. No SQL layer.
- Multi-tenancy -- One tidalDB instance serves one application. Tenant isolation is the application's concern.
- Content moderation, authentication, payments, CDN -- tidalDB solves one problem: ranking. Everything else is someone else's job.