M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
MMR diversity enforcement, and m2_uat integration tests.
M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
creator entities, session tracking, and m3_uat integration tests.
M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
ranking executor with explain mode, query executor integration, benchmarks for
query/ranking/vector/filters/diversity, and m4_uat integration tests.
Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6.4 KiB
Milestone 3, Phase 2: Feedback Loop -- Signal Writes Update User State
Phase Deliverable
When a signal event is written (view, like, skip, hide, block, completion, share), the database atomically updates multiple state targets: the item's signal ledger, the user's preference vector, the user-to-creator interaction weight, and the user-state bitmap indexes. One db.signal() call, multiple state updates, zero application logic.
This phase closes the feedback loop inside the database. Before m3p2, signals only updated item-level aggregates. After m3p2, signals also update user-level state: the preference embedding shifts toward items the user engages with, interaction weights strengthen toward creators the user prefers, and hard negatives (hide/block) permanently exclude items and creators from future queries.
The phase delivers four components: (1) user preference vector EMA update with configurable learning rate, (2) interaction weight ledger using the existing decay infrastructure, (3) hard negative storage with WAL-backed durability, and (4) an atomic signal dispatch that wires all state updates into a single transactional signal write.
Acceptance Criteria
db.signal("view", item_id, 1.0, ts)with user context atomically: updates item signal ledger, marks item as seen inUserStateIndex, increments user->creator interaction weightdb.signal("like", item_id, 1.0, ts)with user context atomically: updates item signal ledger, shifts user preference vector toward item embedding (EMA), increments user->creator interaction weightdb.signal("skip", item_id, 1.0, ts)with user context atomically: updates item signal ledger, shifts user preference vector away from item embedding, decays user->creator interaction weightdb.signal("hide", item_id, 1.0, ts)with user context atomically: writes permanent hide edge, adds item toUserBlockedSet.hidden_items, excludes from all future queriesdb.signal("block", user_id, creator_id, ...)atomically: writes permanent block edge, adds creator toUserBlockedSet.blocked_creators, excludes all creator items from all future queries- Preference vector EMA:
pref_new = normalize(alpha * item_embedding + (1 - alpha) * pref_old)with configurable alpha (default 0.1) - Interaction weights use the same
DecayModel::Exponentialinfrastructure from m1p4 - Hard negatives (hide/block) are WAL-backed and survive crash + replay
- Property test: for any sequence of hide/block/signal events, a RETRIEVE query NEVER returns a hidden item or blocked creator's items
- All updates visible to the next query (no eventual consistency lag within the process)
- Signal dispatch overhead < 50 microseconds beyond the base item signal write
Dependencies
- Requires: m3p1 (user/creator entities, relationship graph,
UserStateIndex,CreatorItemsBitmap), m1p4 (signal ledger, decay infrastructure), m1p5 (signal write API), m2p1 (vector index for embedding reads) - Blocks: m3p3 (Personalized Profiles need updated preference vectors and interaction weights), m3p4 (User State Filters need populated seen/blocked bitmaps)
Research References
- docs/research/tidaldb_signal_ledger.md -- Three-tier storage, signal dispatch
- docs/research/ann_for_tidaldb.md -- User preference vector management
- thoughts.md -- Part V.16 (user preference vector as database-managed embedding)
Task Index
| # | Task | Delivers | Depends On | Complexity |
|---|---|---|---|---|
| 01 | User Preference Vector | EMA update, normalization, learning rate config, cold-start initialization, storage codec | None | L |
| 02 | Interaction Weight Ledger | User-to-creator weights using decay infrastructure, update on engagement signals, read API | Task 01 | M |
| 03 | Hard Negatives | Hide/block permanent storage, WAL-backed durability, crash-safe replay, bitmap integration | None | L |
| 04 | Atomic Signal Dispatch | UserContext wiring, multi-target signal dispatch, property tests for correctness invariants |
Tasks 01, 02, 03 | L |
Task Dependency DAG
Task 01: User Preference Vector Task 03: Hard Negatives
| |
v |
Task 02: Interaction Weight Ledger |
| |
+-----------------------------------+
|
v
Task 04: Atomic Signal Dispatch
Tasks 01 and 03 can be built in parallel. Task 02 depends on Task 01 (needs entity lookup patterns). Task 04 depends on all three (wires everything together).
File Layout
tidal/src/
entities/
preference.rs -- PreferenceVector, EMA update, normalization, storage (Task 01)
interaction.rs -- InteractionWeightLedger, decay-based weights (Task 02)
hard_neg.rs -- HardNegativeStore, WAL event types, replay (Task 03)
db/
signal_dispatch.rs -- UserSignalContext, atomic multi-target dispatch (Task 04)
mod.rs -- Extended signal() API with user context
tidal/tests/
m3p2_feedback_loop.rs -- Phase integration tests
Open Questions
-
Preference vector learning rate scheduling: Should alpha decay over time (fewer updates = smaller shifts as the model converges), or remain constant? Recommendation: constant alpha for M3. Adaptive alpha is a M6 refinement that requires tracking update count per user.
-
Negative preference update formula: When a user skips an item, should the preference vector move directly away (
pref - alpha * item_embedding) or toward the orthogonal complement? Recommendation: simple subtraction + normalization for M3. The orthogonal complement approach is mathematically cleaner but adds complexity without proven benefit at this scale. -
Interaction weight initial value: When a user first interacts with a creator they have no history with, what is the initial interaction weight? Recommendation:
1.0as the initial weight, with decay applied from the first interaction timestamp. This means new interactions start at full strength and decay naturally. -
WAL event format for hide/block: Should hide/block use the same WAL event format as signal writes, or a dedicated relationship-change event type? Recommendation: extend the WAL event format with a
RelationshipChangevariant. This keeps the WAL as the single source of truth for all durable state changes and makes replay straightforward.