tidaldb/docs/specs/10-feedback-loop.md
jordan 413b712c0a chore: initialize tidalDB repository with schema foundation and standards
- Schema phase 1 (tasks 01-02): EntityId, EntityKind, Timestamp, Score, SignalTypeDef, DecayModel, Window, WindowSet — all with property tests and benchmarks scaffolding
- Stub modules for storage, signals, query, ranking
- Full documentation suite: VISION, USE_CASES, SEQUENCE, API, CODING_GUIDELINES, ai-lookup, research docs, specs, roadmap, planning docs
- Marketing site (Next.js) with blog infrastructure
- .claude/ agents and skills for the tidalDB development workflow
- Foundation standards enforced: thiserror + tracing declared as dependencies, clippy::unwrap_used = deny added to lint config
- .gitignore hardened: .next/, node_modules/, .env, secrets, logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 12:52:20 -07:00

74 KiB

Feedback Loop Specification

Status: Draft Authors: tidalDB Engineering Date: 2026-02-20 Depends on: Signal System, Entity Model, Relationships, Storage Engine References: VISION.md, SEQUENCE.md, thoughts.md, API.md


Table of Contents

  1. Design Principles
  2. Signal Ingestion Pipeline
  3. Preference Vector Management
  4. Atomic Multi-Update Semantics
  5. Implicit Signals
  6. Negative Signal Handling
  7. Signal Context
  8. Signal Ordering and Consistency
  9. Feedback Loop Correctness Properties
  10. Performance Targets
  11. Integration Points
  12. Property Tests

1. Design Principles

The feedback loop is what makes tidalDB a Stage 4 closed-loop database. In a Stage 3 system, queries read and writes write -- they are separate paths stitched together by ETL, Kafka consumers, and feature store syncs. In tidalDB, a single signal write atomically updates six subsystems, and the next query -- even 100ms later -- reflects the new state.

The Write Path and the Read Path Are One System

Engagement events and ranking queries share a storage model and a signal ledger. There is no ETL between them. A like signal writes to the WAL, updates the item's decay score, shifts the user's preference vector, increments the user-creator interaction weight, marks the item as liked for that user, and attributes to the user's cohort counters. All of this happens in the time between db.signal() being called and Ok(()) being returned.

Every Engagement Event Updates the Ranking State

There is no concept of "recording an event now, processing it later." The WAL append is the durability guarantee. The derived state updates are the ranking guarantee. Both complete within the signal write path.

No ETL. No Kafka. No Feature Store Sync.

The database IS the feature store. The user's preference vector, the item's engagement velocity, the user-creator interaction weight, the cohort-level trending signals -- all of these are database-managed derived state. The application writes db.signal(Signal { kind: "like", ... }). The database maintains everything else.

Negative Signals Are Equal Citizens

A skip, a hide, a block, a "not interested" -- these update the system with the same immediacy and precision as a like or a completion. They are not the absence of positive engagement. They are data. They carry explicit weight in the decay score, the preference vector, the relationship weight, and the cohort counters.

The Next Query Reflects the Updated State

After a signal write returns Ok(()), every derived state update has completed. A ranking query issued 1ms later sees the updated decay score, the shifted preference vector, the incremented interaction weight, and the updated cohort counters. The staleness bound is zero during normal operation. On crash recovery, staleness is bounded by WAL replay time (typically less than 30 seconds).


2. Signal Ingestion Pipeline

The complete pipeline from API call to durable state. Each step is described with its inputs, outputs, failure modes, and performance budget.

Pipeline Data Flow Diagram

Application calls db.signal(Signal { kind: "like", item: "item_abc", user: "user_123", ... })
     |
     v
[Step 1: DEDUPLICATION CHECK] ──────────────────────────────── ~100 ns (bloom miss)
     |  Input:  signal event
     |  Action: BLAKE3(signal_type, item_id, user_id, timestamp_trunc_1s)
     |  Check:  in-memory bloom filter -> if hit, check on-disk hash set
     |  Output: PASS (new event) or SKIP (duplicate)
     |  Fail:   bloom filter false positive -> on-disk lookup (~50 us), never data loss
     v
[Step 2: WAL APPEND] ──────────────────────────────────────── ~50 us (batched fsync)
     |  Input:  validated signal event
     |  Action: serialize to WAL format (33 + context_len + 8 bytes)
     |  Sync:   per-signal-type durability (Immediate | Batched | Eventual)
     |  Output: durable event with WAL sequence number
     |  Fail:   fsync failure -> return Err to caller, event NOT committed
     |
     |  *** CONSISTENCY BOUNDARY ***
     |  After this point, the event is durable. All subsequent steps
     |  produce derived state that can be reconstructed from the WAL.
     v
[Step 3: SIGNAL LEDGER UPDATE] ────────────────────────────── ~40 ns
     |  Input:  event weight, timestamp, signal type definition (lambdas)
     |  Action: CAS update on HotSignalState.decay_scores[0..3]
     |          atomic increment on WarmSignalState.minute_bucket
     |          atomic increment on WarmSignalState.all_time_count
     |  Output: updated decay scores and windowed counters
     |  Fail:   CAS retry loop -> bounded by concurrent writer count, never fails
     v
[Step 4: USER PREFERENCE VECTOR SHIFT] ────────────────────── ~10 us
     |  Input:  user's current preference vector (1536D)
     |          item's content embedding (1536D)
     |          signal polarity (positive/negative)
     |          signal-specific weight
     |          user's adaptive learning rate
     |  Action: vector arithmetic -> normalize -> write back
     |  Output: updated user preference vector
     |  Fail:   entity not found -> skip (user may have been deleted)
     v
[Step 5: RELATIONSHIP WEIGHT UPDATE] ──────────────────────── ~5 us
     |  Input:  user_id, creator_id (resolved from item's creator_id)
     |          signal-specific delta (from signal_weight_map)
     |          current interaction_weight + timestamp
     |  Action: decay current weight by dt, add delta, clamp to [0.0, 1.0]
     |          update engagement_affinity(user, item) similarly
     |  Output: updated interaction_weight and engagement_affinity edges
     |  Fail:   edge not found -> create with initial weight
     v
[Step 6: COHORT ATTRIBUTION] ──────────────────────────────── ~20 us
     |  Input:  user's cached UserCohortMemberships (22 bytes)
     |          item's cohort tracking activation status
     |  Action: if cohort tracking active for this item:
     |            increment global counter (always)
     |            increment region, language, age_group counters
     |            increment behavioral segment counters (per bitmap)
     |          else:
     |            increment global counter only
     |            check activation threshold
     |  Output: updated cohort dimensional counters
     |  Fail:   stale cohort memberships -> bounded error per refresh interval
     v
[Step 7: USER STATE UPDATE] ───────────────────────────────── ~5 us
     |  Input:  user_id, item_id, signal_kind
     |  Action: update user-item state bitmap:
     |            "view"       -> mark item as "seen"
     |            "like"       -> mark item as "liked"
     |            "completion" -> mark item as "seen", update progress
     |            "save"       -> mark item as "saved"
     |            "hide"       -> mark item as "hidden" (permanent exclusion)
     |  Output: updated user-item state
     |  Fail:   N/A (idempotent bitmap set)
     v
[RETURN Ok(())] ───────────────────────────────────── Total: < 100 us p50

Step-by-Step Detail

Step 1: Deduplication

Signal events are deduplicated using BLAKE3 content-addressed hashing, as specified in Signal System Section 8.

fn signal_content_hash(signal: &Signal) -> [u8; 32] {
    let mut hasher = blake3::Hasher::new();
    hasher.update(signal.kind.as_bytes());
    hasher.update(&signal.item_id.to_bytes());
    hasher.update(&signal.user_id.to_bytes());
    // Truncate timestamp to second granularity: sub-second retries
    // of the same logical event are treated as duplicates.
    let ts_secs = signal.timestamp.timestamp();
    hasher.update(&ts_secs.to_le_bytes());
    *hasher.finalize().as_bytes()
}

Two-level dedup structure:

Level Structure Cost False Positives
L1 In-memory bloom filter (~10 MB for 100M events at 0.01% FPR) ~100 ns 0.01%
L2 On-disk hash set (consulted only on L1 hit) ~50 us 0%

On L1 miss (99.99% of events): the event is new. Proceed to Step 2. On L1 hit: consult L2. If L2 confirms duplicate, return Ok(()) silently -- the event was already processed. If L2 does not contain the hash (false positive from L1), proceed to Step 2.

Bloom filter maintenance: The bloom filter covers the most recent 100M events. Older events fall out of the filter but remain in the on-disk hash set. The filter is rebuilt from the on-disk set on startup. This bounds memory usage while providing fast-path dedup for the common case (recent retries).

Step 2: WAL Append

The WAL append is the consistency boundary. After this step, the event is durable and will survive any crash. All subsequent steps produce derived state that can be reconstructed by replaying the WAL.

The WAL format and durability levels are specified in Signal System Section 8 and Storage Engine Section 3. The relevant parameters:

Signal Category Durability Level Effective Latency
Financial/purchase signals Immediate (fsync per write) ~1 ms
Engagement signals (view, like, share, completion) Batched { max_batch: 100, max_delay_ms: 10 } ~50 us (amortized)
Impressions, telemetry Eventual (OS-scheduled fsync) ~1 us

The group commit queue accumulates signal events and issues a single fsync per batch. Writers block on a per-batch condition variable until their batch is synced. This follows the PostgreSQL commit delay pattern, validated in production by Citadel's GroupCommitQueue.

If the WAL append fails (disk full, I/O error), the signal write returns Err(SignalError::DurabilityFailure) to the caller. No derived state is updated. The event is not committed. The caller must retry or propagate the error.

Step 3: Signal Ledger Update

Updates the per-item signal aggregation state in the hot tier and warm tier. This step is lock-free -- it uses atomic CAS operations on cache-line-aligned HotSignalState structs, as specified in Signal System Section 3.

Hot tier update (decay scores):

// For each configured decay rate (up to 3):
//   1. Load current score (Acquire)
//   2. Decay by dt: score * exp(-lambda * dt)
//   3. Add new weight: score + weight
//   4. CAS store (AcqRel)
//   5. Update last_update_ns if event is newer (Release)
//
// Cost: 3 exp() calls = ~36 ns on modern hardware (12 ns per exp())
// See Signal System Section 4 for the running score formula and proof of exactness.

Warm tier update (windowed counters):

// 1. Atomic increment on current minute bucket (Relaxed -- counter, not synchronization)
// 2. Atomic increment on all_time counter (Relaxed)
// Cost: 2 atomic adds = ~4 ns

Out-of-order events: If event_time < last_update_ns, the weight is pre-decayed before addition. The timestamp is not advanced. See Signal System Section 4, "Out-of-Order Events."

Step 4: User Preference Vector Shift

Moves the user's preference embedding toward or away from the item's content embedding. This is the mechanism by which tidalDB learns the user's taste from their behavior. Full details in Section 3.

What it reads:

  • User's current preference vector from user entity store (1536 dimensions, f32)
  • Item's content embedding from item entity store (1536 dimensions, f32)
  • Signal-specific weight from the preference weight table
  • User's adaptive learning rate (derived from signal count)

What it writes:

  • Updated user preference vector to user entity store
  • Updated user preference vector to HNSW index (incremental re-insertion)

If the user or item entity does not exist (deleted between signal write and this step), the preference update is skipped. The WAL still records the event. On the next query, the skip is harmless -- the user or item is gone.

Step 5: Relationship Weight Update

Updates two implicit relationship edges as a side-effect of the signal, as specified in Relationships Section 8.

Interaction weight (user -> creator):

current = load_edge(user, interaction_weight, creator)
decayed = current.weight * exp(-lambda_iw * dt)
new_weight = clamp(decayed + signal_delta, 0.0, 1.0)
store_edge(user, interaction_weight, creator, new_weight, now)

Where signal_delta comes from the signal weight map in Relationships Section 8:

Signal Delta Rationale
view +0.01 Weak positive. Viewing is passive.
completion +0.03 * ratio Moderate positive, scaled by completion ratio.
like +0.05 Strong positive. Explicit approval.
share +0.07 Very strong positive. Social endorsement.
comment +0.04 Strong positive. Active engagement.
save +0.03 Moderate positive. Intent to return.
skip -0.02 Weak negative. Single skip is noisy.
hide -0.10 Strong negative. Explicit rejection.
not_interested -0.08 Strong negative. Topic-level rejection.
block -> 0.0 Zeroes weight entirely. Triggers cascade.

Engagement affinity (user -> item):

Created on the first signal event for the (user, item) pair. Updated on subsequent signals. Decays with a 7-day half-life. See Relationships Section 8 for the full formula.

If no edge exists: Create one with the signal's initial delta as the weight. This is common for first-time interactions.

Step 6: Cohort Attribution

Resolves the user's cohort memberships and increments dimensional counters on the target item. This is the mechanism that enables cohort-scoped queries like "what is trending among US users aged 18-24 who like jazz."

Full architecture is specified in Signal System Section 7. The key design decision: cohort tracking is threshold-gated. Items with fewer than 100 events/hour for a signal type only receive global counter increments. Items above the threshold receive full dimensional decomposition.

What it reads:

  • UserCohortMemberships (22 bytes, cached in user's hot-tier state):
    struct UserCohortMemberships {
        region: CohortValueId,      // 2 bytes
        language: CohortValueId,    // 2 bytes
        age_group: CohortValueId,   // 2 bytes
        segments: BitSet128,        // 16 bytes (one bit per behavioral segment)
    }
    
  • Item's cohort tracking activation flag

What it writes (below threshold):

  • 1 global counter increment

What it writes (above threshold, user in 8 segments):

  • 1 global + 3 demographic + 8 segment = 12 counter increments

Average write amplification: 1.13x across all events (assuming 1% of events target cohort-tracked items).

Step 7: User State Update

Marks the item's state in the user's engagement history. This powers Filter::unseen(), Filter::user_state("liked"), Filter::user_state("saved"), and the permanent exclusion behavior of hide.

State transitions by signal type:

Signal State Written Filter Affected
view seen Filter::unseen() excludes this item
like liked Filter::user_state("liked") includes this item
completion seen, progress updated Filter::user_state("in_progress") if partial
save saved Filter::user_state("saved") includes this item
hide hidden (permanent) Item excluded from ALL future queries
skip seen Filter::unseen() excludes this item
download downloaded Filter::user_state("downloaded") includes this item

The user-item state is stored as a compact bitmap in the user's relationship edge set. The hidden flag is a permanent, irrevocable exclusion -- see Section 6 for full cascade behavior.


3. Preference Vector Management

The user's preference vector is a database-managed embedding that evolves with every signal. It is the primary mechanism by which tidalDB personalizes ranking queries. The vector is declared in the Entity Model as EmbeddingSource::DatabaseManaged on the preference slot of the User entity.

Update Formula

Positive signal (view, like, share, completion, save, search_click):

pref_new = normalize(pref + lr * weight * (item_embedding - pref))

Negative signal (skip, hide, not_interested, block, dislike, downvote):

pref_new = normalize(pref - lr * weight * (item_embedding - pref))

Where:

  • pref is the user's current preference vector (1536 dimensions, unit length)
  • item_embedding is the item's content embedding (1536 dimensions, unit length)
  • lr is the adaptive learning rate (see below)
  • weight is the signal-specific weight (see below)
  • normalize() projects the result back to unit length

Signal-Specific Weights

Signal Weight Direction Rationale
view 0.3 Positive Passive engagement. Weak but frequent signal.
like 1.0 Positive Explicit approval. Strong intent signal.
completion(ratio) ratio Positive Proportional to consumption depth. Full completion = strong positive.
share 1.5 Positive Social endorsement. Strongest positive signal.
save 1.0 Positive Return intent. Comparable to like.
comment 0.8 Positive Active engagement.
search_click 0.5 Positive Moderate intent from search context.
skip 0.3 Negative Weak negative. Single skip is noisy.
dislike 0.8 Negative Explicit negative.
hide 1.0 Negative Strong explicit rejection.
not_interested 1.5 Negative Strongest explicit negative. Topic-level rejection.
block 2.0 Negative Nuclear option. Full aversion toward creator's catalog.

Adaptive Learning Rate

The learning rate decays as the user accumulates more signal events. Early signals have a large effect on the preference vector (rapid adaptation during cold start). Later signals have a smaller effect (stability after the preference vector has converged).

lr = lr_max * exp(-decay_k * signal_count) + lr_min
Parameter Value Rationale
lr_max 0.10 Initial learning rate for cold-start users. A single like moves the vector ~10% toward the item.
lr_min 0.01 Floor learning rate for mature users. A single like moves the vector ~1%.
decay_k 0.003 After ~770 signals, lr is within 10% of lr_min. After ~1500 signals, lr is effectively at lr_min.

Rationale for these values: At lr_max = 0.10 and signal weight 1.0 (like), the preference vector moves by approximately 0.10 * ||item - pref|| / ||pref|| per signal. For orthogonal vectors (worst case), this is a ~10% shift. For nearby vectors, much less. After 20 signals, the vector is meaningfully personalized (no longer population centroid). After 100 signals, the vector reflects clear user preferences. After 1000+ signals, individual events barely move it -- stability is achieved.

Learning Rate by Signal Count

Signal Count lr Behavior
0 (cold start) 0.100 Large jumps. 5 likes in the same category establish a clear preference.
20 0.094 Still adapting rapidly. Exploration phase.
100 0.074 User has clear preferences. Still responsive to new interests.
500 0.023 Preferences well established. Gradual evolution.
1000 0.015 Very stable. New interests require sustained engagement.
2000+ 0.010 At floor. Maximum stability.

Momentum (EWMA Smoothing)

Raw preference updates can oscillate when the user engages with diverse content in rapid succession (e.g., watching a jazz tutorial then a cooking video then a gaming stream). EWMA smoothing prevents thrashing:

pref_smoothed = alpha * pref_raw + (1 - alpha) * pref_prev
pref_new = normalize(pref_smoothed)
Parameter Value Rationale
alpha 0.7 New direction gets 70% weight, previous direction gets 30%. Responsive but not twitchy.

The smoothing is applied after the direction computation but before normalization. It ensures that a single anomalous signal does not jerk the preference vector far from its established trajectory.

Cold Start Initialization

When a new user is created with no signal history, the preference vector must be initialized to something meaningful.

Strategy hierarchy (first applicable wins):

  1. Explicit interests provided: If explicit_interests are set on the user entity at creation (e.g., ["jazz", "piano", "cooking"]), compute the centroid of the interest embeddings:

    pref_initial = normalize(mean([embed("jazz"), embed("piano"), embed("cooking")]))
    

    Where embed(interest) looks up the pre-computed interest centroid from the schema's interest vocabulary.

  2. Demographic cohort available: If the user has region, age_range, or other demographic fields, use the cohort centroid:

    pref_initial = cohort_centroid(region, age_range)
    

    Cohort centroids are computed daily by the background materializer as the mean preference vector of all users in that cohort.

  3. Population centroid: Fall back to the global population centroid:

    pref_initial = population_centroid
    

    Computed daily as the mean preference vector of all users with 100+ signals.

Convergence Guarantee

With consistent engagement patterns, the preference vector converges. It does not oscillate.

Proof sketch: The update rule pref += lr * w * (item - pref) is a weighted average that pulls the preference vector toward the engagement-weighted centroid of the user's consumed items. The adaptive learning rate ensures that the step size decreases with experience. The EWMA smoothing dampens high-frequency noise. By the theory of stochastic approximation (Robbins-Monro conditions), the sequence converges in the L2 norm as long as sum(lr_i) = infinity and sum(lr_i^2) < infinity. The exponential decay of lr satisfies both conditions.

Worked Example

A new user signs up with explicit_interests: ["jazz"]. Their initial preference vector points toward the jazz centroid: pref_0 = normalize(embed("jazz")).

Signal 1: Views a jazz piano tutorial (item_A)

lr = 0.10 (0 previous signals)
weight = 0.3 (view signal)
direction = item_A_embedding - pref_0
pref_1 = normalize(pref_0 + 0.10 * 0.3 * direction)
       = normalize(pref_0 + 0.03 * direction)

The vector shifts slightly toward item_A's specific position in the jazz space. Movement: ~3% of the distance between pref and item_A.

Signal 2: Likes the jazz piano tutorial (item_A)

lr = 0.0997 (1 previous signal)
weight = 1.0 (like signal)
direction = item_A_embedding - pref_1
pref_2 = normalize(pref_1 + 0.0997 * 1.0 * direction)
       = normalize(pref_1 + 0.0997 * direction)

Larger movement: ~10% of the remaining distance toward item_A. After a view + like, the preference vector is distinctly oriented toward this specific content.

Signal 3: Skips a cooking video (item_B)

lr = 0.0994 (2 previous signals)
weight = 0.3 (skip signal)
direction = item_B_embedding - pref_2
pref_3 = normalize(pref_2 - 0.0994 * 0.3 * direction)
       = normalize(pref_2 - 0.0298 * direction)

The vector shifts slightly away from cooking content. Movement: ~3% away from item_B. This is a mild signal -- a single skip does not create a strong aversion.

After 100 signals (80 jazz-related positive, 20 mixed):

The preference vector is firmly oriented in the jazz/music region of the embedding space. lr has decayed to ~0.074. Individual signals produce shifts of 0.7-7.4% (depending on signal weight), which are small enough to maintain stability but large enough to track genuine interest shifts.

Preference Vector Storage

The preference vector is stored in two places:

  1. Entity store: Under [user_id][0x00][EMB:preference] -- the durable copy, updated on every signal write.
  2. HNSW index: USearch index for the User entity's preference slot -- used for ANN retrieval queries like Candidate::Ann { query_vector: VectorSource::UserPreference }.

The HNSW index is updated incrementally on each preference shift. Full HNSW rebuild occurs on startup or when the incremental insertion quality degrades beyond a threshold (measured by recall@10 spot-checks during background maintenance).

Background Full Recomputation

To correct for incremental drift (accumulated floating-point error from thousands of small updates), the background materializer performs a daily full recomputation:

For each user with 100+ signals:
    Load all signal events from the last 90 days (or all events if fewer)
    Sort by timestamp ascending
    Start from cold-start initialization
    Replay all events through the preference update formula
    Compare with current preference vector
    If cosine_distance(recomputed, current) > 0.01:
        Replace current with recomputed
        Re-index in HNSW

In practice, the incremental and fully-recomputed vectors diverge by less than 0.005 cosine distance after 10,000 signals, so replacements are rare.


4. Atomic Multi-Update Semantics

The signal write pipeline (Steps 3-7) is NOT wrapped in a traditional ACID transaction across all subsystems. This is a deliberate architectural choice.

Why Not a Transaction

A cross-subsystem transaction would require one of:

  1. A global mutex -- blocking all concurrent signal writes and ranking queries. This violates the lock-free hot-path requirement from Signal System Section 3.
  2. Two-phase commit -- coordinating the signal ledger, preference vector, relationship store, cohort counters, and user state into a single distributed commit. The overhead would exceed the entire performance budget.
  3. MVCC across heterogeneous stores -- maintaining read snapshots across the hot-tier atomics, the entity store, and the relationship store. The complexity is unjustifiable for the guarantees it provides.

The WAL Is the Transaction

The WAL append (Step 2) IS the durability guarantee. It is the single point of truth. All subsequent updates are derived state. The correctness argument is:

  1. If the process does not crash: Steps 3-7 complete inline, producing consistent derived state. The next query sees all updates.

  2. If the process crashes after Step 2 but before completing Steps 3-7: On recovery, the WAL is replayed from the last checkpoint. Each WAL event is re-processed through Steps 3-7. Because:

    • Decay score updates are commutative (the CAS loop produces the same result regardless of application order for events with the same timestamp; for different timestamps, the running score formula is mathematically exact)
    • Preference vector updates are idempotent per event (the BLAKE3 dedup prevents double-application)
    • Relationship weight updates are idempotent per event (same dedup mechanism)
    • Cohort counter increments are idempotent per event (same dedup mechanism)
    • User state bitmap sets are idempotent (setting a bit that is already set is a no-op)
  3. If the process crashes during Step 2: The event was not committed to the WAL. Err was not returned to the caller (the process crashed). The caller will retry or timeout. No derived state was updated. No inconsistency.

Consistency Model

Property Guarantee
Durability After Ok(()) is returned, the event survives any single crash.
Visibility (normal operation) All derived state is updated before Ok(()) returns. Zero staleness.
Visibility (crash recovery) Derived state is at most [WAL replay time] behind the WAL. Typically < 30 seconds.
Ordering Within a single signal write, all derived state updates are consistent with each other.
Concurrent visibility A concurrent ranking query may see the pre-update or post-update state for each individual atomic field, but never a torn state (partial f64, corrupted bitmap, etc.).

Staleness Bound

During normal operation, there is no staleness -- derived state is updated inline. After a crash, staleness is bounded by:

max_staleness = checkpoint_age + replay_time
              = (time since last hot-tier checkpoint) + (WAL replay duration)
              <= 60s + 30s = 90s (worst case)
              ~= 0s + 0s = 0s (typical, since checkpoint runs every 30-60s)

The hot-tier checkpoint (stored in the entity_signal_state column family, per Signal System Section 9) captures the current state of all decay scores and windowed counters. On recovery, the checkpoint is loaded first (providing immediate query capability with slightly stale data), then the WAL tail is replayed to bring derived state fully current.


5. Implicit Signals

Some signals are generated by the database itself, not by explicit API calls from the application.

Impression Tracking

When a RETRIEVE or SEARCH query returns items, those items were shown to the user. This is an implicit impression signal.

Design decision: opt-in via query parameter.

let results = db.retrieve(Retrieve {
    profile: "for_you",
    for_user: Some("user_123"),
    track_impressions: true,  // opt-in
    ..Default::default()
})?;

Rationale for opt-in (not automatic):

  1. Performance: Every query becoming a write doubles the I/O cost. A feed query that returns 50 items would generate 50 impression signals. At 1000 queries/sec, this is 50,000 additional signal writes/sec -- a 50x write amplification that must be budgeted explicitly.

  2. Semantic correctness: Not every query is an impression. A background prefetch, a cache warmup, a debug query -- these are not "the user saw these items." The application knows which queries represent real user impressions.

  3. Configurability: The application may want impressions tracked on the For You feed but not on search results. track_impressions is a per-query toggle.

When track_impressions: true:

For each item in the result set, the database generates:

Signal {
    kind: "impression",
    item: item_id,
    user: query.for_user,
    timestamp: query_time,
    weight: 1.0,
    context: Some(json!({
        "surface": query.context,
        "position": result_position,
        "profile": query.profile,
    })),
}

These impression signals follow the same pipeline as explicit signals (Steps 1-7) but use Durability::Eventual to minimize I/O impact. This means impressions may be lost on power failure, which is acceptable for this low-value telemetry signal.

Impression signal properties:

Property Value
Decay Exponential, 1-day half-life
Windows 1h, 24h
Velocity No
Preference vector update No (impressions are too noisy for preference learning)
Relationship update No
User state update Yes (marks item as "seen" for Filter::unseen())

Session Signals

Derived from patterns of explicit signals, computed by the background materializer (not the real-time write path).

Binge session: 5+ completions with ratio > 0.8 in sequence within a 2-hour window.

Effect:
    - Update user's session_pattern field to "binge" (if not already)
    - Boost user's content_format_preference toward long-form
    - Temporarily increase exploration budget (the user is in a consumption mood)

Browse session: 10+ views with fewer than 2 completions in a 30-minute window.

Effect:
    - Update user's session_pattern field to "browsing"
    - Temporarily relax completion_rate quality gates (the user is sampling, not committing)
    - Increase diversity enforcement (the user is exploring)

Search-heavy session: 3+ searches within a 10-minute window.

Effect:
    - Update user's session_pattern field to "searching"
    - Prioritize text relevance over personalization in subsequent queries
    - Record search queries for saved-search suggestions

Session signals are written to the user entity's computed fields on a 5-minute evaluation cadence. They are not generated as individual signal events -- they are state transitions on the user entity.


6. Negative Signal Handling

Negative signals are equal citizens. Each negative signal type has a defined cascade of effects across all subsystems. This section specifies the complete cascade for each type.

Cascade Summary Table

Signal Preference Vector Interaction Weight Engagement Affinity Item Exclusion Creator Exclusion User State
skip (< 3s) Mild shift away -0.02 -0.15 No No seen
dislike Moderate shift away -0.05 -0.20 No No disliked
hide ("not interested") Strong shift away -0.10 -> 0.0 (permanent) Permanent exclusion No hidden
not_interested (topic) Strong shift away -0.08 -0.20 No (score reduced) No --
block (creator) Maximum shift away -> 0.0 -> 0.0 (all items) All items excluded Permanent exclusion blocked
mute None None None Feed exclusion Feed exclusion muted

Skip (Dwell < 3 Seconds)

The mildest negative signal. A skip is noisy -- the user may have accidentally tapped, may have already seen the content, or may simply not be in the mood. The database treats it as weak evidence of disinterest.

Cascade:

  1. Signal ledger: Increment item's skip counter. Decay: exponential, 1-day half-life. The fast decay ensures that a few skips do not permanently damage an item's ranking.

  2. Preference vector: Shift away from item embedding.

    weight = 0.3 (mild)
    pref_new = normalize(pref - lr * 0.3 * (item_embedding - pref))
    
  3. Interaction weight (user -> creator): Decrement by 0.02.

    interaction_weight = clamp(decayed_weight - 0.02, 0.0, 1.0)
    
  4. Engagement affinity (user -> item): Decrement by 0.15.

  5. Item exclusion: None. The item is NOT excluded from future queries. It receives a lower score due to the skip signal's contribution to the ranking function, but it may still appear if other signals are strong enough.

  6. User state: Item marked as seen. Filter::unseen() will exclude it in future queries.

Dislike (Explicit Negative Vote)

Stronger than a skip. The user explicitly indicated dissatisfaction.

Cascade:

  1. Signal ledger: Increment item's dislike counter. Decay: exponential, 7-day half-life.

  2. Preference vector: Shift away from item embedding.

    weight = 0.8 (moderate)
    pref_new = normalize(pref - lr * 0.8 * (item_embedding - pref))
    
  3. Interaction weight (user -> creator): Decrement by 0.05.

  4. Engagement affinity (user -> item): Decrement by 0.20.

  5. Item exclusion: None. The item receives a penalty in ranking but is not permanently excluded. This respects the user's right to change their mind.

  6. User state: Item marked as disliked.

Hide ("Not Interested" on a Specific Item)

A permanent hard-negative on the user-item relationship. The user has explicitly said "I never want to see this item again." This is irrevocable.

Cascade:

  1. Signal ledger: Set item's hide flag for this user. Decay: permanent. No windows.

  2. Preference vector: Strong shift away from item embedding.

    weight = 1.0 (strong)
    pref_new = normalize(pref - lr * 1.0 * (item_embedding - pref))
    
  3. Interaction weight (user -> creator): Decrement by 0.10.

    interaction_weight = clamp(decayed_weight - 0.10, 0.0, 1.0)
    
  4. Engagement affinity (user -> item): Set to 0.0. Create a permanent exclusion edge.

    store_edge(user, engagement_affinity, item, weight=0.0, timestamp=now)
    // The zero-weight edge serves as a permanent exclusion marker.
    // It is never pruned, unlike organic zero-weight edges that are pruned at 0.001.
    
  5. Item exclusion: Permanent. The item is excluded from ALL future queries for this user, including:

    • For You feed
    • Following feed
    • Trending
    • Browse
    • Search results
    • Related content
    • Notifications

    Enforcement: The hidden flag is checked during the pre-filter phase of query execution (before scoring). It is stored in the user's exclusion bitmap, which is loaded at query start alongside the blocked set.

  6. User state: Item marked as hidden.

Correctness invariant (INV-FL-1): A hidden item NEVER reappears for that user. This is formally stated in Section 9.

Not Interested (Topic-Level Rejection)

Weaker than hide (does not exclude the specific item permanently) but broader (affects the preference vector more strongly toward the topic represented by the item).

Cascade:

  1. Signal ledger: Increment item's not_interested counter. Decay: permanent.

  2. Preference vector: Strong shift away from item embedding.

    weight = 1.5 (very strong -- topic-level rejection)
    pref_new = normalize(pref - lr * 1.5 * (item_embedding - pref))
    

    The higher weight (1.5 vs. 1.0 for hide) reflects that this is a topic-level signal. The preference vector should move further from this region of the embedding space.

  3. Interaction weight (user -> creator): Decrement by 0.08.

  4. Engagement affinity (user -> item): Decrement by 0.20.

  5. Item exclusion: The specific item is NOT permanently excluded. But its score is heavily penalized by the not_interested signal in the ranking function.

  6. Topic weight decay: The item's primary category receives a temporary negative weight for this user. Items in the same category will be ranked lower for a decay period.

Block (Creator-Level Nuclear Option)

The strongest negative signal. A blocked creator's content is permanently excluded from every query for this user. This cascades through the entire relationship graph.

Cascade:

  1. Relationship creation: Create a blocked edge from user to creator.

    write_edge(user, blocked, creator, weight=1.0, now)
    
  2. Follows removal: Delete the follows edge if it exists.

    delete_edge(follows, user, creator)
    
  3. Interaction weight zeroing: Set interaction_weight to 0.0.

    store_edge(user, interaction_weight, creator, weight=0.0, now)
    
  4. Engagement affinity zeroing: For every item by this creator where the user has an engagement_affinity edge, set the weight to 0.0.

    for item in items_by_creator_with_user_affinity(user, creator):
        store_edge(user, engagement_affinity, item, weight=0.0, now)
    

    This cascade is bounded by the number of items the user has engaged with from this creator, which is typically O(tens), not O(creator_catalog_size).

  5. Preference vector: Maximum shift away from the creator's catalog embedding (not individual item embeddings -- this is a creator-level rejection).

    catalog_embedding = load_embedding(creator, "catalog")
    weight = 2.0 (maximum negative weight)
    pref_new = normalize(pref - lr * 2.0 * (catalog_embedding - pref))
    

    Using the catalog embedding (centroid of the creator's items) rather than any individual item ensures the preference vector moves away from the creator's general content area.

  6. Item exclusion: ALL items by this creator are excluded from EVERY query for this user. This is enforced at query start by loading the blocked creator set into a Roaring bitmap and excluding all items with matching creator_id during the pre-filter phase.

    This includes:

    • For You feed
    • Following feed
    • Trending
    • Browse
    • Search results (unlike mute, block excludes from search too)
    • Related content
    • Notifications

Correctness invariant (INV-FL-2): A blocked creator's items are excluded from every query, including search. This is formally stated in Section 9.

Block cascade performance budget: < 5 ms (per Relationships Section 13). The cascade visits at most O(user_engagements_with_creator) items, which is typically < 100.

Mute

The gentlest negative relationship. Muting a creator excludes them from algorithmic surfaces but preserves intentional access.

Cascade:

  1. Relationship creation: Create a muted edge from user to creator.

    write_edge(user, muted, creator, weight=1.0, now)
    
  2. No other cascades. Muting does NOT:

    • Remove the follows relationship
    • Change the interaction weight
    • Change the engagement affinity
    • Shift the preference vector
    • Affect cohort counters
  3. Feed exclusion: The muted creator's items are excluded from:

    • For You feed
    • Trending
    • Browse (algorithmic)
    • Notifications
    • Related/Up Next recommendations
  4. Still visible in:

    • Search results (the user may deliberately search for this creator)
    • Following feed (if the user also follows this creator -- they chose to follow, muting only suppresses algorithmic promotion)
    • Direct navigation (profile page, item page via direct URL)

7. Signal Context

Every signal event carries an optional context field that enriches the feedback with attribution and analysis data. Context is stored with the raw signal event in the WAL and cold tier but is NOT used in the real-time hot-path updates.

Context Fields

Field Type Signals Purpose
source_surface string All Which surface generated this engagement: "feed", "search", "related", "notification", "browse", "profile"
query_context string search_click The search query that led to this click
rank_at_click u32 search_click, view (from feed) Position in the result list at the time of engagement
dwell_ms u64 skip, view, completion Milliseconds the user spent before the next action
referrer_item string view (from related/up-next) The item that led to this engagement (for related/up-next attribution)
total_duration_ms u64 completion Total duration of the content in milliseconds
completed_duration_ms u64 completion How much of the content was consumed
platform string share Where the content was shared: "twitter", "sms", "clipboard"
share_type string share How it was shared: "link", "embed", "repost"
session_id string All Application-provided session identifier for session analysis

Context Storage and Retrieval

Context is stored as raw bytes (MessagePack-encoded) in the WAL record's variable-length context field. It is never parsed on the hot path. It is consumed only by:

  1. Background materializer: For offline learning (e.g., training a rank_at_click -> relevance model).
  2. Analytics queries: For understanding user behavior patterns (e.g., "what percentage of search clicks are on result #1?").
  3. Debugging: For investigating why a specific item was ranked where it was.
// Context is opaque on the hot path
pub struct SignalEvent {
    // ... fixed fields ...
    context: Option<Vec<u8>>,  // raw bytes, never parsed during signal write Steps 3-7
}

// Context is parsed only when explicitly accessed
impl SignalEvent {
    pub fn parse_context(&self) -> Result<serde_json::Value, ContextError> {
        match &self.context {
            Some(bytes) => rmp_serde::from_slice(bytes).map_err(ContextError::Decode),
            None => Ok(serde_json::Value::Null),
        }
    }
}

Why Context Is Not Hot-Path

Parsing JSON or MessagePack on every signal write would add ~500 ns - 2 us per event. With 50,000 events/sec, this is 25-100 ms of CPU per second wasted on parsing data that no real-time query ever reads. The context is write-once-read-rarely data that belongs in the cold tier, not the hot path.


8. Signal Ordering and Consistency

Timestamps

Signal events carry timestamps from the application. These are the "event time" -- when the engagement actually occurred -- not the "processing time" -- when the database received it.

pub struct Signal {
    // ...
    /// Event timestamp. If None, uses server time.
    pub timestamp: Option<DateTime<Utc>>,
}

If timestamp is None, the database uses the current server time. If provided, the database uses the application's timestamp. This allows for:

  1. Client-side timestamping: Mobile apps that buffer events and flush them in batches.
  2. Event replay: Backfilling historical events from another system.
  3. Testing: Deterministic timestamp control in integration tests.

Out-of-Order Events

Signals may arrive out of order due to network delays, client retries, batch uploads, or system migration. The database handles this correctly at every level:

Decay scores: The running score formula handles out-of-order events by pre-decaying the weight. If an event arrives with t_event < last_update_ns:

adjusted_weight = weight * exp(-lambda * (last_update_ns - t_event))
score_new = score_current + adjusted_weight
// last_update_ns is NOT updated (it already reflects a more recent time)

This is mathematically equivalent to having received the event in order. See Signal System Section 4.

Windowed counters: Out-of-order events that fall within the current window are attributed to the correct time bucket. Events that fall outside the current window (older than the oldest bucket) are recorded in the cold tier only -- they are no longer relevant for real-time windowed aggregation.

Preference vector: Each signal event triggers a preference update based on the current preference vector, regardless of the event's timestamp. This means that late-arriving events apply their preference shift to the vector's current state, not its historical state. This is a deliberate approximation: reconstructing the exact historical preference trajectory for every late event would require storing the full history of preference snapshots. The error from this approximation is bounded by lr * weight * late_event_count, which is negligible for typical late-arrival rates (< 1% of events).

Relationship weights: Same treatment as preference vectors. The weight update uses the current weight state, not a historical state.

Idempotency

The BLAKE3 content-addressed dedup (Section 2, Step 1) ensures that replayed or duplicated signals do not double-count. The content hash includes the signal type, item ID, user ID, and timestamp truncated to 1-second granularity. This means:

  • Exact retries (same event, same timestamp): deduplicated.
  • Client retries within the same second: deduplicated.
  • Genuine distinct events more than 1 second apart: treated as separate events (correct).
  • Two different users engaging with the same item at the same second: different hashes (user_id is included). Not deduplicated (correct).

Causal Ordering

Within a single user session, signals should be applied in the order they occurred. The database does not enforce global causal ordering across users (that would require a distributed clock), but it does respect the following:

  1. Per-user sequential signals: If user U sends view then like for the same item, the view must be processed before the like. This is guaranteed if the application sends signals sequentially (which it should -- these are user actions that occurred in sequence). If the application sends them concurrently, the database processes them in arrival order, which may differ from event order. The running score formula handles this correctly (addition is commutative). The preference vector shift order matters slightly but the error is negligible.

  2. Cross-user independence: User A's like and User B's view on the same item have no causal relationship. They may be processed in any order.


9. Feedback Loop Correctness Properties

These are formal properties that the feedback loop must maintain. They are encoded as property tests, assertions, and crash recovery tests. Violations of these properties are bugs, not acceptable degradation.

INV-FL-1: Monotonic Negative (Hidden Item)

A hidden item NEVER reappears for that user.

Formally: If signal(hide, item_I, user_U) returns Ok(()) at time t, then for all t' > t and for all queries Q issued by user U, item I does not appear in the result set of Q.

This holds across:

  • Process restarts (the hidden flag is durable in the WAL and the user state store)
  • Schema changes (hiding is orthogonal to ranking profiles)
  • Profile switches (every profile checks the exclusion bitmap)
  • Search queries (hidden items are excluded even from explicit search)

Enforcement mechanism: The hidden flag is stored in a durable per-user exclusion bitmap. The bitmap is loaded at query start (alongside blocked set) and applied as a pre-filter before candidate scoring. The flag is permanent and cannot be cleared by any signal or API call except db.unhide(user_id, item_id), which is an explicit administrative operation.

INV-FL-2: Block Totality

A blocked creator's items are excluded from every query, including search.

Formally: If signal(block, user_U, creator_C) returns Ok(()) at time t, then for all t' > t and for all queries Q issued by user U, no item I where I.creator_id == C appears in the result set of Q.

This is stronger than mute (which allows search visibility). Block is a total exclusion.

Enforcement mechanism: The blocked creator set is a durable Roaring bitmap loaded at query start. All items are checked against the creator's blocked status during pre-filtering. The block cascade also zeroes all historical relationship state (interaction_weight, engagement_affinity), so even if the pre-filter were somehow bypassed, the item would receive zero ranking signal from the blocked creator relationship.

INV-FL-3: Signal Conservation

Every WAL-committed signal eventually appears in all derived state.

Formally: If signal(s) returns Ok(()) at time t, then for all t' > t + max_replay_time:

  • The item's decay score reflects s
  • The item's windowed counters include s
  • The user's preference vector has been shifted by s
  • The user-creator interaction weight has been updated by s
  • The user-item state reflects s
  • The cohort counters (if applicable) include s

max_replay_time is bounded by the WAL tail size and replay throughput:

max_replay_time = wal_tail_events / replay_throughput
                = (checkpoint_interval_sec * events_per_sec) / replay_events_per_sec
                = (60s * 10,000/s) / 100,000/s
                = 600,000 / 100,000
                = 6 seconds (typical)
                <= 60 seconds (worst case, per Signal System Section 12)

INV-FL-4: Preference Convergence

With consistent engagement patterns, the preference vector converges (does not oscillate).

Formally: If user U engages exclusively with items in embedding region R for N consecutive signals where N > 1000, then:

||pref_vector(t_N) - centroid(R)|| < epsilon

Where epsilon is bounded by lr_min * max_weight = 0.01 * 2.0 = 0.02 (the maximum single-step movement at minimum learning rate).

The convergence guarantee does NOT hold if the user has genuinely diverse interests (e.g., 50% jazz, 50% cooking). In that case, the preference vector stabilizes near the centroid of their diverse interests, which is correct behavior -- the ANN retrieval from that centroid captures both interests.

INV-FL-5: Staleness Bound

Derived state is at most [checkpoint_interval + replay_time] behind the WAL.

Formally: For any WAL event e committed at time t:

  • During normal operation: derived state reflects e at time t (zero staleness)
  • After crash recovery: derived state reflects e by time t + checkpoint_interval + replay_time

With default configuration:

max_staleness_after_crash = 60s + 30s = 90s (worst case)
typical_staleness_after_crash = 30s + 6s = 36s (typical)

INV-FL-6: Deduplication Idempotency

Writing the same signal event twice produces the same state as writing it once.

Formally: state(write(s) ; write(s)) == state(write(s)) for all signal events s.

This is guaranteed by the BLAKE3 content-addressed dedup mechanism. The second write is detected as a duplicate and silently returns Ok(()) without updating any derived state.

INV-FL-7: Weight Bounds

All relationship weights are in [0.0, 1.0] after every update.

Formally: For all entities A, B and all relationship kinds K:

0.0 <= weight(A, K, B) <= 1.0

This holds regardless of the signal sequence. The clamp in the weight update formula (clamp(decayed + delta, 0.0, 1.0)) ensures that no sequence of positive signals can push a weight above 1.0 and no sequence of negative signals can push it below 0.0.

INV-FL-8: Mute Visibility Semantics

A muted creator's items are excluded from algorithmic feeds but visible in search and Following feed.

Formally: If mute(user_U, creator_C) is active:

  • RETRIEVE with profile "for_you", "trending", "browse", "related": items by C are excluded
  • RETRIEVE with profile "following" where user follows C: items by C are included
  • SEARCH: items by C are included in search results

10. Performance Targets

These are the latency and throughput targets for the complete feedback loop pipeline. Regressions against these numbers are treated as bugs.

Signal Write Latency (End-to-End)

Metric Target Notes
p50 < 100 us Dominated by batched fsync amortization
p99 < 500 us Occasional fsync flush or cohort attribution for tracked items
p999 < 2 ms Block cascade (rare)

Per-Step Performance Budget

Total budget: 100 us (p50)

Step 1: Deduplication         5 us    (BLAKE3 hash + bloom filter lookup)
Step 2: WAL append           50 us    (batched fsync amortized cost)
Step 3: Signal ledger update  1 us    (3 CAS + 2 atomic add)
Step 4: Preference vector    10 us    (1536D vector arithmetic)
Step 5: Relationship update   5 us    (2 point reads + 2 point writes)
Step 6: Cohort attribution   20 us    (bitmap lookups + counter increments)
Step 7: User state update     5 us    (bitmap set)
Overhead (bookkeeping)        4 us
                            ------
Total                       100 us

Per-Step Detailed Targets

Step Operation Target Measurement
1 BLAKE3 hash (32 bytes input) < 100 ns Single hash computation
1 Bloom filter check (miss) < 100 ns Single bit probe
1 Bloom filter check (hit) + disk lookup < 50 us Hash set point read
2 WAL append (batched fsync) < 50 us p50 Batch flush amortized
2 WAL append (immediate fsync) < 1 ms Single fsync
3 Decay score CAS (per lambda) < 15 ns 1 exp() + 1 CAS
3 Decay score update (3 lambdas) < 50 ns 3 CAS operations
3 Minute bucket increment < 5 ns 1 atomic add
4 Preference vector shift (1536D) < 10 us Vector sub + scale + add + normalize
4 HNSW incremental re-insertion < 100 us Amortized, batched in background
5 Interaction weight update < 5 us 1 read + 1 write
5 Engagement affinity update < 5 us 1 read + 1 write
6 Cohort membership lookup < 100 ns Cached in user's hot-tier state
6 Cohort counter increments (12 counters) < 20 us 12 atomic adds
7 User state bitmap set < 5 us 1 bitmap operation

Throughput Targets

Metric Target Configuration
Sustained signal write throughput (single writer) > 50,000 events/sec Batched durability
Sustained signal write throughput (4 writers) > 150,000 events/sec Batched durability
WAL replay throughput > 100,000 events/sec Sequential replay
Block cascade throughput > 200 cascades/sec 20 engaged items per cascade

Benchmark Suite

These targets must be validated with criterion benchmarks from the first implementation:

// benches/feedback_loop.rs

// End-to-end signal write benchmarks
bench_signal_write_like()                           // target: < 100 us p50
bench_signal_write_view()                           // target: < 100 us p50
bench_signal_write_completion()                     // target: < 100 us p50
bench_signal_write_skip()                           // target: < 100 us p50
bench_signal_write_hide_cascade()                   // target: < 500 us p50
bench_signal_write_block_cascade(20_items)           // target: < 2 ms p50

// Per-step benchmarks
bench_dedup_blake3_hash()                           // target: < 100 ns
bench_dedup_bloom_filter_miss()                     // target: < 100 ns
bench_wal_append_batched()                          // target: < 50 us p50
bench_decay_score_update_3_lambdas()                // target: < 50 ns
bench_preference_vector_shift_1536d()               // target: < 10 us
bench_relationship_weight_update()                  // target: < 5 us
bench_cohort_attribution_12_counters()              // target: < 20 us
bench_user_state_bitmap_set()                       // target: < 5 us

// Throughput benchmarks
bench_sustained_signal_throughput_1_writer()         // target: > 50K/sec
bench_sustained_signal_throughput_4_writers()         // target: > 150K/sec
bench_wal_replay_throughput()                        // target: > 100K/sec

// Feedback loop latency benchmark (write + immediate read)
bench_signal_then_query_latency()                   // target: < 200 us total

11. Integration Points

Integration with Signal System (Spec 03)

The feedback loop is the write-side consumer of the signal system. Every signal event flows through the signal ingestion pipeline (Section 2), which invokes the signal system for:

  • Step 3: HotSignalState::on_signal() and warm-tier bucket increments (Signal System Sections 3, 4)
  • Step 6: Cohort-scoped counter increments (Signal System Section 7)

The feedback loop also triggers the signal system's background materializer (Signal System Section 9) by producing events that need to be:

  • Rolled up into hourly and daily aggregates
  • Evaluated for cohort tracking activation thresholds
  • Checkpointed to the entity_signal_state column family
Feedback Loop (real-time)          Signal System (background)
         |                                    |
    db.signal()                      Materializer thread
         |                                    |
    Steps 1-7                        Bucket rotation (1 min)
         |                           Rollup generation (1 hr)
    WAL event ─────────────────────> WAL replay on crash
         |                           Checkpoint (30-60s)
    Hot/warm tier updates            Hot tier -> cold tier eviction

Integration with Entity Model (Spec 02)

The feedback loop reads and writes to entity state at multiple points:

Step Entity Read Entity Write
Step 4 User preference vector, Item content embedding User preference vector (updated)
Step 5 Item's creator_id (to resolve user -> creator edge) --
Step 6 User's cached cohort memberships --
Step 7 -- User-item state bitmap

The feedback loop also triggers updates to database-computed fields on the User entity:

Computed Field Update Trigger Latency
platform_tenure_days Every signal write (trivial: now - first_signal_at) < 1 us
inferred_interests Incremental update on positive signals < 100 us
followed_creator_count On follow/unfollow (not signal write) < 1 us

Other computed fields (engagement_level, session_pattern, content_format_preference) are updated by the background materializer on a scheduled cadence, not inline during signal writes.

Integration with Relationships (Spec 04)

The feedback loop is the primary source of implicit relationship updates:

Relationship Created By Updated By
interaction_weight (user -> creator) First signal involving user + creator's item Every subsequent signal (Step 5)
engagement_affinity (user -> item) First signal involving user + item Every subsequent signal (Step 5)
blocked (user -> creator) Block signal cascade (Section 6) Never (permanent)
hidden (user -> item state) Hide signal (Step 7) Never (permanent)

The feedback loop also triggers the block cascade defined in Relationships Section 8, which is the most expensive operation in the entire write path (up to 5 ms).

Integration with Query Engine

The feedback loop's output is consumed by the query engine at every stage:

Query Execution Pipeline
========================

1. Parse query
2. Load user state (reads feedback loop output)
   - blocked set       <-- from Step 5 (block cascade)
   - muted set         <-- from explicit mute relationship write
   - follows set       <-- from explicit follows relationship write
   - hidden set        <-- from Step 7 (hide signal)
   - preference vector <-- from Step 4 (preference shift)

3. Generate candidates
   - ANN retrieval uses preference vector <-- Step 4 output
   - Following feed uses follows set

4. Pre-filter candidates
   - Remove blocked creators    <-- Step 5 output
   - Remove muted creators      <-- explicit relationship
   - Remove hidden items        <-- Step 7 output
   - Apply unseen filter        <-- Step 7 output (seen bitmap)

5. Score candidates
   - Decay scores               <-- Step 3 output
   - Windowed aggregates        <-- Step 3 output
   - Interaction weight boost   <-- Step 5 output
   - Cohort velocity            <-- Step 6 output

6. Diversity pass
7. Return results
   - If track_impressions: true, generate implicit impression signals
     (feeds back into the feedback loop)

Feedback Loop Diagram (Complete Cycle)

                            ┌─────────────────────────────────────────┐
                            │           FEEDBACK LOOP                 │
                            │                                         │
    User sees item          │  ┌───────────┐                         │
    in feed ────────────────┼──│  QUERY    │  (reads all derived     │
         │                  │  │  ENGINE   │   state from the loop)  │
         │                  │  └───────────┘                         │
         │                  │        ▲                                │
         ▼                  │        │ reads                         │
    User engages            │        │                                │
    (view/like/skip/        │  ┌─────┴─────────────────────────┐     │
     hide/block)            │  │     DERIVED STATE              │     │
         │                  │  │                                │     │
         ▼                  │  │  Decay scores (Hot tier)       │     │
    db.signal() ────────────┼──│  Windowed counters (Warm tier) │     │
         │                  │  │  Preference vector (Entity)    │     │
         ├── Step 1: Dedup  │  │  Interaction weights (Rel)     │     │
         ├── Step 2: WAL    │  │  Cohort counters (Cohort CF)   │     │
         ├── Step 3: Ledger─┼──│  User state (State bitmap)     │     │
         ├── Step 4: Pref ──┼──│                                │     │
         ├── Step 5: Rel ───┼──│  All updated atomically        │     │
         ├── Step 6: Cohort─┼──│  within the signal write       │     │
         └── Step 7: State──┼──│                                │     │
                            │  └────────────────────────────────┘     │
                            │                                         │
                            │  Next query (even 100ms later)          │
                            │  reflects ALL updated state             │
                            └─────────────────────────────────────────┘

12. Property Tests

The following properties must be verified with proptest. These cover the feedback loop's correctness invariants across arbitrary signal sequences.

P1: Hidden Items Never Reappear

proptest! {
    fn hidden_item_never_in_results(
        signals in prop::collection::vec(arb_signal_event(), 1..500),
        hide_index in 0usize..500,
    ) {
        let db = setup_test_db();
        let user = create_test_user(&db);

        // Write some signals, hide an item at some point in the sequence
        let mut hidden_item = None;
        for (i, signal) in signals.iter().enumerate() {
            db.signal(signal)?;
            if i == hide_index.min(signals.len() - 1) {
                let item = signal.item;
                db.signal(Signal { kind: "hide", item, user, .. })?;
                hidden_item = Some(item);
            }
        }

        // Query with every profile -- hidden item must never appear
        if let Some(hidden) = hidden_item {
            for profile in ["for_you", "trending", "following", "search", "related"] {
                let results = db.retrieve(Retrieve {
                    for_user: Some(user),
                    profile,
                    ..Default::default()
                })?;
                prop_assert!(
                    !results.results.iter().any(|r| r.id == hidden),
                    "Hidden item {} appeared in {} results", hidden, profile
                );
            }
        }
    }
}

P2: Block Cascade Completeness

proptest! {
    fn block_excludes_all_creator_items(
        item_count in 1usize..50,
        signals_per_item in 1usize..10,
    ) {
        let db = setup_test_db();
        let user = create_test_user(&db);
        let creator = create_test_creator(&db);
        let items: Vec<_> = (0..item_count)
            .map(|i| create_test_item(&db, creator, i))
            .collect();

        // Engage with all items
        for item in &items {
            for _ in 0..signals_per_item {
                db.signal(Signal { kind: "view", item: *item, user, .. })?;
            }
        }

        // Block the creator
        db.signal(Signal { kind: "block", user, target_creator: creator, .. })?;

        // No item by this creator should appear in any query
        let results = db.retrieve(Retrieve {
            for_user: Some(user),
            profile: "for_you",
            limit: 1000,
            ..Default::default()
        })?;

        for item in &items {
            prop_assert!(
                !results.results.iter().any(|r| r.id == *item),
                "Blocked creator's item {} appeared in results", item
            );
        }

        // Interaction weight should be zero
        let weight = db.get_relationship_weight(user, "interaction_weight", creator)?;
        prop_assert_eq!(weight, Some(0.0));
    }
}

P3: Preference Vector Remains Unit Length

proptest! {
    fn preference_vector_stays_normalized(
        signals in prop::collection::vec(arb_signal_with_polarity(), 1..1000),
    ) {
        let db = setup_test_db();
        let user = create_test_user(&db);

        for signal in &signals {
            db.signal(signal)?;
        }

        let pref = db.get_embedding(user, "preference")?;
        let norm: f32 = pref.iter().map(|x| x * x).sum::<f32>().sqrt();

        // Unit length within floating-point tolerance
        prop_assert!(
            (norm - 1.0).abs() < 1e-5,
            "Preference vector norm = {}, expected ~1.0", norm
        );
    }
}

P4: Relationship Weights Stay Bounded

proptest! {
    fn relationship_weights_in_bounds(
        signals in prop::collection::vec(arb_signal_event(), 1..1000),
    ) {
        let db = setup_test_db();
        let user = create_test_user(&db);

        for signal in &signals {
            db.signal(signal)?;
        }

        // Check all interaction_weight edges
        let edges = db.scan_relationships(user, "interaction_weight")?;
        for edge in &edges {
            prop_assert!(
                edge.weight >= 0.0 && edge.weight <= 1.0,
                "interaction_weight {} out of bounds [0, 1]", edge.weight
            );
        }

        // Check all engagement_affinity edges
        let edges = db.scan_relationships(user, "engagement_affinity")?;
        for edge in &edges {
            prop_assert!(
                edge.weight >= 0.0 && edge.weight <= 1.0,
                "engagement_affinity {} out of bounds [0, 1]", edge.weight
            );
        }
    }
}

P5: WAL Replay Produces Identical Derived State

proptest! {
    fn wal_replay_consistency(
        signals in prop::collection::vec(arb_signal_event(), 1..500),
        crash_point in 0usize..500,
    ) {
        // Execute all signals without crash
        let db1 = setup_test_db();
        for signal in &signals {
            db1.signal(signal)?;
        }
        let expected_state = snapshot_all_derived_state(&db1);

        // Execute up to crash_point, then "crash" and replay
        let db2 = setup_test_db();
        for signal in signals.iter().take(crash_point.min(signals.len())) {
            db2.signal(signal)?;
        }
        simulate_crash(&db2);
        let db2_recovered = recover_from_wal(&db2);

        // Replay remaining signals
        for signal in signals.iter().skip(crash_point.min(signals.len())) {
            db2_recovered.signal(signal)?;
        }
        let recovered_state = snapshot_all_derived_state(&db2_recovered);

        // States must match
        assert_derived_state_equal(&expected_state, &recovered_state);
    }
}

P6: Dedup Prevents Double-Counting

proptest! {
    fn duplicate_signal_idempotent(
        signal in arb_signal_event(),
        repeat_count in 2usize..10,
    ) {
        let db = setup_test_db();

        // Write the signal once
        db.signal(&signal)?;
        let state_once = snapshot_entity_signal_state(&db, signal.item);

        // Write the same signal multiple times
        for _ in 1..repeat_count {
            db.signal(&signal)?;
        }
        let state_many = snapshot_entity_signal_state(&db, signal.item);

        // States must be identical
        prop_assert_eq!(state_once, state_many,
            "Signal written {} times produced different state than written once",
            repeat_count
        );
    }
}

P7: Signal Conservation After Crash

proptest! {
    fn all_committed_signals_survive_crash(
        signals in prop::collection::vec(arb_signal_event(), 1..200),
    ) {
        let db = setup_test_db();
        let mut committed = Vec::new();

        for signal in &signals {
            if db.signal(signal).is_ok() {
                committed.push(signal.clone());
            }
        }

        simulate_crash(&db);
        let recovered = recover_from_wal(&db);

        // Every committed signal must be reflected in the recovered state
        for signal in &committed {
            let decay_score = recovered.get_decay_score(signal.item, signal.kind)?;
            prop_assert!(
                decay_score > 0.0,
                "Committed signal {:?} not reflected in recovered state", signal
            );
        }
    }
}

Appendix A: Glossary

Term Definition
Feedback Loop The closed cycle where engagement events update ranking state, which influences what users see next, which generates new engagement events
Signal Ingestion Pipeline The 7-step process from API call to durable derived state
Preference Vector A database-managed embedding per user that evolves with every signal, representing the user's taste profile
Learning Rate The magnitude of preference vector updates; decays as the user matures
Momentum (EWMA) Exponentially weighted smoothing applied to preference vector updates to prevent oscillation
Cascade The set of derived state updates triggered by a signal, particularly for negative signals like block and hide
Consistency Boundary The WAL append step; after this point, the event is durable and all derived state can be reconstructed
Staleness Bound The maximum time between a WAL-committed event and its appearance in all derived state
Implicit Signal A signal generated by the database itself (e.g., impressions from query results) rather than by explicit API call
Cohort Attribution The process of resolving which cohorts a user belongs to and incrementing dimensional counters
Block Cascade The full set of relationship mutations triggered by blocking a creator: follows deletion, weight zeroing, engagement affinity zeroing
Cold Start The state of a new user or item with no signal history; handled by population/cohort centroids and exploration budgets

Appendix B: References

  1. Robbins, H., Monro, S. "A Stochastic Approximation Method." Annals of Mathematical Statistics, 1951. (Convergence conditions for the preference vector update rule)
  2. Cormode, G., et al. "Forward Decay: A Practical Time Decay Model." ICDE 2009. (Running decay score exactness proof)
  3. VISION.md, Section "The Feedback Loop" and "Design Principles" (Architectural requirements)
  4. thoughts.md, Part IV "Stage 4: Closed-Loop Systems" (Theoretical motivation)
  5. Signal System Specification, Section 8 "Signal Write Path" (Pipeline foundation)
  6. Relationships Specification, Section 8 "Weight Update Mechanics" (Cascade definitions)
  7. Entity Model Specification, Section "Embedding Management" (Preference vector storage)