- M5p1: BM25 text indexing via Tantivy with background syncer (0.26ms @ 10K docs) - M5p2: RRF fusion layer combining BM25 + ANN scores (46µs @ 1K candidates) - M5p3: unified Search query API (8-stage pipeline, BM25 + vector + ranking) - M5p4: creator text + vector indexing and creator search executor (< 20ms @ 200 creators) - Refactor db/mod.rs into focused sub-modules (creators, items, sessions, signals, etc.) - Decompose monolithic files into directory modules (query/executor, ranking/diversity, etc.) - Split brute.rs → brute/mod.rs + brute/tests.rs; extract search executor helpers - Add benches: fusion, search, session, text_index - Add M5 UAT test suites (m5_uat, m5_search, m5p4_creator_search, text_index) - Update blog posts, roadmap, content strategy, and M5 planning docs - Add tmp/ and .claude/worktrees/ to .gitignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
348 lines
19 KiB
Plaintext
348 lines
19 KiB
Plaintext
---
|
|
title: "Ranking profiles are data, not code"
|
|
date: "2026-02-21"
|
|
author: "Jordan Washburn"
|
|
description: "Changing how content is ranked should not require a code change, a deployment, or a restart. tidalDB treats ranking profiles as versioned schema declarations. Define a profile. Name it. Swap it at query time. A/B test two profiles by name. The database executes the entire pipeline."
|
|
tags: ["ranking", "architecture", "profiles", "query"]
|
|
---
|
|
|
|
There is a microservice in your stack called something like `ranking-service` or `feed-ranker` or `content-scorer`. It has 4,000 lines of Python. It imports numpy. It calls three external systems. It contains the function that determines what every user on your platform sees.
|
|
|
|
Changing that function requires a pull request, a code review, a CI pipeline, a deployment, a canary rollout, and a prayer. Testing two ranking strategies simultaneously requires a feature flag system, a traffic splitter, a metrics pipeline that can segment by experiment cohort, and an engineer who understands all four of those systems well enough to wire them together without introducing a bug that serves the wrong content to the wrong users for three hours on a Saturday.
|
|
|
|
This is the state of the art.
|
|
|
|
This post is about a different model: ranking profiles as versioned, named, declarative data structures that live in the database, not in application code.
|
|
|
|
## The problem
|
|
|
|
Ranking logic has a deployment problem. Specifically: it is coupled to the deployment lifecycle of whatever service executes it.
|
|
|
|
When your ranking formula is a function in a microservice, every change to that formula -- adjusting a weight, adding a signal, changing the gravity constant in your hot sort -- follows the same path as shipping a new feature. Write the code. Test it locally against a mock dataset that does not resemble production. Deploy it. Hope.
|
|
|
|
This coupling creates three failure modes that every team building ranked content encounters.
|
|
|
|
**The iteration tax.** Your product team wants to test whether boosting share velocity by 1.5x instead of 2x improves user retention. This is a one-line change to a weight constant. It requires a deployment. The deployment blocks on CI, which blocks on integration tests, which block on mocking out Elasticsearch and Redis. The one-line change ships in two days. The product team wanted an answer in two hours.
|
|
|
|
**The A/B problem.** You want to run two ranking strategies simultaneously and measure which one produces better engagement. Strategy A is the current production formula. Strategy B doubles the weight on completion signals and adds a recency decay. Both strategies live in the same function with a branching conditional on experiment cohort. The function now has two code paths, both critical, neither cleanly testable in isolation. When you ship Strategy C a month later, the function has three code paths. By Q3 it has six. Nobody refactors it because nobody is confident they understand all six.
|
|
|
|
**The consistency gap.** The ranking function in your feed service is not the same as the ranking function in your notification service. Or your search results ranker. Or your "related content" sidebar. Each service has its own copy of the ranking logic, diverged over time, maintained by different teams, tested to different standards. The user sees inconsistent ranking across surfaces. Nobody knows which version is authoritative.
|
|
|
|
All three failures share a root cause: ranking logic is deployed as code in a service, when it should be declared as data in the database.
|
|
|
|
## What we considered
|
|
|
|
Three approaches.
|
|
|
|
**Hardcoded logic.** The current industry default. The ranking formula is a function in a service. It reads signals from Redis, metadata from Elasticsearch, preferences from a feature store, and computes a score. Changing the formula means changing code. This is what produces the three failure modes above. It is also the only approach that lets you write arbitrary logic -- and arbitrary logic is what teams reach for first, even when they do not need it.
|
|
|
|
**Configuration files.** Move the weights and thresholds into a YAML or JSON config. The service reads the config at startup or watches it for changes. This decouples weight tuning from code deployments. But it does not solve the A/B problem (which config is active for which users?), it does not solve the consistency gap (each service has its own config), and it introduces a new failure mode: the config schema diverges from the code that interprets it. The config says `boost_share_velocity: 2.0`. The code reads `share_velocity_boost`. Nobody notices until the weight is silently ignored in production.
|
|
|
|
**Schema-level declarations.** The ranking profile is a named, versioned, validated data structure that lives in the database alongside the data it operates on. The database understands the structure -- it knows what a boost is, what a gate is, what a sort mode is. Registration validates the profile against the schema. Execution is handled by the database's query pipeline, not by application code. The application says "use this profile." The database does the rest.
|
|
|
|
We chose the third option.
|
|
|
|
## What we chose
|
|
|
|
A `RankingProfile` in tidalDB is a struct. It has a name, a version, and a set of declarative fields that control every stage of the scoring pipeline:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/profile.rs
|
|
|
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
|
pub struct RankingProfile {
|
|
pub name: String,
|
|
pub version: u32,
|
|
pub candidate_strategy: CandidateStrategy,
|
|
pub boosts: Vec<Boost>,
|
|
pub decay: Option<ProfileDecay>,
|
|
pub gates: Vec<Gate>,
|
|
pub penalties: Vec<Penalty>,
|
|
pub excludes: Vec<Exclude>,
|
|
pub diversity: DiversitySpec,
|
|
pub exploration: f64,
|
|
pub sort: Option<Sort>,
|
|
pub is_builtin: bool,
|
|
}
|
|
```
|
|
|
|
Each field controls a specific stage of the ranking pipeline. `candidate_strategy` determines how candidates are sourced -- scan the entire universe, query a vector index, walk the relationship graph. `sort` determines the primary scoring formula -- trending velocity, hot decay, chronological, controversial. `boosts` add weighted signal values to the base score. `gates` set minimum thresholds that filter candidates out. `penalties` subtract signal values. `diversity` sets per-creator and format-mix constraints on the result set. `exploration` injects a fraction of random candidates for discovery.
|
|
|
|
The profile is `Serialize + Deserialize`. It can be stored, transmitted, versioned, and diffed. It is data.
|
|
|
|
## What a profile looks like
|
|
|
|
Here is the built-in `trending` profile. It ranks content by signal velocity -- how fast views and shares are accumulating in the last 24 hours:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/builtins.rs
|
|
|
|
fn trending() -> RankingProfile {
|
|
let mut p = skeleton("trending");
|
|
p.sort = Some(Sort::Trending);
|
|
p.boosts = vec![
|
|
Boost {
|
|
signal: "share".into(),
|
|
agg: SignalAgg::Velocity,
|
|
window: Window::TwentyFourHours,
|
|
weight: TRENDING_SHARE_WEIGHT, // 2.0
|
|
},
|
|
Boost {
|
|
signal: "view".into(),
|
|
agg: SignalAgg::Velocity,
|
|
window: Window::TwentyFourHours,
|
|
weight: 1.0,
|
|
},
|
|
];
|
|
p.diversity = DiversitySpec {
|
|
max_per_creator: Some(TRENDING_MAX_PER_CREATOR), // 1
|
|
..DiversitySpec::default()
|
|
};
|
|
p
|
|
}
|
|
```
|
|
|
|
No personalization. No user context. Pure velocity. Shares are weighted 2x because sharing is a stronger signal of "this is worth seeing" than a passive view. Diversity caps results to one item per creator -- trending should show what is happening across the platform, not five videos from the same creator who had a viral day.
|
|
|
|
Now here is the built-in `for_you` profile. Same struct, entirely different behavior:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/builtins.rs
|
|
|
|
fn for_you() -> RankingProfile {
|
|
let mut p = skeleton("for_you");
|
|
p.sort = Some(Sort::Hot { gravity: 1.5 });
|
|
p.boosts = vec![
|
|
Boost {
|
|
signal: "view".into(),
|
|
agg: SignalAgg::DecayScore,
|
|
window: Window::AllTime,
|
|
weight: 1.0,
|
|
},
|
|
Boost {
|
|
signal: "like".into(),
|
|
agg: SignalAgg::DecayScore,
|
|
window: Window::AllTime,
|
|
weight: 2.0,
|
|
},
|
|
Boost {
|
|
signal: "share".into(),
|
|
agg: SignalAgg::Velocity,
|
|
window: Window::TwentyFourHours,
|
|
weight: 1.5,
|
|
},
|
|
];
|
|
p.diversity = DiversitySpec {
|
|
max_per_creator: Some(FOR_YOU_MAX_PER_CREATOR), // 2
|
|
format_mix_max_fraction: Some(0.4),
|
|
};
|
|
p.exploration = FOR_YOU_EXPLORATION; // 0.1
|
|
p
|
|
}
|
|
```
|
|
|
|
Different sort mode -- `Hot` with a lower gravity constant (1.5 vs 1.8), so content ages more slowly in a personalized feed than in a global hot page. Different signal aggregation -- `DecayScore` instead of `Velocity`, because personalized ranking cares about accumulated engagement history, not just the last 24 hours of acceleration. Likes weighted 2x. A format-mix constraint that prevents the feed from becoming all short-form video. A 10% exploration budget that injects content from outside the user's history to prevent filter bubbles.
|
|
|
|
Two profiles. Same struct. Same pipeline. Entirely different ranking behavior. Neither required a line of scoring logic in the application.
|
|
|
|
## The registry
|
|
|
|
Profiles are registered in a `ProfileRegistry`, which enforces constraints at registration time:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/registry.rs
|
|
|
|
pub fn register(&mut self, profile: RankingProfile) -> Result<(), ProfileError> {
|
|
if !is_valid_name(&profile.name) {
|
|
return Err(ProfileError::InvalidName(profile.name));
|
|
}
|
|
|
|
if !(0.0..=0.5).contains(&profile.exploration) {
|
|
return Err(ProfileError::ExplorationOutOfRange(profile.exploration));
|
|
}
|
|
|
|
// Gate thresholds: DecayScore/Ratio must be in [0.0, 1.0],
|
|
// Value/Velocity must be >= 0.0.
|
|
for gate in &profile.gates {
|
|
if matches!(gate.agg, SignalAgg::DecayScore | SignalAgg::Ratio) {
|
|
if !(0.0..=1.0).contains(&gate.min_threshold) {
|
|
return Err(ProfileError::GateThresholdOutOfRange(gate.min_threshold));
|
|
}
|
|
} else if gate.min_threshold < 0.0 {
|
|
return Err(ProfileError::GateThresholdOutOfRange(gate.min_threshold));
|
|
}
|
|
}
|
|
|
|
// Version must be strictly greater than the latest registered version.
|
|
let versions = self.profiles.entry(profile.name.clone()).or_default();
|
|
if let Some((&max_version, _)) = versions.last_key_value()
|
|
&& profile.version <= max_version
|
|
{
|
|
return Err(ProfileError::VersionConflict {
|
|
name: profile.name,
|
|
existing: max_version,
|
|
new: profile.version,
|
|
});
|
|
}
|
|
|
|
versions.insert(profile.version, profile);
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
Names must match `[a-z0-9_]{1,64}`. Exploration cannot exceed 50%. Gate thresholds are validated against the aggregation type -- a `DecayScore` gate must be in [0.0, 1.0] because decay scores are normalized. Versions must increase monotonically. You cannot accidentally overwrite a production profile with a lower version number.
|
|
|
|
The registry stores every version of every profile. A query can ask for the latest version (the default) or a specific version:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/registry.rs
|
|
|
|
// Get latest version
|
|
let profile = registry.get("trending")?;
|
|
|
|
// Get specific version
|
|
let profile_v1 = registry.get_version("trending", 1)?;
|
|
```
|
|
|
|
This is what makes A/B testing clean. Register `trending` version 1 and version 2. Route cohort A to version 1, cohort B to version 2. Both are valid, both are registered, both execute through the same pipeline. The only difference is the data in the profile struct.
|
|
|
|
## Same query, different profiles
|
|
|
|
The query does not change. The profile name does.
|
|
|
|
```rust
|
|
// Trending: what is gaining velocity right now
|
|
let query = Retrieve::builder()
|
|
.profile("trending")
|
|
.diversity(DiversityConstraints::new().max_per_creator(1))
|
|
.limit(25)
|
|
.build()?;
|
|
|
|
let trending_results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// For You: personalized ranking with exploration
|
|
let query = Retrieve::builder()
|
|
.profile("for_you")
|
|
.diversity(DiversityConstraints::new()
|
|
.max_per_creator(2)
|
|
.format_mix(0.4))
|
|
.limit(25)
|
|
.build()?;
|
|
|
|
let for_you_results = db.retrieve(&query)?;
|
|
```
|
|
|
|
Same candidates. Same signal ledger. Same query pipeline. Different profile name produces different scoring formula, different signal weights, different diversity constraints, different exploration behavior. The application did not implement any ranking logic. It chose a name.
|
|
|
|
## What the executor does with a profile
|
|
|
|
The profile is not interpreted by a generic rules engine. It maps directly to the five-stage pipeline the executor runs for every query:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/executor.rs
|
|
|
|
fn compute_raw_score(
|
|
&self,
|
|
entity_id: EntityId,
|
|
profile: &RankingProfile,
|
|
now: Timestamp,
|
|
) -> f64 {
|
|
let base = self.score_by_sort(entity_id, profile.sort.as_ref(), now);
|
|
|
|
// Apply boosts.
|
|
let boost_sum: f64 = profile
|
|
.boosts
|
|
.iter()
|
|
.map(|b| {
|
|
let val = read_agg(entity_id, &b.signal, &b.agg, b.window, self.ledger);
|
|
b.weight * val
|
|
})
|
|
.sum();
|
|
|
|
base + boost_sum
|
|
}
|
|
```
|
|
|
|
The sort mode determines the base score. The boosts add weighted signal values on top. The result is a single `f64` per candidate. The executor does not know or care whether the profile was built-in or custom, version 1 or version 12. It reads the struct fields and computes.
|
|
|
|
The sort modes themselves are concrete:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/executor.rs
|
|
|
|
fn score_by_sort(&self, entity_id: EntityId, sort: Option<&Sort>, now: Timestamp) -> f64 {
|
|
match sort {
|
|
Some(Sort::Hot { gravity }) => self.score_hot(entity_id, *gravity, now),
|
|
Some(Sort::Trending) => self.score_trending(entity_id),
|
|
Some(Sort::Controversial) => self.score_controversial(entity_id),
|
|
Some(Sort::HiddenGems) => self.score_hidden_gems(entity_id),
|
|
Some(Sort::Shuffle) => shuffle_score(entity_id.as_u64()),
|
|
Some(Sort::New) => { /* entity recency */ },
|
|
Some(Sort::TopWindow { window }) => self.score_top_window(entity_id, *window),
|
|
// ...
|
|
None => 0.0,
|
|
}
|
|
}
|
|
```
|
|
|
|
`Hot` applies a gravity decay by age: `log10(max(views, 1)) / (age_hours + 2)^gravity`. `Trending` reads view and share velocity over 24 hours. `Controversial` computes `(positive * negative) / (positive + negative)^2` -- content that splits opinion scores highest. `HiddenGems` divides quality (completion rate) by the log of reach (`log₁₀(view_count + 10)`) -- the logarithmic denominator means popular content is only mildly penalized, not excluded, while high-quality content that few people have seen surfaces first. Each formula reads from the same signal ledger. Each produces a different ordering of the same candidates.
|
|
|
|
## Fifteen profiles ship by default
|
|
|
|
tidalDB registers 15 built-in profiles at startup:
|
|
|
|
```rust
|
|
// From tidal/src/ranking/builtins.rs
|
|
|
|
pub fn register_builtins(registry: &mut ProfileRegistry) -> Result<(), ProfileError> {
|
|
// Population-level profiles.
|
|
registry.register(trending())?;
|
|
registry.register(hot())?;
|
|
registry.register(new())?;
|
|
registry.register(top_week())?;
|
|
registry.register(top_month())?;
|
|
registry.register(top_all_time())?;
|
|
registry.register(hidden_gems())?;
|
|
registry.register(controversial())?;
|
|
registry.register(most_viewed())?;
|
|
registry.register(most_liked())?;
|
|
registry.register(shuffle())?;
|
|
// Personalized profiles.
|
|
registry.register(for_you())?;
|
|
registry.register(following())?;
|
|
registry.register(related())?;
|
|
registry.register(notification())?;
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
Each is a standard `RankingProfile` struct with `is_builtin: true`. They are not special-cased in the executor. They follow the same pipeline as any custom profile you register. The difference is that they ship with sensible defaults: the `hot` profile uses a gravity of 1.8, the `shuffle` profile injects 50% exploration, the `trending` profile caps results to one item per creator.
|
|
|
|
You can override any built-in by registering a new version with the same name. Register `trending` version 2 with a different share weight. The registry keeps both versions. Queries using `"trending"` get version 2 by default. Queries that explicitly request version 1 still get it. No data is lost. No behavior is silently changed.
|
|
|
|
## What to watch for
|
|
|
|
Declarative profiles cover the common ranking patterns -- weighted signals, gated thresholds, diversity constraints, exploration budgets. tidalDB ships 15 profiles because 15 profiles cover the surfaces most content platforms need.
|
|
|
|
But declarative does not mean unlimited.
|
|
|
|
You cannot express arbitrary computation in a profile. If your ranking requires calling an external ML model per candidate, or joining against a table in PostgreSQL, or applying business logic that depends on the current user's subscription tier, that logic lives in your application. The profile handles the signal-based scoring. The application handles the rest.
|
|
|
|
The boundary is deliberate. A profile that could express arbitrary computation would be a scripting language embedded in a database. That is a complexity trap. The value of a declarative profile is that it is constrained: the database knows the full set of operations it might perform, which means it can validate, optimize, and version the profile with confidence. A profile that passes registration will execute correctly. A profile that contains a Lua callback might do anything.
|
|
|
|
The escape hatch is the `CandidateStrategy`. A custom strategy can pre-filter or pre-score candidates before the profile's pipeline begins. The profile handles scoring, gating, boosting, and diversity. The strategy handles candidate sourcing. The boundary between them is where application-specific logic belongs.
|
|
|
|
## Why this matters
|
|
|
|
Ranking is the most consequential code in a content platform. It determines what every user sees. It runs on every request. It is also, in most organizations, the code that is hardest to change, hardest to test, and hardest to understand.
|
|
|
|
Making ranking a database primitive -- declarative, versioned, validated, swappable at query time -- does not make ranking simple. The problem is inherently complex. What it does is make ranking *operationally manageable*. Changing a weight is a data change, not a code change. Testing two strategies is a query parameter, not a feature flag pipeline. Every surface uses the same profiles through the same pipeline, so ranking behavior is consistent by construction.
|
|
|
|
The `RankingProfile` struct is 12 fields. It controls the entire scoring pipeline. It serializes to JSON. It fits in a diff. A product manager can read it. An engineer can review it in minutes, not hours. And when it is wrong, you register a new version. The old one is still there.
|
|
|
|
---
|
|
|
|
*The ranking profile type system is at [tidal/src/ranking/profile.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/ranking/profile.rs). The 15 built-in profiles are at [tidal/src/ranking/builtins.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/ranking/builtins.rs). The registry is at [tidal/src/ranking/registry.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/ranking/registry.rs). Follow the build on [GitHub](https://github.com/orchard9/tidalDB).*
|