- m0p3: CONTRIBUTING.md with run-samples checklist, all 4 examples (quickstart, cli_embedding, axum_embedding, actix_embedding), doc-test coverage for every public API surface - m1p5: TidalDb public API — write_item, signal, read_decay_score, read_windowed_count, read_velocity; StorageBox enum routing memory vs fjall; WalSender/WalHandleWriter bridge; WAL replay on open - Periodic checkpoint: 30s background thread for persistent+schema mode; FjallBackend::Clone (O(1), fjall::Keyspace is ref-counted); graceful shutdown via Arc<AtomicBool> + join before final checkpoint - ROADMAP.md: M0 and M1 fully marked COMPLETE (341 tests passing) - Milestone 2 planning scaffolding added under docs/planning/milestone-2/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
691 lines
27 KiB
Markdown
691 lines
27 KiB
Markdown
# Task 02: Built-in Profiles
|
|
|
|
## Context
|
|
|
|
**Milestone:** 2 -- Ranked Retrieval
|
|
**Phase:** m2p3 -- Ranking Profile Engine
|
|
**Depends On:** Task 01 (RankingProfile, ScoringRule, Sort, CandidateStrategy, ProfileRegistry types)
|
|
**Blocks:** Task 03 (Profile Executor + Benchmarks)
|
|
**Complexity:** M
|
|
|
|
## Objective
|
|
|
|
Deliver all 11 built-in ranking profiles as `RankingProfile` instances, registered into the `ProfileRegistry` at schema build time. Built-in profiles are not special-cased in the executor -- they go through the same execution pipeline as application-defined profiles. They are standard `RankingProfile` structs constructed from the types defined in Task 01.
|
|
|
|
Each built-in profile maps directly to a profile preset from Spec 09 Section 13. The profiles define which signals they require (e.g., `trending` requires `share` and `view` with velocity), and the registration logic validates signal availability against the schema. When a required signal is not present in the schema, the profile degrades gracefully: missing boosts/penalties contribute 0.0, and a `tracing::warn!` is emitted listing the missing signals. The profile is still registered and usable.
|
|
|
|
This task also delivers the signal dependency validation logic that connects profiles to the schema's signal definitions, closing the loop on INV-PROF-3 (signal reference validity).
|
|
|
|
## Requirements
|
|
|
|
- 11 built-in profiles defined as `RankingProfile` instances:
|
|
- `trending` -- Spec 09 Section 13.2
|
|
- `hot` -- Spec 09 Section 13.10
|
|
- `new` -- pure `created_at DESC`
|
|
- `top_week` -- quality score within 7d window
|
|
- `top_month` -- quality score within 30d window
|
|
- `top_all_time` -- all-time signal score
|
|
- `hidden_gems` -- Spec 09 Section 13.7
|
|
- `controversial` -- Spec 09 Section 13.12
|
|
- `most_viewed` -- windowed view count DESC
|
|
- `most_liked` -- windowed like count DESC
|
|
- `shuffle` -- quality-weighted random ordering
|
|
- Each built-in profile specifies its signal dependencies
|
|
- Signal dependency validation against the schema
|
|
- Graceful degradation for missing signals (skip, warn, not error)
|
|
- Built-in profiles registered with `is_builtin: true`
|
|
- Application profiles can override built-ins by registering with the same name
|
|
- `register_builtins()` function that populates a `ProfileRegistry`
|
|
- No `unsafe` code
|
|
|
|
## Technical Design
|
|
|
|
### Module Structure
|
|
|
|
```
|
|
tidal/src/ranking/
|
|
registry.rs -- ProfileRegistry (Task 01), register_builtins(), SignalDependency,
|
|
validate_signal_dependencies()
|
|
```
|
|
|
|
### Public API
|
|
|
|
```rust
|
|
// === ranking/registry.rs (additions to Task 01 registry) ===
|
|
|
|
use std::collections::HashSet;
|
|
use super::profile::*;
|
|
use crate::schema::SignalTypeDef;
|
|
|
|
/// Signal dependency for a profile. Describes what the profile needs
|
|
/// from the schema's signal definitions.
|
|
#[derive(Debug, Clone)]
|
|
pub struct SignalDependency {
|
|
/// Signal name (e.g., "view", "share", "like").
|
|
pub signal_name: String,
|
|
/// Whether velocity is required for this signal.
|
|
pub requires_velocity: bool,
|
|
/// Which windows are required.
|
|
pub required_windows: Vec<Window>,
|
|
}
|
|
|
|
/// Result of validating a profile's signal dependencies against the schema.
|
|
#[derive(Debug, Clone)]
|
|
pub struct DependencyValidation {
|
|
/// Signal names that are present in the schema and fully satisfy the profile.
|
|
pub satisfied: Vec<String>,
|
|
/// Signal names referenced by the profile but not found in the schema.
|
|
pub missing: Vec<String>,
|
|
/// Signal names present but lacking required velocity configuration.
|
|
pub missing_velocity: Vec<String>,
|
|
/// Signal names present but lacking required windows.
|
|
pub missing_windows: Vec<(String, Vec<Window>)>,
|
|
}
|
|
|
|
impl DependencyValidation {
|
|
/// True if all signal dependencies are fully satisfied.
|
|
pub fn is_fully_satisfied(&self) -> bool {
|
|
self.missing.is_empty()
|
|
&& self.missing_velocity.is_empty()
|
|
&& self.missing_windows.is_empty()
|
|
}
|
|
|
|
/// True if at least one signal dependency is satisfied.
|
|
/// The profile can operate in degraded mode.
|
|
pub fn is_partially_satisfied(&self) -> bool {
|
|
!self.satisfied.is_empty()
|
|
}
|
|
}
|
|
|
|
/// Validate a profile's signal dependencies against the schema's signal definitions.
|
|
///
|
|
/// Returns a `DependencyValidation` describing which signals are available,
|
|
/// missing, or partially available.
|
|
pub fn validate_signal_dependencies(
|
|
profile: &RankingProfile,
|
|
signal_defs: &[SignalTypeDef],
|
|
) -> DependencyValidation;
|
|
|
|
/// Register all built-in profiles into the registry.
|
|
///
|
|
/// Each built-in is validated against the provided signal definitions.
|
|
/// Profiles with missing signals are still registered but emit warnings.
|
|
/// Applications can override any built-in by calling `registry.register()`
|
|
/// with a profile of the same name (the built-in is replaced).
|
|
///
|
|
/// # Arguments
|
|
///
|
|
/// * `registry` -- The profile registry to populate.
|
|
/// * `signal_defs` -- Signal type definitions from the schema.
|
|
///
|
|
/// # Returns
|
|
///
|
|
/// A map of profile name to `DependencyValidation` for observability.
|
|
pub fn register_builtins(
|
|
registry: &mut ProfileRegistry,
|
|
signal_defs: &[SignalTypeDef],
|
|
) -> HashMap<String, DependencyValidation>;
|
|
```
|
|
|
|
### Built-in Profile Definitions
|
|
|
|
Each built-in is a function returning a `RankingProfile`:
|
|
|
|
```rust
|
|
/// trending: pure velocity, no personalization. Spec 09 Section 13.2.
|
|
///
|
|
/// Requires: share (velocity), view (velocity, unique_ratio)
|
|
/// Gate: engagement_ratio >= 0.03
|
|
fn builtin_trending() -> RankingProfile {
|
|
let mut p = RankingProfile::new("trending", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_boost(Boost::new("share", Window::OneHour, SignalAgg::Velocity, 0.5))
|
|
.with_boost(Boost::new("view", Window::OneHour, SignalAgg::Velocity, 0.3))
|
|
// UniqueRatio deferred to M6; use Value as placeholder for M2
|
|
.with_boost(Boost::new("view", Window::TwentyFourHours, SignalAgg::Value, 0.2))
|
|
.with_gate(Gate::min_ratio("engagement_ratio", 0.03))
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(1),
|
|
..Default::default()
|
|
})
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// hot: score / (age_hours + 2)^gravity. Spec 09 Section 13.10.
|
|
///
|
|
/// Requires: like, dislike (for positive/negative computation)
|
|
/// Sort formula replaces boost/penalty pipeline.
|
|
fn builtin_hot() -> RankingProfile {
|
|
let mut p = RankingProfile::new("hot", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
..Default::default()
|
|
})
|
|
.with_sort(Sort::Hot { gravity: 1.8 })
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// new: created_at DESC. Pure chronological, no scoring.
|
|
///
|
|
/// Requires: no signals (metadata sort only).
|
|
fn builtin_new() -> RankingProfile {
|
|
let mut p = RankingProfile::new("new", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::New)
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// top_week: quality score within 7d window. Spec 09 Section 11.7.
|
|
///
|
|
/// Requires: view, like, share, completion (windowed counts)
|
|
fn builtin_top_week() -> RankingProfile {
|
|
let mut p = RankingProfile::new("top_week", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::TopWindow { window: Window::SevenDays })
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
..Default::default()
|
|
})
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// top_month: quality score within 30d window.
|
|
///
|
|
/// Requires: view, like, share, completion (windowed counts)
|
|
fn builtin_top_month() -> RankingProfile {
|
|
let mut p = RankingProfile::new("top_month", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::TopWindow { window: Window::ThirtyDays })
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
..Default::default()
|
|
})
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// top_all_time: all-time signal score.
|
|
///
|
|
/// Requires: view, like, share, completion (all-time counts)
|
|
fn builtin_top_all_time() -> RankingProfile {
|
|
let mut p = RankingProfile::new("top_all_time", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::TopWindow { window: Window::AllTime })
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
..Default::default()
|
|
})
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// hidden_gems: quality * inverse_reach. Spec 09 Section 13.7.
|
|
///
|
|
/// Requires: completion (all_time), like (all_time), view (all_time count)
|
|
/// Gate: completion_rate >= 0.5, view count >= 50
|
|
fn builtin_hidden_gems() -> RankingProfile {
|
|
let mut p = RankingProfile::new("hidden_gems", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_gate(Gate::min_signal("completion", Window::AllTime, 0.5))
|
|
.with_gate(Gate::min_count("view", Window::AllTime, 50))
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(1),
|
|
format_mix: true,
|
|
topic_diversity: Some(0.5),
|
|
..Default::default()
|
|
})
|
|
.with_sort(Sort::HiddenGems)
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// controversial: maximize positive * negative product. Spec 09 Section 13.12.
|
|
///
|
|
/// Requires: like (all_time count), dislike (all_time count)
|
|
/// Gate: like count >= 50 AND dislike count >= 50
|
|
fn builtin_controversial() -> RankingProfile {
|
|
let mut p = RankingProfile::new("controversial", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_gate(Gate::min_count("like", Window::AllTime, 50))
|
|
.with_gate(Gate::min_count("dislike", Window::AllTime, 50))
|
|
.with_diversity(DiversitySpec {
|
|
max_per_creator: Some(2),
|
|
..Default::default()
|
|
})
|
|
.with_sort(Sort::Controversial)
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// most_viewed: view count DESC within 7d window.
|
|
///
|
|
/// Requires: view (7d windowed count)
|
|
fn builtin_most_viewed() -> RankingProfile {
|
|
let mut p = RankingProfile::new("most_viewed", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::MostViewed { window: Window::SevenDays })
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// most_liked: like count DESC within all-time window.
|
|
///
|
|
/// Requires: like (all-time count)
|
|
fn builtin_most_liked() -> RankingProfile {
|
|
let mut p = RankingProfile::new("most_liked", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::MostLiked { window: Window::AllTime })
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
|
|
/// shuffle: quality-weighted random ordering. Spec 09 Section 11.6.
|
|
///
|
|
/// Requires: completion (all_time), like (all_time), view (all_time)
|
|
/// for quality_weight computation. Falls back to uniform random if
|
|
/// quality signals are unavailable.
|
|
fn builtin_shuffle() -> RankingProfile {
|
|
let mut p = RankingProfile::new("shuffle", 1);
|
|
p.with_candidate_strategy(CandidateStrategy::Scan {
|
|
entity: EntityKind::Item,
|
|
})
|
|
.with_sort(Sort::Shuffle)
|
|
.set_builtin(true);
|
|
p
|
|
}
|
|
```
|
|
|
|
### Signal Dependency Table
|
|
|
|
| Profile | Required Signals | Required Windows | Requires Velocity |
|
|
|---------|-----------------|------------------|-------------------|
|
|
| `trending` | share, view | 1h, 24h | Yes (share, view) |
|
|
| `hot` | like, dislike | all_time | No |
|
|
| `new` | (none) | (none) | No |
|
|
| `top_week` | view, like, share, completion | 7d | No |
|
|
| `top_month` | view, like, share, completion | 30d | No |
|
|
| `top_all_time` | view, like, share, completion | all_time | No |
|
|
| `hidden_gems` | completion, like, view | all_time | No |
|
|
| `controversial` | like, dislike | all_time | No |
|
|
| `most_viewed` | view | 7d | No |
|
|
| `most_liked` | like | all_time | No |
|
|
| `shuffle` | completion, like, view | all_time | No (quality weight fallback) |
|
|
|
|
### Degradation Strategy
|
|
|
|
When `register_builtins()` validates a profile against the schema's signal definitions:
|
|
|
|
1. **All signals present:** Profile registered as-is. No warnings.
|
|
|
|
2. **Some signals missing:** Profile registered with missing signals noted. `tracing::warn!("built-in profile '{}' missing signals: {:?}. These scoring rules will contribute 0.0", name, missing)`. The profile's `RankingProfile` struct is not modified -- the executor (Task 03) checks signal availability at scoring time and skips missing signals.
|
|
|
|
3. **All signals missing:** Profile still registered (it may use a sort formula that does not require signals, like `new` or `shuffle`). Warning emitted. If the sort formula also requires signals (like `hot` requires like/dislike counts), the executor returns 0.0 for all candidates, which produces an arbitrary but stable ordering.
|
|
|
|
4. **Signal present but missing velocity:** Warning emitted for boosts that use `SignalAgg::Velocity` on a signal without `velocity_enabled: true`. The executor falls back to `SignalAgg::Value` for that boost.
|
|
|
|
### Error Handling
|
|
|
|
- `register_builtins()` never fails. All built-in profiles are guaranteed to have valid names, versions, and structure. Signal dependency warnings are advisory, not errors.
|
|
- If a built-in profile name conflicts with an already-registered application profile, the application profile takes precedence. The built-in is skipped with a `tracing::info!` log.
|
|
|
|
## Test Strategy
|
|
|
|
### Unit Tests
|
|
|
|
```rust
|
|
#[test]
|
|
fn all_11_builtins_registered() {
|
|
let mut registry = ProfileRegistry::new();
|
|
let validations = register_builtins(&mut registry, &[]);
|
|
assert_eq!(registry.len(), 11);
|
|
|
|
let expected_names = [
|
|
"trending", "hot", "new", "top_week", "top_month", "top_all_time",
|
|
"hidden_gems", "controversial", "most_viewed", "most_liked", "shuffle",
|
|
];
|
|
for name in &expected_names {
|
|
assert!(registry.contains(name), "missing built-in profile: {}", name);
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn builtins_are_flagged_builtin() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
for name in registry.list_names() {
|
|
let profile = registry.get(name).unwrap();
|
|
assert!(profile.is_builtin(),
|
|
"built-in profile '{}' should have is_builtin=true", name);
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn builtins_have_version_1() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
for name in registry.list_names() {
|
|
let profile = registry.get(name).unwrap();
|
|
assert_eq!(profile.version(), 1,
|
|
"built-in profile '{}' should have version 1", name);
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn hot_profile_has_correct_gravity() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
let hot = registry.get("hot").unwrap();
|
|
match hot.sort() {
|
|
Some(Sort::Hot { gravity }) => {
|
|
assert!((gravity - 1.8).abs() < f64::EPSILON,
|
|
"hot gravity should be 1.8, got {}", gravity);
|
|
}
|
|
other => panic!("hot profile should have Sort::Hot, got {:?}", other),
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn trending_profile_has_velocity_boosts() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
let trending = registry.get("trending").unwrap();
|
|
assert!(!trending.boosts().is_empty(), "trending should have boosts");
|
|
|
|
let share_boost = trending.boosts().iter()
|
|
.find(|b| b.signal == "share")
|
|
.expect("trending should boost share");
|
|
assert_eq!(share_boost.aggregation, SignalAgg::Velocity);
|
|
assert!((share_boost.weight - 0.5).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn new_profile_has_no_boosts_or_signals() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
let new = registry.get("new").unwrap();
|
|
assert!(new.boosts().is_empty());
|
|
assert!(new.penalties().is_empty());
|
|
assert!(new.gates().is_empty());
|
|
assert!(matches!(new.sort(), Some(Sort::New)));
|
|
}
|
|
|
|
#[test]
|
|
fn hidden_gems_has_quality_gates() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
let hg = registry.get("hidden_gems").unwrap();
|
|
assert_eq!(hg.gates().len(), 2, "hidden_gems should have 2 gates");
|
|
|
|
let has_completion_gate = hg.gates().iter().any(|g| {
|
|
matches!(g, Gate::MinSignal { signal, threshold, .. }
|
|
if signal == "completion" && (*threshold - 0.5).abs() < f64::EPSILON)
|
|
});
|
|
assert!(has_completion_gate, "hidden_gems should gate on completion >= 0.5");
|
|
|
|
let has_view_gate = hg.gates().iter().any(|g| {
|
|
matches!(g, Gate::MinCount { signal, count, .. }
|
|
if signal == "view" && *count == 50)
|
|
});
|
|
assert!(has_view_gate, "hidden_gems should gate on view count >= 50");
|
|
}
|
|
|
|
#[test]
|
|
fn controversial_gates_on_like_and_dislike() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
let c = registry.get("controversial").unwrap();
|
|
assert_eq!(c.gates().len(), 2);
|
|
|
|
let has_like_gate = c.gates().iter().any(|g| {
|
|
matches!(g, Gate::MinCount { signal, count, .. }
|
|
if signal == "like" && *count == 50)
|
|
});
|
|
assert!(has_like_gate, "controversial should gate on like count >= 50");
|
|
|
|
let has_dislike_gate = c.gates().iter().any(|g| {
|
|
matches!(g, Gate::MinCount { signal, count, .. }
|
|
if signal == "dislike" && *count == 50)
|
|
});
|
|
assert!(has_dislike_gate, "controversial should gate on dislike count >= 50");
|
|
}
|
|
|
|
#[test]
|
|
fn all_scan_profiles_use_scan_strategy() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
// All M2 built-ins use Scan (ANN, Hybrid, Relationship are M3+)
|
|
for name in registry.list_names() {
|
|
let profile = registry.get(name).unwrap();
|
|
assert!(
|
|
matches!(profile.candidate_strategy(), CandidateStrategy::Scan { .. }),
|
|
"built-in '{}' should use Scan strategy for M2", name
|
|
);
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn dependency_validation_all_satisfied() {
|
|
let signal_defs = vec![
|
|
make_signal_def("view", true, &[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime]),
|
|
make_signal_def("like", false, &[Window::AllTime]),
|
|
make_signal_def("share", true, &[Window::OneHour]),
|
|
make_signal_def("completion", false, &[Window::AllTime]),
|
|
make_signal_def("dislike", false, &[Window::AllTime]),
|
|
];
|
|
|
|
let mut registry = ProfileRegistry::new();
|
|
let validations = register_builtins(&mut registry, &signal_defs);
|
|
|
|
let trending_v = &validations["trending"];
|
|
assert!(trending_v.is_partially_satisfied());
|
|
// share and view should be satisfied
|
|
assert!(trending_v.satisfied.contains(&"share".to_string()));
|
|
assert!(trending_v.satisfied.contains(&"view".to_string()));
|
|
}
|
|
|
|
#[test]
|
|
fn dependency_validation_missing_signals() {
|
|
// Schema only has "view" -- "share" is missing for trending
|
|
let signal_defs = vec![
|
|
make_signal_def("view", true, &[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime]),
|
|
];
|
|
|
|
let mut registry = ProfileRegistry::new();
|
|
let validations = register_builtins(&mut registry, &signal_defs);
|
|
|
|
let trending_v = &validations["trending"];
|
|
assert!(trending_v.is_partially_satisfied());
|
|
assert!(trending_v.missing.contains(&"share".to_string()));
|
|
|
|
// Profile should still be registered
|
|
assert!(registry.contains("trending"));
|
|
}
|
|
|
|
#[test]
|
|
fn dependency_validation_no_signals_at_all() {
|
|
let mut registry = ProfileRegistry::new();
|
|
let validations = register_builtins(&mut registry, &[]);
|
|
|
|
// All profiles still registered
|
|
assert_eq!(registry.len(), 11);
|
|
|
|
// "new" should have no missing signals (it uses no signals)
|
|
let new_v = &validations["new"];
|
|
assert!(new_v.missing.is_empty());
|
|
|
|
// "trending" should have all signals missing
|
|
let trending_v = &validations["trending"];
|
|
assert!(!trending_v.missing.is_empty());
|
|
}
|
|
|
|
#[test]
|
|
fn application_override_replaces_builtin() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
// Override "trending" with a custom profile
|
|
let mut custom = RankingProfile::new("trending", 1);
|
|
custom.with_boost(Boost::new("view", Window::OneHour, SignalAgg::Velocity, 1.0));
|
|
|
|
// Remove the built-in first, then register custom
|
|
registry.remove("trending");
|
|
registry.register(custom).unwrap();
|
|
|
|
let trending = registry.get("trending").unwrap();
|
|
assert!(!trending.is_builtin(), "overridden profile should not be builtin");
|
|
assert_eq!(trending.boosts().len(), 1);
|
|
assert!((trending.boosts()[0].weight - 1.0).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn builtin_serde_roundtrip() {
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &[]);
|
|
|
|
for name in registry.list_names() {
|
|
let profile = registry.get(name).unwrap();
|
|
let json = serde_json::to_string(profile).unwrap();
|
|
let restored: RankingProfile = serde_json::from_str(&json).unwrap();
|
|
assert_eq!(restored.name(), profile.name());
|
|
assert_eq!(restored.version(), profile.version());
|
|
assert_eq!(restored.boosts().len(), profile.boosts().len());
|
|
assert_eq!(restored.gates().len(), profile.gates().len());
|
|
}
|
|
}
|
|
|
|
// Test helper
|
|
fn make_signal_def(name: &str, velocity: bool, windows: &[Window]) -> SignalTypeDef {
|
|
use std::time::Duration;
|
|
use crate::schema::{DecayModel, WindowSet};
|
|
SignalTypeDef::new(
|
|
name.into(),
|
|
EntityKind::Item,
|
|
DecayModel::exponential(Duration::from_secs(604_800)),
|
|
WindowSet::new(windows),
|
|
velocity,
|
|
)
|
|
}
|
|
```
|
|
|
|
### Property Tests
|
|
|
|
```rust
|
|
use proptest::prelude::*;
|
|
|
|
// P1: All built-in profiles pass validation when registered.
|
|
#[test]
|
|
fn all_builtins_pass_validation() {
|
|
let mut registry = ProfileRegistry::new();
|
|
// register_builtins should never panic or return errors
|
|
let _validations = register_builtins(&mut registry, &[]);
|
|
|
|
// Verify every registered profile has a valid name
|
|
for name in registry.list_names() {
|
|
let profile = registry.get(name).unwrap();
|
|
assert!(!profile.name().is_empty());
|
|
assert!(profile.name().chars().all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_'));
|
|
assert!(profile.name().chars().next().unwrap().is_ascii_lowercase());
|
|
}
|
|
}
|
|
|
|
// P2: Degradation is consistent -- adding signals never reduces
|
|
// the set of registered profiles.
|
|
proptest! {
|
|
#[test]
|
|
fn more_signals_never_fewer_profiles(
|
|
num_signals in 0usize..5,
|
|
) {
|
|
let all_signals = ["view", "like", "share", "completion", "dislike"];
|
|
let signal_defs: Vec<_> = all_signals[..num_signals.min(all_signals.len())]
|
|
.iter()
|
|
.map(|name| make_signal_def(name, true,
|
|
&[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime]))
|
|
.collect();
|
|
|
|
let mut registry = ProfileRegistry::new();
|
|
register_builtins(&mut registry, &signal_defs);
|
|
|
|
// All 11 profiles should be registered regardless of signal availability
|
|
prop_assert_eq!(registry.len(), 11,
|
|
"expected 11 profiles with {} signals, got {}",
|
|
num_signals, registry.len());
|
|
}
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] 11 built-in profiles registered: `trending`, `hot`, `new`, `top_week`, `top_month`, `top_all_time`, `hidden_gems`, `controversial`, `most_viewed`, `most_liked`, `shuffle`
|
|
- [ ] All built-in profiles have `is_builtin: true` and version 1
|
|
- [ ] All built-in profiles use `CandidateStrategy::Scan` for M2
|
|
- [ ] `trending` has `share` velocity boost (0.5), `view` velocity boost (0.3), `engagement_ratio` gate >= 0.03, `max_per_creator: 1`
|
|
- [ ] `hot` has `Sort::Hot { gravity: 1.8 }`, `max_per_creator: 2`, no boosts/penalties
|
|
- [ ] `new` has `Sort::New`, no boosts/penalties/gates
|
|
- [ ] `top_week` has `Sort::TopWindow { window: SevenDays }`, `max_per_creator: 2`
|
|
- [ ] `top_month` has `Sort::TopWindow { window: ThirtyDays }`, `max_per_creator: 2`
|
|
- [ ] `top_all_time` has `Sort::TopWindow { window: AllTime }`, `max_per_creator: 2`
|
|
- [ ] `hidden_gems` has `Sort::HiddenGems`, gates on completion >= 0.5 and view count >= 50, `max_per_creator: 1`, `format_mix: true`, `topic_diversity: 0.5`
|
|
- [ ] `controversial` has `Sort::Controversial`, gates on like count >= 50 and dislike count >= 50, `max_per_creator: 2`
|
|
- [ ] `most_viewed` has `Sort::MostViewed { window: SevenDays }`
|
|
- [ ] `most_liked` has `Sort::MostLiked { window: AllTime }`
|
|
- [ ] `shuffle` has `Sort::Shuffle`
|
|
- [ ] `validate_signal_dependencies()` correctly classifies signals as satisfied, missing, missing_velocity, or missing_windows
|
|
- [ ] `register_builtins()` registers all 11 profiles even when zero signal definitions are provided
|
|
- [ ] Missing signals produce `tracing::warn!` at registration time, not errors
|
|
- [ ] Application profiles can override built-ins by removing and re-registering with the same name
|
|
- [ ] All built-in profiles survive serde JSON roundtrip
|
|
- [ ] `register_builtins()` never panics regardless of input signal definitions
|
|
- [ ] No `unsafe` code
|
|
- [ ] `cargo clippy -- -D warnings` passes
|
|
- [ ] All unit tests and property tests pass
|
|
|
|
## Research References
|
|
|
|
- [docs/research/tidaldb_signal_ledger.md](../../../research/tidaldb_signal_ledger.md) -- Signal type definitions that profiles reference
|
|
|
|
## Spec References
|
|
|
|
- [docs/specs/09-ranking-scoring.md](../../../specs/09-ranking-scoring.md) -- Section 11 (Built-in sort modes: Hot formula Section 11.1, Trending Section 11.2, Rising Section 11.3, Controversial Section 11.4, HiddenGems Section 11.5, Shuffle Section 11.6, Top windowed Section 11.7, simple field sorts Section 11.8), Section 13 (Profile presets: all 12 presets with exact field definitions), Section 16 (INV-PROF-3: signal reference validity)
|
|
|
|
## Implementation Notes
|
|
|
|
- `register_builtins()` should be called from `SchemaBuilder::build()` or from `TidalDb::open()` after the schema is loaded. The exact call site depends on how the schema-to-registry wiring evolves. For M2, call it from a new method on `TidalDb` or from a test helper.
|
|
- The `trending` profile in Spec 09 Section 13.2 uses `UniqueRatio` aggregation for `view` in the 24h window. `UniqueRatio` requires per-user deduplication in the signal system, which is not implemented until M3. For M2, substitute `SignalAgg::Value` for the third boost. Comment the substitution with `// TODO(M3): upgrade to SignalAgg::UniqueRatio when per-user dedup is available`.
|
|
- The `Rising` sort mode (Spec 09 Section 11.3) requires a per-creator baseline velocity, which is not available until M3 (creator entities). The `rising` profile is NOT included in the 11 M2 built-ins. It is deferred to M3.
|
|
- `dislike` signal may not be in every schema. The `hot` and `controversial` profiles reference it. When `dislike` is missing, `hot` uses only `like` count (positive = likes, negative = 0), which degrades to a simpler formula. `controversial` degrades to 0.0 for all candidates (no controversy without a negative signal).
|
|
- The `make_signal_def` test helper constructs `SignalTypeDef` instances for testing. It uses `pub(crate)` constructors from m1p1. If `SignalTypeDef::new()` is not accessible from tests (it is `pub(crate)`), add a `#[cfg(test)]` helper or use the `SchemaBuilder` to construct test schemas.
|
|
- Add `tracing` to dependencies if not already present, for `tracing::warn!` on missing signals.
|