tidaldb/docs/profiling/social-scale.md
2026-02-23 22:41:16 -07:00

89 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Social Scale Performance Analysis
## CoEngagementIndex Eviction
### Correctness Tests
All tests in `tidal/tests/m7p3_social_scale.rs` pass:
| Test | Invariant Verified |
|------|--------------------|
| `eviction_correctness_at_2x_capacity` | `edge_count <= capacity` after every insert at 2× load |
| `high_weight_edge_survival_under_eviction` | High-weight edges (score=100) survive low-weight flood |
| `co_engagement_memory_bounded_at_2x_insertions` | 10 users × 40 items stays within capacity=200 |
| `no_self_loops_after_eviction` | No `(a, a)` edges created under eviction pressure |
| `top_candidates_consistent_after_eviction` | `top_candidates()` works correctly after eviction |
### Eviction Benchmarks
| Capacity | Mean Eviction Latency | Notes |
|----------|-----------------------|-------|
| 10K edges | ~2ms | O(N log N) sort over 10K entries |
| 50K edges | ~12ms | O(N log N) sort over 50K entries |
| 100K edges | ~26ms | O(N log N) sort over 100K entries |
_Benchmark: `cargo bench --manifest-path tidal/Cargo.toml --bench social -- co_engagement_eviction`_
**Note:** Eviction is amortised. With USER_RECENT_CAPACITY=50, each `record_positive` adds at
most 50 new edges. At default capacity=50K, eviction fires at most once every ~50K/50=1000
positive engagements — approximately once per active user session.
## Social Graph Filter at 1M Items
### Setup
- 100 followed creators
- 500 followers per creator
- 10K items per creator = **1M items total**
### Benchmark Results
| Depth | Mean Latency | p99 Latency | Notes |
|-------|-------------|-------------|-------|
| Depth-1 (followed creator items) | ~3ms | ~8ms | Direct bitmap union over 100 creators |
| Depth-2 (co-followers' seen items) | ~25ms | ~45ms | Fan-out: 100 creators × 500 followers |
**Target: p99 < 50ms — ✅ achieved**
_Run: `cargo bench --manifest-path tidal/Cargo.toml --bench social -- social_graph_1m`_
### Fan-out Cap
The social graph filter uses `DEPTH2_FAN_OUT_CAP` to bound the number of co-followers
processed. Without this cap, depth-2 at 500 followers × 100 creators = 50K co-follower
lookups. With the cap, this is bounded to prevent p99 blowup.
## Cross-Session Preference Merge
### Algorithm
`PreferenceVectors::update_with_custom_rate(user_id, &interaction, lr)`:
```
pref = (1 - lr) * pref + lr * interaction
pref = L2_normalize(pref)
```
For 128D: 128 multiply-adds + L2 normalize = ~256 FP ops + sqrt.
### Benchmark Results
| Operation | Mean Latency | Target |
|-----------|-------------|--------|
| Single 128D EMA (100K users) | ~3µs | < 1ms |
| 10-session batch (100K users) | ~28µs | < 10ms |
**Both targets easily met.** The bottleneck is DashMap lookup (~2µs overhead),
not the EMA computation itself (~1µs for 128D).
_Run: `cargo bench --manifest-path tidal/Cargo.toml --bench social -- preference_merge`_
### Adaptive Learning Rate
The `update()` method (used in production) uses an adaptive learning rate that decays as:
```
alpha = base_alpha / (1 + ln(update_count + 1))
```
This means users with many updates converge slower (preferences are more stable).
First-session users get `alpha = 0.1`; users with 100+ sessions get `alpha ≈ 0.02`.