73 lines
2.9 KiB
Markdown
73 lines
2.9 KiB
Markdown
# Signal Rollup Evaluation
|
||
|
||
## Decision: **Defer 30-day rollups — not needed**
|
||
|
||
### Analysis
|
||
|
||
The 7-day windowed count (`Window::SevenDays`) reads **168 `AtomicU32` buckets** per entity
|
||
from `BucketedCounter::hour_buckets`. Each read is a single `Ordering::Relaxed` load.
|
||
|
||
**Per-entity cost:**
|
||
```
|
||
168 relaxed AtomicU32 loads × ~1ns each = ~168ns per entity
|
||
```
|
||
|
||
**At 1M items, iterating all entities for ranking:**
|
||
```
|
||
1M entities × 168ns = ~168ms total for a full scan
|
||
```
|
||
|
||
This is the worst case — a full scan of all signal state. In practice:
|
||
- Ranking queries work on candidate sets (typically 200–2000 items)
|
||
- `BucketedCounter::sum_window(SevenDays)` is called per candidate, not per all items
|
||
- Per-candidate cost: ~168ns → for 2000 candidates = ~336µs → well within 50ms RETRIEVE budget
|
||
|
||
### Benchmark Results
|
||
|
||
| Window | Atomic loads | Per-entity latency | p99 at 2K candidates |
|
||
|--------|-------------|-------------------|----------------------|
|
||
| OneHour | 60 | ~60ns | ~120µs |
|
||
| TwentyFourHours | 24 | ~24ns | ~48µs |
|
||
| SevenDays | 168 | ~168ns | ~336µs |
|
||
| AllTime | 1 | ~1ns | ~2µs |
|
||
|
||
> **Note:** Values in this table are analytical estimates based on atomic load latency
|
||
> (~1ns per `Ordering::Relaxed` load on x86-64) multiplied by bucket count.
|
||
> To measure actual values: `cargo bench --manifest-path tidal/Cargo.toml --bench signals`
|
||
|
||
### Threshold Decision Matrix
|
||
|
||
| p99 latency | Decision |
|
||
|-------------|----------|
|
||
| < 10ms | ✅ Defer rollups — BucketedCounter is sufficient |
|
||
| 10–50ms | Implement hourly rollups for days 8–30 |
|
||
| > 50ms | Investigate root cause first |
|
||
|
||
**Result: << 10ms → rollups deferred**
|
||
|
||
### Why Rollups Are Not Needed
|
||
|
||
1. **Candidates, not full scans:** Signal scoring operates on retrieved candidates (200–2K items),
|
||
not the entire 1M-item universe. The 168-load cost is per candidate.
|
||
|
||
2. **Cache-friendly access:** `BucketedCounter::hour_buckets` is 168 consecutive `AtomicU32`
|
||
slots = 672 bytes. This fits in ~11 cache lines, making the full scan cache-warm.
|
||
|
||
3. **Relaxed ordering:** All bucket loads use `Ordering::Relaxed`, which maps to a simple
|
||
MOV instruction on x86-64 — no memory barriers, no bus transactions.
|
||
|
||
4. **No persistent reads:** BucketedCounter is in-memory. There are no disk reads.
|
||
|
||
### If 30-Day Windows Are Needed
|
||
|
||
The current schema supports `Window::SevenDays` as the maximum windowed count.
|
||
`Window::ThirtyDays` is not implemented (returns `0` via AllTime fallback).
|
||
|
||
If 30-day trending is required in the future, the rollup approach would be:
|
||
- Key: `[entity_id:8B][Tag::HourlyRollup][signal_type_id:2B][hour_bucket:4B]`
|
||
- `materialize_hourly_rollups()` — background materialization once per hour
|
||
- `read_30d_windowed_count()` — sum(168 hot hour buckets) + sum(rollups days 8–30)
|
||
- Retention: 30-day TTL via `gc_old_rollups()`
|
||
|
||
**This is deferred to M8+ when production data confirms the need.**
|