tidaldb/docs/planning/milestone-3/phase-1/task-02-relationship-graph.md
jordan 39ada28c6e feat: complete Milestones 2–4 — RETRIEVE query, vector index, ranking profiles, diversity, entity system, sessions
M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
    usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
    MMR diversity enforcement, and m2_uat integration tests.

M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
    creator entities, session tracking, and m3_uat integration tests.

M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
    ranking executor with explain mode, query executor integration, benchmarks for
    query/ranking/vector/filters/diversity, and m4_uat integration tests.

Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:24:48 -07:00

504 lines
18 KiB
Markdown

# Task 02: Relationship Graph
## Context
**Milestone:** 3 -- Personalized Ranking
**Phase:** m3p1 -- User and Creator Entities with Relationships
**Depends On:** Task 01 (User + Creator entity types, `StorageBox` with users/creators engines)
**Blocks:** Task 03 (User-State Bitmap Indexes), m3p2 (Feedback Loop needs interaction_weight edges), m3p3 (Personalized Profiles need follows/blocks)
**Complexity:** L
## Objective
Deliver the relationship graph: typed, weighted, directional edges between entities stored in the users keyspace under `Tag::Rel`. The graph supports five relationship types (`follows`, `blocks`, `interaction_weight`, `hide`, `mute`) with CRUD operations and prefix-scanned enumeration. Edges are encoded so that all relationships from a single user can be scanned with one prefix, and all relationships of a given type from a user can be scanned with a narrower prefix.
The relationship graph is the foundation for:
- **`follows` filter**: enumerate items from followed creators
- **`blocked` filter**: exclude items from blocked creators
- **`hide` filter**: exclude specific hidden items
- **interaction_weight**: user-to-creator affinity used in personalized scoring
- **mute**: soft filter (suppresses but does not hard-exclude)
Key encoding follows the subject-prefix pattern established in m1p3:
```
[user_id: 8 bytes BE][0x00][REL: 0x04][type_byte: 1 byte][to_entity_id: 8 bytes BE]
```
This gives O(1) point lookup for a specific edge and O(n) prefix scan for all edges of a type from a user.
## Requirements
- `RelationshipType` enum: `Follows`, `Blocks`, `InteractionWeight`, `Hide`, `Mute` with `as_byte()` / `from_byte()` discriminants
- `RelationshipEdge` struct: `from: EntityId`, `to: EntityId`, `rel_type: RelationshipType`, `weight: f64`, `timestamp_nanos: u64`
- `db.write_relationship(from, to, rel_type, weight, timestamp)` stores edge in users keyspace
- `db.read_relationship(from, to, rel_type)` returns `Option<RelationshipEdge>`
- `db.delete_relationship(from, to, rel_type)` removes edge
- `db.list_relationships(from, rel_type)` returns `Vec<RelationshipEdge>` via prefix scan
- `db.list_all_relationships(from)` returns all edge types from a user
- Key encoding: `[from_id][0x00][0x04][type_byte][to_id_bytes]`
- Value encoding: `[weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]`
- Relationship write latency < 50 microseconds
- Edges persist across shutdown and restart
## Technical Design
### Module Structure
```
tidal/src/
entities/
relationship.rs -- RelationshipType, RelationshipEdge, encode/decode, CRUD
```
### Types
```rust
// === entities/relationship.rs ===
use crate::schema::{EntityId, Timestamp};
/// Relationship type discriminant.
///
/// Each variant maps to a single byte for key encoding.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum RelationshipType {
/// User follows a creator. Permanent until unfollowed.
Follows = 0x01,
/// User blocks a creator. Permanent. Hard filter in all queries.
Blocks = 0x02,
/// User-to-creator interaction weight. Updated on every engagement signal.
/// Decays over time using the same decay infrastructure as signal scores.
InteractionWeight = 0x03,
/// User hides a specific item. Permanent. Hard negative.
Hide = 0x04,
/// User mutes a creator. Permanent. Soft filter (suppresses, not excludes).
Mute = 0x05,
}
impl RelationshipType {
pub const fn as_byte(self) -> u8 {
self as u8
}
pub const fn from_byte(b: u8) -> Option<Self> {
match b {
0x01 => Some(Self::Follows),
0x02 => Some(Self::Blocks),
0x03 => Some(Self::InteractionWeight),
0x04 => Some(Self::Hide),
0x05 => Some(Self::Mute),
_ => None,
}
}
/// Human-readable name for display and query parsing.
pub const fn name(self) -> &'static str {
match self {
Self::Follows => "follows",
Self::Blocks => "blocks",
Self::InteractionWeight => "interaction_weight",
Self::Hide => "hide",
Self::Mute => "mute",
}
}
/// Parse from a string name. Used by the query parser.
pub fn from_name(name: &str) -> Option<Self> {
match name {
"follows" => Some(Self::Follows),
"blocks" => Some(Self::Blocks),
"interaction_weight" => Some(Self::InteractionWeight),
"hide" => Some(Self::Hide),
"mute" => Some(Self::Mute),
_ => None,
}
}
}
/// A directional, weighted relationship edge.
#[derive(Debug, Clone, PartialEq)]
pub struct RelationshipEdge {
pub from: EntityId,
pub to: EntityId,
pub rel_type: RelationshipType,
pub weight: f64,
pub timestamp_nanos: u64,
}
```
### Key and Value Encoding
```rust
/// Encode a relationship key.
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1][to_id: 8 BE]
///
/// Total: 18 bytes (fixed size, no variable-length components).
pub fn encode_relationship_key(
from: EntityId,
rel_type: RelationshipType,
to: EntityId,
) -> [u8; 18] {
let mut key = [0u8; 18];
key[0..8].copy_from_slice(&from.to_be_bytes());
key[8] = 0x00; // NUL separator
key[9] = Tag::Rel.as_byte(); // 0x04
key[10] = rel_type.as_byte();
key[11..19].copy_from_slice(&to.to_be_bytes());
key
}
/// Build the prefix for scanning all relationships of a type from a user.
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1]
///
/// Total: 11 bytes.
pub fn relationship_type_prefix(
from: EntityId,
rel_type: RelationshipType,
) -> [u8; 11] {
let mut prefix = [0u8; 11];
prefix[0..8].copy_from_slice(&from.to_be_bytes());
prefix[8] = 0x00;
prefix[9] = Tag::Rel.as_byte();
prefix[10] = rel_type.as_byte();
prefix
}
/// Build the prefix for scanning all relationships from a user (any type).
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04]
///
/// Total: 10 bytes (same as entity_tag_prefix with Tag::Rel).
pub fn relationship_prefix(from: EntityId) -> [u8; 10] {
crate::storage::keys::entity_tag_prefix(from, Tag::Rel)
}
/// Encode relationship edge value.
///
/// Format: [weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]
pub fn encode_relationship_value(weight: f64, timestamp_nanos: u64) -> [u8; 16] {
let mut buf = [0u8; 16];
buf[0..8].copy_from_slice(&weight.to_le_bytes());
buf[8..16].copy_from_slice(&timestamp_nanos.to_le_bytes());
buf
}
/// Decode a relationship edge from key + value bytes.
pub fn decode_relationship(key: &[u8], value: &[u8]) -> Option<RelationshipEdge> {
if key.len() < 18 || value.len() < 16 {
return None;
}
let from = EntityId::new(u64::from_be_bytes(key[0..8].try_into().ok()?));
let rel_type = RelationshipType::from_byte(key[10])?;
let to = EntityId::new(u64::from_be_bytes(key[11..19].try_into().ok()?));
let weight = f64::from_le_bytes(value[0..8].try_into().ok()?);
let timestamp_nanos = u64::from_le_bytes(value[8..16].try_into().ok()?);
Some(RelationshipEdge { from, to, rel_type, weight, timestamp_nanos })
}
```
### TidalDb API Extensions
```rust
impl TidalDb {
/// Write a relationship edge. Overwrites if the edge already exists.
pub fn write_relationship(
&self,
from: EntityId,
to: EntityId,
rel_type: RelationshipType,
weight: f64,
timestamp: Timestamp,
) -> crate::Result<()> {
let storage = self.storage.as_ref()
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
let key = encode_relationship_key(from, rel_type, to);
let value = encode_relationship_value(weight, timestamp.as_nanos());
storage.users_engine().put(&key, &value).map_err(LumenError::from)
}
/// Read a specific relationship edge.
pub fn read_relationship(
&self,
from: EntityId,
to: EntityId,
rel_type: RelationshipType,
) -> crate::Result<Option<RelationshipEdge>> {
let storage = self.storage.as_ref()
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
let key = encode_relationship_key(from, rel_type, to);
match storage.users_engine().get(&key)? {
Some(value) => Ok(decode_relationship(&key, &value)),
None => Ok(None),
}
}
/// Delete a relationship edge.
pub fn delete_relationship(
&self,
from: EntityId,
to: EntityId,
rel_type: RelationshipType,
) -> crate::Result<()> {
let storage = self.storage.as_ref()
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
let key = encode_relationship_key(from, rel_type, to);
storage.users_engine().delete(&key).map_err(LumenError::from)
}
/// List all relationship edges of a given type from a user.
pub fn list_relationships(
&self,
from: EntityId,
rel_type: RelationshipType,
) -> crate::Result<Vec<RelationshipEdge>> {
let storage = self.storage.as_ref()
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
let prefix = relationship_type_prefix(from, rel_type);
let mut edges = Vec::new();
for (key, value) in storage.users_engine().scan_prefix(&prefix) {
if let Some(edge) = decode_relationship(&key, &value) {
edges.push(edge);
}
}
Ok(edges)
}
}
```
## Test Strategy
### Unit Tests
```rust
#[test]
fn relationship_type_byte_roundtrip() {
let types = [
RelationshipType::Follows,
RelationshipType::Blocks,
RelationshipType::InteractionWeight,
RelationshipType::Hide,
RelationshipType::Mute,
];
for rt in types {
let byte = rt.as_byte();
assert_eq!(RelationshipType::from_byte(byte), Some(rt));
}
}
#[test]
fn relationship_type_name_roundtrip() {
let types = [
RelationshipType::Follows,
RelationshipType::Blocks,
RelationshipType::InteractionWeight,
RelationshipType::Hide,
RelationshipType::Mute,
];
for rt in types {
let name = rt.name();
assert_eq!(RelationshipType::from_name(name), Some(rt));
}
}
#[test]
fn encode_decode_relationship_roundtrip() {
let from = EntityId::new(42);
let to = EntityId::new(7);
let rt = RelationshipType::Follows;
let weight = 1.0;
let ts = 1_000_000_000u64;
let key = encode_relationship_key(from, rt, to);
let value = encode_relationship_value(weight, ts);
let edge = decode_relationship(&key, &value).unwrap();
assert_eq!(edge.from, from);
assert_eq!(edge.to, to);
assert_eq!(edge.rel_type, rt);
assert!((edge.weight - weight).abs() < f64::EPSILON);
assert_eq!(edge.timestamp_nanos, ts);
}
#[test]
fn write_read_relationship_ephemeral() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::Follows, 1.0, Timestamp::now(),
).unwrap();
let edge = db.read_relationship(
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
).unwrap();
assert!(edge.is_some());
assert_eq!(edge.unwrap().to, EntityId::new(10));
}
#[test]
fn read_nonexistent_relationship_returns_none() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
let edge = db.read_relationship(
EntityId::new(1), EntityId::new(99), RelationshipType::Follows,
).unwrap();
assert!(edge.is_none());
}
#[test]
fn delete_relationship_removes_edge() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::Follows, 1.0, Timestamp::now(),
).unwrap();
db.delete_relationship(
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
).unwrap();
let edge = db.read_relationship(
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
).unwrap();
assert!(edge.is_none());
}
#[test]
fn list_relationships_returns_all_of_type() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
for creator in 1..=5u64 {
db.write_relationship(
EntityId::new(42), EntityId::new(creator),
RelationshipType::Follows, 1.0, Timestamp::now(),
).unwrap();
}
db.write_relationship(
EntityId::new(42), EntityId::new(99),
RelationshipType::Blocks, 1.0, Timestamp::now(),
).unwrap();
let follows = db.list_relationships(EntityId::new(42), RelationshipType::Follows).unwrap();
assert_eq!(follows.len(), 5);
assert!(follows.iter().all(|e| e.rel_type == RelationshipType::Follows));
let blocks = db.list_relationships(EntityId::new(42), RelationshipType::Blocks).unwrap();
assert_eq!(blocks.len(), 1);
}
#[test]
fn relationship_write_overwrites_weight() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::InteractionWeight, 0.5, Timestamp::now(),
).unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::InteractionWeight, 0.9, Timestamp::now(),
).unwrap();
let edge = db.read_relationship(
EntityId::new(1), EntityId::new(10), RelationshipType::InteractionWeight,
).unwrap().unwrap();
assert!((edge.weight - 0.9).abs() < f64::EPSILON);
}
#[test]
fn different_relationship_types_do_not_collide() {
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::Follows, 1.0, Timestamp::now(),
).unwrap();
db.write_relationship(
EntityId::new(1), EntityId::new(10),
RelationshipType::Blocks, 1.0, Timestamp::now(),
).unwrap();
let follows = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Follows).unwrap();
let blocks = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Blocks).unwrap();
assert!(follows.is_some());
assert!(blocks.is_some());
}
```
### Property Tests
```rust
use proptest::prelude::*;
proptest! {
#[test]
fn relationship_key_encode_decode_roundtrip(
from_id in 1u64..100000,
to_id in 1u64..100000,
type_byte in 1u8..=5u8,
weight in -10.0f64..10.0,
ts in 0u64..u64::MAX,
) {
let from = EntityId::new(from_id);
let to = EntityId::new(to_id);
let rt = RelationshipType::from_byte(type_byte).unwrap();
let key = encode_relationship_key(from, rt, to);
let value = encode_relationship_value(weight, ts);
let edge = decode_relationship(&key, &value);
prop_assert!(edge.is_some());
let edge = edge.unwrap();
prop_assert_eq!(edge.from, from);
prop_assert_eq!(edge.to, to);
prop_assert_eq!(edge.rel_type, rt);
prop_assert!((edge.weight - weight).abs() < f64::EPSILON);
prop_assert_eq!(edge.timestamp_nanos, ts);
}
#[test]
fn relationship_type_prefix_contains_all_keys(
from_id in 1u64..10000,
to_ids in proptest::collection::vec(1u64..10000, 1..20),
type_byte in 1u8..=5u8,
) {
let from = EntityId::new(from_id);
let rt = RelationshipType::from_byte(type_byte).unwrap();
let prefix = relationship_type_prefix(from, rt);
for &to_id in &to_ids {
let key = encode_relationship_key(from, rt, EntityId::new(to_id));
prop_assert!(key.starts_with(&prefix),
"key for to={} should start with type prefix", to_id);
}
}
}
```
## Acceptance Criteria
- [ ] `RelationshipType` enum with 5 variants, `as_byte()`/`from_byte()`/`name()`/`from_name()` roundtrip
- [ ] `RelationshipEdge` struct with `Debug`, `Clone`, `PartialEq`
- [ ] `encode_relationship_key` produces 18-byte fixed-size keys
- [ ] `encode_relationship_value` produces 16-byte fixed-size values
- [ ] `decode_relationship` roundtrips correctly with encode functions
- [ ] `db.write_relationship()` stores in users keyspace under `Tag::Rel`
- [ ] `db.read_relationship()` retrieves specific edge, returns `None` for missing
- [ ] `db.delete_relationship()` removes edge
- [ ] `db.list_relationships()` enumerates all edges of a type from a user via prefix scan
- [ ] Write overwrites existing edge (same from, to, type)
- [ ] Different relationship types for the same (from, to) pair do not collide
- [ ] Relationship key prefix scan returns only the requested type
- [ ] Write/read latency < 50 microseconds (benchmarked or measured in test)
- [ ] Property tests pass: encode/decode roundtrip, prefix containment
- [ ] `cargo clippy -- -D warnings` passes
- [ ] All tests pass
## Research References
- [thoughts.md](../../../../thoughts.md) -- Part V.12 (subject-prefix key encoding)
- [VISION.md](../../../../VISION.md) -- Relationships are first-class edges between entities
## Implementation Notes
- The key length is fixed at 18 bytes: 8 (from_id) + 1 (NUL) + 1 (Tag::Rel) + 1 (type_byte) + 8 (to_id) = 19 bytes. Correction: the key array in the design is 19 bytes, not 18. Adjust array sizes accordingly: `[u8; 19]`.
- Relationship edges are stored in the **users keyspace** because the query pattern is always "given a user, find their relationships." Storing in the users keyspace means all of a user's data (metadata, relationships, preference vector) is co-located and scannable with one prefix.
- For M3, only forward traversal is needed (user -> creators). Reverse indexes (creator -> followers) are deferred to M6 when social graph traversal queries are implemented.
- The `interaction_weight` relationship type is updated atomically in m3p2 when engagement signals are written. The weight is a running value, not a sum -- each update replaces the previous weight.
- `Hide` edges point from a user to an **item** (not a creator). The `to` field is an item_id. This is a user-to-item relationship, unlike follows/blocks which are user-to-creator.
- Do NOT implement WAL-backed relationship writes in this task. Relationships are stored directly in the storage engine (fjall). WAL-backed relationship writes for crash safety of in-flight relationship changes are addressed in m3p2 Task 03 (Hard Negatives) where it matters most.