M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
MMR diversity enforcement, and m2_uat integration tests.
M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
creator entities, session tracking, and m3_uat integration tests.
M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
ranking executor with explain mode, query executor integration, benchmarks for
query/ranking/vector/filters/diversity, and m4_uat integration tests.
Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
504 lines
18 KiB
Markdown
504 lines
18 KiB
Markdown
# Task 02: Relationship Graph
|
|
|
|
## Context
|
|
|
|
**Milestone:** 3 -- Personalized Ranking
|
|
**Phase:** m3p1 -- User and Creator Entities with Relationships
|
|
**Depends On:** Task 01 (User + Creator entity types, `StorageBox` with users/creators engines)
|
|
**Blocks:** Task 03 (User-State Bitmap Indexes), m3p2 (Feedback Loop needs interaction_weight edges), m3p3 (Personalized Profiles need follows/blocks)
|
|
**Complexity:** L
|
|
|
|
## Objective
|
|
|
|
Deliver the relationship graph: typed, weighted, directional edges between entities stored in the users keyspace under `Tag::Rel`. The graph supports five relationship types (`follows`, `blocks`, `interaction_weight`, `hide`, `mute`) with CRUD operations and prefix-scanned enumeration. Edges are encoded so that all relationships from a single user can be scanned with one prefix, and all relationships of a given type from a user can be scanned with a narrower prefix.
|
|
|
|
The relationship graph is the foundation for:
|
|
- **`follows` filter**: enumerate items from followed creators
|
|
- **`blocked` filter**: exclude items from blocked creators
|
|
- **`hide` filter**: exclude specific hidden items
|
|
- **interaction_weight**: user-to-creator affinity used in personalized scoring
|
|
- **mute**: soft filter (suppresses but does not hard-exclude)
|
|
|
|
Key encoding follows the subject-prefix pattern established in m1p3:
|
|
```
|
|
[user_id: 8 bytes BE][0x00][REL: 0x04][type_byte: 1 byte][to_entity_id: 8 bytes BE]
|
|
```
|
|
|
|
This gives O(1) point lookup for a specific edge and O(n) prefix scan for all edges of a type from a user.
|
|
|
|
## Requirements
|
|
|
|
- `RelationshipType` enum: `Follows`, `Blocks`, `InteractionWeight`, `Hide`, `Mute` with `as_byte()` / `from_byte()` discriminants
|
|
- `RelationshipEdge` struct: `from: EntityId`, `to: EntityId`, `rel_type: RelationshipType`, `weight: f64`, `timestamp_nanos: u64`
|
|
- `db.write_relationship(from, to, rel_type, weight, timestamp)` stores edge in users keyspace
|
|
- `db.read_relationship(from, to, rel_type)` returns `Option<RelationshipEdge>`
|
|
- `db.delete_relationship(from, to, rel_type)` removes edge
|
|
- `db.list_relationships(from, rel_type)` returns `Vec<RelationshipEdge>` via prefix scan
|
|
- `db.list_all_relationships(from)` returns all edge types from a user
|
|
- Key encoding: `[from_id][0x00][0x04][type_byte][to_id_bytes]`
|
|
- Value encoding: `[weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]`
|
|
- Relationship write latency < 50 microseconds
|
|
- Edges persist across shutdown and restart
|
|
|
|
## Technical Design
|
|
|
|
### Module Structure
|
|
|
|
```
|
|
tidal/src/
|
|
entities/
|
|
relationship.rs -- RelationshipType, RelationshipEdge, encode/decode, CRUD
|
|
```
|
|
|
|
### Types
|
|
|
|
```rust
|
|
// === entities/relationship.rs ===
|
|
|
|
use crate::schema::{EntityId, Timestamp};
|
|
|
|
/// Relationship type discriminant.
|
|
///
|
|
/// Each variant maps to a single byte for key encoding.
|
|
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
|
|
#[repr(u8)]
|
|
pub enum RelationshipType {
|
|
/// User follows a creator. Permanent until unfollowed.
|
|
Follows = 0x01,
|
|
/// User blocks a creator. Permanent. Hard filter in all queries.
|
|
Blocks = 0x02,
|
|
/// User-to-creator interaction weight. Updated on every engagement signal.
|
|
/// Decays over time using the same decay infrastructure as signal scores.
|
|
InteractionWeight = 0x03,
|
|
/// User hides a specific item. Permanent. Hard negative.
|
|
Hide = 0x04,
|
|
/// User mutes a creator. Permanent. Soft filter (suppresses, not excludes).
|
|
Mute = 0x05,
|
|
}
|
|
|
|
impl RelationshipType {
|
|
pub const fn as_byte(self) -> u8 {
|
|
self as u8
|
|
}
|
|
|
|
pub const fn from_byte(b: u8) -> Option<Self> {
|
|
match b {
|
|
0x01 => Some(Self::Follows),
|
|
0x02 => Some(Self::Blocks),
|
|
0x03 => Some(Self::InteractionWeight),
|
|
0x04 => Some(Self::Hide),
|
|
0x05 => Some(Self::Mute),
|
|
_ => None,
|
|
}
|
|
}
|
|
|
|
/// Human-readable name for display and query parsing.
|
|
pub const fn name(self) -> &'static str {
|
|
match self {
|
|
Self::Follows => "follows",
|
|
Self::Blocks => "blocks",
|
|
Self::InteractionWeight => "interaction_weight",
|
|
Self::Hide => "hide",
|
|
Self::Mute => "mute",
|
|
}
|
|
}
|
|
|
|
/// Parse from a string name. Used by the query parser.
|
|
pub fn from_name(name: &str) -> Option<Self> {
|
|
match name {
|
|
"follows" => Some(Self::Follows),
|
|
"blocks" => Some(Self::Blocks),
|
|
"interaction_weight" => Some(Self::InteractionWeight),
|
|
"hide" => Some(Self::Hide),
|
|
"mute" => Some(Self::Mute),
|
|
_ => None,
|
|
}
|
|
}
|
|
}
|
|
|
|
/// A directional, weighted relationship edge.
|
|
#[derive(Debug, Clone, PartialEq)]
|
|
pub struct RelationshipEdge {
|
|
pub from: EntityId,
|
|
pub to: EntityId,
|
|
pub rel_type: RelationshipType,
|
|
pub weight: f64,
|
|
pub timestamp_nanos: u64,
|
|
}
|
|
```
|
|
|
|
### Key and Value Encoding
|
|
|
|
```rust
|
|
/// Encode a relationship key.
|
|
///
|
|
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1][to_id: 8 BE]
|
|
///
|
|
/// Total: 18 bytes (fixed size, no variable-length components).
|
|
pub fn encode_relationship_key(
|
|
from: EntityId,
|
|
rel_type: RelationshipType,
|
|
to: EntityId,
|
|
) -> [u8; 18] {
|
|
let mut key = [0u8; 18];
|
|
key[0..8].copy_from_slice(&from.to_be_bytes());
|
|
key[8] = 0x00; // NUL separator
|
|
key[9] = Tag::Rel.as_byte(); // 0x04
|
|
key[10] = rel_type.as_byte();
|
|
key[11..19].copy_from_slice(&to.to_be_bytes());
|
|
key
|
|
}
|
|
|
|
/// Build the prefix for scanning all relationships of a type from a user.
|
|
///
|
|
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1]
|
|
///
|
|
/// Total: 11 bytes.
|
|
pub fn relationship_type_prefix(
|
|
from: EntityId,
|
|
rel_type: RelationshipType,
|
|
) -> [u8; 11] {
|
|
let mut prefix = [0u8; 11];
|
|
prefix[0..8].copy_from_slice(&from.to_be_bytes());
|
|
prefix[8] = 0x00;
|
|
prefix[9] = Tag::Rel.as_byte();
|
|
prefix[10] = rel_type.as_byte();
|
|
prefix
|
|
}
|
|
|
|
/// Build the prefix for scanning all relationships from a user (any type).
|
|
///
|
|
/// Format: [from_id: 8 BE][0x00][REL: 0x04]
|
|
///
|
|
/// Total: 10 bytes (same as entity_tag_prefix with Tag::Rel).
|
|
pub fn relationship_prefix(from: EntityId) -> [u8; 10] {
|
|
crate::storage::keys::entity_tag_prefix(from, Tag::Rel)
|
|
}
|
|
|
|
/// Encode relationship edge value.
|
|
///
|
|
/// Format: [weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]
|
|
pub fn encode_relationship_value(weight: f64, timestamp_nanos: u64) -> [u8; 16] {
|
|
let mut buf = [0u8; 16];
|
|
buf[0..8].copy_from_slice(&weight.to_le_bytes());
|
|
buf[8..16].copy_from_slice(×tamp_nanos.to_le_bytes());
|
|
buf
|
|
}
|
|
|
|
/// Decode a relationship edge from key + value bytes.
|
|
pub fn decode_relationship(key: &[u8], value: &[u8]) -> Option<RelationshipEdge> {
|
|
if key.len() < 18 || value.len() < 16 {
|
|
return None;
|
|
}
|
|
let from = EntityId::new(u64::from_be_bytes(key[0..8].try_into().ok()?));
|
|
let rel_type = RelationshipType::from_byte(key[10])?;
|
|
let to = EntityId::new(u64::from_be_bytes(key[11..19].try_into().ok()?));
|
|
let weight = f64::from_le_bytes(value[0..8].try_into().ok()?);
|
|
let timestamp_nanos = u64::from_le_bytes(value[8..16].try_into().ok()?);
|
|
Some(RelationshipEdge { from, to, rel_type, weight, timestamp_nanos })
|
|
}
|
|
```
|
|
|
|
### TidalDb API Extensions
|
|
|
|
```rust
|
|
impl TidalDb {
|
|
/// Write a relationship edge. Overwrites if the edge already exists.
|
|
pub fn write_relationship(
|
|
&self,
|
|
from: EntityId,
|
|
to: EntityId,
|
|
rel_type: RelationshipType,
|
|
weight: f64,
|
|
timestamp: Timestamp,
|
|
) -> crate::Result<()> {
|
|
let storage = self.storage.as_ref()
|
|
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
|
|
let key = encode_relationship_key(from, rel_type, to);
|
|
let value = encode_relationship_value(weight, timestamp.as_nanos());
|
|
storage.users_engine().put(&key, &value).map_err(LumenError::from)
|
|
}
|
|
|
|
/// Read a specific relationship edge.
|
|
pub fn read_relationship(
|
|
&self,
|
|
from: EntityId,
|
|
to: EntityId,
|
|
rel_type: RelationshipType,
|
|
) -> crate::Result<Option<RelationshipEdge>> {
|
|
let storage = self.storage.as_ref()
|
|
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
|
|
let key = encode_relationship_key(from, rel_type, to);
|
|
match storage.users_engine().get(&key)? {
|
|
Some(value) => Ok(decode_relationship(&key, &value)),
|
|
None => Ok(None),
|
|
}
|
|
}
|
|
|
|
/// Delete a relationship edge.
|
|
pub fn delete_relationship(
|
|
&self,
|
|
from: EntityId,
|
|
to: EntityId,
|
|
rel_type: RelationshipType,
|
|
) -> crate::Result<()> {
|
|
let storage = self.storage.as_ref()
|
|
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
|
|
let key = encode_relationship_key(from, rel_type, to);
|
|
storage.users_engine().delete(&key).map_err(LumenError::from)
|
|
}
|
|
|
|
/// List all relationship edges of a given type from a user.
|
|
pub fn list_relationships(
|
|
&self,
|
|
from: EntityId,
|
|
rel_type: RelationshipType,
|
|
) -> crate::Result<Vec<RelationshipEdge>> {
|
|
let storage = self.storage.as_ref()
|
|
.ok_or_else(|| LumenError::Internal("no storage".into()))?;
|
|
let prefix = relationship_type_prefix(from, rel_type);
|
|
let mut edges = Vec::new();
|
|
for (key, value) in storage.users_engine().scan_prefix(&prefix) {
|
|
if let Some(edge) = decode_relationship(&key, &value) {
|
|
edges.push(edge);
|
|
}
|
|
}
|
|
Ok(edges)
|
|
}
|
|
}
|
|
```
|
|
|
|
## Test Strategy
|
|
|
|
### Unit Tests
|
|
|
|
```rust
|
|
#[test]
|
|
fn relationship_type_byte_roundtrip() {
|
|
let types = [
|
|
RelationshipType::Follows,
|
|
RelationshipType::Blocks,
|
|
RelationshipType::InteractionWeight,
|
|
RelationshipType::Hide,
|
|
RelationshipType::Mute,
|
|
];
|
|
for rt in types {
|
|
let byte = rt.as_byte();
|
|
assert_eq!(RelationshipType::from_byte(byte), Some(rt));
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn relationship_type_name_roundtrip() {
|
|
let types = [
|
|
RelationshipType::Follows,
|
|
RelationshipType::Blocks,
|
|
RelationshipType::InteractionWeight,
|
|
RelationshipType::Hide,
|
|
RelationshipType::Mute,
|
|
];
|
|
for rt in types {
|
|
let name = rt.name();
|
|
assert_eq!(RelationshipType::from_name(name), Some(rt));
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn encode_decode_relationship_roundtrip() {
|
|
let from = EntityId::new(42);
|
|
let to = EntityId::new(7);
|
|
let rt = RelationshipType::Follows;
|
|
let weight = 1.0;
|
|
let ts = 1_000_000_000u64;
|
|
|
|
let key = encode_relationship_key(from, rt, to);
|
|
let value = encode_relationship_value(weight, ts);
|
|
let edge = decode_relationship(&key, &value).unwrap();
|
|
|
|
assert_eq!(edge.from, from);
|
|
assert_eq!(edge.to, to);
|
|
assert_eq!(edge.rel_type, rt);
|
|
assert!((edge.weight - weight).abs() < f64::EPSILON);
|
|
assert_eq!(edge.timestamp_nanos, ts);
|
|
}
|
|
|
|
#[test]
|
|
fn write_read_relationship_ephemeral() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::Follows, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
let edge = db.read_relationship(
|
|
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
|
|
).unwrap();
|
|
assert!(edge.is_some());
|
|
assert_eq!(edge.unwrap().to, EntityId::new(10));
|
|
}
|
|
|
|
#[test]
|
|
fn read_nonexistent_relationship_returns_none() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
let edge = db.read_relationship(
|
|
EntityId::new(1), EntityId::new(99), RelationshipType::Follows,
|
|
).unwrap();
|
|
assert!(edge.is_none());
|
|
}
|
|
|
|
#[test]
|
|
fn delete_relationship_removes_edge() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::Follows, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
db.delete_relationship(
|
|
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
|
|
).unwrap();
|
|
let edge = db.read_relationship(
|
|
EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
|
|
).unwrap();
|
|
assert!(edge.is_none());
|
|
}
|
|
|
|
#[test]
|
|
fn list_relationships_returns_all_of_type() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
for creator in 1..=5u64 {
|
|
db.write_relationship(
|
|
EntityId::new(42), EntityId::new(creator),
|
|
RelationshipType::Follows, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
}
|
|
db.write_relationship(
|
|
EntityId::new(42), EntityId::new(99),
|
|
RelationshipType::Blocks, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
|
|
let follows = db.list_relationships(EntityId::new(42), RelationshipType::Follows).unwrap();
|
|
assert_eq!(follows.len(), 5);
|
|
assert!(follows.iter().all(|e| e.rel_type == RelationshipType::Follows));
|
|
|
|
let blocks = db.list_relationships(EntityId::new(42), RelationshipType::Blocks).unwrap();
|
|
assert_eq!(blocks.len(), 1);
|
|
}
|
|
|
|
#[test]
|
|
fn relationship_write_overwrites_weight() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::InteractionWeight, 0.5, Timestamp::now(),
|
|
).unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::InteractionWeight, 0.9, Timestamp::now(),
|
|
).unwrap();
|
|
let edge = db.read_relationship(
|
|
EntityId::new(1), EntityId::new(10), RelationshipType::InteractionWeight,
|
|
).unwrap().unwrap();
|
|
assert!((edge.weight - 0.9).abs() < f64::EPSILON);
|
|
}
|
|
|
|
#[test]
|
|
fn different_relationship_types_do_not_collide() {
|
|
let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::Follows, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
db.write_relationship(
|
|
EntityId::new(1), EntityId::new(10),
|
|
RelationshipType::Blocks, 1.0, Timestamp::now(),
|
|
).unwrap();
|
|
|
|
let follows = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Follows).unwrap();
|
|
let blocks = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Blocks).unwrap();
|
|
assert!(follows.is_some());
|
|
assert!(blocks.is_some());
|
|
}
|
|
```
|
|
|
|
### Property Tests
|
|
|
|
```rust
|
|
use proptest::prelude::*;
|
|
|
|
proptest! {
|
|
#[test]
|
|
fn relationship_key_encode_decode_roundtrip(
|
|
from_id in 1u64..100000,
|
|
to_id in 1u64..100000,
|
|
type_byte in 1u8..=5u8,
|
|
weight in -10.0f64..10.0,
|
|
ts in 0u64..u64::MAX,
|
|
) {
|
|
let from = EntityId::new(from_id);
|
|
let to = EntityId::new(to_id);
|
|
let rt = RelationshipType::from_byte(type_byte).unwrap();
|
|
|
|
let key = encode_relationship_key(from, rt, to);
|
|
let value = encode_relationship_value(weight, ts);
|
|
let edge = decode_relationship(&key, &value);
|
|
|
|
prop_assert!(edge.is_some());
|
|
let edge = edge.unwrap();
|
|
prop_assert_eq!(edge.from, from);
|
|
prop_assert_eq!(edge.to, to);
|
|
prop_assert_eq!(edge.rel_type, rt);
|
|
prop_assert!((edge.weight - weight).abs() < f64::EPSILON);
|
|
prop_assert_eq!(edge.timestamp_nanos, ts);
|
|
}
|
|
|
|
#[test]
|
|
fn relationship_type_prefix_contains_all_keys(
|
|
from_id in 1u64..10000,
|
|
to_ids in proptest::collection::vec(1u64..10000, 1..20),
|
|
type_byte in 1u8..=5u8,
|
|
) {
|
|
let from = EntityId::new(from_id);
|
|
let rt = RelationshipType::from_byte(type_byte).unwrap();
|
|
let prefix = relationship_type_prefix(from, rt);
|
|
|
|
for &to_id in &to_ids {
|
|
let key = encode_relationship_key(from, rt, EntityId::new(to_id));
|
|
prop_assert!(key.starts_with(&prefix),
|
|
"key for to={} should start with type prefix", to_id);
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `RelationshipType` enum with 5 variants, `as_byte()`/`from_byte()`/`name()`/`from_name()` roundtrip
|
|
- [ ] `RelationshipEdge` struct with `Debug`, `Clone`, `PartialEq`
|
|
- [ ] `encode_relationship_key` produces 18-byte fixed-size keys
|
|
- [ ] `encode_relationship_value` produces 16-byte fixed-size values
|
|
- [ ] `decode_relationship` roundtrips correctly with encode functions
|
|
- [ ] `db.write_relationship()` stores in users keyspace under `Tag::Rel`
|
|
- [ ] `db.read_relationship()` retrieves specific edge, returns `None` for missing
|
|
- [ ] `db.delete_relationship()` removes edge
|
|
- [ ] `db.list_relationships()` enumerates all edges of a type from a user via prefix scan
|
|
- [ ] Write overwrites existing edge (same from, to, type)
|
|
- [ ] Different relationship types for the same (from, to) pair do not collide
|
|
- [ ] Relationship key prefix scan returns only the requested type
|
|
- [ ] Write/read latency < 50 microseconds (benchmarked or measured in test)
|
|
- [ ] Property tests pass: encode/decode roundtrip, prefix containment
|
|
- [ ] `cargo clippy -- -D warnings` passes
|
|
- [ ] All tests pass
|
|
|
|
## Research References
|
|
|
|
- [thoughts.md](../../../../thoughts.md) -- Part V.12 (subject-prefix key encoding)
|
|
- [VISION.md](../../../../VISION.md) -- Relationships are first-class edges between entities
|
|
|
|
## Implementation Notes
|
|
|
|
- The key length is fixed at 18 bytes: 8 (from_id) + 1 (NUL) + 1 (Tag::Rel) + 1 (type_byte) + 8 (to_id) = 19 bytes. Correction: the key array in the design is 19 bytes, not 18. Adjust array sizes accordingly: `[u8; 19]`.
|
|
- Relationship edges are stored in the **users keyspace** because the query pattern is always "given a user, find their relationships." Storing in the users keyspace means all of a user's data (metadata, relationships, preference vector) is co-located and scannable with one prefix.
|
|
- For M3, only forward traversal is needed (user -> creators). Reverse indexes (creator -> followers) are deferred to M6 when social graph traversal queries are implemented.
|
|
- The `interaction_weight` relationship type is updated atomically in m3p2 when engagement signals are written. The weight is a running value, not a sum -- each update replaces the previous weight.
|
|
- `Hide` edges point from a user to an **item** (not a creator). The `to` field is an item_id. This is a user-to-item relationship, unlike follows/blocks which are user-to-creator.
|
|
- Do NOT implement WAL-backed relationship writes in this task. Relationships are stored directly in the storage engine (fjall). WAL-backed relationship writes for crash safety of in-flight relationship changes are addressed in m3p2 Task 03 (Hard Negatives) where it matters most.
|