tidaldb/docs/planning/milestone-3/phase-1/task-02-relationship-graph.md
jordan 39ada28c6e feat: complete Milestones 2–4 — RETRIEVE query, vector index, ranking profiles, diversity, entity system, sessions
M2: RETRIEVE query pipeline with 5-stage execution (candidate → filter → score → diversify → limit),
    usearch HNSW vector index, bitmap/range/universe filters, ranking profiles with signal scoring,
    MMR diversity enforcement, and m2_uat integration tests.

M3: Entity system with typed metadata, relationship graph (follows/blocks/interactions),
    creator entities, session tracking, and m3_uat integration tests.

M4: Advanced ranking with builtin functions (freshness, trending, controversy, wilson),
    ranking executor with explain mode, query executor integration, benchmarks for
    query/ranking/vector/filters/diversity, and m4_uat integration tests.

Includes: 9 new blog posts, marketing site updates, updated roadmap, and updated vision doc.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:24:48 -07:00

18 KiB

Task 02: Relationship Graph

Context

Milestone: 3 -- Personalized Ranking Phase: m3p1 -- User and Creator Entities with Relationships Depends On: Task 01 (User + Creator entity types, StorageBox with users/creators engines) Blocks: Task 03 (User-State Bitmap Indexes), m3p2 (Feedback Loop needs interaction_weight edges), m3p3 (Personalized Profiles need follows/blocks) Complexity: L

Objective

Deliver the relationship graph: typed, weighted, directional edges between entities stored in the users keyspace under Tag::Rel. The graph supports five relationship types (follows, blocks, interaction_weight, hide, mute) with CRUD operations and prefix-scanned enumeration. Edges are encoded so that all relationships from a single user can be scanned with one prefix, and all relationships of a given type from a user can be scanned with a narrower prefix.

The relationship graph is the foundation for:

  • follows filter: enumerate items from followed creators
  • blocked filter: exclude items from blocked creators
  • hide filter: exclude specific hidden items
  • interaction_weight: user-to-creator affinity used in personalized scoring
  • mute: soft filter (suppresses but does not hard-exclude)

Key encoding follows the subject-prefix pattern established in m1p3:

[user_id: 8 bytes BE][0x00][REL: 0x04][type_byte: 1 byte][to_entity_id: 8 bytes BE]

This gives O(1) point lookup for a specific edge and O(n) prefix scan for all edges of a type from a user.

Requirements

  • RelationshipType enum: Follows, Blocks, InteractionWeight, Hide, Mute with as_byte() / from_byte() discriminants
  • RelationshipEdge struct: from: EntityId, to: EntityId, rel_type: RelationshipType, weight: f64, timestamp_nanos: u64
  • db.write_relationship(from, to, rel_type, weight, timestamp) stores edge in users keyspace
  • db.read_relationship(from, to, rel_type) returns Option<RelationshipEdge>
  • db.delete_relationship(from, to, rel_type) removes edge
  • db.list_relationships(from, rel_type) returns Vec<RelationshipEdge> via prefix scan
  • db.list_all_relationships(from) returns all edge types from a user
  • Key encoding: [from_id][0x00][0x04][type_byte][to_id_bytes]
  • Value encoding: [weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]
  • Relationship write latency < 50 microseconds
  • Edges persist across shutdown and restart

Technical Design

Module Structure

tidal/src/
  entities/
    relationship.rs -- RelationshipType, RelationshipEdge, encode/decode, CRUD

Types

// === entities/relationship.rs ===

use crate::schema::{EntityId, Timestamp};

/// Relationship type discriminant.
///
/// Each variant maps to a single byte for key encoding.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(u8)]
pub enum RelationshipType {
    /// User follows a creator. Permanent until unfollowed.
    Follows = 0x01,
    /// User blocks a creator. Permanent. Hard filter in all queries.
    Blocks = 0x02,
    /// User-to-creator interaction weight. Updated on every engagement signal.
    /// Decays over time using the same decay infrastructure as signal scores.
    InteractionWeight = 0x03,
    /// User hides a specific item. Permanent. Hard negative.
    Hide = 0x04,
    /// User mutes a creator. Permanent. Soft filter (suppresses, not excludes).
    Mute = 0x05,
}

impl RelationshipType {
    pub const fn as_byte(self) -> u8 {
        self as u8
    }

    pub const fn from_byte(b: u8) -> Option<Self> {
        match b {
            0x01 => Some(Self::Follows),
            0x02 => Some(Self::Blocks),
            0x03 => Some(Self::InteractionWeight),
            0x04 => Some(Self::Hide),
            0x05 => Some(Self::Mute),
            _ => None,
        }
    }

    /// Human-readable name for display and query parsing.
    pub const fn name(self) -> &'static str {
        match self {
            Self::Follows => "follows",
            Self::Blocks => "blocks",
            Self::InteractionWeight => "interaction_weight",
            Self::Hide => "hide",
            Self::Mute => "mute",
        }
    }

    /// Parse from a string name. Used by the query parser.
    pub fn from_name(name: &str) -> Option<Self> {
        match name {
            "follows" => Some(Self::Follows),
            "blocks" => Some(Self::Blocks),
            "interaction_weight" => Some(Self::InteractionWeight),
            "hide" => Some(Self::Hide),
            "mute" => Some(Self::Mute),
            _ => None,
        }
    }
}

/// A directional, weighted relationship edge.
#[derive(Debug, Clone, PartialEq)]
pub struct RelationshipEdge {
    pub from: EntityId,
    pub to: EntityId,
    pub rel_type: RelationshipType,
    pub weight: f64,
    pub timestamp_nanos: u64,
}

Key and Value Encoding

/// Encode a relationship key.
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1][to_id: 8 BE]
///
/// Total: 18 bytes (fixed size, no variable-length components).
pub fn encode_relationship_key(
    from: EntityId,
    rel_type: RelationshipType,
    to: EntityId,
) -> [u8; 18] {
    let mut key = [0u8; 18];
    key[0..8].copy_from_slice(&from.to_be_bytes());
    key[8] = 0x00; // NUL separator
    key[9] = Tag::Rel.as_byte(); // 0x04
    key[10] = rel_type.as_byte();
    key[11..19].copy_from_slice(&to.to_be_bytes());
    key
}

/// Build the prefix for scanning all relationships of a type from a user.
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04][type_byte: 1]
///
/// Total: 11 bytes.
pub fn relationship_type_prefix(
    from: EntityId,
    rel_type: RelationshipType,
) -> [u8; 11] {
    let mut prefix = [0u8; 11];
    prefix[0..8].copy_from_slice(&from.to_be_bytes());
    prefix[8] = 0x00;
    prefix[9] = Tag::Rel.as_byte();
    prefix[10] = rel_type.as_byte();
    prefix
}

/// Build the prefix for scanning all relationships from a user (any type).
///
/// Format: [from_id: 8 BE][0x00][REL: 0x04]
///
/// Total: 10 bytes (same as entity_tag_prefix with Tag::Rel).
pub fn relationship_prefix(from: EntityId) -> [u8; 10] {
    crate::storage::keys::entity_tag_prefix(from, Tag::Rel)
}

/// Encode relationship edge value.
///
/// Format: [weight: 8 bytes f64 LE][timestamp_nanos: 8 bytes u64 LE]
pub fn encode_relationship_value(weight: f64, timestamp_nanos: u64) -> [u8; 16] {
    let mut buf = [0u8; 16];
    buf[0..8].copy_from_slice(&weight.to_le_bytes());
    buf[8..16].copy_from_slice(&timestamp_nanos.to_le_bytes());
    buf
}

/// Decode a relationship edge from key + value bytes.
pub fn decode_relationship(key: &[u8], value: &[u8]) -> Option<RelationshipEdge> {
    if key.len() < 18 || value.len() < 16 {
        return None;
    }
    let from = EntityId::new(u64::from_be_bytes(key[0..8].try_into().ok()?));
    let rel_type = RelationshipType::from_byte(key[10])?;
    let to = EntityId::new(u64::from_be_bytes(key[11..19].try_into().ok()?));
    let weight = f64::from_le_bytes(value[0..8].try_into().ok()?);
    let timestamp_nanos = u64::from_le_bytes(value[8..16].try_into().ok()?);
    Some(RelationshipEdge { from, to, rel_type, weight, timestamp_nanos })
}

TidalDb API Extensions

impl TidalDb {
    /// Write a relationship edge. Overwrites if the edge already exists.
    pub fn write_relationship(
        &self,
        from: EntityId,
        to: EntityId,
        rel_type: RelationshipType,
        weight: f64,
        timestamp: Timestamp,
    ) -> crate::Result<()> {
        let storage = self.storage.as_ref()
            .ok_or_else(|| LumenError::Internal("no storage".into()))?;
        let key = encode_relationship_key(from, rel_type, to);
        let value = encode_relationship_value(weight, timestamp.as_nanos());
        storage.users_engine().put(&key, &value).map_err(LumenError::from)
    }

    /// Read a specific relationship edge.
    pub fn read_relationship(
        &self,
        from: EntityId,
        to: EntityId,
        rel_type: RelationshipType,
    ) -> crate::Result<Option<RelationshipEdge>> {
        let storage = self.storage.as_ref()
            .ok_or_else(|| LumenError::Internal("no storage".into()))?;
        let key = encode_relationship_key(from, rel_type, to);
        match storage.users_engine().get(&key)? {
            Some(value) => Ok(decode_relationship(&key, &value)),
            None => Ok(None),
        }
    }

    /// Delete a relationship edge.
    pub fn delete_relationship(
        &self,
        from: EntityId,
        to: EntityId,
        rel_type: RelationshipType,
    ) -> crate::Result<()> {
        let storage = self.storage.as_ref()
            .ok_or_else(|| LumenError::Internal("no storage".into()))?;
        let key = encode_relationship_key(from, rel_type, to);
        storage.users_engine().delete(&key).map_err(LumenError::from)
    }

    /// List all relationship edges of a given type from a user.
    pub fn list_relationships(
        &self,
        from: EntityId,
        rel_type: RelationshipType,
    ) -> crate::Result<Vec<RelationshipEdge>> {
        let storage = self.storage.as_ref()
            .ok_or_else(|| LumenError::Internal("no storage".into()))?;
        let prefix = relationship_type_prefix(from, rel_type);
        let mut edges = Vec::new();
        for (key, value) in storage.users_engine().scan_prefix(&prefix) {
            if let Some(edge) = decode_relationship(&key, &value) {
                edges.push(edge);
            }
        }
        Ok(edges)
    }
}

Test Strategy

Unit Tests

#[test]
fn relationship_type_byte_roundtrip() {
    let types = [
        RelationshipType::Follows,
        RelationshipType::Blocks,
        RelationshipType::InteractionWeight,
        RelationshipType::Hide,
        RelationshipType::Mute,
    ];
    for rt in types {
        let byte = rt.as_byte();
        assert_eq!(RelationshipType::from_byte(byte), Some(rt));
    }
}

#[test]
fn relationship_type_name_roundtrip() {
    let types = [
        RelationshipType::Follows,
        RelationshipType::Blocks,
        RelationshipType::InteractionWeight,
        RelationshipType::Hide,
        RelationshipType::Mute,
    ];
    for rt in types {
        let name = rt.name();
        assert_eq!(RelationshipType::from_name(name), Some(rt));
    }
}

#[test]
fn encode_decode_relationship_roundtrip() {
    let from = EntityId::new(42);
    let to = EntityId::new(7);
    let rt = RelationshipType::Follows;
    let weight = 1.0;
    let ts = 1_000_000_000u64;

    let key = encode_relationship_key(from, rt, to);
    let value = encode_relationship_value(weight, ts);
    let edge = decode_relationship(&key, &value).unwrap();

    assert_eq!(edge.from, from);
    assert_eq!(edge.to, to);
    assert_eq!(edge.rel_type, rt);
    assert!((edge.weight - weight).abs() < f64::EPSILON);
    assert_eq!(edge.timestamp_nanos, ts);
}

#[test]
fn write_read_relationship_ephemeral() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::Follows, 1.0, Timestamp::now(),
    ).unwrap();
    let edge = db.read_relationship(
        EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
    ).unwrap();
    assert!(edge.is_some());
    assert_eq!(edge.unwrap().to, EntityId::new(10));
}

#[test]
fn read_nonexistent_relationship_returns_none() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    let edge = db.read_relationship(
        EntityId::new(1), EntityId::new(99), RelationshipType::Follows,
    ).unwrap();
    assert!(edge.is_none());
}

#[test]
fn delete_relationship_removes_edge() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::Follows, 1.0, Timestamp::now(),
    ).unwrap();
    db.delete_relationship(
        EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
    ).unwrap();
    let edge = db.read_relationship(
        EntityId::new(1), EntityId::new(10), RelationshipType::Follows,
    ).unwrap();
    assert!(edge.is_none());
}

#[test]
fn list_relationships_returns_all_of_type() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    for creator in 1..=5u64 {
        db.write_relationship(
            EntityId::new(42), EntityId::new(creator),
            RelationshipType::Follows, 1.0, Timestamp::now(),
        ).unwrap();
    }
    db.write_relationship(
        EntityId::new(42), EntityId::new(99),
        RelationshipType::Blocks, 1.0, Timestamp::now(),
    ).unwrap();

    let follows = db.list_relationships(EntityId::new(42), RelationshipType::Follows).unwrap();
    assert_eq!(follows.len(), 5);
    assert!(follows.iter().all(|e| e.rel_type == RelationshipType::Follows));

    let blocks = db.list_relationships(EntityId::new(42), RelationshipType::Blocks).unwrap();
    assert_eq!(blocks.len(), 1);
}

#[test]
fn relationship_write_overwrites_weight() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::InteractionWeight, 0.5, Timestamp::now(),
    ).unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::InteractionWeight, 0.9, Timestamp::now(),
    ).unwrap();
    let edge = db.read_relationship(
        EntityId::new(1), EntityId::new(10), RelationshipType::InteractionWeight,
    ).unwrap().unwrap();
    assert!((edge.weight - 0.9).abs() < f64::EPSILON);
}

#[test]
fn different_relationship_types_do_not_collide() {
    let db = TidalDb::builder().ephemeral().with_schema(test_schema()).open().unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::Follows, 1.0, Timestamp::now(),
    ).unwrap();
    db.write_relationship(
        EntityId::new(1), EntityId::new(10),
        RelationshipType::Blocks, 1.0, Timestamp::now(),
    ).unwrap();

    let follows = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Follows).unwrap();
    let blocks = db.read_relationship(EntityId::new(1), EntityId::new(10), RelationshipType::Blocks).unwrap();
    assert!(follows.is_some());
    assert!(blocks.is_some());
}

Property Tests

use proptest::prelude::*;

proptest! {
    #[test]
    fn relationship_key_encode_decode_roundtrip(
        from_id in 1u64..100000,
        to_id in 1u64..100000,
        type_byte in 1u8..=5u8,
        weight in -10.0f64..10.0,
        ts in 0u64..u64::MAX,
    ) {
        let from = EntityId::new(from_id);
        let to = EntityId::new(to_id);
        let rt = RelationshipType::from_byte(type_byte).unwrap();

        let key = encode_relationship_key(from, rt, to);
        let value = encode_relationship_value(weight, ts);
        let edge = decode_relationship(&key, &value);

        prop_assert!(edge.is_some());
        let edge = edge.unwrap();
        prop_assert_eq!(edge.from, from);
        prop_assert_eq!(edge.to, to);
        prop_assert_eq!(edge.rel_type, rt);
        prop_assert!((edge.weight - weight).abs() < f64::EPSILON);
        prop_assert_eq!(edge.timestamp_nanos, ts);
    }

    #[test]
    fn relationship_type_prefix_contains_all_keys(
        from_id in 1u64..10000,
        to_ids in proptest::collection::vec(1u64..10000, 1..20),
        type_byte in 1u8..=5u8,
    ) {
        let from = EntityId::new(from_id);
        let rt = RelationshipType::from_byte(type_byte).unwrap();
        let prefix = relationship_type_prefix(from, rt);

        for &to_id in &to_ids {
            let key = encode_relationship_key(from, rt, EntityId::new(to_id));
            prop_assert!(key.starts_with(&prefix),
                "key for to={} should start with type prefix", to_id);
        }
    }
}

Acceptance Criteria

  • RelationshipType enum with 5 variants, as_byte()/from_byte()/name()/from_name() roundtrip
  • RelationshipEdge struct with Debug, Clone, PartialEq
  • encode_relationship_key produces 18-byte fixed-size keys
  • encode_relationship_value produces 16-byte fixed-size values
  • decode_relationship roundtrips correctly with encode functions
  • db.write_relationship() stores in users keyspace under Tag::Rel
  • db.read_relationship() retrieves specific edge, returns None for missing
  • db.delete_relationship() removes edge
  • db.list_relationships() enumerates all edges of a type from a user via prefix scan
  • Write overwrites existing edge (same from, to, type)
  • Different relationship types for the same (from, to) pair do not collide
  • Relationship key prefix scan returns only the requested type
  • Write/read latency < 50 microseconds (benchmarked or measured in test)
  • Property tests pass: encode/decode roundtrip, prefix containment
  • cargo clippy -- -D warnings passes
  • All tests pass

Research References

  • thoughts.md -- Part V.12 (subject-prefix key encoding)
  • VISION.md -- Relationships are first-class edges between entities

Implementation Notes

  • The key length is fixed at 18 bytes: 8 (from_id) + 1 (NUL) + 1 (Tag::Rel) + 1 (type_byte) + 8 (to_id) = 19 bytes. Correction: the key array in the design is 19 bytes, not 18. Adjust array sizes accordingly: [u8; 19].
  • Relationship edges are stored in the users keyspace because the query pattern is always "given a user, find their relationships." Storing in the users keyspace means all of a user's data (metadata, relationships, preference vector) is co-located and scannable with one prefix.
  • For M3, only forward traversal is needed (user -> creators). Reverse indexes (creator -> followers) are deferred to M6 when social graph traversal queries are implemented.
  • The interaction_weight relationship type is updated atomically in m3p2 when engagement signals are written. The weight is a running value, not a sum -- each update replaces the previous weight.
  • Hide edges point from a user to an item (not a creator). The to field is an item_id. This is a user-to-item relationship, unlike follows/blocks which are user-to-creator.
  • Do NOT implement WAL-backed relationship writes in this task. Relationships are stored directly in the storage engine (fjall). WAL-backed relationship writes for crash safety of in-flight relationship changes are addressed in m3p2 Task 03 (Hard Negatives) where it matters most.