# API Reference How developers interact with tidalDB. This document covers initialization, schema definition, write operations, queries, and the feedback loop. tidalDB is an embeddable Rust library. You link it into your process. There is no separate server, no network protocol, no client SDK. The API is Rust types and method calls. --- ## Table of Contents - [Initialization](#initialization) - [Schema Definition](#schema-definition) - [Entity Types](#entity-types) - [Signal Definitions](#signal-definitions) - [Ranking Profiles](#ranking-profiles) - [Write Path](#write-path) - [Ingesting Entities](#ingesting-entities) - [Updating Entities](#updating-entities) - [Writing Relationships](#writing-relationships) - [Writing Signals](#writing-signals) - [Query Language](#query-language) - [RETRIEVE — Feeds, Browse, Related](#retrieve--feeds-browse-related) - [SEARCH — Text + Semantic Retrieval](#search--text--semantic-retrieval) - [SUGGEST — Autocomplete and Suggestions](#suggest--autocomplete-and-suggestions) - [Filters](#filters) - [Sort Modes](#sort-modes) - [Diversity Constraints](#diversity-constraints) - [Pagination](#pagination) - [Response Format](#response-format) - [Lifecycle and Operations](#lifecycle-and-operations) --- ## Initialization Open a database, providing a path and configuration. ```rust use tidaldb::{TidalDB, Config}; let db = TidalDB::open(Config { path: "./data/my_app", // Memory budget for in-memory signal state and hot caches. // Higher = faster ranking queries. 10M entities need ~1-2 GB. memory_budget: 2 * 1024 * 1024 * 1024, // 2 GB // WAL fsync strategy for signal writes. signal_durability: Durability::Batched { max_batch: 100, max_delay_ms: 10 }, // Number of threads for background materializer and segment merging. background_threads: 4, })?; ``` **`Durability`** controls fsync behavior for signal writes: | Level | Behavior | Use Case | |---|---|---| | `Immediate` | fsync every write | Financial, purchase events | | `Batched { max_batch, max_delay_ms }` | fsync per batch | Default for engagement signals | | `Eventual` | fsync on OS schedule | Impressions, low-value telemetry | The database is `Send + Sync`. Share it across threads with `Arc`. --- ## Schema Definition Schema is defined before writing data. It declares entity types, signal types, and ranking profiles. Schema is versioned — old profiles remain queryable by name and version. ### Entity Types Entities are the nodes of the system. Three built-in types: **Item**, **User**, **Creator**. ```rust use tidaldb::schema::*; db.define_entity(EntityDef { kind: EntityKind::Item, metadata_fields: vec![ Field::text("title"), // full-text indexed Field::text("description"), // full-text indexed Field::keyword("category"), // exact match, filterable Field::keywords("tags"), // multi-value, filterable Field::keyword("format"), // video, short, podcast, article, etc. Field::keyword("language"), // ISO code Field::keyword("content_rating"), // G, PG, PG-13, R Field::keyword("status"), // published, live, scheduled, archived Field::keyword("availability"), // free, premium, subscriber_only Field::duration("duration"), // filterable, sortable Field::timestamp("created_at"), // filterable, sortable Field::bool("has_subtitles"), Field::bool("downloadable"), ], // Embedding slot — you provide the vector, tidalDB indexes it. embedding_dimensions: 1536, })?; db.define_entity(EntityDef { kind: EntityKind::User, metadata_fields: vec![ // Demographic (application-set) Field::keyword("locale"), // en-US, ja-JP, etc. Field::keyword("region"), // country or region code Field::keyword("timezone"), // IANA timezone Field::keyword("age_range"), // 18-24, 25-34, 35-44, etc. Field::keyword("gender"), // optional demographic // Interests (mixed: app-set + DB-computed) Field::keywords("explicit_interests"), // stated interests Field::keywords("inferred_interests"), // DB-computed from engagement Field::keywords("primary_categories"), // top categories by engagement // Behavioral (DB-computed) Field::keyword("engagement_level"), // power, regular, casual, dormant Field::keyword("format_preference"), // short, long, mixed Field::keyword("session_pattern"), // binge, browse, search Field::i64("platform_tenure_days"), // days since first signal ], // User preference vector — managed by the database. // Updated automatically on every signal write. embedding_dimensions: 1536, })?; db.define_entity(EntityDef { kind: EntityKind::Creator, metadata_fields: vec![ Field::text("name"), Field::keyword("handle"), Field::keyword("language"), Field::keyword("region"), Field::bool("verified"), ], // Creator embedding — aggregated from their item catalog. embedding_dimensions: 1536, })?; ``` **Field types:** | Type | Behavior | |---|---| | `text` | Full-text indexed (BM25), searchable with tokenization | | `keyword` | Exact match, filterable, facetable | | `keywords` | Multi-value keyword (tags, categories) | | `i64` / `f64` | Numeric, range-filterable, sortable | | `bool` | Boolean filter | | `timestamp` | Time value, range-filterable | | `duration` | Duration value, range-filterable, sortable | ### Signal Definitions Signals are typed, timestamped event streams. Decay, velocity, and windowed aggregation are declared in schema — not computed in application code. ```rust db.define_signal(SignalDef { name: "view", target: EntityKind::Item, decay: Decay::Exponential { half_life: Duration::days(7) }, windows: vec![ Window::hours(1), Window::hours(24), Window::days(7), Window::days(30), Window::all_time(), ], velocity: true, // compute rate-of-change per window })?; db.define_signal(SignalDef { name: "like", target: EntityKind::Item, decay: Decay::Exponential { half_life: Duration::days(7) }, windows: vec![Window::hours(1), Window::hours(24), Window::days(7), Window::all_time()], velocity: true, })?; db.define_signal(SignalDef { name: "skip", target: EntityKind::Item, decay: Decay::Exponential { half_life: Duration::hours(24) }, windows: vec![Window::hours(1), Window::hours(24)], velocity: false, })?; db.define_signal(SignalDef { name: "hide", target: EntityKind::Item, decay: Decay::Permanent, // never decays windows: vec![], // no aggregation — binary flag velocity: false, })?; db.define_signal(SignalDef { name: "completion", target: EntityKind::Item, decay: Decay::Exponential { half_life: Duration::days(30) }, windows: vec![Window::all_time()], velocity: false, })?; ``` **Decay types:** | Decay | Behavior | |---|---| | `Exponential { half_life }` | Signal weight halves every `half_life` duration | | `Linear { lifetime }` | Signal weight drops linearly to zero over `lifetime` | | `Permanent` | Never decays — hides, blocks, follows | The full signal reference is in [USE_CASES.md Appendix C](USE_CASES.md#appendix-c--signal-reference). ### Ranking Profiles Named, versioned scoring functions. The application says `USING PROFILE for_you`. The database executes the entire pipeline. ```rust db.define_profile(ProfileDef { name: "for_you", version: 1, candidate: Candidate::Ann { query_vector: VectorSource::UserPreference, index: EntityKind::Item, top_k: 500, }, boosts: vec![ Boost::signal("view", Window::hours(24), Velocity, 0.3), Boost::relationship("interaction_weight", 0.2), Boost::social_proof(0.15), ], decay: ProfileDecay { field: "created_at", half_life: Duration::hours(48), }, gates: vec![ Gate::min("completion", Window::all_time(), 0.3), ], penalties: vec![ Penalty::signal("skip", Window::hours(24), -0.5), ], excludes: vec![ Exclude::signal("hide"), Exclude::relationship("blocked"), ], diversity: DiversitySpec { max_per_creator: Some(2), format_mix: true, topic_diversity: None, }, exploration: 0.10, // 10% of results from creators the user doesn't follow })?; db.define_profile(ProfileDef { name: "trending", version: 1, candidate: Candidate::Scan { entity: EntityKind::Item }, boosts: vec![ Boost::signal("share", Window::hours(6), Velocity, 0.5), Boost::signal("view", Window::hours(6), Velocity, 0.3), Boost::signal("view", Window::hours(24), UniqueRatio, 0.2), // new-user reach ], gates: vec![ Gate::min_ratio("engagement_ratio", 0.03), ], // No personalization. No user context. Pure velocity. ..ProfileDef::default() })?; db.define_profile(ProfileDef { name: "search", version: 1, candidate: Candidate::Hybrid { text_weight: 0.6, vector_weight: 0.4, fusion: Fusion::Rrf { k: 60 }, }, boosts: vec![ Boost::signal("completion", Window::all_time(), Value, 0.15), Boost::signal("like", Window::all_time(), Ratio, 0.10), ], decay: ProfileDecay { field: "created_at", half_life: Duration::days(90), // slow decay for evergreen content }, diversity: DiversitySpec { max_per_creator: Some(2), ..Default::default() }, ..ProfileDef::default() })?; db.define_profile(ProfileDef { name: "following", version: 1, candidate: Candidate::Relationship { edge: "follows" }, boosts: vec![], // Pure reverse chronological. Minimal algorithmic intervention. sort: Sort::Field("created_at", Desc), ..ProfileDef::default() })?; ``` See [ai-lookup/services/ranking-profiles.md](ai-lookup/services/ranking-profiles.md) for the full list of built-in profiles. ### Cohort Definitions Cohorts are named predicates over user attributes. They define audience segments for scoped signal aggregation and trending. ```rust db.define_cohort(CohortDef { name: "us_young_music", predicate: Predicate::and(vec![ Predicate::eq("locale", "en-US"), Predicate::any("age_range", &["18-24", "25-34"]), Predicate::any("primary_categories", &["music", "concerts"]), ]), // Signal aggregation level — controls storage cost vs. query flexibility aggregation: CohortAggregation::Materialized, })?; db.define_cohort(CohortDef { name: "jp_casual", predicate: Predicate::and(vec![ Predicate::eq("region", "JP"), Predicate::eq("engagement_level", "casual"), ]), aggregation: CohortAggregation::Materialized, })?; // Ad-hoc cohorts can be queried without pre-definition, // but use query-time aggregation (slower). ``` **`CohortAggregation`:** | Level | Behavior | Use Case | |---|---|---| | `Materialized` | Pre-computed per-cohort signal aggregates, updated at signal write time | High-traffic cohorts, production trending pages | | `OnDemand` | Computed at query time by filtering signal events | Ad-hoc analysis, rare cohort combinations | --- ## Write Path ### Ingesting Entities Items enter the system with metadata and an embedding. The application provides the embedding — tidalDB does not generate vectors. ```rust db.write_item(WriteItem { id: "item_abc", creator_id: "creator_xyz", metadata: metadata! { "title" => "Introduction to Jazz Piano", "description" => "A beginner's guide to jazz piano...", "category" => "music", "tags" => ["jazz", "piano", "tutorial", "beginner"], "format" => "video", "language" => "en", "duration" => Duration::minutes(22), "content_rating" => "G", "status" => "published", "availability" => "free", "has_subtitles" => true, "downloadable" => false, "created_at" => Utc::now(), }, embedding: &content_vector, // [f32; 1536] — you compute this externally })?; ``` On commit, the item is: 1. Stored in the entity store 2. Text fields indexed in the inverted index (BM25) 3. Embedding inserted into the ANN index (HNSW) 4. Signal ledger initialized (all zeros) 5. Linked to its creator entity 6. Given a cold-start exploration budget 7. **Immediately queryable** ```rust db.write_creator(WriteCreator { id: "creator_xyz", metadata: metadata! { "name" => "Jazz Academy", "handle" => "jazzacademy", "language" => "en", "verified" => true, }, embedding: &creator_vector, // aggregated from catalog, computed externally })?; db.write_user(WriteUser { id: "user_123", metadata: metadata! { "locale" => "en-US", "region" => "US", "timezone" => "America/New_York", "age_range" => "18-24", "explicit_interests" => ["jazz", "piano", "music production"], }, // Initial preference vector. If None, uses cohort-level or population-level default. embedding: None, })?; ``` ### Updating Entities Update metadata or embeddings on existing entities. ```rust db.update_item("item_abc", UpdateItem { metadata: Some(metadata! { "status" => "archived", }), embedding: None, // unchanged })?; ``` ### Writing Relationships Relationships are first-class edges between entities. Weighted, directional, traversable in queries. ```rust db.write_relationship(Relationship { kind: "follows", from: ("user", "user_123"), to: ("creator", "creator_xyz"), weight: 1.0, timestamp: Utc::now(), })?; // Block a creator — permanent, hard filter in all future queries. db.write_relationship(Relationship { kind: "blocked", from: ("user", "user_123"), to: ("creator", "creator_bad"), weight: 1.0, timestamp: Utc::now(), })?; ``` ### Writing Signals Signals are how the feedback loop closes. A single signal write atomically updates: 1. The item's signal ledger (windowed aggregates, velocity, decay score) 2. The user's preference vector (shifted toward or away from the item's embedding) 3. The user-to-creator relationship weight 4. The user-to-item relationship (seen, liked, hidden, etc.) ```rust // User viewed an item db.signal(Signal { kind: "view", item: "item_abc", user: "user_123", timestamp: Utc::now(), weight: 1.0, context: None, })?; // User completed 94% of the video db.signal(Signal { kind: "completion", item: "item_abc", user: "user_123", timestamp: Utc::now(), weight: 0.94, // completion ratio context: None, })?; // User liked an item db.signal(Signal { kind: "like", item: "item_abc", user: "user_123", timestamp: Utc::now(), weight: 1.0, context: None, })?; // User skipped after 3 seconds (strong negative) db.signal(Signal { kind: "skip", item: "item_xyz", user: "user_123", timestamp: Utc::now(), weight: 1.0, context: Some(json!({ "dwell_ms": 3200, "source": "autoplay" })), })?; // User tapped "Not interested" (permanent negative on this item) db.signal(Signal { kind: "hide", item: "item_xyz", user: "user_123", timestamp: Utc::now(), weight: 1.0, context: None, })?; // Search click with rank context (trains query relevance) db.signal(Signal { kind: "search_click", item: "item_abc", user: "user_123", timestamp: Utc::now(), weight: 1.0, context: Some(json!({ "query": "jazz piano tutorial", "rank_at_click": 3 })), })?; ``` The next ranking query — even 100ms later — reflects the updated state. --- ## Query Language Three operations: **RETRIEVE** (feed generation, browse, related), **SEARCH** (text + semantic retrieval), **SUGGEST** (autocomplete). All queries return ranked results with scores. The application renders — it never re-ranks. ### RETRIEVE — Feeds, Browse, Related RETRIEVE generates ranked content lists. It handles personalized feeds, category browse, trending, following, related content, and every other surface described in [USE_CASES.md](USE_CASES.md). ```rust // Personalized For You feed let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), context: Some("feed"), profile: "for_you", filters: vec![ Filter::unseen(), Filter::not_blocked(), Filter::eq("format", "video"), Filter::preset("duration", "short"), // under 4 minutes ], diversity: Some(DiversitySpec { max_per_creator: Some(2), format_mix: true, topic_diversity: None, }), exclude_ids: vec![], // previously returned items limit: 50, cursor: None, })?; ``` ```rust // Trending globally let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: None, // no personalization profile: "trending", filters: vec![], diversity: Some(DiversitySpec { max_per_creator: Some(1), ..Default::default() }), limit: 25, ..Default::default() })?; ``` ```rust // Trending in a category let results = db.retrieve(Retrieve { entity: EntityKind::Item, profile: "trending", filters: vec![ Filter::eq("category", "jazz"), ], diversity: Some(DiversitySpec { max_per_creator: Some(1), ..Default::default() }), limit: 25, ..Default::default() })?; ``` ```rust // Trending among people I follow (social graph scoped) let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), profile: "trending", filters: vec![ Filter::social_graph("user_123", Depth::Two), ], limit: 25, ..Default::default() })?; ``` ```rust // Trending within a cohort — what's hot among US young music fans let results = db.retrieve(Retrieve { entity: EntityKind::Item, profile: "trending", cohort: Some("us_young_music"), // scopes signal aggregation filters: vec![], diversity: Some(DiversitySpec { max_per_creator: Some(1), ..Default::default() }), limit: 25, ..Default::default() })?; ``` ```rust // Ad-hoc cohort (not pre-defined) — slower but flexible let results = db.retrieve(Retrieve { entity: EntityKind::Item, profile: "trending", cohort_predicate: Some(Predicate::and(vec![ Predicate::eq("region", "DE"), Predicate::any("primary_categories", &["cooking", "food"]), ])), limit: 25, ..Default::default() })?; ``` ```rust // Following feed — pure chronological let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), profile: "following", filters: vec![ Filter::relationship("follows"), Filter::unseen(), ], limit: 50, cursor: Some(cursor_from_last_batch), })?; ``` ```rust // Related content / Up Next — anchored to a specific item let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), profile: "related", similar_to: Some("item_abc"), // anchor item filters: vec![ Filter::unseen(), ], diversity: Some(DiversitySpec { max_per_creator: Some(1), // avoid same creator in top 3 ..Default::default() }), limit: 10, ..Default::default() })?; ``` ```rust // Browse category with explicit sort mode let results = db.retrieve(Retrieve { entity: EntityKind::Item, profile: "browse", sort: Some(Sort::TopWeek), // override profile's default sort filters: vec![ Filter::eq("category", "jazz"), Filter::preset("duration", "short"), Filter::eq("has_subtitles", true), ], limit: 20, ..Default::default() })?; ``` ```rust // Hidden gems — high quality, low reach let results = db.retrieve(Retrieve { entity: EntityKind::Item, profile: "hidden_gems", filters: vec![ Filter::created_within(Duration::days(30)), ], limit: 20, ..Default::default() })?; ``` ```rust // User's watch history — personal library let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), sort: Some(Sort::DateSaved), filters: vec![ Filter::user_state("saved"), ], limit: 20, ..Default::default() })?; ``` ```rust // Creator discovery — find creators like another creator let results = db.retrieve(Retrieve { entity: EntityKind::Creator, for_user: Some("user_123"), similar_to: Some("creator_xyz"), sort: Some(Sort::CreatorEngagementRate), limit: 10, ..Default::default() })?; ``` ```rust // Live content — what's live right now that this user cares about let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), profile: "live", filters: vec![ Filter::eq("status", "live"), ], sort: Some(Sort::LiveViewerCount), limit: 20, ..Default::default() })?; ``` ```rust // Notification prioritization let results = db.retrieve(Retrieve { entity: EntityKind::Item, for_user: Some("user_123"), profile: "notification", filters: vec![ Filter::since(last_seen_timestamp), ], limit: 20, ..Default::default() })?; ``` ### SEARCH — Text + Semantic Retrieval Search combines full-text BM25 relevance with semantic similarity. Text relevance is the floor — an irrelevant result never surfaces just because the user likes the creator. ```rust // Basic keyword search, personalized for this user let results = db.search(Search { query: "rust tutorial beginner", vector: Some(&query_embedding), // embed the query externally, pass it in for_user: Some("user_123"), profile: "search", filters: vec![], diversity: Some(DiversitySpec { max_per_creator: Some(2), ..Default::default() }), limit: 20, })?; ``` **Query syntax** — users can type these directly. The database parses them. | Syntax | Meaning | |---|---| | `jazz piano tutorial` | Match any of these terms (OR), rank by relevance | | `"jazz piano"` | Exact phrase match | | `jazz AND piano NOT beginner` | Boolean operators | | `jazz -beginner` | Exclude term | | `jazz pian*` | Wildcard prefix | | `title:jazz` | Field-scoped search | | `tag:tutorial` | Search within specific field | | `creator:jazzacademy` | Search by creator handle | | `#jazz` | Hashtag match | ```rust // Exact phrase with date filter let results = db.search(Search { query: "\"machine learning\" fundamentals", filters: vec![ Filter::created_within(Duration::days(30)), Filter::eq("format", "video"), Filter::preset("duration", "long"), ], limit: 20, ..Default::default() })?; ``` ```rust // Semantic-only search (no text query, just a vector) let results = db.search(Search { query: "", vector: Some(&image_embedding), // visual similarity search for_user: Some("user_123"), profile: "search", limit: 20, ..Default::default() })?; ``` ```rust // People/creator search let results = db.search(Search { query: "jazz piano", entity: EntityKind::Creator, filters: vec![ Filter::eq("verified", true), Filter::min("follower_count", 1000), ], sort: Some(Sort::CreatorEngagementRate), limit: 10, ..Default::default() })?; ``` ### Query Composition — SEARCH within Scoped Results SEARCH can be composed with RETRIEVE scopes. This enables searching within trending, within a cohort, or within any candidate set. ```rust // Search within globally trending items let results = db.search(Search { query: "jazz piano", vector: Some(&query_embedding), within: Some(WithinScope::Trending { window: Duration::hours(24), }), profile: "search", limit: 20, ..Default::default() })?; ``` ```rust // Search within cohort-scoped trending — the full three-layer query let results = db.search(Search { query: "jazz piano", vector: Some(&query_embedding), within: Some(WithinScope::CohortTrending { cohort: "us_young_music", window: Duration::hours(24), }), profile: "search", limit: 20, ..Default::default() })?; ``` ```rust // Search within a user's following feed let results = db.search(Search { query: "jazz piano", for_user: Some("user_123"), within: Some(WithinScope::Following), profile: "search", limit: 20, ..Default::default() })?; ``` **`WithinScope`:** | Scope | Candidate Set | |---|---| | `Trending { window }` | Items with high global velocity in window | | `CohortTrending { cohort, window }` | Items with high velocity among cohort members | | `Following` | Items from followed creators (requires for_user) | | `Category { name }` | Items in a category | | `Collection { id }` | Items in a collection | ### SUGGEST — Autocomplete and Suggestions ```rust // Autocomplete on partial query let suggestions = db.suggest(Suggest { prefix: "jazz pia", for_user: Some("user_123"), limit: 5, })?; // Returns: ["jazz piano", "jazz piano tutorial", "jazz piano chords", ...] // Trending searches (empty prefix) let trending = db.suggest(Suggest { prefix: "", for_user: None, limit: 10, })?; ``` --- ## Filters All filters are composable. Any combination of filters produces a valid, efficiently-executed query. Multi-select within a dimension uses OR logic. Cross-dimension uses AND logic. ### Content Attribute Filters ```rust Filter::eq("category", "jazz") // exact match Filter::any("category", &["jazz", "blues"]) // OR within dimension Filter::eq("format", "video") Filter::eq("language", "en") Filter::eq("content_rating", "PG") Filter::eq("status", "published") Filter::eq("availability", "free") Filter::eq("has_subtitles", true) Filter::eq("downloadable", true) Filter::eq("original_only", true) // exclude crossposts ``` ### Duration Filters ```rust Filter::preset("duration", "short") // under 4 minutes Filter::preset("duration", "medium") // 4-20 minutes Filter::preset("duration", "long") // over 20 minutes Filter::range("duration", 5 * 60..15 * 60) // custom range (seconds) ``` ### Date / Time Filters ```rust Filter::created_within(Duration::days(7)) Filter::created_preset("today") // today, week, month, year Filter::created_after("2025-01-01T00:00:00Z") Filter::created_before("2025-06-01T00:00:00Z") Filter::since(timestamp) // for notifications ``` ### Creator Filters ```rust Filter::eq("creator", "creator_xyz") Filter::exclude_creator("creator_bad") Filter::eq("creator_verified", true) Filter::creator_followed_by_user() // requires for_user Filter::creator_new_to_user() // never seen this creator Filter::min("creator_min_followers", 1000) Filter::max("creator_max_followers", 50000) ``` ### Engagement Threshold Filters ```rust Filter::min("min_views", 10000) Filter::max("max_views", 5000) // for hidden gems Filter::min("min_likes", 100) Filter::min("min_like_ratio", 0.85) Filter::min("min_completion_rate", 0.5) Filter::min("min_comments", 50) ``` ### User State Filters ```rust Filter::unseen() // not yet viewed by this user Filter::user_state("seen") // already viewed Filter::user_state("in_progress") // partially watched Filter::user_state("saved") // bookmarked / watch later Filter::user_state("liked") // user has liked Filter::user_state("downloaded") // available offline Filter::in_collection("playlist_abc") // in specific collection Filter::not_blocked() // exclude blocked creators ``` ### Geographic Filters ```rust Filter::eq("content_region", "US") Filter::eq("trending_in_region", "US") Filter::near_location(lat, lng, radius_km) ``` ### Cohort Filters ```rust Filter::cohort("us_young_music") // pre-defined cohort Filter::cohort_predicate(Predicate::and(vec![ // ad-hoc cohort Predicate::eq("locale", "en-US"), Predicate::eq("age_range", "18-24"), ])) ``` ### Social Graph Filters ```rust Filter::relationship("follows") // from followed creators only Filter::social_graph("user_123", Depth::Two) // engaged by user's follows ``` See [USE_CASES.md Appendix A](USE_CASES.md#appendix-a--filter-reference) for the complete filter reference. --- ## Sort Modes Sort modes are built-in. The application names a mode. The database executes it. No application-side sorting logic. Sort can be specified as an override on any RETRIEVE query using the `sort` field, or it can be embedded in a ranking profile. ```rust Sort::Relevance // text + semantic match (search only) Sort::Personalized // user preference match Sort::New // created_at DESC Sort::Old // created_at ASC Sort::Hot // score / (age + 2)^gravity — Reddit model Sort::Trending // pure engagement velocity Sort::Rising // velocity relative to baseline, age-boosted Sort::Controversial // max(positive * negative signals) Sort::HiddenGems // high quality, low reach Sort::TopAllTime // cumulative quality, no decay Sort::TopHour Sort::TopToday Sort::TopWeek Sort::TopMonth Sort::TopYear Sort::MostViewed Sort::MostLiked Sort::MostCommented Sort::MostShared Sort::Shortest // duration ASC Sort::Longest // duration DESC Sort::AlphabeticalAsc // title A-Z Sort::AlphabeticalDesc // title Z-A Sort::Shuffle // random, quality-weighted Sort::LiveViewerCount // current viewer count DESC Sort::DateSaved // when user bookmarked DESC Sort::CreatorEngagementRate ``` See [USE_CASES.md Appendix B](USE_CASES.md#appendix-b--sort-mode-reference) for the complete sort mode reference. --- ## Diversity Constraints Diversity is a post-scoring pass. After candidates are scored, diversity constraints reorder the result set to enforce variety — without reducing the result count. ```rust DiversitySpec { // No more than N items from the same creator in the result set. max_per_creator: Some(2), // Ensure a mix of content formats (video, short, article, etc.) format_mix: true, // Topic diversity — 0.0 (no enforcement) to 1.0 (maximize diversity). // Uses maximal marginal relevance (MMR). topic_diversity: Some(0.7), } ``` Diversity is specified per query or per ranking profile. Query-level diversity overrides the profile default. --- ## Pagination Cursor-based pagination for stable result sets across pages. ```rust // First page let page1 = db.retrieve(Retrieve { profile: "for_you", for_user: Some("user_123"), limit: 50, cursor: None, ..Default::default() })?; // Next page — pass the cursor from the previous response let page2 = db.retrieve(Retrieve { profile: "for_you", for_user: Some("user_123"), limit: 50, cursor: page1.next_cursor, ..Default::default() })?; ``` Alternatively, use `exclude_ids` to exclude previously returned items: ```rust let page2 = db.retrieve(Retrieve { profile: "for_you", for_user: Some("user_123"), exclude_ids: page1.results.iter().map(|r| r.id.clone()).collect(), limit: 50, ..Default::default() })?; ``` --- ## Response Format All queries return a `Results` struct. ```rust pub struct Results { /// Ranked items with scores. pub results: Vec, /// Cursor for fetching the next page. pub next_cursor: Option, /// Total candidate count before diversity/limit (useful for faceted UI). pub total_candidates: u64, } pub struct RankedItem { /// Entity ID. pub id: String, /// Final composite score after ranking profile, boosts, penalties, diversity. pub score: f64, /// Signal snapshot at query time — lets the application display counts. pub signals: SignalSnapshot, } pub struct SignalSnapshot { /// Raw signal values at query time. /// e.g., { "view": { "24h": 12450, "7d": 89200, "all_time": 1230000 } } pub values: HashMap>, } ``` The application uses `results` to render the UI. It uses `signals` to display engagement counts (views, likes, etc.). It never re-ranks — the order from tidalDB is the final order. --- ## Lifecycle and Operations ### Shutdown ```rust // Graceful shutdown — flushes WAL, finalizes materializer, persists ANN index. db.shutdown()?; ``` ### Profile Management ```rust // List all profiles let profiles = db.list_profiles()?; // Get a specific profile (with version) let profile = db.get_profile("for_you", None)?; // latest let profile = db.get_profile("for_you", Some(1))?; // specific version // A/B test by using different profile versions at query time let control = db.retrieve(Retrieve { profile: "for_you", // uses latest version ..query.clone() })?; let variant = db.retrieve(Retrieve { profile: "for_you_v2", // experimental profile ..query.clone() })?; ``` ### Schema Inspection ```rust // List defined signals let signals = db.list_signals()?; // List entity types with their fields let entities = db.list_entities()?; ``` ### Saved Searches ```rust // Save a search as a persistent feed — user gets new results over time. db.save_search(SavedSearch { user: "user_123", name: "Jazz tutorials", query: "jazz tutorial", filters: vec![ Filter::eq("format", "video"), Filter::eq("language", "en"), ], })?; // Query a saved search for new results since last check. let results = db.retrieve_saved_search("user_123", "Jazz tutorials", since)?; ``` ### Collections ```rust // Create a user collection (playlist, board, etc.) db.create_collection(Collection { id: "playlist_abc", owner: "user_123", name: "Jazz Favorites", visibility: Visibility::Private, // Private, Shared, Public })?; // Add an item to a collection db.add_to_collection("playlist_abc", "item_abc")?; // Items can belong to multiple collections. // Public collections are themselves rankable — they appear in browse and search. ``` --- ## Summary | Operation | What the Application Does | What tidalDB Does | |---|---|---| | **Ingest content** | Compute embedding, call `write_item` | Index text, insert vector, initialize signals, apply cold start | | **Record engagement** | Call `signal` with event type | Update signal ledger, user preferences, relationship weights — atomically | | **Serve a feed** | Call `retrieve` with a profile name | Candidate retrieval, scoring, diversity enforcement, pagination | | **Search** | Embed query, call `search` | BM25 + ANN + fusion + personalization + diversity | | **Define ranking** | Declare a `ProfileDef` | The database executes the entire ranking pipeline | | **Handle cold start** | Nothing | Exploration budget, population priors — automatic | | **Handle negative signals** | Call `signal` with skip/hide/block | Permanent exclusion, preference decay, relationship zeroing | | **Scope trending by cohort** | Specify cohort name or predicate in query | Cohort-scoped signal aggregation, same ranking profile | | **Search within scope** | Specify `within` on search query | Intersects text/vector retrieval with scoped candidate set | One process. One query interface. One operational model.