816 lines
24 KiB
Markdown
816 lines
24 KiB
Markdown
# API Reference
|
|
|
|
> **Quick API Reference:** The examples below reflect the current implementation API. Use `cargo doc --manifest-path tidal/Cargo.toml --open` for full documentation.
|
|
|
|
How developers interact with tidalDB. This document covers initialization, schema definition, write operations, queries, and the feedback loop.
|
|
|
|
tidalDB is an embeddable Rust library. You link it into your process. There is no separate server, no network protocol, no client SDK. The API is Rust types and method calls.
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
- [Initialization](#initialization)
|
|
- [Schema Definition](#schema-definition)
|
|
- [Entity Types](#entity-types)
|
|
- [Signal Definitions](#signal-definitions)
|
|
- [Ranking Profiles](#ranking-profiles)
|
|
- [Write Path](#write-path)
|
|
- [Ingesting Entities](#ingesting-entities)
|
|
- [Writing Embeddings](#writing-embeddings)
|
|
- [Writing Relationships](#writing-relationships)
|
|
- [Writing Signals](#writing-signals)
|
|
- [Query Language](#query-language)
|
|
- [RETRIEVE -- Feeds, Browse, Related](#retrieve--feeds-browse-related)
|
|
- [SEARCH -- Text + Semantic Retrieval](#search--text--semantic-retrieval)
|
|
- [SUGGEST -- Autocomplete and Suggestions](#suggest--autocomplete-and-suggestions)
|
|
- [Filters](#filters)
|
|
- [Sort Modes](#sort-modes)
|
|
- [Diversity Constraints](#diversity-constraints)
|
|
- [Pagination](#pagination)
|
|
- [Response Format](#response-format)
|
|
- [Lifecycle and Operations](#lifecycle-and-operations)
|
|
|
|
---
|
|
|
|
## Initialization
|
|
|
|
Open a database using the builder pattern. Define the schema first, then pass it to the builder.
|
|
|
|
```rust
|
|
use tidaldb::TidalDb;
|
|
use tidaldb::schema::{SchemaBuilder, EntityKind, DecaySpec, Window};
|
|
use std::time::Duration;
|
|
|
|
// 1. Define the schema (signal types, text fields, embedding slots).
|
|
let mut schema = SchemaBuilder::new();
|
|
let _ = schema.signal("view", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(7 * 24 * 3600),
|
|
})
|
|
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::AllTime])
|
|
.velocity(true)
|
|
.add();
|
|
let schema = schema.build().expect("valid schema");
|
|
|
|
// 2a. Ephemeral (in-memory) -- no filesystem access, ideal for testing.
|
|
let db = TidalDb::builder()
|
|
.ephemeral()
|
|
.with_schema(schema.clone())
|
|
.open()?;
|
|
|
|
// 2b. Persistent -- durable storage at the given path.
|
|
let db = TidalDb::builder()
|
|
.with_data_dir("/var/lib/tidaldb/my_app")
|
|
.with_schema(schema)
|
|
.open()?;
|
|
```
|
|
|
|
The database is `Send + Sync`. Share it across threads with `Arc<TidalDb>`.
|
|
|
|
---
|
|
|
|
## Schema Definition
|
|
|
|
Schema is defined before opening the database using `SchemaBuilder`. It declares signal types, text fields for full-text search, and embedding slots for vector search.
|
|
|
|
### Entity Types
|
|
|
|
Entities are the nodes of the system. Three built-in types: **Item**, **User**, **Creator**. Entity metadata is stored as `HashMap<String, String>` key-value pairs.
|
|
|
|
### Signal Definitions
|
|
|
|
Signals are typed, timestamped event streams. Decay, velocity, and windowed aggregation are declared in schema -- not computed in application code.
|
|
|
|
```rust
|
|
use tidaldb::schema::{SchemaBuilder, EntityKind, DecaySpec, Window, TextFieldType};
|
|
use std::time::Duration;
|
|
|
|
let mut schema = SchemaBuilder::new();
|
|
|
|
// View signal: exponential decay, 7-day half-life, three windows + velocity.
|
|
let _ = schema.signal("view", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(7 * 24 * 3600),
|
|
})
|
|
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
// Like signal: slower decay (14 days).
|
|
let _ = schema.signal("like", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(14 * 24 * 3600),
|
|
})
|
|
.windows(&[Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
// Skip signal: fast decay (1 day), no velocity.
|
|
let _ = schema.signal("skip", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(24 * 3600),
|
|
})
|
|
.windows(&[Window::OneHour, Window::TwentyFourHours])
|
|
.velocity(false)
|
|
.add();
|
|
|
|
// Hide signal: permanent (never decays), no windows.
|
|
let _ = schema.signal("hide", EntityKind::Item, DecaySpec::Permanent).add();
|
|
|
|
// Share signal: for trending and social features.
|
|
let _ = schema.signal("share", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(3 * 24 * 3600),
|
|
})
|
|
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::AllTime])
|
|
.velocity(true)
|
|
.add();
|
|
|
|
// Completion signal: long-lived quality metric.
|
|
let _ = schema.signal("completion", EntityKind::Item, DecaySpec::Exponential {
|
|
half_life: Duration::from_secs(30 * 24 * 3600),
|
|
})
|
|
.windows(&[Window::AllTime])
|
|
.velocity(false)
|
|
.add();
|
|
|
|
// Text fields for BM25 full-text search.
|
|
schema.text_field("title", TextFieldType::Text);
|
|
schema.text_field("description", TextFieldType::Text);
|
|
schema.text_field("category", TextFieldType::Keyword);
|
|
schema.text_field("tags", TextFieldType::Keyword);
|
|
|
|
// Creator text fields for creator search.
|
|
schema.creator_text_field("name", TextFieldType::Text);
|
|
schema.creator_text_field("handle", TextFieldType::Keyword);
|
|
|
|
// Embedding slots for vector search (you provide the vectors).
|
|
schema.embedding_slot("content", EntityKind::Item, 128);
|
|
schema.embedding_slot("content", EntityKind::Creator, 128);
|
|
|
|
let schema = schema.build()?;
|
|
```
|
|
|
|
**Decay types:**
|
|
|
|
| Decay | Behavior |
|
|
|---|---|
|
|
| `Exponential { half_life }` | Signal weight halves every `half_life` duration |
|
|
| `Linear { lifetime }` | Signal weight drops linearly to zero over `lifetime` |
|
|
| `Permanent` | Never decays -- hides, blocks, follows |
|
|
|
|
The full signal reference is in [USE_CASES.md Appendix C](USE_CASES.md#appendix-c--signal-reference).
|
|
|
|
### Ranking Profiles
|
|
|
|
tidalDB ships 25 built-in ranking profiles. The application says `profile("trending")`. The database executes the entire pipeline.
|
|
|
|
Built-in profiles include: `trending`, `hot`, `new`, `for_you`, `following`, `related`, `notification`, `search`, `top_week`, `top_month`, `top_all_time`, `hidden_gems`, `controversial`, `most_viewed`, `most_liked`, `shuffle`, `cohort_trending`, `live`, `alphabetical_asc`, `alphabetical_desc`, `shortest`, `longest`, `most_commented`, `most_shared`, `date_saved`.
|
|
|
|
See [ai-lookup/services/ranking-profiles.md](ai-lookup/services/ranking-profiles.md) for the full list of built-in profiles.
|
|
|
|
### Cohort Definitions
|
|
|
|
Cohorts are named predicates over user attributes. They define audience segments for scoped signal aggregation and trending.
|
|
|
|
```rust
|
|
use tidaldb::schema::EntityId;
|
|
|
|
// Define a cohort via db.define_cohort() after opening.
|
|
// Cohort signal aggregation happens at signal write time.
|
|
// Use RetrieveBuilder::cohort("us_young_music") to scope queries.
|
|
```
|
|
|
|
---
|
|
|
|
## Write Path
|
|
|
|
### Ingesting Entities
|
|
|
|
Items enter the system with metadata as `HashMap<String, String>` key-value pairs. The application provides the embedding -- tidalDB does not generate vectors.
|
|
|
|
```rust
|
|
use std::collections::HashMap;
|
|
use tidaldb::schema::EntityId;
|
|
|
|
let mut metadata = HashMap::new();
|
|
metadata.insert("title".to_string(), "Introduction to Jazz Piano".to_string());
|
|
metadata.insert("description".to_string(), "A beginner's guide...".to_string());
|
|
metadata.insert("category".to_string(), "music".to_string());
|
|
metadata.insert("tags".to_string(), "jazz,piano,tutorial,beginner".to_string());
|
|
metadata.insert("format".to_string(), "video".to_string());
|
|
metadata.insert("language".to_string(), "en".to_string());
|
|
metadata.insert("duration".to_string(), "1320".to_string()); // seconds
|
|
metadata.insert("creator_id".to_string(), "100".to_string());
|
|
|
|
db.write_item_with_metadata(EntityId::new(1), &metadata)?;
|
|
```
|
|
|
|
On commit, the item is:
|
|
1. Stored in the entity store
|
|
2. Text fields indexed in the inverted index (BM25)
|
|
3. Inserted into bitmap and range indexes for filtering
|
|
4. Added to the universe bitmap for RETRIEVE queries
|
|
5. **Immediately queryable**
|
|
|
|
### Writing Embeddings
|
|
|
|
Embeddings are written separately from metadata. tidalDB L2-normalizes and indexes them into the HNSW vector index.
|
|
|
|
```rust
|
|
use tidaldb::schema::EntityId;
|
|
|
|
// Item embedding (you compute this externally).
|
|
let embedding: Vec<f32> = compute_embedding("Introduction to Jazz Piano");
|
|
db.write_item_embedding(EntityId::new(1), &embedding)?;
|
|
|
|
// Creator embedding.
|
|
let creator_embedding: Vec<f32> = compute_creator_embedding("Jazz Academy");
|
|
db.write_creator_embedding(EntityId::new(100), &creator_embedding)?;
|
|
```
|
|
|
|
### Writing Relationships
|
|
|
|
Relationships are directional edges between entities (follows, blocks). Used for the `following` profile and blocked-creator filtering.
|
|
|
|
```rust
|
|
use tidaldb::schema::EntityId;
|
|
use tidaldb::schema::Timestamp;
|
|
|
|
// User follows a creator.
|
|
db.write_relationship(
|
|
EntityId::new(123), // user
|
|
EntityId::new(100), // creator
|
|
"follows",
|
|
1.0, // weight
|
|
Timestamp::now(),
|
|
)?;
|
|
```
|
|
|
|
### Writing Signals
|
|
|
|
Signals are how the feedback loop closes. A single signal write atomically updates:
|
|
1. The item's signal ledger (windowed aggregates, velocity, decay score)
|
|
2. The WAL (write-ahead log) for durability
|
|
|
|
```rust
|
|
use tidaldb::schema::{EntityId, Timestamp};
|
|
|
|
// User viewed an item.
|
|
db.signal("view", EntityId::new(1), 1.0, Timestamp::now())?;
|
|
|
|
// User completed 94% of the video.
|
|
db.signal("completion", EntityId::new(1), 0.94, Timestamp::now())?;
|
|
|
|
// User liked an item.
|
|
db.signal("like", EntityId::new(1), 1.0, Timestamp::now())?;
|
|
|
|
// User skipped after 3 seconds (strong negative).
|
|
db.signal("skip", EntityId::new(2), 1.0, Timestamp::now())?;
|
|
|
|
// User tapped "Not interested" (permanent negative on this item).
|
|
db.signal("hide", EntityId::new(2), 1.0, Timestamp::now())?;
|
|
```
|
|
|
|
For signals with user context (updates preference vectors, seen state, interaction weights):
|
|
|
|
```rust
|
|
use tidaldb::schema::{EntityId, Timestamp};
|
|
|
|
db.signal_with_context(
|
|
"view",
|
|
EntityId::new(1), // item
|
|
1.0, // weight
|
|
Timestamp::now(),
|
|
Some(123), // for_user
|
|
Some(100), // creator_id
|
|
)?;
|
|
```
|
|
|
|
The next ranking query -- even 100ms later -- reflects the updated state.
|
|
|
|
---
|
|
|
|
## Query Language
|
|
|
|
Three operations: **RETRIEVE** (feed generation, browse, related), **SEARCH** (text + semantic retrieval), **SUGGEST** (autocomplete).
|
|
|
|
All queries return ranked results with scores. The application renders -- it never re-ranks.
|
|
|
|
### RETRIEVE -- Feeds, Browse, Related
|
|
|
|
RETRIEVE generates ranked content lists. It handles personalized feeds, category browse, trending, following, related content, and every other surface described in [USE_CASES.md](USE_CASES.md).
|
|
|
|
```rust
|
|
use tidaldb::query::retrieve::Retrieve;
|
|
use tidaldb::schema::EntityId;
|
|
|
|
// Personalized For You feed.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("for_you")
|
|
.limit(50)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Trending globally.
|
|
let query = Retrieve::builder()
|
|
.profile("trending")
|
|
.limit(25)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
use tidaldb::storage::indexes::filter::FilterExpr;
|
|
|
|
// Trending in a category.
|
|
let query = Retrieve::builder()
|
|
.profile("trending")
|
|
.filter(FilterExpr::eq("category", "jazz"))
|
|
.limit(25)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Trending within a cohort -- what's hot among US young music fans.
|
|
let query = Retrieve::builder()
|
|
.profile("cohort_trending")
|
|
.cohort("us_young_music")
|
|
.limit(25)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Following feed -- content from followed creators.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("following")
|
|
.limit(50)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
use tidaldb::ranking::diversity::DiversityConstraints;
|
|
|
|
// Related content / Up Next -- anchored to a specific item.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("related")
|
|
.similar_to(EntityId::new(1))
|
|
.diversity(DiversityConstraints::new().max_per_creator(1))
|
|
.limit(10)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Browse category with explicit sort mode.
|
|
let query = Retrieve::builder()
|
|
.profile("top_week")
|
|
.filter(FilterExpr::eq("category", "jazz"))
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Hidden gems -- high quality, low reach.
|
|
let query = Retrieve::builder()
|
|
.profile("hidden_gems")
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Exclude previously seen items.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("for_you")
|
|
.exclude(vec![EntityId::new(1), EntityId::new(2)])
|
|
.limit(50)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Creator profile -- items from a specific creator.
|
|
let query = Retrieve::builder()
|
|
.profile("new")
|
|
.for_creator(EntityId::new(100))
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Notification prioritization.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("notification")
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.retrieve(&query)?;
|
|
```
|
|
|
|
### SEARCH -- Text + Semantic Retrieval
|
|
|
|
Search combines full-text BM25 relevance with semantic similarity via RRF (Reciprocal Rank Fusion). Text relevance is the floor -- an irrelevant result never surfaces just because the user likes the creator.
|
|
|
|
```rust
|
|
use tidaldb::query::search::Search;
|
|
|
|
// Basic keyword search, personalized for this user.
|
|
let query = Search::builder()
|
|
.query("rust tutorial beginner")
|
|
.for_user(123)
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Hybrid search: text + vector.
|
|
let query_embedding: Vec<f32> = embed("rust tutorial beginner");
|
|
let query = Search::builder()
|
|
.query("rust tutorial beginner")
|
|
.vector(query_embedding)
|
|
.for_user(123)
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Creator search.
|
|
use tidaldb::schema::EntityKind;
|
|
|
|
let query = Search::builder()
|
|
.query("jazz piano")
|
|
.entity_kind(EntityKind::Creator)
|
|
.limit(10)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
### Query Composition -- SEARCH within Scoped Results
|
|
|
|
SEARCH can be composed with scope constraints. This enables searching within trending, within a cohort, or within any candidate set.
|
|
|
|
```rust
|
|
use tidaldb::query::search::{Search, WithinScope};
|
|
|
|
// Search within globally trending items.
|
|
let query = Search::builder()
|
|
.query("jazz piano")
|
|
.within(WithinScope::Trending { window_hours: 24 })
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Search within cohort-scoped trending.
|
|
let query = Search::builder()
|
|
.query("jazz piano")
|
|
.within(WithinScope::CohortTrending {
|
|
cohort: "us_young_music".into(),
|
|
window_hours: 24,
|
|
})
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
```rust
|
|
// Search within a user's following feed.
|
|
let query = Search::builder()
|
|
.query("jazz piano")
|
|
.for_user(123)
|
|
.within(WithinScope::Following)
|
|
.limit(20)
|
|
.build()?;
|
|
let results = db.search(&query)?;
|
|
```
|
|
|
|
**`WithinScope`:**
|
|
|
|
| Scope | Candidate Set |
|
|
|---|---|
|
|
| `Trending { window_hours }` | Items with high global velocity in window |
|
|
| `CohortTrending { cohort, window_hours }` | Items with high velocity among cohort members |
|
|
| `Following` | Items from followed creators (requires `for_user`) |
|
|
| `Category { name }` | Items in a category |
|
|
| `Collection { id }` | Items in a collection |
|
|
|
|
### SUGGEST -- Autocomplete and Suggestions
|
|
|
|
```rust
|
|
use tidaldb::query::suggest::Suggest;
|
|
|
|
// Autocomplete on partial query.
|
|
let req = Suggest { prefix: "jazz pia".into(), for_user: None, limit: 5 };
|
|
let suggestions = db.suggest(&req)?;
|
|
// Returns Vec<Suggestion> with text and frequency.
|
|
|
|
// Trending searches (empty prefix).
|
|
let req = Suggest { prefix: "".into(), for_user: None, limit: 10 };
|
|
let trending = db.suggest(&req)?;
|
|
```
|
|
|
|
---
|
|
|
|
## Filters
|
|
|
|
All filters are composable. Any combination of filters produces a valid, efficiently-executed query. Filters use the `FilterExpr` type from `tidaldb::storage::indexes::filter::FilterExpr`.
|
|
|
|
### Content Attribute Filters
|
|
|
|
```rust
|
|
use tidaldb::storage::indexes::filter::FilterExpr;
|
|
|
|
FilterExpr::eq("category", "jazz") // exact match on category
|
|
FilterExpr::eq("format", "video") // exact match on format
|
|
FilterExpr::eq("tags", "tutorial") // tag match
|
|
```
|
|
|
|
### Engagement Threshold Filters
|
|
|
|
```rust
|
|
FilterExpr::MinSignal { signal: "view".into(), threshold: 10000.0 }
|
|
FilterExpr::MaxSignal { signal: "view".into(), threshold: 5000.0 }
|
|
```
|
|
|
|
### Geographic Filters
|
|
|
|
```rust
|
|
FilterExpr::NearLocation { lat: 40.7128, lng: -74.0060, radius_km: 50.0 }
|
|
```
|
|
|
|
### Collection Filters
|
|
|
|
```rust
|
|
use tidaldb::entities::CollectionId;
|
|
|
|
FilterExpr::InCollection(CollectionId::new(42))
|
|
```
|
|
|
|
See [USE_CASES.md Appendix A](USE_CASES.md#appendix-a--filter-reference) for the complete filter reference.
|
|
|
|
---
|
|
|
|
## Sort Modes
|
|
|
|
Sort modes are embedded in ranking profiles. The application names a profile. The database executes the ranking pipeline. 25 built-in profiles cover the most common sort needs.
|
|
|
|
| Profile | Sort Mode |
|
|
|---|---|
|
|
| `new` | `created_at` DESC |
|
|
| `trending` | Engagement velocity |
|
|
| `hot` | Score / (age + 2)^gravity |
|
|
| `top_week` / `top_month` / `top_all_time` | Cumulative quality by window |
|
|
| `most_viewed` / `most_liked` | Signal count by window |
|
|
| `most_commented` / `most_shared` | Signal count (AllTime) |
|
|
| `hidden_gems` | High quality, low reach |
|
|
| `controversial` | max(positive * negative signals) |
|
|
| `shuffle` | Random, quality-weighted |
|
|
| `live` | Live viewer count DESC |
|
|
| `date_saved` | When user bookmarked DESC |
|
|
| `alphabetical_asc` / `alphabetical_desc` | Title A-Z / Z-A |
|
|
| `shortest` / `longest` | Duration ASC / DESC |
|
|
|
|
See [USE_CASES.md Appendix B](USE_CASES.md#appendix-b--sort-mode-reference) for the complete sort mode reference.
|
|
|
|
---
|
|
|
|
## Diversity Constraints
|
|
|
|
Diversity is a post-scoring pass. After candidates are scored, diversity constraints reorder the result set to enforce variety -- without reducing the result count.
|
|
|
|
```rust
|
|
use tidaldb::ranking::diversity::DiversityConstraints;
|
|
|
|
let diversity = DiversityConstraints::new()
|
|
.max_per_creator(2) // No more than 2 items per creator
|
|
.max_format_fraction(0.4); // No format > 40% of results
|
|
|
|
let query = Retrieve::builder()
|
|
.profile("for_you")
|
|
.for_user(123)
|
|
.diversity(diversity)
|
|
.limit(50)
|
|
.build()?;
|
|
```
|
|
|
|
Diversity is specified per query or per ranking profile. Query-level diversity overrides the profile default.
|
|
|
|
---
|
|
|
|
## Pagination
|
|
|
|
Cursor-based pagination for stable result sets across pages.
|
|
|
|
```rust
|
|
use tidaldb::query::retrieve::Retrieve;
|
|
|
|
// First page.
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("for_you")
|
|
.limit(50)
|
|
.build()?;
|
|
let page1 = db.retrieve(&query)?;
|
|
|
|
// Next page -- pass the cursor from the previous response.
|
|
if let Some(cursor) = page1.next_cursor {
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("for_you")
|
|
.cursor(cursor)
|
|
.limit(50)
|
|
.build()?;
|
|
let page2 = db.retrieve(&query)?;
|
|
}
|
|
```
|
|
|
|
Alternatively, use `exclude` to exclude previously returned items:
|
|
|
|
```rust
|
|
let seen_ids: Vec<_> = page1.items.iter().map(|r| r.entity_id).collect();
|
|
let query = Retrieve::builder()
|
|
.for_user(123)
|
|
.profile("for_you")
|
|
.exclude(seen_ids)
|
|
.limit(50)
|
|
.build()?;
|
|
let page2 = db.retrieve(&query)?;
|
|
```
|
|
|
|
---
|
|
|
|
## Response Format
|
|
|
|
### RETRIEVE Response
|
|
|
|
```rust
|
|
pub struct Results {
|
|
/// Ranked items with scores.
|
|
pub items: Vec<RetrieveResult>,
|
|
/// Cursor for fetching the next page.
|
|
pub next_cursor: Option<Cursor>,
|
|
/// Total candidate count before diversity/limit.
|
|
pub total_candidates: usize,
|
|
/// Whether all diversity constraints were satisfied.
|
|
pub constraints_satisfied: bool,
|
|
/// Warnings generated during query execution.
|
|
pub warnings: Vec<String>,
|
|
/// The degradation level under which this query was executed.
|
|
pub degradation_level: DegradationLevel,
|
|
}
|
|
|
|
pub struct RetrieveResult {
|
|
/// Entity ID.
|
|
pub entity_id: EntityId,
|
|
/// Normalized score in [0.0, 1.0].
|
|
pub score: f64,
|
|
/// 1-based rank.
|
|
pub rank: usize,
|
|
/// Signal values that contributed to this score.
|
|
pub signals: Vec<Signal>,
|
|
}
|
|
```
|
|
|
|
### SEARCH Response
|
|
|
|
```rust
|
|
pub struct SearchResults {
|
|
pub items: Vec<SearchResultItem>,
|
|
pub next_cursor: Option<Cursor>,
|
|
pub total_candidates: usize,
|
|
pub constraints_satisfied: bool,
|
|
pub warnings: Vec<String>,
|
|
pub degradation_level: DegradationLevel,
|
|
}
|
|
|
|
pub struct SearchResultItem {
|
|
pub entity_id: EntityId,
|
|
pub score: f64,
|
|
pub rank: usize,
|
|
pub bm25_score: Option<f32>,
|
|
pub semantic_score: Option<f32>,
|
|
pub signals: Vec<Signal>,
|
|
pub metadata: Option<HashMap<String, String>>,
|
|
}
|
|
```
|
|
|
|
The application uses `items` to render the UI. It uses `signals` to display engagement counts (views, likes, etc.). It never re-ranks -- the order from tidalDB is the final order.
|
|
|
|
---
|
|
|
|
## Lifecycle and Operations
|
|
|
|
### Shutdown
|
|
|
|
```rust
|
|
// Graceful shutdown -- flushes WAL, checkpoints signal state, persists indexes.
|
|
db.close()?;
|
|
// Or equivalently:
|
|
db.shutdown()?;
|
|
```
|
|
|
|
### Health Check
|
|
|
|
```rust
|
|
db.health_check()?; // Returns Ok(()) if operational.
|
|
```
|
|
|
|
### Item Count
|
|
|
|
```rust
|
|
let count: u64 = db.item_count(); // Number of items in the universe bitmap.
|
|
```
|
|
|
|
### Reading Signal State
|
|
|
|
```rust
|
|
use tidaldb::schema::{EntityId, Window};
|
|
|
|
// Read decay score (applies lazy decay to current time).
|
|
let score: Option<f64> = db.read_decay_score(EntityId::new(1), "view", 0)?;
|
|
|
|
// Read windowed event count.
|
|
let count: u64 = db.read_windowed_count(EntityId::new(1), "view", Window::OneHour)?;
|
|
|
|
// Read velocity (events per second).
|
|
let velocity: f64 = db.read_velocity(EntityId::new(1), "view", Window::OneHour)?;
|
|
```
|
|
|
|
### Saved Searches
|
|
|
|
```rust
|
|
use tidaldb::schema::{EntityId, Timestamp};
|
|
|
|
// Save a search as a persistent feed.
|
|
db.save_search(EntityId::new(123), "Jazz tutorials", "jazz tutorial", None)?;
|
|
|
|
// Query a saved search for new results since a timestamp.
|
|
let results = db.retrieve_saved_search(EntityId::new(123), "Jazz tutorials", Some(since))?;
|
|
|
|
// List all saved searches for a user.
|
|
let searches = db.list_saved_searches(EntityId::new(123))?;
|
|
|
|
// Delete a saved search.
|
|
db.delete_saved_search(EntityId::new(123), "Jazz tutorials")?;
|
|
```
|
|
|
|
### Collections
|
|
|
|
```rust
|
|
use tidaldb::schema::EntityId;
|
|
use tidaldb::entities::collection::Visibility;
|
|
|
|
// Create a user collection (playlist, board, etc.)
|
|
let collection_id = db.create_collection(EntityId::new(123), "Jazz Favorites", Visibility::Private)?;
|
|
|
|
// Add an item to a collection.
|
|
db.add_to_collection(collection_id, EntityId::new(1))?;
|
|
|
|
// Remove an item from a collection.
|
|
db.remove_from_collection(collection_id, EntityId::new(1))?;
|
|
|
|
// List collections for a user.
|
|
let collections = db.list_collections(EntityId::new(123))?;
|
|
```
|
|
|
|
### Text Index Management
|
|
|
|
```rust
|
|
// Force a synchronous commit and reload of the text index.
|
|
// Useful in tests after writing items to make them immediately searchable.
|
|
db.flush_text_index()?;
|
|
db.flush_creator_text_index()?;
|
|
|
|
// Manual reload (for ephemeral mode).
|
|
db.reload_text_index()?;
|
|
```
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Operation | What the Application Does | What tidalDB Does |
|
|
|---|---|---|
|
|
| **Ingest content** | Compute embedding, call `write_item_with_metadata` + `write_item_embedding` | Index text, insert vector, initialize signals, apply cold start |
|
|
| **Record engagement** | Call `signal` with event type | Update signal ledger, WAL-backed durability |
|
|
| **Record engagement with context** | Call `signal_with_context` with user/creator IDs | Update ledger + user preferences + interaction weights + cohort attribution |
|
|
| **Serve a feed** | Call `retrieve` with a profile name | Candidate retrieval, scoring, diversity enforcement, pagination |
|
|
| **Search** | Embed query, call `search` | BM25 + ANN + RRF fusion + personalization + diversity |
|
|
| **Handle cold start** | Nothing | Exploration budget, population priors -- automatic |
|
|
| **Handle negative signals** | Call `signal` with skip/hide | Preference decay, exclusion in future queries |
|
|
| **Scope trending by cohort** | Specify cohort name in retrieve query | Cohort-scoped signal aggregation, same ranking profile |
|
|
| **Search within scope** | Specify `within` on search query | Intersects text/vector retrieval with scoped candidate set |
|
|
|
|
One process. One query interface. One operational model.
|