24 KiB
API Reference
Quick API Reference: The examples below reflect the current implementation API. Use
cargo doc --manifest-path tidal/Cargo.toml --openfor full documentation.
How developers interact with tidalDB. This document covers initialization, schema definition, write operations, queries, and the feedback loop.
tidalDB is an embeddable Rust library. You link it into your process. There is no separate server, no network protocol, no client SDK. The API is Rust types and method calls.
Table of Contents
- Initialization
- Schema Definition
- Write Path
- Query Language
- Filters
- Sort Modes
- Diversity Constraints
- Pagination
- Response Format
- Lifecycle and Operations
Initialization
Open a database using the builder pattern. Define the schema first, then pass it to the builder.
use tidaldb::TidalDb;
use tidaldb::schema::{SchemaBuilder, EntityKind, DecaySpec, Window};
use std::time::Duration;
// 1. Define the schema (signal types, text fields, embedding slots).
let mut schema = SchemaBuilder::new();
let _ = schema.signal("view", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(7 * 24 * 3600),
})
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::AllTime])
.velocity(true)
.add();
let schema = schema.build().expect("valid schema");
// 2a. Ephemeral (in-memory) -- no filesystem access, ideal for testing.
let db = TidalDb::builder()
.ephemeral()
.with_schema(schema.clone())
.open()?;
// 2b. Persistent -- durable storage at the given path.
let db = TidalDb::builder()
.with_data_dir("/var/lib/tidaldb/my_app")
.with_schema(schema)
.open()?;
The database is Send + Sync. Share it across threads with Arc<TidalDb>.
Schema Definition
Schema is defined before opening the database using SchemaBuilder. It declares signal types, text fields for full-text search, and embedding slots for vector search.
Entity Types
Entities are the nodes of the system. Three built-in types: Item, User, Creator. Entity metadata is stored as HashMap<String, String> key-value pairs.
Signal Definitions
Signals are typed, timestamped event streams. Decay, velocity, and windowed aggregation are declared in schema -- not computed in application code.
use tidaldb::schema::{SchemaBuilder, EntityKind, DecaySpec, Window, TextFieldType};
use std::time::Duration;
let mut schema = SchemaBuilder::new();
// View signal: exponential decay, 7-day half-life, three windows + velocity.
let _ = schema.signal("view", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(7 * 24 * 3600),
})
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
.velocity(true)
.add();
// Like signal: slower decay (14 days).
let _ = schema.signal("like", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(14 * 24 * 3600),
})
.windows(&[Window::TwentyFourHours, Window::SevenDays, Window::AllTime])
.velocity(true)
.add();
// Skip signal: fast decay (1 day), no velocity.
let _ = schema.signal("skip", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(24 * 3600),
})
.windows(&[Window::OneHour, Window::TwentyFourHours])
.velocity(false)
.add();
// Hide signal: permanent (never decays), no windows.
let _ = schema.signal("hide", EntityKind::Item, DecaySpec::Permanent).add();
// Share signal: for trending and social features.
let _ = schema.signal("share", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(3 * 24 * 3600),
})
.windows(&[Window::OneHour, Window::TwentyFourHours, Window::AllTime])
.velocity(true)
.add();
// Completion signal: long-lived quality metric.
let _ = schema.signal("completion", EntityKind::Item, DecaySpec::Exponential {
half_life: Duration::from_secs(30 * 24 * 3600),
})
.windows(&[Window::AllTime])
.velocity(false)
.add();
// Text fields for BM25 full-text search.
schema.text_field("title", TextFieldType::Text);
schema.text_field("description", TextFieldType::Text);
schema.text_field("category", TextFieldType::Keyword);
schema.text_field("tags", TextFieldType::Keyword);
// Creator text fields for creator search.
schema.creator_text_field("name", TextFieldType::Text);
schema.creator_text_field("handle", TextFieldType::Keyword);
// Embedding slots for vector search (you provide the vectors).
schema.embedding_slot("content", EntityKind::Item, 128);
schema.embedding_slot("content", EntityKind::Creator, 128);
let schema = schema.build()?;
Decay types:
| Decay | Behavior |
|---|---|
Exponential { half_life } |
Signal weight halves every half_life duration |
Linear { lifetime } |
Signal weight drops linearly to zero over lifetime |
Permanent |
Never decays -- hides, blocks, follows |
The full signal reference is in USE_CASES.md Appendix C.
Ranking Profiles
tidalDB ships 25 built-in ranking profiles. The application says profile("trending"). The database executes the entire pipeline.
Built-in profiles include: trending, hot, new, for_you, following, related, notification, search, top_week, top_month, top_all_time, hidden_gems, controversial, most_viewed, most_liked, shuffle, cohort_trending, live, alphabetical_asc, alphabetical_desc, shortest, longest, most_commented, most_shared, date_saved.
See ai-lookup/services/ranking-profiles.md for the full list of built-in profiles.
Cohort Definitions
Cohorts are named predicates over user attributes. They define audience segments for scoped signal aggregation and trending.
use tidaldb::schema::EntityId;
// Define a cohort via db.define_cohort() after opening.
// Cohort signal aggregation happens at signal write time.
// Use RetrieveBuilder::cohort("us_young_music") to scope queries.
Write Path
Ingesting Entities
Items enter the system with metadata as HashMap<String, String> key-value pairs. The application provides the embedding -- tidalDB does not generate vectors.
use std::collections::HashMap;
use tidaldb::schema::EntityId;
let mut metadata = HashMap::new();
metadata.insert("title".to_string(), "Introduction to Jazz Piano".to_string());
metadata.insert("description".to_string(), "A beginner's guide...".to_string());
metadata.insert("category".to_string(), "music".to_string());
metadata.insert("tags".to_string(), "jazz,piano,tutorial,beginner".to_string());
metadata.insert("format".to_string(), "video".to_string());
metadata.insert("language".to_string(), "en".to_string());
metadata.insert("duration".to_string(), "1320".to_string()); // seconds
metadata.insert("creator_id".to_string(), "100".to_string());
db.write_item_with_metadata(EntityId::new(1), &metadata)?;
On commit, the item is:
- Stored in the entity store
- Text fields indexed in the inverted index (BM25)
- Inserted into bitmap and range indexes for filtering
- Added to the universe bitmap for RETRIEVE queries
- Immediately queryable
Writing Embeddings
Embeddings are written separately from metadata. tidalDB L2-normalizes and indexes them into the HNSW vector index.
use tidaldb::schema::EntityId;
// Item embedding (you compute this externally).
let embedding: Vec<f32> = compute_embedding("Introduction to Jazz Piano");
db.write_item_embedding(EntityId::new(1), &embedding)?;
// Creator embedding.
let creator_embedding: Vec<f32> = compute_creator_embedding("Jazz Academy");
db.write_creator_embedding(EntityId::new(100), &creator_embedding)?;
Writing Relationships
Relationships are directional edges between entities (follows, blocks). Used for the following profile and blocked-creator filtering.
use tidaldb::schema::EntityId;
use tidaldb::schema::Timestamp;
// User follows a creator.
db.write_relationship(
EntityId::new(123), // user
EntityId::new(100), // creator
"follows",
1.0, // weight
Timestamp::now(),
)?;
Writing Signals
Signals are how the feedback loop closes. A single signal write atomically updates:
- The item's signal ledger (windowed aggregates, velocity, decay score)
- The WAL (write-ahead log) for durability
use tidaldb::schema::{EntityId, Timestamp};
// User viewed an item.
db.signal("view", EntityId::new(1), 1.0, Timestamp::now())?;
// User completed 94% of the video.
db.signal("completion", EntityId::new(1), 0.94, Timestamp::now())?;
// User liked an item.
db.signal("like", EntityId::new(1), 1.0, Timestamp::now())?;
// User skipped after 3 seconds (strong negative).
db.signal("skip", EntityId::new(2), 1.0, Timestamp::now())?;
// User tapped "Not interested" (permanent negative on this item).
db.signal("hide", EntityId::new(2), 1.0, Timestamp::now())?;
For signals with user context (updates preference vectors, seen state, interaction weights):
use tidaldb::schema::{EntityId, Timestamp};
db.signal_with_context(
"view",
EntityId::new(1), // item
1.0, // weight
Timestamp::now(),
Some(123), // for_user
Some(100), // creator_id
)?;
The next ranking query -- even 100ms later -- reflects the updated state.
Query Language
Three operations: RETRIEVE (feed generation, browse, related), SEARCH (text + semantic retrieval), SUGGEST (autocomplete).
All queries return ranked results with scores. The application renders -- it never re-ranks.
RETRIEVE -- Feeds, Browse, Related
RETRIEVE generates ranked content lists. It handles personalized feeds, category browse, trending, following, related content, and every other surface described in USE_CASES.md.
use tidaldb::query::retrieve::Retrieve;
use tidaldb::schema::EntityId;
// Personalized For You feed.
let query = Retrieve::builder()
.for_user(123)
.profile("for_you")
.limit(50)
.build()?;
let results = db.retrieve(&query)?;
// Trending globally.
let query = Retrieve::builder()
.profile("trending")
.limit(25)
.build()?;
let results = db.retrieve(&query)?;
use tidaldb::storage::indexes::filter::FilterExpr;
// Trending in a category.
let query = Retrieve::builder()
.profile("trending")
.filter(FilterExpr::eq("category", "jazz"))
.limit(25)
.build()?;
let results = db.retrieve(&query)?;
// Trending within a cohort -- what's hot among US young music fans.
let query = Retrieve::builder()
.profile("cohort_trending")
.cohort("us_young_music")
.limit(25)
.build()?;
let results = db.retrieve(&query)?;
// Following feed -- content from followed creators.
let query = Retrieve::builder()
.for_user(123)
.profile("following")
.limit(50)
.build()?;
let results = db.retrieve(&query)?;
use tidaldb::ranking::diversity::DiversityConstraints;
// Related content / Up Next -- anchored to a specific item.
let query = Retrieve::builder()
.for_user(123)
.profile("related")
.similar_to(EntityId::new(1))
.diversity(DiversityConstraints::new().max_per_creator(1))
.limit(10)
.build()?;
let results = db.retrieve(&query)?;
// Browse category with explicit sort mode.
let query = Retrieve::builder()
.profile("top_week")
.filter(FilterExpr::eq("category", "jazz"))
.limit(20)
.build()?;
let results = db.retrieve(&query)?;
// Hidden gems -- high quality, low reach.
let query = Retrieve::builder()
.profile("hidden_gems")
.limit(20)
.build()?;
let results = db.retrieve(&query)?;
// Exclude previously seen items.
let query = Retrieve::builder()
.for_user(123)
.profile("for_you")
.exclude(vec![EntityId::new(1), EntityId::new(2)])
.limit(50)
.build()?;
let results = db.retrieve(&query)?;
// Creator profile -- items from a specific creator.
let query = Retrieve::builder()
.profile("new")
.for_creator(EntityId::new(100))
.limit(20)
.build()?;
let results = db.retrieve(&query)?;
// Notification prioritization.
let query = Retrieve::builder()
.for_user(123)
.profile("notification")
.limit(20)
.build()?;
let results = db.retrieve(&query)?;
SEARCH -- Text + Semantic Retrieval
Search combines full-text BM25 relevance with semantic similarity via RRF (Reciprocal Rank Fusion). Text relevance is the floor -- an irrelevant result never surfaces just because the user likes the creator.
use tidaldb::query::search::Search;
// Basic keyword search, personalized for this user.
let query = Search::builder()
.query("rust tutorial beginner")
.for_user(123)
.limit(20)
.build()?;
let results = db.search(&query)?;
// Hybrid search: text + vector.
let query_embedding: Vec<f32> = embed("rust tutorial beginner");
let query = Search::builder()
.query("rust tutorial beginner")
.vector(query_embedding)
.for_user(123)
.limit(20)
.build()?;
let results = db.search(&query)?;
// Creator search.
use tidaldb::schema::EntityKind;
let query = Search::builder()
.query("jazz piano")
.entity_kind(EntityKind::Creator)
.limit(10)
.build()?;
let results = db.search(&query)?;
Query Composition -- SEARCH within Scoped Results
SEARCH can be composed with scope constraints. This enables searching within trending, within a cohort, or within any candidate set.
use tidaldb::query::search::{Search, WithinScope};
// Search within globally trending items.
let query = Search::builder()
.query("jazz piano")
.within(WithinScope::Trending { window_hours: 24 })
.limit(20)
.build()?;
let results = db.search(&query)?;
// Search within cohort-scoped trending.
let query = Search::builder()
.query("jazz piano")
.within(WithinScope::CohortTrending {
cohort: "us_young_music".into(),
window_hours: 24,
})
.limit(20)
.build()?;
let results = db.search(&query)?;
// Search within a user's following feed.
let query = Search::builder()
.query("jazz piano")
.for_user(123)
.within(WithinScope::Following)
.limit(20)
.build()?;
let results = db.search(&query)?;
WithinScope:
| Scope | Candidate Set |
|---|---|
Trending { window_hours } |
Items with high global velocity in window |
CohortTrending { cohort, window_hours } |
Items with high velocity among cohort members |
Following |
Items from followed creators (requires for_user) |
Category { name } |
Items in a category |
Collection { id } |
Items in a collection |
SUGGEST -- Autocomplete and Suggestions
use tidaldb::query::suggest::Suggest;
// Autocomplete on partial query.
let req = Suggest { prefix: "jazz pia".into(), for_user: None, limit: 5 };
let suggestions = db.suggest(&req)?;
// Returns Vec<Suggestion> with text and frequency.
// Trending searches (empty prefix).
let req = Suggest { prefix: "".into(), for_user: None, limit: 10 };
let trending = db.suggest(&req)?;
Filters
All filters are composable. Any combination of filters produces a valid, efficiently-executed query. Filters use the FilterExpr type from tidaldb::storage::indexes::filter::FilterExpr.
Content Attribute Filters
use tidaldb::storage::indexes::filter::FilterExpr;
FilterExpr::eq("category", "jazz") // exact match on category
FilterExpr::eq("format", "video") // exact match on format
FilterExpr::eq("tags", "tutorial") // tag match
Engagement Threshold Filters
FilterExpr::MinSignal { signal: "view".into(), threshold: 10000.0 }
FilterExpr::MaxSignal { signal: "view".into(), threshold: 5000.0 }
Geographic Filters
FilterExpr::NearLocation { lat: 40.7128, lng: -74.0060, radius_km: 50.0 }
Collection Filters
use tidaldb::entities::CollectionId;
FilterExpr::InCollection(CollectionId::new(42))
See USE_CASES.md Appendix A for the complete filter reference.
Sort Modes
Sort modes are embedded in ranking profiles. The application names a profile. The database executes the ranking pipeline. 25 built-in profiles cover the most common sort needs.
| Profile | Sort Mode |
|---|---|
new |
created_at DESC |
trending |
Engagement velocity |
hot |
Score / (age + 2)^gravity |
top_week / top_month / top_all_time |
Cumulative quality by window |
most_viewed / most_liked |
Signal count by window |
most_commented / most_shared |
Signal count (AllTime) |
hidden_gems |
High quality, low reach |
controversial |
max(positive * negative signals) |
shuffle |
Random, quality-weighted |
live |
Live viewer count DESC |
date_saved |
When user bookmarked DESC |
alphabetical_asc / alphabetical_desc |
Title A-Z / Z-A |
shortest / longest |
Duration ASC / DESC |
See USE_CASES.md Appendix B for the complete sort mode reference.
Diversity Constraints
Diversity is a post-scoring pass. After candidates are scored, diversity constraints reorder the result set to enforce variety -- without reducing the result count.
use tidaldb::ranking::diversity::DiversityConstraints;
let diversity = DiversityConstraints::new()
.max_per_creator(2) // No more than 2 items per creator
.max_format_fraction(0.4); // No format > 40% of results
let query = Retrieve::builder()
.profile("for_you")
.for_user(123)
.diversity(diversity)
.limit(50)
.build()?;
Diversity is specified per query or per ranking profile. Query-level diversity overrides the profile default.
Pagination
Cursor-based pagination for stable result sets across pages.
use tidaldb::query::retrieve::Retrieve;
// First page.
let query = Retrieve::builder()
.for_user(123)
.profile("for_you")
.limit(50)
.build()?;
let page1 = db.retrieve(&query)?;
// Next page -- pass the cursor from the previous response.
if let Some(cursor) = page1.next_cursor {
let query = Retrieve::builder()
.for_user(123)
.profile("for_you")
.cursor(cursor)
.limit(50)
.build()?;
let page2 = db.retrieve(&query)?;
}
Alternatively, use exclude to exclude previously returned items:
let seen_ids: Vec<_> = page1.items.iter().map(|r| r.entity_id).collect();
let query = Retrieve::builder()
.for_user(123)
.profile("for_you")
.exclude(seen_ids)
.limit(50)
.build()?;
let page2 = db.retrieve(&query)?;
Response Format
RETRIEVE Response
pub struct Results {
/// Ranked items with scores.
pub items: Vec<RetrieveResult>,
/// Cursor for fetching the next page.
pub next_cursor: Option<Cursor>,
/// Total candidate count before diversity/limit.
pub total_candidates: usize,
/// Whether all diversity constraints were satisfied.
pub constraints_satisfied: bool,
/// Warnings generated during query execution.
pub warnings: Vec<String>,
/// The degradation level under which this query was executed.
pub degradation_level: DegradationLevel,
}
pub struct RetrieveResult {
/// Entity ID.
pub entity_id: EntityId,
/// Normalized score in [0.0, 1.0].
pub score: f64,
/// 1-based rank.
pub rank: usize,
/// Signal values that contributed to this score.
pub signals: Vec<Signal>,
}
SEARCH Response
pub struct SearchResults {
pub items: Vec<SearchResultItem>,
pub next_cursor: Option<Cursor>,
pub total_candidates: usize,
pub constraints_satisfied: bool,
pub warnings: Vec<String>,
pub degradation_level: DegradationLevel,
}
pub struct SearchResultItem {
pub entity_id: EntityId,
pub score: f64,
pub rank: usize,
pub bm25_score: Option<f32>,
pub semantic_score: Option<f32>,
pub signals: Vec<Signal>,
pub metadata: Option<HashMap<String, String>>,
}
The application uses items to render the UI. It uses signals to display engagement counts (views, likes, etc.). It never re-ranks -- the order from tidalDB is the final order.
Lifecycle and Operations
Shutdown
// Graceful shutdown -- flushes WAL, checkpoints signal state, persists indexes.
db.close()?;
// Or equivalently:
db.shutdown()?;
Health Check
db.health_check()?; // Returns Ok(()) if operational.
Item Count
let count: u64 = db.item_count(); // Number of items in the universe bitmap.
Reading Signal State
use tidaldb::schema::{EntityId, Window};
// Read decay score (applies lazy decay to current time).
let score: Option<f64> = db.read_decay_score(EntityId::new(1), "view", 0)?;
// Read windowed event count.
let count: u64 = db.read_windowed_count(EntityId::new(1), "view", Window::OneHour)?;
// Read velocity (events per second).
let velocity: f64 = db.read_velocity(EntityId::new(1), "view", Window::OneHour)?;
Saved Searches
use tidaldb::schema::{EntityId, Timestamp};
// Save a search as a persistent feed.
db.save_search(EntityId::new(123), "Jazz tutorials", "jazz tutorial", None)?;
// Query a saved search for new results since a timestamp.
let results = db.retrieve_saved_search(EntityId::new(123), "Jazz tutorials", Some(since))?;
// List all saved searches for a user.
let searches = db.list_saved_searches(EntityId::new(123))?;
// Delete a saved search.
db.delete_saved_search(EntityId::new(123), "Jazz tutorials")?;
Collections
use tidaldb::schema::EntityId;
use tidaldb::entities::collection::Visibility;
// Create a user collection (playlist, board, etc.)
let collection_id = db.create_collection(EntityId::new(123), "Jazz Favorites", Visibility::Private)?;
// Add an item to a collection.
db.add_to_collection(collection_id, EntityId::new(1))?;
// Remove an item from a collection.
db.remove_from_collection(collection_id, EntityId::new(1))?;
// List collections for a user.
let collections = db.list_collections(EntityId::new(123))?;
Text Index Management
// Force a synchronous commit and reload of the text index.
// Useful in tests after writing items to make them immediately searchable.
db.flush_text_index()?;
db.flush_creator_text_index()?;
// Manual reload (for ephemeral mode).
db.reload_text_index()?;
Summary
| Operation | What the Application Does | What tidalDB Does |
|---|---|---|
| Ingest content | Compute embedding, call write_item_with_metadata + write_item_embedding |
Index text, insert vector, initialize signals, apply cold start |
| Record engagement | Call signal with event type |
Update signal ledger, WAL-backed durability |
| Record engagement with context | Call signal_with_context with user/creator IDs |
Update ledger + user preferences + interaction weights + cohort attribution |
| Serve a feed | Call retrieve with a profile name |
Candidate retrieval, scoring, diversity enforcement, pagination |
| Search | Embed query, call search |
BM25 + ANN + RRF fusion + personalization + diversity |
| Handle cold start | Nothing | Exploration budget, population priors -- automatic |
| Handle negative signals | Call signal with skip/hide |
Preference decay, exclusion in future queries |
| Scope trending by cohort | Specify cohort name in retrieve query | Cohort-scoped signal aggregation, same ranking profile |
| Search within scope | Specify within on search query |
Intersects text/vector retrieval with scoped candidate set |
One process. One query interface. One operational model.