- m0p3: CONTRIBUTING.md with run-samples checklist, all 4 examples (quickstart, cli_embedding, axum_embedding, actix_embedding), doc-test coverage for every public API surface - m1p5: TidalDb public API — write_item, signal, read_decay_score, read_windowed_count, read_velocity; StorageBox enum routing memory vs fjall; WalSender/WalHandleWriter bridge; WAL replay on open - Periodic checkpoint: 30s background thread for persistent+schema mode; FjallBackend::Clone (O(1), fjall::Keyspace is ref-counted); graceful shutdown via Arc<AtomicBool> + join before final checkpoint - ROADMAP.md: M0 and M1 fully marked COMPLETE (341 tests passing) - Milestone 2 planning scaffolding added under docs/planning/milestone-2/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
33 KiB
Task 01: RETRIEVE AST + Parser
Context
Milestone: 2 -- Ranked Retrieval Phase: m2p5 -- Query Parser and RETRIEVE Executor Depends On: None (uses types from m2p2, m2p3, m2p4 but no m2p5 tasks) Blocks: Task 02 (RETRIEVE Executor Pipeline), Task 03 (M2 UAT Integration Test) Complexity: M
Objective
Deliver the typed AST for the RETRIEVE query operation and a Rust builder API for constructing queries ergonomically. For M2, there is no text grammar parser -- the "parser" is the RetrieveBuilder which validates and constructs a Retrieve struct. The text syntax parser (RETRIEVE items USING PROFILE trending LIMIT 25) is deferred to M5.
This task also defines the response types (Results, RetrieveResult), the pagination cursor, the Signal write command struct (wired to the existing M1 signal write path), and the QueryError enum. These types are consumed by Task 02's executor and returned to the caller.
The types defined here map directly to the spec's input/output types (Spec 08 Section 3) but are scoped to M2: no for_user, no similar_to, no for_cohort, no window, no context, no for_session. These fields exist on the struct as Option types for forward compatibility but are validated as unsupported in M2 if set.
Requirements
Retrievestruct: the complete RETRIEVE query requestRetrieveBuilder: ergonomic builder pattern for constructingRetrievequeriesProfileRef: profile name + optional version referenceCursor: opaque offset-based pagination cursor with base64 encodingResults: the query response (items, cursor, total_scored, constraints_satisfied)RetrieveResult: one result item (entity_id, score, rank, signal_snapshot)Signal: write command struct wired toTidalDb::signal()QueryError: error enum for query validation and execution failures- Validation: limit range, profile reference format, filter compatibility
- No
unsafecode
Technical Design
Module Structure
tidal/src/
query/
mod.rs -- pub mod retrieve; re-exports
retrieve.rs -- all types from this task
Public API
// === query/retrieve.rs ===
use crate::schema::{EntityId, EntityKind, Timestamp};
use crate::ranking::diversity::DiversityConstraints;
use crate::storage::indexes::filter::FilterExpr;
/// Reference to a ranking profile by name, optionally pinned to a version.
///
/// The executor resolves this against the `ProfileRegistry` at query time.
/// If `version` is `None`, the latest version is used.
#[derive(Debug, Clone)]
pub struct ProfileRef {
/// Profile name. Must match a registered profile in the registry.
pub name: String,
/// Optional version pin. `None` = latest version.
pub version: Option<u32>,
}
impl ProfileRef {
pub fn new(name: impl Into<String>) -> Self {
Self {
name: name.into(),
version: None,
}
}
pub fn versioned(name: impl Into<String>, version: u32) -> Self {
Self {
name: name.into(),
version: Some(version),
}
}
}
/// A RETRIEVE query. Declarative: specifies what, not how.
///
/// For M2, this struct is constructed via `RetrieveBuilder` (Rust API).
/// A text syntax parser is deferred to M5.
///
/// The profile determines the candidate generation strategy and scoring
/// formula. The caller never specifies how candidates are found -- only
/// which profile to use and which filters to apply.
///
/// Spec reference: docs/specs/08-query-engine.md Section 3.1
#[derive(Debug, Clone)]
pub struct Retrieve {
/// Target entity type. For M2, only `EntityKind::Item` is supported.
pub entity_kind: EntityKind,
/// Named ranking profile. Determines candidate strategy and scoring.
pub profile: ProfileRef,
/// Metadata and signal filters. Combined as AND.
/// Uses `FilterExpr` from m2p2 for composable filter evaluation.
pub filters: Vec<FilterExpr>,
/// Diversity constraints. Applied as a post-scoring pass.
/// If `None`, no diversity enforcement is applied.
pub diversity: Option<DiversityConstraints>,
/// Maximum results to return. Default: 50. Range: [1, 500].
pub limit: usize,
/// Explicit item exclusions. Removed from candidate set before scoring.
pub exclude: Vec<EntityId>,
/// Pagination cursor from a previous result set.
/// If `None`, returns the first page.
pub cursor: Option<Cursor>,
// --- Fields present for forward compatibility (M3+), validated as unsupported in M2 ---
/// User context for personalization. M3+.
pub for_user: Option<u64>,
/// Anchor item for related/similar queries. M3+.
pub similar_to: Option<EntityId>,
/// Surface context for the feedback loop. M3+.
pub context: Option<String>,
}
/// Builder for constructing `Retrieve` queries ergonomically.
///
/// # Example
///
/// ```ignore
/// let query = Retrieve::builder()
/// .entity(EntityKind::Item)
/// .profile("trending")
/// .filter(FilterExpr::eq("category", "jazz"))
/// .diversity(DiversityConstraints::new().max_per_creator(2))
/// .limit(25)
/// .build()?;
/// ```
pub struct RetrieveBuilder {
entity_kind: Option<EntityKind>,
profile: Option<ProfileRef>,
filters: Vec<FilterExpr>,
diversity: Option<DiversityConstraints>,
limit: usize,
exclude: Vec<EntityId>,
cursor: Option<Cursor>,
for_user: Option<u64>,
similar_to: Option<EntityId>,
context: Option<String>,
}
impl RetrieveBuilder {
pub fn new() -> Self {
Self {
entity_kind: None,
profile: None,
filters: Vec::new(),
diversity: None,
limit: 50,
exclude: Vec::new(),
cursor: None,
for_user: None,
similar_to: None,
context: None,
}
}
/// Set the target entity kind.
pub fn entity(mut self, kind: EntityKind) -> Self {
self.entity_kind = Some(kind);
self
}
/// Set the ranking profile by name.
pub fn profile(mut self, name: impl Into<String>) -> Self {
self.profile = Some(ProfileRef::new(name));
self
}
/// Set the ranking profile by name and version.
pub fn profile_versioned(mut self, name: impl Into<String>, version: u32) -> Self {
self.profile = Some(ProfileRef::versioned(name, version));
self
}
/// Add a filter expression. Multiple filters are ANDed together.
pub fn filter(mut self, expr: FilterExpr) -> Self {
self.filters.push(expr);
self
}
/// Set diversity constraints.
pub fn diversity(mut self, constraints: DiversityConstraints) -> Self {
self.diversity = Some(constraints);
self
}
/// Set the maximum number of results. Range: [1, 500]. Default: 50.
pub fn limit(mut self, limit: usize) -> Self {
self.limit = limit;
self
}
/// Add an entity ID to the exclusion list.
pub fn exclude(mut self, id: EntityId) -> Self {
self.exclude.push(id);
self
}
/// Add multiple entity IDs to the exclusion list.
pub fn exclude_ids(mut self, ids: impl IntoIterator<Item = EntityId>) -> Self {
self.exclude.extend(ids);
self
}
/// Set the pagination cursor.
pub fn cursor(mut self, cursor: Cursor) -> Self {
self.cursor = Some(cursor);
self
}
/// Validate and build the `Retrieve` query.
///
/// Returns `QueryError::InvalidLimit` if limit is 0 or > 500.
/// Returns `QueryError::ProfileNotFound` if no profile is set.
/// Returns `QueryError::InvalidFilter` if `for_user` or `similar_to` are set (M2).
pub fn build(self) -> Result<Retrieve, QueryError> {
let entity_kind = self.entity_kind.unwrap_or(EntityKind::Item);
let profile = self.profile.ok_or_else(|| {
QueryError::ProfileNotFound("no profile specified".to_string())
})?;
if self.limit == 0 || self.limit > 500 {
return Err(QueryError::InvalidLimit {
requested: self.limit,
min: 1,
max: 500,
});
}
// M2: reject unsupported features
if self.for_user.is_some() {
return Err(QueryError::InvalidFilter {
field: "for_user".to_string(),
reason: "FOR USER clause is not supported until M3".to_string(),
});
}
if self.similar_to.is_some() {
return Err(QueryError::InvalidFilter {
field: "similar_to".to_string(),
reason: "SIMILAR TO clause is not supported until M3".to_string(),
});
}
Ok(Retrieve {
entity_kind,
profile,
filters: self.filters,
diversity: self.diversity,
limit: self.limit,
exclude: self.exclude,
cursor: self.cursor,
for_user: self.for_user,
similar_to: self.similar_to,
context: self.context,
})
}
}
impl Default for RetrieveBuilder {
fn default() -> Self {
Self::new()
}
}
impl Retrieve {
/// Start building a RETRIEVE query.
pub fn builder() -> RetrieveBuilder {
RetrieveBuilder::new()
}
}
/// The combined filter expression for the query.
///
/// Multiple filters are ANDed together. This helper constructs the
/// combined filter from the `Retrieve` query's filter list.
impl Retrieve {
/// Combine all filters into a single AND expression.
/// Returns `None` if no filters are specified.
pub fn combined_filter(&self) -> Option<FilterExpr> {
match self.filters.len() {
0 => None,
1 => Some(self.filters[0].clone()),
_ => Some(FilterExpr::And(self.filters.clone())),
}
}
}
// ============================================================
// Response Types
// ============================================================
/// The complete response from a RETRIEVE query.
///
/// Spec reference: docs/specs/08-query-engine.md Section 5.7
#[derive(Debug, Clone)]
pub struct Results {
/// The ranked result items for this page.
pub items: Vec<RetrieveResult>,
/// Cursor for the next page. `None` if this is the last page.
pub next_cursor: Option<Cursor>,
/// How many candidates were scored by the profile executor.
/// This is the count after filtering but before diversity and limit.
pub total_scored: usize,
/// Whether all diversity constraints were fully satisfied.
/// `false` if constraints were relaxed (see `DiversityResult::violations`).
pub constraints_satisfied: bool,
/// Non-fatal warnings from query execution.
///
/// Warnings are surfaced when the executor degrades gracefully:
/// - Metadata enrichment fails for a candidate (the item is treated as a
/// unique creator for diversity purposes; results are still returned)
/// - A filter references a field with no index (predicate fallback used)
///
/// An empty `warnings` vec means clean execution with no degradation.
pub warnings: Vec<String>,
}
impl Results {
/// Number of items in this page.
pub fn len(&self) -> usize {
self.items.len()
}
/// Whether this page is empty.
pub fn is_empty(&self) -> bool {
self.items.is_empty()
}
}
/// A single result item from a RETRIEVE query.
///
/// Includes the entity ID, composite score, rank position, and a
/// signal snapshot for debugging and transparency.
///
/// Spec reference: docs/specs/08-query-engine.md Section 5, Stage 10
#[derive(Debug, Clone)]
pub struct RetrieveResult {
/// The entity ID of the result.
pub entity_id: EntityId,
/// The composite score from the ranking profile, normalized to [0.0, 1.0].
pub score: f64,
/// The 1-based rank position in the result set.
pub rank: usize,
/// Key signal values used in scoring. For debugging and transparency.
/// Contains (signal_name, value) pairs for signals referenced by the
/// profile's scoring rules. Capped at 10 entries.
pub signal_snapshot: Vec<(String, f64)>,
}
// ============================================================
// Pagination Cursor
// ============================================================
/// Opaque pagination cursor for RETRIEVE queries.
///
/// For M2, this is a simple offset-based cursor encoded as a base64 string.
/// True keyset-based pagination (score + entity_id tiebreaker, Spec 08
/// Section 8.2) is deferred to M5.
///
/// # Limitation: Not Stable Under Concurrent Writes
///
/// Offset-based cursors are not stable when the underlying ranked list
/// changes between page requests (e.g., due to concurrent signal writes).
/// Items may appear on multiple pages or be skipped if the ranking shifts.
/// This is documented and acceptable for M2; the spec says to prefer
/// keyset cursors for production use. Do not use cursor-based pagination
/// in write-heavy workloads until M5.
///
/// The cursor is opaque to the caller -- they receive it as a string and
/// pass it back on the next request. The internal representation is an
/// implementation detail.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Cursor {
/// The offset into the full result set for the next page.
offset: usize,
}
impl Cursor {
/// Create a cursor from an offset.
pub(crate) fn from_offset(offset: usize) -> Self {
Self { offset }
}
/// Get the offset this cursor represents.
pub(crate) fn offset(&self) -> usize {
self.offset
}
/// Encode the cursor as an opaque base64 string.
pub fn encode(&self) -> String {
use base64::Engine as _;
let bytes = self.offset.to_le_bytes();
base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(bytes)
}
/// Decode a cursor from an opaque base64 string.
pub fn decode(encoded: &str) -> Result<Self, QueryError> {
use base64::Engine as _;
let bytes = base64::engine::general_purpose::URL_SAFE_NO_PAD
.decode(encoded)
.map_err(|e| QueryError::InvalidCursor(format!("invalid base64: {e}")))?;
if bytes.len() != std::mem::size_of::<usize>() {
return Err(QueryError::InvalidCursor(format!(
"expected {} bytes, got {}",
std::mem::size_of::<usize>(),
bytes.len()
)));
}
let offset = usize::from_le_bytes(
bytes
.try_into()
.map_err(|_| QueryError::InvalidCursor("byte conversion failed".to_string()))?,
);
Ok(Self { offset })
}
}
// ============================================================
// Signal Write Command
// ============================================================
/// A signal write command.
///
/// For M2, this is a thin wrapper that routes to the existing
/// `TidalDb::signal()` method from M1. The struct form enables
/// future batching and the query language parser (M5) to produce
/// signal writes from parsed text.
#[derive(Debug, Clone)]
pub struct Signal {
/// The signal type name (e.g., "view", "like", "share").
pub signal_type: String,
/// The target entity ID.
pub entity_id: EntityId,
/// The signal weight. Typically 1.0 for count-based signals.
pub weight: f64,
/// The timestamp of the event.
pub timestamp: Timestamp,
}
impl Signal {
/// Create a new signal write command.
pub fn new(
signal_type: impl Into<String>,
entity_id: EntityId,
weight: f64,
timestamp: Timestamp,
) -> Self {
Self {
signal_type: signal_type.into(),
entity_id,
weight,
timestamp,
}
}
}
// ============================================================
// Query Error
// ============================================================
/// Errors returned by the query engine.
///
/// Spec reference: docs/specs/08-query-engine.md Section 3.4
#[derive(Debug, Clone)]
pub enum QueryError {
/// The named profile does not exist in the profile registry.
ProfileNotFound(String),
/// A filter references a field or uses a condition that is invalid.
InvalidFilter {
field: String,
reason: String,
},
/// The requested limit is out of the valid range [1, 500].
InvalidLimit {
requested: usize,
min: usize,
max: usize,
},
/// A required index (vector, bitmap, range) is not available.
IndexNotAvailable(String),
/// A storage engine error occurred during query execution.
///
/// Preserves the original error type so callers can match on specific
/// storage failure modes (e.g., `StorageError::Corruption`). Use
/// `From<crate::storage::StorageError>` for automatic conversion.
StorageError(crate::storage::StorageError),
/// The pagination cursor is invalid or could not be decoded.
InvalidCursor(String),
/// The profile's candidate strategy is not supported in this milestone.
/// M2 supports: Ann, Scan, SignalRanked. Others require M3+ infrastructure.
UnsupportedStrategy(String),
}
impl std::fmt::Display for QueryError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
QueryError::ProfileNotFound(name) => {
write!(f, "ranking profile '{name}' not found in registry")
}
QueryError::InvalidFilter { field, reason } => {
write!(f, "invalid filter on field '{field}': {reason}")
}
QueryError::InvalidLimit {
requested,
min,
max,
} => {
write!(f, "limit {requested} is out of range [{min}, {max}]")
}
QueryError::IndexNotAvailable(name) => {
write!(f, "required index not available: {name}")
}
QueryError::StorageError(e) => {
write!(f, "storage error during query: {e}")
}
QueryError::InvalidCursor(msg) => {
write!(f, "invalid pagination cursor: {msg}")
}
QueryError::UnsupportedStrategy(msg) => {
write!(f, "unsupported candidate strategy: {msg}")
}
}
}
}
impl std::error::Error for QueryError {}
impl From<crate::storage::StorageError> for QueryError {
fn from(e: crate::storage::StorageError) -> Self {
QueryError::StorageError(e)
}
}
Validation Logic
impl Retrieve {
/// Validate the query against a ProfileRegistry and Schema.
///
/// Called by the executor before pipeline execution. Separated from
/// `build()` because profile existence requires the registry, which
/// the builder does not have access to.
pub(crate) fn validate(
&self,
registry: &ProfileRegistry,
) -> Result<(), QueryError> {
// 1. Profile existence
let profile_name = &self.profile.name;
let profile = match self.profile.version {
Some(v) => registry.get_versioned(profile_name, v),
None => registry.get(profile_name),
};
if profile.is_none() {
return Err(QueryError::ProfileNotFound(profile_name.clone()));
}
// 2. Limit range (already validated in builder, but defense in depth)
if self.limit == 0 || self.limit > 500 {
return Err(QueryError::InvalidLimit {
requested: self.limit,
min: 1,
max: 500,
});
}
// 3. M2: unsupported features
if self.for_user.is_some() {
return Err(QueryError::InvalidFilter {
field: "for_user".to_string(),
reason: "FOR USER clause requires M3".to_string(),
});
}
if self.similar_to.is_some() {
return Err(QueryError::InvalidFilter {
field: "similar_to".to_string(),
reason: "SIMILAR TO clause requires M3".to_string(),
});
}
// 4. Candidate strategy support check
let resolved_profile = profile.unwrap();
match resolved_profile.candidate_strategy() {
CandidateStrategy::Ann { .. }
| CandidateStrategy::Scan { .. }
| CandidateStrategy::SignalRanked { .. } => {}
other => {
return Err(QueryError::UnsupportedStrategy(
format!("{other:?} requires M3+ infrastructure"),
));
}
}
Ok(())
}
}
Test Strategy
Unit Tests
// === RetrieveBuilder tests ===
#[test]
fn builder_default_limit_50() {
let query = Retrieve::builder()
.profile("trending")
.build()
.unwrap();
assert_eq!(query.limit, 50);
assert_eq!(query.entity_kind, EntityKind::Item);
assert!(query.filters.is_empty());
assert!(query.diversity.is_none());
assert!(query.exclude.is_empty());
assert!(query.cursor.is_none());
}
#[test]
fn builder_with_all_fields() {
let query = Retrieve::builder()
.entity(EntityKind::Item)
.profile("hot")
.filter(FilterExpr::eq("category", "jazz"))
.filter(FilterExpr::eq("format", "video"))
.diversity(DiversityConstraints::new().max_per_creator(2))
.limit(25)
.exclude(EntityId::new(999))
.build()
.unwrap();
assert_eq!(query.entity_kind, EntityKind::Item);
assert_eq!(query.profile.name, "hot");
assert_eq!(query.filters.len(), 2);
assert!(query.diversity.is_some());
assert_eq!(query.limit, 25);
assert_eq!(query.exclude.len(), 1);
}
#[test]
fn builder_rejects_zero_limit() {
let result = Retrieve::builder()
.profile("trending")
.limit(0)
.build();
assert!(matches!(result, Err(QueryError::InvalidLimit { .. })));
}
#[test]
fn builder_rejects_limit_over_500() {
let result = Retrieve::builder()
.profile("trending")
.limit(501)
.build();
assert!(matches!(result, Err(QueryError::InvalidLimit { .. })));
}
#[test]
fn builder_rejects_missing_profile() {
let result = Retrieve::builder()
.entity(EntityKind::Item)
.limit(25)
.build();
assert!(matches!(result, Err(QueryError::ProfileNotFound(_))));
}
#[test]
fn builder_limit_boundary_values() {
// Min valid
let r1 = Retrieve::builder().profile("a").limit(1).build();
assert!(r1.is_ok());
assert_eq!(r1.unwrap().limit, 1);
// Max valid
let r2 = Retrieve::builder().profile("a").limit(500).build();
assert!(r2.is_ok());
assert_eq!(r2.unwrap().limit, 500);
}
#[test]
fn builder_profile_versioned() {
let query = Retrieve::builder()
.profile_versioned("trending", 3)
.build()
.unwrap();
assert_eq!(query.profile.name, "trending");
assert_eq!(query.profile.version, Some(3));
}
#[test]
fn builder_multiple_excludes() {
let query = Retrieve::builder()
.profile("new")
.exclude(EntityId::new(1))
.exclude(EntityId::new(2))
.exclude_ids(vec![EntityId::new(3), EntityId::new(4)])
.build()
.unwrap();
assert_eq!(query.exclude.len(), 4);
}
#[test]
fn combined_filter_none_when_empty() {
let query = Retrieve::builder().profile("new").build().unwrap();
assert!(query.combined_filter().is_none());
}
#[test]
fn combined_filter_single() {
let query = Retrieve::builder()
.profile("new")
.filter(FilterExpr::eq("category", "jazz"))
.build()
.unwrap();
let combined = query.combined_filter().unwrap();
assert!(matches!(combined, FilterExpr::Eq { .. }));
}
#[test]
fn combined_filter_multiple_becomes_and() {
let query = Retrieve::builder()
.profile("new")
.filter(FilterExpr::eq("category", "jazz"))
.filter(FilterExpr::eq("format", "video"))
.build()
.unwrap();
let combined = query.combined_filter().unwrap();
assert!(matches!(combined, FilterExpr::And(_)));
}
// === Cursor tests ===
#[test]
fn cursor_encode_decode_roundtrip() {
let cursor = Cursor::from_offset(42);
let encoded = cursor.encode();
let decoded = Cursor::decode(&encoded).unwrap();
assert_eq!(cursor, decoded);
}
#[test]
fn cursor_encode_decode_zero() {
let cursor = Cursor::from_offset(0);
let encoded = cursor.encode();
let decoded = Cursor::decode(&encoded).unwrap();
assert_eq!(cursor, decoded);
}
#[test]
fn cursor_encode_decode_large_offset() {
let cursor = Cursor::from_offset(100_000);
let encoded = cursor.encode();
let decoded = Cursor::decode(&encoded).unwrap();
assert_eq!(cursor, decoded);
}
#[test]
fn cursor_decode_invalid_base64() {
let result = Cursor::decode("!!!not-base64!!!");
assert!(matches!(result, Err(QueryError::InvalidCursor(_))));
}
#[test]
fn cursor_decode_wrong_length() {
use base64::Engine as _;
let encoded = base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(&[1u8, 2, 3]);
let result = Cursor::decode(&encoded);
assert!(matches!(result, Err(QueryError::InvalidCursor(_))));
}
// === QueryError display tests ===
#[test]
fn query_error_display_messages() {
let e1 = QueryError::ProfileNotFound("trending".to_string());
assert!(e1.to_string().contains("trending"));
let e2 = QueryError::InvalidLimit {
requested: 0,
min: 1,
max: 500,
};
assert!(e2.to_string().contains("0"));
assert!(e2.to_string().contains("500"));
let e3 = QueryError::InvalidFilter {
field: "category".to_string(),
reason: "unknown field".to_string(),
};
assert!(e3.to_string().contains("category"));
let e4 = QueryError::InvalidCursor("bad cursor".to_string());
assert!(e4.to_string().contains("bad cursor"));
}
// === RetrieveResult tests ===
#[test]
fn results_len_and_is_empty() {
let empty = Results {
items: vec![],
next_cursor: None,
total_scored: 0,
constraints_satisfied: true,
warnings: vec![],
};
assert_eq!(empty.len(), 0);
assert!(empty.is_empty());
let one = Results {
items: vec![RetrieveResult {
entity_id: EntityId::new(1),
score: 0.5,
rank: 1,
signal_snapshot: vec![],
}],
next_cursor: None,
total_scored: 1,
constraints_satisfied: true,
warnings: vec![],
};
assert_eq!(one.len(), 1);
assert!(!one.is_empty());
}
// === Signal struct tests ===
#[test]
fn signal_new() {
let sig = Signal::new(
"view",
EntityId::new(42),
1.0,
Timestamp::from_nanos(1_000_000),
);
assert_eq!(sig.signal_type, "view");
assert_eq!(sig.entity_id, EntityId::new(42));
assert!((sig.weight - 1.0).abs() < f64::EPSILON);
}
Property Tests
use proptest::prelude::*;
// P1: Cursor encode/decode is lossless for all valid offsets.
proptest! {
#[test]
fn cursor_roundtrip(offset in 0usize..1_000_000) {
let cursor = Cursor::from_offset(offset);
let encoded = cursor.encode();
let decoded = Cursor::decode(&encoded).unwrap();
prop_assert_eq!(cursor, decoded);
}
}
// P2: Builder always produces valid Retrieve when required fields are set.
proptest! {
#[test]
fn builder_valid_with_limit(limit in 1usize..=500) {
let result = Retrieve::builder()
.profile("test_profile")
.limit(limit)
.build();
prop_assert!(result.is_ok());
prop_assert_eq!(result.unwrap().limit, limit);
}
}
// P3: Builder rejects invalid limits.
proptest! {
#[test]
fn builder_rejects_invalid_limit(limit in 501usize..10_000) {
let result = Retrieve::builder()
.profile("test_profile")
.limit(limit)
.build();
prop_assert!(matches!(result, Err(QueryError::InvalidLimit { .. })));
}
}
Acceptance Criteria
Retrievestruct with all fields:entity_kind,profile,filters,diversity,limit,exclude,cursor,for_user,similar_to,contextRetrieveBuilderwith methods:entity(),profile(),profile_versioned(),filter(),diversity(),limit(),exclude(),exclude_ids(),cursor(),build()RetrieveBuilder::build()validates: limit in [1, 500], profile present,for_userandsimilar_torejected in M2Retrieve::combined_filter()returnsNonefor empty filters, single filter as-is, AND for multipleProfileRefwithnew()(latest version) andversioned()(pinned version)Resultsstruct withitems,next_cursor,total_scored,constraints_satisfied,warnings,len(),is_empty()RetrieveResultstruct withentity_id,score,rank,signal_snapshotCursorwithfrom_offset(),offset(),encode(),decode()-- base64 roundtrip is losslessCursor::decode()returnsQueryError::InvalidCursorfor invalid inputSignalstruct withnew()constructor and all fieldsQueryErrorenum withProfileNotFound,InvalidFilter,InvalidLimit,IndexNotAvailable,StorageError,InvalidCursor,UnsupportedStrategyQueryErrorimplementsDisplayandErrorRetrieve::validate()checks profile existence againstProfileRegistry, rejects unsupported candidate strategies- Property test: cursor roundtrip for all valid offsets
- Property test: builder accepts valid limits [1, 500], rejects [501, ...]
- No
unsafecode cargo clippy -- -D warningspasses- All unit tests and property tests pass
Research References
- docs/specs/08-query-engine.md -- Section 3.1 (
Retrievestruct fields), Section 3.4 (QueryErrorenum variants and validation rules), Section 8.2 (Cursorstructure and encoding)
Spec References
- docs/specs/08-query-engine.md -- Section 2.1 (RETRIEVE operation overview), Section 3.1 (Retrieve input struct), Section 3.4 (QueryError enum), Section 8 (Pagination: cursor design, encoding, semantics)
Implementation Notes
- Add
base64 = "0.22"to[dependencies]intidal/Cargo.toml. Thebase64crate is small (no transitive deps beyondstd) and provides theURL_SAFE_NO_PADengine for cursor encoding. - The
query/mod.rsfile should be created withpub mod retrieve;and re-exports of all public types. Thepub mod executor;line is added in Task 02 when the executor module is created. lib.rsshould addpub mod query;-- this is the first time the query module exists.FilterExpris imported fromcrate::storage::indexes::filter(m2p2). If the exact import path differs from the m2p2 implementation, adapt accordingly.DiversityConstraintsis imported fromcrate::ranking::diversity(m2p4). TheDiversityConstraints::new()and.max_per_creator()API must match the m2p4 implementation.- The
for_user: Option<u64>field usesu64rather thanUserIdbecauseUserId(a newtype overu64) is not defined until M3 when user entities are introduced. In M3, this field will be changed toOption<UserId>. - Do NOT add
serdederives to the query types for M2. Serialization of query types is an M5+ concern when the text parser and network protocol are built.
Migration from M1 QueryError Stub
This task must remove the M1 stub QueryError before adding the new one. A stub QueryError struct exists in tidal/src/schema/error.rs (simple { message: String } struct) and is re-exported through schema/mod.rs and wrapped by LumenError::Query. The M2 QueryError is a rich enum that replaces it.
Migration steps (in order):
- Remove the stub
QueryErrorstruct fromtidal/src/schema/error.rs - Add
pub use crate::query::retrieve::QueryError;intidal/src/schema/mod.rs(or update theFrom<QueryError>impl inLumenErrorto use the new path) - Update
LumenError::Queryvariant to holdcrate::query::retrieve::QueryError - Update the
From<QueryError> for LumenErrorimpl to use the new enum - Fix any existing tests in
schema/error.rsthat construct the oldQueryError { message: "..." }struct
Do NOT create query/retrieve.rs before completing steps 1-4 -- the name collision will cause confusing compilation errors.
Intentional Spec Deviations
The following fields differ from Spec 08 Section 3.1's Retrieve struct definition. These are intentional improvements:
| This task | Spec 08 | Reason |
|---|---|---|
entity_kind: EntityKind |
entity: EntityKind |
More explicit — avoids confusion with entity_id |
exclude: Vec<EntityId> |
exclude_ids: Vec<EntityId> |
Shorter; builder method .exclude() reads naturally |
profile: ProfileRef |
profile: String |
Richer type; supports version pinning for A/B testing |
diversity: Option<DiversityConstraints> |
diversity: Option<DiversitySpec> |
Matches m2p4's concrete type name |
These deviations are safe: the spec defines behavior, not names. Future language parsing (M5) maps text tokens to these struct fields.