tidaldb/CHANGELOG.md
2026-02-23 22:41:16 -07:00

97 lines
4.3 KiB
Markdown

# Changelog
All notable changes to tidalDB will be documented in this file.
## [Unreleased]
## [0.1.0] - 2026-02-23
### Added
**Core Database Engine**
- `TidalDb` embeddable database with `ephemeral()` and `with_data_dir()` open modes
- `SchemaBuilder` for defining signal types, decay parameters, and ranking profiles
- `TidalDbBuilder` fluent builder with schema, data directory, metrics, and rate limiter configuration
**Signal System**
- Typed signal recording with exponential decay scoring
- Hot-tier (DashMap) and warm-tier (BucketedCounter) signal storage
- Windowed aggregation: `OneHour`, `TwentyFourHours`, `SevenDays`, `AllTime`
- Signal velocity tracking
- WAL-backed signal durability with crash recovery
- Periodic signal checkpointing to fjall (every 30s)
- WAL compaction after each checkpoint
**Retrieval (RETRIEVE query)**
- 5-stage pipeline: universe, filter, score, diversify, return
- Filter expressions: `Eq`, `In`, `Gt`, `Lt`, `And`, `Or`, `Not`, `InCollection`, `InProgress`, `MinSignal`, `MaxSignal`, `NearLocation`
- Built-in ranking profiles: `trending`, `for_you`, `new`, `popular`, `recent`, and 20+ more
- Custom ranking profiles via `SchemaBuilder`
- Diversity enforcement (max N per category/creator)
- Sort modes: `Relevance`, `Trending`, `Newest`, `MostLiked`, `MostViewed`, `MostFollowed`, `AlphabeticalAsc/Desc`, `Shortest/Longest`, `LiveViewerCount`, `DateSaved`, and more
**Search (SEARCH query)**
- BM25 full-text search via Tantivy
- Approximate nearest-neighbor (ANN) semantic search via USearch HNSW
- Reciprocal Rank Fusion (RRF) combining BM25 + ANN scores
- Creator search with `entity_kind(EntityKind::Creator)`
- `similar_to(EntityId)` for content-based recommendations
- Scope pre-filters: `Trending`, `CohortTrending`, `Following`, `Category`, `Collection`
- Autocomplete suggestions via `db.suggest()`
**Entity Model**
- Three built-in entity types: `Item`, `User`, `Creator`
- Metadata storage as `HashMap<String, String>`
- Embedding slots (up to 4 per entity type) via USearch
- Relationships: `Follows`, `Blocks`, `Hide`, `Mute`, `InteractionWeight`
**Sessions**
- Session lifecycle: `open_session`, `close_session`
- Cross-session preference vector updates (EMA blend)
- Session snapshots with signal state and preference vectors
- Session serialization format v0x03 with backward compatibility
**Social Graph**
- Creator follower/following indexes
- Cohort membership (user segments)
- CoEngagementIndex for co-viewing patterns with LRU eviction
- Social graph filter for "followed creator" content scoping
**Collections**
- Named collections with `Private`, `Shared`, `Public` visibility
- `create_collection`, `add_to_collection`, `remove_from_collection`, `list_collections`
- `FilterExpr::InCollection` for collection-scoped retrieval
- Saved searches with `save_search`, `list_saved_searches`, `retrieve_saved_search`
**Observability**
- `enable_metrics(addr)` -- Prometheus-format `/metrics` endpoint + `/healthz` JSON
- 15+ metrics: signal writes, WAL lag, checkpoint age, degradation level, index health
- `tidaldb_checkpoint_failures_total` counter for checkpoint monitoring
- `TidalDb::diagnostics()` -- structured health snapshot
- WAL diagnostics and recovery tools
**Safety**
- Signal weight NaN/Inf validation (returns `TidalError::InvalidInput`)
- Metadata size bounds: 64 keys max, 8KB value max, 64KB total max
- Export request limit: 500K signals max per request
- `FilterExpr` complexity limit: 256 nodes max
- Data directory lock (`tidaldb.lock`) prevents dual-process corruption
- Schema fingerprint persistence detects decay parameter changes on reopen
- Bounded `closed_sessions` cache (10K max, LRU eviction)
- Metrics server non-loopback bind warning
**CLI (`tidalctl`)**
- `tidalctl` binary for database inspection and diagnostics
**RLHF / ML Export**
- `db.export_signals(ExportRequest)` -- WAL-based signal export for training data
- `db.user_session_summary(user_id, since_ns)` -- aggregated session statistics
### Stability
tidalDB `0.1.0` is pre-1.0. **No API or data format stability guarantees** are made for `0.x` releases. Upgrade guides will be provided for each minor version bump. Do not upgrade `0.x` to `0.y` on a live data directory without reading the release notes.
---
*Format based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)*