Implements the foundation of tidalDB's data pipeline: **Phase 1 – Schema primitives** - EntityId newtype (u64, big-endian ordering) - SignalTypeDefinition with pre-computed decay λ, deduped/sorted windows - SchemaBuilder with full constraint validation (duplicates, identifiers, half-life, windows, velocity) - LumenError wrapping all subsystems with required From impls **Phase 2 – Write-Ahead Log** - Length-prefixed, BLAKE3-protected entry format - Group-commit writer (batch up to 100 events / 10 ms) - Double-buffered content-hash deduplication - Checkpoint, truncation, and crash-recovery with full replay - Integration, property, and UAT tests (incl. 5,500-event deterministic UAT) - Proptest coverage scaled to 10 000 events/run (was ≤500) to meet acceptance criterion; cases reduced 100→10 to keep runtime comparable **Phase 3 – Storage engine** - StorageEngine trait (get/put/delete/scan/batch/flush) - Key encoding: [EntityId][0x00][Tag][suffix] with ordering/prefix helpers - InMemoryBackend (BTreeMap + RwLock) - FjallStorage with three isolated keyspaces and atomic batch helper - Property tests for key ordering and round-trip correctness Also adds planning docs for phases 4-5, research docs, architecture overview, and roadmap updates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.9 KiB
5.9 KiB
Jon Gjengset: I don't ship what I wouldn't trust at 3am during a production incident. Pay attention to what the user says and follow it. Do not make them repeat themselves.
tidalDB
A single-node-first, embeddable Rust database for the personalized content ranking problem. Replaces the 6-system stack (Elasticsearch + Redis + Kafka + feature store + vector DB + ranking service) with a single process, single query interface, and single operational model.
Status: Vision and specification phase. No implementation yet.
Find Your Guide
| If you need to... | Read this |
|---|---|
| Understand the vision | VISION.md |
| See use cases and surfaces | USE_CASES.md |
| See sequence diagrams | SEQUENCE.md |
| Understand the system architecture | ARCHITECTURE.md |
| Look up domain concepts | ai-lookup/index.md |
| Follow coding standards | CODING_GUIDELINES.md |
| See the API spec | API.md |
| Read architectural lessons | thoughts.md |
| Read technical research | docs/research/ |
Agents
| Agent | Identity | Use when |
|---|---|---|
| @tidal-engineer | Jon Gjengset | Implementing features, designing storage internals, building the signal system, debugging correctness issues |
| @tidal-visionary | Spencer Kimball | Planning roadmaps, defining milestones, scoping phases, making build-vs-defer decisions |
| @tidal-researcher | Andy Pavlo | Investigating best practices, surveying prior art, evaluating libraries, producing research documents |
| @tidal-storyteller | — | Building the marketing site, writing blog posts, crafting public-facing copy |
Skills
Phase Lifecycle
| Step | Skill | Use when |
|---|---|---|
| 1. Plan | /milestone |
Planning task documents for a milestone phase (orchestrates all 3 agents) |
| 2. Build | /implement |
Executing a planned phase task-by-task (delegates to @tidal-engineer) |
| 3. Review | /review |
Reviewing completed phase against spec and coding standards (delegates to @tidal-engineer) |
| 4. Accept | /uat |
User acceptance testing a reviewed phase (delegates to @tidal-engineer) |
Other Skills
| Skill | Use when |
|---|---|
/tidal-deliver-task |
End-to-end feature delivery orchestrating all 4 agents (scope -> research -> build -> review -> accept) |
/tidal-verify-completion-to-spec |
Joint spec verification from all 3 agent lenses in parallel (product fit, research grounding, implementation correctness) — use any time, not just after /implement |
/develop |
Quick implementation work outside the milestone lifecycle |
/research [topic] |
Investigating best practices, evaluating approaches (delegates to @tidal-researcher) |
/roadmap |
Building or updating the milestone roadmap (delegates to @tidal-visionary) |
/build-site |
Creating or iterating on the marketing site |
/write-blog |
Writing blog posts about progress or architecture |
Core Domain Model
- Entities: Items (content), Users, Creators — each with metadata, embedding slot, signal ledger
- Signals: Typed, timestamped event streams with native decay, velocity, and windowed aggregation
- Relationships: Weighted, directional edges between entities (follows, blocks, interactions)
- Ranking Profiles: Named, versioned scoring functions declared in schema
- Query: Single operation combining retrieval, filtering, ranking, and diversity enforcement
Ports
Dev servers use port range 59520–59529 (e.g. site/ on 59520).
Critical Rules
- Scope: This is NOT a general-purpose database. Every decision serves one question: "given a user and a context, what content should they see, in what order?"
- Embeddings: The database retrieves and ranks over vectors. It does NOT generate them.
- Signals are primitives: Decay, velocity, and windowed aggregation are native — not application logic.
- Single-node first: Embeddable. Scales vertically before horizontally.
- Language: Rust.
Repository Structure
. # Top-level docs and configuration
├── CLAUDE.md # This file — project instructions
├── VISION.md # Product vision and thesis
├── USE_CASES.md # 14 use cases, all discovery surfaces
├── SEQUENCE.md # Data flow sequence diagrams
├── CODING_GUIDELINES.md # Engineering standards
├── API.md # API specification
├── thoughts.md # Architectural lessons from sister projects
├── ai-lookup/ # Domain concept reference
├── docs/ # Research and documentation
│ └── research/ # Deep technical research docs
├── .claude/ # Claude Code configuration
│ ├── agents/ # Agent definitions
│ └── skills/ # Skill definitions
├── tidal/ # Rust database engine
│ ├── Cargo.toml
│ ├── src/
│ │ ├── storage/ # Entity store, signal ledger, inverted index, HNSW
│ │ ├── query/ # Query parser, planner, executor
│ │ ├── ranking/ # Profile engine, signal scoring, diversity enforcement
│ │ ├── signals/ # Signal types, decay, velocity, windowed aggregation
│ │ └── schema/ # Schema definition, validation, migrations
│ ├── benches/ # Performance benchmarks
│ └── tests/ # Integration and property tests
└── site/ # Public marketing site (Next.js)
Pre-commit Hooks
The pre-commit hook runs automatically on staged files:
- tidal/ (Rust):
cargo fmt(auto-fix + re-stage),cargo clippy -D warnings,cargo test --lib - site/ (Next.js):
eslint(if node_modules installed)
All cargo commands use --manifest-path tidal/Cargo.toml since the Rust project is not at repo root.