tidaldb/docs/planning/milestone-1/phase-1/OVERVIEW.md
jordan 413b712c0a chore: initialize tidalDB repository with schema foundation and standards
- Schema phase 1 (tasks 01-02): EntityId, EntityKind, Timestamp, Score, SignalTypeDef, DecayModel, Window, WindowSet — all with property tests and benchmarks scaffolding
- Stub modules for storage, signals, query, ranking
- Full documentation suite: VISION, USE_CASES, SEQUENCE, API, CODING_GUIDELINES, ai-lookup, research docs, specs, roadmap, planning docs
- Marketing site (Next.js) with blog infrastructure
- .claude/ agents and skills for the tidalDB development workflow
- Foundation standards enforced: thiserror + tracing declared as dependencies, clippy::unwrap_used = deny added to lint config
- .gitignore hardened: .next/, node_modules/, .env, secrets, logs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 12:52:20 -07:00

5.0 KiB

Milestone 1 Phase 1.1: Core Type System and Schema

Phase Deliverable

The foundational type system -- entity IDs, signal type definitions, decay rate declarations, window specifications, and the error types that every subsequent module depends on. The schema module that validates and stores signal/entity definitions.

Acceptance Criteria

  • EntityId is a u64 newtype with Display, Hash, Eq, Ord
  • SignalTypeDef declaration captures: name, decay model (exponential/linear/permanent), half-life duration, enabled windows (1h/24h/7d/30d/all_time), velocity enabled flag
  • DecayModel::Exponential stores pre-computed lambda derived from half-life: lambda = ln(2) / half_life_seconds
  • LumenError enum covers Storage, NotFound, Schema, Durability, Query, Internal variants per CODING_GUIDELINES.md
  • Schema validation rejects: duplicate signal names, zero/negative half-life, empty window list on non-permanent signals, velocity without windows
  • All hot-path numeric types use the precision specified in research (f64 for decay scores, u64 for timestamps in nanoseconds)

Dependencies

  • Requires: Nothing -- this is the root of the dependency DAG
  • Blocks: Phase 1.2 (WAL), Phase 1.3 (Storage/fjall), and transitively all subsequent phases

Research References

Spec References

Task Index

# Task Delivers Depends On Complexity
01 Core Identity and Temporal Types EntityId, EntityKind, Timestamp, Score None S
02 Signal Type Definitions SignalTypeDef, DecayModel, DecaySpec, Window, WindowSet Task 01 S
03 Error Types and Schema Validation LumenError, SchemaError, Schema, SchemaBuilder Task 01, Task 02 S

Task Dependency DAG

Task 01: Core Identity Types
    |
    v
Task 02: Signal Type Definitions  (uses EntityKind from Task 01)
    |
    v
Task 03: Error Types + Schema Validation  (uses EntityId, SignalTypeDef, DecayModel, Window)

Tasks 01 and 02 are technically parallelizable if EntityKind is extracted first, but at complexity S each, sequential execution is fine.

File Layout

tidal/src/
  lib.rs              -- pub mod declarations, Result<T> alias, re-exports
  schema/
    mod.rs            -- pub use re-exports from submodules
    entity.rs         -- Task 01: EntityId, EntityKind
    timestamp.rs      -- Task 01: Timestamp newtype
    score.rs          -- Task 01: Score newtype (finite f64 with Ord)
    signal.rs         -- Task 02: SignalTypeDef, DecayModel, Window, WindowSet
    error.rs          -- Task 03: LumenError, SchemaError, sub-error stubs
    validation.rs     -- Task 03: Schema, SchemaBuilder, DecaySpec, SignalBuilder
  signals/mod.rs      -- empty (Phase 1.4)
  storage/mod.rs      -- empty (Phase 1.3)
  query/mod.rs        -- empty (Milestone 2)
  ranking/mod.rs      -- empty (Milestone 2)

Open Questions

  1. String vs u64 entity IDs in public API -- API.md uses string IDs ("item_abc"), internal types use u64. Resolution: EntityId is u64 internally. String-to-u64 mapping is a Phase 1.5 concern when the public Lumen API is built. Phase 1.1 defines only the internal type.

  2. EntityId uniqueness scope -- globally unique or per-EntityKind? Resolution: signal names are globally unique (no item.view vs user.view). Entity IDs are scoped per-EntityKind by storage namespace. Different column families isolate the namespaces.

  3. Custom windows -- Window::Custom(Duration) deferred. The five fixed variants cover every sort mode and ranking profile in the spec. Adding custom windows would require dynamic bucket allocation. Revisit if M5 benchmarks demand it.