# Milestone 1, Phase 1: Core Type System and Schema ## Phase Deliverable The foundational type system -- entity IDs, signal type definitions, decay rate declarations, window specifications, and the error types that every subsequent module depends on. The schema module that validates and stores signal/entity definitions. ## Acceptance Criteria - [ ] `EntityId` is a u64 newtype with `Display`, `Hash`, `Eq`, `Ord` - [ ] `SignalTypeDef` declaration captures: name, decay model (exponential/linear/permanent), half-life duration, enabled windows (1h/24h/7d/30d/all_time), velocity enabled flag - [ ] `DecayModel::Exponential` stores pre-computed lambda derived from half-life: `lambda = ln(2) / half_life_seconds` - [ ] `LumenError` enum covers Storage, NotFound, Schema, Durability, Query, Internal variants per CODING_GUIDELINES.md - [ ] Schema validation rejects: duplicate signal names, zero/negative half-life, empty window list on non-permanent signals, velocity without windows - [ ] All hot-path numeric types use the precision specified in research (f64 for decay scores, u64 for timestamps in nanoseconds) ## Dependencies - **Requires:** Nothing -- this is the root of the dependency DAG - **Blocks:** m1p2 (WAL), m1p3 (Storage/fjall), and transitively all subsequent phases ## Research References - [docs/research/tidaldb_signal_ledger.md](../../../research/tidaldb_signal_ledger.md) -- decay formula, EntityState struct, running-score approach - [docs/research/phase1_1_type_system.md](../../../research/phase1_1_type_system.md) -- newtype patterns, Duration handling, error hierarchy, schema validation, f64 precision analysis, Window enum design - [CODING_GUIDELINES.md](../../../../CODING_GUIDELINES.md) -- error handling (section 7), module boundaries (section 9), dependencies (section 10) - [thoughts.md](../../../../thoughts.md) -- Part V.12 (subject-prefix keys), Part II.1 (WAL convergence) ## Spec References - [docs/specs/03-signal-system.md](../../../specs/03-signal-system.md) -- signal type declaration, decay types and lambda precomputation, window definitions, signal ledger architecture - [docs/specs/11-schema.md](../../../specs/11-schema.md) -- schema definition API, type system, validation rules, schema versioning - [docs/specs/02-entity-model.md](../../../specs/02-entity-model.md) -- EntityKind (Item/User/Creator), entity ID encoding, storage representation - [docs/specs/01-storage-engine.md](../../../specs/01-storage-engine.md) -- key encoding scheme using big-endian EntityId and Timestamp - [docs/specs/00-architecture-overview.md](../../../specs/00-architecture-overview.md) -- system architecture, code module map showing schema/ layout ## Task Index | # | Task | Delivers | Depends On | Complexity | |---|------|----------|------------|------------| | 01 | Core Identity and Temporal Types | `EntityId`, `EntityKind`, `Timestamp`, `Score` | None | S | | 02 | Signal Type Definitions | `SignalTypeDef`, `DecayModel`, `DecaySpec`, `Window`, `WindowSet` | Task 01 | S | | 03 | Error Types and Schema Validation | `LumenError`, `SchemaError`, `Schema`, `SchemaBuilder` | Task 01, Task 02 | S | ## Task Dependency DAG ``` Task 01: Core Identity Types | v Task 02: Signal Type Definitions (uses EntityKind from Task 01) | v Task 03: Error Types + Schema Validation (uses EntityId, SignalTypeDef, DecayModel, Window) ``` Tasks 01 and 02 are technically parallelizable if `EntityKind` is extracted first, but at complexity S each, sequential execution is fine. ## File Layout ``` tidal/src/ lib.rs -- pub mod declarations, Result alias, re-exports schema/ mod.rs -- pub use re-exports from submodules entity.rs -- Task 01: EntityId, EntityKind timestamp.rs -- Task 01: Timestamp newtype score.rs -- Task 01: Score newtype (finite f64 with Ord) signal.rs -- Task 02: SignalTypeDef, DecayModel, Window, WindowSet error.rs -- Task 03: LumenError, SchemaError, sub-error stubs validation.rs -- Task 03: Schema, SchemaBuilder, DecaySpec, SignalBuilder signals/mod.rs -- empty (m1p4) storage/mod.rs -- empty (m1p3) query/mod.rs -- empty (Milestone 2) ranking/mod.rs -- empty (Milestone 2) ``` ## Open Questions 1. **String vs u64 entity IDs in public API** -- API.md uses string IDs (`"item_abc"`), internal types use `u64`. Resolution: `EntityId` is `u64` internally. String-to-u64 mapping is a m1p5 concern when the public `Lumen` API is built. m1p1 defines only the internal type. 2. **EntityId uniqueness scope** -- globally unique or per-EntityKind? Resolution: signal names are globally unique (no `item.view` vs `user.view`). Entity IDs are scoped per-EntityKind by storage namespace. Different column families isolate the namespaces. 3. **Custom windows** -- `Window::Custom(Duration)` deferred. The five fixed variants cover every sort mode and ranking profile in the spec. Adding custom windows would require dynamic bucket allocation. Revisit if M5 benchmarks demand it.