# Episteme (StemeDB) Roadmap > **Goal:** Build the "Git for Truth" substrate for autonomous AI research. > **Current Phase:** Phase 0 (Planning) --- ## 📅 High-Level Timeline | Phase | Codename | Focus | Key Deliverable | | :--- | :--- | :--- | :--- | | **1** | **The Spine** | Storage & Safety | Append-only WAL + KV Store | | **2** | **The Lattice** | Indexing & Query | Lens Engine + HTTP API | | **3** | **The Cortex** | Branching & Vectors | Semantic Search + Forking | | **4** | **The Hive** | Trust & Consensus | TrustRank + Replication | --- ## 🛠 Detailed Milestones ### Phase 1: The Spine (Foundation) *Goal: Securely ingest assertions and persist them without data loss.* - [ ] **Project Scaffold**: Initialize Rust workspace, set up linting/CI (clippy, fmt). - [ ] **Assertion Schema**: Define the `Assertion` struct with `rkyv` serialization. - [ ] **WAL Integration**: Implement `quarantine-journal` pattern for write-ahead logging. - [ ] **Storage Engine**: Implement the `Store` trait using `sled` (embedded KV). - [ ] **Basic Ingestor**: Background worker that tails WAL and writes to KV. - [ ] **Verification**: Write tests proving crash recovery (write -> crash -> restart -> read). ### Phase 2: The Lattice (Connectivity) *Goal: Query data by Subject/Predicate and resolve simple conflicts.* - [ ] **Indexing**: Implement `Subject -> List` and `S:P -> List` indexes. - [ ] **Lens Architecture**: Define the `Lens` trait for read-time resolution. - [ ] **Lens: Recency**: Implement "Last Writer Wins" logic (Baseline). - [ ] **Lens: Consensus**: Implement simple "Vote Count" logic. - [ ] **API Surface**: Build `axum` HTTP server (`POST /assert`, `GET /query`). - [ ] **CLI**: Basic CLI tool for interacting with the DB manually. ### Phase 3: The Cortex (Reasoning) *Goal: Enable semantic search and "What If" scenarios.* - [ ] **Branching Core**: Implement Overlay Graph logic for "Forking Reality." - [ ] **Vector Storage**: Integrate `hnsw-rs` or similar for embedding storage. - [ ] **Semantic Search**: Implement k-NN query support in the API. - [ ] **Lens: Skeptic**: Implement variance analysis (finding high-conflict nodes). - [ ] **Session Context**: Allow queries to pass a `BranchID` to read from a fork. ### Phase 4: The Hive (Trust & Scale) *Goal: Implement reputation systems and distributed consensus.* - [ ] **TrustRank Engine**: Background "Gardener" process to calculate Agent reputation. - [ ] **Lens: Authority**: Filter results by Agent Reputation score. - [ ] **Replication**: Basic leader-follower replication for high availability. - [ ] **Garbage Collection**: Pruning logic for low-confidence/spam assertions. --- ## 🚦 Tracking ### Active Tasks * [ ] Initialize `stemedb` cargo workspace. * [ ] Define `Assertion` data structure in `stemedb-core`. ### Blockers * None. ### Decisions Pending * **Vector Engine**: `hnsw-rs` vs `lance`? (Leaning `lance` for disk-based scale, but `hnsw-rs` is simpler for MVP). * **KV Store**: `sled` vs `rocksdb`? (`sled` is pure Rust, `rocksdb` is battle-tested. Start with `sled` for dev speed, abstract via Trait).