stemedb/roadmap.md
jordan a776744889 Initial project setup with Claude Code monorepo structure
- Rust workspace with stemedb-core crate
- Full .claude/ configuration (agents, skills, commands, guides)
- ai-lookup/ for token-efficient fact storage
- Quality gates: clippy, fmt, jscpd duplication detection
- Pre-commit hook with 5-phase quality checks
- CLAUDE.md router and CODING_GUIDELINES.md standards

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 10:56:26 -07:00

71 lines
3.1 KiB
Markdown

# Episteme (StemeDB) Roadmap
> **Goal:** Build the "Git for Truth" substrate for autonomous AI research.
> **Current Phase:** Phase 0 (Planning)
---
## 📅 High-Level Timeline
| Phase | Codename | Focus | Key Deliverable |
| :--- | :--- | :--- | :--- |
| **1** | **The Spine** | Storage & Safety | Append-only WAL + KV Store |
| **2** | **The Lattice** | Indexing & Query | Lens Engine + HTTP API |
| **3** | **The Cortex** | Branching & Vectors | Semantic Search + Forking |
| **4** | **The Hive** | Trust & Consensus | TrustRank + Replication |
---
## 🛠 Detailed Milestones
### Phase 1: The Spine (Foundation)
*Goal: Securely ingest assertions and persist them without data loss.*
- [ ] **Project Scaffold**: Initialize Rust workspace, set up linting/CI (clippy, fmt).
- [ ] **Assertion Schema**: Define the `Assertion` struct with `rkyv` serialization.
- [ ] **WAL Integration**: Implement `quarantine-journal` pattern for write-ahead logging.
- [ ] **Storage Engine**: Implement the `Store` trait using `sled` (embedded KV).
- [ ] **Basic Ingestor**: Background worker that tails WAL and writes to KV.
- [ ] **Verification**: Write tests proving crash recovery (write -> crash -> restart -> read).
### Phase 2: The Lattice (Connectivity)
*Goal: Query data by Subject/Predicate and resolve simple conflicts.*
- [ ] **Indexing**: Implement `Subject -> List<Hash>` and `S:P -> List<Hash>` indexes.
- [ ] **Lens Architecture**: Define the `Lens` trait for read-time resolution.
- [ ] **Lens: Recency**: Implement "Last Writer Wins" logic (Baseline).
- [ ] **Lens: Consensus**: Implement simple "Vote Count" logic.
- [ ] **API Surface**: Build `axum` HTTP server (`POST /assert`, `GET /query`).
- [ ] **CLI**: Basic CLI tool for interacting with the DB manually.
### Phase 3: The Cortex (Reasoning)
*Goal: Enable semantic search and "What If" scenarios.*
- [ ] **Branching Core**: Implement Overlay Graph logic for "Forking Reality."
- [ ] **Vector Storage**: Integrate `hnsw-rs` or similar for embedding storage.
- [ ] **Semantic Search**: Implement k-NN query support in the API.
- [ ] **Lens: Skeptic**: Implement variance analysis (finding high-conflict nodes).
- [ ] **Session Context**: Allow queries to pass a `BranchID` to read from a fork.
### Phase 4: The Hive (Trust & Scale)
*Goal: Implement reputation systems and distributed consensus.*
- [ ] **TrustRank Engine**: Background "Gardener" process to calculate Agent reputation.
- [ ] **Lens: Authority**: Filter results by Agent Reputation score.
- [ ] **Replication**: Basic leader-follower replication for high availability.
- [ ] **Garbage Collection**: Pruning logic for low-confidence/spam assertions.
---
## 🚦 Tracking
### Active Tasks
* [ ] Initialize `stemedb` cargo workspace.
* [ ] Define `Assertion` data structure in `stemedb-core`.
### Blockers
* None.
### Decisions Pending
* **Vector Engine**: `hnsw-rs` vs `lance`? (Leaning `lance` for disk-based scale, but `hnsw-rs` is simpler for MVP).
* **KV Store**: `sled` vs `rocksdb`? (`sled` is pure Rust, `rocksdb` is battle-tested. Start with `sled` for dev speed, abstract via Trait).