stemedb/roadmap.md
jordan a776744889 Initial project setup with Claude Code monorepo structure
- Rust workspace with stemedb-core crate
- Full .claude/ configuration (agents, skills, commands, guides)
- ai-lookup/ for token-efficient fact storage
- Quality gates: clippy, fmt, jscpd duplication detection
- Pre-commit hook with 5-phase quality checks
- CLAUDE.md router and CODING_GUIDELINES.md standards

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 10:56:26 -07:00

3.1 KiB

Episteme (StemeDB) Roadmap

Goal: Build the "Git for Truth" substrate for autonomous AI research. Current Phase: Phase 0 (Planning)


📅 High-Level Timeline

Phase Codename Focus Key Deliverable
1 The Spine Storage & Safety Append-only WAL + KV Store
2 The Lattice Indexing & Query Lens Engine + HTTP API
3 The Cortex Branching & Vectors Semantic Search + Forking
4 The Hive Trust & Consensus TrustRank + Replication

🛠 Detailed Milestones

Phase 1: The Spine (Foundation)

Goal: Securely ingest assertions and persist them without data loss.

  • Project Scaffold: Initialize Rust workspace, set up linting/CI (clippy, fmt).
  • Assertion Schema: Define the Assertion struct with rkyv serialization.
  • WAL Integration: Implement quarantine-journal pattern for write-ahead logging.
  • Storage Engine: Implement the Store trait using sled (embedded KV).
  • Basic Ingestor: Background worker that tails WAL and writes to KV.
  • Verification: Write tests proving crash recovery (write -> crash -> restart -> read).

Phase 2: The Lattice (Connectivity)

Goal: Query data by Subject/Predicate and resolve simple conflicts.

  • Indexing: Implement Subject -> List<Hash> and S:P -> List<Hash> indexes.
  • Lens Architecture: Define the Lens trait for read-time resolution.
  • Lens: Recency: Implement "Last Writer Wins" logic (Baseline).
  • Lens: Consensus: Implement simple "Vote Count" logic.
  • API Surface: Build axum HTTP server (POST /assert, GET /query).
  • CLI: Basic CLI tool for interacting with the DB manually.

Phase 3: The Cortex (Reasoning)

Goal: Enable semantic search and "What If" scenarios.

  • Branching Core: Implement Overlay Graph logic for "Forking Reality."
  • Vector Storage: Integrate hnsw-rs or similar for embedding storage.
  • Semantic Search: Implement k-NN query support in the API.
  • Lens: Skeptic: Implement variance analysis (finding high-conflict nodes).
  • Session Context: Allow queries to pass a BranchID to read from a fork.

Phase 4: The Hive (Trust & Scale)

Goal: Implement reputation systems and distributed consensus.

  • TrustRank Engine: Background "Gardener" process to calculate Agent reputation.
  • Lens: Authority: Filter results by Agent Reputation score.
  • Replication: Basic leader-follower replication for high availability.
  • Garbage Collection: Pruning logic for low-confidence/spam assertions.

🚦 Tracking

Active Tasks

  • Initialize stemedb cargo workspace.
  • Define Assertion data structure in stemedb-core.

Blockers

  • None.

Decisions Pending

  • Vector Engine: hnsw-rs vs lance? (Leaning lance for disk-based scale, but hnsw-rs is simpler for MVP).
  • KV Store: sled vs rocksdb? (sled is pure Rust, rocksdb is battle-tested. Start with sled for dev speed, abstract via Trait).