stemedb/roadmap-archive.md
jordan 157dbbb9eb feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing)
## Phase 8: Enterprise Extractor Improvements 
- 14 security extractors (TLS, JWT, SQL injection, XSS, etc.)
- 10 framework-specific extractors (Spring, Django, Rails, etc.)
- Config file security detection (YAML, TOML)

## Phase 9: Autonomous Extractor Generation 
- Shadow mode executor with TP/FP tracking
- Graduation pipeline with confidence thresholds
- Auto-rollback on regression detection
- Cross-project pattern syncing

## UAT Suite Complete (14 scripts, 90 tests)
- test-core-detection.sh (6 tests)
- test-declarative-extractors.sh (5 tests)
- test-domain-frameworks.sh (5 tests)
- test-domain-unreal.sh (3 tests)
- test-llm-extraction.sh (6 tests)
- test-eval-harness.sh (5 tests)
- test-cross-language.sh (3 tests)
- test-precommit-performance.sh (4 tests)
- test-output-formats.sh (8 tests)
- test-drift-detection.sh (6 tests)
- test-exit-codes.sh (12 tests)
+ 3 more scripts

## Other Changes
- Updated roadmap to mark Phase 8-9 complete
- Added .gitignore entries for build artifacts
- Updated pre-commit: 800 line limit, exclude tests/data/cmd

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:50:55 -07:00

16 KiB

Episteme (StemeDB) Roadmap Archive

Purpose: Historical record of completed phases. For current work, see roadmap.md. Last Updated: 2026-02-05


Completed Phases Summary

Phase Codename Status Completion
1 The Spine Complete Storage & Safety — WAL + KV Store
2 The Lattice Complete Indexing & Async — MVs + Ballot Box
2.5 Hardening Complete MV staleness, epoch behavior, lens cleanup
3 The Pilot Complete Vertical Integration — Pharma Ingestion
4 The Hive Complete Trust & Learning — TrustRank, metadata indexing
5 The Forge Complete Foundation Hardening — redb/fjall, WAL, indices
6 The Mesh Complete Distributed Writes — CRDT, Raft, clustering
7 The Shield Complete Trust at Scale — EigenTrust, PoW, quarantine
8A Chaos Complete Partition testing, Jepsen-style verification
MVP Consumer Health Complete Real FDA data → conflicts detected → demo

Phase 1: The Spine (Foundation)

Goal: Securely ingest assertions and persist them without data loss.

  • Project Scaffold: Initialize Rust workspace, set up linting/CI (clippy, fmt).
  • Assertion Schema: Define the Assertion struct with rkyv serialization.
    • Add dependencies: rkyv, blake3, ed25519-dalek, image_hasher.
    • Define Assertion struct (Subject, Predicate, Object, Confidence, SourceHash).
    • Multi-Sig Expansion: Implement SignatureEntry struct and signatures: Vec<SignatureEntry> field.
    • Visual Expansion: Add visual_hash: Option<pHash> field for image provenance.
    • Test serialization round-trips.
  • Ballot Schema: Define the Vote struct for multi-agent consensus.
    • Add Vote struct: assertion_hash, agent_id, weight, signature.
    • Test serialization round-trips.
  • Paradigm Schema (Epochs): Define the Epoch and SupersessionType structs.
    • Add epoch: Option<EpochId> to Assertion.
    • Implement Epoch struct with supersedes and SupersessionType.
    • Test serialization round-trips.
  • WAL Integration: Implement the Quarantine Pattern for write-ahead logging.
    • Create stemedb-wal crate.
    • Port FsyncGuard and Record logic from established durability patterns.
    • Implement Record format with BLAKE3 checksums and Headers.
    • Verify fsync behavior with tests.
  • Storage Engine: Implement the Store trait using sled (embedded KV).
    • Add sled dependency.
    • Define KVStore trait (put, get, delete, scan_prefix, flush).
    • Implement SledStore wrapper.
  • Basic Ingestor: Background worker that tails WAL and writes to KV.
    • Implement async loop reading from WAL.
    • Write deserialized assertions, votes, and epochs to sled.
    • Ed25519 signature verification during ingestion.
    • Maintains S: and SP: indexes on ingest.
    • Persistent cursor/checkpoint (resumes from __CURSOR__:ingest in KV store).
  • Verification: Crash recovery tests (write -> crash -> restart -> read).
    • Single and multi-record crash recovery.
    • Multiple crash cycles tested.

Phase 2: The Lattice (Connectivity)

Goal: Query data with sub-millisecond latency using Materialized Views.

  • Lifecycle Schema: Add LifecycleStage to Assertion.
    • Define enum: Proposed, UnderReview, Approved, Deprecated, Rejected.
    • Update Assertion struct and serialization tests.
  • The Ballot Box: Implement high-velocity vote ingestion.
    • VoteStore trait and implementation.
    • VoteAwareConsensusLens for real vote-based resolution.
  • Index Infrastructure: Compound indexes for O(1) queries.
    • IndexStore trait with S: and SP: indexes.
    • QueryEngine smart routing (SP -> S -> scan).
  • Materializer: Background worker for O(1) Read Performance.
    • MaterializedView type in stemedb-core.
    • Materializer worker in stemedb-query with step() and run().
    • Aggregates Votes via VoteAwareConsensusLens (or any AsyncLens).
    • Updates MV:{Subject}:{Predicate} with the winning Assertion + metadata.
    • Event-driven mode via run_notified() with tokio::sync::Notify.
    • Fast-path MV lookup in QueryEngine::try_fast_path().
  • The Meter: Implement Economic Throttling (TAN).
    • QuotaStore trait and GenericQuotaStore implementation.
    • Token Bucket algorithm with per-agent per-hour quotas.
    • MeterLayer tower middleware for request cost tracking.
    • Cost model: Assert=10, Vote=1, Query=5+lens, +1/KB payload.
    • GET /v1/meter/quota endpoint to check remaining quota.
    • POST /v1/meter/quota/limit admin endpoint to set custom limits.
  • API Surface: axum HTTP server with OpenAPI (utoipa).
    • POST /v1/assert -> Accepts JSON, writes to WAL.
    • POST /v1/vote -> High-throughput vote endpoint.
    • POST /v1/epoch -> Create epoch with optional supersession.
    • GET /v1/query -> Subject/Predicate/Lens/Lifecycle/Epoch filtering.
    • GET /v1/health -> Health check with assertion count.
    • GET /swagger-ui -> Interactive API docs.
    • 5 lens types available: Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority.
  • Query Audit: Log every read with provenance.
    • Define QueryAudit struct: query_id, agent_id, timestamp, params, result_hash, contributing_assertions.
    • Storage at AUD:{query_id} with agent index at AUDA:{agent_id}:{timestamp}:{query_id}.
    • GET /v1/audit/queries -> Returns history of agent decisions.
    • GET /v1/audit/query/{id} -> Full reasoning trace for a single query.
    • Auto-logging on every query via X-Agent-Id header.

Phase 2.5: Hardening

Goal: Close the gaps between "built" and "works right."

  • 2.1 MV Staleness Detection: max_stale parameter on queries.
  • 2.2 AuthorityLens -> ConfidenceLens Rename: Eliminated misleading name.
  • 2.3 EpochAwareLens: Epoch supersession runtime behavior with cycle detection.
  • 2.4 Visual Hash Query Support: Hamming distance queries on visual_hash.
  • 2.5 Vector Field: vector: Option<Vec<f32>> stored on assertions.
  • 2.6 E2E Integration Test: Full pipeline validation (Write -> Materialize -> Read).

Phase 3: The Pilot (BioTech/Pharma)

Goal: Prove value in the "High-Liability" beachhead.

3A. Schema Expansion

  • 3A.1 Source-Class Field: 6-tier SourceClass enum (Regulatory → Anecdotal).
  • 3A.2 Conflict Score on Resolution: Normalized variance-based conflict metric.
  • 3A.3 Rich Source Metadata: source_metadata: Option<Vec<u8>> for JSON provenance.

3B. Time & Decay

  • 3B.1 Time-Travel Engine: as_of parameter for historical queries.
  • 3B.2 Semantic Decay: Confidence half-life with tier-specific rates.

3C. New Lenses

  • 3C.1 Skeptic Lens: Surface disagreement via Shannon entropy conflict scoring.
  • 3C.2 Layered Consensus Lens: Per-source-class consensus with tier visibility.
  • 3C.3 Constraints Lens: Pre-flight check for must_use/forbidden/prefer.

3D. Epoch Enhancement

  • 3D.1 Epoch Cascade Logic: O(1) supersession lookup via pre-computed markers.
  • 3E.1 Vector Search: HNSW-based semantic k-NN queries.
  • 3E.2 Visual Hash Index: BK-tree for O(log N) visual similarity.

3F. Provenance

  • 3F.1 Source Document Storage: Content-addressed source storage with GET /v1/provenance/{hash}.

3G. API Cleanup

  • 3G.1 Document epoch supersession: Updated docs for POST /v1/epoch with supersedes field.

Phase 4: The Hive (Trust & Scale)

Goal: Change tracking, metadata indexing, and training pipeline primitives.

  • TrustRank Engine: Per-agent reputation with decay and learning loop.
  • 4.1 "Since" Parameter: MV changelog at MVC: keys with changes_since in responses.
  • 4.2 Source Metadata Indexing: Indexed fields (journal, doi, platform, study_design) at SMV:.
  • 4.3 Batch TrustRank Decay API: POST /v1/admin/decay-trust-ranks.
  • 4.4 Vote Provenance Witness: source_url and observed_context on votes.
  • 4.5 Conflict Score Filtering: min_conflict_score/max_conflict_score on queries.
  • 4.6 Escalation Triggers: EscalationPolicy fires events on high-conflict assertions.
  • 4.7 Gold Standard Verification: Admin-verified assertions for agent testing.

Phase 5: The Forge (Foundation Hardening)

Goal: Replace abandoned dependencies, fix WAL gaps, persist indices.

5A. Storage Engine Replacement

  • 5A.1 Replace sled with redb + fjall: HybridStore with prefix-based routing.
  • 5A.2 Key Layout Redesign: Subject-prefix keys for range sharding readiness.

5B. WAL Hardening

  • 5B.1 CRC32C Checksums: Hardware-accelerated torn write detection.
  • 5B.2 Crash Recovery Implementation: Sequential scan with truncation.
  • 5B.3 Group Commit: Batch fsync for throughput.
  • 5B.4 Log Rotation: Segment management with safe deletion.

5C. Index Persistence

  • 5C.1 Persistent Vector Index: Hot/cold HNSW with checkpoint files.
  • 5C.2 Persistent Visual Index: BK-tree snapshots with CRC32C verification.

5D. Concept Hierarchy

  • 5D.1 ConceptPath Type: Scheme-qualified subject identifiers.
  • 5D.2 Source Scheme Registry: Scheme → default source tier mapping.
  • 5D.3 Alias Store: Cross-scheme entity resolution with cycle detection.
  • 5D.4 Hierarchical Query: Prefix-based subject queries.
  • 5D.5 Alias Resolution in Queries: GET /v1/concepts/resolve?path=....
  • 5D.6 Source Class Inference: Tier inference from scheme.
  • 5D.7 Concept API Endpoints: Full CRUD for aliases and hierarchy.
  • 5D.8 Battery Tests: 15 tests across Battery 8 and 9.

Phase 6: The Mesh (Distributed Writes)

Goal: Multi-node cluster with CRDT replication and Raft coordination.

6A. CRDT Foundation

  • 6A.1 Integrate CRDT Crate: G-Set for assertions, G-Counter for votes.
  • 6A.2 Hybrid Logical Clocks: HLC timestamps for causal ordering.
  • 6A.3 Merkle Tree Over Assertions: BLAKE3-based diff detection.

6B. Two-Node Replication (PoC)

  • 6B.1 RPC Layer: tonic gRPC with SyncClient and SyncServiceHandler.
  • 6B.2 Gossip Broadcast: Configurable fanout with rate limiting.
  • 6B.3 Merkle Anti-Entropy Sync: Background convergence worker.
  • 6B.4 Integration Test: 8 tests validating replication primitives.

6C. Multi-Node Cluster

  • 6C.1 Cluster Membership (SWIM Gossip): Node discovery and failure detection.
  • 6C.2 Subject-Prefix Range Sharding: BLAKE3 + jump hash routing.
  • 6C.4 Gateway: Stateless request routing with health and status endpoints.
  • 6C.5 Integration Tests: 82 tests covering membership, sharding, gateway.

Consistency Guarantees

Property Guarantee Mechanism
Convergence Eventually consistent G-Set merge (CRDT)
Causality Supersessions ordered HLC timestamps
Partition Tolerance Writes never blocked Any node accepts via CRDT
Availability Reads/writes always succeed Every node is master for CRDTs
Durability WAL + fsync per node Existing WAL infra
Conflict Resolution Deterministic Lens algorithms

Phase 7: The Shield (Trust at Scale)

Goal: Defend against spam, Sybil attacks, and knowledge poisoning.

7A. Admission Control

  • 7A.1 Proof-of-Work Admission: BLAKE3 hashcash with graduated difficulty.
  • 7A.2 Graduated Trust Tiers: 5 tiers (Untrusted → Authority) with quota multipliers.

7B. EigenTrust

  • 7B.1 Trust Graph Store: Direct trust relationships at TG: keys.
  • 7B.2 EigenTrust Computation: Power iteration with Sybil resistance.
  • 7B.3 Domain-Specific Trust: Per-predicate-namespace reputation.

7C. Content Defense

  • 7C.1 MinHash Deduplication: LSH bucketing with 0.9 Jaccard threshold.
  • 7C.2 Content Quality Scoring: Entropy, length, structure heuristics.
  • 7C.3 Quarantine Store: Time-ordered suspicious assertions with admin review.

7D. Circuit Breakers

  • 7D.1 Per-Agent Circuit Breakers: Closed → Open → HalfOpen state machine.

Phase 8A: Chaos Testing

  • 8A.1 Partition Testing: 5-node cluster, network partitions, cascading failures.
  • 8A.2 Jepsen-Style Consistency Testing: CRDT properties, clock skew, concurrent writes.

Consumer Health MVP

"Can Episteme demonstrate value that's impossible with Postgres?"

Definition of Done (All Complete)

Checkpoint Description
Real Data Flows FDA drug labels for 3+ GLP-1 drugs ingested as signed assertions
Conflicts Detected SkepticLens shows conflict_score > 0.5 when sources disagree
Source Hierarchy Works Tier 0 (FDA) outweighs 100x Tier 5 (anecdotal) volume
Time Travel Works as_of=2024-01-01 returns historical snapshot
Decay Works 6-month-old Reddit claim has lower effective confidence than fresh FDA
UAT Passes Consumer Health scenarios documented and verified
Self-Serve Demo CLI tool lets anyone explore without code
Documentation "Adding a Domain" guide enables new verticals

MVP Workstream (Weeks 1-6)

  • Week 1: Domain definitions, SubjectBuilder, pharma schema
  • Week 2: FDA extractor, claim-to-assertion signing
  • Week 3: Ingest FDA claims, mock conflicts, SkepticLens demo
  • Week 4: UAT scenarios documented and verified
  • Week 5: steme-pharma CLI for self-serve exploration
  • Week 6: Polish, reusable patterns, documentation

Key Architectural Decisions (Historical)

  • sled → redb/fjall: sled abandoned. HybridStore routes by key prefix.
  • Raft log = WAL: Eliminated duplicate WAL following TiKV v5.4 pattern.
  • CRDT for data, Raft for coordination: Assertions are G-Set CRDT.
  • Subject-prefix ranges: Co-locate all data for a subject on one shard.
  • HLC over TrueTime: Works on commodity hardware.
  • AP model: Writes never blocked during partitions.

Research Documents


Crates (as of archive date)

Crate Purpose
stemedb-core Assertion, LifecycleStage, MaterializedView, types, signing utilities
stemedb-wal Write-ahead log with crash recovery
stemedb-storage KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex
stemedb-ingest Ingestion pipeline, signature verification, ContentDefenseLayer
stemedb-query Query engine, Materializer for O(1) MV reads
stemedb-lens Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.)
stemedb-api HTTP API with axum + utoipa OpenAPI docs
stemedb-sim Simulation for testing the pipeline
stemedb-merkle BLAKE3 Merkle tree for diff detection
stemedb-rpc gRPC services for node-to-node communication
stemedb-sync Merkle sync, gossip broadcast, anti-entropy
stemedb-cluster Cluster membership (SWIM), sharding, gateway
stemedb-ontology Domain definitions (Pharma), subject builders, medical extractors
stemedb-chaos Chaos testing infrastructure