## Phase 8: Enterprise Extractor Improvements ✅ - 14 security extractors (TLS, JWT, SQL injection, XSS, etc.) - 10 framework-specific extractors (Spring, Django, Rails, etc.) - Config file security detection (YAML, TOML) ## Phase 9: Autonomous Extractor Generation ✅ - Shadow mode executor with TP/FP tracking - Graduation pipeline with confidence thresholds - Auto-rollback on regression detection - Cross-project pattern syncing ## UAT Suite Complete (14 scripts, 90 tests) - test-core-detection.sh (6 tests) - test-declarative-extractors.sh (5 tests) - test-domain-frameworks.sh (5 tests) - test-domain-unreal.sh (3 tests) - test-llm-extraction.sh (6 tests) - test-eval-harness.sh (5 tests) - test-cross-language.sh (3 tests) - test-precommit-performance.sh (4 tests) - test-output-formats.sh (8 tests) - test-drift-detection.sh (6 tests) - test-exit-codes.sh (12 tests) + 3 more scripts ## Other Changes - Updated roadmap to mark Phase 8-9 complete - Added .gitignore entries for build artifacts - Updated pre-commit: 800 line limit, exclude tests/data/cmd Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
324 lines
16 KiB
Markdown
324 lines
16 KiB
Markdown
# Episteme (StemeDB) Roadmap Archive
|
|
|
|
> **Purpose:** Historical record of completed phases. For current work, see [roadmap.md](./roadmap.md).
|
|
> **Last Updated:** 2026-02-05
|
|
|
|
---
|
|
|
|
## Completed Phases Summary
|
|
|
|
| Phase | Codename | Status | Completion |
|
|
|-------|----------|--------|------------|
|
|
| **1** | The Spine | ✅ Complete | Storage & Safety — WAL + KV Store |
|
|
| **2** | The Lattice | ✅ Complete | Indexing & Async — MVs + Ballot Box |
|
|
| **2.5** | Hardening | ✅ Complete | MV staleness, epoch behavior, lens cleanup |
|
|
| **3** | The Pilot | ✅ Complete | Vertical Integration — Pharma Ingestion |
|
|
| **4** | The Hive | ✅ Complete | Trust & Learning — TrustRank, metadata indexing |
|
|
| **5** | The Forge | ✅ Complete | Foundation Hardening — redb/fjall, WAL, indices |
|
|
| **6** | The Mesh | ✅ Complete | Distributed Writes — CRDT, Raft, clustering |
|
|
| **7** | The Shield | ✅ Complete | Trust at Scale — EigenTrust, PoW, quarantine |
|
|
| **8A** | Chaos | ✅ Complete | Partition testing, Jepsen-style verification |
|
|
| **MVP** | Consumer Health | ✅ Complete | Real FDA data → conflicts detected → demo |
|
|
|
|
---
|
|
|
|
## Phase 1: The Spine (Foundation) ✅
|
|
|
|
*Goal: Securely ingest assertions and persist them without data loss.*
|
|
|
|
- [x] **Project Scaffold**: Initialize Rust workspace, set up linting/CI (clippy, fmt).
|
|
- [x] **Assertion Schema**: Define the `Assertion` struct with `rkyv` serialization.
|
|
- [x] Add dependencies: `rkyv`, `blake3`, `ed25519-dalek`, `image_hasher`.
|
|
- [x] Define `Assertion` struct (Subject, Predicate, Object, Confidence, SourceHash).
|
|
- [x] **Multi-Sig Expansion**: Implement `SignatureEntry` struct and `signatures: Vec<SignatureEntry>` field.
|
|
- [x] **Visual Expansion**: Add `visual_hash: Option<pHash>` field for image provenance.
|
|
- [x] Test serialization round-trips.
|
|
- [x] **Ballot Schema**: Define the `Vote` struct for multi-agent consensus.
|
|
- [x] Add `Vote` struct: `assertion_hash`, `agent_id`, `weight`, `signature`.
|
|
- [x] Test serialization round-trips.
|
|
- [x] **Paradigm Schema (Epochs)**: Define the `Epoch` and `SupersessionType` structs.
|
|
- [x] Add `epoch: Option<EpochId>` to `Assertion`.
|
|
- [x] Implement `Epoch` struct with `supersedes` and `SupersessionType`.
|
|
- [x] Test serialization round-trips.
|
|
- [x] **WAL Integration**: Implement the Quarantine Pattern for write-ahead logging.
|
|
- [x] Create `stemedb-wal` crate.
|
|
- [x] Port `FsyncGuard` and `Record` logic from established durability patterns.
|
|
- [x] Implement Record format with BLAKE3 checksums and Headers.
|
|
- [x] Verify `fsync` behavior with tests.
|
|
- [x] **Storage Engine**: Implement the `Store` trait using `sled` (embedded KV).
|
|
- [x] Add `sled` dependency.
|
|
- [x] Define `KVStore` trait (put, get, delete, scan_prefix, flush).
|
|
- [x] Implement `SledStore` wrapper.
|
|
- [x] **Basic Ingestor**: Background worker that tails WAL and writes to KV.
|
|
- [x] Implement async loop reading from WAL.
|
|
- [x] Write deserialized assertions, votes, and epochs to `sled`.
|
|
- [x] Ed25519 signature verification during ingestion.
|
|
- [x] Maintains S: and SP: indexes on ingest.
|
|
- [x] Persistent cursor/checkpoint (resumes from `__CURSOR__:ingest` in KV store).
|
|
- [x] **Verification**: Crash recovery tests (write -> crash -> restart -> read).
|
|
- [x] Single and multi-record crash recovery.
|
|
- [x] Multiple crash cycles tested.
|
|
|
|
---
|
|
|
|
## Phase 2: The Lattice (Connectivity) ✅
|
|
|
|
*Goal: Query data with sub-millisecond latency using Materialized Views.*
|
|
|
|
- [x] **Lifecycle Schema**: Add `LifecycleStage` to Assertion.
|
|
- [x] Define enum: `Proposed`, `UnderReview`, `Approved`, `Deprecated`, `Rejected`.
|
|
- [x] Update `Assertion` struct and serialization tests.
|
|
- [x] **The Ballot Box**: Implement high-velocity vote ingestion.
|
|
- [x] `VoteStore` trait and implementation.
|
|
- [x] `VoteAwareConsensusLens` for real vote-based resolution.
|
|
- [x] **Index Infrastructure**: Compound indexes for O(1) queries.
|
|
- [x] `IndexStore` trait with S: and SP: indexes.
|
|
- [x] `QueryEngine` smart routing (SP -> S -> scan).
|
|
- [x] **Materializer**: Background worker for O(1) Read Performance.
|
|
- [x] `MaterializedView` type in `stemedb-core`.
|
|
- [x] `Materializer` worker in `stemedb-query` with `step()` and `run()`.
|
|
- [x] Aggregates Votes via `VoteAwareConsensusLens` (or any `AsyncLens`).
|
|
- [x] Updates `MV:{Subject}:{Predicate}` with the winning Assertion + metadata.
|
|
- [x] Event-driven mode via `run_notified()` with `tokio::sync::Notify`.
|
|
- [x] Fast-path MV lookup in `QueryEngine::try_fast_path()`.
|
|
- [x] **The Meter**: Implement Economic Throttling (TAN).
|
|
- [x] `QuotaStore` trait and `GenericQuotaStore` implementation.
|
|
- [x] Token Bucket algorithm with per-agent per-hour quotas.
|
|
- [x] `MeterLayer` tower middleware for request cost tracking.
|
|
- [x] Cost model: Assert=10, Vote=1, Query=5+lens, +1/KB payload.
|
|
- [x] `GET /v1/meter/quota` endpoint to check remaining quota.
|
|
- [x] `POST /v1/meter/quota/limit` admin endpoint to set custom limits.
|
|
- [x] **API Surface**: `axum` HTTP server with OpenAPI (utoipa).
|
|
- [x] `POST /v1/assert` -> Accepts JSON, writes to WAL.
|
|
- [x] `POST /v1/vote` -> High-throughput vote endpoint.
|
|
- [x] `POST /v1/epoch` -> Create epoch with optional supersession.
|
|
- [x] `GET /v1/query` -> Subject/Predicate/Lens/Lifecycle/Epoch filtering.
|
|
- [x] `GET /v1/health` -> Health check with assertion count.
|
|
- [x] `GET /swagger-ui` -> Interactive API docs.
|
|
- [x] 5 lens types available: Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority.
|
|
- [x] **Query Audit**: Log every read with provenance.
|
|
- [x] Define `QueryAudit` struct: query_id, agent_id, timestamp, params, result_hash, contributing_assertions.
|
|
- [x] Storage at `AUD:{query_id}` with agent index at `AUDA:{agent_id}:{timestamp}:{query_id}`.
|
|
- [x] `GET /v1/audit/queries` -> Returns history of agent decisions.
|
|
- [x] `GET /v1/audit/query/{id}` -> Full reasoning trace for a single query.
|
|
- [x] Auto-logging on every query via `X-Agent-Id` header.
|
|
|
|
---
|
|
|
|
## Phase 2.5: Hardening ✅
|
|
|
|
*Goal: Close the gaps between "built" and "works right."*
|
|
|
|
- [x] **2.1 MV Staleness Detection**: `max_stale` parameter on queries.
|
|
- [x] **2.2 AuthorityLens -> ConfidenceLens Rename**: Eliminated misleading name.
|
|
- [x] **2.3 EpochAwareLens**: Epoch supersession runtime behavior with cycle detection.
|
|
- [x] **2.4 Visual Hash Query Support**: Hamming distance queries on `visual_hash`.
|
|
- [x] **2.5 Vector Field**: `vector: Option<Vec<f32>>` stored on assertions.
|
|
- [x] **2.6 E2E Integration Test**: Full pipeline validation (Write -> Materialize -> Read).
|
|
|
|
---
|
|
|
|
## Phase 3: The Pilot (BioTech/Pharma) ✅
|
|
|
|
*Goal: Prove value in the "High-Liability" beachhead.*
|
|
|
|
### 3A. Schema Expansion
|
|
- [x] **3A.1 Source-Class Field**: 6-tier `SourceClass` enum (Regulatory → Anecdotal).
|
|
- [x] **3A.2 Conflict Score on Resolution**: Normalized variance-based conflict metric.
|
|
- [x] **3A.3 Rich Source Metadata**: `source_metadata: Option<Vec<u8>>` for JSON provenance.
|
|
|
|
### 3B. Time & Decay
|
|
- [x] **3B.1 Time-Travel Engine**: `as_of` parameter for historical queries.
|
|
- [x] **3B.2 Semantic Decay**: Confidence half-life with tier-specific rates.
|
|
|
|
### 3C. New Lenses
|
|
- [x] **3C.1 Skeptic Lens**: Surface disagreement via Shannon entropy conflict scoring.
|
|
- [x] **3C.2 Layered Consensus Lens**: Per-source-class consensus with tier visibility.
|
|
- [x] **3C.3 Constraints Lens**: Pre-flight check for must_use/forbidden/prefer.
|
|
|
|
### 3D. Epoch Enhancement
|
|
- [x] **3D.1 Epoch Cascade Logic**: O(1) supersession lookup via pre-computed markers.
|
|
|
|
### 3E. Similarity Search
|
|
- [x] **3E.1 Vector Search**: HNSW-based semantic k-NN queries.
|
|
- [x] **3E.2 Visual Hash Index**: BK-tree for O(log N) visual similarity.
|
|
|
|
### 3F. Provenance
|
|
- [x] **3F.1 Source Document Storage**: Content-addressed source storage with `GET /v1/provenance/{hash}`.
|
|
|
|
### 3G. API Cleanup
|
|
- [x] **3G.1 Document epoch supersession**: Updated docs for `POST /v1/epoch` with `supersedes` field.
|
|
|
|
---
|
|
|
|
## Phase 4: The Hive (Trust & Scale) ✅
|
|
|
|
*Goal: Change tracking, metadata indexing, and training pipeline primitives.*
|
|
|
|
- [x] **TrustRank Engine**: Per-agent reputation with decay and learning loop.
|
|
- [x] **4.1 "Since" Parameter**: MV changelog at `MVC:` keys with `changes_since` in responses.
|
|
- [x] **4.2 Source Metadata Indexing**: Indexed fields (journal, doi, platform, study_design) at `SMV:`.
|
|
- [x] **4.3 Batch TrustRank Decay API**: `POST /v1/admin/decay-trust-ranks`.
|
|
- [x] **4.4 Vote Provenance Witness**: `source_url` and `observed_context` on votes.
|
|
- [x] **4.5 Conflict Score Filtering**: `min_conflict_score`/`max_conflict_score` on queries.
|
|
- [x] **4.6 Escalation Triggers**: `EscalationPolicy` fires events on high-conflict assertions.
|
|
- [x] **4.7 Gold Standard Verification**: Admin-verified assertions for agent testing.
|
|
|
|
---
|
|
|
|
## Phase 5: The Forge (Foundation Hardening) ✅
|
|
|
|
*Goal: Replace abandoned dependencies, fix WAL gaps, persist indices.*
|
|
|
|
### 5A. Storage Engine Replacement
|
|
- [x] **5A.1 Replace sled with redb + fjall**: HybridStore with prefix-based routing.
|
|
- [x] **5A.2 Key Layout Redesign**: Subject-prefix keys for range sharding readiness.
|
|
|
|
### 5B. WAL Hardening
|
|
- [x] **5B.1 CRC32C Checksums**: Hardware-accelerated torn write detection.
|
|
- [x] **5B.2 Crash Recovery Implementation**: Sequential scan with truncation.
|
|
- [x] **5B.3 Group Commit**: Batch fsync for throughput.
|
|
- [x] **5B.4 Log Rotation**: Segment management with safe deletion.
|
|
|
|
### 5C. Index Persistence
|
|
- [x] **5C.1 Persistent Vector Index**: Hot/cold HNSW with checkpoint files.
|
|
- [x] **5C.2 Persistent Visual Index**: BK-tree snapshots with CRC32C verification.
|
|
|
|
### 5D. Concept Hierarchy
|
|
- [x] **5D.1 ConceptPath Type**: Scheme-qualified subject identifiers.
|
|
- [x] **5D.2 Source Scheme Registry**: Scheme → default source tier mapping.
|
|
- [x] **5D.3 Alias Store**: Cross-scheme entity resolution with cycle detection.
|
|
- [x] **5D.4 Hierarchical Query**: Prefix-based subject queries.
|
|
- [x] **5D.5 Alias Resolution in Queries**: `GET /v1/concepts/resolve?path=...`.
|
|
- [x] **5D.6 Source Class Inference**: Tier inference from scheme.
|
|
- [x] **5D.7 Concept API Endpoints**: Full CRUD for aliases and hierarchy.
|
|
- [x] **5D.8 Battery Tests**: 15 tests across Battery 8 and 9.
|
|
|
|
---
|
|
|
|
## Phase 6: The Mesh (Distributed Writes) ✅
|
|
|
|
*Goal: Multi-node cluster with CRDT replication and Raft coordination.*
|
|
|
|
### 6A. CRDT Foundation
|
|
- [x] **6A.1 Integrate CRDT Crate**: G-Set for assertions, G-Counter for votes.
|
|
- [x] **6A.2 Hybrid Logical Clocks**: HLC timestamps for causal ordering.
|
|
- [x] **6A.3 Merkle Tree Over Assertions**: BLAKE3-based diff detection.
|
|
|
|
### 6B. Two-Node Replication (PoC)
|
|
- [x] **6B.1 RPC Layer**: tonic gRPC with SyncClient and SyncServiceHandler.
|
|
- [x] **6B.2 Gossip Broadcast**: Configurable fanout with rate limiting.
|
|
- [x] **6B.3 Merkle Anti-Entropy Sync**: Background convergence worker.
|
|
- [x] **6B.4 Integration Test**: 8 tests validating replication primitives.
|
|
|
|
### 6C. Multi-Node Cluster
|
|
- [x] **6C.1 Cluster Membership (SWIM Gossip)**: Node discovery and failure detection.
|
|
- [x] **6C.2 Subject-Prefix Range Sharding**: BLAKE3 + jump hash routing.
|
|
- [x] **6C.4 Gateway**: Stateless request routing with health and status endpoints.
|
|
- [x] **6C.5 Integration Tests**: 82 tests covering membership, sharding, gateway.
|
|
|
|
### Consistency Guarantees
|
|
| Property | Guarantee | Mechanism |
|
|
|----------|-----------|-----------|
|
|
| **Convergence** | Eventually consistent | G-Set merge (CRDT) |
|
|
| **Causality** | Supersessions ordered | HLC timestamps |
|
|
| **Partition Tolerance** | Writes never blocked | Any node accepts via CRDT |
|
|
| **Availability** | Reads/writes always succeed | Every node is master for CRDTs |
|
|
| **Durability** | WAL + fsync per node | Existing WAL infra |
|
|
| **Conflict Resolution** | Deterministic | Lens algorithms |
|
|
|
|
---
|
|
|
|
## Phase 7: The Shield (Trust at Scale) ✅
|
|
|
|
*Goal: Defend against spam, Sybil attacks, and knowledge poisoning.*
|
|
|
|
### 7A. Admission Control
|
|
- [x] **7A.1 Proof-of-Work Admission**: BLAKE3 hashcash with graduated difficulty.
|
|
- [x] **7A.2 Graduated Trust Tiers**: 5 tiers (Untrusted → Authority) with quota multipliers.
|
|
|
|
### 7B. EigenTrust
|
|
- [x] **7B.1 Trust Graph Store**: Direct trust relationships at `TG:` keys.
|
|
- [x] **7B.2 EigenTrust Computation**: Power iteration with Sybil resistance.
|
|
- [x] **7B.3 Domain-Specific Trust**: Per-predicate-namespace reputation.
|
|
|
|
### 7C. Content Defense
|
|
- [x] **7C.1 MinHash Deduplication**: LSH bucketing with 0.9 Jaccard threshold.
|
|
- [x] **7C.2 Content Quality Scoring**: Entropy, length, structure heuristics.
|
|
- [x] **7C.3 Quarantine Store**: Time-ordered suspicious assertions with admin review.
|
|
|
|
### 7D. Circuit Breakers
|
|
- [x] **7D.1 Per-Agent Circuit Breakers**: Closed → Open → HalfOpen state machine.
|
|
|
|
---
|
|
|
|
## Phase 8A: Chaos Testing ✅
|
|
|
|
- [x] **8A.1 Partition Testing**: 5-node cluster, network partitions, cascading failures.
|
|
- [x] **8A.2 Jepsen-Style Consistency Testing**: CRDT properties, clock skew, concurrent writes.
|
|
|
|
---
|
|
|
|
## Consumer Health MVP ✅
|
|
|
|
*"Can Episteme demonstrate value that's impossible with Postgres?"*
|
|
|
|
### Definition of Done (All Complete)
|
|
| Checkpoint | Description |
|
|
|------------|-------------|
|
|
| **Real Data Flows** | FDA drug labels for 3+ GLP-1 drugs ingested as signed assertions |
|
|
| **Conflicts Detected** | SkepticLens shows `conflict_score > 0.5` when sources disagree |
|
|
| **Source Hierarchy Works** | Tier 0 (FDA) outweighs 100x Tier 5 (anecdotal) volume |
|
|
| **Time Travel Works** | `as_of=2024-01-01` returns historical snapshot |
|
|
| **Decay Works** | 6-month-old Reddit claim has lower effective confidence than fresh FDA |
|
|
| **UAT Passes** | Consumer Health scenarios documented and verified |
|
|
| **Self-Serve Demo** | CLI tool lets anyone explore without code |
|
|
| **Documentation** | "Adding a Domain" guide enables new verticals |
|
|
|
|
### MVP Workstream (Weeks 1-6)
|
|
- Week 1: Domain definitions, SubjectBuilder, pharma schema
|
|
- Week 2: FDA extractor, claim-to-assertion signing
|
|
- Week 3: Ingest FDA claims, mock conflicts, SkepticLens demo
|
|
- Week 4: UAT scenarios documented and verified
|
|
- Week 5: `steme-pharma` CLI for self-serve exploration
|
|
- Week 6: Polish, reusable patterns, documentation
|
|
|
|
---
|
|
|
|
## Key Architectural Decisions (Historical)
|
|
|
|
- **sled → redb/fjall**: sled abandoned. HybridStore routes by key prefix.
|
|
- **Raft log = WAL**: Eliminated duplicate WAL following TiKV v5.4 pattern.
|
|
- **CRDT for data, Raft for coordination**: Assertions are G-Set CRDT.
|
|
- **Subject-prefix ranges**: Co-locate all data for a subject on one shard.
|
|
- **HLC over TrueTime**: Works on commodity hardware.
|
|
- **AP model**: Writes never blocked during partitions.
|
|
|
|
---
|
|
|
|
## Research Documents
|
|
|
|
- [docs/research/wal-crash-recovery-research.md](docs/research/wal-crash-recovery-research.md) — WAL patterns from CockroachDB, TiKV, FoundationDB, SQLite.
|
|
- [docs/research/distributed-write-path.md](docs/research/distributed-write-path.md) — Spanner/CockroachDB-style distributed writes adapted for append-only model.
|
|
|
|
---
|
|
|
|
## Crates (as of archive date)
|
|
|
|
| Crate | Purpose |
|
|
|-------|---------|
|
|
| `stemedb-core` | Assertion, LifecycleStage, MaterializedView, types, signing utilities |
|
|
| `stemedb-wal` | Write-ahead log with crash recovery |
|
|
| `stemedb-storage` | KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex |
|
|
| `stemedb-ingest` | Ingestion pipeline, signature verification, ContentDefenseLayer |
|
|
| `stemedb-query` | Query engine, Materializer for O(1) MV reads |
|
|
| `stemedb-lens` | Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.) |
|
|
| `stemedb-api` | HTTP API with axum + utoipa OpenAPI docs |
|
|
| `stemedb-sim` | Simulation for testing the pipeline |
|
|
| `stemedb-merkle` | BLAKE3 Merkle tree for diff detection |
|
|
| `stemedb-rpc` | gRPC services for node-to-node communication |
|
|
| `stemedb-sync` | Merkle sync, gossip broadcast, anti-entropy |
|
|
| `stemedb-cluster` | Cluster membership (SWIM), sharding, gateway |
|
|
| `stemedb-ontology` | Domain definitions (Pharma), subject builders, medical extractors |
|
|
| `stemedb-chaos` | Chaos testing infrastructure |
|