Complete Aphoria claims system overhaul: - A1: Rename ExtractedClaim to Observation (extractors produce observations, not claims) - A2: Add AuthoredClaim with full provenance, invariants, and authority tiers - A3: Verify engine comparing observations against authored claims, CLI + formatters - A4: Corpus as first-class assertions with predicate indexing, authority lens, trust packs - A5: Coverage analysis, explain/docs generation, self-audit extractor, claim suggester skill Also includes: 42 extractors updated for Observation type, verifiable_predicates trait, conflict detection with comparison modes, claims TOML persistence, Grafana dashboard, backup/restore scripts, and comprehensive test coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
275 lines
13 KiB
Markdown
275 lines
13 KiB
Markdown
# Episteme (StemeDB) Roadmap
|
|
|
|
> **Goal:** Build the "Git for Truth" substrate for autonomous AI research.
|
|
> **Current Focus:** A5.3 Claim Suggester validation + Pilot 5 Operational Readiness
|
|
> **Target Vertical:** BioTech/Pharma ("The Living Review") + Code Truth (Aphoria)
|
|
> **Endgame:** Distributed multi-writer cluster for millions of concurrent agents
|
|
>
|
|
> **Infrastructure Status:** Phases 1-7 complete | Phase 8A (Chaos) complete | Pilot 1-4 complete
|
|
> **Aphoria Status:** A1-A4 complete (observations/claims/verify/corpus) | A5 flywheel 3/4 done
|
|
>
|
|
> **Archive:** For completed phases 1-8A + Pilot 1-3, see [roadmap-archive.md](./roadmap-archive.md)
|
|
|
|
---
|
|
|
|
## Current Status
|
|
|
|
| Phase | Status | Summary |
|
|
|-------|--------|---------|
|
|
| **1-7, 8A** | ✅ Complete | Core infra, cluster, trust, chaos testing |
|
|
| **MVP, Pilot 1-4** | ✅ Complete | Consumer Health demo, dashboard, API auth, metrics |
|
|
| **Aphoria A1-A4** | ✅ Complete | Observations/claims/verify/corpus/authority lens |
|
|
| **Aphoria A5** | 🎯 In Progress | Flywheel: 3/4 done, A5.3 suggest skill needs validation |
|
|
| **Pilot 5** | Planned | Operational readiness: runbooks, ref arch, demo validation |
|
|
| **8B-C** | Planned | Distributed observability, geo-distribution |
|
|
| **9** | Planned | Disaster recovery, compliance, storage management |
|
|
|
|
---
|
|
|
|
## 🎯 Aphoria: From Scanner to Knowledge Graph Client (CURRENT)
|
|
|
|
> **Goal:** Transform Aphoria from "grep with Episteme vocabulary" into a real knowledge graph client that authors, stores, and audits claims with provenance and lineage.
|
|
> **Vision Document:** [applications/aphoria/docs/vision-gaps.md](./applications/aphoria/docs/vision-gaps.md)
|
|
> **Validation:** Maxwell scan (67 observations, 0 noise) + hand-written [claims-explained.md](./claims-explained.md)
|
|
|
|
### Completed Phases (A1-A4 + P4 — see [roadmap-archive.md](./roadmap-archive.md) for details)
|
|
|
|
| Phase | What It Delivered |
|
|
|-------|-------------------|
|
|
| **A1** | `Observation` vs `AuthoredClaim` types, bridge tier mapping, `.aphoria/claims.toml` format |
|
|
| **A2** | `aphoria claims create/list/explain/update/supersede/deprecate`, `aphoria-claims` skill |
|
|
| **A3** | `verify.rs` engine (Pass/Conflict/Missing/Unclaimed), `aphoria verify run/map`, pre-commit hook, self-audit |
|
|
| **A4** | RFC/OWASP as Episteme assertions, `AphoriaAuthorityLens`, Trust Pack export/install |
|
|
| **P4** | API auth (3 roles), backup/restore scripts, Prometheus metrics + Grafana dashboard |
|
|
|
|
### Phase A5: The Flywheel
|
|
|
|
> **Goal:** The system gets smarter with use. Each claim makes the next claim easier.
|
|
> **Details:** [vision-gaps.md — §5](./applications/aphoria/docs/vision-gaps.md#5-the-claims-explainedmd-pattern-should-be-the-product) (claims-explained.md as the product)
|
|
> **Research:** [a5-flywheel-skill-design.md](./research-requests/a5-flywheel-skill-design.md) — validates "skill calls CLI" hypothesis
|
|
> **Key Insight:** LLM reasoning over CLI JSON output replaces ML training. The flywheel is prompt engineering, not machine learning.
|
|
|
|
- [x] **A5.1 Claim Coverage Metrics**: Per-module claim density and gap reporting
|
|
- [x] `coverage.rs`: `CoverageReport`, `ModuleCoverage`, `CoverageSummary` types
|
|
- [x] `compute_coverage()` uses `verify_claims()` as source of truth for claim-observation matching
|
|
- [x] Per-module: observation count, claim count, claimed/unclaimed, missing claims, density
|
|
- [x] `aphoria coverage` CLI: table, JSON, markdown formats, `--sort-by` (name/density/unclaimed/observations)
|
|
- [x] Coverage gaps section: modules with observations but no claims
|
|
- [x] 8 unit tests including deprecated claim exclusion
|
|
- [x] **A5.2 Auto-Generated Documentation**: `aphoria docs generate` + `aphoria claims explain`
|
|
- [x] `aphoria docs generate` CLI command with `--output` and `--format` (markdown/json)
|
|
- [x] `claims_explain.rs`: groups by category, includes provenance/invariant/consequence/evidence per claim
|
|
- [x] `explain.rs`: reads `.aphoria/claims.toml`, renders via `render_claims_markdown()`
|
|
- [x] Provenance chains preserved (supersedes references)
|
|
- [ ] **A5.3 Claim Suggester Skill**: LLM-powered pattern recognition via "skill calls CLI"
|
|
- [x] New skill: `.claude/skills/aphoria-suggest/SKILL.md` (3 modes: cold start / foundation / flywheel)
|
|
- [x] Workflow defined: `claims list` → `verify run --show-unclaimed` → reason by analogy → suggest
|
|
- [x] Few-shot learning: existing claims as gold-standard examples for style matching
|
|
- [x] Chain-of-thought: reasoning template before each suggestion
|
|
- [x] Cold start bootstrap: reads README/CLAUDE.md/tests/ADRs when 0 claims
|
|
- [x] Context tiers: local → semantic → summary → global (subagent)
|
|
- [x] Quality gates: non-trivial, not type-enforced, has consequence, not duplicate
|
|
- [x] **VG-022 CLOSED**: `verifiable_predicates()` on Extractor trait; 10 extractors declare predicates; `verify map` shows extractor→claim coverage
|
|
- [x] **Dogfood claims**: 10 total claims in `.aphoria/claims.toml` (3 arch + 7 security) covering all ComparisonModes
|
|
- [ ] **Validate**: Run skill against Aphoria's own codebase (dogfood)
|
|
- [ ] **Validate**: Run skill against an external project (cold start test)
|
|
- [ ] **Iterate**: Refine prompt based on suggestion quality from validation
|
|
- [x] **A5.4 Onboarding Mode**: `aphoria explain` for new team members
|
|
- [x] `explain.rs`: `generate_explanation()` reads claims, renders narrative
|
|
- [x] `aphoria explain` CLI with `--output` and `--format` (markdown/json)
|
|
- [x] Shows claim inventory grouped by category with provenance
|
|
- [x] Empty project handling: directs to `aphoria claims create`
|
|
|
|
---
|
|
|
|
## Pilot 5: Operational Readiness
|
|
|
|
> **Goal:** Complete production readiness for enterprise pilot demo.
|
|
> **Context:** Pilot 1-4 complete (see [archive](./roadmap-archive.md)).
|
|
|
|
- [ ] **P5.1 Operational Runbooks**: Common procedures documented
|
|
- [ ] "Server won't start" troubleshooting
|
|
- [ ] "High query latency" investigation
|
|
- [ ] "Quarantine queue overflow" handling
|
|
- [ ] "Circuit breaker stuck open" resolution
|
|
- [ ] "Restore from backup" step-by-step
|
|
|
|
- [ ] **P5.2 Reference Architecture**: Deployment guide
|
|
- [ ] Single-node pilot deployment diagram
|
|
- [ ] Network requirements (ports, firewall rules)
|
|
- [ ] Reverse proxy configuration (nginx/envoy with TLS)
|
|
- [ ] Resource sizing guide (CPU, memory, disk)
|
|
|
|
- [ ] **P5.3 Pilot Success Criteria Document**: Definition of done
|
|
- [ ] Sub-second query latency at 10K assertions: measured
|
|
- [ ] Successful conflict detection on known contradictory studies: demonstrated
|
|
- [ ] Complete audit trail export for mock regulatory review: tested
|
|
- [ ] Source retraction workflow: exercised
|
|
|
|
- [ ] **P5.4 Executive Demo Script Validation**: End-to-end rehearsal
|
|
- [ ] Run through `amazement-demo-2.md` with real dashboard
|
|
- [ ] Time each segment (target: 20 minutes total)
|
|
- [ ] Record demo video for async sharing
|
|
- [ ] All 5 Aha Moments demonstrable with real data
|
|
|
|
---
|
|
|
|
## Phase 8B-C: Production Observability (Planned)
|
|
|
|
> **Blocked by:** Pilot Prep (need real production deployment first)
|
|
|
|
### 8B. Observability
|
|
|
|
- [ ] **8B.1 Distributed Metrics**: Per-node, per-range, per-agent metrics.
|
|
- [ ] **8B.2 Admin Dashboard**: Cluster health visibility.
|
|
|
|
### 8C. Production Hardening
|
|
|
|
- [ ] **8C.1 Snapshot/Restore**: Fast replica bootstrap.
|
|
- [ ] **8C.2 Backpressure**: Don't overwhelm slow nodes.
|
|
- [ ] **8C.3 Geo-Distribution**: Multi-region deployment.
|
|
|
|
---
|
|
|
|
## Phase 9: The Bunker (Disaster Planning)
|
|
|
|
> **Goal:** Survive the worst. Backup, restore, recover from corruption, comply with regulations.
|
|
|
|
### 9A. Backup & Cold Storage
|
|
|
|
- [ ] **9A.1 Full Cluster Backup**: Point-in-time snapshot to S3/GCS.
|
|
- [ ] **9A.2 Point-in-Time Recovery (PITR)**: Restore to any HLC timestamp.
|
|
- [ ] **9A.3 Backup Verification**: Weekly automated restore tests.
|
|
|
|
### 9B. Data Corruption & Rollback
|
|
|
|
- [ ] **9B.1 Corruption Detection**: Deep validation before accepting gossip.
|
|
- [ ] **9B.2 Assertion Tombstones**: "Delete" in an append-only world.
|
|
- [ ] **9B.3 Cluster Rollback**: Batch tombstone generation for time ranges.
|
|
- [ ] **9B.4 Fork Recovery**: Heal split-brain after extended partition.
|
|
|
|
### 9C. Compliance & Legal
|
|
|
|
- [ ] **9C.1 GDPR Right to Erasure**: Cryptographic erasure via per-agent keys.
|
|
- [ ] **9C.2 Data Retention Policies**: Per-subject/predicate retention rules.
|
|
- [ ] **9C.3 Audit Trail for Compliance**: Immutable admin action log.
|
|
- [ ] **9C.4 SOC 2 Type II Certification**: External audit and certification.
|
|
|
|
### 9D. Storage Management
|
|
|
|
- [ ] **9D.1 Compaction**: Reclaim space from tombstoned data.
|
|
- [ ] **9D.2 Tiered Storage**: Hot/warm/cold based on access patterns.
|
|
- [ ] **9D.3 Storage Quotas**: Per-agent and cluster-wide limits.
|
|
|
|
### 9E. Incident Response
|
|
|
|
- [ ] **9E.1 Alerting & Escalation**: PagerDuty/Slack integration.
|
|
- [ ] **9E.2 Operational Runbooks**: Documented procedures for common failures.
|
|
- [ ] **9E.3 Chaos Engineering**: Monthly "game days" with controlled failures.
|
|
|
|
### 9F. Security Hardening
|
|
|
|
- [ ] **9F.1 TLS Everywhere**: mTLS for node-to-node traffic.
|
|
- [ ] **9F.2 Encryption at Rest**: WAL and KV store encryption.
|
|
- [ ] **9F.3 Node Authentication**: Ed25519 keypair identity, signed cluster join.
|
|
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
Write Path (Spine): Read Path (Cortex):
|
|
[Agent] -> [Ingestion] [Agent] <- [Lens Engine]
|
|
| |
|
|
v |
|
|
[WAL/Fsync] [Index Lookup]
|
|
| |
|
|
v |
|
|
[KV Store] <--------------------+
|
|
```
|
|
|
|
## Port Scheme (181XX)
|
|
|
|
| Offset | Service | Default | Env Var |
|
|
|--------|---------|---------|---------|
|
|
| +0 | HTTP API | 18180 | `STEMEDB_BIND_ADDR` |
|
|
| +1 | Cluster Gateway | 18181 | `STEMEDB_NODE_API_ADDR` |
|
|
| +2 | Cluster RPC | 18182 | `STEMEDB_NODE_RPC_ADDR` |
|
|
| +3 | SWIM Gossip | 18183 | via `SwimConfig` |
|
|
| +4 | Metrics | 18184 | (reserved) |
|
|
| +5 | Admin | 18185 | (reserved) |
|
|
| +6 | Latent Signal | 18186 | — |
|
|
| +7 | Community App | 18187 | — |
|
|
| +8 | Admin Dashboard | 18188 | — |
|
|
|
|
## Crates
|
|
|
|
| Crate | Purpose | Status |
|
|
|-------|---------|--------|
|
|
| `stemedb-core` | Assertion, LifecycleStage, MaterializedView, types, signing | ✅ |
|
|
| `stemedb-wal` | Write-ahead log with crash recovery | ✅ |
|
|
| `stemedb-storage` | KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore | ✅ |
|
|
| `stemedb-ingest` | Ingestion pipeline, signature verification, ContentDefenseLayer | ✅ |
|
|
| `stemedb-query` | Query engine, Materializer for O(1) MV reads | ✅ |
|
|
| `stemedb-lens` | Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.) | ✅ |
|
|
| `stemedb-api` | HTTP API with axum + utoipa OpenAPI docs | ✅ |
|
|
| `stemedb-sim` | Simulation for testing the pipeline | ✅ |
|
|
| `stemedb-merkle` | BLAKE3 Merkle tree for diff detection | ✅ |
|
|
| `stemedb-rpc` | gRPC services for node-to-node communication | ✅ |
|
|
| `stemedb-sync` | Merkle sync, gossip broadcast, anti-entropy | ✅ |
|
|
| `stemedb-cluster` | Cluster membership (SWIM), sharding, gateway | ✅ |
|
|
| `stemedb-ontology` | Domain definitions (Pharma), subject builders, medical extractors | ✅ |
|
|
| `stemedb-chaos` | Chaos testing infrastructure | ✅ |
|
|
| `stemedb-dashboard` | Admin dashboard (React/Next.js) | ✅ (7 panels) |
|
|
|
|
## Applications
|
|
|
|
| App | Purpose | Status |
|
|
|-----|---------|--------|
|
|
| `aphoria` | Code-level truth linter — 42 extractors, claims, verify, coverage | 🎯 A5 flywheel |
|
|
| `disputed` | Controversy explorer | Planned |
|
|
|
|
## SDKs
|
|
|
|
| SDK | Purpose | Status |
|
|
|-----|---------|--------|
|
|
| `sdk/go/steme` | Go HTTP client with Ed25519 signing and fluent builders | ✅ |
|
|
| `sdk/go/adk` | ADK-Go tools and callbacks for AI agents | ✅ |
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
```bash
|
|
# Build
|
|
cargo build --workspace
|
|
|
|
# Test
|
|
cargo test --workspace
|
|
|
|
# Lint (must pass before commit)
|
|
cargo clippy --workspace -- -D warnings
|
|
cargo fmt --check
|
|
|
|
# Run API server
|
|
cargo run --bin stemedb-api
|
|
|
|
# Run Aphoria scan
|
|
cargo run --bin aphoria -- scan /path/to/project --show-observations
|
|
|
|
# Run demo script
|
|
./scripts/demo-consumer-health.sh
|
|
```
|
|
|
|
---
|
|
|
|
## Related Documents
|
|
|
|
- [CLAUDE.md](./CLAUDE.md) — AI assistant instructions and project rules
|
|
- [roadmap-archive.md](./roadmap-archive.md) — Completed phases 1-8A + Pilot 1-3
|
|
- [applications/aphoria/docs/vision-gaps.md](./applications/aphoria/docs/vision-gaps.md) — Aphoria vision gap analysis
|
|
- [claims-explained.md](./claims-explained.md) — Hand-written Maxwell claims (the gold standard)
|
|
- [docs/demo/pilot/amazement-demo.md](./docs/demo/pilot/amazement-demo.md) — Technical demo script
|
|
- [docs/demo/pilot/amazement-demo-2.md](./docs/demo/pilot/amazement-demo-2.md) — Executive demo script
|
|
- [uat/production-readiness/README.md](./uat/production-readiness/README.md) — Production verification checklist
|