Complete Aphoria claims system overhaul: - A1: Rename ExtractedClaim to Observation (extractors produce observations, not claims) - A2: Add AuthoredClaim with full provenance, invariants, and authority tiers - A3: Verify engine comparing observations against authored claims, CLI + formatters - A4: Corpus as first-class assertions with predicate indexing, authority lens, trust packs - A5: Coverage analysis, explain/docs generation, self-audit extractor, claim suggester skill Also includes: 42 extractors updated for Observation type, verifiable_predicates trait, conflict detection with comparison modes, claims TOML persistence, Grafana dashboard, backup/restore scripts, and comprehensive test coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 KiB
Episteme (StemeDB) Roadmap
Goal: Build the "Git for Truth" substrate for autonomous AI research. Current Focus: A5.3 Claim Suggester validation + Pilot 5 Operational Readiness Target Vertical: BioTech/Pharma ("The Living Review") + Code Truth (Aphoria) Endgame: Distributed multi-writer cluster for millions of concurrent agents
Infrastructure Status: Phases 1-7 complete | Phase 8A (Chaos) complete | Pilot 1-4 complete Aphoria Status: A1-A4 complete (observations/claims/verify/corpus) | A5 flywheel 3/4 done
Archive: For completed phases 1-8A + Pilot 1-3, see roadmap-archive.md
Current Status
| Phase | Status | Summary |
|---|---|---|
| 1-7, 8A | ✅ Complete | Core infra, cluster, trust, chaos testing |
| MVP, Pilot 1-4 | ✅ Complete | Consumer Health demo, dashboard, API auth, metrics |
| Aphoria A1-A4 | ✅ Complete | Observations/claims/verify/corpus/authority lens |
| Aphoria A5 | 🎯 In Progress | Flywheel: 3/4 done, A5.3 suggest skill needs validation |
| Pilot 5 | Planned | Operational readiness: runbooks, ref arch, demo validation |
| 8B-C | Planned | Distributed observability, geo-distribution |
| 9 | Planned | Disaster recovery, compliance, storage management |
🎯 Aphoria: From Scanner to Knowledge Graph Client (CURRENT)
Goal: Transform Aphoria from "grep with Episteme vocabulary" into a real knowledge graph client that authors, stores, and audits claims with provenance and lineage. Vision Document: applications/aphoria/docs/vision-gaps.md Validation: Maxwell scan (67 observations, 0 noise) + hand-written claims-explained.md
Completed Phases (A1-A4 + P4 — see roadmap-archive.md for details)
| Phase | What It Delivered |
|---|---|
| A1 | Observation vs AuthoredClaim types, bridge tier mapping, .aphoria/claims.toml format |
| A2 | aphoria claims create/list/explain/update/supersede/deprecate, aphoria-claims skill |
| A3 | verify.rs engine (Pass/Conflict/Missing/Unclaimed), aphoria verify run/map, pre-commit hook, self-audit |
| A4 | RFC/OWASP as Episteme assertions, AphoriaAuthorityLens, Trust Pack export/install |
| P4 | API auth (3 roles), backup/restore scripts, Prometheus metrics + Grafana dashboard |
Phase A5: The Flywheel
Goal: The system gets smarter with use. Each claim makes the next claim easier. Details: vision-gaps.md — §5 (claims-explained.md as the product) Research: a5-flywheel-skill-design.md — validates "skill calls CLI" hypothesis Key Insight: LLM reasoning over CLI JSON output replaces ML training. The flywheel is prompt engineering, not machine learning.
- A5.1 Claim Coverage Metrics: Per-module claim density and gap reporting
coverage.rs:CoverageReport,ModuleCoverage,CoverageSummarytypescompute_coverage()usesverify_claims()as source of truth for claim-observation matching- Per-module: observation count, claim count, claimed/unclaimed, missing claims, density
aphoria coverageCLI: table, JSON, markdown formats,--sort-by(name/density/unclaimed/observations)- Coverage gaps section: modules with observations but no claims
- 8 unit tests including deprecated claim exclusion
- A5.2 Auto-Generated Documentation:
aphoria docs generate+aphoria claims explainaphoria docs generateCLI command with--outputand--format(markdown/json)claims_explain.rs: groups by category, includes provenance/invariant/consequence/evidence per claimexplain.rs: reads.aphoria/claims.toml, renders viarender_claims_markdown()- Provenance chains preserved (supersedes references)
- A5.3 Claim Suggester Skill: LLM-powered pattern recognition via "skill calls CLI"
- New skill:
.claude/skills/aphoria-suggest/SKILL.md(3 modes: cold start / foundation / flywheel) - Workflow defined:
claims list→verify run --show-unclaimed→ reason by analogy → suggest - Few-shot learning: existing claims as gold-standard examples for style matching
- Chain-of-thought: reasoning template before each suggestion
- Cold start bootstrap: reads README/CLAUDE.md/tests/ADRs when 0 claims
- Context tiers: local → semantic → summary → global (subagent)
- Quality gates: non-trivial, not type-enforced, has consequence, not duplicate
- VG-022 CLOSED:
verifiable_predicates()on Extractor trait; 10 extractors declare predicates;verify mapshows extractor→claim coverage - Dogfood claims: 10 total claims in
.aphoria/claims.toml(3 arch + 7 security) covering all ComparisonModes - Validate: Run skill against Aphoria's own codebase (dogfood)
- Validate: Run skill against an external project (cold start test)
- Iterate: Refine prompt based on suggestion quality from validation
- New skill:
- A5.4 Onboarding Mode:
aphoria explainfor new team membersexplain.rs:generate_explanation()reads claims, renders narrativeaphoria explainCLI with--outputand--format(markdown/json)- Shows claim inventory grouped by category with provenance
- Empty project handling: directs to
aphoria claims create
Pilot 5: Operational Readiness
Goal: Complete production readiness for enterprise pilot demo. Context: Pilot 1-4 complete (see archive).
-
P5.1 Operational Runbooks: Common procedures documented
- "Server won't start" troubleshooting
- "High query latency" investigation
- "Quarantine queue overflow" handling
- "Circuit breaker stuck open" resolution
- "Restore from backup" step-by-step
-
P5.2 Reference Architecture: Deployment guide
- Single-node pilot deployment diagram
- Network requirements (ports, firewall rules)
- Reverse proxy configuration (nginx/envoy with TLS)
- Resource sizing guide (CPU, memory, disk)
-
P5.3 Pilot Success Criteria Document: Definition of done
- Sub-second query latency at 10K assertions: measured
- Successful conflict detection on known contradictory studies: demonstrated
- Complete audit trail export for mock regulatory review: tested
- Source retraction workflow: exercised
-
P5.4 Executive Demo Script Validation: End-to-end rehearsal
- Run through
amazement-demo-2.mdwith real dashboard - Time each segment (target: 20 minutes total)
- Record demo video for async sharing
- All 5 Aha Moments demonstrable with real data
- Run through
Phase 8B-C: Production Observability (Planned)
Blocked by: Pilot Prep (need real production deployment first)
8B. Observability
- 8B.1 Distributed Metrics: Per-node, per-range, per-agent metrics.
- 8B.2 Admin Dashboard: Cluster health visibility.
8C. Production Hardening
- 8C.1 Snapshot/Restore: Fast replica bootstrap.
- 8C.2 Backpressure: Don't overwhelm slow nodes.
- 8C.3 Geo-Distribution: Multi-region deployment.
Phase 9: The Bunker (Disaster Planning)
Goal: Survive the worst. Backup, restore, recover from corruption, comply with regulations.
9A. Backup & Cold Storage
- 9A.1 Full Cluster Backup: Point-in-time snapshot to S3/GCS.
- 9A.2 Point-in-Time Recovery (PITR): Restore to any HLC timestamp.
- 9A.3 Backup Verification: Weekly automated restore tests.
9B. Data Corruption & Rollback
- 9B.1 Corruption Detection: Deep validation before accepting gossip.
- 9B.2 Assertion Tombstones: "Delete" in an append-only world.
- 9B.3 Cluster Rollback: Batch tombstone generation for time ranges.
- 9B.4 Fork Recovery: Heal split-brain after extended partition.
9C. Compliance & Legal
- 9C.1 GDPR Right to Erasure: Cryptographic erasure via per-agent keys.
- 9C.2 Data Retention Policies: Per-subject/predicate retention rules.
- 9C.3 Audit Trail for Compliance: Immutable admin action log.
- 9C.4 SOC 2 Type II Certification: External audit and certification.
9D. Storage Management
- 9D.1 Compaction: Reclaim space from tombstoned data.
- 9D.2 Tiered Storage: Hot/warm/cold based on access patterns.
- 9D.3 Storage Quotas: Per-agent and cluster-wide limits.
9E. Incident Response
- 9E.1 Alerting & Escalation: PagerDuty/Slack integration.
- 9E.2 Operational Runbooks: Documented procedures for common failures.
- 9E.3 Chaos Engineering: Monthly "game days" with controlled failures.
9F. Security Hardening
- 9F.1 TLS Everywhere: mTLS for node-to-node traffic.
- 9F.2 Encryption at Rest: WAL and KV store encryption.
- 9F.3 Node Authentication: Ed25519 keypair identity, signed cluster join.
Architecture Overview
Write Path (Spine): Read Path (Cortex):
[Agent] -> [Ingestion] [Agent] <- [Lens Engine]
| |
v |
[WAL/Fsync] [Index Lookup]
| |
v |
[KV Store] <--------------------+
Port Scheme (181XX)
| Offset | Service | Default | Env Var |
|---|---|---|---|
| +0 | HTTP API | 18180 | STEMEDB_BIND_ADDR |
| +1 | Cluster Gateway | 18181 | STEMEDB_NODE_API_ADDR |
| +2 | Cluster RPC | 18182 | STEMEDB_NODE_RPC_ADDR |
| +3 | SWIM Gossip | 18183 | via SwimConfig |
| +4 | Metrics | 18184 | (reserved) |
| +5 | Admin | 18185 | (reserved) |
| +6 | Latent Signal | 18186 | — |
| +7 | Community App | 18187 | — |
| +8 | Admin Dashboard | 18188 | — |
Crates
| Crate | Purpose | Status |
|---|---|---|
stemedb-core |
Assertion, LifecycleStage, MaterializedView, types, signing | ✅ |
stemedb-wal |
Write-ahead log with crash recovery | ✅ |
stemedb-storage |
KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore | ✅ |
stemedb-ingest |
Ingestion pipeline, signature verification, ContentDefenseLayer | ✅ |
stemedb-query |
Query engine, Materializer for O(1) MV reads | ✅ |
stemedb-lens |
Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.) | ✅ |
stemedb-api |
HTTP API with axum + utoipa OpenAPI docs | ✅ |
stemedb-sim |
Simulation for testing the pipeline | ✅ |
stemedb-merkle |
BLAKE3 Merkle tree for diff detection | ✅ |
stemedb-rpc |
gRPC services for node-to-node communication | ✅ |
stemedb-sync |
Merkle sync, gossip broadcast, anti-entropy | ✅ |
stemedb-cluster |
Cluster membership (SWIM), sharding, gateway | ✅ |
stemedb-ontology |
Domain definitions (Pharma), subject builders, medical extractors | ✅ |
stemedb-chaos |
Chaos testing infrastructure | ✅ |
stemedb-dashboard |
Admin dashboard (React/Next.js) | ✅ (7 panels) |
Applications
| App | Purpose | Status |
|---|---|---|
aphoria |
Code-level truth linter — 42 extractors, claims, verify, coverage | 🎯 A5 flywheel |
disputed |
Controversy explorer | Planned |
SDKs
| SDK | Purpose | Status |
|---|---|---|
sdk/go/steme |
Go HTTP client with Ed25519 signing and fluent builders | ✅ |
sdk/go/adk |
ADK-Go tools and callbacks for AI agents | ✅ |
Quick Reference
# Build
cargo build --workspace
# Test
cargo test --workspace
# Lint (must pass before commit)
cargo clippy --workspace -- -D warnings
cargo fmt --check
# Run API server
cargo run --bin stemedb-api
# Run Aphoria scan
cargo run --bin aphoria -- scan /path/to/project --show-observations
# Run demo script
./scripts/demo-consumer-health.sh
Related Documents
- CLAUDE.md — AI assistant instructions and project rules
- roadmap-archive.md — Completed phases 1-8A + Pilot 1-3
- applications/aphoria/docs/vision-gaps.md — Aphoria vision gap analysis
- claims-explained.md — Hand-written Maxwell claims (the gold standard)
- docs/demo/pilot/amazement-demo.md — Technical demo script
- docs/demo/pilot/amazement-demo-2.md — Executive demo script
- uat/production-readiness/README.md — Production verification checklist