jml 9bfa626203 docs: reorganize documentation structure for clarity

Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 07:33:40 +00:00

15 KiB

Raw Blame History

Episteme (StemeDB) Roadmap

Goal: Build the "Git for Truth" substrate for autonomous AI research. Current Focus: A5.3 Claim Suggester validation + Pilot 5 Operational Readiness Target Vertical: BioTech/Pharma ("The Living Review") + Code Truth (Aphoria) Endgame: Distributed multi-writer cluster for millions of concurrent agents

Infrastructure Status: Phases 1-7 complete | Phase 8A (Chaos) complete | Pilot 1-4 complete Aphoria Status: A1-A4 complete (observations/claims/verify/corpus) | A5 flywheel 3/4 done

Archive: For completed phases 1-8A + Pilot 1-3, see roadmap-archive.md

Current Status

Phase	Status	Summary
1-7, 8A	✅ Complete	Core infra, cluster, trust, chaos testing
MVP, Pilot 1-4	✅ Complete	Consumer Health demo, dashboard, API auth, metrics
Aphoria A1-A4	✅ Complete	Observations/claims/verify/corpus/authority lens
Aphoria A5	🎯 In Progress	Flywheel: 3/4 done, A5.3 suggest skill needs validation
Pilot 5	Planned	Operational readiness: runbooks, ref arch, demo validation
8B-C	Planned	Distributed observability, geo-distribution
9	Planned	Disaster recovery, compliance, storage management

🎯 Aphoria: From Scanner to Knowledge Graph Client (CURRENT)

Goal: Transform Aphoria from "grep with Episteme vocabulary" into a real knowledge graph client that authors, stores, and audits claims with provenance and lineage. Vision Document: applications/aphoria/docs/vision-gaps.md Validation: Maxwell scan (67 observations, 0 noise) + hand-written claims-explained.md

Completed Phases (A1-A4 + P4 — see roadmap-archive.md for details)

Phase	What It Delivered
A1	`Observation` vs `AuthoredClaim` types, bridge tier mapping, `.aphoria/claims.toml` format
A2	`aphoria claims create/list/explain/update/supersede/deprecate`, `aphoria-claims` skill
A3	`verify.rs` engine (Pass/Conflict/Missing/Unclaimed), `aphoria verify run/map`, pre-commit hook, self-audit
A4	RFC/OWASP as Episteme assertions, `AphoriaAuthorityLens`, Trust Pack export/install
P4	API auth (3 roles), backup/restore scripts, Prometheus metrics + Grafana dashboard

Phase A5: The Flywheel

Goal: The system gets smarter with use. Each claim makes the next claim easier. Details: vision-gaps.md — §5 (claims-explained.md as the product) Research: a5-flywheel-skill-design.md — validates "skill calls CLI" hypothesis Key Insight: LLM reasoning over CLI JSON output replaces ML training. The flywheel is prompt engineering, not machine learning.

A5.1 Claim Coverage Metrics: Per-module claim density and gap reporting
- coverage.rs: CoverageReport, ModuleCoverage, CoverageSummary types
- compute_coverage() uses verify_claims() as source of truth for claim-observation matching
- Per-module: observation count, claim count, claimed/unclaimed, missing claims, density
- aphoria coverage CLI: table, JSON, markdown formats, --sort-by (name/density/unclaimed/observations)
- Coverage gaps section: modules with observations but no claims
- 8 unit tests including deprecated claim exclusion
A5.2 Auto-Generated Documentation: aphoria docs generate + aphoria claims explain
- aphoria docs generate CLI command with --output and --format (markdown/json)
- claims_explain.rs: groups by category, includes provenance/invariant/consequence/evidence per claim
- explain.rs: reads .aphoria/claims.toml, renders via render_claims_markdown()
- Provenance chains preserved (supersedes references)
A5.3 Claim Suggester Skill: LLM-powered pattern recognition via "skill calls CLI"
- New skill: .claude/skills/aphoria-suggest/SKILL.md (3 modes: cold start / foundation / flywheel)
- Workflow defined: claims list → verify run --show-unclaimed → reason by analogy → suggest
- Few-shot learning: existing claims as gold-standard examples for style matching
- Chain-of-thought: reasoning template before each suggestion
- Cold start bootstrap: reads README/CLAUDE.md/tests/ADRs when 0 claims
- Context tiers: local → semantic → summary → global (subagent)
- Quality gates: non-trivial, not type-enforced, has consequence, not duplicate
- VG-022 CLOSED: verifiable_predicates() on Extractor trait; 10 extractors declare predicates; verify map shows extractor→claim coverage
- Dogfood claims: 10 total claims in .aphoria/claims.toml (3 arch + 7 security) covering all ComparisonModes
- Validate: Run skill against Aphoria's own codebase (dogfood)
- Validate: Run skill against an external project (cold start test)
- Iterate: Refine prompt based on suggestion quality from validation
A5.4 Onboarding Mode: aphoria explain for new team members
- explain.rs: generate_explanation() reads claims, renders narrative
- aphoria explain CLI with --output and --format (markdown/json)
- Shows claim inventory grouped by category with provenance
- Empty project handling: directs to aphoria claims create

Pilot 5: Operational Readiness

Goal: Complete production readiness for enterprise pilot demo. Context: Pilot 1-4 complete (see archive).

P5.1 Operational Runbooks: Common procedures documented
- "Server won't start" troubleshooting
- "High query latency" investigation
- "Quarantine queue overflow" handling
- "Circuit breaker stuck open" resolution
- "Restore from backup" step-by-step
P5.2 Reference Architecture: Deployment guide
- Single-node pilot deployment diagram
- Network requirements (ports, firewall rules)
- Reverse proxy configuration (nginx/envoy with TLS)
- Resource sizing guide (CPU, memory, disk)
P5.3 Pilot Success Criteria Document: Definition of done
- Sub-second query latency at 10K assertions: measured
- Successful conflict detection on known contradictory studies: demonstrated
- Complete audit trail export for mock regulatory review: tested
- Source retraction workflow: exercised
P5.4 Executive Demo Script Validation: End-to-end rehearsal
- Run through amazement-demo-2.md with real dashboard
- Time each segment (target: 20 minutes total)
- Record demo video for async sharing
- All 5 Aha Moments demonstrable with real data

Phase 8B-C: Production Observability (Planned)

Blocked by: Pilot Prep (need real production deployment first)

8B. Observability

8B.1 Distributed Metrics: Per-node, per-range, per-agent metrics.
8B.2 Admin Dashboard: Cluster health visibility.

8C. Production Hardening

8C.1 Snapshot/Restore: Fast replica bootstrap.
8C.2 Backpressure: Don't overwhelm slow nodes.
8C.3 Geo-Distribution: Multi-region deployment.

Phase 9: The Bunker (Disaster Planning)

Goal: Survive the worst. Backup, restore, recover from corruption, comply with regulations.

9A. Backup & Cold Storage

9A.1 Full Cluster Backup: Point-in-time snapshot to S3/GCS.
9A.2 Point-in-Time Recovery (PITR): Restore to any HLC timestamp.
9A.3 Backup Verification: Weekly automated restore tests.

9B. Data Corruption & Rollback

9B.1 Corruption Detection: Deep validation before accepting gossip.
9B.2 Assertion Tombstones: "Delete" in an append-only world.
9B.3 Cluster Rollback: Batch tombstone generation for time ranges.
9B.4 Fork Recovery: Heal split-brain after extended partition.

9C. Compliance & Legal

9C.1 GDPR Right to Erasure: Cryptographic erasure via per-agent keys.
9C.2 Data Retention Policies: Per-subject/predicate retention rules.
9C.3 Audit Trail for Compliance: Immutable admin action log.
9C.4 SOC 2 Type II Certification: External audit and certification.

9D. Storage Management

9D.1 Compaction: Reclaim space from tombstoned data.
9D.2 Tiered Storage: Hot/warm/cold based on access patterns.
9D.3 Storage Quotas: Per-agent and cluster-wide limits.

9E. Incident Response

9E.1 Alerting & Escalation: PagerDuty/Slack integration.
9E.2 Operational Runbooks: Documented procedures for common failures.
9E.3 Chaos Engineering: Monthly "game days" with controlled failures.

9F. Security Hardening

9F.1 TLS Everywhere: mTLS for node-to-node traffic.
9F.2 Encryption at Rest: WAL and KV store encryption.
9F.3 Node Authentication: Ed25519 keypair identity, signed cluster join.

Architecture Overview

Write Path (Spine):           Read Path (Cortex):
[Agent] -> [Ingestion]        [Agent] <- [Lens Engine]
              |                              |
              v                              |
         [WAL/Fsync]                  [Index Lookup]
              |                              |
              v                              |
         [KV Store] <--------------------+

Port Scheme (181XX)

Offset	Service	Default	Env Var
+0	HTTP API	18180	`STEMEDB_BIND_ADDR`
+1	Cluster Gateway	18181	`STEMEDB_NODE_API_ADDR`
+2	Cluster RPC	18182	`STEMEDB_NODE_RPC_ADDR`
+3	SWIM Gossip	18183	via `SwimConfig`
+4	Metrics	18184	(reserved)
+5	Admin	18185	(reserved)
+6	Latent Signal	18186	—
+7	Community App	18187	—
+8	Admin Dashboard	18188	—

Crates

Crate	Purpose	Status
`stemedb-core`	Assertion, LifecycleStage, MaterializedView, types, signing	✅
`stemedb-wal`	Write-ahead log with crash recovery	✅
`stemedb-storage`	KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore	✅
`stemedb-ingest`	Ingestion pipeline, signature verification, ContentDefenseLayer	✅
`stemedb-query`	Query engine, Materializer for O(1) MV reads	✅
`stemedb-lens`	Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.)	✅
`stemedb-api`	HTTP API with axum + utoipa OpenAPI docs	✅
`stemedb-sim`	Simulation for testing the pipeline	✅
`stemedb-merkle`	BLAKE3 Merkle tree for diff detection	✅
`stemedb-rpc`	gRPC services for node-to-node communication	✅
`stemedb-sync`	Merkle sync, gossip broadcast, anti-entropy	✅
`stemedb-cluster`	Cluster membership (SWIM), sharding, gateway	✅
`stemedb-ontology`	Domain definitions (Pharma), subject builders, medical extractors	✅
`stemedb-chaos`	Chaos testing infrastructure	✅
`stemedb-dashboard`	Admin dashboard (React/Next.js)	✅ (7 panels)

Applications

App	Purpose	Status
`aphoria`	Code-level truth linter — 42 extractors, claims, verify, coverage	🎯 A5 flywheel
`disputed`	Controversy explorer	Planned

SDKs

SDK	Purpose	Status
`sdk/go/steme`	Go HTTP client with Ed25519 signing and fluent builders	✅
`sdk/go/adk`	ADK-Go tools and callbacks for AI agents	✅

Quick Reference

# Build
cargo build --workspace

# Test
cargo test --workspace

# Lint (must pass before commit)
cargo clippy --workspace -- -D warnings
cargo fmt --check

# Run API server
cargo run --bin stemedb-api

# Run Aphoria scan
cargo run --bin aphoria -- scan /path/to/project --show-observations

# Run demo script
./scripts/demo-consumer-health.sh

Arena: Simulation Roadmap

Goal: Incrementally evolve the simulator from Spine validation to a full Agent-Based Modeling environment. Philosophy: Make it run. Then add. Verify at every step. Alignment: Tracks main roadmap phases; exercises features as they land.

Current State

The simulator (stemedb-sim) validates the full system through Arena 0-4:

Completed Arenas:

✅ Arena 0: Test infrastructure with assertions and CI integration
✅ Arena 1: Query path via QueryEngine, Recency lens, lifecycle filtering, query audit
✅ Arena 2: Voting & VoteAwareConsensus, troll resistance
✅ Arena 2.5: Hardening (race conditions, API tests, crash recovery, input validation)
✅ Arena 3: Materialized Views, fast-path verification, MV freshness
✅ Arena 4: Agent personas (Scientist, Troll, Believer with differentiated strategies)

What's Tested:

WAL durability, rkyv serialization, Ed25519 signatures
Ingestor pipeline (WAL → KV async flow)
QueryEngine with multiple lenses
Lifecycle filtering, voting, consensus
Query audit trail, materialized views
Strategy-driven agent behaviors

What's Not Yet Tested:

❌ TrustRank (Arena 5)
❌ Concurrent agents at scale (Arena 6)
❌ Time-travel queries (Arena 7)
❌ Skeptic lens & conflict scores (Arena 8)

Upcoming Arena Phases

Arena 5: TrustRank Integration (Next)

Initialize TrustRank for agents
Reputation adjustment after votes
TrustAwareAuthorityLens verification
Troll reputation decay over time

Arena 6: Concurrent Agents

Tokio task per agent
Scale to 100 agents, then 1000
Contention metrics and bottleneck identification

Arena 7: Time-Travel & Epochs

Time-travel query verification
Epoch creation and supersession
Epoch cascade validation

Arena 8: Skeptic & Conflict

High/low conflict scenarios
Skeptic lens surfacing outliers
Conflict score accuracy

Arena 9: Full Gameplay Loop

Ground truth injection
Complete 5-tick scenario
Extended 1000-tick run
Emergence validation

Alignment with Use Cases

Use Case	Arena Phase
Agile Agent Team
Lifecycle filtering	Arena 1.3
Query audit trail	Arena 1.4
Time-travel debugging	Arena 7.1
Expert weighting	Arena 5.3
Financial Due Diligence
Conflict detection	Arena 8.1, 8.3
Epoch cascades	Arena 7.2, 7.3

Run command: cargo run --bin stemedb-sim Test suite: cargo test -p stemedb-sim

CLAUDE.md — AI assistant instructions and project rules
roadmap-archive.md — Completed phases 1-8A + Pilot 1-3
applications/aphoria/docs/vision-gaps.md — Aphoria vision gap analysis
claims-explained.md — Hand-written Maxwell claims (the gold standard)
docs/demo/pilot/amazement-demo.md — Technical demo script
docs/demo/pilot/amazement-demo-2.md — Executive demo script
uat/production-readiness/README.md — Production verification checklist

15 KiB Raw Blame History