stemedb/roadmap-archive.md
jordan 422e2d4416 feat(aphoria): wire claims through StemeDB — Gap Closure Phase 1
Claims now flow through StemeDB's append-only knowledge graph instead of
mutable TOML files. This resolves all 6 critical claim-bypass code paths:

- Bridge: lossless AuthoredClaim ↔ Assertion round-trip (comparison, status, lifecycle mapping)
- LocalEpisteme: ingest_authored_claim() and fetch_authored_claims() with AUTHORED_CLAIM predicate index
- EpistemeClaimStore: ClaimStore trait backed by StemeDB (append-only delete via deprecation)
- CLI handlers: all claim commands read/write through StemeDB
- Scanner: loads claims from StemeDB with auto-migration fallback to TOML
- Export: new `aphoria claims export` serializes StemeDB claims to TOML/JSON

Also cleans up dead code (EpistemeConfig.url), renames ingest_claims→ingest_observations,
fixes ClaimFilter.authority_tier type, adds Draft variant to ClaimStatus, and fixes
pre-existing clippy warnings (too_many_arguments, filter_next→rfind).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 02:02:51 -07:00

426 lines
23 KiB
Markdown

# Episteme (StemeDB) Roadmap Archive
> **Purpose:** Historical record of completed phases. For current work, see [roadmap.md](./roadmap.md).
> **Last Updated:** 2026-02-08
---
## Completed Phases Summary
| Phase | Codename | Status | Completion |
|-------|----------|--------|------------|
| **1** | The Spine | ✅ Complete | Storage & Safety — WAL + KV Store |
| **2** | The Lattice | ✅ Complete | Indexing & Async — MVs + Ballot Box |
| **2.5** | Hardening | ✅ Complete | MV staleness, epoch behavior, lens cleanup |
| **3** | The Pilot | ✅ Complete | Vertical Integration — Pharma Ingestion |
| **4** | The Hive | ✅ Complete | Trust & Learning — TrustRank, metadata indexing |
| **5** | The Forge | ✅ Complete | Foundation Hardening — redb/fjall, WAL, indices |
| **6** | The Mesh | ✅ Complete | Distributed Writes — CRDT, Raft, clustering |
| **7** | The Shield | ✅ Complete | Trust at Scale — EigenTrust, PoW, quarantine |
| **8A** | Chaos | ✅ Complete | Partition testing, Jepsen-style verification |
| **MVP** | Consumer Health | ✅ Complete | Real FDA data → conflicts detected → demo |
| **Pilot 1-3** | Pilot Prep (Partial) | ✅ Complete | Dashboard, demo data, impact analysis, load testing |
| **Pilot 4** | Production Hardening | ✅ Complete | API auth, backup/restore, Prometheus metrics |
| **Aphoria A1** | Observations vs Claims | ✅ Complete | Type system: Observation + AuthoredClaim, bridge tiers |
| **Aphoria A2** | Authoring Workflow | ✅ Complete | claims create/list/explain/update/supersede/deprecate |
| **Aphoria A3** | Verification Engine | ✅ Complete | verify.rs, verify run/map, pre-commit hook, self-audit |
| **Aphoria A4** | Corpus as Assertions | ✅ Complete | RFC/OWASP assertions, authority lens, trust packs |
---
## Phase 1: The Spine (Foundation) ✅
*Goal: Securely ingest assertions and persist them without data loss.*
- [x] **Project Scaffold**: Initialize Rust workspace, set up linting/CI (clippy, fmt).
- [x] **Assertion Schema**: Define the `Assertion` struct with `rkyv` serialization.
- [x] Add dependencies: `rkyv`, `blake3`, `ed25519-dalek`, `image_hasher`.
- [x] Define `Assertion` struct (Subject, Predicate, Object, Confidence, SourceHash).
- [x] **Multi-Sig Expansion**: Implement `SignatureEntry` struct and `signatures: Vec<SignatureEntry>` field.
- [x] **Visual Expansion**: Add `visual_hash: Option<pHash>` field for image provenance.
- [x] Test serialization round-trips.
- [x] **Ballot Schema**: Define the `Vote` struct for multi-agent consensus.
- [x] Add `Vote` struct: `assertion_hash`, `agent_id`, `weight`, `signature`.
- [x] Test serialization round-trips.
- [x] **Paradigm Schema (Epochs)**: Define the `Epoch` and `SupersessionType` structs.
- [x] Add `epoch: Option<EpochId>` to `Assertion`.
- [x] Implement `Epoch` struct with `supersedes` and `SupersessionType`.
- [x] Test serialization round-trips.
- [x] **WAL Integration**: Implement the Quarantine Pattern for write-ahead logging.
- [x] Create `stemedb-wal` crate.
- [x] Port `FsyncGuard` and `Record` logic from established durability patterns.
- [x] Implement Record format with BLAKE3 checksums and Headers.
- [x] Verify `fsync` behavior with tests.
- [x] **Storage Engine**: Implement the `Store` trait using `sled` (embedded KV).
- [x] Add `sled` dependency.
- [x] Define `KVStore` trait (put, get, delete, scan_prefix, flush).
- [x] Implement `SledStore` wrapper.
- [x] **Basic Ingestor**: Background worker that tails WAL and writes to KV.
- [x] Implement async loop reading from WAL.
- [x] Write deserialized assertions, votes, and epochs to `sled`.
- [x] Ed25519 signature verification during ingestion.
- [x] Maintains S: and SP: indexes on ingest.
- [x] Persistent cursor/checkpoint (resumes from `__CURSOR__:ingest` in KV store).
- [x] **Verification**: Crash recovery tests (write -> crash -> restart -> read).
- [x] Single and multi-record crash recovery.
- [x] Multiple crash cycles tested.
---
## Phase 2: The Lattice (Connectivity) ✅
*Goal: Query data with sub-millisecond latency using Materialized Views.*
- [x] **Lifecycle Schema**: Add `LifecycleStage` to Assertion.
- [x] Define enum: `Proposed`, `UnderReview`, `Approved`, `Deprecated`, `Rejected`.
- [x] Update `Assertion` struct and serialization tests.
- [x] **The Ballot Box**: Implement high-velocity vote ingestion.
- [x] `VoteStore` trait and implementation.
- [x] `VoteAwareConsensusLens` for real vote-based resolution.
- [x] **Index Infrastructure**: Compound indexes for O(1) queries.
- [x] `IndexStore` trait with S: and SP: indexes.
- [x] `QueryEngine` smart routing (SP -> S -> scan).
- [x] **Materializer**: Background worker for O(1) Read Performance.
- [x] `MaterializedView` type in `stemedb-core`.
- [x] `Materializer` worker in `stemedb-query` with `step()` and `run()`.
- [x] Aggregates Votes via `VoteAwareConsensusLens` (or any `AsyncLens`).
- [x] Updates `MV:{Subject}:{Predicate}` with the winning Assertion + metadata.
- [x] Event-driven mode via `run_notified()` with `tokio::sync::Notify`.
- [x] Fast-path MV lookup in `QueryEngine::try_fast_path()`.
- [x] **The Meter**: Implement Economic Throttling (TAN).
- [x] `QuotaStore` trait and `GenericQuotaStore` implementation.
- [x] Token Bucket algorithm with per-agent per-hour quotas.
- [x] `MeterLayer` tower middleware for request cost tracking.
- [x] Cost model: Assert=10, Vote=1, Query=5+lens, +1/KB payload.
- [x] `GET /v1/meter/quota` endpoint to check remaining quota.
- [x] `POST /v1/meter/quota/limit` admin endpoint to set custom limits.
- [x] **API Surface**: `axum` HTTP server with OpenAPI (utoipa).
- [x] `POST /v1/assert` -> Accepts JSON, writes to WAL.
- [x] `POST /v1/vote` -> High-throughput vote endpoint.
- [x] `POST /v1/epoch` -> Create epoch with optional supersession.
- [x] `GET /v1/query` -> Subject/Predicate/Lens/Lifecycle/Epoch filtering.
- [x] `GET /v1/health` -> Health check with assertion count.
- [x] `GET /swagger-ui` -> Interactive API docs.
- [x] 5 lens types available: Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority.
- [x] **Query Audit**: Log every read with provenance.
- [x] Define `QueryAudit` struct: query_id, agent_id, timestamp, params, result_hash, contributing_assertions.
- [x] Storage at `AUD:{query_id}` with agent index at `AUDA:{agent_id}:{timestamp}:{query_id}`.
- [x] `GET /v1/audit/queries` -> Returns history of agent decisions.
- [x] `GET /v1/audit/query/{id}` -> Full reasoning trace for a single query.
- [x] Auto-logging on every query via `X-Agent-Id` header.
---
## Phase 2.5: Hardening ✅
*Goal: Close the gaps between "built" and "works right."*
- [x] **2.1 MV Staleness Detection**: `max_stale` parameter on queries.
- [x] **2.2 AuthorityLens -> ConfidenceLens Rename**: Eliminated misleading name.
- [x] **2.3 EpochAwareLens**: Epoch supersession runtime behavior with cycle detection.
- [x] **2.4 Visual Hash Query Support**: Hamming distance queries on `visual_hash`.
- [x] **2.5 Vector Field**: `vector: Option<Vec<f32>>` stored on assertions.
- [x] **2.6 E2E Integration Test**: Full pipeline validation (Write -> Materialize -> Read).
---
## Phase 3: The Pilot (BioTech/Pharma) ✅
*Goal: Prove value in the "High-Liability" beachhead.*
### 3A. Schema Expansion
- [x] **3A.1 Source-Class Field**: 6-tier `SourceClass` enum (Regulatory → Anecdotal).
- [x] **3A.2 Conflict Score on Resolution**: Normalized variance-based conflict metric.
- [x] **3A.3 Rich Source Metadata**: `source_metadata: Option<Vec<u8>>` for JSON provenance.
### 3B. Time & Decay
- [x] **3B.1 Time-Travel Engine**: `as_of` parameter for historical queries.
- [x] **3B.2 Semantic Decay**: Confidence half-life with tier-specific rates.
### 3C. New Lenses
- [x] **3C.1 Skeptic Lens**: Surface disagreement via Shannon entropy conflict scoring.
- [x] **3C.2 Layered Consensus Lens**: Per-source-class consensus with tier visibility.
- [x] **3C.3 Constraints Lens**: Pre-flight check for must_use/forbidden/prefer.
### 3D. Epoch Enhancement
- [x] **3D.1 Epoch Cascade Logic**: O(1) supersession lookup via pre-computed markers.
### 3E. Similarity Search
- [x] **3E.1 Vector Search**: HNSW-based semantic k-NN queries.
- [x] **3E.2 Visual Hash Index**: BK-tree for O(log N) visual similarity.
### 3F. Provenance
- [x] **3F.1 Source Document Storage**: Content-addressed source storage with `GET /v1/provenance/{hash}`.
### 3G. API Cleanup
- [x] **3G.1 Document epoch supersession**: Updated docs for `POST /v1/epoch` with `supersedes` field.
---
## Phase 4: The Hive (Trust & Scale) ✅
*Goal: Change tracking, metadata indexing, and training pipeline primitives.*
- [x] **TrustRank Engine**: Per-agent reputation with decay and learning loop.
- [x] **4.1 "Since" Parameter**: MV changelog at `MVC:` keys with `changes_since` in responses.
- [x] **4.2 Source Metadata Indexing**: Indexed fields (journal, doi, platform, study_design) at `SMV:`.
- [x] **4.3 Batch TrustRank Decay API**: `POST /v1/admin/decay-trust-ranks`.
- [x] **4.4 Vote Provenance Witness**: `source_url` and `observed_context` on votes.
- [x] **4.5 Conflict Score Filtering**: `min_conflict_score`/`max_conflict_score` on queries.
- [x] **4.6 Escalation Triggers**: `EscalationPolicy` fires events on high-conflict assertions.
- [x] **4.7 Gold Standard Verification**: Admin-verified assertions for agent testing.
---
## Phase 5: The Forge (Foundation Hardening) ✅
*Goal: Replace abandoned dependencies, fix WAL gaps, persist indices.*
### 5A. Storage Engine Replacement
- [x] **5A.1 Replace sled with redb + fjall**: HybridStore with prefix-based routing.
- [x] **5A.2 Key Layout Redesign**: Subject-prefix keys for range sharding readiness.
### 5B. WAL Hardening
- [x] **5B.1 CRC32C Checksums**: Hardware-accelerated torn write detection.
- [x] **5B.2 Crash Recovery Implementation**: Sequential scan with truncation.
- [x] **5B.3 Group Commit**: Batch fsync for throughput.
- [x] **5B.4 Log Rotation**: Segment management with safe deletion.
### 5C. Index Persistence
- [x] **5C.1 Persistent Vector Index**: Hot/cold HNSW with checkpoint files.
- [x] **5C.2 Persistent Visual Index**: BK-tree snapshots with CRC32C verification.
### 5D. Concept Hierarchy
- [x] **5D.1 ConceptPath Type**: Scheme-qualified subject identifiers.
- [x] **5D.2 Source Scheme Registry**: Scheme → default source tier mapping.
- [x] **5D.3 Alias Store**: Cross-scheme entity resolution with cycle detection.
- [x] **5D.4 Hierarchical Query**: Prefix-based subject queries.
- [x] **5D.5 Alias Resolution in Queries**: `GET /v1/concepts/resolve?path=...`.
- [x] **5D.6 Source Class Inference**: Tier inference from scheme.
- [x] **5D.7 Concept API Endpoints**: Full CRUD for aliases and hierarchy.
- [x] **5D.8 Battery Tests**: 15 tests across Battery 8 and 9.
---
## Phase 6: The Mesh (Distributed Writes) ✅
*Goal: Multi-node cluster with CRDT replication and Raft coordination.*
### 6A. CRDT Foundation
- [x] **6A.1 Integrate CRDT Crate**: G-Set for assertions, G-Counter for votes.
- [x] **6A.2 Hybrid Logical Clocks**: HLC timestamps for causal ordering.
- [x] **6A.3 Merkle Tree Over Assertions**: BLAKE3-based diff detection.
### 6B. Two-Node Replication (PoC)
- [x] **6B.1 RPC Layer**: tonic gRPC with SyncClient and SyncServiceHandler.
- [x] **6B.2 Gossip Broadcast**: Configurable fanout with rate limiting.
- [x] **6B.3 Merkle Anti-Entropy Sync**: Background convergence worker.
- [x] **6B.4 Integration Test**: 8 tests validating replication primitives.
### 6C. Multi-Node Cluster
- [x] **6C.1 Cluster Membership (SWIM Gossip)**: Node discovery and failure detection.
- [x] **6C.2 Subject-Prefix Range Sharding**: BLAKE3 + jump hash routing.
- [x] **6C.4 Gateway**: Stateless request routing with health and status endpoints.
- [x] **6C.5 Integration Tests**: 82 tests covering membership, sharding, gateway.
### Consistency Guarantees
| Property | Guarantee | Mechanism |
|----------|-----------|-----------|
| **Convergence** | Eventually consistent | G-Set merge (CRDT) |
| **Causality** | Supersessions ordered | HLC timestamps |
| **Partition Tolerance** | Writes never blocked | Any node accepts via CRDT |
| **Availability** | Reads/writes always succeed | Every node is master for CRDTs |
| **Durability** | WAL + fsync per node | Existing WAL infra |
| **Conflict Resolution** | Deterministic | Lens algorithms |
---
## Phase 7: The Shield (Trust at Scale) ✅
*Goal: Defend against spam, Sybil attacks, and knowledge poisoning.*
### 7A. Admission Control
- [x] **7A.1 Proof-of-Work Admission**: BLAKE3 hashcash with graduated difficulty.
- [x] **7A.2 Graduated Trust Tiers**: 5 tiers (Untrusted → Authority) with quota multipliers.
### 7B. EigenTrust
- [x] **7B.1 Trust Graph Store**: Direct trust relationships at `TG:` keys.
- [x] **7B.2 EigenTrust Computation**: Power iteration with Sybil resistance.
- [x] **7B.3 Domain-Specific Trust**: Per-predicate-namespace reputation.
### 7C. Content Defense
- [x] **7C.1 MinHash Deduplication**: LSH bucketing with 0.9 Jaccard threshold.
- [x] **7C.2 Content Quality Scoring**: Entropy, length, structure heuristics.
- [x] **7C.3 Quarantine Store**: Time-ordered suspicious assertions with admin review.
### 7D. Circuit Breakers
- [x] **7D.1 Per-Agent Circuit Breakers**: Closed → Open → HalfOpen state machine.
---
## Phase 8A: Chaos Testing ✅
- [x] **8A.1 Partition Testing**: 5-node cluster, network partitions, cascading failures.
- [x] **8A.2 Jepsen-Style Consistency Testing**: CRDT properties, clock skew, concurrent writes.
---
## Consumer Health MVP ✅
*"Can Episteme demonstrate value that's impossible with Postgres?"*
### Definition of Done (All Complete)
| Checkpoint | Description |
|------------|-------------|
| **Real Data Flows** | FDA drug labels for 3+ GLP-1 drugs ingested as signed assertions |
| **Conflicts Detected** | SkepticLens shows `conflict_score > 0.5` when sources disagree |
| **Source Hierarchy Works** | Tier 0 (FDA) outweighs 100x Tier 5 (anecdotal) volume |
| **Time Travel Works** | `as_of=2024-01-01` returns historical snapshot |
| **Decay Works** | 6-month-old Reddit claim has lower effective confidence than fresh FDA |
| **UAT Passes** | Consumer Health scenarios documented and verified |
| **Self-Serve Demo** | CLI tool lets anyone explore without code |
| **Documentation** | "Adding a Domain" guide enables new verticals |
### MVP Workstream (Weeks 1-6)
- Week 1: Domain definitions, SubjectBuilder, pharma schema
- Week 2: FDA extractor, claim-to-assertion signing
- Week 3: Ingest FDA claims, mock conflicts, SkepticLens demo
- Week 4: UAT scenarios documented and verified
- Week 5: `steme-pharma` CLI for self-serve exploration
- Week 6: Polish, reusable patterns, documentation
---
## Enterprise Pilot Preparation (Partial) ✅
*Completed: Pilot-1, Pilot-2, Pilot-3, P4.1. Remaining: P4.2-P4.4, P5.1-P5.4 (still in roadmap.md)*
### Pilot-1: Demo Dashboard (Complete)
> **Deliverable:** React admin dashboard that makes the API visual
- [x] **P1.1 Dashboard Scaffold**: Next.js + shadcn/ui project setup (`applications/stemedb-dashboard/`)
- [x] **P1.2 Skeptic Query Visualization**: Contradictions with conflict scores, tier badges, expandable claims
- [x] **P1.3 Layered Consensus View**: Per-tier breakdown with cross-tier conflict visualization
- [x] **P1.4 Quarantine Admin Panel**: Pending queue, approve/reject, filter by reason, metrics
- [x] **P1.5 Circuit Breaker Status**: Blocked agents, state badges (OPEN/HALF_OPEN/CLOSED), manual reset
- [x] **P1.6 Audit Trail Browser**: Recent queries, drilldown, filter by agent/time, export JSON/CSV
### Pilot-2: Demo Data Seeder (Complete)
> **Deliverable:** Pre-signed realistic demo data using Go SDK
- [x] **P2.1 Demo Keypair Management**: 5 demo agents (FDA, PubMed, ClinicalTrials, Reddit, Internal) with deterministic keys
- [x] **P2.2 Conflict Scenarios**: 3 drugs (semaglutide, tirzepatide, liraglutide), 150+ assertions, real FDA content
- [x] **P2.3 Retractable Sources**: CARDIOVASC_MEGA_TRIAL with 110 cascade assertions across 5 agents
- [x] **P2.4 Historical Data**: Lifecycle evolution (Proposed → Approved → Deprecated), 17 historical assertions
### Pilot-3: Impact Analysis (Complete)
> **Deliverable:** Automatic cascade when source is retracted
- [x] **P3.1 Impact Analysis Endpoint**: `GET /v1/sources/{hash}/impact`, quarantine with preview, restore, 17 tests
- [x] **P3.2 Cascade Flagging**: Query-time source status enrichment, `exclude_quarantined_sources` filter, CSV/JSON export
- [x] **P3.3 Impact Dashboard Widget**: Sources page, quarantine dialog with impact preview, impact ripple animation
### Pilot-4: Production Hardening (Partial)
- [x] **P4.1 Load Testing**: Go-based load tester, 10K assertions, 1K writes/sec, 100 concurrent readers, markdown reports
### 5 Amazement Moments (Status at Archive)
| # | Moment | Status |
|---|--------|--------|
| 1 | Contradictions visible with confidence scores | ✅ Complete |
| 2 | Cascade invalidation when source retracted | ✅ Complete |
| 3 | Full FDA-ready audit trail | ✅ Complete |
| 4 | Point-in-time queries + decay | ✅ API ready (no timeline UI) |
| 5 | Malicious agent blocked by circuit breaker | ✅ Complete |
---
## Key Architectural Decisions (Historical)
- **sled → redb/fjall**: sled abandoned. HybridStore routes by key prefix.
- **Raft log = WAL**: Eliminated duplicate WAL following TiKV v5.4 pattern.
- **CRDT for data, Raft for coordination**: Assertions are G-Set CRDT.
- **Subject-prefix ranges**: Co-locate all data for a subject on one shard.
- **HLC over TrueTime**: Works on commodity hardware.
- **AP model**: Writes never blocked during partitions.
---
## Research Documents
- [docs/research/wal-crash-recovery-research.md](docs/research/wal-crash-recovery-research.md) — WAL patterns from CockroachDB, TiKV, FoundationDB, SQLite.
- [docs/research/distributed-write-path.md](docs/research/distributed-write-path.md) — Spanner/CockroachDB-style distributed writes adapted for append-only model.
---
## Crates (as of archive date)
| Crate | Purpose |
|-------|---------|
| `stemedb-core` | Assertion, LifecycleStage, MaterializedView, types, signing utilities |
| `stemedb-wal` | Write-ahead log with crash recovery |
| `stemedb-storage` | KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex |
| `stemedb-ingest` | Ingestion pipeline, signature verification, ContentDefenseLayer |
| `stemedb-query` | Query engine, Materializer for O(1) MV reads |
| `stemedb-lens` | Lenses (Recency, Consensus, Authority, Skeptic, Layered, etc.) |
| `stemedb-api` | HTTP API with axum + utoipa OpenAPI docs |
| `stemedb-sim` | Simulation for testing the pipeline |
| `stemedb-merkle` | BLAKE3 Merkle tree for diff detection |
| `stemedb-rpc` | gRPC services for node-to-node communication |
| `stemedb-sync` | Merkle sync, gossip broadcast, anti-entropy |
| `stemedb-cluster` | Cluster membership (SWIM), sharding, gateway |
| `stemedb-ontology` | Domain definitions (Pharma), subject builders, medical extractors |
| `stemedb-chaos` | Chaos testing infrastructure |
---
## Pilot-4: Production Hardening ✅
- [x] **P4.2 API Authentication**: API key middleware (`X-API-Key`), BLAKE3-hashed keys, 3 roles (admin/write/read), 5 CRUD endpoints, bootstrap via `STEMEDB_ROOT_API_KEY`
- [x] **P4.3 Backup/Restore**: `scripts/backup-stemedb.sh` + `scripts/restore-stemedb.sh`, WAL magic verify, rename-not-delete safety
- [x] **P4.4 Prometheus Metrics**: `/metrics` endpoint, `assertions_total`, `queries_total`, `query_latency_seconds`, `quarantine_pending`, Grafana dashboard template
---
## Aphoria A1: Distinguish Observations from Claims ✅
*Goal: Type system reflects the real difference. No more pretending grep results are claims.*
- [x] **A1.1 Rename ExtractedClaim to Observation**: Updated across all 42 extractors, bridge, scanner, CLI
- [ ] **A1.2 Create Claim Type**: `AuthoredClaim` in `types/authored_claim.rs` with provenance/invariant/consequence/authority/evidence/status/supersedes. `ClaimStore` trait + `TomlClaimStore`. `ClaimsFile` TOML persistence in `.aphoria/claims.toml` *(Partial: `AuthoredClaim` type and `ClaimsFile` persistence complete. `ClaimStore` trait defined but never implemented — `TODO(A4)` in `claim_store.rs`. All operations bypass it via `ClaimsFile` directly.)*
- [x] **A1.3 Update Bridge Tier Mapping**: Observations → Tier 4 (Community), authored claims get tier from `authority_tier` field via `authored_claim_to_assertion()`
- [x] **A1.4 Claim File Format**: `.aphoria/claims.toml` with `[[claim]]` TOML arrays, human-readable, version-controllable
## Aphoria A2: Build the Authoring Workflow ✅
*Goal: The skill — not the scanner — is the primary interface for creating claims.*
- [x] **A2.1 Claim Authoring Command**: `aphoria claims create` with all fields, authority tier validation
- [x] **A2.2 Claim Listing**: `aphoria claims list` with `--category`, `--status`, `--format json`
- [x] **A2.3 Claims Explained Generator**: `aphoria claims explain` groups by category with provenance/invariant/consequence
- [x] **A2.4 Enhance Aphoria Skill**: `.claude/skills/aphoria-claims/SKILL.md` for diff review, pattern table, authority tier guide
- [x] **A2.5 Claim Lifecycle**: `update`, `supersede` (with parent pointer), `deprecate` (with reason)
## Aphoria A3: Pair Extractors with Claims ✅
*Goal: Extractors verify claims, not generate them. The audit finds real conflicts.*
- [x] **A3.1 Verification Engine**: `ComparisonMode` (Equals/NotEquals/Present/Absent), `verify.rs` with tail-path matching, 4 verdicts (Pass/Conflict/Missing/Unclaimed), 11 unit tests
- [x] **A3.2 Verify Command**: `aphoria verify run|map`, `--exit-code` (0=pass, 1=missing, 2=conflicts, 3=error), `--claim` and `--category` filters
- [x] **A3.3 Verify Report Formatters**: `verify_table.rs` + `verify_json.rs`
- [x] **A3.4 Pre-Commit Hook**: `aphoria verify run --changed-only --exit-code` using `walk_staged_files()`
- [x] **A3.5 Self-Audit Extractors**: `self_audit.rs` (unwrap count, bridge tier, parent_hash, lifecycle), opt-in, 5+3 tests
## Aphoria A4: Make the Corpus First-Class ✅
*Goal: RFC/OWASP knowledge lives in Episteme as real assertions, not hardcoded data.*
- [x] **A4.1 Import RFC Corpus**: Tier 0/1 assertions with section references, source hash = content hash, `create_authoritative_assertion_with_metadata()` helper
- [x] **A4.2 Import OWASP Corpus**: OWASP → Tier 0/1 assertions with CWE references as metadata
- [x] **A4.3 Lens-Based Conflict Resolution**: `AphoriaAuthorityLens` implementing `stemedb_lens::Lens`, `TierBreakdown` in conflict results
- [x] **A4.4 Trust Packs as Claim Bundles**: `aphoria corpus export-pack`, `trust-pack list/install`, `export_claims_as_policy()` bridges claims → Trust Packs