# Aphoria Vision Gaps **Date**: 2026-02-08 **Status**: Honest assessment of where we are vs. where we need to be **Grounded Against**: Codebase as of commit `e0d2940` (42 extractors, bridge.rs, ephemeral/persistent modes) ## Implementation Status **Phase A1: Distinguish Observations from Claims** - ✅ **COMPLETE** (2026-02-08) - Renamed `ExtractedClaim` → `Observation` (struct + 81 files updated) - Added confidence-based tier mapping: ≥0.9 → Tier 4, <0.9 → Tier 5 - `observation_to_assertion()` replaces fixed Tier 3 assignment - `AuthoredClaim` type fully defined with provenance/invariant/consequence fields - Claims storage in `.aphoria/claims.toml` (ClaimsFile implementation) - CLI commands: `aphoria claim create|list|explain|update|supersede|deprecate` - All 1055 tests passing **Verification Engine Enhancements** - ✅ **COMPLETE** (2026-02-08) - Added `Contains` and `NotContains` comparison modes for substring/list checking - `verify run` command verifies authored claims against observations - `verify map` shows extractor-to-claim coverage - Inline marker support: `@aphoria:claim[category]` comments in code - Marker workflow: `list-markers`, `formalize-marker`, `reject-marker` commands - All 47 verification tests passing (39 existing + 8 new for Contains/NotContains) - Maxwell dogfooding: 10/10 claims verified, false negative bug eliminated See commit history for implementation details. --- ## The Problem in One Sentence Aphoria extracts observations about source code and calls them "claims," but they aren't claims -- they're grep results wearing Episteme vocabulary. --- ## Current Architecture: What Actually Happens ### Scan Flow (Ephemeral Mode) This is the fast path (~0.25s), used for CI/pre-commit. Traced from `scanner.rs:52` through to report output. ```mermaid sequenceDiagram participant CLI as CLI (main.rs) participant Handler as handle_scan() participant Scanner as run_scan() participant Walker as walk_project() participant Registry as ExtractorRegistry participant Bridge as bridge.rs participant Corpus as corpus.rs participant Index as ConceptIndex participant Conflict as conflict.rs participant Report as Formatter CLI->>Handler: ScanArgs + AphoriaConfig Handler->>Scanner: run_scan(args, config) Note over Scanner: Phase 1: WALK Scanner->>Walker: walk_project(root, config) Walker-->>Scanner: Vec Note over Scanner: Phase 2: EXTRACT loop For each WalkedFile Scanner->>Registry: extract_all(segments, content, lang, file) Registry->>Registry: for_language(lang) -> applicable extractors loop For each Extractor Registry->>Registry: extractor.extract(segments, content, lang, file) end Registry->>Registry: filter by IgnoreCommentParser Registry-->>Scanner: Vec end Note over Scanner: Phase 3: CONFLICT DETECTION Scanner->>Bridge: load_or_generate_key(root) Bridge-->>Scanner: SigningKey Scanner->>Corpus: create_authoritative_corpus(key) Note over Corpus: Hardcoded RFC/OWASP assertions
corpus.rs:33-157 Corpus-->>Scanner: Vec (authority) Scanner->>Index: ConceptIndex::build(corpus) Note over Index: make_key() = last 2 path segments
+ "::" + predicate Index-->>Scanner: ConceptIndex Scanner->>Conflict: check_conflicts(claims, index, config) loop For each Observation Conflict->>Index: lookup(claim.subject, claim.predicate) Note over Conflict: Tail-path match:
"code://rust/app/tls/cert_verification"
matches "rfc://5246/tls/cert_verification" Conflict->>Conflict: Compare values, compute score Conflict->>Conflict: Determine verdict (Block/Flag/Pass) end Conflict-->>Scanner: Vec Note over Scanner: Phase 4: REPORT Scanner->>Report: format(results) Report-->>CLI: Table / JSON / SARIF / Markdown ``` **Key code locations:** - Entry: `handlers/scan.rs:8-71` - Orchestration: `scanner.rs:52-117` - Walker: `walker/mod.rs:115-175` - Extraction: `registry.rs:289-304` - Corpus build: `corpus.rs:33-157` - Index: `concept_index.rs:30-110` - Conflict: `conflict.rs:64-200` ### Scan Flow (Persistent Mode with --persist --sync) The full Episteme path, used for drift detection and observation write-back. ```mermaid sequenceDiagram participant Scanner as run_scan() participant Episteme as LocalEpisteme participant WAL as Journal (WAL) participant Store as HybridStore participant Bridge as bridge.rs participant Index as ConceptIndex participant Drift as drift.rs participant Hosted as HostedClient Note over Scanner: Same walk + extract as ephemeral Scanner->>Episteme: LocalEpisteme::open(config, root) Episteme->>WAL: Journal::open(wal_dir) Episteme->>Store: HybridStore::open(store_dir) Episteme-->>Scanner: LocalEpisteme Note over Scanner: Ingest claims as Tier 3 assertions Scanner->>Episteme: ingest_claims(all_claims) loop For each claim Episteme->>Bridge: claim_to_assertion(claim, key, ts) Note over Bridge: SourceClass::Expert (Tier 3)
lifecycle: Approved
parent_hash: None
epoch: None Bridge-->>Episteme: Assertion Episteme->>WAL: journal.append(serialized) end Note over Scanner: Build index from corpus + imported assertions Scanner->>Episteme: fetch_authoritative_assertions() Episteme-->>Scanner: Vec (from store) Scanner->>Index: ConceptIndex::build_with_aliases(corpus, aliases) Note over Scanner: Check conflicts Scanner->>Episteme: check_conflicts(claims, config, index) Episteme-->>Scanner: Vec Note over Scanner: Check drift against prior observations Scanner->>Drift: check_drift(non_conflicting_claims) Drift->>Store: fetch_observations_for_concept(path) Note over Drift: Compare current value vs prior
If different -> DriftResult Drift-->>Scanner: Vec Note over Scanner: Write back novel observations as Tier 4 Scanner->>Episteme: ingest_observations(novel_claims) loop For each observation Episteme->>Bridge: claim_to_observation(claim, key, ts) Note over Bridge: SourceClass::Community (Tier 4)
weight: 0.3 Bridge-->>Episteme: Assertion Episteme->>WAL: journal.append(serialized) Episteme->>Store: predicate_index("observation", hash) end opt If hosted mode enabled Scanner->>Hosted: push_observations(assertions) Hosted-->>Scanner: PushObservationsResponse end ``` **Key code locations:** - Persistent path: `scanner.rs:195-325` - LocalEpisteme::open: `local/mod.rs:44-124` - Ingest claims: `local/store.rs:20-96` - Ingest observations: `local/store.rs:105-165` - Drift detection: `drift.rs:23-57` - Hosted push: `hosted.rs:178+` --- ## What We Built (Grounded) Aphoria has **42 built-in extractors** (`registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42`) that scan source code with regex patterns and produce `Observation` structs: ```rust // types/claim.rs:7-31 pub struct Observation { pub concept_path: String, // e.g., "code://rust/maxwell/hypervisor/lib/imports/firecracker" pub predicate: String, // e.g., "imported" pub value: ObjectValue, // Boolean(true) pub file: String, // "hypervisor/src/lib.rs" pub line: usize, // 24 pub matched_text: String, // "use firecracker_sdk::..." pub confidence: f32, // 1.0 pub description: String, // "Module imports firecracker" } ``` We ran this on Maxwell and got 67 "claims" with zero noise. We celebrated. Then we looked at the output and asked: **what is the claim being made here?** The answer is: there is no claim. `imported: true` is an index entry. No one will ever assert `imported: false`. There's no conflict to resolve, no lens needed, no reason to store this in an append-only Merkle DAG. It's `grep "use firecracker"` with extra steps. ### Verified Against Code | Extractor | File | Predicate Used | What It Actually Produces | |-----------|------|---------------|--------------------------| | `import_graph` | `extractors/import_graph.rs` | `"imported"` with `Boolean(true)` | grep for `use` statements | | `derive_pattern` | `extractors/derive_pattern.rs` | `"derives"` with `Text("Clone,Debug")` | AST metadata extraction | | `const_declarations` | `extractors/const_declarations.rs` | `"value"` with literal value | copy of the source line | | `unsafe_atomic` | `extractors/unsafe_atomic.rs` | `"pattern"` with `Text("SeqCst")` | grep for `Ordering::` | None of these can conflict. None need lenses. None benefit from Episteme's architecture. --- ## What a Real Claim Looks Like After the scan, we wrote [claims-explained.md](../../claims-explained.md) by hand for Maxwell. That document contains actual claims. Compare: **What Aphoria produces** (`unsafe_atomic` extractor, `extractors/unsafe_atomic.rs`): ``` Subject: "code://rust/maxwell/core/wallet/atomics/ordering" Predicate: "pattern" Value: "SeqCst" ``` **What a human wrote:** > "All wallet atomic operations MUST use SeqCst to prevent double-spend race conditions. Weakening to Relaxed or Acquire/Release is a correctness bug." **What Episteme expects** (from `stemedb-core/src/types/assertion.rs`): ``` Subject: "maxwell/wallet/atomics/ordering" Predicate: "required_ordering" Value: "SeqCst" Source: Safety analysis by lead developer Authority: Tier 3 (Expert) -- with real evidence Evidence: "AtomicU64 balance requires sequential consistency to prevent double-spend. See wallet ADR-003." Parent: None (original assertion) Epoch: Some("maxwell-v1.0") ``` More examples from the same scan: **Aphoria says:** `core/thermal/const/rapl_power_unit = 0x606` **The claim is:** "Intel MSR register address for reading CPU power units. Sourced from Intel SDM Vol 4. If this changes, either the code is wrong or targeting different hardware." **Aphoria says:** `wallet/type/wallet/derives = Debug` **The claim is:** "Wallet MUST NOT derive Clone because singleton ownership is a safety invariant. Wallet contains AtomicU64 -- cloning it creates divergent state." **Aphoria says:** `vsock/message/agentmessage/derives = Clone,Debug,Deserialize,Serialize` **The claim is:** "All vsock message types MUST derive Serialize+Deserialize because they cross the VM boundary via bincode. If serde appears in core imports, internal types are leaking into the wire protocol." The difference: observations describe **what is**. Claims describe **what must be and why**. Claims have provenance, consequences, and can conflict with each other. --- ## The Fundamental Gap (Code-Grounded) Episteme is a knowledge graph for conflicting claims with lineage and resolution. Aphoria uses it as a document store for scan results. The `bridge.rs` conversion (`bridge.rs:45-92`) forces observations into the Assertion schema: | Assertion Field | What Episteme Expects | What bridge.rs Provides | Code Reference | |----------------|----------------------|------------------------|----------------| | `source_hash` | Hash of source document (RFC, paper) | `blake3(file + line + matched_text)` | `bridge.rs:107-113` | | `source_class` | Tiered authority (0=Regulatory...4=Community) | Always `SourceClass::Expert` (Tier 3) for claims | `bridge.rs:25` | | `source_metadata` | `{journal, DOI, author, standard}` | `{file, line, matched_text, scan_tool, scan_version}` | `bridge.rs:52-58` | | `parent_hash` | Links to superseded assertion | Always `None` | `bridge.rs:79` | | `epoch` | Paradigm context (e.g., "post-quantum") | Always `None` | `bridge.rs:89` | | `lifecycle` | Pending -> Review -> Approved | Always `LifecycleStage::Approved` (skips review) | `bridge.rs:85` | | `evidence` | Provenance chain, ADR references | Not present in `Observation` at all | `types/claim.rs:7-31` | **We're using a Mercedes as a shopping cart.** ### Partial Mitigation Already Exists `claim_to_observation()` (`bridge.rs:36-42`) creates Tier 4 (Community) assertions for write-back. But this is only used in the `--sync` path for drift detection -- the default `claim_to_assertion()` still uses Tier 3. --- ## What the Workflow Should Be ### Target: Commit-Time Claim Authoring ```mermaid sequenceDiagram participant Dev as Developer participant Skill as Aphoria Skill (.claude/skills/) participant Graph as Episteme Knowledge Graph participant Scanner as aphoria scan (audit mode) participant Report as Claims-Explained View Note over Dev: Developer commits code Dev->>Skill: Review diff Skill->>Skill: Identify claimable changes Note over Skill: Claimable = new constants from specs,
ordering changes, boundary crossings,
derive changes on serialized types

NOT claimable = renamed variables,
whitespace, internal refactors Skill->>Graph: Look up existing claims for context Graph-->>Skill: Related claims (if any) alt Diff contradicts existing claim Skill->>Dev: "This contradicts claim X. Fix code or supersede claim?" Dev->>Skill: Decision + evidence Skill->>Graph: Create superseding claim (parent_hash = old claim) else New claimable pattern Skill->>Dev: "This looks claimable. Author a claim?" Dev->>Skill: Provenance + invariant + consequence Skill->>Graph: Submit authored claim with lineage end Note over Skill: Create extractor for audit Skill->>Scanner: Register extractor paired with claim Note over Scanner: Later: Audit runs Scanner->>Graph: For each claim, verify code matches Graph-->>Scanner: Expected values Scanner->>Scanner: Extractor output vs claim Scanner-->>Report: PASS / CONFLICT / DRIFT Report->>Report: Auto-generate claims-explained.md ``` ### Audit Flow: Two Directions **Direction 1: Scan code, check against claims** (what Aphoria partially does today) ```mermaid sequenceDiagram participant Scanner as aphoria audit participant Extractors as ExtractorRegistry participant Code as Source Files participant Graph as Episteme (Claims) participant Report as Audit Report Scanner->>Code: Walk project files Scanner->>Extractors: extract_all(file) -> Vec loop For each Observation Scanner->>Graph: lookup_claim(observation.subject, observation.predicate) alt Claim exists alt observation.value == claim.value Scanner->>Report: PASS (code matches claim) else observation.value != claim.value Scanner->>Report: CONFLICT (code contradicts claim) Note over Report: Score by authority tier,
apply lenses for resolution end else No claim exists Scanner->>Report: REVIEW ("should this be a claim?") end end ``` **Direction 2: Walk claims, verify in code** (does not exist today) ```mermaid sequenceDiagram participant Scanner as aphoria audit --verify-claims participant Graph as Episteme (Claims) participant Extractors as Paired Extractors participant Code as Source Files participant Report as Audit Report Scanner->>Graph: List all authored claims Graph-->>Scanner: Vec loop For each Claim Scanner->>Extractors: Find extractor paired with this claim alt Extractor exists Extractors->>Code: Run extractor on relevant files Code-->>Extractors: Vec alt Observation matches claim Scanner->>Report: PASS else Observation contradicts claim Scanner->>Report: CONFLICT end alt No observation found (code deleted?) Scanner->>Report: MISSING (claimed pattern not found) end else No paired extractor Scanner->>Report: UNCHECKED (no extractor for this claim) end end Note over Report: Catches:
- Deleted code (claim says X exists, it doesn't)
- Drifted values (claim says 0x606, code says 0x607)
- Unenforced policies (claim says "no tokio in core") ``` --- ## Extracted Claims from This Document The following claims were extracted using the `extract-claims` skill pattern. Each is testable against the current codebase. ### Architecture Claims (Verified) | ID | Claim | Verification Status | Code Reference | |----|-------|-------------------|----------------| | VG-001 | Aphoria has 42 built-in extractors | VERIFIED | `registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42` | | VG-002 | `import_graph` extractor uses predicate `"imported"` with `Boolean(true)` | VERIFIED | `import_graph.rs` -- only produces `imported: true` | | VG-003 | `unsafe_atomic` extractor uses predicate `"pattern"` | VERIFIED | `unsafe_atomic.rs` -- uses generic `"pattern"` predicate | | VG-004 | `bridge.rs` default path uses `SourceClass::Expert` (Tier 3) | VERIFIED | `bridge.rs:25` -- `claim_to_assertion()` calls with `SourceClass::Expert` | | VG-005 | `bridge.rs` always sets `parent_hash: None` | VERIFIED | `bridge.rs:79` | | VG-006 | `bridge.rs` always sets `epoch: None` | VERIFIED | `bridge.rs:89` | | VG-007 | `bridge.rs` always sets `lifecycle: LifecycleStage::Approved` | VERIFIED | `bridge.rs:85` | | VG-008 | `source_metadata` contains `{file, line, matched_text, scan_tool, scan_version}` only | VERIFIED | `bridge.rs:52-58` | | VG-009 | `Observation` has no evidence/provenance field | VERIFIED | `types/claim.rs:7-31` -- only has location, value, confidence | | VG-010 | `claim_to_observation()` uses Tier 4 (Community) | VERIFIED | `bridge.rs:36-42` | | VG-011 | Extractor trait has no mechanism to receive claims for verification | ✅ **CLOSED** | `traits.rs:68-107` -- `verifiable_predicates()` method added, 10 extractors declare predicates | ### Gap Claims (What Doesn't Exist) | ID | Claim | Gap | |----|-------|-----| | VG-020 | `Observation` type exists and is properly named | ✅ **CLOSED** — `ExtractedClaim` renamed to `Observation` in Phase A1 | | VG-021 | A real `Claim` type should exist with provenance, invariant, consequence, authority | No such type exists anywhere | | VG-022 | Extractors should be paired with claims they verify | ✅ **CLOSED** — `verifiable_predicates()` added to `Extractor` trait; 10 extractors declare predicates; `compute_extractor_claim_map()` in verify.rs; `aphoria verify map` shows coverage | | VG-023 | `aphoria audit` command should exist | No audit subcommand in CLI | | VG-024 | Claims should support supersession via `parent_hash` | `parent_hash` is always `None` | | VG-025 | `aphoria claims list` / `aphoria claims explain` should exist | No claims subcommand | | VG-026 | Corpus should be emergent from community patterns, not hardcoded | ✅ **CLOSED** — Community corpus builder queries StemeDB pattern aggregates; hardcoded.rs deleted; bootstrap via wiki import or Trust Packs | | VG-027 | Conflict resolution should use Episteme lenses | No lens invoked during scan | | VG-028 | Direction 2 audit (walk claims, verify code) doesn't exist | ✅ **CLOSED** — `aphoria verify run` walks claims and checks code | | VG-029 | Skill should be primary claim authoring interface | No `.claude/skills/aphoria` skill exists | --- ## What Needs to Change ### 1. Claims are authored, not extracted Extractors don't produce claims. Humans (assisted by the Aphoria skill) produce claims. Extractors produce **observations** that are checked against claims. The type system should reflect this: ```rust // CURRENT (types/claim.rs:7-31) - Phase A1 COMPLETE pub struct Observation { pub concept_path: String, pub predicate: String, pub value: ObjectValue, pub file: String, pub line: usize, pub matched_text: String, pub confidence: f32, pub description: String, } // Already exists as Observation (was ExtractedClaim before A1) pub struct Observation { pub concept_path: String, pub predicate: String, pub value: ObjectValue, pub file: String, pub line: usize, pub matched_text: String, pub confidence: f32, pub description: String, } // TARGET: New Claim type (does not exist today) pub struct AuthoredClaim { pub concept_path: String, pub predicate: String, pub value: ObjectValue, pub provenance: String, // Where did this come from? (Intel SDM, RFC, ADR) pub invariant: String, // What must remain true? pub consequence: String, // What breaks if violated? pub authority_tier: SourceClass, // Tier 0-4 pub evidence_chain: Vec, // References to supporting documents pub parent_hash: Option, // Supersedes which claim? pub epoch: Option, // Paradigm context } ``` ### 2. The skill is the primary interface, not the scanner The `.claude/skills/aphoria` skill should be the main way claims enter the system. It: - Understands the project's claim vocabulary - Reviews diffs for claimable changes - Looks up existing claims for context - Helps author claims with proper lineage - Submits them as real Episteme assertions The scanner (`aphoria scan`) becomes the audit tool -- it verifies that code matches claims, not the other way around. ### 3. Extractors serve the audit, not the authoring The `Extractor` trait (`traits.rs:68-94`) needs to change: ```rust // CURRENT: Extractors produce observations from thin air pub trait Extractor: Send + Sync { fn name(&self) -> &str; fn languages(&self) -> &[Language]; fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec; } // TARGET: Extractors can also verify observations against claims pub trait Extractor: Send + Sync { fn name(&self) -> &str; fn languages(&self) -> &[Language]; fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec; /// Claims this extractor can verify (empty = observation-only extractor) fn verifiable_claims(&self) -> &[&str] { &[] } /// Verify a specific claim against extracted observations fn verify(&self, claim: &AuthoredClaim, observations: &[Observation]) -> VerifyResult { VerifyResult::Unchecked } } ``` ### 4. The corpus should be proper assertions Today, RFC/OWASP knowledge is built procedurally in `corpus.rs:33-157`. The `ConflictingSource::extract_citation()` in `types/claim.rs:89-111` already handles `rfc://` and `owasp://` URI schemes -- the infrastructure for proper corpus assertions partially exists. Target: corpus data stored as real Episteme assertions with proper lineage, not rebuilt every scan. ### 5. The claims-explained.md pattern should be the product The workflow that produces it: ```mermaid flowchart TD A[aphoria scan] -->|produces| B[Observations] B -->|skill identifies| C{Claimable?} C -->|yes| D[Developer authors claim
with skill assistance] C -->|no| E[Discard / log as observation] D -->|submit| F[Episteme Knowledge Graph] F -->|future scans| G[aphoria audit checks
code against claims] G -->|generates| H[claims-explained.md
auto-generated from graph] F -->|new observations| I{Matches existing claim?} I -->|yes, same value| J[PASS] I -->|no, different value| K[CONFLICT] I -->|claim about deleted code| L[MISSING] ``` --- ## Proposed Extractors for Audit Flow These extractors don't exist today. They're needed to close the gap between observations and claims. ### Self-Audit Extractors (Meta) These extractors audit Aphoria's own code to verify the claims in this document remain true: | Extractor Name | What It Verifies | Pattern | |---------------|-----------------|---------| | `bridge_source_class_audit` | `bridge.rs` default tier assignment | Regex for `SourceClass::Expert` in `claim_to_assertion` | | `bridge_parent_hash_audit` | Whether `parent_hash` is always `None` | Regex for `parent_hash: None` in bridge | | `bridge_lifecycle_audit` | Whether lifecycle skips review | Regex for `LifecycleStage::Approved` without Pending | | `extractor_trait_audit` | Whether Extractor trait accepts claims | Check trait definition for claim parameter | | `type_naming_audit` | Whether `ExtractedClaim` has been renamed | Grep for `struct ExtractedClaim` vs `struct Observation` | ### Claim-Paired Extractors (Project-Specific) These are examples of what extractor-claim pairs look like for a project like Maxwell: | Claim | Extractor | Verification | |-------|-----------|-------------| | "Wallet atomics MUST use SeqCst" | `unsafe_atomic` (exists) | Check all `Ordering::` in wallet/ are `SeqCst` | | "Wallet MUST NOT derive Clone" | `derive_pattern` (exists) | Check `#[derive(` on Wallet struct excludes `Clone` | | "vsock types MUST derive Serialize+Deserialize" | `derive_pattern` (exists) | Check all structs in vsock/ derive both | | "RAPL_POWER_UNIT MUST be 0x606" | `const_declarations` (exists) | Check const value matches Intel SDM | | "Core modules MUST NOT import tokio" | `import_graph` (exists) | Check no `use tokio` in core/ | The existing extractors can already produce the observations needed. What's missing is the **claim** to compare against and the **pairing mechanism** to connect them. ### Declarative Extractor Examples Using the existing `DeclarativeExtractor` system (`extractors/declarative/`), claim-paired extractors can be defined in `aphoria.toml`: ```toml [[extractors.declarative]] name = "wallet_seqcst_policy" description = "Wallet atomics must use SeqCst ordering" languages = ["rust"] pattern = 'Ordering::(Relaxed|AcqRel|Acquire|Release)' claim.subject = "policy/wallet/atomics/ordering" claim.predicate = "forbidden_ordering" claim.value = { type = "boolean", value = true } confidence = 0.95 source = { claim_id = "wallet-seqcst-001", authority = "safety-analysis" } [[extractors.declarative]] name = "core_no_tokio_policy" description = "Core modules must not import tokio" languages = ["rust"] pattern = 'use tokio' claim.subject = "policy/core/imports/tokio" claim.predicate = "forbidden_import" claim.value = { type = "boolean", value = true } confidence = 0.95 source = { claim_id = "arch-boundary-001", authority = "architecture-decision" } ``` --- ## The Path Forward ### Phase 1: Distinguish observations from claims - [x] Rename `ExtractedClaim` to `Observation` in `types/claim.rs` ✅ **COMPLETE (Phase A1)** - [ ] Create `AuthoredClaim` type with provenance, invariant, consequence, authority, evidence_chain - [ ] Update `bridge.rs` default path to use Tier 4/5 (not Tier 3) for scanner output - [ ] Add `evidence` field to `source_metadata` in bridge ### Phase 2: Build the authoring workflow - [ ] Create `.claude/skills/aphoria` skill for claim authoring - [ ] Add `aphoria claims create` CLI command - [ ] Add `aphoria claims update` with `parent_hash` supersession - [ ] Add `aphoria claims list` and `aphoria claims explain` - [ ] Store authored claims as proper Episteme assertions with lineage ### Phase 3: Pair extractors with claims - [ ] Extend `Extractor` trait with `verifiable_claims()` and `verify()` methods - [ ] Add `aphoria audit` command (both directions) - [ ] Map each existing extractor to claims it can verify - [ ] Flag observations without matching claims as "should this be a claim?" ### Phase 4: Make the corpus first-class - [x] Community corpus queries StemeDB for pattern aggregates (not hardcoded) - [x] Wiki import (`aphoria corpus import --from-wiki`) for bootstrap - [x] Trust Packs store assertions in StemeDB (not TOML files) - [ ] Wire up Authority Lens for conflict resolution ### Phase 5: The flywheel - [ ] More claims authored per commit - [ ] Better audit coverage (extractors verify more claims) - [ ] Skill learns from authored claims what's claimable - [ ] Claims-explained documentation auto-generates from knowledge graph - [ ] New team members read claims to understand WHY, not just WHAT --- ## Summary We built a good code scanner. We didn't build a knowledge graph client. The extractors work well at finding patterns. But finding patterns isn't the point -- understanding what those patterns mean, why they must be that way, and what breaks if they change is the point. The Maxwell claims-explained.md proves the concept works. Every one of those 67 observations becomes valuable when paired with provenance and invariants. The gap is that today a human has to write that context by hand. Close the gap by making the skill -- not the scanner -- the primary interface, and by treating claims as authored artifacts with lineage rather than regex output with a fancy name. --- ## Appendix: Claim Extraction Summary This document contains **94 extractable claims** across **52 unique subjects**: - **11 architecture claims**: Verified against current code (all confirmed true) - **10 gap claims**: Define what doesn't exist yet - **5 bridge.rs claims**: Code-verifiable, confirmed (source_hash faked, source_class hardcoded, parent_hash ignored, epoch ignored, evidence empty) - **15 phase-plan claims**: Define specific deliverables and tasks - **20+ workflow claims**: Define the target authoring/audit model - **5 claimability rules**: What counts as claimable in a diff (spec constants=yes, ordering changes=yes, boundary crossings=yes, derive changes on serialized types=yes, renamed variables=no) - **4 Maxwell examples**: Real claims about SeqCst ordering, Wallet derives, vsock serialization, RAPL_POWER_UNIT ~~The most critical engineering gap: **no extractor currently has the ability to verify against existing claims**.~~ **CLOSED (2026-02-08):** The `Extractor` trait now includes `verifiable_predicates()` returning `(tail_path, predicate)` pairs. 10 extractors declare their predicates. `compute_extractor_claim_map()` matches claims against extractors (with wildcard support). `aphoria verify map` shows coverage. Direction 2 audit (walk claims, verify code) is now implemented via `aphoria verify run`.