Major additions: - Staged scanning modes (working tree, staged, committed) with git integration - Drift detection for baseline vs current state comparisons - Hosted API handlers for policy CRUD operations via StemeDB API - stemedb-ontology crate with domain definitions and medical extractors - Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.) - Aphoria development skill documentation Code organization: - Split large files into focused modules to stay under 500-line limit - Extracted config tests, episteme helpers/drift/aliases, API helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
30 KiB
Aphoria Roadmap
Phase 0: StemeDB Foundation ✅
Tracked in: roadmap.md § 5D. Concept Hierarchy
Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.
| Aphoria Phase 0 | StemeDB Phase 5D | Status |
|---|---|---|
| 0.1 ConceptPath Type | 5D.1 ConceptPath Type | ✅ |
| 0.2 ConceptPath in Assertion | (implicit in 5D.1) | ✅ |
| 0.3 Hierarchical Index | 5D.4 Hierarchical Query | ✅ |
| 0.4 Alias Store | 5D.3 Alias Store + 5D.5 Alias Resolution | ✅ |
| 0.5 Source Class Inference | 5D.6 Source Class Inference | ✅ |
| 0.6 Concept API Endpoints | 5D.7 Concept API Endpoints | ✅ |
Spec: docs/specs/concept-hierarchy.md
Phase 2: CLI Core ✅
Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.
| Task | Status |
|---|---|
| 2.1 Project Walker | ✅ walker/mod.rs, walker/path_mapper.rs, walker/language.rs |
| 2.2 Extractors (10) | ✅ tls_verify, jwt_config, hardcoded_secrets, timeout_config, dep_versions, cors_config, rate_limit, weak_crypto, command_injection, sql_injection |
| 2.3 Ingestion Bridge | ✅ bridge.rs — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion |
| 2.4 Conflict Query | ✅ episteme.rs — LocalEpisteme with check_conflicts() |
| 2.5 Report Output | ✅ report/ — table (comfy-table), JSON, SARIF 2.1.0, markdown |
| 2.6 Acknowledge Command | ✅ lib.rs acknowledge() |
| Baseline & Diff | ✅ lib.rs set_baseline(), show_diff() |
| Status Command | ✅ lib.rs show_status() |
183 tests pass. Clippy and fmt clean.
Phase 2 Code Quality Fixes ✅
Code review improvements to extractors:
| Issue | Fix | Status |
|---|---|---|
| DES/RC4 concept path misclassification | Split check_pattern() into check_hash_pattern() and check_encryption_pattern(); DES/RC4 now use crypto/encryption/algorithm path |
✅ |
| SHA1 edge case undocumented | Added comments and test documenting that SHA1 detection is intentionally broad (triggers for git hashes, etc.) | ✅ |
| JS exec() regex overly broad | Tightened regex to require child_process. prefix or non-word/non-dot preceding character; prevents RegExp.exec() false positives |
✅ |
Phase 2A: Concept Matching ✅
Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.
2A.1 Leaf-Based Concept Matching (Aphoria-side fix) ✅
Implemented in episteme.rs via ConceptIndex:
make_key(subject, predicate)extracts tail 2 path segments + predicatebuild(assertions)creates in-memory index keyed by tail pathlookup(subject, predicate)finds matching authoritative assertionscheck_conflicts()usesConceptIndexinstead ofQueryEnginefor cross-scheme matching
Integration tests prove TLS and JWT conflicts are detected correctly.
2A.2 Alias Resolution in QueryEngine (StemeDB-side fix) ✅
Wired AliasStore into QueryEngine.execute():
- Added
resolve_aliases: boolfield toQuery(defaults tofalse) - Added
alias_store: Option<Arc<dyn AliasStore>>toQueryEngine - Added
.with_alias_store()builder method - When
resolve_aliases: true, expands subject viaAliasStore.resolve_all()before index lookup - Added
fetch_by_subjects()andfetch_by_subjects_predicate()for multi-subject deduplication - Modified
Query.matches()to skip subject filtering when aliases are resolved - Skips fast path (MV lookup) when
resolve_aliases: true - Gracefully degrades when no alias store is configured
7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.
2A.3 Auto-Alias Creation ✅
When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:
code://rust/myapp/tls/cert_verification↔rfc://5246/tls/cert_verificationcode://rust/myapp/auth/jwt/audience_validation↔rfc://7519/jwt/audience_validation
This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.
Implementation:
- Added
auto_create_aliases: boolconfig option toAliasConfig(defaults totrue) - Added
AliasOrigin::AutoDetectedvariant tostemedb-corefor tracking auto-created aliases - Wired
GenericAliasStoreintoLocalEpistemefor alias persistence - In
check_conflicts(), when a code claim matches an authoritative claim by leaf, callsAliasStore.set_alias()to persist the relationship withAliasOrigin::AutoDetected - Alias creation is idempotent (skips if alias already exists)
- 4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency
Phase 1: Authoritative Corpus Expansion ✅
Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ aphoria corpus build │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│ │ RFC Ingester │ │ OWASP │ │ Vendor Bootstrapper │ │
│ │ (Tier 0) │ │ Ingester │ │ (Tier 2) │ │
│ │ │ │ (Tier 1) │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └───────────┬───────────┘ │
│ │ │ │ │
│ └─────────────────┼──────────────────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ CorpusRegistry │ │
│ └────────┬────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ LocalEpisteme │ │
│ │ ingest_ │ │
│ │ authoritative() │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
1.1 CorpusBuilder Trait ✅
| Task | Status |
|---|---|
CorpusBuilder trait |
✅ corpus/mod.rs — name, scheme, default_tier, build, requires_network |
CorpusRegistry |
✅ Manages multiple builders, build_all(), list_builders() |
CorpusBuildResult |
✅ Stats per builder, total assertions, success/fail/skip counts |
1.2 RFC Ingester ✅
| Task | Status |
|---|---|
RfcCorpusBuilder |
✅ corpus/rfc.rs |
| HTTP fetching | ✅ Via ureq, cached to ~/.cache/aphoria/rfc-cache/ |
| RFC 2119 keyword parsing | ✅ MUST, MUST NOT, SHOULD, SHALL extraction |
| RFC-specific parsers | ✅ JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110) |
| Concept mapping | ✅ rfc://{number}/{topic} at Tier 0 (Regulatory) |
1.3 OWASP Ingester ✅
| Task | Status |
|---|---|
OwaspCorpusBuilder |
✅ corpus/owasp.rs |
| HTTP fetching | ✅ From GitHub raw content, cached to ~/.cache/aphoria/owasp-cache/ |
| Markdown parsing | ✅ MUST/SHOULD statements, section context |
| Cheat sheet parsers | ✅ Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers |
| Concept mapping | ✅ owasp://cheatsheet/{topic}/{claim} at Tier 1 (Clinical) |
1.4 Vendor Docs ✅
| Task | Status |
|---|---|
VendorCorpusBuilder |
✅ corpus/vendor.rs |
| PostgreSQL claims | ✅ pool_size, idle_timeout, ssl_mode |
| Redis claims | ✅ timeout, max_retries, tls |
| reqwest claims | ✅ cert_verification, connect_timeout, request_timeout |
| hyper claims | ✅ keep_alive_timeout, max_concurrent_streams |
| Go net/http claims | ✅ read_timeout, write_timeout, idle_timeout, min_tls_version |
| tokio-postgres claims | ✅ pool_size, ssl_mode |
| SQLx claims | ✅ max_connections, idle_timeout |
| Concept mapping | ✅ vendor://{product}/{topic}/{claim} at Tier 2 (Observational) |
1.5 Hardcoded Refactor ✅
| Task | Status |
|---|---|
HardcodedCorpusBuilder |
✅ corpus/hardcoded.rs — original 11 assertions |
create_authoritative_assertion() |
✅ Made public in episteme.rs for corpus builders |
1.6 CLI Integration ✅
| Task | Status |
|---|---|
aphoria corpus build |
✅ Fetches and ingests from all sources |
--only rfc,owasp,vendor |
✅ Filter to specific sources |
--offline |
✅ Skip network-requiring sources |
--clear-cache |
✅ Clear cache before building |
aphoria corpus list |
✅ List available corpus sources |
CorpusConfig |
✅ cache_dir, include_*, rfc_list options |
1.7 Error Handling ✅
| Task | Status |
|---|---|
RfcFetch error |
✅ Per-RFC fetch failures with context |
OwaspFetch error |
✅ Per-cheat-sheet fetch failures with context |
CorpusBuild error |
✅ General corpus build failures |
| Graceful degradation | ✅ Continue with other sources if one fails |
Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs
Phase 3: Skill Integration ✅
Complete. Aphoria is now usable in Claude Code agent workflows.
3.1 Claude Code Skill ✅
| Task | Status |
|---|---|
skill/SKILL.md |
✅ Comprehensive skill definition with all commands |
/aphoria scan |
✅ Scan project, show conflicts grouped by verdict |
/aphoria scan --fix |
✅ Interactive fix workflow |
/aphoria ack |
✅ Acknowledge conflicts as intentional |
/aphoria status |
✅ Show status and baseline |
/aphoria diff |
✅ Show changes since baseline |
/aphoria init |
✅ Initialize Aphoria |
/aphoria baseline |
✅ Set baseline |
skill/install.sh |
✅ Install script for ~/.claude/skills/aphoria/ |
Files: skill/SKILL.md, skill/install.sh, skill/hooks.json
3.2 Agent Pre-Flight Hook ✅
| Task | Status |
|---|---|
--exit-code flag |
✅ Returns 2 for BLOCK, 1 for FLAG only, 0 for clean |
--strict flag |
✅ Lower thresholds (FLAG at 0.3, BLOCK at 0.5) |
| Hook template | ✅ skill/hooks.json with PreCommit and PrePush examples |
Usage:
{
"hooks": {
"PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
"PrePush": [{"command": "aphoria scan --strict --exit-code"}]
}
}
3.3 Alias Suggestion Workflow ✅
Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:
- Tail-path matching finds authoritative assertions
- Aliases are auto-created with
AliasOrigin::AutoDetected - Future queries use the alias automatically
The skill documents the suggestion flow for manual alias management:
- y (Accept): Creates alias
- n (Reject): Records intentional difference
- defer: Flags for later review
Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ⬜
Vision: The pre-commit hook is a bidirectional knowledge sync, not just a read-only linter. Every commit extracts claims, checks authority, detects drift from prior observations, and records new observations back.
Spec: uat/2026-02-04-full-cycle-precommit-vision.md
┌─────────────────────────────────────────────────────────────┐
│ PRE-COMMIT FLOW │
├─────────────────────────────────────────────────────────────┤
│ 1. EXTRACT → What claims does this code make? │
│ 2. CHECK → Against authority + own prior claims │
│ 3. CLASSIFY → Authority conflict | Self conflict | Novel │
│ 4. UPDATE → Record observations to local Episteme │
│ 5. GATE → Exit code (BLOCK=2, FLAG=1, PASS=0) │
└─────────────────────────────────────────────────────────────┘
4.1 Git Pre-Commit Hook ⬜
#!/bin/sh
# .git/hooks/pre-commit
aphoria scan --staged --sync --exit-code
Or using pre-commit framework:
repos:
- repo: local
hooks:
- id: aphoria
name: Aphoria Truth Sync
entry: aphoria scan --staged --sync --exit-code
language: system
pass_filenames: false
4.2 Baseline Mode ✅
Already implemented in Phase 2.
4A: Observational Claims ✅
Record code claims as Tier 4 (Community) assertions when no authority conflict exists:
| Task | Status |
|---|---|
sync: bool in ScanArgs |
✅ types/command.rs |
observations_recorded: usize in ScanResult |
✅ types/result.rs |
--sync CLI flag |
✅ cli.rs — requires --persist |
claim_to_observation() |
✅ bridge.rs — creates Tier 4 (Community, 0.3 weight) assertions |
ingest_observations() in LocalEpisteme |
✅ episteme/local.rs — writes to WAL + predicate index |
| Scan flow integration | ✅ scan.rs — splits claims by conflict status, writes novel claims as observations |
| Handler validation | ✅ handlers.rs — --sync requires --persist error |
| Report output | ✅ report/table.rs, report/json.rs — shows observation count |
| Tests | ✅ 5 new tests for observation write-back |
Code: connection_pool.max_size = 25
Authority: (nothing)
Action: Record as Tier 4 observation (project memory)
Usage:
# Scan with observation write-back
aphoria scan --persist --sync
# Output:
# Recorded 45 observations (project memory)
4B: Self-Conflict Detection ✅
Detect drift from the project's own prior observations:
| Task | Status |
|---|---|
| Query prior claims before conflict check | ✅ fetch_observations_for_concept() |
| Compare current vs stored observations | ✅ check_drift() compares values |
| Report changes as SELF-CONFLICT | ✅ DriftResult with prior/current values |
New verdict: Drift (distinct from Block/Flag) |
✅ Verdict::Drift |
| Drift reporting in all formats | ✅ table, json, markdown, sarif |
| Exit code includes drift | ✅ --exit-code returns 1 for drift |
Prior: db/pool_size = 25 (recorded 2026-01-15)
Now: db/pool_size = 100
Result: DRIFT — "You changed pool_size from 25 to 100. Intentional?"
Files: types/result.rs, types/verdict.rs, episteme/local.rs, scan.rs, report/*.rs
4C: Diff-Only Scanning ✅
Fast scanning for pre-commit hooks:
| Task | Status |
|---|---|
FileSource enum (All, Staged) |
✅ types/command.rs |
--staged flag (git diff --cached) |
✅ cli.rs, handlers.rs |
walker/git.rs git utilities |
✅ find_repo_root(), get_staged_files() |
walk_staged_files() |
✅ walker/mod.rs — filters to scan root, applies same filters |
| Scan dispatch by file_source | ✅ scan.rs |
| Error handling (NotGitRepo, GitCommand) | ✅ error.rs |
| Tests | ✅ 9 tests in tests/staged_scanning.rs |
| Target: < 500ms for staged-only | ✅ |
Files: types/command.rs, walker/git.rs, walker/mod.rs, scan.rs, cli.rs, handlers.rs, error.rs
Usage:
# Pre-commit hook (fast, staged files only)
aphoria scan --staged --exit-code
# Full cycle with observation sync
aphoria scan --staged --persist --sync --exit-code
4D: Enhanced Ack ⬜
Acknowledgments with rationale and policy updates:
| Task | Status |
|---|---|
--reason "text" flag |
⬜ |
| Store rationale in assertion metadata | ⬜ |
aphoria update for intentional drift |
⬜ |
| Policy update assertions | ⬜ |
$ aphoria ack db/pool_size --reason "Scaling for Black Friday"
$ aphoria update db/pool_size 100 --reason "New baseline after load test"
4E: Hosted Mode ✅
Organizations run their own StemeDB server and all team members automatically sync observations:
| Task | Status |
|---|---|
HostedConfig in config.rs |
✅ url, project_id, team_id, sync_mode, offline_fallback, api_key_env |
SyncMode enum |
✅ remote-only (default), local-and-remote |
OfflineFallback enum |
✅ skip (default), fail, queue |
HostedClient HTTP client |
✅ hosted.rs — retry logic, auth headers, observation push |
POST /v1/aphoria/observations endpoint |
✅ Server receives observations with project/team metadata |
| Scan integration | ✅ Auto-enables sync when [hosted] configured |
Hosted(String) error variant |
✅ For connection/auth failures |
| Graceful offline fallback | ✅ Based on offline_fallback config |
| Tests | ✅ Config parsing, client creation, assertion conversion |
# aphoria.toml
[hosted]
url = "https://episteme.acme.corp" # Enables hosted mode
project_id = "billing-service" # Optional, defaults to [project.name]
team_id = "platform-team" # Optional, for multi-team servers
sync_mode = "remote-only" # "remote-only" | "local-and-remote"
offline_fallback = "skip" # "skip" | "fail" | "queue"
api_key_env = "APHORIA_API_KEY" # Env var for auth token
Architecture:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Developer A │ │ Developer B │ │ Developer C │
│ aphoria scan │ │ aphoria scan │ │ aphoria scan │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
▼
┌─────────────────────┐
│ Team StemeDB Server │
│ POST /v1/aphoria/ │
│ observations │
└─────────────────────┘
│
▼
Aggregated team patterns
Files: config.rs, hosted.rs, scan.rs, error.rs, lib.rs, crates/stemedb-api/src/handlers/aphoria.rs, crates/stemedb-api/src/dto/aphoria.rs
Phase 4.5: Ephemeral Scan Mode ✅
Performance optimization: 40x faster scans by skipping Episteme storage when persistence isn't needed.
Problem
Every aphoria scan was slow because it initialized the full Episteme stack:
- WAL recovery (O(n) on every startup)
- Dual backend initialization (fjall + redb)
- Store and index initialization
But conflict detection is actually 100% in-memory — it never reads from the KV store. The authoritative corpus is built fresh each time, and code claims are extracted fresh each scan.
Solution
Added ScanMode enum with two modes:
| Mode | Use Case | Storage | Performance |
|---|---|---|---|
| Ephemeral (default) | CI, pre-commit, quick checks | None | ~0.25 seconds |
| Persistent | Baseline/diff tracking, alias creation | WAL + store | ~1-2 seconds |
Implementation ✅
| Task | Status |
|---|---|
ScanMode enum |
✅ types.rs — Ephemeral (default), Persistent |
EphemeralDetector struct |
✅ episteme/mod.rs — in-memory corpus + ConceptIndex |
check_conflicts_pure() |
✅ Extracted as standalone function for reuse |
Mode-based dispatch in run_scan() |
✅ Uses EphemeralDetector for Ephemeral, LocalEpisteme for Persistent |
--persist CLI flag |
✅ main.rs — opt-in to persistent mode |
| Tests for both modes | ✅ test_ephemeral_scan_no_storage_created, test_persistent_scan_creates_storage, test_scan_modes_produce_same_conflicts |
Usage
# Fast ephemeral scan (default) — no storage created
aphoria scan .
# Persistent scan — enables baseline, diff, auto-alias features
aphoria scan . --persist
Performance
| Mode | Time | Storage |
|---|---|---|
| Ephemeral | ~0.25s | None |
| Persistent | ~1-2s | WAL + store directories |
Files: types.rs, episteme/mod.rs, lib.rs, main.rs, tests.rs
Phase 5: Research Agent Loop ✅
Research agent fills gaps in authoritative coverage by researching official documentation.
5.1 Gap Detection ✅
| Task | Status |
|---|---|
Gap struct |
✅ research/gap_detector.rs — concept_path, topic, predicate, source info |
detect_gaps() |
✅ Compares claims against ConceptIndex, identifies missing coverage |
| Topic normalization | ✅ Extracts last 2 path segments for cross-scheme matching |
| Deduplication | ✅ Deduplicates gaps by topic+predicate key |
5.2 Gap Storage ✅
| Task | Status |
|---|---|
GapRecord |
✅ research/gap_store.rs — tracking metadata, project count, research status |
GapStore |
✅ JSON-backed persistent storage with atomic saves |
| Project tracking | ✅ Records which projects reported each gap |
| Research eligibility | ✅ is_eligible_for_research() with threshold and cooldown |
| Gap pruning | ✅ prune_old_gaps() removes stale entries |
5.3 Quality Validation ✅
| Task | Status |
|---|---|
QualityValidator |
✅ research/quality.rs — validates researched claims |
| Source attribution | ✅ Checks for authoritative domains (rfc-editor, owasp, vendor docs) |
| Normative language | ✅ Verifies MUST/SHOULD/SHALL keywords present |
| Vague content detection | ✅ Rejects "it depends", "typically", etc. |
| Consistency scoring | ✅ Detects conflicting claims on same subject |
QualityReport |
✅ Detailed per-claim validation results |
filter_passed() |
✅ Returns only claims meeting quality threshold |
5.4 Research Execution ✅
| Task | Status |
|---|---|
Researcher |
✅ research/researcher.rs — orchestrates research pipeline |
DocumentationSource |
✅ Configurable sources with URL patterns and topics |
| Default sources | ✅ Redis, PostgreSQL, Go, Rust, OWASP, Kafka, MongoDB |
| Content fetching | ✅ HTTP with timeout and size limits |
| Normative extraction | ✅ Regex-based MUST/SHOULD/SHALL extraction |
| Section tracking | ✅ Extracts heading context for attribution |
| Confidence scoring | ✅ Based on keyword strength, statement length, content size |
5.5 CLI Integration ✅
| Task | Status |
|---|---|
aphoria research run |
✅ Run research agent with configurable threshold |
aphoria research status |
✅ Show gap statistics and research progress |
aphoria research gaps |
✅ List gaps by project count |
--threshold |
✅ Minimum projects before researching (default: 3) |
--strict |
✅ Use strict quality validation |
--prune |
✅ Remove stale gaps before researching |
--ready |
✅ Show only gaps ready for research |
Files: research/mod.rs, research/gap_detector.rs, research/gap_store.rs, research/quality.rs, research/researcher.rs, research/tests.rs
5.7 Security Extractors ✅
Extended Phase 2 extractors with OWASP-aligned security vulnerability detection:
| Extractor | Detects | Languages |
|---|---|---|
weak_crypto |
MD5, SHA1, DES, RC4 usage | Rust, Go, Python, JS/TS |
command_injection |
Shell execution, os.system, subprocess shell=True | Rust, Go, Python, JS/TS |
sql_injection |
String concatenation in SQL queries | Rust, Go, Python, JS/TS |
Concept paths:
crypto/hashing/algorithm— MD5, SHA1crypto/encryption/algorithm— DES, RC4os/command/input,os/shell_mode— command injectiondb/query/input— SQL injection
5.6 Community Corpus Contributions ⬜
Future: Users can opt in to contribute patterns anonymously.
- "Every Rust project has this JWT pattern" → pre-built alias set
- "This Redis config is always acknowledged" → adjust default threshold
- "This TLS pattern is always a real bug" → elevate threshold
Phase 6: Federated Policy & Trust Packs ✅
Allow teams to define their own authoritative truths and distribute them as signed Trust Packs. This enables "Enterprise Grade" compliance across distributed teams.
6.1 Trust Pack Format ✅
| Task | Status |
|---|---|
TrustPack schema |
✅ policy.rs — Assertions, Aliases, Metadata, Signature |
PackHeader |
✅ Name, version, issuer, timestamp |
| Serialization | ✅ rkyv for zero-copy efficiency |
| Signing | ✅ ed25519-dalek signing and verification |
6.2 Policy Management ✅
| Task | Status |
|---|---|
PolicyManager |
✅ Loads local and remote (HTTP/HTTPS) policies |
| Caching | ✅ Caches remote policies in ~/.cache/aphoria/policies/ |
aphoria.toml config |
✅ policies list support |
6.3 Core Integration ✅
| Task | Status |
|---|---|
EphemeralDetector integration |
✅ Ingests policies into memory corpus/index |
check_conflicts_pure update |
✅ Resolves policy aliases before authoritative lookup |
LocalEpisteme export helpers |
✅ fetch_acknowledgments, fetch_manual_aliases |
6.4 CLI Commands ✅
| Task | Status |
|---|---|
aphoria policy export |
✅ Exports local ack decisions as a Trust Pack |
aphoria scan policy loading |
✅ Auto-loads policies from config |
Files: policy.rs, config.rs, episteme/mod.rs, lib.rs, main.rs
Phase 7: Declarative Extractors ⬜
Enable users to define new extractors in config/policy files (YAML/TOML) without writing Rust code. This removes the recompilation bottleneck for custom pattern enforcement.
7.1 Declarative Schema ⬜
Define a schema for pattern-based extraction:
extractors:
- name: "api_style"
language: "go"
pattern: 'func \w+\(.*\) \[\]\w+'
claim:
subject: "api/response_format"
predicate: "structure"
object: "raw_array"
7.2 Implementation Tasks ⬜
| Task | Description |
|---|---|
DeclarativeExtractor |
Generic extractor implementation reading from config |
ExtractorConfig update |
Load declarative definitions from aphoria.toml and Trust Packs |
Regex optimization |
Pre-compile all declarative patterns |
| Validation | Ensure valid regex and claim structure at load time |
Milestone Summary
| Phase | Deliverable | Depends On | Status |
|---|---|---|---|
| 0 | ConceptPath in StemeDB | concept-hierarchy spec | ✅ |
| 2 | Aphoria CLI (scan, report, ack) | Phase 0 | ✅ |
| 2A | Concept matching (leaf, alias, auto-alias) | Phase 2 | ✅ |
| 1 | Authoritative corpus expansion | Phase 0 | ✅ |
| 3 | Claude Code skill + hooks | Phase 2A | ✅ |
| 4.5 | Ephemeral scan mode (40x faster) | Phase 2 | ✅ |
| 5 | Research agent loop | Phase 3 | ✅ |
| 6 | Federated Policy & Trust Packs | Phase 4.5 | ✅ |
| 4A | Observational claims (Tier 4 write-back) | Phase 6 | ✅ |
| 4B | Self-conflict detection (drift) | Phase 4A | ✅ |
| 4C | Diff-only scanning (--staged) | Phase 4B | ✅ |
| 4E | Hosted mode (team aggregation) | Phase 4C | ✅ |
| 4D | Enhanced ack (--reason, policy updates) | Phase 4C | ⬜ NEXT |
| 7 | Declarative Extractors | Phase 4 | ⬜ |
Current state:
- Phases 0-3, 4.5, 4A-4C, 4E, 5, 6 complete (258 tests, clippy clean)
- Full corpus: RFC, OWASP, Vendor sources
- 10 extractors including security (weak_crypto, command_injection, sql_injection)
- Trust Packs: signed policy bundles with import/export
- Ephemeral mode: 40x faster for CI
- Observation write-back:
--syncrecords novel claims as Tier 4 project memory - Drift detection: Detects changes from prior observations
- Staged scanning:
--stagedflag for fast pre-commit hooks - Hosted mode: Team aggregation via central StemeDB server
Next: Phase 4D — Enhanced Ack (--reason, policy updates)
The pre-commit hook is now a bidirectional knowledge sync:
- 4A ✅: Record code claims as Tier 4 observations (project memory)
- 4B ✅: Detect drift from prior observations (self-conflict)
- 4C ✅: Fast diff-only scanning for pre-commit hooks (
--staged) - 4E ✅: Team aggregation via hosted StemeDB server
- 4D ⬜: Enhanced ack with rationale and policy updates
This transforms Aphoria from a linter into a learning system that builds institutional memory per-project and collective intelligence across teams via hosted mode.