# Aphoria Roadmap Archive > Completed phases moved from `roadmap.md`. Full implementation details preserved in git history. --- ## Phase 0: StemeDB Foundation ✅ ConceptPath type, hierarchical index, alias store, source class inference, concept API endpoints. All shipped as Phase 5D of the main StemeDB roadmap. **Spec:** [docs/specs/concept-hierarchy.md](../../docs/specs/concept-hierarchy.md) --- ## Phase 2: CLI Core ✅ End-to-end CLI pipeline with 10 extractors and bootstrapped corpus of 11 hardcoded assertions. | Task | Status | |------|--------| | 2.1 Project Walker | ✅ `walker/mod.rs`, `walker/path_mapper.rs`, `walker/language.rs` | | 2.2 Extractors (10) | ✅ `tls_verify`, `jwt_config`, `hardcoded_secrets`, `timeout_config`, `dep_versions`, `cors_config`, `rate_limit`, `weak_crypto`, `command_injection`, `sql_injection` | | 2.3 Ingestion Bridge | ✅ `bridge.rs` — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion | | 2.4 Conflict Query | ✅ `episteme.rs` — LocalEpisteme with check_conflicts() | | 2.5 Report Output | ✅ `report/` — table (comfy-table), JSON, SARIF 2.1.0, markdown | | 2.6 Acknowledge Command | ✅ `lib.rs` acknowledge() | | Baseline & Diff | ✅ `lib.rs` set_baseline(), show_diff() | | Status Command | ✅ `lib.rs` show_status() | ### Phase 2 Code Quality Fixes ✅ - DES/RC4 concept path misclassification: Split into `check_hash_pattern()` and `check_encryption_pattern()` - SHA1 edge case: Documented as intentionally broad - JS exec() regex: Tightened to require `child_process.` prefix --- ## Phase 2A: Concept Matching ✅ - **2A.1 Leaf-Based Matching**: `ConceptIndex` with tail-path matching (last 2 segments + predicate) - **2A.2 Alias Resolution**: Wired `AliasStore` into `QueryEngine.execute()` with `resolve_aliases: bool` - **2A.3 Auto-Alias Creation**: Auto-creates aliases when code and authority share leaf names --- ## Phase 1: Authoritative Corpus Expansion ✅ Expanded from 11 hardcoded assertions to pluggable corpus system. - **1.1 CorpusBuilder Trait** ✅ — name, scheme, default_tier, build, requires_network - **1.2 RFC Ingester** ✅ — HTTP fetching, RFC 2119 keyword parsing, 8 RFC-specific parsers - **1.3 OWASP Ingester** ✅ — GitHub raw content, 9 cheat sheet parsers - **1.4 Vendor Docs** ✅ — PostgreSQL, Redis, reqwest, hyper, Go net/http, tokio-postgres, SQLx - **1.5 Hardcoded Refactor** ✅ — Original 11 assertions as `HardcodedCorpusBuilder` - **1.6 CLI Integration** ✅ — `aphoria corpus build/list`, `--only`, `--offline`, `--clear-cache` - **1.7 Error Handling** ✅ — Per-source graceful degradation **Files:** `corpus/mod.rs`, `corpus/hardcoded.rs`, `corpus/rfc.rs`, `corpus/owasp.rs`, `corpus/vendor.rs` --- ## Phase 3: Skill Integration ✅ - **3.1 Claude Code Skill** ✅ — `/aphoria scan`, `scan --fix`, `ack`, `status`, `diff`, `init`, `baseline` - **3.2 Agent Pre-Flight Hook** ✅ — `--exit-code` (2=BLOCK, 1=FLAG, 0=clean), `--strict` - **3.3 Alias Suggestion** ✅ — Auto-alias creation from Phase 2A.3 --- ## Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ✅ Bidirectional knowledge sync: extract → check → classify → update → gate. - **4A Observational Claims** ✅ — `--sync` records novel claims as Tier 4 observations - **4B Self-Conflict Detection** ✅ — Drift detection with `Verdict::Drift` - **4C Diff-Only Scanning** ✅ — `--staged` for fast pre-commit hooks - **4D Enhanced Ack** ✅ — `--reason`, `aphoria update` for policy changes - **4E Hosted Mode** ✅ — Team aggregation via central StemeDB server, `HostedClient` --- ## Phase 4.5: Ephemeral Scan Mode ✅ 40x faster scans by skipping Episteme storage. Default mode ~0.25s, persistent ~1-2s. - `ScanMode` enum (Ephemeral default, Persistent opt-in with `--persist`) - `EphemeralDetector` — in-memory corpus + ConceptIndex - `check_conflicts_pure()` extracted as standalone function --- ## Phase 5: Research Agent Loop ✅ - **5.1 Gap Detection** ✅ — `detect_gaps()` compares claims against ConceptIndex - **5.2 Gap Storage** ✅ — JSON-backed persistent storage with eligibility tracking - **5.3 Quality Validation** ✅ — Source attribution, normative language, vague content detection - **5.4 Research Execution** ✅ — HTTP fetching, normative extraction, confidence scoring - **5.5 CLI Integration** ✅ — `aphoria research run/status/gaps` - **5.6 Community Corpus** ✅ — Opt-in anonymous pattern sharing with privacy-preserving anonymization - **5.7 Security Extractors** ✅ — weak_crypto, command_injection, sql_injection --- ## Phase 6: Federated Policy & Trust Packs ✅ - **6.1 Trust Pack Format** ✅ — rkyv serialization, Ed25519 signing - **6.2 Policy Management** ✅ — Local and remote loading with caching - **6.3 Core Integration** ✅ — EphemeralDetector + LocalEpisteme policy ingestion - **6.4 CLI Commands** ✅ — `aphoria policy export`, auto-loading --- ## Phase 6.5: Trust Pack Extensions ✅ - **6.5.1 Predicate Aliases** ✅ — `enabled` ↔ `required` ↔ `mandatory` ↔ `enforced` - **6.5.2 Pack Signing Key Rotation** ✅ — `aphoria policy resign`, signature chain audit trail --- ## Phase 7: Declarative Extractors ✅ TOML-defined custom extractors without Rust code. - **7.1 Core Types** ✅ — `DeclarativeExtractorDef`, `DeclarativeExtractor` - **7.2 Configuration** ✅ — `[[extractors.declarative]]` in aphoria.toml - **7.3 Validation** ✅ — ReDoS protection, confidence validation - **7.4 Registry Integration** ✅ — Enable/disable, Trust Pack integration - **7.5 Error Handling** ✅ - **7.6 Tests** ✅ — 22 unit + 7 integration tests --- ## Phase 7.5: LLM-in-the-Loop Extraction ✅ Gemini-powered semantic extraction for high-value files. - **7.5.1 LLM Extractor** ✅ — `GeminiClient`, structured JSON output - **7.5.2 Selective Triggering** ✅ — `is_high_value_file()`, token budget - **7.5.3 Cost Controls** ✅ — BLAKE3 caching, budget enforcement - **7.5.4 Configuration** ✅ — `[llm]` section in aphoria.toml --- ## Phase 7.6: Pattern Learning Store ✅ Remember patterns LLM finds for promotion to declarative extractors. - **7.6.1 Schema** ✅ — `LearnedPattern`, `ClaimTemplate`, `ValueType` - **7.6.2 PatternStore** ✅ — JSON-backed, RwLock thread safety, Levenshtein dedup - **7.6.3 Normalization** ✅ — Version/boolean/number/string placeholder replacement - **7.6.4 Configuration** ✅ — `[learning]` section - **7.6.5 Scan Integration** ✅ — Project hash, record/update patterns --- ## Phase 7.7: Pattern → Extractor Promotion ✅ Learned patterns become declarative extractors via LLM regex generation. - **7.7.1 Pipeline** ✅ — `PromotionPipeline`, `RegexGenerator`, `ExtractorValidator`, `YamlWriter` - **7.7.2 Regex Generation** ✅ — Multi-example prompt, ReDoS safety - **7.7.3 Validation** ✅ — Positive tests, timing validation - **7.7.4 Human Review** ✅ — `aphoria extractors review/stats/candidates/promote` - **7.7.5 Extractor Output** ✅ — YAML files in `.aphoria/extractors/learned/` --- ## Phase 7.8: LLM Prompt Evaluation ✅ Golden fixtures with precision/recall metrics and regression detection. - **7.8.1 Fixture Format** ✅ — TOML-based with `must_contain`/`must_not_contain` - **7.8.2 Claim Matching** ✅ — Tail-path matching, type coercion - **7.8.3 Metrics** ✅ — Precision/Recall/F1, per-category breakdown - **7.8.4 Harness** ✅ — Live/Cached/Mock modes, regression detection - **7.8.5 Reports** ✅ — Table, JSON, Markdown - **7.8.6 CLI** ✅ — `aphoria eval run/baseline/update-baseline/list-fixtures/validate-fixtures` - **7.8.7 Seed Fixtures** ✅ — 10 fixtures across tls, jwt, secrets, auth, negative, edge --- ## Phase 8: Enterprise Extractor Improvements ✅ 42 extractors total. Enterprise-grade detection for production codebases. - **8.1 High-Entropy Secrets** ✅ — Shannon entropy, known prefixes (AWS/Stripe/GitHub/GitLab/Slack) - **8.2 Framework Extractors (10)** ✅ — Spring, Django, Express, Rails, ASP.NET, Laravel, FastAPI, Next.js, Flask, NestJS - **8.3 Config Deep Parsing** ✅ — YAML/JSON/TOML tree walking, 11 security rules - **8.4 Semantic TLS Version** ✅ — Cross-language const detection, Terraform, Kubernetes - **8.5 ORM SQL Injection** ✅ — Django/SQLAlchemy/GORM/ActiveRecord/Prisma/Sequelize - **8.6 Path Traversal** ✅ - **8.7 Unvalidated Redirects** ✅ - **8.8 Weak Password** ✅ - **8.9 Security Headers** ✅ - **8.10 Insecure Deserialization** ✅ - **8.11 SSRF** ✅ --- ## Phase 9: Autonomous Extractor Generation ✅ Fully self-improving extraction system. - **9.1 Autonomous Promotion** ✅ — >0.95 confidence, >10 projects, full audit trail - **9.2 Shadow Mode Testing** ✅ — Isolated metrics, graduation gate, FP tracking - **9.3 Auto-Rollback** ✅ — FP rate >15% triggers automatic rollback - **9.4 Cross-Project Learning** ✅ — Privacy-preserving pattern sync, community extractors - **9.5 Extractor Versioning** ✅ — Changelogs, rollback, A/B comparison --- ## Phase 10.1: Acknowledgment Expiry ✅ Time-limited exceptions with `--expires` flag. - `--expires 90d` or `--expires 2026-12-31` (ISO 8601) - Expired acks resurface as BLOCK - Preserved for audit trail per patent claim 25 - All report formatters show expiry info **Files:** `src/expiry.rs`, `cli.rs`, `report/*.rs` --- ## Phase 11: Evidence-Based Authority ✅ Evidence levels (ProductSpec > Standard > Research > Commit-only) with evidence-aware graduation. - **11.1 Types** ✅ — `EvidenceLevel`, `PatternEvidence` with ADR/spec/RFC references - **11.2 Detection** ✅ — Commit message parsing, ADR/spec file detection - **11.3 Graduation** ✅ — Thresholds vary by evidence (ProductSpec: 1 usage, Commit-only: 10) - **11.4 Display** ✅ — Evidence chain in output, `--evidence` filter **Files:** `src/evidence/mod.rs`, `evidence/types.rs`, `evidence/detection.rs` --- ## Phase 12: Knowledge Scope Hierarchy ✅ Organization → Team → Project scope levels with inheritance. - **12.1 Scope Types** ✅ — `ScopeLevel` enum, `ScopeConfig` - **12.2 Inheritance** ✅ — Security: no opt-out, Conventions: override with justification - **12.3 Override Workflow** ✅ — Justification + evidence required - **12.4 Cross-Scope Queries** ✅ — `--scope org/team/project`, `--exclude-inherited` **Files:** `src/scope/mod.rs`, `scope/config.rs`, `scope/resolver.rs`, `scope/override_record.rs`, `scope/store.rs` --- ## Phase 13: Knowledge Lifecycle Management ✅ Active → Deprecated → Superseded → Archived lifecycle for patterns. - **13.1 Status Types** ✅ — `KnowledgeStatus` enum with history tracking - **13.2 Deprecation** ✅ — `aphoria deprecate` with `--reason`, `--superseded-by`, `--sunset-date` - **13.3 Migration Guidance** ✅ — Warnings in scan output, links to replacements - **13.4 Migration Dashboard** ✅ — `aphoria migrations status`, progress tracking, export **Files:** `src/lifecycle/mod.rs`, `lifecycle/store.rs`, `lifecycle/migration.rs` --- ## Phase 16: Ignore & Exclusion System ✅ Clean scans by excluding test fixtures and intentional patterns. - **16.1 Glob Patterns** ✅ — `globset` with `**`, `*`, `?` support - **16.2 `.aphoriaignore`** ✅ — Gitignore-style patterns, merged with aphoria.toml - **16.3 Inline Comments** ✅ — `// aphoria:ignore`, `ignore-next-line`, `ignore-block` - **16.4 Ack Export/Import** ✅ — `.aphoria/acks.toml`, version-controllable --- ## The Self-Learning Vision (Complete) ``` Phase 7: Declarative Extractors ✅ ↓ Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅ ↓ Phase 7.6: Pattern Learning (remember what LLM finds) ✅ ↓ Phase 7.7: Pattern Promotion (patterns → extractors) ✅ ↓ Phase 7.8: LLM Prompt Evaluation (measure & improve) ✅ ↓ Phase 8: Enterprise Extractors (42 total) ✅ ↓ Phase 9: Autonomous Generation (fully self-improving) ✅ ``` ## Milestone Summary (Completed) | Phase | Deliverable | Status | |-------|-------------|--------| | 0 | ConceptPath in StemeDB | ✅ | | 2 | Aphoria CLI (scan, report, ack) | ✅ | | 2A | Concept matching (leaf, alias, auto-alias) | ✅ | | 1 | Authoritative corpus expansion | ✅ | | 3 | Claude Code skill + hooks | ✅ | | 4 | Full-cycle pre-commit (sync, drift, staged, hosted) | ✅ | | 4.5 | Ephemeral scan mode (40x faster) | ✅ | | 5 | Research agent loop + community corpus | ✅ | | 6 | Federated Policy & Trust Packs | ✅ | | 6.5 | Trust Pack Extensions | ✅ | | 7 | Declarative Extractors | ✅ | | 7.5 | LLM-in-the-Loop Extraction | ✅ | | 7.6 | Pattern Learning Store | ✅ | | 7.7 | Pattern → Extractor Promotion | ✅ | | 7.8 | LLM Prompt Evaluation | ✅ | | 8 | Enterprise Extractors (42 total) | ✅ | | 9 | Autonomous Extractor Generation | ✅ | | 10.1 | Acknowledgment Expiry | ✅ | | 11 | Evidence-Based Authority | ✅ | | 12 | Knowledge Scope Hierarchy | ✅ | | 13 | Knowledge Lifecycle Management | ✅ | | 16 | Ignore & Exclusion System | ✅ |