This commit implements Phase 17 of the Aphoria roadmap, adding: **Inline Claim Markers (@aphoria:claim):** - New extractor for detecting inline markers in comments - Pending markers tracked in .aphoria/pending_markers.toml - CLI commands: list-markers, formalize-marker, reject-marker - Support for all major comment styles (Rust, Python, SQL, etc.) - Auto-sync during scan (configurable) **Claim Enrichment:** - ClaimEnrichment type with source attribution (inline, extractor, manual) - EnrichedClaimInfo with full enrichment metadata - Extended AuthoredClaim with optional enrichment field - API endpoints for enriched claim queries - Dashboard UI components (enrichment badge, verdict badge) **Enhanced Extractor Trait:** - verifiable_predicates() method for declaring (tail_path, predicate) pairs - 10 security extractors now implement verifiable_predicates - Enables claim suggester skill to find unclaimed patterns **Documentation:** - Phase 17 summary with complete implementation details - Gap fixes summary documenting 8 closed vision gaps - Updated CLI reference with new commands - New aphoria-docs skill for documentation maintenance - Updated roadmap with Phase 17 completion **Integration:** - ClaimsFile support for claim enrichment persistence - Pattern aggregate store support for enrichment queries - Dashboard filters and display for enrichment metadata - API handlers for list-markers and enrichment queries **Tests:** - New gap_fixes_integration test suite - Corpus enricher module with best practices ingestion Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
269 lines
9.3 KiB
Markdown
269 lines
9.3 KiB
Markdown
# Aphoria Roadmap
|
||
|
||
> Completed phases archived in [`roadmap-archive.md`](./roadmap-archive.md)
|
||
|
||
---
|
||
|
||
## Status Overview
|
||
|
||
| Phase | Deliverable | Status |
|
||
|-------|-------------|--------|
|
||
| 0–9, 11–13, 16–17 | Core CLI, Extractors (42), LLM, Learning, Enterprise, Lifecycle, Pattern Enrichment | ✅ Archived |
|
||
| 10 | UX & Enterprise Polish | 🔄 Partial (10.1 ✅, 10.2–10.3 ⬜) |
|
||
| 14 | Governance Workflows | 🎯 Current |
|
||
| 15 | Evidence Source Integration | ⬜ Future |
|
||
| A6 | AST-Aware Observation & Claim Verification | ⬜ Future |
|
||
|
||
### Current State
|
||
|
||
- 42 built-in extractors + declarative custom extractors
|
||
- Full corpus: RFC, OWASP, Vendor sources
|
||
- Ephemeral mode (~0.25s), persistent mode with drift detection
|
||
- Observation/claim distinction (A1–A5 complete, see main `roadmap.md`)
|
||
- `aphoria verify run|map` for claim verification
|
||
- 10 claims dogfooded in `.aphoria/claims.toml`
|
||
- Self-improving: LLM extraction → pattern learning → autonomous promotion → shadow testing → auto-rollback
|
||
|
||
---
|
||
|
||
## Phase 10: UX & Enterprise Polish (Partial)
|
||
|
||
> 10.1 Acknowledgment Expiry ✅ — archived
|
||
|
||
### 10.2 Human-Readable Signer Names ⬜
|
||
|
||
**Impact:** MEDIUM | **Effort:** MEDIUM | **Priority:** P2
|
||
|
||
Map issuer hex IDs to human-readable team names in output.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Add `signer_name: Option<String>` to `PackHeader` | ⬜ |
|
||
| Add `contact: Option<String>` to `PackHeader` (Slack channel, email) | ⬜ |
|
||
| Update `policy export/import` to preserve new fields | ⬜ |
|
||
| Show "Signed by Platform Security Team" instead of hex in output | ⬜ |
|
||
| Backward-compat: gracefully handle packs without new fields | ⬜ |
|
||
|
||
### 10.3 Speed Benchmarks ⬜
|
||
|
||
**Impact:** LOW | **Effort:** LOW | **Priority:** P3
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `benchmarks/` directory with test corpora | ⬜ |
|
||
| Add `aphoria scan --benchmark` flag for self-test | ⬜ |
|
||
| Document test conditions in benchmark results | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase 14: Governance Workflows 🎯
|
||
|
||
> **Vision:** Clear approval paths for pattern promotion with audit trails.
|
||
|
||
### 14.1 Approval Workflow Definition ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/governance/mod.rs` module | ⬜ |
|
||
| Define `ApprovalWorkflow` struct | ⬜ |
|
||
| Define `ApprovalStage` with required approvers | ⬜ |
|
||
| Support evidence-based auto-approve thresholds | ⬜ |
|
||
| Config: define workflows in `.aphoria.toml` | ⬜ |
|
||
|
||
### 14.2 Approval State Machine ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Implement state transitions (pending → approved/rejected) | ⬜ |
|
||
| Multi-stage approval support | ⬜ |
|
||
| Timeout and escalation policies | ⬜ |
|
||
| Store approval history with timestamps | ⬜ |
|
||
|
||
### 14.3 Approval CLI ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| `aphoria governance pending` — list pending approvals | ⬜ |
|
||
| `aphoria governance approve <id> --comment "..."` | ⬜ |
|
||
| `aphoria governance reject <id> --reason "..."` | ⬜ |
|
||
| `aphoria governance escalate <id>` | ⬜ |
|
||
| Show approval status in pattern list | ⬜ |
|
||
|
||
### 14.4 SOC 2 Audit Trail ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Full audit log for all governance actions | ⬜ |
|
||
| `aphoria audit trail --pattern <id>` — show timeline | ⬜ |
|
||
| Export governance history for auditors | ⬜ |
|
||
| Include approver identity and timestamp | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase 15: Evidence Source Integration ⬜
|
||
|
||
> **Vision:** ADRs, specs, and standards automatically link to patterns.
|
||
|
||
### 15.1 ADR Auto-Detection ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/evidence/adr.rs` | ⬜ |
|
||
| Detect ADR-XXX patterns in commit messages | ⬜ |
|
||
| Scan for ADR files in standard locations | ⬜ |
|
||
| Parse ADR content for related patterns | ⬜ |
|
||
| Link ADR to patterns automatically | ⬜ |
|
||
|
||
### 15.2 Spec File Detection ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/evidence/spec.rs` | ⬜ |
|
||
| Detect spec files (specs/*.md, *.spec.md) | ⬜ |
|
||
| Parse requirement IDs (REQ-XXX) | ⬜ |
|
||
| Link requirements to patterns | ⬜ |
|
||
| Show requirement coverage in reports | ⬜ |
|
||
|
||
### 15.3 Standard Reference Extraction ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Parse RFC references (RFC 7519) | ⬜ |
|
||
| Parse OWASP references (OWASP A03:2021) | ⬜ |
|
||
| Parse NIST references (NIST SP 800-53) | ⬜ |
|
||
| Auto-link to authoritative corpus | ⬜ |
|
||
|
||
### 15.4 Evidence Display ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Show full evidence chain in pattern output | ⬜ |
|
||
| `aphoria patterns --by-evidence` grouping | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase A6: AST-Aware Observation & Claim Verification ⬜
|
||
|
||
> Evolved from the "Scout & Judge" proposal (2026-02-05). The original focused on LLM cost reduction via AST snippet extraction. Reframed through the observations/claims distinction: the **Scout** produces structurally richer observations that regex can't, and the **Judge** verifies authored claims against code rather than classifying security issues.
|
||
|
||
### Why This Matters
|
||
|
||
The 42 regex extractors work well for direct pattern matching (~0.25s). But they can't follow indirection:
|
||
|
||
```python
|
||
# Regex sees `requests.get(url, verify=should_verify)` — no match
|
||
# AST sees `should_verify = False` in scope — match
|
||
should_verify = False
|
||
requests.get(url, verify=should_verify)
|
||
```
|
||
|
||
And they can't verify authored claims. When a claim says "Wallet MUST NOT derive Clone", regex can find `#[derive(` but can't determine scope or negation semantics. An AST-aware scout + LLM judge can.
|
||
|
||
### A6.1 Tree-sitter Infrastructure ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Add `tree-sitter` + language grammars to `Cargo.toml` | ⬜ |
|
||
| Create `src/scout/mod.rs` module | ⬜ |
|
||
| `src/scout/engine.rs` — parse files, run SCM queries | ⬜ |
|
||
| `CandidateSnippet` type with structural context | ⬜ |
|
||
| `src/scout/queries/` — `.scm` query files per category/language | ⬜ |
|
||
| Language support: Python, Go, Rust, JavaScript/TypeScript | ⬜ |
|
||
|
||
```rust
|
||
pub struct CandidateSnippet {
|
||
pub file_path: String,
|
||
pub language: Language,
|
||
pub start_line: usize,
|
||
pub end_line: usize,
|
||
pub code: String,
|
||
pub context_variables: HashMap<String, String>,
|
||
pub query_id: String,
|
||
}
|
||
```
|
||
|
||
### A6.2 Scout as Observation Producer ⬜
|
||
|
||
AST-aware ROI detection for patterns regex can't follow.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Variable indirection tracking (assign → use across lines) | ⬜ |
|
||
| Context expansion: function scope, variable defs, comments | ⬜ |
|
||
| Deduplication with existing regex extractors | ⬜ |
|
||
| SCM queries for TLS, secrets, auth, crypto categories | ⬜ |
|
||
| Integration: run scout after regex, drop overlaps, combine | ⬜ |
|
||
|
||
**Key design:** Scout runs alongside (not instead of) regex extractors. Regex handles 90% at zero cost; scout handles the indirection cases regex misses.
|
||
|
||
### A6.3 Judge as Claim Verifier ⬜
|
||
|
||
LLM receives focused snippet + authored claim → structured verdict.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Refactor `LlmExtractor` to accept `CandidateSnippet` + `AuthoredClaim` | ⬜ |
|
||
| Verification prompt: "Does this code satisfy this claim?" | ⬜ |
|
||
| Structured output: `{ verdict: PASS|FAIL|UNCERTAIN, evidence: "..." }` | ⬜ |
|
||
| Wire into `aphoria verify` Direction 2 (walk claims, verify in code) | ⬜ |
|
||
| Maps to `Extractor::verify()` from vision-gaps | ⬜ |
|
||
|
||
**Token efficiency:** Snippet (~100 tokens) vs whole file (~2000 tokens) = 95% cost reduction per verification.
|
||
|
||
### A6.4 Scout for Claim Suggestion ⬜
|
||
|
||
Scout identifies ROIs without matching authored claims, feeds context to `aphoria-suggest`.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Identify ROIs with no matching claim in `.aphoria/claims.toml` | ⬜ |
|
||
| Enrich context for skill: snippet + function name + surrounding comments | ⬜ |
|
||
| Feed to `aphoria-suggest` skill for claim drafting | ⬜ |
|
||
|
||
### A6.5 Evaluation ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Scout recall: "Did scout find the vulnerable line in fixture?" | ⬜ |
|
||
| Judge precision: "Given snippet + claim, did LLM classify correctly?" | ⬜ |
|
||
| Cost metric: `tokens_per_verification` vs monolithic approach | ⬜ |
|
||
| Parallel run: shadow mode alongside regex for tuning | ⬜ |
|
||
|
||
### Phase A6 Priority
|
||
|
||
Lower priority than A5 flywheel completion and Phase 14 governance. Build when:
|
||
1. Regex extractors hit limits on specific indirection patterns
|
||
2. `aphoria verify` Direction 2 needs LLM-backed verification
|
||
3. `aphoria-suggest` needs richer context than regex observations provide
|
||
|
||
---
|
||
|
||
## Enterprise Pilot Success Metrics
|
||
|
||
### 90-Day Pilot Targets
|
||
|
||
| Metric | Target | Measurement |
|
||
|--------|--------|-------------|
|
||
| Patterns captured | 100+ observations | Count in knowledge graph |
|
||
| Patterns promoted | 10+ conventions | Count with status=Active |
|
||
| Cross-team adoption | 2+ teams connected | Unique team_ids |
|
||
| New hire guidance events | 5+ accepted suggestions | Accept rate tracking |
|
||
| False positive rate | <10% | FP feedback / total flags |
|
||
| Evidence-backed patterns | >50% | Patterns with Research+ evidence |
|
||
|
||
### 180-Day Production Targets
|
||
|
||
| Metric | Target | Measurement |
|
||
|--------|--------|-------------|
|
||
| Knowledge retention | 0 lost patterns on departures | Audit log |
|
||
| Onboarding velocity | 50% faster ramp | Time to first PR |
|
||
| Convention adoption | 80% across org | Compliance rate |
|
||
| SOC 2 evidence | Audit pass | External validation |
|
||
| Deprecated pattern migration | 90% complete by sunset | Migration tracking |
|
||
|
||
---
|
||
|
||
## Enterprise Simulation UAT
|
||
|
||
See: `uat/enterprise-simulation-uat.md`
|