stemedb/applications/aphoria/roadmap.md

# Aphoria Roadmap

> Completed phases archived in [`roadmap-archive.md`](./roadmap-archive.md)

---

## Status Overview

| Phase | Deliverable | Status |
|-------|-------------|--------|
| 0–9, 11–13, 16–17 | Core CLI, Extractors (42), LLM, Learning, Enterprise, Lifecycle, Pattern Enrichment | ✅ Archived |
| CC | Corpus Infrastructure (Community Corpus, Wiki Import, Pattern Aggregation, **Async Default**) | ✅ Complete |
| 10 | UX & Enterprise Polish | 🔄 Partial (10.1 ✅, 10.2–10.3 ⬜) |
| 14 | Governance Workflows | 🎯 Current |
| **DF-1** | **Dogfood: Database Connection Pool** | 🎯 **ACTIVE** |
| 15 | Evidence Source Integration | ⬜ Future |
| A6 | AST-Aware Observation & Claim Verification | ⬜ Future |

### Current State

- 42 built-in extractors + declarative custom extractors
- **Emergent corpus**: RFC, OWASP, Vendor sources + **community-driven patterns (CC.6 ✅)**
- **Community corpus enabled by default** (CC.7 ✅): `use_community: true`, proper async, no runtime hacks
- **Pattern aggregation active**: Observations auto-feed pattern aggregates after each scan
- **No hardcoded assertions**: Bootstrap via wiki import or Trust Packs
- Ephemeral mode (~0.25s), persistent mode with drift detection
- Observation/claim distinction (A1–A5 complete)
- `aphoria verify run|map` for claim verification
- 10 claims dogfooded in `.aphoria/claims.toml`
- Self-improving: LLM extraction → pattern learning → autonomous promotion → shadow testing → auto-rollback

### Recently Completed: Corpus Infrastructure (Phase CC ✅)

**Phase CC.1-CC.3: Removed hardcoded corpus, built emergent system** (Feb 6-7)
- Deleted `hardcoded.rs` (369 lines, 19 assertions)
- Pattern aggregates stored in StemeDB: `community://pattern/{BLAKE3(SPV)}`
- Multi-tier promotion: 95%+ (Regulatory), 80%+ (Clinical), 50%+ (Emerging, review required)
- Wiki import: `aphoria corpus import wiki ~/docs` parses MUST/SHOULD patterns

**Phase CC.6: Pattern Aggregation (Emergent Learning)** (Feb 8) ✅
- Observations now automatically feed back into pattern aggregates
- Every scan with `--persist --sync` contributes to community learning
- Config: `aggregation_enabled: true` (default)
- Tracks project_count and observation_count per pattern
- Privacy-preserving: wildcarded subjects, project deduplication

**Phase CC.7: Make Community Corpus Default** (Feb 8) ✅
- Created `AsyncCorpusBuilder` trait for async-native corpus builders
- Refactored `CommunityCorpusBuilder` to implement `AsyncCorpusBuilder`
- **Removed `rt.block_on()` hack** that caused "runtime within runtime" errors
- Made entire corpus building chain properly async (16 functions updated)
- Enabled `use_community: true` by default in `CorpusConfig`
- All 1189 tests pass, no clippy warnings, no runtime errors

**Philosophy:** The corpus isn't written by experts. It's discovered by the community and validated by authorities.

---

## Phase 10: UX & Enterprise Polish (Partial)

> 10.1 Acknowledgment Expiry ✅ — archived

### 10.2 Human-Readable Signer Names ⬜

**Impact:** MEDIUM | **Effort:** MEDIUM | **Priority:** P2

Map issuer hex IDs to human-readable team names in output.

| Task | Status |
|------|--------|
| Add `signer_name: Option<String>` to `PackHeader` | ⬜ |
| Add `contact: Option<String>` to `PackHeader` (Slack channel, email) | ⬜ |
| Update `policy export/import` to preserve new fields | ⬜ |
| Show "Signed by Platform Security Team" instead of hex in output | ⬜ |
| Backward-compat: gracefully handle packs without new fields | ⬜ |

### 10.3 Speed Benchmarks ⬜

**Impact:** LOW | **Effort:** LOW | **Priority:** P3

| Task | Status |
|------|--------|
| Create `benchmarks/` directory with test corpora | ⬜ |
| Add `aphoria scan --benchmark` flag for self-test | ⬜ |
| Document test conditions in benchmark results | ⬜ |

---

## Phase CC: Corpus Infrastructure (Community Corpus) ✅

> **Completed:** 2026-02-08 | Removed hardcoded corpus, built emergent community-driven system

### Philosophy

The corpus isn't written by experts. It's discovered by the community and validated by authorities. 95% adoption = "This is what the community does" = Authoritative.

### CC.1 Delete Hardcoded Corpus ✅

| Task | Status |
|------|--------|
| Remove `applications/aphoria/src/corpus/hardcoded.rs` (369 lines) | ✅ |
| Remove `include_hardcoded` from `CorpusConfig` | ✅ |
| Remove from `CorpusRegistry::with_defaults()` | ✅ |
| Update tests to use community corpus | ✅ |
| Fix 5 pre-existing clippy errors in stemedb-api | ✅ |

**Implemented:** Destructive pre-release approach - no deprecation warnings, just deleted.

### CC.2 Community Corpus Builder ✅

| Task | Status |
|------|--------|
| Create `applications/aphoria/src/corpus/community.rs` (393 lines) | ✅ |
| Create `applications/aphoria/src/corpus/thresholds.rs` (230 lines) | ✅ |
| Create `applications/aphoria/src/corpus/resolver.rs` (220 lines) | ✅ |
| Create `applications/aphoria/src/community/pattern_store.rs` (332 lines) | ✅ |
| Implement `PatternAggregateStore` trait with StemeDB backend | ✅ |
| Multi-tier promotion: 95% (Regulatory), 80% (Clinical), 50% (Emerging) | ✅ |
| Content-addressed storage: `community://pattern/{BLAKE3(SPV)}` | ✅ |
| Config integration: `use_community` flag (opt-in) | ✅ |
| Full scan flow integration | ✅ |

**Storage Architecture:**
- Pattern aggregates stored as StemeDB assertions (no TOML files)
- Predicate: `pattern_aggregate` with JSON metadata
- Deduplication via content-addressed subjects
- Privacy-preserving: wildcarded subjects, k-anonymity

### CC.3 Wiki Import Bootstrap ✅

| Task | Status |
|------|--------|
| Create `applications/aphoria/src/corpus/wiki_importer.rs` (332 lines) | ✅ |
| Regex extraction of MUST/SHOULD patterns from markdown | ✅ |
| Authority source parsing (RFC, OWASP, CWE references) | ✅ |
| Smart subject normalization (TLS → tls/cert_verification) | ✅ |
| CLI command: `aphoria corpus import wiki <path>` | ✅ |
| PatternAggregator write path (stores to StemeDB) | ✅ |
| Integration tests with fixtures | ✅ (6 tests) |
| Documentation: `docs/bootstrap-corpus.md` | ✅ |

**Usage:**
```bash
# Create wiki with best practices
mkdir -p .aphoria/wiki
echo "TLS cert verification MUST be enabled. Authority: RFC 5246" > .aphoria/wiki/tls.md

# Import patterns
aphoria corpus import wiki .aphoria/wiki
# → Patterns now in StemeDB, available for conflict detection
```

### CC.4 Trust Pack Bootstrap ⬜

| Task | Status |
|------|--------|
| Extend Trust Packs to include pattern aggregates | ⬜ Future |
| `aphoria trust-pack install <name>` writes patterns to StemeDB | ⬜ Future |
| Create `rfc-owasp-baseline.toml` with ~20 common patterns | ⬜ Future |

**Status:** Infrastructure exists, implementation deferred. Wiki import covers bootstrap needs.

### CC.5 Skill-Driven Cold Start ⬜

| Task | Status |
|------|--------|
| Enhance `aphoria-suggest` skill with bootstrap mode | ⬜ Future |
| Detect empty corpus during scan | ⬜ Future |
| Analyze project structure (Cargo.toml, package.json) | ⬜ Future |
| Suggest 3-5 baseline patterns based on detected stack | ⬜ Future |

**Status:** Skill exists, bootstrap mode not implemented. Manual wiki creation works well.

### CC.6 Pattern Aggregation (Emergent Learning) ✅

> **Completed:** 2026-02-08 | Observations now feed back into pattern aggregates automatically

| Task | Status |
|------|--------|
| Add `aggregation_enabled` config field (default: `true`) | ✅ |
| Implement `aggregate_observations_to_patterns()` in scanner | ✅ |
| Add `StemeDBPatternStore::get_pattern_by_spv()` for lookup | ✅ |
| Add `StemeDBPatternStore::update_pattern()` for updates | ✅ |
| Add `compute_project_hash()` for deduplication | ✅ |
| Hook into scan flow after observation recording | ✅ |
| Group observations by (subject, predicate, value) | ✅ |
| Wildcard project paths for anonymization | ✅ |
| Create or update PatternAggregate records | ✅ |
| Track project_count and observation_count | ✅ |

**Implementation:**
```rust
// scanner.rs:344-357
if config.corpus.aggregation_enabled && should_persist_locally {
    let project_hash = compute_project_hash(project_root);
    aggregate_observations_to_patterns(&novel_claims, &episteme, &project_hash).await?;
}
```

**Flow:**
1. Scan extracts observations → recorded as Tier 4 assertions
2. Observations aggregated by (wildcarded_subject, predicate, value)
3. For each unique pattern:
   - If exists: increment observation_count, check new project → increment project_count
   - If new: create PatternAggregate with initial counts
4. Stored as assertions with predicate `"pattern_aggregate"`

**Result:** The corpus is now **emergent**. Every scan with `--persist --sync` feeds the learning loop.

---

### What Remains (Future Enhancement)

**CC.4 Trust Pack Bootstrap ⬜**
_(Unchanged - Future enhancement)_

**CC.5 Skill-Driven Cold Start ⬜**
_(Unchanged - Future enhancement)_

---

### CC.7 Make Community Corpus Default ✅

> **Completed:** 2026-02-08 | Community corpus now enabled by default, async runtime issue resolved

| Task | Status |
|------|--------|
| Create `AsyncCorpusBuilder` trait for async corpus builders | ✅ |
| Implement dual registry (sync + async builders) | ✅ |
| Refactor `CommunityCorpusBuilder` to implement `AsyncCorpusBuilder` | ✅ |
| Remove `rt.block_on()` hack, use proper `.await` | ✅ |
| Make `build_corpus_with_stores()` async | ✅ |
| Make `create_authoritative_corpus()` async | ✅ |
| Make `EphemeralDetector::new()` async | ✅ |
| Make `extract_claims_from_files()` async | ✅ |
| Update all 16 function callers to use `.await` | ✅ |
| Change `use_community: false` → `true` in defaults | ✅ |
| Verify tests pass with community corpus enabled | ✅ (1189 tests) |

**Architecture Improvement:**
- **Before**: Sync `CorpusBuilder` trait forced async operations to use `rt.block_on()`, causing runtime errors in async contexts
- **After**: Dual-trait approach (`CorpusBuilder` + `AsyncCorpusBuilder`) allows sync builders (RFC, OWASP, Vendor) to stay simple while community builder uses proper async
- **Result**: No `block_on()` hacks anywhere, proper async/await throughout

**Verification:**
```bash
RUST_LOG=aphoria=debug aphoria scan --persist --sync .
# Logs show:
# ✅ "Registered community corpus builder (async)"
# ✅ "Building corpus (async)" for Community builder
# ✅ "Querying popular patterns from StemeDB"
# ✅ No "Cannot start a runtime from within a runtime" errors
```

---

### CC.4 Trust Pack System (Bootstrap Option 2) ⬜

| Task | Status |
|------|--------|
| `aphoria trust-pack export --source community` | ⬜ |
| `aphoria trust-pack install <name>` | ⬜ |
| Create `rfc-owasp-bootstrap` Trust Pack from old hardcoded corpus | ⬜ |
| Trust Pack validation and signing | ⬜ |
| Trust Pack registry/sharing mechanism | ⬜ |

**Usage:**
```bash
aphoria trust-pack install rfc-owasp-bootstrap
# Installs 19 baseline assertions for new projects
```

### CC.5 Corpus Management CLI ⬜

| Task | Status |
|------|--------|
| `aphoria corpus build` - Build community corpus | ⬜ |
| `aphoria corpus list` - Show loaded corpus assertions | ⬜ |
| `aphoria corpus candidates --min-adoption 0.50` - List promotion candidates | ⬜ |
| `aphoria corpus promote <pattern-id>` - Manual promotion | ⬜ |
| Update `aphoria-corpus-curator` skill for manual review | ⬜ |

### CC.6 Multi-Layer Corpus Resolver ⬜

| Task | Status |
|------|--------|
| Create `applications/aphoria/src/corpus/resolver.rs` | ⬜ |
| Priority layers: Manual overrides > Trust Packs > Community > (deprecated hardcoded) | ⬜ |
| Conflict resolution: higher priority overwrites lower | ⬜ |
| Config: `use_community = true` default | ⬜ |
| Config: `include_hardcoded = false` default (post-migration) | ⬜ |

---

## Phase 14: Governance Workflows 🎯

> **Vision:** Clear approval paths for pattern promotion with audit trails.

### 14.1 Approval Workflow Definition ⬜

| Task | Status |
|------|--------|
| Create `src/governance/mod.rs` module | ⬜ |
| Define `ApprovalWorkflow` struct | ⬜ |
| Define `ApprovalStage` with required approvers | ⬜ |
| Support evidence-based auto-approve thresholds | ⬜ |
| Config: define workflows in `.aphoria.toml` | ⬜ |

### 14.2 Approval State Machine ⬜

| Task | Status |
|------|--------|
| Implement state transitions (pending → approved/rejected) | ⬜ |
| Multi-stage approval support | ⬜ |
| Timeout and escalation policies | ⬜ |
| Store approval history with timestamps | ⬜ |

### 14.3 Approval CLI ⬜

| Task | Status |
|------|--------|
| `aphoria governance pending` — list pending approvals | ⬜ |
| `aphoria governance approve <id> --comment "..."` | ⬜ |
| `aphoria governance reject <id> --reason "..."` | ⬜ |
| `aphoria governance escalate <id>` | ⬜ |
| Show approval status in pattern list | ⬜ |

### 14.4 SOC 2 Audit Trail ⬜

| Task | Status |
|------|--------|
| Full audit log for all governance actions | ⬜ |
| `aphoria audit trail --pattern <id>` — show timeline | ⬜ |
| Export governance history for auditors | ⬜ |
| Include approver identity and timestamp | ⬜ |

---

## Phase DF-1: Dogfood Project - Database Connection Pool 🎯

> **Status:** ACTIVE | **Start:** 2026-02-09 | **Target:** 2026-02-14 (5 days)
>
> **Vision:** Build a production-ready database connection pool with intentional violations, use Aphoria to detect and guide remediation. Demonstrates real-world value in preventing production incidents.

### Overview

**Product:** `dbpool` - Safe, opinionated PostgreSQL connection pool for Rust

**Why This Matters:**
- Connection pool misconfigurations cause real P0 incidents
- Clear authority sources (HikariCP, PostgreSQL docs)
- Demonstrates Aphoria preventing actual production problems
- "Aphoria caught this before deployment" is compelling ROI

**Key Metrics:**
- Claims to extract: 25-30
- Intentional violations: 7-8
- Expected detection rate: 100%
- Final state: 0 conflicts, production-ready

### DF-1.1 Preparation & Corpus Building (Day 1) 🔄

**Goal:** Extract claims from authority sources and populate corpus database

| Task | Status |
|------|--------|
| Create project structure at `applications/aphoria/dogfood/dbpool/` | ✅ |
| Write comprehensive plan in `dogfood/dbpool/plan.md` | ✅ |
| Fetch HikariCP configuration documentation | ⏳ |
| Fetch PostgreSQL connection pooling guide | ⏳ |
| Extract OWASP A07 credential guidance | ⏳ |
| Create 25-30 claims via CLI (`aphoria corpus create`) | ⏳ |
| Verify all claims queryable via API | ⏳ |
| Document claim templates for future dogfoods | ⏳ |

**Deliverables:**
- `docs/sources/hikaricp-config.md`
- `docs/sources/postgresql-pooling.md`
- `docs/sources/owasp-credentials.md`
- 25-30 claims in corpus database
- Verification report

### DF-1.2 Initial Implementation with Violations (Day 2) ⏳

**Goal:** Write working code that compiles but violates best practices

| Task | Status |
|------|--------|
| Create Rust project with Cargo.toml | ⏳ |
| Implement PoolConfig with 5 violations | ⏳ |
| Implement ConnectionPool with 2 violations | ⏳ |
| Add basic tests (that pass despite violations) | ⏳ |
| Verify compilation successful | ⏳ |

**Intentional Violations:**
1. ❌ Unbounded max_connections (CRITICAL)
2. ❌ Plaintext password in connection string (CRITICAL)
3. ❌ Missing max_lifetime (CRITICAL)
4. ❌ Excessive connection_timeout (ERROR)
5. ❌ Zero min_connections (ERROR)
6. ❌ Missing connection validation (ERROR)
7. ⚠️ No metrics exposed (WARNING)
8. ⚠️ Missing leak detection (WARNING)

### DF-1.3 First Scan & Verification (Day 3) ✅

**Goal:** Run Aphoria scan and verify all violations detected

| Task | Status |
|------|--------|
| Create `.aphoria/config.toml` | ✅ |
| Run initial scan, save results JSON | ✅ |
| Verify 7-8 violations detected (100% accuracy) | ⚠️ Gap identified |
| Generate markdown report | ✅ |
| Take screenshots for demo | ⏳ |
| Verify 0 false positives | ✅ |

**Actual Results:**
- 0/7 violations detected (expected - documented in planning as Scenario 1)
- Built-in extractors cover security patterns, not library API patterns
- All 7 claims authored successfully via A2 system
- Verify system working correctly (all claims returned "missing" verdict)
- **Key Finding:** Extractor coverage gap identified and documented

**Discovered Limitation:**
Aphoria's 42 built-in extractors excel at **security/infrastructure patterns** (TLS, JWT, CORS, SQL injection, rate limits) but don't cover **library API design validation** (struct field types, missing fields, numeric constraints, function call patterns).

**Why This Matters:**
- This is the **expected outcome** documented in STATE-2026-02-10.md (Scenario 1)
- Validates Aphoria's architecture (claims, verify, scanning all work correctly)
- Identifies product gap: custom extractors require Rust code, not TOML
- Confirms LLM automation requirement for flywheel (needs `/aphoria-custom-extractor-creator` skill)

See: `dogfood/dbpool/DAY3-FINDINGS.md` for complete analysis

### DF-1.4 Remediation & Re-verification (Day 4) ⏳

**Goal:** Fix violations incrementally, re-scan after each fix

| Task | Status |
|------|--------|
| Fix unbounded max_connections → re-scan | ⏳ |
| Fix plaintext password → re-scan | ⏳ |
| Fix missing max_lifetime → re-scan | ⏳ |
| Fix excessive timeouts → re-scan | ⏳ |
| Fix zero min_connections → re-scan | ⏳ |
| Add connection validation → re-scan | ⏳ |
| Add metrics exposure → re-scan | ⏳ |
| Add leak detection → re-scan | ⏳ |
| Final verification: 0 conflicts | ⏳ |

**Deliverables:**
- Progressive scan results (v1 through v6)
- Git tags for each fix milestone
- Final clean scan report

### DF-1.5 Documentation & Demo Preparation (Day 5) ⏳

**Goal:** Create compelling documentation and demo materials

| Task | Status |
|------|--------|
| Write success story document | ⏳ |
| Create demo script for live presentation | ⏳ |
| Record performance metrics | ⏳ |
| Create before/after visual comparison | ⏳ |
| Document prevented incidents with cost estimates | ⏳ |
| Update this roadmap with completion status | ⏳ |

**Deliverables:**
- `docs/SUCCESS-STORY.md` - Comprehensive case study
- `demo.sh` - Automated demo script
- Screenshots and visuals
- Metrics report (accuracy, performance)

### Success Metrics

| Metric | Target | Actual |
|--------|--------|--------|
| Claims Extracted | 25-30 | TBD |
| Violations Detected | 7-8 | TBD |
| Detection Accuracy | 100% | TBD |
| False Positives | 0 | TBD |
| Scan Performance | ≤0.3s | TBD |
| Final Conflicts | 0 | TBD |

### Lessons Learned

**From Day 3 (2026-02-10):**

1. **Extractor Coverage Gap Validated**
   - Built-in extractors (42 total) cover security patterns excellently
   - Library API design patterns (struct fields, type constraints) need custom extractors
   - Custom extractors require Rust code (~10-20 hours), not TOML configuration
   - This was documented in planning (Scenario 1 vs 2) and validated through execution

2. **Authored Claims System Works**
   - A2 system successfully created 7 claims with full provenance/invariant/consequence
   - Claims loaded correctly, verify system working as designed
   - All claims returned "missing" verdict (correct - no matching observations)
   - Demonstrates claim authoring workflow even without detection

3. **Flywheel Automation is Critical**
   - Manual TOML configuration cannot address the gap
   - Requires LLM-driven extractor generation (`/aphoria-custom-extractor-creator` skill)
   - Confirms vision.md's emphasis on LLM automation as core, not optional
   - Manual CLI is debug interface, not primary workflow

4. **Dogfooding Reveals Product Gaps**
   - Time investment: Day 3 took 8 hours (3x planned) due to troubleshooting
   - Found fundamental limitation, not implementation bug
   - "Failure" to detect is actually success at identifying product needs
   - Documentation produced (CUSTOM-EXTRACTOR-GUIDE.md) valuable despite approach not working

5. **Next Priority Clear**
   - Implement `/aphoria-custom-extractor-creator` skill (Priority 1)
   - LLM reads violation examples → generates Rust extractor code
   - Re-run dogfood to validate end-to-end automation
   - Expand built-in extractor library with common API patterns

### Next Dogfoods

Potential follow-up dogfooding projects:
- Health check service (`healthd`)
- Rate limiter middleware (`ratelimit-rs`)
- Secrets manager client (`secrets-rs`)

**Full Plan:** See [`applications/aphoria/dogfood/dbpool/plan.md`](dogfood/dbpool/plan.md)

---

## Phase 15: Evidence Source Integration ⬜

> **Vision:** ADRs, specs, and standards automatically link to patterns.

### 15.1 ADR Auto-Detection ⬜

| Task | Status |
|------|--------|
| Create `src/evidence/adr.rs` | ⬜ |
| Detect ADR-XXX patterns in commit messages | ⬜ |
| Scan for ADR files in standard locations | ⬜ |
| Parse ADR content for related patterns | ⬜ |
| Link ADR to patterns automatically | ⬜ |

### 15.2 Spec File Detection ⬜

| Task | Status |
|------|--------|
| Create `src/evidence/spec.rs` | ⬜ |
| Detect spec files (specs/*.md, *.spec.md) | ⬜ |
| Parse requirement IDs (REQ-XXX) | ⬜ |
| Link requirements to patterns | ⬜ |
| Show requirement coverage in reports | ⬜ |

### 15.3 Standard Reference Extraction ⬜

| Task | Status |
|------|--------|
| Parse RFC references (RFC 7519) | ⬜ |
| Parse OWASP references (OWASP A03:2021) | ⬜ |
| Parse NIST references (NIST SP 800-53) | ⬜ |
| Auto-link to authoritative corpus | ⬜ |

### 15.4 Evidence Display ⬜

| Task | Status |
|------|--------|
| Show full evidence chain in pattern output | ⬜ |
| `aphoria patterns --by-evidence` grouping | ⬜ |

---

## Phase A6: AST-Aware Observation & Claim Verification ⬜

> Evolved from the "Scout & Judge" proposal (2026-02-05). The original focused on LLM cost reduction via AST snippet extraction. Reframed through the observations/claims distinction: the **Scout** produces structurally richer observations that regex can't, and the **Judge** verifies authored claims against code rather than classifying security issues.

### Why This Matters

The 42 regex extractors work well for direct pattern matching (~0.25s). But they can't follow indirection:

```python
# Regex sees `requests.get(url, verify=should_verify)` — no match
# AST sees `should_verify = False` in scope — match
should_verify = False
requests.get(url, verify=should_verify)
```

And they can't verify authored claims. When a claim says "Wallet MUST NOT derive Clone", regex can find `#[derive(` but can't determine scope or negation semantics. An AST-aware scout + LLM judge can.

### A6.1 Tree-sitter Infrastructure ⬜

| Task | Status |
|------|--------|
| Add `tree-sitter` + language grammars to `Cargo.toml` | ⬜ |
| Create `src/scout/mod.rs` module | ⬜ |
| `src/scout/engine.rs` — parse files, run SCM queries | ⬜ |
| `CandidateSnippet` type with structural context | ⬜ |
| `src/scout/queries/` — `.scm` query files per category/language | ⬜ |
| Language support: Python, Go, Rust, JavaScript/TypeScript | ⬜ |

```rust
pub struct CandidateSnippet {
    pub file_path: String,
    pub language: Language,
    pub start_line: usize,
    pub end_line: usize,
    pub code: String,
    pub context_variables: HashMap<String, String>,
    pub query_id: String,
}
```

### A6.2 Scout as Observation Producer ⬜

AST-aware ROI detection for patterns regex can't follow.

| Task | Status |
|------|--------|
| Variable indirection tracking (assign → use across lines) | ⬜ |
| Context expansion: function scope, variable defs, comments | ⬜ |
| Deduplication with existing regex extractors | ⬜ |
| SCM queries for TLS, secrets, auth, crypto categories | ⬜ |
| Integration: run scout after regex, drop overlaps, combine | ⬜ |

**Key design:** Scout runs alongside (not instead of) regex extractors. Regex handles 90% at zero cost; scout handles the indirection cases regex misses.

### A6.3 Judge as Claim Verifier ⬜

LLM receives focused snippet + authored claim → structured verdict.

| Task | Status |
|------|--------|
| Refactor `LlmExtractor` to accept `CandidateSnippet` + `AuthoredClaim` | ⬜ |
| Verification prompt: "Does this code satisfy this claim?" | ⬜ |
| Structured output: `{ verdict: PASS|FAIL|UNCERTAIN, evidence: "..." }` | ⬜ |
| Wire into `aphoria verify` Direction 2 (walk claims, verify in code) | ⬜ |
| Maps to `Extractor::verify()` from vision-gaps | ⬜ |

**Token efficiency:** Snippet (~100 tokens) vs whole file (~2000 tokens) = 95% cost reduction per verification.

### A6.4 Scout for Claim Suggestion ⬜

Scout identifies ROIs without matching authored claims, feeds context to `aphoria-suggest`.

| Task | Status |
|------|--------|
| Identify ROIs with no matching claim in `.aphoria/claims.toml` | ⬜ |
| Enrich context for skill: snippet + function name + surrounding comments | ⬜ |
| Feed to `aphoria-suggest` skill for claim drafting | ⬜ |

### A6.5 Evaluation ⬜

| Task | Status |
|------|--------|
| Scout recall: "Did scout find the vulnerable line in fixture?" | ⬜ |
| Judge precision: "Given snippet + claim, did LLM classify correctly?" | ⬜ |
| Cost metric: `tokens_per_verification` vs monolithic approach | ⬜ |
| Parallel run: shadow mode alongside regex for tuning | ⬜ |

### Phase A6 Priority

Lower priority than A5 flywheel completion and Phase 14 governance. Build when:
1. Regex extractors hit limits on specific indirection patterns
2. `aphoria verify` Direction 2 needs LLM-backed verification
3. `aphoria-suggest` needs richer context than regex observations provide

---

## Enterprise Pilot Success Metrics

### 90-Day Pilot Targets

| Metric | Target | Measurement |
|--------|--------|-------------|
| Patterns captured | 100+ observations | Count in knowledge graph |
| Patterns promoted | 10+ conventions | Count with status=Active |
| Cross-team adoption | 2+ teams connected | Unique team_ids |
| New hire guidance events | 5+ accepted suggestions | Accept rate tracking |
| False positive rate | <10% | FP feedback / total flags |
| Evidence-backed patterns | >50% | Patterns with Research+ evidence |

### 180-Day Production Targets

| Metric | Target | Measurement |
|--------|--------|-------------|
| Knowledge retention | 0 lost patterns on departures | Audit log |
| Onboarding velocity | 50% faster ramp | Time to first PR |
| Convention adoption | 80% across org | Compliance rate |
| SOC 2 evidence | Audit pass | External validation |
| Deprecated pattern migration | 90% complete by sunset | Migration tracking |

---

## Enterprise Simulation UAT

See: `uat/enterprise-simulation-uat.md`