Major documentation restructure to improve discoverability and reduce duplication. ## Changes **Deleted (Archived/Consolidated)**: - Removed duplicate getting started guides - Archived outdated planning documents - Consolidated corpus and configuration docs - Removed obsolete vision/spec files (superseded by vision.md) - Cleaned up scrapyard and old PDFs **New Structure**: - docs/about/ - Project overview and introduction - docs/guides/ - User guides (moved from root) - docs/specs/ - Technical specifications - docs/sdk/ - SDK documentation (Go) - docs/references/ - API references - docs/archive/ - Archived historical docs - applications/aphoria/docs/advanced/ - Advanced topics - applications/aphoria/docs/reference/ - CLI reference - applications/aphoria/docs/archive/ - Archived aphoria docs **Updated**: - README.md - New root README with clear navigation - CONTRIBUTING.md - Contribution guidelines - CLAUDE.md - Updated paths to new structure - roadmap.md - Added recent completions ## Files Changed - 57 files changed - 1,977 insertions(+) - 961 deletions(-) **Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
698 lines
26 KiB
Markdown
698 lines
26 KiB
Markdown
# Aphoria Roadmap
|
||
|
||
> Completed phases archived in [`roadmap-archive.md`](./roadmap-archive.md)
|
||
|
||
---
|
||
|
||
## Status Overview
|
||
|
||
| Phase | Deliverable | Status |
|
||
|-------|-------------|--------|
|
||
| 0–9, 11–13, 16–17 | Core CLI, Extractors (42), LLM, Learning, Enterprise, Lifecycle, Pattern Enrichment | ✅ Archived |
|
||
| CC | Corpus Infrastructure (Community Corpus, Wiki Import, Pattern Aggregation, **Async Default**) | ✅ Complete |
|
||
| 10 | UX & Enterprise Polish | 🔄 Partial (10.1 ✅, 10.2–10.3 ⬜) |
|
||
| 14 | Governance Workflows | 🎯 Current |
|
||
| **DF-1** | **Dogfood: Database Connection Pool** | 🎯 **ACTIVE** |
|
||
| 15 | Evidence Source Integration | ⬜ Future |
|
||
| A6 | AST-Aware Observation & Claim Verification | ⬜ Future |
|
||
|
||
### Current State
|
||
|
||
- 42 built-in extractors + declarative custom extractors
|
||
- **Emergent corpus**: RFC, OWASP, Vendor sources + **community-driven patterns (CC.6 ✅)**
|
||
- **Community corpus enabled by default** (CC.7 ✅): `use_community: true`, proper async, no runtime hacks
|
||
- **Pattern aggregation active**: Observations auto-feed pattern aggregates after each scan
|
||
- **No hardcoded assertions**: Bootstrap via wiki import or Trust Packs
|
||
- Ephemeral mode (~0.25s), persistent mode with drift detection
|
||
- Observation/claim distinction (A1–A5 complete)
|
||
- `aphoria verify run|map` for claim verification
|
||
- 10 claims dogfooded in `.aphoria/claims.toml`
|
||
- Self-improving: LLM extraction → pattern learning → autonomous promotion → shadow testing → auto-rollback
|
||
|
||
### Recently Completed: Corpus Infrastructure (Phase CC ✅)
|
||
|
||
**Phase CC.1-CC.3: Removed hardcoded corpus, built emergent system** (Feb 6-7)
|
||
- Deleted `hardcoded.rs` (369 lines, 19 assertions)
|
||
- Pattern aggregates stored in StemeDB: `community://pattern/{BLAKE3(SPV)}`
|
||
- Multi-tier promotion: 95%+ (Regulatory), 80%+ (Clinical), 50%+ (Emerging, review required)
|
||
- Wiki import: `aphoria corpus import wiki ~/docs` parses MUST/SHOULD patterns
|
||
|
||
**Phase CC.6: Pattern Aggregation (Emergent Learning)** (Feb 8) ✅
|
||
- Observations now automatically feed back into pattern aggregates
|
||
- Every scan with `--persist --sync` contributes to community learning
|
||
- Config: `aggregation_enabled: true` (default)
|
||
- Tracks project_count and observation_count per pattern
|
||
- Privacy-preserving: wildcarded subjects, project deduplication
|
||
|
||
**Phase CC.7: Make Community Corpus Default** (Feb 8) ✅
|
||
- Created `AsyncCorpusBuilder` trait for async-native corpus builders
|
||
- Refactored `CommunityCorpusBuilder` to implement `AsyncCorpusBuilder`
|
||
- **Removed `rt.block_on()` hack** that caused "runtime within runtime" errors
|
||
- Made entire corpus building chain properly async (16 functions updated)
|
||
- Enabled `use_community: true` by default in `CorpusConfig`
|
||
- All 1189 tests pass, no clippy warnings, no runtime errors
|
||
|
||
**Philosophy:** The corpus isn't written by experts. It's discovered by the community and validated by authorities.
|
||
|
||
---
|
||
|
||
## Phase 10: UX & Enterprise Polish (Partial)
|
||
|
||
> 10.1 Acknowledgment Expiry ✅ — archived
|
||
|
||
### 10.2 Human-Readable Signer Names ⬜
|
||
|
||
**Impact:** MEDIUM | **Effort:** MEDIUM | **Priority:** P2
|
||
|
||
Map issuer hex IDs to human-readable team names in output.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Add `signer_name: Option<String>` to `PackHeader` | ⬜ |
|
||
| Add `contact: Option<String>` to `PackHeader` (Slack channel, email) | ⬜ |
|
||
| Update `policy export/import` to preserve new fields | ⬜ |
|
||
| Show "Signed by Platform Security Team" instead of hex in output | ⬜ |
|
||
| Backward-compat: gracefully handle packs without new fields | ⬜ |
|
||
|
||
### 10.3 Speed Benchmarks ⬜
|
||
|
||
**Impact:** LOW | **Effort:** LOW | **Priority:** P3
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `benchmarks/` directory with test corpora | ⬜ |
|
||
| Add `aphoria scan --benchmark` flag for self-test | ⬜ |
|
||
| Document test conditions in benchmark results | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase CC: Corpus Infrastructure (Community Corpus) ✅
|
||
|
||
> **Completed:** 2026-02-08 | Removed hardcoded corpus, built emergent community-driven system
|
||
|
||
### Philosophy
|
||
|
||
The corpus isn't written by experts. It's discovered by the community and validated by authorities. 95% adoption = "This is what the community does" = Authoritative.
|
||
|
||
### CC.1 Delete Hardcoded Corpus ✅
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Remove `applications/aphoria/src/corpus/hardcoded.rs` (369 lines) | ✅ |
|
||
| Remove `include_hardcoded` from `CorpusConfig` | ✅ |
|
||
| Remove from `CorpusRegistry::with_defaults()` | ✅ |
|
||
| Update tests to use community corpus | ✅ |
|
||
| Fix 5 pre-existing clippy errors in stemedb-api | ✅ |
|
||
|
||
**Implemented:** Destructive pre-release approach - no deprecation warnings, just deleted.
|
||
|
||
### CC.2 Community Corpus Builder ✅
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `applications/aphoria/src/corpus/community.rs` (393 lines) | ✅ |
|
||
| Create `applications/aphoria/src/corpus/thresholds.rs` (230 lines) | ✅ |
|
||
| Create `applications/aphoria/src/corpus/resolver.rs` (220 lines) | ✅ |
|
||
| Create `applications/aphoria/src/community/pattern_store.rs` (332 lines) | ✅ |
|
||
| Implement `PatternAggregateStore` trait with StemeDB backend | ✅ |
|
||
| Multi-tier promotion: 95% (Regulatory), 80% (Clinical), 50% (Emerging) | ✅ |
|
||
| Content-addressed storage: `community://pattern/{BLAKE3(SPV)}` | ✅ |
|
||
| Config integration: `use_community` flag (opt-in) | ✅ |
|
||
| Full scan flow integration | ✅ |
|
||
|
||
**Storage Architecture:**
|
||
- Pattern aggregates stored as StemeDB assertions (no TOML files)
|
||
- Predicate: `pattern_aggregate` with JSON metadata
|
||
- Deduplication via content-addressed subjects
|
||
- Privacy-preserving: wildcarded subjects, k-anonymity
|
||
|
||
### CC.3 Wiki Import Bootstrap ✅
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `applications/aphoria/src/corpus/wiki_importer.rs` (332 lines) | ✅ |
|
||
| Regex extraction of MUST/SHOULD patterns from markdown | ✅ |
|
||
| Authority source parsing (RFC, OWASP, CWE references) | ✅ |
|
||
| Smart subject normalization (TLS → tls/cert_verification) | ✅ |
|
||
| CLI command: `aphoria corpus import wiki <path>` | ✅ |
|
||
| PatternAggregator write path (stores to StemeDB) | ✅ |
|
||
| Integration tests with fixtures | ✅ (6 tests) |
|
||
| Documentation: `docs/bootstrap-corpus.md` | ✅ |
|
||
|
||
**Usage:**
|
||
```bash
|
||
# Create wiki with best practices
|
||
mkdir -p .aphoria/wiki
|
||
echo "TLS cert verification MUST be enabled. Authority: RFC 5246" > .aphoria/wiki/tls.md
|
||
|
||
# Import patterns
|
||
aphoria corpus import wiki .aphoria/wiki
|
||
# → Patterns now in StemeDB, available for conflict detection
|
||
```
|
||
|
||
### CC.4 Trust Pack Bootstrap ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Extend Trust Packs to include pattern aggregates | ⬜ Future |
|
||
| `aphoria trust-pack install <name>` writes patterns to StemeDB | ⬜ Future |
|
||
| Create `rfc-owasp-baseline.toml` with ~20 common patterns | ⬜ Future |
|
||
|
||
**Status:** Infrastructure exists, implementation deferred. Wiki import covers bootstrap needs.
|
||
|
||
### CC.5 Skill-Driven Cold Start ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Enhance `aphoria-suggest` skill with bootstrap mode | ⬜ Future |
|
||
| Detect empty corpus during scan | ⬜ Future |
|
||
| Analyze project structure (Cargo.toml, package.json) | ⬜ Future |
|
||
| Suggest 3-5 baseline patterns based on detected stack | ⬜ Future |
|
||
|
||
**Status:** Skill exists, bootstrap mode not implemented. Manual wiki creation works well.
|
||
|
||
### CC.6 Pattern Aggregation (Emergent Learning) ✅
|
||
|
||
> **Completed:** 2026-02-08 | Observations now feed back into pattern aggregates automatically
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Add `aggregation_enabled` config field (default: `true`) | ✅ |
|
||
| Implement `aggregate_observations_to_patterns()` in scanner | ✅ |
|
||
| Add `StemeDBPatternStore::get_pattern_by_spv()` for lookup | ✅ |
|
||
| Add `StemeDBPatternStore::update_pattern()` for updates | ✅ |
|
||
| Add `compute_project_hash()` for deduplication | ✅ |
|
||
| Hook into scan flow after observation recording | ✅ |
|
||
| Group observations by (subject, predicate, value) | ✅ |
|
||
| Wildcard project paths for anonymization | ✅ |
|
||
| Create or update PatternAggregate records | ✅ |
|
||
| Track project_count and observation_count | ✅ |
|
||
|
||
**Implementation:**
|
||
```rust
|
||
// scanner.rs:344-357
|
||
if config.corpus.aggregation_enabled && should_persist_locally {
|
||
let project_hash = compute_project_hash(project_root);
|
||
aggregate_observations_to_patterns(&novel_claims, &episteme, &project_hash).await?;
|
||
}
|
||
```
|
||
|
||
**Flow:**
|
||
1. Scan extracts observations → recorded as Tier 4 assertions
|
||
2. Observations aggregated by (wildcarded_subject, predicate, value)
|
||
3. For each unique pattern:
|
||
- If exists: increment observation_count, check new project → increment project_count
|
||
- If new: create PatternAggregate with initial counts
|
||
4. Stored as assertions with predicate `"pattern_aggregate"`
|
||
|
||
**Result:** The corpus is now **emergent**. Every scan with `--persist --sync` feeds the learning loop.
|
||
|
||
---
|
||
|
||
### What Remains (Future Enhancement)
|
||
|
||
**CC.4 Trust Pack Bootstrap ⬜**
|
||
_(Unchanged - Future enhancement)_
|
||
|
||
**CC.5 Skill-Driven Cold Start ⬜**
|
||
_(Unchanged - Future enhancement)_
|
||
|
||
---
|
||
|
||
### CC.7 Make Community Corpus Default ✅
|
||
|
||
> **Completed:** 2026-02-08 | Community corpus now enabled by default, async runtime issue resolved
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `AsyncCorpusBuilder` trait for async corpus builders | ✅ |
|
||
| Implement dual registry (sync + async builders) | ✅ |
|
||
| Refactor `CommunityCorpusBuilder` to implement `AsyncCorpusBuilder` | ✅ |
|
||
| Remove `rt.block_on()` hack, use proper `.await` | ✅ |
|
||
| Make `build_corpus_with_stores()` async | ✅ |
|
||
| Make `create_authoritative_corpus()` async | ✅ |
|
||
| Make `EphemeralDetector::new()` async | ✅ |
|
||
| Make `extract_claims_from_files()` async | ✅ |
|
||
| Update all 16 function callers to use `.await` | ✅ |
|
||
| Change `use_community: false` → `true` in defaults | ✅ |
|
||
| Verify tests pass with community corpus enabled | ✅ (1189 tests) |
|
||
|
||
**Architecture Improvement:**
|
||
- **Before**: Sync `CorpusBuilder` trait forced async operations to use `rt.block_on()`, causing runtime errors in async contexts
|
||
- **After**: Dual-trait approach (`CorpusBuilder` + `AsyncCorpusBuilder`) allows sync builders (RFC, OWASP, Vendor) to stay simple while community builder uses proper async
|
||
- **Result**: No `block_on()` hacks anywhere, proper async/await throughout
|
||
|
||
**Verification:**
|
||
```bash
|
||
RUST_LOG=aphoria=debug aphoria scan --persist --sync .
|
||
# Logs show:
|
||
# ✅ "Registered community corpus builder (async)"
|
||
# ✅ "Building corpus (async)" for Community builder
|
||
# ✅ "Querying popular patterns from StemeDB"
|
||
# ✅ No "Cannot start a runtime from within a runtime" errors
|
||
```
|
||
|
||
---
|
||
|
||
### CC.4 Trust Pack System (Bootstrap Option 2) ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| `aphoria trust-pack export --source community` | ⬜ |
|
||
| `aphoria trust-pack install <name>` | ⬜ |
|
||
| Create `rfc-owasp-bootstrap` Trust Pack from old hardcoded corpus | ⬜ |
|
||
| Trust Pack validation and signing | ⬜ |
|
||
| Trust Pack registry/sharing mechanism | ⬜ |
|
||
|
||
**Usage:**
|
||
```bash
|
||
aphoria trust-pack install rfc-owasp-bootstrap
|
||
# Installs 19 baseline assertions for new projects
|
||
```
|
||
|
||
### CC.5 Corpus Management CLI ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| `aphoria corpus build` - Build community corpus | ⬜ |
|
||
| `aphoria corpus list` - Show loaded corpus assertions | ⬜ |
|
||
| `aphoria corpus candidates --min-adoption 0.50` - List promotion candidates | ⬜ |
|
||
| `aphoria corpus promote <pattern-id>` - Manual promotion | ⬜ |
|
||
| Update `aphoria-corpus-curator` skill for manual review | ⬜ |
|
||
|
||
### CC.6 Multi-Layer Corpus Resolver ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `applications/aphoria/src/corpus/resolver.rs` | ⬜ |
|
||
| Priority layers: Manual overrides > Trust Packs > Community > (deprecated hardcoded) | ⬜ |
|
||
| Conflict resolution: higher priority overwrites lower | ⬜ |
|
||
| Config: `use_community = true` default | ⬜ |
|
||
| Config: `include_hardcoded = false` default (post-migration) | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase 14: Governance Workflows 🎯
|
||
|
||
> **Vision:** Clear approval paths for pattern promotion with audit trails.
|
||
|
||
### 14.1 Approval Workflow Definition ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/governance/mod.rs` module | ⬜ |
|
||
| Define `ApprovalWorkflow` struct | ⬜ |
|
||
| Define `ApprovalStage` with required approvers | ⬜ |
|
||
| Support evidence-based auto-approve thresholds | ⬜ |
|
||
| Config: define workflows in `.aphoria.toml` | ⬜ |
|
||
|
||
### 14.2 Approval State Machine ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Implement state transitions (pending → approved/rejected) | ⬜ |
|
||
| Multi-stage approval support | ⬜ |
|
||
| Timeout and escalation policies | ⬜ |
|
||
| Store approval history with timestamps | ⬜ |
|
||
|
||
### 14.3 Approval CLI ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| `aphoria governance pending` — list pending approvals | ⬜ |
|
||
| `aphoria governance approve <id> --comment "..."` | ⬜ |
|
||
| `aphoria governance reject <id> --reason "..."` | ⬜ |
|
||
| `aphoria governance escalate <id>` | ⬜ |
|
||
| Show approval status in pattern list | ⬜ |
|
||
|
||
### 14.4 SOC 2 Audit Trail ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Full audit log for all governance actions | ⬜ |
|
||
| `aphoria audit trail --pattern <id>` — show timeline | ⬜ |
|
||
| Export governance history for auditors | ⬜ |
|
||
| Include approver identity and timestamp | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase DF-1: Dogfood Project - Database Connection Pool 🎯
|
||
|
||
> **Status:** ACTIVE | **Start:** 2026-02-09 | **Target:** 2026-02-14 (5 days)
|
||
>
|
||
> **Vision:** Build a production-ready database connection pool with intentional violations, use Aphoria to detect and guide remediation. Demonstrates real-world value in preventing production incidents.
|
||
|
||
### Overview
|
||
|
||
**Product:** `dbpool` - Safe, opinionated PostgreSQL connection pool for Rust
|
||
|
||
**Why This Matters:**
|
||
- Connection pool misconfigurations cause real P0 incidents
|
||
- Clear authority sources (HikariCP, PostgreSQL docs)
|
||
- Demonstrates Aphoria preventing actual production problems
|
||
- "Aphoria caught this before deployment" is compelling ROI
|
||
|
||
**Key Metrics:**
|
||
- Claims to extract: 25-30
|
||
- Intentional violations: 7-8
|
||
- Expected detection rate: 100%
|
||
- Final state: 0 conflicts, production-ready
|
||
|
||
### DF-1.1 Preparation & Corpus Building (Day 1) 🔄
|
||
|
||
**Goal:** Extract claims from authority sources and populate corpus database
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create project structure at `applications/aphoria/dogfood/dbpool/` | ✅ |
|
||
| Write comprehensive plan in `dogfood/dbpool/plan.md` | ✅ |
|
||
| Fetch HikariCP configuration documentation | ⏳ |
|
||
| Fetch PostgreSQL connection pooling guide | ⏳ |
|
||
| Extract OWASP A07 credential guidance | ⏳ |
|
||
| Create 25-30 claims via CLI (`aphoria corpus create`) | ⏳ |
|
||
| Verify all claims queryable via API | ⏳ |
|
||
| Document claim templates for future dogfoods | ⏳ |
|
||
|
||
**Deliverables:**
|
||
- `docs/sources/hikaricp-config.md`
|
||
- `docs/sources/postgresql-pooling.md`
|
||
- `docs/sources/owasp-credentials.md`
|
||
- 25-30 claims in corpus database
|
||
- Verification report
|
||
|
||
### DF-1.2 Initial Implementation with Violations (Day 2) ⏳
|
||
|
||
**Goal:** Write working code that compiles but violates best practices
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create Rust project with Cargo.toml | ⏳ |
|
||
| Implement PoolConfig with 5 violations | ⏳ |
|
||
| Implement ConnectionPool with 2 violations | ⏳ |
|
||
| Add basic tests (that pass despite violations) | ⏳ |
|
||
| Verify compilation successful | ⏳ |
|
||
|
||
**Intentional Violations:**
|
||
1. ❌ Unbounded max_connections (CRITICAL)
|
||
2. ❌ Plaintext password in connection string (CRITICAL)
|
||
3. ❌ Missing max_lifetime (CRITICAL)
|
||
4. ❌ Excessive connection_timeout (ERROR)
|
||
5. ❌ Zero min_connections (ERROR)
|
||
6. ❌ Missing connection validation (ERROR)
|
||
7. ⚠️ No metrics exposed (WARNING)
|
||
8. ⚠️ Missing leak detection (WARNING)
|
||
|
||
### DF-1.3 First Scan & Verification (Day 3) ✅
|
||
|
||
**Goal:** Run Aphoria scan and verify all violations detected
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `.aphoria/config.toml` | ✅ |
|
||
| Run initial scan, save results JSON | ✅ |
|
||
| Verify 7-8 violations detected (100% accuracy) | ⚠️ Gap identified |
|
||
| Generate markdown report | ✅ |
|
||
| Take screenshots for demo | ⏳ |
|
||
| Verify 0 false positives | ✅ |
|
||
|
||
**Actual Results:**
|
||
- 0/7 violations detected (expected - documented in planning as Scenario 1)
|
||
- Built-in extractors cover security patterns, not library API patterns
|
||
- All 7 claims authored successfully via A2 system
|
||
- Verify system working correctly (all claims returned "missing" verdict)
|
||
- **Key Finding:** Extractor coverage gap identified and documented
|
||
|
||
**Discovered Limitation:**
|
||
Aphoria's 42 built-in extractors excel at **security/infrastructure patterns** (TLS, JWT, CORS, SQL injection, rate limits) but don't cover **library API design validation** (struct field types, missing fields, numeric constraints, function call patterns).
|
||
|
||
**Why This Matters:**
|
||
- This is the **expected outcome** documented in STATE-2026-02-10.md (Scenario 1)
|
||
- Validates Aphoria's architecture (claims, verify, scanning all work correctly)
|
||
- Identifies product gap: custom extractors require Rust code, not TOML
|
||
- Confirms LLM automation requirement for flywheel (needs `/aphoria-custom-extractor-creator` skill)
|
||
|
||
See: `dogfood/dbpool/DAY3-FINDINGS.md` for complete analysis
|
||
|
||
### DF-1.4 Remediation & Re-verification (Day 4) ⏳
|
||
|
||
**Goal:** Fix violations incrementally, re-scan after each fix
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Fix unbounded max_connections → re-scan | ⏳ |
|
||
| Fix plaintext password → re-scan | ⏳ |
|
||
| Fix missing max_lifetime → re-scan | ⏳ |
|
||
| Fix excessive timeouts → re-scan | ⏳ |
|
||
| Fix zero min_connections → re-scan | ⏳ |
|
||
| Add connection validation → re-scan | ⏳ |
|
||
| Add metrics exposure → re-scan | ⏳ |
|
||
| Add leak detection → re-scan | ⏳ |
|
||
| Final verification: 0 conflicts | ⏳ |
|
||
|
||
**Deliverables:**
|
||
- Progressive scan results (v1 through v6)
|
||
- Git tags for each fix milestone
|
||
- Final clean scan report
|
||
|
||
### DF-1.5 Documentation & Demo Preparation (Day 5) ⏳
|
||
|
||
**Goal:** Create compelling documentation and demo materials
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Write success story document | ⏳ |
|
||
| Create demo script for live presentation | ⏳ |
|
||
| Record performance metrics | ⏳ |
|
||
| Create before/after visual comparison | ⏳ |
|
||
| Document prevented incidents with cost estimates | ⏳ |
|
||
| Update this roadmap with completion status | ⏳ |
|
||
|
||
**Deliverables:**
|
||
- `docs/SUCCESS-STORY.md` - Comprehensive case study
|
||
- `demo.sh` - Automated demo script
|
||
- Screenshots and visuals
|
||
- Metrics report (accuracy, performance)
|
||
|
||
### Success Metrics
|
||
|
||
| Metric | Target | Actual |
|
||
|--------|--------|--------|
|
||
| Claims Extracted | 25-30 | TBD |
|
||
| Violations Detected | 7-8 | TBD |
|
||
| Detection Accuracy | 100% | TBD |
|
||
| False Positives | 0 | TBD |
|
||
| Scan Performance | ≤0.3s | TBD |
|
||
| Final Conflicts | 0 | TBD |
|
||
|
||
### Lessons Learned
|
||
|
||
**From Day 3 (2026-02-10):**
|
||
|
||
1. **Extractor Coverage Gap Validated**
|
||
- Built-in extractors (42 total) cover security patterns excellently
|
||
- Library API design patterns (struct fields, type constraints) need custom extractors
|
||
- Custom extractors require Rust code (~10-20 hours), not TOML configuration
|
||
- This was documented in planning (Scenario 1 vs 2) and validated through execution
|
||
|
||
2. **Authored Claims System Works**
|
||
- A2 system successfully created 7 claims with full provenance/invariant/consequence
|
||
- Claims loaded correctly, verify system working as designed
|
||
- All claims returned "missing" verdict (correct - no matching observations)
|
||
- Demonstrates claim authoring workflow even without detection
|
||
|
||
3. **Flywheel Automation is Critical**
|
||
- Manual TOML configuration cannot address the gap
|
||
- Requires LLM-driven extractor generation (`/aphoria-custom-extractor-creator` skill)
|
||
- Confirms vision.md's emphasis on LLM automation as core, not optional
|
||
- Manual CLI is debug interface, not primary workflow
|
||
|
||
4. **Dogfooding Reveals Product Gaps**
|
||
- Time investment: Day 3 took 8 hours (3x planned) due to troubleshooting
|
||
- Found fundamental limitation, not implementation bug
|
||
- "Failure" to detect is actually success at identifying product needs
|
||
- Documentation produced (CUSTOM-EXTRACTOR-GUIDE.md) valuable despite approach not working
|
||
|
||
5. **Next Priority Clear**
|
||
- Implement `/aphoria-custom-extractor-creator` skill (Priority 1)
|
||
- LLM reads violation examples → generates Rust extractor code
|
||
- Re-run dogfood to validate end-to-end automation
|
||
- Expand built-in extractor library with common API patterns
|
||
|
||
### Next Dogfoods
|
||
|
||
Potential follow-up dogfooding projects:
|
||
- Health check service (`healthd`)
|
||
- Rate limiter middleware (`ratelimit-rs`)
|
||
- Secrets manager client (`secrets-rs`)
|
||
|
||
**Full Plan:** See [`applications/aphoria/dogfood/dbpool/plan.md`](dogfood/dbpool/plan.md)
|
||
|
||
---
|
||
|
||
## Phase 15: Evidence Source Integration ⬜
|
||
|
||
> **Vision:** ADRs, specs, and standards automatically link to patterns.
|
||
|
||
### 15.1 ADR Auto-Detection ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/evidence/adr.rs` | ⬜ |
|
||
| Detect ADR-XXX patterns in commit messages | ⬜ |
|
||
| Scan for ADR files in standard locations | ⬜ |
|
||
| Parse ADR content for related patterns | ⬜ |
|
||
| Link ADR to patterns automatically | ⬜ |
|
||
|
||
### 15.2 Spec File Detection ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Create `src/evidence/spec.rs` | ⬜ |
|
||
| Detect spec files (specs/*.md, *.spec.md) | ⬜ |
|
||
| Parse requirement IDs (REQ-XXX) | ⬜ |
|
||
| Link requirements to patterns | ⬜ |
|
||
| Show requirement coverage in reports | ⬜ |
|
||
|
||
### 15.3 Standard Reference Extraction ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Parse RFC references (RFC 7519) | ⬜ |
|
||
| Parse OWASP references (OWASP A03:2021) | ⬜ |
|
||
| Parse NIST references (NIST SP 800-53) | ⬜ |
|
||
| Auto-link to authoritative corpus | ⬜ |
|
||
|
||
### 15.4 Evidence Display ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Show full evidence chain in pattern output | ⬜ |
|
||
| `aphoria patterns --by-evidence` grouping | ⬜ |
|
||
|
||
---
|
||
|
||
## Phase A6: AST-Aware Observation & Claim Verification ⬜
|
||
|
||
> Evolved from the "Scout & Judge" proposal (2026-02-05). The original focused on LLM cost reduction via AST snippet extraction. Reframed through the observations/claims distinction: the **Scout** produces structurally richer observations that regex can't, and the **Judge** verifies authored claims against code rather than classifying security issues.
|
||
|
||
### Why This Matters
|
||
|
||
The 42 regex extractors work well for direct pattern matching (~0.25s). But they can't follow indirection:
|
||
|
||
```python
|
||
# Regex sees `requests.get(url, verify=should_verify)` — no match
|
||
# AST sees `should_verify = False` in scope — match
|
||
should_verify = False
|
||
requests.get(url, verify=should_verify)
|
||
```
|
||
|
||
And they can't verify authored claims. When a claim says "Wallet MUST NOT derive Clone", regex can find `#[derive(` but can't determine scope or negation semantics. An AST-aware scout + LLM judge can.
|
||
|
||
### A6.1 Tree-sitter Infrastructure ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Add `tree-sitter` + language grammars to `Cargo.toml` | ⬜ |
|
||
| Create `src/scout/mod.rs` module | ⬜ |
|
||
| `src/scout/engine.rs` — parse files, run SCM queries | ⬜ |
|
||
| `CandidateSnippet` type with structural context | ⬜ |
|
||
| `src/scout/queries/` — `.scm` query files per category/language | ⬜ |
|
||
| Language support: Python, Go, Rust, JavaScript/TypeScript | ⬜ |
|
||
|
||
```rust
|
||
pub struct CandidateSnippet {
|
||
pub file_path: String,
|
||
pub language: Language,
|
||
pub start_line: usize,
|
||
pub end_line: usize,
|
||
pub code: String,
|
||
pub context_variables: HashMap<String, String>,
|
||
pub query_id: String,
|
||
}
|
||
```
|
||
|
||
### A6.2 Scout as Observation Producer ⬜
|
||
|
||
AST-aware ROI detection for patterns regex can't follow.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Variable indirection tracking (assign → use across lines) | ⬜ |
|
||
| Context expansion: function scope, variable defs, comments | ⬜ |
|
||
| Deduplication with existing regex extractors | ⬜ |
|
||
| SCM queries for TLS, secrets, auth, crypto categories | ⬜ |
|
||
| Integration: run scout after regex, drop overlaps, combine | ⬜ |
|
||
|
||
**Key design:** Scout runs alongside (not instead of) regex extractors. Regex handles 90% at zero cost; scout handles the indirection cases regex misses.
|
||
|
||
### A6.3 Judge as Claim Verifier ⬜
|
||
|
||
LLM receives focused snippet + authored claim → structured verdict.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Refactor `LlmExtractor` to accept `CandidateSnippet` + `AuthoredClaim` | ⬜ |
|
||
| Verification prompt: "Does this code satisfy this claim?" | ⬜ |
|
||
| Structured output: `{ verdict: PASS|FAIL|UNCERTAIN, evidence: "..." }` | ⬜ |
|
||
| Wire into `aphoria verify` Direction 2 (walk claims, verify in code) | ⬜ |
|
||
| Maps to `Extractor::verify()` concept (historical: vision-gaps-2026-02-08) | ⬜ |
|
||
|
||
**Token efficiency:** Snippet (~100 tokens) vs whole file (~2000 tokens) = 95% cost reduction per verification.
|
||
|
||
### A6.4 Scout for Claim Suggestion ⬜
|
||
|
||
Scout identifies ROIs without matching authored claims, feeds context to `aphoria-suggest`.
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Identify ROIs with no matching claim in `.aphoria/claims.toml` | ⬜ |
|
||
| Enrich context for skill: snippet + function name + surrounding comments | ⬜ |
|
||
| Feed to `aphoria-suggest` skill for claim drafting | ⬜ |
|
||
|
||
### A6.5 Evaluation ⬜
|
||
|
||
| Task | Status |
|
||
|------|--------|
|
||
| Scout recall: "Did scout find the vulnerable line in fixture?" | ⬜ |
|
||
| Judge precision: "Given snippet + claim, did LLM classify correctly?" | ⬜ |
|
||
| Cost metric: `tokens_per_verification` vs monolithic approach | ⬜ |
|
||
| Parallel run: shadow mode alongside regex for tuning | ⬜ |
|
||
|
||
### Phase A6 Priority
|
||
|
||
Lower priority than A5 flywheel completion and Phase 14 governance. Build when:
|
||
1. Regex extractors hit limits on specific indirection patterns
|
||
2. `aphoria verify` Direction 2 needs LLM-backed verification
|
||
3. `aphoria-suggest` needs richer context than regex observations provide
|
||
|
||
---
|
||
|
||
## Enterprise Pilot Success Metrics
|
||
|
||
### 90-Day Pilot Targets
|
||
|
||
| Metric | Target | Measurement |
|
||
|--------|--------|-------------|
|
||
| Patterns captured | 100+ observations | Count in knowledge graph |
|
||
| Patterns promoted | 10+ conventions | Count with status=Active |
|
||
| Cross-team adoption | 2+ teams connected | Unique team_ids |
|
||
| New hire guidance events | 5+ accepted suggestions | Accept rate tracking |
|
||
| False positive rate | <10% | FP feedback / total flags |
|
||
| Evidence-backed patterns | >50% | Patterns with Research+ evidence |
|
||
|
||
### 180-Day Production Targets
|
||
|
||
| Metric | Target | Measurement |
|
||
|--------|--------|-------------|
|
||
| Knowledge retention | 0 lost patterns on departures | Audit log |
|
||
| Onboarding velocity | 50% faster ramp | Time to first PR |
|
||
| Convention adoption | 80% across org | Compliance rate |
|
||
| SOC 2 evidence | Audit pass | External validation |
|
||
| Deprecated pattern migration | 90% complete by sunset | Migration tracking |
|
||
|
||
---
|
||
|
||
## Enterprise Simulation UAT
|
||
|
||
See: `uat/enterprise-simulation-uat.md`
|