## Vision Update
- Shift from "code-level truth linter" to "self-learning institutional knowledge"
- Evidence-based authority model: merit over titles
- ProductSpec → 0.95 authority, 1 usage to graduate
- Standard (RFC) → 0.85 authority, 3 usages
- Research (ADR) → 0.70 authority, 5 usages
- Commit only → 0.40 authority, 10 usages
- Three-tier knowledge: Policies → Conventions → Observations
- Knowledge compounds with every commit
## Gap Analysis
- Documented missing features for enterprise pilot
- Phases 11-15 spec with implementation details
- Evidence detection, scope hierarchy, lifecycle management
## Roadmap Additions
- Phase 11: Evidence-Based Authority (🎯 current)
- Phase 12: Knowledge Scope Hierarchy
- Phase 13: Knowledge Lifecycle Management
- Phase 14: Governance Workflows
- Phase 15: Evidence Source Integration
## Enterprise Simulation UAT
- 6-month simulation: 3 teams, 19 contributors
- Month-by-month scenarios with expected outcomes
- Success metrics for 90-day and 180-day milestones
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
113 KiB
Aphoria Roadmap
Phase 0: StemeDB Foundation ✅
Tracked in: roadmap.md § 5D. Concept Hierarchy
Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.
| Aphoria Phase 0 | StemeDB Phase 5D | Status |
|---|---|---|
| 0.1 ConceptPath Type | 5D.1 ConceptPath Type | ✅ |
| 0.2 ConceptPath in Assertion | (implicit in 5D.1) | ✅ |
| 0.3 Hierarchical Index | 5D.4 Hierarchical Query | ✅ |
| 0.4 Alias Store | 5D.3 Alias Store + 5D.5 Alias Resolution | ✅ |
| 0.5 Source Class Inference | 5D.6 Source Class Inference | ✅ |
| 0.6 Concept API Endpoints | 5D.7 Concept API Endpoints | ✅ |
Spec: docs/specs/concept-hierarchy.md
Phase 2: CLI Core ✅
Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.
| Task | Status |
|---|---|
| 2.1 Project Walker | ✅ walker/mod.rs, walker/path_mapper.rs, walker/language.rs |
| 2.2 Extractors (10) | ✅ tls_verify, jwt_config, hardcoded_secrets, timeout_config, dep_versions, cors_config, rate_limit, weak_crypto, command_injection, sql_injection |
| 2.3 Ingestion Bridge | ✅ bridge.rs — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion |
| 2.4 Conflict Query | ✅ episteme.rs — LocalEpisteme with check_conflicts() |
| 2.5 Report Output | ✅ report/ — table (comfy-table), JSON, SARIF 2.1.0, markdown |
| 2.6 Acknowledge Command | ✅ lib.rs acknowledge() |
| Baseline & Diff | ✅ lib.rs set_baseline(), show_diff() |
| Status Command | ✅ lib.rs show_status() |
183 tests pass. Clippy and fmt clean.
Phase 2 Code Quality Fixes ✅
Code review improvements to extractors:
| Issue | Fix | Status |
|---|---|---|
| DES/RC4 concept path misclassification | Split check_pattern() into check_hash_pattern() and check_encryption_pattern(); DES/RC4 now use crypto/encryption/algorithm path |
✅ |
| SHA1 edge case undocumented | Added comments and test documenting that SHA1 detection is intentionally broad (triggers for git hashes, etc.) | ✅ |
| JS exec() regex overly broad | Tightened regex to require child_process. prefix or non-word/non-dot preceding character; prevents RegExp.exec() false positives |
✅ |
Phase 2A: Concept Matching ✅
Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.
2A.1 Leaf-Based Concept Matching (Aphoria-side fix) ✅
Implemented in episteme.rs via ConceptIndex:
make_key(subject, predicate)extracts tail 2 path segments + predicatebuild(assertions)creates in-memory index keyed by tail pathlookup(subject, predicate)finds matching authoritative assertionscheck_conflicts()usesConceptIndexinstead ofQueryEnginefor cross-scheme matching
Integration tests prove TLS and JWT conflicts are detected correctly.
2A.2 Alias Resolution in QueryEngine (StemeDB-side fix) ✅
Wired AliasStore into QueryEngine.execute():
- Added
resolve_aliases: boolfield toQuery(defaults tofalse) - Added
alias_store: Option<Arc<dyn AliasStore>>toQueryEngine - Added
.with_alias_store()builder method - When
resolve_aliases: true, expands subject viaAliasStore.resolve_all()before index lookup - Added
fetch_by_subjects()andfetch_by_subjects_predicate()for multi-subject deduplication - Modified
Query.matches()to skip subject filtering when aliases are resolved - Skips fast path (MV lookup) when
resolve_aliases: true - Gracefully degrades when no alias store is configured
7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.
2A.3 Auto-Alias Creation ✅
When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:
code://rust/myapp/tls/cert_verification↔rfc://5246/tls/cert_verificationcode://rust/myapp/auth/jwt/audience_validation↔rfc://7519/jwt/audience_validation
This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.
Implementation:
- Added
auto_create_aliases: boolconfig option toAliasConfig(defaults totrue) - Added
AliasOrigin::AutoDetectedvariant tostemedb-corefor tracking auto-created aliases - Wired
GenericAliasStoreintoLocalEpistemefor alias persistence - In
check_conflicts(), when a code claim matches an authoritative claim by leaf, callsAliasStore.set_alias()to persist the relationship withAliasOrigin::AutoDetected - Alias creation is idempotent (skips if alias already exists)
- 4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency
Phase 1: Authoritative Corpus Expansion ✅
Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ aphoria corpus build │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│ │ RFC Ingester │ │ OWASP │ │ Vendor Bootstrapper │ │
│ │ (Tier 0) │ │ Ingester │ │ (Tier 2) │ │
│ │ │ │ (Tier 1) │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └───────────┬───────────┘ │
│ │ │ │ │
│ └─────────────────┼──────────────────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ CorpusRegistry │ │
│ └────────┬────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ LocalEpisteme │ │
│ │ ingest_ │ │
│ │ authoritative() │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
1.1 CorpusBuilder Trait ✅
| Task | Status |
|---|---|
CorpusBuilder trait |
✅ corpus/mod.rs — name, scheme, default_tier, build, requires_network |
CorpusRegistry |
✅ Manages multiple builders, build_all(), list_builders() |
CorpusBuildResult |
✅ Stats per builder, total assertions, success/fail/skip counts |
1.2 RFC Ingester ✅
| Task | Status |
|---|---|
RfcCorpusBuilder |
✅ corpus/rfc.rs |
| HTTP fetching | ✅ Via ureq, cached to ~/.cache/aphoria/rfc-cache/ |
| RFC 2119 keyword parsing | ✅ MUST, MUST NOT, SHOULD, SHALL extraction |
| RFC-specific parsers | ✅ JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110) |
| Concept mapping | ✅ rfc://{number}/{topic} at Tier 0 (Regulatory) |
1.3 OWASP Ingester ✅
| Task | Status |
|---|---|
OwaspCorpusBuilder |
✅ corpus/owasp.rs |
| HTTP fetching | ✅ From GitHub raw content, cached to ~/.cache/aphoria/owasp-cache/ |
| Markdown parsing | ✅ MUST/SHOULD statements, section context |
| Cheat sheet parsers | ✅ Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers |
| Concept mapping | ✅ owasp://cheatsheet/{topic}/{claim} at Tier 1 (Clinical) |
1.4 Vendor Docs ✅
| Task | Status |
|---|---|
VendorCorpusBuilder |
✅ corpus/vendor.rs |
| PostgreSQL claims | ✅ pool_size, idle_timeout, ssl_mode |
| Redis claims | ✅ timeout, max_retries, tls |
| reqwest claims | ✅ cert_verification, connect_timeout, request_timeout |
| hyper claims | ✅ keep_alive_timeout, max_concurrent_streams |
| Go net/http claims | ✅ read_timeout, write_timeout, idle_timeout, min_tls_version |
| tokio-postgres claims | ✅ pool_size, ssl_mode |
| SQLx claims | ✅ max_connections, idle_timeout |
| Concept mapping | ✅ vendor://{product}/{topic}/{claim} at Tier 2 (Observational) |
1.5 Hardcoded Refactor ✅
| Task | Status |
|---|---|
HardcodedCorpusBuilder |
✅ corpus/hardcoded.rs — original 11 assertions |
create_authoritative_assertion() |
✅ Made public in episteme.rs for corpus builders |
1.6 CLI Integration ✅
| Task | Status |
|---|---|
aphoria corpus build |
✅ Fetches and ingests from all sources |
--only rfc,owasp,vendor |
✅ Filter to specific sources |
--offline |
✅ Skip network-requiring sources |
--clear-cache |
✅ Clear cache before building |
aphoria corpus list |
✅ List available corpus sources |
CorpusConfig |
✅ cache_dir, include_*, rfc_list options |
1.7 Error Handling ✅
| Task | Status |
|---|---|
RfcFetch error |
✅ Per-RFC fetch failures with context |
OwaspFetch error |
✅ Per-cheat-sheet fetch failures with context |
CorpusBuild error |
✅ General corpus build failures |
| Graceful degradation | ✅ Continue with other sources if one fails |
Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs
Phase 3: Skill Integration ✅
Complete. Aphoria is now usable in Claude Code agent workflows.
3.1 Claude Code Skill ✅
| Task | Status |
|---|---|
skill/SKILL.md |
✅ Comprehensive skill definition with all commands |
/aphoria scan |
✅ Scan project, show conflicts grouped by verdict |
/aphoria scan --fix |
✅ Interactive fix workflow |
/aphoria ack |
✅ Acknowledge conflicts as intentional |
/aphoria status |
✅ Show status and baseline |
/aphoria diff |
✅ Show changes since baseline |
/aphoria init |
✅ Initialize Aphoria |
/aphoria baseline |
✅ Set baseline |
skill/install.sh |
✅ Install script for ~/.claude/skills/aphoria/ |
Files: skill/SKILL.md, skill/install.sh, skill/hooks.json
3.2 Agent Pre-Flight Hook ✅
| Task | Status |
|---|---|
--exit-code flag |
✅ Returns 2 for BLOCK, 1 for FLAG only, 0 for clean |
--strict flag |
✅ Lower thresholds (FLAG at 0.3, BLOCK at 0.5) |
| Hook template | ✅ skill/hooks.json with PreCommit and PrePush examples |
Usage:
{
"hooks": {
"PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
"PrePush": [{"command": "aphoria scan --strict --exit-code"}]
}
}
3.3 Alias Suggestion Workflow ✅
Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:
- Tail-path matching finds authoritative assertions
- Aliases are auto-created with
AliasOrigin::AutoDetected - Future queries use the alias automatically
The skill documents the suggestion flow for manual alias management:
- y (Accept): Creates alias
- n (Reject): Records intentional difference
- defer: Flags for later review
Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ✅
Vision: The pre-commit hook is a bidirectional knowledge sync, not just a read-only linter. Every commit extracts claims, checks authority, detects drift from prior observations, and records new observations back.
Spec: uat/2026-02-04-full-cycle-precommit-vision.md
┌─────────────────────────────────────────────────────────────┐
│ PRE-COMMIT FLOW │
├─────────────────────────────────────────────────────────────┤
│ 1. EXTRACT → What claims does this code make? │
│ 2. CHECK → Against authority + own prior claims │
│ 3. CLASSIFY → Authority conflict | Self conflict | Novel │
│ 4. UPDATE → Record observations to local Episteme │
│ 5. GATE → Exit code (BLOCK=2, FLAG=1, PASS=0) │
└─────────────────────────────────────────────────────────────┘
4.1 Git Pre-Commit Hook ✅
All flags needed for pre-commit integration are implemented:
#!/bin/sh
# .git/hooks/pre-commit
aphoria scan --staged --sync --exit-code
Or using pre-commit framework:
repos:
- repo: local
hooks:
- id: aphoria
name: Aphoria Truth Sync
entry: aphoria scan --staged --sync --exit-code
language: system
pass_filenames: false
4.2 Baseline Mode ✅
Already implemented in Phase 2.
4A: Observational Claims ✅
Record code claims as Tier 4 (Community) assertions when no authority conflict exists:
| Task | Status |
|---|---|
sync: bool in ScanArgs |
✅ types/command.rs |
observations_recorded: usize in ScanResult |
✅ types/result.rs |
--sync CLI flag |
✅ cli.rs — requires --persist |
claim_to_observation() |
✅ bridge.rs — creates Tier 4 (Community, 0.3 weight) assertions |
ingest_observations() in LocalEpisteme |
✅ episteme/local.rs — writes to WAL + predicate index |
| Scan flow integration | ✅ scan.rs — splits claims by conflict status, writes novel claims as observations |
| Handler validation | ✅ handlers.rs — --sync requires --persist error |
| Report output | ✅ report/table.rs, report/json.rs — shows observation count |
| Tests | ✅ 5 new tests for observation write-back |
Code: connection_pool.max_size = 25
Authority: (nothing)
Action: Record as Tier 4 observation (project memory)
Usage:
# Scan with observation write-back
aphoria scan --persist --sync
# Output:
# Recorded 45 observations (project memory)
4B: Self-Conflict Detection ✅
Detect drift from the project's own prior observations:
| Task | Status |
|---|---|
| Query prior claims before conflict check | ✅ fetch_observations_for_concept() |
| Compare current vs stored observations | ✅ check_drift() compares values |
| Report changes as SELF-CONFLICT | ✅ DriftResult with prior/current values |
New verdict: Drift (distinct from Block/Flag) |
✅ Verdict::Drift |
| Drift reporting in all formats | ✅ table, json, markdown, sarif |
| Exit code includes drift | ✅ --exit-code returns 1 for drift |
Prior: db/pool_size = 25 (recorded 2026-01-15)
Now: db/pool_size = 100
Result: DRIFT — "You changed pool_size from 25 to 100. Intentional?"
Files: types/result.rs, types/verdict.rs, episteme/local.rs, scan.rs, report/*.rs
4C: Diff-Only Scanning ✅
Fast scanning for pre-commit hooks:
| Task | Status |
|---|---|
FileSource enum (All, Staged) |
✅ types/command.rs |
--staged flag (git diff --cached) |
✅ cli.rs, handlers.rs |
walker/git.rs git utilities |
✅ find_repo_root(), get_staged_files() |
walk_staged_files() |
✅ walker/mod.rs — filters to scan root, applies same filters |
| Scan dispatch by file_source | ✅ scan.rs |
| Error handling (NotGitRepo, GitCommand) | ✅ error.rs |
| Tests | ✅ 9 tests in tests/staged_scanning.rs |
| Target: < 500ms for staged-only | ✅ |
Files: types/command.rs, walker/git.rs, walker/mod.rs, scan.rs, cli.rs, handlers.rs, error.rs
Usage:
# Pre-commit hook (fast, staged files only)
aphoria scan --staged --exit-code
# Full cycle with observation sync
aphoria scan --staged --persist --sync --exit-code
4D: Enhanced Ack ✅
Acknowledgments with rationale and policy updates:
| Task | Status |
|---|---|
--reason "text" flag |
✅ cli.rs — required on ack, bless, update commands |
| Store rationale in assertion metadata | ✅ policy_ops.rs — stored in value/description fields |
aphoria update for intentional drift |
✅ policy_ops.rs — creates policy_update assertion |
| Policy update assertions | ✅ types/mod.rs — predicates::POLICY_UPDATE |
Files: cli.rs, handlers.rs, policy_ops.rs, types/command.rs, types/mod.rs
$ aphoria ack db/pool_size --reason "Scaling for Black Friday"
$ aphoria update db/pool_size 100 --reason "New baseline after load test"
4E: Hosted Mode ✅
Organizations run their own StemeDB server and all team members automatically sync observations:
| Task | Status |
|---|---|
HostedConfig in config.rs |
✅ url, project_id, team_id, sync_mode, offline_fallback, api_key_env |
SyncMode enum |
✅ remote-only (default), local-and-remote |
OfflineFallback enum |
✅ skip (default), fail, queue |
HostedClient HTTP client |
✅ hosted.rs — retry logic, auth headers, observation push |
POST /v1/aphoria/observations endpoint |
✅ Server receives observations with project/team metadata |
| Scan integration | ✅ Auto-enables sync when [hosted] configured |
Hosted(String) error variant |
✅ For connection/auth failures |
| Graceful offline fallback | ✅ Based on offline_fallback config |
| Tests | ✅ Config parsing, client creation, assertion conversion |
# aphoria.toml
[hosted]
url = "https://episteme.acme.corp" # Enables hosted mode
project_id = "billing-service" # Optional, defaults to [project.name]
team_id = "platform-team" # Optional, for multi-team servers
sync_mode = "remote-only" # "remote-only" | "local-and-remote"
offline_fallback = "skip" # "skip" | "fail" | "queue"
api_key_env = "APHORIA_API_KEY" # Env var for auth token
Architecture:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Developer A │ │ Developer B │ │ Developer C │
│ aphoria scan │ │ aphoria scan │ │ aphoria scan │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
▼
┌─────────────────────┐
│ Team StemeDB Server │
│ POST /v1/aphoria/ │
│ observations │
└─────────────────────┘
│
▼
Aggregated team patterns
Files: config.rs, hosted.rs, scan.rs, error.rs, lib.rs, crates/stemedb-api/src/handlers/aphoria.rs, crates/stemedb-api/src/dto/aphoria.rs
Phase 4.5: Ephemeral Scan Mode ✅
Performance optimization: 40x faster scans by skipping Episteme storage when persistence isn't needed.
Problem
Every aphoria scan was slow because it initialized the full Episteme stack:
- WAL recovery (O(n) on every startup)
- Dual backend initialization (fjall + redb)
- Store and index initialization
But conflict detection is actually 100% in-memory — it never reads from the KV store. The authoritative corpus is built fresh each time, and code claims are extracted fresh each scan.
Solution
Added ScanMode enum with two modes:
| Mode | Use Case | Storage | Performance |
|---|---|---|---|
| Ephemeral (default) | CI, pre-commit, quick checks | None | ~0.25 seconds |
| Persistent | Baseline/diff tracking, alias creation | WAL + store | ~1-2 seconds |
Implementation ✅
| Task | Status |
|---|---|
ScanMode enum |
✅ types.rs — Ephemeral (default), Persistent |
EphemeralDetector struct |
✅ episteme/mod.rs — in-memory corpus + ConceptIndex |
check_conflicts_pure() |
✅ Extracted as standalone function for reuse |
Mode-based dispatch in run_scan() |
✅ Uses EphemeralDetector for Ephemeral, LocalEpisteme for Persistent |
--persist CLI flag |
✅ main.rs — opt-in to persistent mode |
| Tests for both modes | ✅ test_ephemeral_scan_no_storage_created, test_persistent_scan_creates_storage, test_scan_modes_produce_same_conflicts |
Usage
# Fast ephemeral scan (default) — no storage created
aphoria scan .
# Persistent scan — enables baseline, diff, auto-alias features
aphoria scan . --persist
Performance
| Mode | Time | Storage |
|---|---|---|
| Ephemeral | ~0.25s | None |
| Persistent | ~1-2s | WAL + store directories |
Files: types.rs, episteme/mod.rs, lib.rs, main.rs, tests.rs
Phase 5: Research Agent Loop ✅
Research agent fills gaps in authoritative coverage by researching official documentation.
5.1 Gap Detection ✅
| Task | Status |
|---|---|
Gap struct |
✅ research/gap_detector.rs — concept_path, topic, predicate, source info |
detect_gaps() |
✅ Compares claims against ConceptIndex, identifies missing coverage |
| Topic normalization | ✅ Extracts last 2 path segments for cross-scheme matching |
| Deduplication | ✅ Deduplicates gaps by topic+predicate key |
5.2 Gap Storage ✅
| Task | Status |
|---|---|
GapRecord |
✅ research/gap_store.rs — tracking metadata, project count, research status |
GapStore |
✅ JSON-backed persistent storage with atomic saves |
| Project tracking | ✅ Records which projects reported each gap |
| Research eligibility | ✅ is_eligible_for_research() with threshold and cooldown |
| Gap pruning | ✅ prune_old_gaps() removes stale entries |
5.3 Quality Validation ✅
| Task | Status |
|---|---|
QualityValidator |
✅ research/quality.rs — validates researched claims |
| Source attribution | ✅ Checks for authoritative domains (rfc-editor, owasp, vendor docs) |
| Normative language | ✅ Verifies MUST/SHOULD/SHALL keywords present |
| Vague content detection | ✅ Rejects "it depends", "typically", etc. |
| Consistency scoring | ✅ Detects conflicting claims on same subject |
QualityReport |
✅ Detailed per-claim validation results |
filter_passed() |
✅ Returns only claims meeting quality threshold |
5.4 Research Execution ✅
| Task | Status |
|---|---|
Researcher |
✅ research/researcher.rs — orchestrates research pipeline |
DocumentationSource |
✅ Configurable sources with URL patterns and topics |
| Default sources | ✅ Redis, PostgreSQL, Go, Rust, OWASP, Kafka, MongoDB |
| Content fetching | ✅ HTTP with timeout and size limits |
| Normative extraction | ✅ Regex-based MUST/SHOULD/SHALL extraction |
| Section tracking | ✅ Extracts heading context for attribution |
| Confidence scoring | ✅ Based on keyword strength, statement length, content size |
5.5 CLI Integration ✅
| Task | Status |
|---|---|
aphoria research run |
✅ Run research agent with configurable threshold |
aphoria research status |
✅ Show gap statistics and research progress |
aphoria research gaps |
✅ List gaps by project count |
--threshold |
✅ Minimum projects before researching (default: 3) |
--strict |
✅ Use strict quality validation |
--prune |
✅ Remove stale gaps before researching |
--ready |
✅ Show only gaps ready for research |
Files: research/mod.rs, research/gap_detector.rs, research/gap_store.rs, research/quality.rs, research/researcher.rs, research/tests.rs
5.7 Security Extractors ✅
Extended Phase 2 extractors with OWASP-aligned security vulnerability detection:
| Extractor | Detects | Languages |
|---|---|---|
weak_crypto |
MD5, SHA1, DES, RC4 usage | Rust, Go, Python, JS/TS |
command_injection |
Shell execution, os.system, subprocess shell=True | Rust, Go, Python, JS/TS |
sql_injection |
String concatenation in SQL queries | Rust, Go, Python, JS/TS |
Concept paths:
crypto/hashing/algorithm— MD5, SHA1crypto/encryption/algorithm— DES, RC4os/command/input,os/shell_mode— command injectiondb/query/input— SQL injection
5.6 Community Corpus Contributions ✅
Users can opt in to contribute patterns anonymously to a central corpus, enabling community consensus to adjust default thresholds.
| Task | Status |
|---|---|
CommunityConfig |
✅ config/mod.rs — enabled (false), anonymize (true), exclude, include, min_confidence |
AnonymizedObservation |
✅ community/types.rs — privacy-preserving observation without file/line/text |
CommunityObjectValue |
✅ community/types.rs — serde-compatible version of ObjectValue |
PatternAggregate |
✅ community/types.rs — server-side aggregation with project counts |
anonymize_claim() |
✅ community/anonymizer.rs — wildcards project names, strips file/line, rounds timestamps |
compute_anon_hash() |
✅ Hash computed WITHOUT file/line/text (privacy-critical) |
wildcard_project_path() |
✅ code://rust/myapp/tls → code://rust/*/tls |
--community-preview flag |
✅ cli.rs — dry-run showing what WOULD be shared |
PatternAggregateStore |
✅ stemedb-storage — server-side pattern aggregation |
| Project deduplication | ✅ Uses project_hash to prevent double-counting |
POST /v1/aphoria/community/observations |
✅ Push anonymized observations |
GET /v1/aphoria/patterns |
✅ Retrieve high-confidence community patterns |
Privacy Model:
- Project names wildcarded:
myapp→* - File paths, line numbers, matched text NEVER shared
- Timestamps rounded to hour (k-anonymity)
- Server receives
project_hash, not raw project names enableddefaults tofalse(explicit opt-in required)anonymizedefaults totrue(privacy-preserving by default)
Usage:
# Preview what would be shared (no network)
aphoria scan --community-preview
# Enable in aphoria.toml:
[community]
enabled = true
anonymize = true
min_confidence = 0.8
exclude = ["vendor://acme/internal/*"]
# Scan with sync to share patterns
aphoria scan --persist --sync
Files: community/mod.rs, community/types.rs, community/anonymizer.rs, config/mod.rs, cli.rs, handlers.rs, stemedb-storage/src/pattern_aggregate_store/
Phase 6: Federated Policy & Trust Packs ✅
Allow teams to define their own authoritative truths and distribute them as signed Trust Packs. This enables "Enterprise Grade" compliance across distributed teams.
6.1 Trust Pack Format ✅
| Task | Status |
|---|---|
TrustPack schema |
✅ policy.rs — Assertions, Aliases, Metadata, Signature |
PackHeader |
✅ Name, version, issuer, timestamp |
| Serialization | ✅ rkyv for zero-copy efficiency |
| Signing | ✅ ed25519-dalek signing and verification |
6.2 Policy Management ✅
| Task | Status |
|---|---|
PolicyManager |
✅ Loads local and remote (HTTP/HTTPS) policies |
| Caching | ✅ Caches remote policies in ~/.cache/aphoria/policies/ |
aphoria.toml config |
✅ policies list support |
6.3 Core Integration ✅
| Task | Status |
|---|---|
EphemeralDetector integration |
✅ Ingests policies into memory corpus/index |
check_conflicts_pure update |
✅ Resolves policy aliases before authoritative lookup |
LocalEpisteme export helpers |
✅ fetch_acknowledgments, fetch_manual_aliases |
6.4 CLI Commands ✅
| Task | Status |
|---|---|
aphoria policy export |
✅ Exports local ack decisions as a Trust Pack |
aphoria scan policy loading |
✅ Auto-loads policies from config |
Files: policy.rs, config.rs, episteme/mod.rs, lib.rs, main.rs
Phase 6.5: Trust Pack Extensions ✅
Enhancements to Trust Packs for semantic predicate matching and key management.
6.5.1 Predicate Aliases ✅
Status: Complete Implemented: 2026-02-06
User Story:
As a security architect, when my policy uses
required=truebut the extractor emitsenabled=true, I need them to match semantically.
Problem:
- Policy blesses:
code://standard/tls/cert_verificationwith predicaterequired, valuetrue - Extractor emits:
code://config/tls/cert_verificationwith predicateenabled, valuefalse - Tail-path matching finds the concept (
tls/cert_verification) ✓ - But predicates differ:
requiredvsenabled— no conflict detected ✗
Solution:
| Task | Description |
|---|---|
predicate_aliases field |
Add to Trust Pack schema |
| Default aliases | enabled ↔ required ↔ mandatory ↔ enforced |
| ConceptIndex update | Check aliases during lookup |
| Pack-defined aliases | Allow packs to specify custom alias sets |
Trust Pack Schema Extension:
# In Trust Pack
[predicate_aliases]
security_enabled = ["enabled", "required", "mandatory", "enforced", "active"]
version_minimum = ["min_version", "minimum_version", "tls_min_version"]
Implementation Plan:
- Add
predicate_aliases: HashMap<String, Vec<String>>toTrustPack - Store aliases alongside assertions during import
- Update
ConceptIndex.make_key()to normalize predicates via aliases - Match during conflict detection: if
predicate_aaliases topredicate_b, treat as same concept
6.5.2 Pack Signing Key Rotation ✅
Status: Complete Implemented: 2026-02-06
User Story:
As a security admin, when our signing key is rotated, I need to re-sign all packs without losing policy content.
Problem:
- Trust Packs are signed with Ed25519 keys
- When keys are rotated (security best practice), existing packs become unverifiable
- Need to re-sign packs with new key while preserving content hash
Solution:
| Task | Description |
|---|---|
aphoria policy resign |
CLI command to re-sign pack with new key |
| Content hash preservation | Keep content_hash unchanged, only update signature |
| Key rotation audit | Log key rotation events |
| Old signature archival | Optionally keep old signature for audit trail |
CLI:
# Re-sign pack with new key
aphoria policy resign my-standards.pack --key-file new-private-key.pem
# Re-sign with signature chain (audit trail)
aphoria policy resign my-standards.pack --key-file new-key.pem --chain-signatures
Trust Pack Schema Extension:
pub struct TrustPack {
// Existing fields...
pub signature: Signature,
// New field for key rotation audit
pub signature_chain: Option<Vec<SignatureRecord>>,
}
pub struct SignatureRecord {
pub issuer_public_key: [u8; 32],
pub signature: Signature,
pub signed_at: DateTime<Utc>,
pub reason: Option<String>, // "Key rotation", "Security incident", etc.
}
6.5.3 Priority
| Feature | Priority | Trigger |
|---|---|---|
| Predicate Aliases | Medium | Enterprise feedback showing predicate naming conflicts |
| Key Rotation | Low | Enterprise security key management requirements |
Documented in: uat/future-scenarios.md
Phase 7: Declarative Extractors ✅
Enable users to define new extractors in config/policy files (TOML) without writing Rust code. This removes the recompilation bottleneck for custom pattern enforcement.
User Outcome: "I added a custom extractor to my aphoria.toml that detects our company's deprecated API patterns. Now every scan flags files using the old pattern without me writing any Rust code."
7.1 Core Types ✅
| Task | Status |
|---|---|
DeclarativeExtractorDef |
✅ extractors/declarative.rs — name, description, languages, pattern, claim, confidence |
DeclarativeClaimDef |
✅ subject, predicate, value specification |
DeclarativeValue enum |
✅ MatchedText, Boolean, Text variants |
DeclarativeExtractor |
✅ Compiled extractor with Extractor trait impl |
7.2 Configuration ✅
| Task | Status |
|---|---|
ExtractorConfig.declarative |
✅ config/mod.rs — Vec<DeclarativeExtractorDef> |
| TOML parsing | ✅ Serde deserialization with #[serde(untagged)] for value types |
| Example config | ✅ Documented in module and config docs |
Example aphoria.toml:
[[extractors.declarative]]
name = "deprecated_api_v1"
description = "Detects usage of deprecated v1 API endpoints"
languages = ["go", "rust", "python"]
pattern = '/api/v1/\w+'
claim.subject = "api/deprecated_endpoint"
claim.predicate = "version"
claim.value = "v1"
confidence = 1.0
[[extractors.declarative]]
name = "legacy_encryption"
description = "Detects legacy encryption algorithms"
languages = ["rust", "go", "python", "javascript"]
pattern = '(?i)blowfish|twofish|cast5'
claim.subject = "crypto/encryption/algorithm"
claim.predicate = "algorithm"
claim.value_from_match = true
confidence = 0.9
7.3 Validation & Security ✅
| Task | Status |
|---|---|
| Name validation | ✅ Non-empty required |
| Subject/predicate validation | ✅ Non-empty required |
| Confidence validation | ✅ Must be 0.0-1.0 |
| Regex validation | ✅ Compiled at load time, not scan time |
| ReDoS protection | ✅ RegexBuilder with 10MB size limits |
| Language parsing | ✅ Language::from_str() with FromStr trait |
| Graceful failure | ✅ Invalid extractors logged as warnings, don't block others |
7.4 Registry Integration ✅
| Task | Status |
|---|---|
| Module export | ✅ extractors/mod.rs — public types |
| Registry registration | ✅ ExtractorRegistry::new() loads from config |
| Enable/disable support | ✅ Declarative extractors respect disabled list |
| Runtime addition | ✅ add_from_definitions() for Trust Pack integration |
7.5 Error Handling ✅
| Task | Status |
|---|---|
DeclarativeExtractor error variant |
✅ error.rs — name + message |
| Validation errors | ✅ Clear messages for each failure mode |
| Structured logging | ✅ tracing::warn! for compilation failures |
7.6 Tests ✅
| Task | Status |
|---|---|
| Unit tests | ✅ 22 tests in declarative.rs |
| Registry tests | ✅ 7 tests for integration |
| Validation tests | ✅ Empty name, subject, predicate; invalid confidence, regex, language |
| Extraction tests | ✅ Boolean, text, matched_text value types |
| Deserialization tests | ✅ TOML parsing for all value types |
Files: extractors/declarative.rs, extractors/mod.rs, config/mod.rs, types/language.rs, error.rs
Phase 7.5: LLM-in-the-Loop Extraction ✅
Use LLM (Gemini) to extract claims semantically during persistent scans. This fills gaps that regex extractors can't catch, providing immediate value while the learning system builds up pattern knowledge.
Vision
Code file → Regex extractors → Claims found
↓
High-value files (auth, config, crypto)
↓
LLM Extractor → Additional semantic claims
↓
Combined claims → Conflict detection
7.5.1 LLM Extractor Implementation ✅
| Task | Status |
|---|---|
GeminiClient struct |
✅ llm/client.rs — Gemini API client using ureq |
LlmExtractor struct |
✅ llm/extractor.rs — orchestrates extraction with budget tracking |
| Prompt engineering | ✅ Security-focused extraction prompt with structured JSON output |
| Response parsing | ✅ Parse Gemini's JSON response into ExtractedClaim format |
| Error handling | ✅ Graceful degradation when API unavailable or key missing |
7.5.2 Selective Triggering ✅
| Task | Status |
|---|---|
is_high_value_file() |
✅ llm/extractor.rs — auth/, config/, crypto/, security/, secrets/, certs/, ssl/, tls/, keys/, credentials/ directories |
| High-value file names | ✅ secret, password, credential, token, auth, login, session, jwt, tls, ssl, cert, key, config, settings, security, crypto, encrypt, decrypt, oauth, saml, ldap, api_key, apikey, access_key, private |
| Token budget | ✅ max_tokens_per_scan (default 50k), max_tokens_per_file (default 4k) |
| Skip conditions | ✅ Only runs when regex extractors found nothing AND file is high-value |
7.5.3 Cost Controls ✅
| Task | Status |
|---|---|
| Token tracking | ✅ Arc<AtomicUsize> for thread-safe budget tracking across files |
| BLAKE3 caching | ✅ llm/cache.rs — content hash + model + prompt version for cache key |
| Cache location | ✅ ~/.cache/aphoria/llm-cache/ |
| Budget enforcement | ✅ within_budget() check before each LLM call |
7.5.4 Configuration ✅
# aphoria.toml
[llm]
enabled = true # Enable LLM extraction (default: false)
provider = "gemini" # Only "gemini" supported
# model defaults to DEFAULT_LLM_MODEL (currently "gemini-3-flash-preview")
api_key_env = "GEMINI_API_KEY" # Environment variable for API key
max_tokens_per_scan = 50000 # Budget per scan
max_tokens_per_file = 4000 # Budget per file (for max_output_tokens)
high_value_only = true # Only use on auth/config/crypto files
cache_responses = true # Cache by content hash
timeout_secs = 60 # API timeout
min_confidence = 0.7 # Filter claims below this confidence
Files: llm/mod.rs, llm/client.rs, llm/extractor.rs, llm/cache.rs, config/mod.rs, scan.rs, error.rs
Phase 7.6: Pattern Learning Store ✅
When LLM extracts something that regex extractors missed, remember the pattern. Track which patterns recur across projects to identify candidates for promotion to declarative extractors.
Vision
LLM extracts claim from code
↓
Pattern not in learned store?
↓
Store: { example_code, claim, project_hash }
↓
Same pattern seen in 5+ projects?
↓
Flag for promotion to declarative extractor
7.6.1 LearnedPattern Schema ✅
| Task | Status |
|---|---|
ValueType enum |
✅ learning/types.rs — Text, Number, Boolean |
ClaimTemplate struct |
✅ learning/types.rs — subject_template, predicate, value_type, description |
LearnedPattern struct |
✅ learning/types.rs — full schema with timestamps, project hashes, confidence tracking |
| Serde serialization | ✅ JSON serialization with chrono timestamps |
| Tests | ✅ 5 unit tests for types |
7.6.2 PatternStore Implementation ✅
| Task | Status |
|---|---|
PatternStore trait |
✅ learning/store.rs — abstract storage interface |
LocalPatternStore |
✅ JSON-backed local storage at ~/.aphoria/learning/patterns.json |
RwLock thread safety |
✅ Write-through cache with in-memory HashMap |
| Deduplication | ✅ find_similar() with Levenshtein similarity threshold 0.8 |
| Pruning | ✅ prune_stale() removes patterns not seen in N days |
| Tests | ✅ 8 unit tests for store operations |
7.6.3 Pattern Normalization ✅
| Task | Status |
|---|---|
normalize_pattern() |
✅ learning/normalizer.rs — replaces literals with placeholders |
| Version detection | ✅ "1.0", "TLSv1.2" → <string:version> |
| Boolean detection | ✅ true/false → <boolean> |
| Number detection | ✅ Standalone numbers → <number> |
| String detection | ✅ Remaining quoted strings → <string> |
pattern_similarity() |
✅ Levenshtein distance normalized to 0.0-1.0 |
| Tests | ✅ 17 unit tests for normalization |
7.6.4 Configuration ✅
# aphoria.toml
[learning]
enabled = true # Enable pattern learning (default: false)
store = "local" # "local" | "hosted"
min_confidence = 0.7 # Minimum LLM confidence to learn
prune_after_days = 90 # Remove patterns not seen in N days
[learning.promotion]
min_projects = 5 # Projects needed before promotion
min_confidence = 0.8 # Average confidence needed
auto_promote = false # Require human approval (Phase 7.7)
7.6.5 Scan Integration ✅
| Task | Status |
|---|---|
| Initialize pattern store | ✅ scan.rs — only in persistent mode with learning enabled |
| Project hash computation | ✅ BLAKE3 hash for privacy-preserving project identification |
| Record LLM-extracted claims | ✅ After LLM extraction, record patterns meeting min_confidence |
| Update existing patterns | ✅ Merge observations when similar pattern found |
| Logging | ✅ Reports patterns_recorded count on scan completion |
7.6.6 Error Handling ✅
| Task | Status |
|---|---|
LearningStore error variant |
✅ error.rs — for storage/cache failures |
| Graceful degradation | ✅ Store failures logged, don't block scan |
Files: learning/mod.rs, learning/types.rs, learning/normalizer.rs, learning/store.rs, config/mod.rs, scan.rs, error.rs, lib.rs
Tests: 30 tests covering types, normalization, and store operations.
Phase 7.6 (Legacy Documentation)
Note: The following is the original spec for reference. See above for implemented status.
Original Schema (Reference)
/// A pattern learned from LLM extraction that could become a declarative extractor.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LearnedPattern {
/// Unique identifier
pub id: Uuid,
/// Example code that triggered this pattern
pub example_code: String,
/// Normalized pattern (variables replaced with placeholders)
/// e.g., "const TLS_MIN_VERSION = \"1.0\"" → "const TLS_MIN_VERSION = <version>"
pub normalized_pattern: String,
/// The claim this pattern produces
pub claim_template: ClaimTemplate,
/// Language this pattern applies to
pub language: Language,
/// When first seen
pub first_seen: DateTime<Utc>,
/// When last seen
pub last_seen: DateTime<Utc>,
/// Projects that have this pattern (hashed for privacy)
pub project_hashes: HashSet<String>,
/// Total occurrences across all projects
pub occurrences: u32,
/// Average LLM confidence when extracting this
pub avg_confidence: f32,
/// Has this been promoted to a declarative extractor?
pub promoted: bool,
/// If promoted, the extractor ID
pub promoted_to: Option<String>,
}
/// Template for generating claims from a learned pattern.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ClaimTemplate {
pub subject_template: String, // "tls/min_version"
pub predicate: String, // "version"
pub value_type: ValueType, // String, Boolean, Number
pub description_template: String,
}
Original PatternStore Trait (Reference)
pub trait PatternStore: Send + Sync {
/// Record a pattern learned from LLM extraction
fn record_pattern(&self, pattern: &LearnedPattern) -> Result<()>;
/// Find existing pattern matching this example
fn find_similar(&self, normalized: &str, language: Language, threshold: f32) -> Option<LearnedPattern>;
/// Get patterns ready for promotion (threshold met)
fn get_promotion_candidates(&self, min_projects: usize, min_confidence: f32) -> Vec<LearnedPattern>;
/// Mark pattern as promoted
fn mark_promoted(&self, id: &Uuid, extractor_name: &str) -> Result<()>;
/// Prune old patterns
async fn prune_stale(&self, max_age_days: u32) -> Result<usize>;
}
7.6.3 Pattern Normalization ⬜
| Task | Description |
|---|---|
| Variable extraction | Identify literals that vary (versions, names, values) |
| Placeholder insertion | Replace literals with typed placeholders |
| Similarity scoring | Compare normalized patterns for dedup |
fn normalize_pattern(code: &str, claim: &ExtractedClaim) -> String {
// "const TLS_MIN = \"1.0\"" → "const TLS_MIN = <string:version>"
// "pool_size: 25" → "pool_size: <number>"
// "verify_ssl: false" → "verify_ssl: <boolean>"
}
fn similarity_score(a: &str, b: &str) -> f32 {
// Levenshtein distance normalized to 0.0-1.0
// Patterns with score > 0.8 are considered duplicates
}
7.6.4 Integration with Scan ⬜
// In scan.rs, after LLM extraction
for claim in llm_claims {
// Check if this is a new pattern
if let Some(existing) = pattern_store.find_similar(&claim.matched_text, language).await {
// Update existing pattern
pattern_store.increment_occurrence(&existing.id, project_hash).await?;
} else {
// Record new pattern
let pattern = LearnedPattern::from_claim(&claim, &code_context, project_hash);
pattern_store.record_pattern(&pattern).await?;
}
}
7.6.5 Configuration ⬜
# aphoria.toml
[learning]
enabled = true # Enable pattern learning
store = "local" # "local" | "hosted"
min_confidence = 0.7 # Minimum LLM confidence to learn
prune_after_days = 90 # Remove patterns not seen in N days
[learning.promotion]
min_projects = 5 # Projects needed before promotion
min_confidence = 0.8 # Average confidence needed
auto_promote = false # Require human approval (Phase 7.7)
Files: learning/mod.rs, learning/pattern.rs, learning/store.rs, learning/normalize.rs
Phase 7.7: Pattern → Extractor Promotion ✅
High-frequency learned patterns get promoted to declarative extractors. This closes the learning loop: patterns discovered by LLM become permanent, fast regex extractors.
Vision
LearnedPattern (5+ projects, >0.8 confidence)
↓
Claude: "Generate regex for this pattern"
↓
Candidate declarative extractor
↓
Validate against stored examples
↓
Human review (optional) → Approve/Reject
↓
Merge to project's .aphoria/extractors/
7.7.1 Promotion Pipeline ✅
| Task | Status |
|---|---|
PromotionPipeline |
✅ promotion/pipeline.rs — orchestrates full promotion flow |
RegexGenerator |
✅ promotion/regex_gen.rs — Gemini LLM integration |
ExtractorValidator |
✅ promotion/validator.rs — ReDoS detection, timing validation |
YamlWriter |
✅ promotion/writer.rs — outputs to .aphoria/extractors/learned/ |
InteractiveReviewer |
✅ promotion/review.rs — CLI review workflow |
PromotionCandidate |
✅ promotion/types.rs |
ValidationResult |
✅ promotion/types.rs |
pub struct PromotionPipeline {
pattern_store: Arc<dyn PatternStore>,
llm_client: ClaudeClient,
validator: ExtractorValidator,
}
impl PromotionPipeline {
/// Get patterns ready for promotion
pub async fn get_candidates(&self) -> Vec<PromotionCandidate> {
let patterns = self.pattern_store
.get_promotion_candidates(5, 0.8)
.await?;
patterns.into_iter()
.map(|p| self.generate_candidate(p))
.collect()
}
/// Generate declarative extractor from pattern
async fn generate_candidate(&self, pattern: LearnedPattern) -> PromotionCandidate {
// Ask Claude to generate regex
let regex = self.llm_client.generate_regex(&pattern).await?;
// Build declarative extractor
let extractor = DeclarativeExtractor {
name: pattern.id.to_string(),
language: pattern.language,
pattern: regex,
claim: pattern.claim_template.clone(),
source: ExtractorSource::Learned {
pattern_id: pattern.id,
projects: pattern.project_hashes.len(),
},
};
// Validate against examples
let validation = self.validator.validate(&extractor, &pattern).await;
PromotionCandidate { pattern, extractor, validation }
}
}
7.7.2 Regex Generation ✅
| Task | Status |
|---|---|
| Multi-example prompt | ✅ Includes all examples in generation prompt |
| Regex safety | ✅ ReDoS detection prevents catastrophic backtracking |
| Test coverage | ✅ Validates against stored examples |
async fn generate_regex(examples: &[String], claim: &ClaimTemplate) -> Result<String> {
let prompt = format!(
"Generate a regex pattern that matches all these code examples:\n\n{}\n\n\
The regex should extract the value for claim: {}\n\
Requirements:\n\
- Must match ALL examples\n\
- Use named capture groups for extracted values\n\
- Avoid catastrophic backtracking (no nested quantifiers)\n\
- Return ONLY the regex, no explanation",
examples.join("\n---\n"),
claim.subject_template
);
let response = claude.message(&prompt).await?;
validate_regex_safety(&response)?;
Ok(response)
}
7.7.3 Validation Suite ✅
| Task | Status |
|---|---|
| Positive tests | ✅ Must match all stored examples |
| ReDoS detection | ✅ Detects catastrophic backtracking patterns |
| Performance test | ✅ Timing validation with configurable threshold |
| False positive check | ⬜ Deferred to Phase 9 (sample codebase FP testing) |
pub struct ExtractorValidator {
sample_codebases: Vec<PathBuf>, // Known-good projects for FP testing
}
impl ExtractorValidator {
pub async fn validate(
&self,
extractor: &DeclarativeExtractor,
pattern: &LearnedPattern
) -> ValidationResult {
let mut result = ValidationResult::default();
// Must match all positive examples
for example in &pattern.examples {
if !extractor.matches(example) {
result.positive_failures.push(example.clone());
}
}
// Must not have excessive false positives
for codebase in &self.sample_codebases {
let fps = self.count_false_positives(extractor, codebase).await;
if fps > 10 {
result.false_positive_warning = true;
}
}
// Must be fast
let duration = self.benchmark(extractor);
if duration > Duration::from_millis(100) {
result.performance_warning = true;
}
result
}
}
7.7.4 Human Review Gate ✅
| Task | Status |
|---|---|
aphoria extractors review |
✅ CLI to review pending promotions |
aphoria extractors stats |
✅ Show pattern store statistics |
aphoria extractors candidates |
✅ List promotion candidates |
aphoria extractors promote |
✅ Promote pattern to extractor |
| Approval workflow | ✅ Approve, reject, or skip via InteractiveReviewer |
| Rejection tracking | ⬜ Deferred to Phase 9 (rejection reason persistence) |
| Auto-approve mode | ⬜ Deferred to Phase 9 (>0.95 confidence auto-promote) |
$ aphoria extractors review
Pending promotions: 3
[1/3] Pattern: tls_min_version_const
Examples: 47 (across 8 projects)
Confidence: 0.91
Generated regex: (?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["']?(1\.[01])["']?
Sample matches:
const TLS_MIN_VERSION = "1.0" ✓ matches
TLS_MINIMUM_VERSION: "1.1" ✓ matches
ssl_min_version = "1.2" ✓ matches (TLS 1.2 is safe, false positive?)
[a]pprove [r]eject [e]dit [s]kip [q]uit: _
7.7.5 Extractor Output ✅
Promoted patterns become declarative extractors in .aphoria/extractors/learned/:
# .aphoria/extractors/learned/tls_min_version_const.yaml
# Auto-generated from learned pattern. DO NOT EDIT.
# Pattern ID: 550e8400-e29b-41d4-a716-446655440000
# Learned from: 8 projects, 47 occurrences
# Confidence: 0.91
# Promoted: 2026-02-10
name: "tls_min_version_const"
language: ["rust", "go", "python", "javascript", "typescript"]
pattern: '(?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["\']?(1\.[01])["\']?'
claim:
subject: "tls/min_version"
predicate: "version"
value_capture: 1 # Capture group for version
description: "TLS minimum version set to deprecated {value}"
metadata:
source: "learned"
pattern_id: "550e8400-e29b-41d4-a716-446655440000"
projects: 8
occurrences: 47
confidence: 0.91
7.7.6 Configuration ✅
# aphoria.toml
[promotion]
enabled = true # Enable promotion pipeline
auto_promote = false # Require human approval
output_dir = ".aphoria/extractors/learned"
min_confidence = 0.8 # Minimum to consider
min_projects = 5 # Projects needed before promotion
require_validation = true # Must pass validation suite
Files: promotion/mod.rs, promotion/pipeline.rs, promotion/regex_gen.rs, promotion/validator.rs, promotion/review.rs, promotion/writer.rs, promotion/types.rs, handlers/extractors.rs
Tests: 43 tests covering pipeline, validation, regex generation, and YAML output.
Phase 9: Autonomous Extractor Generation ✅
The system generates, tests, and deploys extractors without human approval for high-confidence patterns. This is the endgame: a fully self-improving extraction system.
Vision
Learned pattern exceeds autonomous threshold (>0.95 confidence, >10 projects)
↓
Auto-generate extractor
↓
Validate against comprehensive test suite
↓
A/B test: run new extractor in shadow mode
↓
If FP rate < 5%: auto-deploy
↓
If FP rate spikes: auto-rollback
Phase 7.8: LLM Prompt Evaluation ✅
Measure and improve LLM extraction quality through golden fixtures and regression detection. Essential for prompt engineering without breaking existing quality.
Vision
Golden Fixtures (TOML) Evaluation Harness
├── tls-001: verify=False ├── Load fixtures
├── jwt-001: algorithm=none --> ├── Run extraction (live/cached/mock)
└── secrets-001: hardcoded key ├── Match against expectations
├── Compute precision/recall/F1
└── Compare to baseline (regression detection)
7.8.1 Fixture Format ✅
| Task | Status |
|---|---|
Fixture type |
✅ eval/fixture.rs — TOML-based test cases |
ExpectedClaim |
✅ Subject/predicate/value expectations |
must_contain |
✅ Claims that MUST be extracted (recall) |
must_not_contain |
✅ Claims that MUST NOT appear (precision) |
FixtureLoader |
✅ Load fixtures from directory tree |
CorpusManifest |
✅ Corpus metadata + baseline metrics |
| Validation | ✅ Duplicate ID, empty content, missing expectations |
# tests/llm_fixtures/tls/tls-001-disabled-verification.toml
[metadata]
id = "tls-001"
name = "TLS verification disabled in Python requests"
category = "tls"
language = "python"
[input]
filename = "api_client.py"
content = """
response = requests.get(url, verify=False)
"""
[expected]
must_contain = [
{ subject = "tls/cert_verification", predicate = "enabled", value = false }
]
must_not_contain = [
{ subject = "tls/cert_verification", predicate = "enabled", value = true }
]
7.8.2 Claim Matching ✅
| Task | Status |
|---|---|
ClaimMatcher |
✅ eval/matcher.rs — Flexible claim comparison |
| Tail-path matching | ✅ Last 2 segments for subject comparison |
| Type coercion | ✅ Boolean↔string ("true"/"yes"), number↔string |
| Confidence thresholds | ✅ Optional min_confidence per expectation |
count_false_positives() |
✅ Detect unexpected claims |
7.8.3 Metrics Computation ✅
| Task | Status |
|---|---|
Metrics |
✅ eval/metrics.rs — Aggregate evaluation metrics |
| Precision/Recall/F1 | ✅ Standard information retrieval metrics |
| Per-category breakdown | ✅ Metrics by fixture category |
| Cost estimation | ✅ Token-based cost tracking |
BaselineComparison |
✅ Compare current run to stored baseline |
| Regression detection | ✅ Flag if F1/precision/recall drop > threshold |
7.8.4 Evaluation Harness ✅
| Task | Status |
|---|---|
EvalHarness |
✅ eval/harness.rs — Orchestrates evaluation runs |
EvalMode::Live |
✅ Real LLM API calls |
EvalMode::Cached |
✅ Use cached responses (deterministic CI) |
EvalMode::Mock |
✅ No LLM, tests harness itself |
EvalVerdict |
✅ Pass, Regression, Review, Error |
update_baseline() |
✅ Save current metrics as new baseline |
7.8.5 Report Generation ✅
| Task | Status |
|---|---|
Report |
✅ eval/report.rs — Multi-format output |
| Table format | ✅ Terminal tables with color-coded results |
| JSON format | ✅ Machine-readable for CI/CD integration |
| Markdown format | ✅ Documentation and PR comments |
| Failed fixture details | ✅ Shows unmatched expectations with rationale |
7.8.6 CLI Commands ✅
| Task | Status |
|---|---|
aphoria eval run |
✅ Run evaluation against fixtures |
aphoria eval baseline |
✅ Show current baseline metrics |
aphoria eval update-baseline |
✅ Update baseline (--force required) |
aphoria eval list-fixtures |
✅ List available fixtures by category |
aphoria eval validate-fixtures |
✅ Validate fixture format |
--fail-on-regression |
✅ Exit code 1 if regression detected |
--threshold |
✅ Configurable regression threshold (default 5%) |
--mode |
✅ live, cached, or mock |
# Run evaluation in mock mode
aphoria eval run --fixtures tests/llm_fixtures --mode mock
# CI: fail on regression
aphoria eval run --mode cached --fail-on-regression --threshold 0.05
# Update baseline after prompt improvements
aphoria eval update-baseline --fixtures tests/llm_fixtures --force
# List fixtures by category
aphoria eval list-fixtures --category tls
7.8.7 Seed Fixtures ✅
| Category | Fixture | Description |
|---|---|---|
| tls | tls-001 | Python requests verify=False |
| tls | tls-002 | Node.js TLSv1 deprecated protocol |
| jwt | jwt-001 | Algorithm 'none' allowed |
| jwt | jwt-002 | Go WithoutClaimsValidation |
| secrets | secrets-001 | Hardcoded API key |
| secrets | secrets-002 | High-entropy JWT in config |
| auth | auth-001 | Debug authentication bypass |
| negative | negative-001 | Safe TLS config (no findings expected) |
| negative | negative-002 | Env-loaded secrets (no findings expected) |
| edge | edge-001 | Empty file edge case |
Files: eval/mod.rs, eval/fixture.rs, eval/matcher.rs, eval/metrics.rs, eval/harness.rs, eval/report.rs, handlers/eval.rs, cli.rs, tests/llm_fixtures/
Documentation: docs/llm-optimization/ — Full optimization playbook with decision trees, research templates, and baseline tracking.
9.1 Autonomous Promotion ✅
| Task | Description | Status |
|---|---|---|
AutonomousConfig |
Configuration with kill switch (enabled: false default) | ✅ |
| High-confidence threshold | Skip human review for >0.95 confidence | ✅ |
| Project threshold | Require >10 projects for autonomous | ✅ |
| Validation strictness | Zero failures, zero warnings required | ✅ |
should_auto_promote() |
Decision logic on PromotionCandidate |
✅ |
auto_promotion_blockers() |
Explains why pattern can't be auto-promoted | ✅ |
AutonomousAuditLog |
JSONL audit trail for all decisions | ✅ |
smart_auto_promote_all() |
Pipeline integration with audit logging | ✅ |
| YAML header enhancement | "AUTO-PROMOTED" + "Approved by: autonomous" | ✅ |
| CLI command | aphoria extractors auto-promote [--dry-run] |
✅ |
Safety Features:
- Kill switch:
enabled: falseby default (opt-in only) - Auditability: All decisions logged to
~/.aphoria/audit/autonomous-decisions.jsonl - Reversibility: Can delete YAML + reset pattern.promoted
- Blast radius: One pattern = one YAML file
- Traceability: YAML header shows approval source
Files: config/types/autonomous.rs, promotion/audit.rs, promotion/types.rs, promotion/pipeline.rs, promotion/writer.rs, handlers/extractors.rs
Configuration:
[autonomous]
enabled = true # Master switch (default: false)
min_confidence = 0.95 # Stricter than standard 0.8
min_projects = 10 # Stricter than standard 5
require_zero_failures = true
require_zero_warnings = true
audit_log = true
audit_dir = "~/.aphoria/audit/"
CLI Usage:
# Preview what would be auto-promoted
aphoria extractors auto-promote --dry-run
# Run autonomous promotion
aphoria extractors auto-promote
# Override thresholds
aphoria extractors auto-promote --min-confidence 0.97 --min-projects 15
9.2 Shadow Mode Testing ✅
| Task | Description | Status |
|---|---|---|
ShadowConfig |
Configuration for shadow mode (min_scans, max_fp_rate, rollback_threshold) | ✅ |
ShadowTest, ShadowStatus, ShadowMetrics |
Core types for tracking shadow extractors | ✅ |
ShadowStore |
JSONL persistence for tests, matches, and decisions | ✅ |
ShadowExtractorRegistry |
Loads shadow extractors from learned/ directory | ✅ |
ShadowExecutor |
Runs shadow extractors during scans, stores matches separately | ✅ |
FeedbackCollector |
TP/FP feedback collection and metrics update | ✅ |
GraduationManager |
Shadow → production promotion and rollback logic | ✅ |
| CLI commands | shadow-status, feedback, graduate, rollback |
✅ |
Safety Features:
- Shadow isolation: Matches stored separately, not in production output
- Metrics transparency: FP rate visible via
shadow-status - Graduation gate: Must meet min_scans (100) + max_fp_rate (5%) + feedback exists
- Manual control:
rollbackcommand for immediate removal - Audit trail: All decisions logged to
decisions.jsonl
Files: shadow/mod.rs, shadow/types.rs, shadow/store.rs, shadow/registry.rs, shadow/executor.rs, shadow/feedback.rs, shadow/graduation.rs, handlers/shadow.rs, config/types/shadow.rs
Configuration:
[shadow]
enabled = true # Shadow mode on by default
min_scans = 100 # Scans before graduation eligible
max_fp_rate = 0.05 # Maximum FP rate for graduation
rollback_threshold = 0.15 # FP rate that triggers rollback
retention_days = 30 # Days to retain shadow data
CLI Usage:
# View shadow test status
aphoria extractors shadow-status [-v]
# Provide TP/FP feedback on matches
aphoria extractors feedback <test-name> [--limit 10]
# Graduate shadow test to production
aphoria extractors graduate <test-name> [--force]
# Rollback a shadow test
aphoria extractors rollback <test-name> --reason "too many FPs"
Tests: 44 tests covering types, store, registry, executor, feedback, graduation, and auto-rollback.
9.3 Auto-Rollback ✅
| Task | Description | Status |
|---|---|---|
auto_rollback_enabled config |
Toggle to enable/disable auto-rollback (default: true) | ✅ |
| Feedback-time check | Auto-rollback triggered immediately after FP feedback | ✅ |
FeedbackWithRollback return |
record_feedback() returns rollback info |
✅ |
AutoRollbackResult |
Track checked count, rolled back names, errors | ✅ |
| CLI command | aphoria extractors auto-check for manual batch checking |
✅ |
| Audit trail | Decision logged as ShadowDecisionKind::AutoRollback |
✅ |
| YAML deletion | Extractor file deleted from learned/ on rollback | ✅ |
Safety Features:
- Toggle:
auto_rollback_enabledcan disable feature for testing or manual-only workflows - Threshold configurable:
rollback_thresholdin config (default: 15%) - Minimum reviews: Requires 10+ reviewed matches before auto-rollback triggers
- Audit trail: All auto-rollback decisions logged to
decisions.jsonl - CLI fallback:
auto-checkcommand for manual verification
Files: shadow/feedback.rs, shadow/graduation.rs, config/types/shadow.rs, handlers/shadow.rs, cli.rs
Configuration:
[shadow]
enabled = true
auto_rollback_enabled = true # NEW: Enable automatic rollback (default: true)
rollback_threshold = 0.15 # FP rate that triggers auto-rollback
CLI Usage:
# Automatic: Rollback happens immediately when feedback pushes FP rate over threshold
aphoria extractors feedback <test-name> --limit 10
# If FP rate exceeds 15%, you'll see:
# ⚠️ AUTO-ROLLBACK TRIGGERED: <extractor-name>
# Manual batch check: Scan all active tests and rollback any over threshold
aphoria extractors auto-check
# Output: "⚠️ Auto-rolled back 1 of 5 shadow test(s): ..."
Tests: 3 new tests covering auto-rollback triggering, disabled toggle, and threshold boundary.
9.4 Cross-Project Learning ✅
| Task | Description | Status |
|---|---|---|
| Hosted pattern sync | Patterns from all projects aggregate on server | ✅ |
| Global promotion | Promote patterns seen across many orgs | ✅ |
| Privacy preservation | Only normalized patterns shared, no code | ✅ |
| Opt-in distribution | Orgs can opt-in to receive community extractors | ✅ |
Org A: Pattern seen in 3 projects → shared to hosted
Org B: Same pattern in 5 projects → shared to hosted
Org C: Same pattern in 4 projects → shared to hosted
↓
Hosted aggregates: 12 projects total
↓
Promotes to community extractor
↓
All orgs receive new extractor (if opted in)
Implementation:
CrossProjectConfigwith opt-in flags (contribute_patterns,receive_community)PatternSyncerfor uploading anonymized patterns to hosted serverCommunityExtractorLoaderfor pulling community extractors as YAML files- BLAKE3 hashing for pattern deduplication and org anonymization
- Privacy guarantees:
normalized_patternshared, but NOTexample_codeorproject_hashes - CLI commands:
aphoria patterns sync,aphoria patterns status,aphoria patterns pull-community
Files: config/types/cross_project.rs, community/pattern_syncer.rs, community/extractor_loader.rs, handlers/patterns.rs
Tests: 7 new tests covering pattern hashing, subject exclusion, anonymization, and extractor loading.
9.5 Extractor Versioning ✅
| Task | Description | Status |
|---|---|---|
| Version tracking | Track which version caught which issues | ✅ ExtractorVersion + VersionStore |
| Changelog | Record changes between versions | ✅ ExtractorChangelog + ChangelogEntry |
| Rollback support | Revert to previous version | ✅ aphoria extractors rollback-version |
| A/B metrics | Compare versions side-by-side | ✅ aphoria extractors compare + compute_metrics_delta() |
| CLI commands | versions, compare, rollback-version | ✅ Full CLI implementation |
| Tests | Unit tests for all components | ✅ 15+ version/changelog tests |
Files:
promotion/version.rs- Core types (ExtractorVersion,ChangelogEntry,MetricsDelta,ExtractorChangelog,VersionStore)promotion/writer.rs- Versioned YAML output (write_versioned())promotion/types.rs- Version field inPromotionMetadatahandlers/extractors.rs- CLI handlers (handle_versions,handle_compare,handle_rollback_version)cli.rs- CLI commands (Versions,Compare,RollbackVersion)
CLI Usage:
# List versions
aphoria extractors versions learned_tls_min_version
# Version History: learned_tls_min_version
# Version Date Changes
# ------------------------------------------------------------
# 2 2026-03-15 Added support for YAML configs
# 1 2026-02-01 Initial promotion from learned pattern
# Compare versions
aphoria extractors compare learned_tls_min_version -a 1 -b 2
# Comparison: learned_tls_min_version v1 vs v2
# Matches +15%
# False Positives -3%
# Rollback
aphoria extractors rollback-version learned_tls_min_version --version 1 --reason "v2 edge case bug"
# Rolled back learned_tls_min_version to v1
YAML Output:
# Generated from learned pattern. Review before editing.
# Pattern ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Version: 2 (previous: 1)
# Promoted: 2026-03-15 14:30:00 UTC
name: learned_tls_min_version
description: TLS minimum version set to deprecated value
version: 2
previous_version: 1
languages:
- rust
- go
pattern: '(?i)tls_?min_?(version)?\s*[:=]\s*["\']?(?P<value>1\.[01])["\']?'
claim:
subject: tls/min_version
predicate: version
value_from_match: true
confidence: 0.97
metadata:
source: learned
pattern_id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
version: 2
changelog:
- version: 2
date: 2026-03-15
changes: "Added support for YAML configs"
metrics:
matches: "+15%"
false_positives: "-3%"
- version: 1
date: 2026-02-01
changes: "Initial promotion from learned pattern"
9.6 Configuration ⬜
# aphoria.toml
[autonomous]
enabled = false # Opt-in to autonomous mode
min_confidence = 0.95 # Higher threshold for auto
min_projects = 10 # More evidence required
shadow_scans = 100 # Scans before promotion
max_fp_rate = 0.05 # Auto-rollback threshold
[autonomous.distribution]
receive_community = true # Receive community extractors
contribute_patterns = true # Share patterns to community
Files: autonomous/mod.rs, autonomous/shadow.rs, autonomous/rollback.rs, autonomous/distribution.rs
Milestone Summary
| Phase | Deliverable | Depends On | Status |
|---|---|---|---|
| 0 | ConceptPath in StemeDB | concept-hierarchy spec | ✅ |
| 2 | Aphoria CLI (scan, report, ack) | Phase 0 | ✅ |
| 2A | Concept matching (leaf, alias, auto-alias) | Phase 2 | ✅ |
| 1 | Authoritative corpus expansion | Phase 0 | ✅ |
| 3 | Claude Code skill + hooks | Phase 2A | ✅ |
| 4.5 | Ephemeral scan mode (40x faster) | Phase 2 | ✅ |
| 5 | Research agent loop | Phase 3 | ✅ |
| 6 | Federated Policy & Trust Packs | Phase 4.5 | ✅ |
| 6.5 | Trust Pack Extensions (Predicate Aliases, Key Rotation) | Phase 6 | ✅ |
| 4A | Observational claims (Tier 4 write-back) | Phase 6 | ✅ |
| 4B | Self-conflict detection (drift) | Phase 4A | ✅ |
| 4C | Diff-only scanning (--staged) | Phase 4B | ✅ |
| 4E | Hosted mode (team aggregation) | Phase 4C | ✅ |
| 4D | Enhanced ack (--reason, policy updates) | Phase 4C | ✅ |
| 5.6 | Community Corpus Contributions | Phase 4E | ✅ |
| 7 | Declarative Extractors | Phase 6 | ✅ |
| 7.5 | LLM-in-the-Loop Extraction (Gemini) | Phase 7 | ✅ |
| 7.6 | Pattern Learning Store | Phase 7.5 | ✅ |
| 7.7 | Pattern → Extractor Promotion | Phase 7.6 | ✅ |
| 7.8 | LLM Prompt Evaluation | Phase 7.5 | ✅ |
| 8 | Enterprise Extractors (8.1-8.11) | Phase 7.5 | ✅ |
| 8.2 | Framework-Specific Extractors (10 frameworks) | Phase 8 | ✅ |
| 9.1 | Autonomous Promotion | Phase 8 | ✅ |
| 9.2 | Shadow Mode Testing | Phase 9.1 | ✅ |
| 9.3 | Auto-Rollback | Phase 9.2 | ✅ |
| 9.4 | Cross-Project Learning | Phase 9.1 | ✅ |
| 9.5 | Extractor Versioning | Phase 9.4 | ✅ |
Current state:
- Phases 0-3, 4.5, 4A-4E, 5, 5.6, 6, 7, 7.5, 7.6, 7.7, 7.8, 8, 9.1, 9.2, 9.3, 9.4, 9.5 complete (clippy clean)
- Full corpus: RFC, OWASP, Vendor sources
- 36 extractors including:
- Security: weak_crypto, command_injection, sql_injection, high_entropy_secrets, auth_bypass, insecure_cookies, path_traversal, unvalidated_redirects, weak_password, security_headers, insecure_deserialization, ssrf, orm_injection, xxe
- Framework-specific: django, express, flask, fastapi, nestjs, nextjs, spring, laravel, rails, aspnet
- Trust Packs: signed policy bundles with import/export
- Ephemeral mode: 40x faster for CI
- Observation write-back:
--syncrecords novel claims as Tier 4 project memory - Autonomous promotion: High-confidence patterns (>0.95, 10+ projects) can skip human review with full audit trail
- Shadow mode testing: Auto-promoted extractors run in shadow mode to measure FP rate before graduation
- Auto-rollback: Shadow extractors exceeding FP threshold (15%) are automatically rolled back
- Drift detection: Detects changes from prior observations
- Staged scanning:
--stagedflag for fast pre-commit hooks - Hosted mode: Team aggregation via central StemeDB server
- Enhanced ack:
--reasonflag,aphoria updatefor policy changes - Community Corpus: Opt-in anonymous pattern sharing with privacy-preserving anonymization
- Declarative Extractors: TOML-defined custom extractors without Rust code
- LLM Extraction: Gemini-powered semantic claim extraction for high-value files
- Pattern Learning: LLM-extracted claims recorded for promotion to declarative extractors
- Pattern Promotion: CLI workflow to promote learned patterns to declarative extractors with Gemini regex generation and validation
- LLM Prompt Evaluation: Golden fixtures with precision/recall metrics, baseline comparison, and regression detection for prompt engineering
- Cross-Project Learning: Privacy-preserving pattern sync to hosted server, community extractor pull, BLAKE3-based deduplication, opt-in sharing with
CrossProjectConfig - Extractor Versioning: Version tracking with changelogs, safe rollback to previous versions, A/B metrics comparison between versions via
VersionStore
Phase 9 Complete! Autonomous Generation pipeline is fully self-improving.
The Self-Learning Vision
Phase 7: Declarative Extractors (foundation) ✅ COMPLETE
↓
Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅ COMPLETE
↓
Phase 7.6: Pattern Learning (remember what LLM finds) ✅ COMPLETE
↓
Phase 7.7: Pattern Promotion (patterns → extractors) ✅ COMPLETE
↓
Phase 7.8: LLM Prompt Evaluation (measure & improve) ✅ COMPLETE
↓
Phase 8: Enterprise Extractors (36 total) ✅ COMPLETE
├── 8.1: High-entropy secrets ✅
├── 8.2: Framework extractors (10 frameworks) ✅
├── 8.3: Config deep parsing ✅
├── 8.4-8.11: Security patterns ✅
↓
Phase 9: Autonomous Generation (fully self-improving) ✅ COMPLETE
├── 9.1: Autonomous Promotion ✅ COMPLETE
├── 9.2: Shadow Mode Testing ✅ COMPLETE
├── 9.3: Auto-Rollback ✅ COMPLETE
├── 9.4: Cross-Project Learning ✅ COMPLETE
└── 9.5: Extractor Versioning ✅ COMPLETE
The endgame: Every PR teaches Aphoria. After a month, it knows your security patterns better than your team does.
Bidirectional Knowledge Sync (Complete)
The pre-commit hook is now a bidirectional knowledge sync:
- 4A ✅: Record code claims as Tier 4 observations (project memory)
- 4B ✅: Detect drift from prior observations (self-conflict)
- 4C ✅: Fast diff-only scanning for pre-commit hooks (
--staged) - 4E ✅: Team aggregation via hosted StemeDB server
- 4D ✅: Enhanced ack with rationale and policy updates
This transforms Aphoria from a linter into a learning system that builds institutional memory per-project and collective intelligence across teams via hosted mode.
Phase 8: Enterprise Extractor Improvements ✅
Goal: Transform extractors from "toy examples" to enterprise-grade detection that catches real violations in production codebases.
Current State Audit
| Extractor | Languages | Strengths | Weaknesses |
|---|---|---|---|
tls_verify |
8 | Multi-lang, configs | Misses custom wrappers |
tls_version |
8 | API patterns | Misses semantic (const = "1.0") |
hardcoded_secrets |
8 | Placeholders, test files | No entropy detection |
weak_crypto |
5 | MD5/SHA1/DES/RC4 | SHA1 false positives, misses bcrypt cost |
sql_injection |
5 | Interpolation patterns | Misses ORM unsafe methods |
jwt_config |
8 | alg:none, skip sig | Library-specific gaps |
cors_config |
8 | Wildcard + credentials | Misses dynamic origin reflection |
rate_limit |
8 | Basic patterns | Limited depth |
timeout_config |
8 | Basic patterns | Limited depth |
command_injection |
5 | exec/system calls | Indirect injection |
dep_versions |
3 | Version parsing | No CVE correlation |
Enterprise Reality: Current extractors catch ~30% of real-world security misconfigurations. Config files are highest value (patterns consistent), code is lowest (semantic understanding required).
8.1 High-Entropy Secret Detection ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
HighEntropySecretsExtractor |
✅ extractors/high_entropy_secrets.rs |
| Shannon entropy algorithm | ✅ shannon_entropy() with 4.5 threshold |
| Charset variety check | ✅ 0.4 minimum variety ratio |
| Known secret prefixes | ✅ AWS (AKIA), Stripe (sk_live_, sk_test_), GitHub (ghp_, gho_), GitLab (glpat-), Slack (xox[baprs]-) |
| High-entropy context patterns | ✅ api_key, secret, token, credential, auth_key contexts |
| False positive exclusions | ✅ UUIDs, git SHAs (40-char hex), file hashes (64-char hex) |
| Test file confidence reduction | ✅ 0.6 confidence for test files |
| Tests | ✅ 10+ tests covering all patterns |
Configuration:
# aphoria.toml
[extractors.entropy]
min_entropy = 4.5 # Shannon entropy threshold
min_charset_variety = 0.4 # Unique chars / length ratio
min_length = 20 # Minimum string length
max_length = 200 # Maximum string length
Languages: Rust, Go, Python, JavaScript, TypeScript, YAML, TOML, JSON, Dotenv
8.2 Framework-Specific Extractors ✅
Impact: HIGH | Effort: HIGH | Status: Complete
Research Document: docs/architecture/framework-security-extractors.md
All 10 framework-specific extractors implemented and tested:
| Framework | Extractor | Languages | Tests |
|---|---|---|---|
| Spring Boot | spring_security |
Java, YAML, Properties | 7 |
| Django | django_security |
Python | 7 |
| Express.js | express_security |
JavaScript, TypeScript | 5 |
| Rails | rails_security |
Ruby, YAML | 6 |
| ASP.NET Core | aspnet_security |
C# (via regex), JSON | 6 |
| Laravel | laravel_security |
PHP (via regex) | 5 |
| FastAPI | fastapi_security |
Python | 5 |
| Next.js | nextjs_security |
JavaScript, TypeScript | 5 |
| Flask | flask_security |
Python | 6 |
| NestJS | nestjs_security |
TypeScript | 5 |
Total: 10 extractors, 57+ tests, 100+ patterns
Files: extractors/{django,express,flask,fastapi,nestjs,nextjs,spring,laravel,rails,aspnet}_security.rs
8.2.1 Spring Boot Security
# application.yml misconfigs
security:
basic:
enabled: false # Auth disabled
csrf:
enabled: false # CSRF disabled
headers:
frame-options: DISABLE # Clickjacking
// Java code patterns
@EnableWebSecurity
public class Config extends WebSecurityConfigurerAdapter {
http.csrf().disable(); // CSRF disabled
http.authorizeRequests().antMatchers("/**").permitAll(); // Auth bypass
}
8.2.2 Django Security
# settings.py misconfigs
DEBUG = True # Debug in production
ALLOWED_HOSTS = ['*'] # All hosts
CSRF_COOKIE_SECURE = False # Insecure cookies
SESSION_COOKIE_SECURE = False
8.2.3 Express.js Security
// Missing security middleware
app.use(helmet()); // helmet() should exist
app.use(cors({ origin: '*', credentials: true })); // CORS + creds
app.disable('x-powered-by'); // Should be disabled
8.2.4 Rails Security
# config/environments/production.rb
config.force_ssl = false # Should be true
config.action_dispatch.cookies_same_site_protection = :none
8.3 Config File Deep Parsing ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
ConfigValue enum |
✅ extractors/config_parser.rs |
| YAML/JSON/TOML parsers | ✅ Using serde_yaml, serde_json, toml |
| Tree walker with path tracking | ✅ walk_config() with dot-path |
ConfigSecurityExtractor |
✅ extractors/config_security.rs |
| Security rules (11 rules) | ✅ TLS, CSRF, debug, password, cookies, CORS, rate limit |
| Dev file exclusion | ✅ Skip debug warnings in dev/test configs |
| Tests | ✅ 26 tests for parsing + security rules |
Patterns now caught (nested to any depth):
*.tls.verify: false— TLS verification disabled*.insecure_skip_verify: true— Skip verification enabled*.security.enabled: false— Security disabled*.csrf.enabled: false— CSRF protection disableddebug: true— Debug mode (only in production files)*.password.min_length < 8— Weak password policy*.cookie.secure: false— Cookie secure flag disabled*.cookie.httpOnly: false— Cookie httpOnly disabled*.cors.allow_origin: "*"— CORS allows all origins*.rate_limit.enabled: false— Rate limiting disabled
Languages: YAML, JSON, TOML
8.4 Semantic TLS Version Detection ✅
Impact: MEDIUM | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
Add Language::Terraform variant |
✅ types/language.rs |
| Semantic pattern (cross-language) | ✅ Catches TLS_MIN_VERSION = "1.0" with type annotations |
| Environment variable pattern | ✅ .env files with TLS_MIN_VERSION=1.0 |
| Terraform HCL pattern | ✅ min_tls_version = "TLS1_0" |
| Kubernetes camelCase pattern | ✅ minTLSVersion: VersionTLS10 |
| False positive prevention | ✅ TLS 1.2/1.3 not flagged |
| Tests | ✅ 16 new tests (27 total for TLS extractor) |
Patterns now caught:
const TLS_MIN_VERSION: &str = "1.0";(Rust with type annotation)let sslVersion = "TLSv1";(JavaScript camelCase)TLS_MINIMUM_VERSION = "1.1"(Python assignment)TLS_MIN_VERSION=1.0(dotenv)export SSL_VERSION=TLSv1(shell export)min_tls_version = "TLS1_0"(Terraform)minTLSVersion: VersionTLS10(Kubernetes YAML)
Languages: Rust, Go, Python, TypeScript, JavaScript, Yaml, Toml, Json, Terraform, Dotenv
8.5 ORM SQL Injection Detection ✅
Impact: MEDIUM | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
OrmInjectionExtractor |
✅ extractors/orm_injection.rs |
| Django .raw() with interpolation | ✅ f"SELECT...", .format() patterns |
| Django .extra() with interpolation | ✅ where=["...{}...".format()] |
| SQLAlchemy text() with interpolation | ✅ text(f"SELECT...") |
| SQLAlchemy execute() with f-string | ✅ execute(f"...") |
| Sequelize raw query | ✅ sequelize.query(`...${...}`) |
| TypeORM where() | ✅ .where(`...${...}`) |
| GORM Raw() with Sprintf | ✅ .Raw(fmt.Sprintf(...)) |
| Prisma $queryRawUnsafe | ✅ $queryRawUnsafe(`...${...}`) |
| Tests | ✅ 8+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go
Current sql_injection catches raw string interpolation but misses ORM escape hatches:
# SQLAlchemy
db.execute(text(f"SELECT * FROM users WHERE id = {user_id}"))
User.query.filter(text("name = '" + name + "'"))
# Django
User.objects.raw("SELECT * FROM users WHERE id = %s" % user_id)
User.objects.extra(where=["name = '%s'" % name])
// Sequelize
sequelize.query(`SELECT * FROM users WHERE id = ${userId}`);
Model.findAll({ where: sequelize.literal(`id = ${id}`) });
// Prisma
prisma.$queryRawUnsafe(`SELECT * FROM users WHERE id = ${id}`);
# ActiveRecord
User.where("name = '#{name}'")
User.find_by_sql("SELECT * FROM users WHERE id = #{id}")
8.6 Authentication Bypass Patterns ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
AuthBypassExtractor |
✅ extractors/auth_bypass.rs |
| Hardcoded admin credentials | ✅ username == "admin" && password == "..." patterns |
| Debug auth headers | ✅ X-Debug-Auth, X-Internal-Auth, X-Admin-Auth |
| Skip auth env vars | ✅ SKIP_AUTH, BYPASS_AUTH, NO_AUTH, DEBUG_AUTH |
| Backdoor patterns | ✅ if username == "backdoor", if user == "test" |
| Default credentials | ✅ admin/admin, root/root, test/test, guest/guest |
| Test file confidence reduction | ✅ 0.5 confidence for test files |
| Tests | ✅ 11+ tests covering all patterns |
Detected patterns:
# Hardcoded credentials
if username == "admin" and password == "admin":
# Debug auth headers
if request.headers.get("X-Debug-Auth") == "secret":
# Skip auth env vars
if os.environ.get("SKIP_AUTH") == "true":
Languages: Python, JavaScript, TypeScript, Go, Rust
8.7 Insecure Deserialization ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
InsecureDeserializationExtractor |
✅ extractors/insecure_deserialization.rs |
| Python pickle (critical) | ✅ pickle.load(), pickle.loads(), Unpickler() |
| Python yaml.load without SafeLoader | ✅ Detects missing SafeLoader |
| Python marshal | ✅ marshal.load(), marshal.loads() |
| Python eval/exec with user input | ✅ eval(request...), exec(user...) |
| JavaScript node-serialize | ✅ require('node-serialize'), .unserialize() |
| Go gob decoder | ✅ gob.NewDecoder(), gob.Decode() |
| Java ObjectInputStream (polyglot) | ✅ ObjectInputStream, readObject() |
| Tests | ✅ 10+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go
Unsafe deserialization of untrusted data:
# Python
pickle.loads(user_input)
yaml.load(user_input) # Without Loader=SafeLoader
eval(user_input)
exec(user_input)
// Java
ObjectInputStream ois = new ObjectInputStream(userInput);
ois.readObject(); // Dangerous!
# Ruby
Marshal.load(user_input)
YAML.load(user_input) # Should use safe_load
8.8 Path Traversal Patterns ✅
Impact: MEDIUM | Effort: LOW | Status: Complete
| Task | Status |
|---|---|
PathTraversalExtractor |
✅ extractors/path_traversal.rs |
| Python open/read/write with user input | ✅ open(request...), read(params...) |
| Python os.path.join with user input | ✅ os.path.join(base, request...) |
| JavaScript fs operations | ✅ fs.readFile(req...), fs.writeFile(params...) |
| JavaScript path.join/resolve | ✅ path.join(base, req.query...) |
| JavaScript res.sendFile | ✅ res.sendFile(req.params...) |
| Go filepath operations | ✅ filepath.Join(base, r...), os.Open(req...) |
| Rust path operations | ✅ Path::new(request...), std::fs::read(user...) |
| Traversal literals | ✅ ../, %2e%2e URL-encoded patterns |
| Tests | ✅ 8+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go, Rust
File operations with user input:
# Python
open(user_input)
os.path.join(base, user_input) # Doesn't prevent ../
shutil.copy(user_input, dest)
// JavaScript
fs.readFile(userInput)
path.join(base, userInput) // Doesn't prevent ../
res.sendFile(userInput)
8.9 SSRF Patterns ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
SsrfExtractor |
✅ extractors/ssrf.rs |
| Python requests library | ✅ requests.get(url), requests.post(target) |
| Python urllib | ✅ urllib.request.urlopen(url) |
| Python httpx | ✅ httpx.get(url), AsyncClient |
| JavaScript fetch | ✅ fetch(url), fetch(req.query...) |
| JavaScript axios | ✅ axios.get(url), axios.post(target) |
| JavaScript got | ✅ got(url) |
| Go http.Get/Post | ✅ http.Get(url), http.NewRequest(...) |
| Rust reqwest | ✅ reqwest::get(url), reqwest::Client |
| URL sink patterns | ✅ proxy_url, webhook_url, callback_url from request |
| Tests | ✅ 10+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go, Rust
HTTP requests with user-controlled URLs:
# Python
requests.get(user_url)
urllib.request.urlopen(user_input)
// JavaScript
fetch(userUrl)
axios.get(userUrl)
http.get(userUrl)
// Go
http.Get(userURL)
client.Do(req) // Where req.URL is user-controlled
8.10 Missing Security Headers ✅
Impact: MEDIUM | Effort: LOW | Status: Complete
| Task | Status |
|---|---|
SecurityHeadersExtractor |
✅ extractors/security_headers.rs |
| X-Frame-Options disabled | ✅ X-Frame-Options: none, ALLOWALL |
| X-Content-Type-Options disabled | ✅ X-Content-Type-Options: disabled |
| X-XSS-Protection disabled | ✅ X-XSS-Protection: false |
| Django SECURE_* settings | ✅ SECURE_BROWSER_XSS_FILTER = False, etc. |
| YAML headers disabled | ✅ x_frame_options: false, hsts: no |
| CSP disabled or unsafe | ✅ unsafe-inline, unsafe-eval directives |
| HSTS disabled | ✅ Strict-Transport-Security: none, hsts_seconds = 0 |
| Tests | ✅ 7+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go, YAML, JSON, TOML
Detect when security headers are explicitly removed or not set:
# Response headers missing
response.headers.pop('X-Content-Type-Options')
response.headers['X-Frame-Options'] = 'ALLOWALL'
// Express without helmet
app.use(cors()); // CORS without other security
// No app.use(helmet()) found
8.11 Insecure Cookie Flags ✅
Impact: MEDIUM | Effort: LOW | Status: Complete
| Task | Status |
|---|---|
InsecureCookiesExtractor |
✅ extractors/insecure_cookies.rs |
| Missing Secure flag | ✅ secure=False, secure: false |
| Missing HttpOnly flag | ✅ httponly=False, httpOnly: false |
| SameSite=None without Secure | ✅ sameSite: 'none', SameSite=None |
| Django settings | ✅ SESSION_COOKIE_SECURE, CSRF_COOKIE_SECURE = False |
| Go cookie patterns | ✅ Secure: false, HttpOnly: false |
| Rust actix-web patterns | ✅ .secure(false), .http_only(false) |
| Test file confidence reduction | ✅ 0.5 confidence for test files |
| Tests | ✅ 8+ tests covering all patterns |
Detected patterns:
# Python/Flask/Django
response.set_cookie('session', value, secure=False)
SESSION_COOKIE_SECURE = False
// JavaScript/Express
res.cookie('session', value, { httpOnly: false });
res.cookie('auth', value, { sameSite: 'none' });
Languages: Python, JavaScript, TypeScript, Go, Rust, Ruby, YAML
8.12 Unvalidated Redirects ✅
Impact: MEDIUM | Effort: LOW | Status: Complete
| Task | Status |
|---|---|
UnvalidatedRedirectsExtractor |
✅ extractors/unvalidated_redirects.rs |
| Python redirect with user input | ✅ redirect(request.GET['next']), HttpResponseRedirect(url) |
| Python Flask redirect | ✅ redirect(request.args.get(...)) |
| JavaScript res.redirect | ✅ res.redirect(req.query.next) |
| JavaScript window.location | ✅ window.location = url, location.href = params... |
| Go http.Redirect | ✅ http.Redirect(w, r, r.Query...) |
| URL parameter patterns | ✅ redirect_url, return_url, next, goto from request |
| Tests | ✅ 7+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go
Open redirect vulnerabilities:
# Python
return redirect(request.args.get('next'))
return redirect(request.GET['url'])
// JavaScript
res.redirect(req.query.redirect);
window.location = userInput;
window.location.href = params.url;
8.13 XXE (XML External Entity) ✅
Impact: HIGH | Effort: MEDIUM | Status: Complete
| Task | Status |
|---|---|
XxeExtractor |
✅ extractors/xxe.rs |
| Python lxml/etree | ✅ etree.parse(), lxml.fromstring() |
| Python xml.etree.ElementTree | ✅ ET.parse(), ET.fromstring() |
| Python xml.dom.minidom | ✅ minidom.parse(), minidom.parseString() |
| Python xml.sax | ✅ xml.sax.parse(), xml.sax.make_parser() |
| JavaScript xml2js | ✅ xml2js.parseString(), xml2js.Parser() |
| JavaScript libxmljs | ✅ libxmljs.parseXml() |
| Go encoding/xml | ✅ xml.Unmarshal(), xml.NewDecoder() |
| Java patterns (polyglot) | ✅ DocumentBuilderFactory, SAXParser, XMLReader |
| DTD entity declarations | ✅ <!ENTITY ... SYSTEM>, <!ENTITY ... PUBLIC> |
| defusedxml detection | ✅ Lower confidence when defusedxml is imported |
| Tests | ✅ 9+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go
Unsafe XML parsing:
# Python
etree.parse(user_input) # Without disabling entities
xml.etree.ElementTree.parse(user_input)
// Java
DocumentBuilderFactory.newInstance() // Without setFeature to disable XXE
SAXParserFactory.newInstance() // Without secure processing
8.14 Weak Password Requirements ✅
Impact: MEDIUM | Effort: LOW | Status: Complete
| Task | Status |
|---|---|
WeakPasswordExtractor |
✅ extractors/weak_password.rs |
| Minimum length < 8 | ✅ password_min_length: 6, minLength: 4 |
| Bcrypt cost < 10 | ✅ bcrypt_cost = 8, hash_rounds = 5 |
| Simple length checks | ✅ len(password) >= 6 in code |
| Complexity disabled | ✅ require_special_chars: false, require_uppercase = false |
| Number requirement disabled | ✅ require_numbers: no, require_digit = 0 |
| Tests | ✅ 7+ tests covering all patterns |
Languages: Python, JavaScript, TypeScript, Go, Rust, YAML, JSON, TOML
Password validation that's too weak:
# Python
if len(password) >= 4: # Too short
if len(password) >= 6: # Still weak
MIN_PASSWORD_LENGTH = 6 # Config too low
// JavaScript
if (password.length >= 4)
const MIN_LENGTH = 6;
/^.{4,}$/ // Regex allows 4+ chars
8.15 LLM-Assisted Extraction (Future) ⬜
Impact: VERY HIGH | Effort: VERY HIGH
Use Claude to understand code semantically:
// Pseudo-implementation
async fn extract_with_llm(code: &str, file: &str) -> Vec<ExtractedClaim> {
let prompt = format!(
"Analyze this code for security issues. Return JSON with:\n\
- concept_path: security concept (e.g., 'tls/cert_verification')\n\
- predicate: what aspect (e.g., 'enabled')\n\
- value: the value found\n\
- confidence: 0.0-1.0\n\
- description: why this is an issue\n\n\
Code:\n```\n{}\n```",
code
);
let response = claude_api.message(&prompt).await?;
parse_claims_from_llm_response(&response)
}
When to use:
- High-value files (auth, crypto, config)
- After regex extractors find nothing
- For code review mode (not CI)
Considerations:
- Cost per scan
- Latency
- Rate limits
- Privacy (code leaves machine)
Implementation Priority
| Phase | Extractors | Impact | Effort | Enterprise Value | Status |
|---|---|---|---|---|---|
| 8.1 | High-entropy secrets | HIGH | MEDIUM | Catches real leaked secrets | ✅ |
| 8.2 | Framework-specific | HIGH | HIGH | Spring/Django/Express coverage | ✅ |
| 8.3 | Config deep parsing | HIGH | MEDIUM | Nested YAML/JSON understanding | ✅ |
| 8.4 | Semantic TLS | MEDIUM | MEDIUM | Catches const TLS_MIN = "1.0" | ✅ |
| 8.5 | ORM SQL injection | MEDIUM | MEDIUM | SQLAlchemy, Django, Sequelize | ✅ |
| 8.6 | Auth bypass | HIGH | MEDIUM | Backdoors, hardcoded creds | ✅ |
| 8.7 | Deserialization | HIGH | MEDIUM | pickle, Marshal, eval | ✅ |
| 8.8 | Path traversal | MEDIUM | LOW | ../../../etc/passwd | ✅ |
| 8.9 | SSRF | HIGH | MEDIUM | Internal network access | ✅ |
| 8.10 | Security headers | MEDIUM | LOW | Missing helmet(), CSP | ✅ |
| 8.11 | Cookie flags | MEDIUM | LOW | httpOnly, secure, sameSite | ✅ |
| 8.12 | Open redirects | MEDIUM | LOW | Phishing via redirect | ✅ |
| 8.13 | XXE | HIGH | MEDIUM | XML entity injection | ✅ |
| 8.14 | Weak passwords | MEDIUM | LOW | MIN_LENGTH = 4 | ✅ |
| 8.15 | LLM extraction | VERY HIGH | VERY HIGH | Semantic understanding | ✅ (Phase 7.5) |
Phase 8 Complete (8.1-8.14): All extractors implemented including 10 framework-specific extractors (Spring, Django, Express, Rails, ASP.NET, Laravel, FastAPI, Next.js, Flask, NestJS).
Success Metrics
| Metric | Current | Target | How to Measure |
|---|---|---|---|
| Detection rate (known vulns) | ~30% | >70% | Run against OWASP benchmark |
| False positive rate | Unknown | <10% | Manual review of 100 findings |
| Config file coverage | Regex only | Full parse | Structure-aware extraction |
| Framework coverage | 0 | 4 major | Spring, Django, Express, Rails |
| Enterprise pilot feedback | N/A | >4/5 | Post-pilot survey |
Phase 10: UX & Enterprise Polish ⬜
Goal: Address enterprise buyer feedback from pilot demos. Close gaps between pitch claims and actual functionality. Source: Skeptical buyer review of
applications/aphoria-pitch/materials.
10.1 Acknowledgment Expiry ✅
Impact: HIGH | Effort: MEDIUM | Priority: P1
Add --expires flag to aphoria ack command for time-limited exceptions.
| Task | Status |
|---|---|
Add expires_at: Option<String> to AcknowledgmentInfo struct (ISO 8601 format) |
✅ |
Add --expires CLI flag to Commands::Ack in cli.rs |
✅ |
Parse durations: --expires 90d, --expires 2026-12-31 (ISO 8601 date only) |
✅ |
Filter expired acks in check_conflicts() |
✅ |
| Show "Ack expired, resurfaces as BLOCK" in output | ✅ |
| Add expiry to JSON export for audit trail | ✅ |
| Tests for expiry parsing and behavior | ✅ |
Implementation Notes:
- Created
src/expiry.rsmodule withparse_expiry(),is_expired(), andformat_expiry()functions - Ack payloads stored as JSON with
{reason, expires_at}for backwards compatibility - Legacy plain-text acks treated as permanent (no expiry)
- Expired acks preserved for audit trail per patent claim 25
- Updated all report formatters (table, JSON, markdown) to show expiry info
CLI changes (cli.rs):
Ack {
concept_path: String,
#[arg(short, long)]
reason: String,
/// Optional expiry (e.g., "90d", "2026-12-31")
#[arg(long)]
expires: Option<String>,
},
Usage:
# Expire after 90 days
aphoria ack code://go/auth/tls/cert_verification \
--reason "Integration test environment" \
--expires 90d
# Expire on specific date (ISO 8601)
aphoria ack code://go/auth/tls/cert_verification \
--reason "Legacy migration - ends Q2" \
--expires 2026-12-31
Output after expiry:
BLOCK code://go/auth/tls/cert_verification
Your code: TLS certificate verification is disabled (main.go:12)
Note: Previous acknowledgment expired 2026-12-31
Action: Re-acknowledge or fix the issue
Enterprise Value: "Exceptions don't become permanent." SOC 2 auditors love time-limited exceptions because they force periodic review.
10.2 Human-Readable Signer Names ⬜
Impact: MEDIUM | Effort: MEDIUM | Priority: P2
Map issuer hex IDs to human-readable team names in output.
| Task | Status |
|---|---|
Add signer_name: Option<String> to PackHeader |
⬜ |
Add contact: Option<String> to PackHeader (Slack channel, email) |
⬜ |
Update policy export/import to preserve new fields |
⬜ |
| Show "Signed by Platform Security Team" instead of hex in output | ⬜ |
| Show contact info in conflict output | ⬜ |
| Backward-compat: gracefully handle packs without new fields | ⬜ |
Output with signer name:
BLOCK code://go/auth/tls/cert_verification
Your code: TLS certificate verification is disabled (main.go:12)
Source: Acme Security Standard v3.2 (Platform Security Team)
Contact: #security-policy
Action: Fix or acknowledge with: aphoria ack <path> --reason "..."
Enterprise Value: Developers know who to contact. Auditors see clear attribution.
10.3 Speed Benchmarks ⬜
Impact: LOW | Effort: LOW | Priority: P3
Document and automate speed benchmark testing.
| Task | Status |
|---|---|
Create benchmarks/ directory with test corpora |
⬜ |
Automate time aphoria scan on standard corpus |
⬜ |
| Document test conditions in benchmark results | ⬜ |
Add aphoria scan --benchmark flag for self-test |
⬜ |
| Include benchmarks in CI (optional, non-blocking) | ⬜ |
Usage:
# Run benchmark on current directory
aphoria scan --benchmark
# Output includes timing breakdown
Benchmark Results:
Files scanned: 767
Lines of code: 187,918
Claims extracted: 722
Conflicts found: 186
Total time: 652ms
- File discovery: 45ms
- Extraction: 487ms
- Conflict query: 120ms
Enterprise Value: "Show me the benchmark on a 100K-line codebase" → aphoria scan --benchmark
Phase 10 Completion Criteria
| Metric | Target |
|---|---|
| Ack expiry working with 90d default | ✓ |
| Demo output matches pitch slides exactly | ✓ |
| Buyer can see who signed a policy (name, not hex) | ✓ |
| Buyer can see how to contact policy owner | ✓ |
| Speed benchmarks documented and reproducible | ✓ |
Phase 11: Evidence-Based Authority 🎯
Vision: Authority comes from evidence, not titles. Merit over tenure.
Problem: All patterns treated equally. A random commit carries the same weight as a pattern backed by RFC research and product specs.
Principle: The system rewards documentation, not tenure.
Evidence Levels
| Level | Example | Authority Weight | Graduation Threshold |
|---|---|---|---|
| ProductSpec | specs/api-design.md → REQ-API-001 |
0.95 | 1 usage |
| Standard | RFC 7519, OWASP A03:2021 | 0.85 | 3 usages |
| Research | ADR-042, docs/decision-log.md | 0.70 | 5 usages |
| Commit | Just code, no context | 0.40 | 10 usages |
11.1 Evidence Level Types ⬜
| Task | Status |
|---|---|
Create src/evidence/mod.rs module |
⬜ |
Define EvidenceLevel enum (Commit, Research, Standard, ProductSpec) |
⬜ |
Implement authority_weight() method |
⬜ |
Add evidence level to LearnedPattern struct |
⬜ |
| Update pattern display to show evidence level | ⬜ |
11.2 Evidence Source Detection ⬜
| Task | Status |
|---|---|
Create EvidenceSource enum |
⬜ |
| Implement commit message parsing for RFC/standard references | ⬜ |
| Implement ADR file detection (docs/adr/*.md patterns) | ⬜ |
| Implement spec file detection (specs/*.md, *.spec.md) | ⬜ |
Add PatternEvidence::detect() auto-detection |
⬜ |
11.3 Evidence-Aware Graduation ⬜
| Task | Status |
|---|---|
Update GraduationManager thresholds based on evidence |
⬜ |
| ProductSpec: 1 usage → promotion candidate | ⬜ |
| Standard: 3 usages → promotion candidate | ⬜ |
| Research: 5 usages → promotion candidate | ⬜ |
| Commit-only: 10 usages → promotion candidate | ⬜ |
| Add evidence boost to shadow mode evaluation | ⬜ |
11.4 Evidence Display ⬜
| Task | Status |
|---|---|
Update aphoria patterns show to display evidence chain |
⬜ |
| Show evidence level badge in table/JSON output | ⬜ |
| Show linked sources (ADR, spec, RFC) in conflict output | ⬜ |
Add --evidence flag to filter patterns by evidence level |
⬜ |
Phase 11 Completion Criteria
| Metric | Target |
|---|---|
| Evidence detection working for 4 source types | ✓ |
| Graduation thresholds vary by evidence level | ✓ |
| Pattern display shows evidence chain | ✓ |
| ProductSpec-backed patterns graduate with 1 usage | ✓ |
Phase 12: Knowledge Scope Hierarchy ⬜
Vision: Knowledge applies at the right level - org, team, or project.
Problem: All knowledge exists at one flat level. No way to say "this applies org-wide" vs "this is just our team's preference."
Scope Levels
Organization Level (applies to all teams)
├── Security policies (TLS, auth, secrets) - NO opt-out
├── Compliance requirements (GDPR, SOC 2)
└── Architecture decisions (API gateway, event bus)
Team Level (applies to team's projects)
├── Coding conventions (naming, error handling)
├── Technology choices (frameworks, libraries)
└── Domain patterns (payment flows, user lifecycle)
Project Level (applies to single project)
├── Local overrides (justified exceptions)
├── Experimental patterns (not yet proven)
└── Context-specific decisions
12.1 Scope Level Types ⬜
| Task | Status |
|---|---|
Create src/scope/mod.rs module |
⬜ |
Define ScopeLevel enum (Organization, Team, Project) |
⬜ |
Add scope_level and scope_id to LearnedPattern |
⬜ |
Add ScopeConfig to .aphoria.toml |
⬜ |
Implement --scope flag for CLI commands |
⬜ |
12.2 Scope Inheritance ⬜
| Task | Status |
|---|---|
| Implement inheritance resolution (project → team → org) | ⬜ |
| Security policies: auto-apply, no opt-out | ⬜ |
| Conventions: auto-apply, teams can override with justification | ⬜ |
| Observations: never inherited, team-specific only | ⬜ |
Add ScopedKnowledge struct with inherited_from chain |
⬜ |
12.3 Scope Override Workflow ⬜
| Task | Status |
|---|---|
Implement aphoria scope override command |
⬜ |
| Require justification for overrides | ⬜ |
| Require evidence link (spec, ADR, ticket) for overrides | ⬜ |
| Store override audit trail | ⬜ |
| Show overrides in SOC 2 reports | ⬜ |
12.4 Cross-Scope Queries ⬜
| Task | Status |
|---|---|
aphoria patterns --scope org (org-level only) |
⬜ |
aphoria patterns --scope team --exclude-inherited |
⬜ |
aphoria patterns --scope project --only-local |
⬜ |
| Show scope in pattern list output | ⬜ |
Phase 12 Completion Criteria
| Metric | Target |
|---|---|
| 3 scope levels working (org/team/project) | ✓ |
| Inheritance resolution correct | ✓ |
| Overrides require justification + evidence | ✓ |
| Cross-scope queries functional | ✓ |
Phase 13: Knowledge Lifecycle Management ⬜
Vision: Knowledge ages. Patterns can be deprecated and superseded.
Problem: Knowledge exists forever. No way to deprecate patterns or track evolution.
Knowledge Status
Active → Pattern is current, enforced
Deprecated → Pattern is being phased out, migration guidance provided
Superseded → Pattern replaced by another, link to replacement
Archived → Pattern removed from active use, historical only
13.1 Knowledge Status Types ⬜
| Task | Status |
|---|---|
Create src/lifecycle/mod.rs module |
⬜ |
Define KnowledgeStatus enum |
⬜ |
Add Deprecated variant with reason, superseded_by, sunset_date |
⬜ |
Add KnowledgeLifecycle struct with status history |
⬜ |
| Store lifecycle in pattern metadata | ⬜ |
13.2 Deprecation Command ⬜
| Task | Status |
|---|---|
Implement aphoria deprecate <pattern-id> command |
⬜ |
Require --reason flag |
⬜ |
Optional --superseded-by <new-pattern> |
⬜ |
Optional --sunset-date <ISO-8601> |
⬜ |
| Notify connected teams on deprecation | ⬜ |
13.3 Migration Guidance ⬜
| Task | Status |
|---|---|
| Show deprecation warning in scan output | ⬜ |
| Link to superseding pattern when available | ⬜ |
| Show migration guide/ADR when linked | ⬜ |
| FLAG (not BLOCK) deprecated pattern usage | ⬜ |
| Track migration progress across projects | ⬜ |
13.4 Migration Tracking Dashboard ⬜
| Task | Status |
|---|---|
Implement aphoria migrations status command |
⬜ |
| Show progress by team (X/Y endpoints migrated) | ⬜ |
| Show days remaining until sunset | ⬜ |
| Show blockers (acknowledged exceptions) | ⬜ |
| Export migration status for reporting | ⬜ |
Phase 13 Completion Criteria
| Metric | Target |
|---|---|
| Deprecation command working | ✓ |
| Deprecated patterns show warning in scan | ✓ |
| Migration tracking across projects | ✓ |
| SOC 2 report includes migration status | ✓ |
Phase 14: Governance Workflows ⬜
Vision: Clear approval paths for pattern promotion with audit trails.
Problem: Governance is binary: manual review or >0.95 auto-promote. No structured approval workflows.
14.1 Approval Workflow Definition ⬜
| Task | Status |
|---|---|
Create src/governance/mod.rs module |
⬜ |
Define ApprovalWorkflow struct |
⬜ |
Define ApprovalStage with required approvers |
⬜ |
| Support evidence-based auto-approve thresholds | ⬜ |
Config: define workflows in .aphoria.toml |
⬜ |
14.2 Approval State Machine ⬜
| Task | Status |
|---|---|
| Implement state transitions (pending → approved/rejected) | ⬜ |
| Multi-stage approval support | ⬜ |
| Timeout and escalation policies | ⬜ |
| Store approval history with timestamps | ⬜ |
14.3 Approval CLI ⬜
| Task | Status |
|---|---|
aphoria governance pending - list pending approvals |
⬜ |
aphoria governance approve <id> --comment "..." |
⬜ |
aphoria governance reject <id> --reason "..." |
⬜ |
aphoria governance escalate <id> |
⬜ |
| Show approval status in pattern list | ⬜ |
14.4 SOC 2 Audit Trail ⬜
| Task | Status |
|---|---|
| Full audit log for all governance actions | ⬜ |
aphoria audit trail --pattern <id> - show timeline |
⬜ |
| Export governance history for auditors | ⬜ |
| Include approver identity and timestamp | ⬜ |
Phase 14 Completion Criteria
| Metric | Target |
|---|---|
| Multi-stage approval working | ✓ |
| Approval/reject with comments | ✓ |
| Full audit trail exportable | ✓ |
| SOC 2 evidence includes approval chain | ✓ |
Phase 15: Evidence Source Integration ⬜
Vision: ADRs, specs, and standards automatically link to patterns.
Problem: Evidence sources aren't automatically detected. Developers must manually reference them.
15.1 ADR Auto-Detection ⬜
| Task | Status |
|---|---|
Create src/evidence/adr.rs |
⬜ |
| Detect ADR-XXX patterns in commit messages | ⬜ |
| Scan for ADR files in standard locations | ⬜ |
| Parse ADR content for related patterns | ⬜ |
| Link ADR to patterns automatically | ⬜ |
15.2 Spec File Detection ⬜
| Task | Status |
|---|---|
Create src/evidence/spec.rs |
⬜ |
| Detect spec files (specs/*.md, *.spec.md) | ⬜ |
| Parse requirement IDs (REQ-XXX) | ⬜ |
| Link requirements to patterns | ⬜ |
| Show requirement coverage in reports | ⬜ |
15.3 Standard Reference Extraction ⬜
| Task | Status |
|---|---|
Create src/evidence/standards.rs |
⬜ |
| Parse RFC references (RFC 7519) | ⬜ |
| Parse OWASP references (OWASP A03:2021) | ⬜ |
| Parse NIST references (NIST SP 800-53) | ⬜ |
| Auto-link to authoritative corpus | ⬜ |
15.4 Evidence Display ⬜
| Task | Status |
|---|---|
| Show full evidence chain in pattern output | ⬜ |
| Link to source files (ADR, spec) | ⬜ |
| Show external standard references | ⬜ |
aphoria patterns --by-evidence grouping |
⬜ |
Phase 15 Completion Criteria
| Metric | Target |
|---|---|
| ADR auto-detection working | ✓ |
| Spec file linking working | ✓ |
| Standard references extracted | ✓ |
| Evidence chain visible in output | ✓ |
Enterprise Pilot Success Metrics
90-Day Pilot Targets
| Metric | Target | Measurement |
|---|---|---|
| Patterns captured | 100+ observations | Count in knowledge graph |
| Patterns promoted | 10+ conventions | Count with status=Active |
| Cross-team adoption | 2+ teams connected | Unique team_ids |
| New hire guidance events | 5+ accepted suggestions | Accept rate tracking |
| False positive rate | <10% | FP feedback / total flags |
| Evidence-backed patterns | >50% | Patterns with Research+ evidence |
180-Day Production Targets
| Metric | Target | Measurement |
|---|---|---|
| Knowledge retention | 0 lost patterns on departures | Audit log |
| Onboarding velocity | 50% faster ramp | Time to first PR |
| Convention adoption | 80% across org | Compliance rate |
| SOC 2 evidence | Audit pass | External validation |
| Deprecated pattern migration | 90% complete by sunset | Migration tracking |
Enterprise Simulation UAT
See: uat/enterprise-simulation-uat.md
6-month simulation covering:
- Month 1: Platform team adopts, baseline patterns captured
- Month 2: Payments team joins, cross-team patterns emerge
- Month 3: New hire guided by existing patterns
- Month 4: Mobile team joins, org-level promotion
- Month 5: API versioning deprecated, migration tracked
- Month 6: SOC 2 audit evidence generated