stemedb/applications/aphoria/roadmap.md
jordan c65066fd1c feat(aphoria): implement ignore & exclusion system (Phase 16)
Reduces scan noise by 96% through proper exclusion of test fixtures,
demo apps, and intentional vulnerabilities.

Phase 16.1 - Glob Pattern Matching:
- Replace starts_with() with globset for ** and * patterns
- Backwards compatible with legacy prefix patterns
- Add walker/mod.rs tests for glob exclusions

Phase 16.2 - .aphoriaignore File:
- Create walker/ignore_file.rs for gitignore-style parsing
- Merge with aphoria.toml excludes
- Support # comments and whitespace trimming

Phase 16.3 - Inline Ignore Comments:
- Create extractors/ignore_comments.rs parser
- Support // aphoria:ignore, // aphoria:ignore-next-line
- Support // aphoria:ignore-block / // aphoria:end-ignore
- Multiple comment styles: //, #, /*, --, <!--
- Integrate with ExtractorRegistry.extract_all()

Phase 16.4 - Ack Export/Import:
- Create ack_file.rs for TOML serialization
- Add 'aphoria ack add' subcommand
- Add 'aphoria ack export' to .aphoria/acks.toml
- Add 'aphoria ack import' from .aphoria/acks.toml
- Preserve expiry and reason fields

Also configures stemedb with:
- aphoria.toml with glob excludes for vulnbank, extractors, fixtures
- .aphoriaignore for dashboard, community, latent, SDK examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 17:28:50 -07:00

117 KiB

Aphoria Roadmap


Phase 0: StemeDB Foundation

Tracked in: roadmap.md § 5D. Concept Hierarchy

Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.

Aphoria Phase 0 StemeDB Phase 5D Status
0.1 ConceptPath Type 5D.1 ConceptPath Type
0.2 ConceptPath in Assertion (implicit in 5D.1)
0.3 Hierarchical Index 5D.4 Hierarchical Query
0.4 Alias Store 5D.3 Alias Store + 5D.5 Alias Resolution
0.5 Source Class Inference 5D.6 Source Class Inference
0.6 Concept API Endpoints 5D.7 Concept API Endpoints

Spec: docs/specs/concept-hierarchy.md


Phase 2: CLI Core

Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.

Task Status
2.1 Project Walker walker/mod.rs, walker/path_mapper.rs, walker/language.rs
2.2 Extractors (10) tls_verify, jwt_config, hardcoded_secrets, timeout_config, dep_versions, cors_config, rate_limit, weak_crypto, command_injection, sql_injection
2.3 Ingestion Bridge bridge.rs — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion
2.4 Conflict Query episteme.rs — LocalEpisteme with check_conflicts()
2.5 Report Output report/ — table (comfy-table), JSON, SARIF 2.1.0, markdown
2.6 Acknowledge Command lib.rs acknowledge()
Baseline & Diff lib.rs set_baseline(), show_diff()
Status Command lib.rs show_status()

183 tests pass. Clippy and fmt clean.

Phase 2 Code Quality Fixes

Code review improvements to extractors:

Issue Fix Status
DES/RC4 concept path misclassification Split check_pattern() into check_hash_pattern() and check_encryption_pattern(); DES/RC4 now use crypto/encryption/algorithm path
SHA1 edge case undocumented Added comments and test documenting that SHA1 detection is intentionally broad (triggers for git hashes, etc.)
JS exec() regex overly broad Tightened regex to require child_process. prefix or non-word/non-dot preceding character; prevents RegExp.exec() false positives

Phase 2A: Concept Matching

Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.

2A.1 Leaf-Based Concept Matching (Aphoria-side fix)

Implemented in episteme.rs via ConceptIndex:

  • make_key(subject, predicate) extracts tail 2 path segments + predicate
  • build(assertions) creates in-memory index keyed by tail path
  • lookup(subject, predicate) finds matching authoritative assertions
  • check_conflicts() uses ConceptIndex instead of QueryEngine for cross-scheme matching

Integration tests prove TLS and JWT conflicts are detected correctly.

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix)

Wired AliasStore into QueryEngine.execute():

  • Added resolve_aliases: bool field to Query (defaults to false)
  • Added alias_store: Option<Arc<dyn AliasStore>> to QueryEngine
  • Added .with_alias_store() builder method
  • When resolve_aliases: true, expands subject via AliasStore.resolve_all() before index lookup
  • Added fetch_by_subjects() and fetch_by_subjects_predicate() for multi-subject deduplication
  • Modified Query.matches() to skip subject filtering when aliases are resolved
  • Skips fast path (MV lookup) when resolve_aliases: true
  • Gracefully degrades when no alias store is configured

7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.

2A.3 Auto-Alias Creation

When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:

  • code://rust/myapp/tls/cert_verificationrfc://5246/tls/cert_verification
  • code://rust/myapp/auth/jwt/audience_validationrfc://7519/jwt/audience_validation

This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.

Implementation:

  • Added auto_create_aliases: bool config option to AliasConfig (defaults to true)
  • Added AliasOrigin::AutoDetected variant to stemedb-core for tracking auto-created aliases
  • Wired GenericAliasStore into LocalEpisteme for alias persistence
  • In check_conflicts(), when a code claim matches an authoritative claim by leaf, calls AliasStore.set_alias() to persist the relationship with AliasOrigin::AutoDetected
  • Alias creation is idempotent (skips if alias already exists)
  • 4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency

Phase 1: Authoritative Corpus Expansion

Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     aphoria corpus build                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ RFC Ingester │  │ OWASP        │  │ Vendor Bootstrapper   │  │
│  │ (Tier 0)     │  │ Ingester     │  │ (Tier 2)              │  │
│  │              │  │ (Tier 1)     │  │                       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         └─────────────────┼──────────────────────┘              │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ CorpusRegistry  │                            │
│                  └────────┬────────┘                            │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ LocalEpisteme   │                            │
│                  │ ingest_         │                            │
│                  │ authoritative() │                            │
│                  └─────────────────┘                            │
└─────────────────────────────────────────────────────────────────┘

1.1 CorpusBuilder Trait

Task Status
CorpusBuilder trait corpus/mod.rs — name, scheme, default_tier, build, requires_network
CorpusRegistry Manages multiple builders, build_all(), list_builders()
CorpusBuildResult Stats per builder, total assertions, success/fail/skip counts

1.2 RFC Ingester

Task Status
RfcCorpusBuilder corpus/rfc.rs
HTTP fetching Via ureq, cached to ~/.cache/aphoria/rfc-cache/
RFC 2119 keyword parsing MUST, MUST NOT, SHOULD, SHALL extraction
RFC-specific parsers JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110)
Concept mapping rfc://{number}/{topic} at Tier 0 (Regulatory)

1.3 OWASP Ingester

Task Status
OwaspCorpusBuilder corpus/owasp.rs
HTTP fetching From GitHub raw content, cached to ~/.cache/aphoria/owasp-cache/
Markdown parsing MUST/SHOULD statements, section context
Cheat sheet parsers Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers
Concept mapping owasp://cheatsheet/{topic}/{claim} at Tier 1 (Clinical)

1.4 Vendor Docs

Task Status
VendorCorpusBuilder corpus/vendor.rs
PostgreSQL claims pool_size, idle_timeout, ssl_mode
Redis claims timeout, max_retries, tls
reqwest claims cert_verification, connect_timeout, request_timeout
hyper claims keep_alive_timeout, max_concurrent_streams
Go net/http claims read_timeout, write_timeout, idle_timeout, min_tls_version
tokio-postgres claims pool_size, ssl_mode
SQLx claims max_connections, idle_timeout
Concept mapping vendor://{product}/{topic}/{claim} at Tier 2 (Observational)

1.5 Hardcoded Refactor

Task Status
HardcodedCorpusBuilder corpus/hardcoded.rs — original 11 assertions
create_authoritative_assertion() Made public in episteme.rs for corpus builders

1.6 CLI Integration

Task Status
aphoria corpus build Fetches and ingests from all sources
--only rfc,owasp,vendor Filter to specific sources
--offline Skip network-requiring sources
--clear-cache Clear cache before building
aphoria corpus list List available corpus sources
CorpusConfig cache_dir, include_*, rfc_list options

1.7 Error Handling

Task Status
RfcFetch error Per-RFC fetch failures with context
OwaspFetch error Per-cheat-sheet fetch failures with context
CorpusBuild error General corpus build failures
Graceful degradation Continue with other sources if one fails

Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs


Phase 3: Skill Integration

Complete. Aphoria is now usable in Claude Code agent workflows.

3.1 Claude Code Skill

Task Status
skill/SKILL.md Comprehensive skill definition with all commands
/aphoria scan Scan project, show conflicts grouped by verdict
/aphoria scan --fix Interactive fix workflow
/aphoria ack Acknowledge conflicts as intentional
/aphoria status Show status and baseline
/aphoria diff Show changes since baseline
/aphoria init Initialize Aphoria
/aphoria baseline Set baseline
skill/install.sh Install script for ~/.claude/skills/aphoria/

Files: skill/SKILL.md, skill/install.sh, skill/hooks.json

3.2 Agent Pre-Flight Hook

Task Status
--exit-code flag Returns 2 for BLOCK, 1 for FLAG only, 0 for clean
--strict flag Lower thresholds (FLAG at 0.3, BLOCK at 0.5)
Hook template skill/hooks.json with PreCommit and PrePush examples

Usage:

{
  "hooks": {
    "PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
    "PrePush": [{"command": "aphoria scan --strict --exit-code"}]
  }
}

3.3 Alias Suggestion Workflow

Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:

  1. Tail-path matching finds authoritative assertions
  2. Aliases are auto-created with AliasOrigin::AutoDetected
  3. Future queries use the alias automatically

The skill documents the suggestion flow for manual alias management:

  • y (Accept): Creates alias
  • n (Reject): Records intentional difference
  • defer: Flags for later review

Phase 4: Full-Cycle Pre-Commit (Scan + Sync)

Vision: The pre-commit hook is a bidirectional knowledge sync, not just a read-only linter. Every commit extracts claims, checks authority, detects drift from prior observations, and records new observations back.

Spec: uat/2026-02-04-full-cycle-precommit-vision.md

┌─────────────────────────────────────────────────────────────┐
│                     PRE-COMMIT FLOW                          │
├─────────────────────────────────────────────────────────────┤
│  1. EXTRACT     → What claims does this code make?           │
│  2. CHECK       → Against authority + own prior claims       │
│  3. CLASSIFY    → Authority conflict | Self conflict | Novel │
│  4. UPDATE      → Record observations to local Episteme      │
│  5. GATE        → Exit code (BLOCK=2, FLAG=1, PASS=0)        │
└─────────────────────────────────────────────────────────────┘

4.1 Git Pre-Commit Hook

All flags needed for pre-commit integration are implemented:

#!/bin/sh
# .git/hooks/pre-commit
aphoria scan --staged --sync --exit-code

Or using pre-commit framework:

repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria Truth Sync
        entry: aphoria scan --staged --sync --exit-code
        language: system
        pass_filenames: false

4.2 Baseline Mode

Already implemented in Phase 2.

4A: Observational Claims

Record code claims as Tier 4 (Community) assertions when no authority conflict exists:

Task Status
sync: bool in ScanArgs types/command.rs
observations_recorded: usize in ScanResult types/result.rs
--sync CLI flag cli.rs — requires --persist
claim_to_observation() bridge.rs — creates Tier 4 (Community, 0.3 weight) assertions
ingest_observations() in LocalEpisteme episteme/local.rs — writes to WAL + predicate index
Scan flow integration scan.rs — splits claims by conflict status, writes novel claims as observations
Handler validation handlers.rs--sync requires --persist error
Report output report/table.rs, report/json.rs — shows observation count
Tests 5 new tests for observation write-back
Code: connection_pool.max_size = 25
Authority: (nothing)
Action: Record as Tier 4 observation (project memory)

Usage:

# Scan with observation write-back
aphoria scan --persist --sync

# Output:
# Recorded 45 observations (project memory)

4B: Self-Conflict Detection

Detect drift from the project's own prior observations:

Task Status
Query prior claims before conflict check fetch_observations_for_concept()
Compare current vs stored observations check_drift() compares values
Report changes as SELF-CONFLICT DriftResult with prior/current values
New verdict: Drift (distinct from Block/Flag) Verdict::Drift
Drift reporting in all formats table, json, markdown, sarif
Exit code includes drift --exit-code returns 1 for drift
Prior: db/pool_size = 25 (recorded 2026-01-15)
Now:   db/pool_size = 100
Result: DRIFT — "You changed pool_size from 25 to 100. Intentional?"

Files: types/result.rs, types/verdict.rs, episteme/local.rs, scan.rs, report/*.rs

4C: Diff-Only Scanning

Fast scanning for pre-commit hooks:

Task Status
FileSource enum (All, Staged) types/command.rs
--staged flag (git diff --cached) cli.rs, handlers.rs
walker/git.rs git utilities find_repo_root(), get_staged_files()
walk_staged_files() walker/mod.rs — filters to scan root, applies same filters
Scan dispatch by file_source scan.rs
Error handling (NotGitRepo, GitCommand) error.rs
Tests 9 tests in tests/staged_scanning.rs
Target: < 500ms for staged-only

Files: types/command.rs, walker/git.rs, walker/mod.rs, scan.rs, cli.rs, handlers.rs, error.rs

Usage:

# Pre-commit hook (fast, staged files only)
aphoria scan --staged --exit-code

# Full cycle with observation sync
aphoria scan --staged --persist --sync --exit-code

4D: Enhanced Ack

Acknowledgments with rationale and policy updates:

Task Status
--reason "text" flag cli.rs — required on ack, bless, update commands
Store rationale in assertion metadata policy_ops.rs — stored in value/description fields
aphoria update for intentional drift policy_ops.rs — creates policy_update assertion
Policy update assertions types/mod.rspredicates::POLICY_UPDATE

Files: cli.rs, handlers.rs, policy_ops.rs, types/command.rs, types/mod.rs

$ aphoria ack db/pool_size --reason "Scaling for Black Friday"
$ aphoria update db/pool_size 100 --reason "New baseline after load test"

4E: Hosted Mode

Organizations run their own StemeDB server and all team members automatically sync observations:

Task Status
HostedConfig in config.rs url, project_id, team_id, sync_mode, offline_fallback, api_key_env
SyncMode enum remote-only (default), local-and-remote
OfflineFallback enum skip (default), fail, queue
HostedClient HTTP client hosted.rs — retry logic, auth headers, observation push
POST /v1/aphoria/observations endpoint Server receives observations with project/team metadata
Scan integration Auto-enables sync when [hosted] configured
Hosted(String) error variant For connection/auth failures
Graceful offline fallback Based on offline_fallback config
Tests Config parsing, client creation, assertion conversion
# aphoria.toml
[hosted]
url = "https://episteme.acme.corp"    # Enables hosted mode
project_id = "billing-service"         # Optional, defaults to [project.name]
team_id = "platform-team"              # Optional, for multi-team servers
sync_mode = "remote-only"              # "remote-only" | "local-and-remote"
offline_fallback = "skip"              # "skip" | "fail" | "queue"
api_key_env = "APHORIA_API_KEY"        # Env var for auth token

Architecture:

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ Developer A  │  │ Developer B  │  │ Developer C  │
│ aphoria scan │  │ aphoria scan │  │ aphoria scan │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       └─────────────────┼─────────────────┘
                         ▼
              ┌─────────────────────┐
              │ Team StemeDB Server │
              │ POST /v1/aphoria/   │
              │      observations   │
              └─────────────────────┘
                         │
                         ▼
              Aggregated team patterns

Files: config.rs, hosted.rs, scan.rs, error.rs, lib.rs, crates/stemedb-api/src/handlers/aphoria.rs, crates/stemedb-api/src/dto/aphoria.rs


Phase 4.5: Ephemeral Scan Mode

Performance optimization: 40x faster scans by skipping Episteme storage when persistence isn't needed.

Problem

Every aphoria scan was slow because it initialized the full Episteme stack:

  • WAL recovery (O(n) on every startup)
  • Dual backend initialization (fjall + redb)
  • Store and index initialization

But conflict detection is actually 100% in-memory — it never reads from the KV store. The authoritative corpus is built fresh each time, and code claims are extracted fresh each scan.

Solution

Added ScanMode enum with two modes:

Mode Use Case Storage Performance
Ephemeral (default) CI, pre-commit, quick checks None ~0.25 seconds
Persistent Baseline/diff tracking, alias creation WAL + store ~1-2 seconds

Implementation

Task Status
ScanMode enum types.rs — Ephemeral (default), Persistent
EphemeralDetector struct episteme/mod.rs — in-memory corpus + ConceptIndex
check_conflicts_pure() Extracted as standalone function for reuse
Mode-based dispatch in run_scan() Uses EphemeralDetector for Ephemeral, LocalEpisteme for Persistent
--persist CLI flag main.rs — opt-in to persistent mode
Tests for both modes test_ephemeral_scan_no_storage_created, test_persistent_scan_creates_storage, test_scan_modes_produce_same_conflicts

Usage

# Fast ephemeral scan (default) — no storage created
aphoria scan .

# Persistent scan — enables baseline, diff, auto-alias features
aphoria scan . --persist

Performance

Mode Time Storage
Ephemeral ~0.25s None
Persistent ~1-2s WAL + store directories

Files: types.rs, episteme/mod.rs, lib.rs, main.rs, tests.rs


Phase 5: Research Agent Loop

Research agent fills gaps in authoritative coverage by researching official documentation.

5.1 Gap Detection

Task Status
Gap struct research/gap_detector.rs — concept_path, topic, predicate, source info
detect_gaps() Compares claims against ConceptIndex, identifies missing coverage
Topic normalization Extracts last 2 path segments for cross-scheme matching
Deduplication Deduplicates gaps by topic+predicate key

5.2 Gap Storage

Task Status
GapRecord research/gap_store.rs — tracking metadata, project count, research status
GapStore JSON-backed persistent storage with atomic saves
Project tracking Records which projects reported each gap
Research eligibility is_eligible_for_research() with threshold and cooldown
Gap pruning prune_old_gaps() removes stale entries

5.3 Quality Validation

Task Status
QualityValidator research/quality.rs — validates researched claims
Source attribution Checks for authoritative domains (rfc-editor, owasp, vendor docs)
Normative language Verifies MUST/SHOULD/SHALL keywords present
Vague content detection Rejects "it depends", "typically", etc.
Consistency scoring Detects conflicting claims on same subject
QualityReport Detailed per-claim validation results
filter_passed() Returns only claims meeting quality threshold

5.4 Research Execution

Task Status
Researcher research/researcher.rs — orchestrates research pipeline
DocumentationSource Configurable sources with URL patterns and topics
Default sources Redis, PostgreSQL, Go, Rust, OWASP, Kafka, MongoDB
Content fetching HTTP with timeout and size limits
Normative extraction Regex-based MUST/SHOULD/SHALL extraction
Section tracking Extracts heading context for attribution
Confidence scoring Based on keyword strength, statement length, content size

5.5 CLI Integration

Task Status
aphoria research run Run research agent with configurable threshold
aphoria research status Show gap statistics and research progress
aphoria research gaps List gaps by project count
--threshold Minimum projects before researching (default: 3)
--strict Use strict quality validation
--prune Remove stale gaps before researching
--ready Show only gaps ready for research

Files: research/mod.rs, research/gap_detector.rs, research/gap_store.rs, research/quality.rs, research/researcher.rs, research/tests.rs

5.7 Security Extractors

Extended Phase 2 extractors with OWASP-aligned security vulnerability detection:

Extractor Detects Languages
weak_crypto MD5, SHA1, DES, RC4 usage Rust, Go, Python, JS/TS
command_injection Shell execution, os.system, subprocess shell=True Rust, Go, Python, JS/TS
sql_injection String concatenation in SQL queries Rust, Go, Python, JS/TS

Concept paths:

  • crypto/hashing/algorithm — MD5, SHA1
  • crypto/encryption/algorithm — DES, RC4
  • os/command/input, os/shell_mode — command injection
  • db/query/input — SQL injection

5.6 Community Corpus Contributions

Users can opt in to contribute patterns anonymously to a central corpus, enabling community consensus to adjust default thresholds.

Task Status
CommunityConfig config/mod.rs — enabled (false), anonymize (true), exclude, include, min_confidence
AnonymizedObservation community/types.rs — privacy-preserving observation without file/line/text
CommunityObjectValue community/types.rs — serde-compatible version of ObjectValue
PatternAggregate community/types.rs — server-side aggregation with project counts
anonymize_claim() community/anonymizer.rs — wildcards project names, strips file/line, rounds timestamps
compute_anon_hash() Hash computed WITHOUT file/line/text (privacy-critical)
wildcard_project_path() code://rust/myapp/tlscode://rust/*/tls
--community-preview flag cli.rs — dry-run showing what WOULD be shared
PatternAggregateStore stemedb-storage — server-side pattern aggregation
Project deduplication Uses project_hash to prevent double-counting
POST /v1/aphoria/community/observations Push anonymized observations
GET /v1/aphoria/patterns Retrieve high-confidence community patterns

Privacy Model:

  • Project names wildcarded: myapp*
  • File paths, line numbers, matched text NEVER shared
  • Timestamps rounded to hour (k-anonymity)
  • Server receives project_hash, not raw project names
  • enabled defaults to false (explicit opt-in required)
  • anonymize defaults to true (privacy-preserving by default)

Usage:

# Preview what would be shared (no network)
aphoria scan --community-preview

# Enable in aphoria.toml:
[community]
enabled = true
anonymize = true
min_confidence = 0.8
exclude = ["vendor://acme/internal/*"]

# Scan with sync to share patterns
aphoria scan --persist --sync

Files: community/mod.rs, community/types.rs, community/anonymizer.rs, config/mod.rs, cli.rs, handlers.rs, stemedb-storage/src/pattern_aggregate_store/


Phase 6: Federated Policy & Trust Packs

Allow teams to define their own authoritative truths and distribute them as signed Trust Packs. This enables "Enterprise Grade" compliance across distributed teams.

6.1 Trust Pack Format

Task Status
TrustPack schema policy.rs — Assertions, Aliases, Metadata, Signature
PackHeader Name, version, issuer, timestamp
Serialization rkyv for zero-copy efficiency
Signing ed25519-dalek signing and verification

6.2 Policy Management

Task Status
PolicyManager Loads local and remote (HTTP/HTTPS) policies
Caching Caches remote policies in ~/.cache/aphoria/policies/
aphoria.toml config policies list support

6.3 Core Integration

Task Status
EphemeralDetector integration Ingests policies into memory corpus/index
check_conflicts_pure update Resolves policy aliases before authoritative lookup
LocalEpisteme export helpers fetch_acknowledgments, fetch_manual_aliases

6.4 CLI Commands

Task Status
aphoria policy export Exports local ack decisions as a Trust Pack
aphoria scan policy loading Auto-loads policies from config

Files: policy.rs, config.rs, episteme/mod.rs, lib.rs, main.rs


Phase 6.5: Trust Pack Extensions

Enhancements to Trust Packs for semantic predicate matching and key management.

6.5.1 Predicate Aliases

Status: Complete Implemented: 2026-02-06

User Story:

As a security architect, when my policy uses required=true but the extractor emits enabled=true, I need them to match semantically.

Problem:

  • Policy blesses: code://standard/tls/cert_verification with predicate required, value true
  • Extractor emits: code://config/tls/cert_verification with predicate enabled, value false
  • Tail-path matching finds the concept (tls/cert_verification) ✓
  • But predicates differ: required vs enabled — no conflict detected ✗

Solution:

Task Description
predicate_aliases field Add to Trust Pack schema
Default aliases enabledrequiredmandatoryenforced
ConceptIndex update Check aliases during lookup
Pack-defined aliases Allow packs to specify custom alias sets

Trust Pack Schema Extension:

# In Trust Pack
[predicate_aliases]
security_enabled = ["enabled", "required", "mandatory", "enforced", "active"]
version_minimum = ["min_version", "minimum_version", "tls_min_version"]

Implementation Plan:

  1. Add predicate_aliases: HashMap<String, Vec<String>> to TrustPack
  2. Store aliases alongside assertions during import
  3. Update ConceptIndex.make_key() to normalize predicates via aliases
  4. Match during conflict detection: if predicate_a aliases to predicate_b, treat as same concept

6.5.2 Pack Signing Key Rotation

Status: Complete Implemented: 2026-02-06

User Story:

As a security admin, when our signing key is rotated, I need to re-sign all packs without losing policy content.

Problem:

  • Trust Packs are signed with Ed25519 keys
  • When keys are rotated (security best practice), existing packs become unverifiable
  • Need to re-sign packs with new key while preserving content hash

Solution:

Task Description
aphoria policy resign CLI command to re-sign pack with new key
Content hash preservation Keep content_hash unchanged, only update signature
Key rotation audit Log key rotation events
Old signature archival Optionally keep old signature for audit trail

CLI:

# Re-sign pack with new key
aphoria policy resign my-standards.pack --key-file new-private-key.pem

# Re-sign with signature chain (audit trail)
aphoria policy resign my-standards.pack --key-file new-key.pem --chain-signatures

Trust Pack Schema Extension:

pub struct TrustPack {
    // Existing fields...
    pub signature: Signature,

    // New field for key rotation audit
    pub signature_chain: Option<Vec<SignatureRecord>>,
}

pub struct SignatureRecord {
    pub issuer_public_key: [u8; 32],
    pub signature: Signature,
    pub signed_at: DateTime<Utc>,
    pub reason: Option<String>,  // "Key rotation", "Security incident", etc.
}

6.5.3 Priority

Feature Priority Trigger
Predicate Aliases Medium Enterprise feedback showing predicate naming conflicts
Key Rotation Low Enterprise security key management requirements

Documented in: uat/future-scenarios.md


Phase 7: Declarative Extractors

Enable users to define new extractors in config/policy files (TOML) without writing Rust code. This removes the recompilation bottleneck for custom pattern enforcement.

User Outcome: "I added a custom extractor to my aphoria.toml that detects our company's deprecated API patterns. Now every scan flags files using the old pattern without me writing any Rust code."

7.1 Core Types

Task Status
DeclarativeExtractorDef extractors/declarative.rs — name, description, languages, pattern, claim, confidence
DeclarativeClaimDef subject, predicate, value specification
DeclarativeValue enum MatchedText, Boolean, Text variants
DeclarativeExtractor Compiled extractor with Extractor trait impl

7.2 Configuration

Task Status
ExtractorConfig.declarative config/mod.rsVec<DeclarativeExtractorDef>
TOML parsing Serde deserialization with #[serde(untagged)] for value types
Example config Documented in module and config docs

Example aphoria.toml:

[[extractors.declarative]]
name = "deprecated_api_v1"
description = "Detects usage of deprecated v1 API endpoints"
languages = ["go", "rust", "python"]
pattern = '/api/v1/\w+'
claim.subject = "api/deprecated_endpoint"
claim.predicate = "version"
claim.value = "v1"
confidence = 1.0

[[extractors.declarative]]
name = "legacy_encryption"
description = "Detects legacy encryption algorithms"
languages = ["rust", "go", "python", "javascript"]
pattern = '(?i)blowfish|twofish|cast5'
claim.subject = "crypto/encryption/algorithm"
claim.predicate = "algorithm"
claim.value_from_match = true
confidence = 0.9

7.3 Validation & Security

Task Status
Name validation Non-empty required
Subject/predicate validation Non-empty required
Confidence validation Must be 0.0-1.0
Regex validation Compiled at load time, not scan time
ReDoS protection RegexBuilder with 10MB size limits
Language parsing Language::from_str() with FromStr trait
Graceful failure Invalid extractors logged as warnings, don't block others

7.4 Registry Integration

Task Status
Module export extractors/mod.rs — public types
Registry registration ExtractorRegistry::new() loads from config
Enable/disable support Declarative extractors respect disabled list
Runtime addition add_from_definitions() for Trust Pack integration

7.5 Error Handling

Task Status
DeclarativeExtractor error variant error.rs — name + message
Validation errors Clear messages for each failure mode
Structured logging tracing::warn! for compilation failures

7.6 Tests

Task Status
Unit tests 22 tests in declarative.rs
Registry tests 7 tests for integration
Validation tests Empty name, subject, predicate; invalid confidence, regex, language
Extraction tests Boolean, text, matched_text value types
Deserialization tests TOML parsing for all value types

Files: extractors/declarative.rs, extractors/mod.rs, config/mod.rs, types/language.rs, error.rs


Phase 7.5: LLM-in-the-Loop Extraction

Use LLM (Gemini) to extract claims semantically during persistent scans. This fills gaps that regex extractors can't catch, providing immediate value while the learning system builds up pattern knowledge.

Vision

Code file → Regex extractors → Claims found
                ↓
         High-value files (auth, config, crypto)
                ↓
         LLM Extractor → Additional semantic claims
                ↓
         Combined claims → Conflict detection

7.5.1 LLM Extractor Implementation

Task Status
GeminiClient struct llm/client.rs — Gemini API client using ureq
LlmExtractor struct llm/extractor.rs — orchestrates extraction with budget tracking
Prompt engineering Security-focused extraction prompt with structured JSON output
Response parsing Parse Gemini's JSON response into ExtractedClaim format
Error handling Graceful degradation when API unavailable or key missing

7.5.2 Selective Triggering

Task Status
is_high_value_file() llm/extractor.rs — auth/, config/, crypto/, security/, secrets/, certs/, ssl/, tls/, keys/, credentials/ directories
High-value file names secret, password, credential, token, auth, login, session, jwt, tls, ssl, cert, key, config, settings, security, crypto, encrypt, decrypt, oauth, saml, ldap, api_key, apikey, access_key, private
Token budget max_tokens_per_scan (default 50k), max_tokens_per_file (default 4k)
Skip conditions Only runs when regex extractors found nothing AND file is high-value

7.5.3 Cost Controls

Task Status
Token tracking Arc<AtomicUsize> for thread-safe budget tracking across files
BLAKE3 caching llm/cache.rs — content hash + model + prompt version for cache key
Cache location ~/.cache/aphoria/llm-cache/
Budget enforcement within_budget() check before each LLM call

7.5.4 Configuration

# aphoria.toml
[llm]
enabled = true                    # Enable LLM extraction (default: false)
provider = "gemini"               # Only "gemini" supported
# model defaults to DEFAULT_LLM_MODEL (currently "gemini-3-flash-preview")
api_key_env = "GEMINI_API_KEY"    # Environment variable for API key
max_tokens_per_scan = 50000       # Budget per scan
max_tokens_per_file = 4000        # Budget per file (for max_output_tokens)
high_value_only = true            # Only use on auth/config/crypto files
cache_responses = true            # Cache by content hash
timeout_secs = 60                 # API timeout
min_confidence = 0.7              # Filter claims below this confidence

Files: llm/mod.rs, llm/client.rs, llm/extractor.rs, llm/cache.rs, config/mod.rs, scan.rs, error.rs


Phase 7.6: Pattern Learning Store

When LLM extracts something that regex extractors missed, remember the pattern. Track which patterns recur across projects to identify candidates for promotion to declarative extractors.

Vision

LLM extracts claim from code
        ↓
Pattern not in learned store?
        ↓
Store: { example_code, claim, project_hash }
        ↓
Same pattern seen in 5+ projects?
        ↓
Flag for promotion to declarative extractor

7.6.1 LearnedPattern Schema

Task Status
ValueType enum learning/types.rs — Text, Number, Boolean
ClaimTemplate struct learning/types.rs — subject_template, predicate, value_type, description
LearnedPattern struct learning/types.rs — full schema with timestamps, project hashes, confidence tracking
Serde serialization JSON serialization with chrono timestamps
Tests 5 unit tests for types

7.6.2 PatternStore Implementation

Task Status
PatternStore trait learning/store.rs — abstract storage interface
LocalPatternStore JSON-backed local storage at ~/.aphoria/learning/patterns.json
RwLock thread safety Write-through cache with in-memory HashMap
Deduplication find_similar() with Levenshtein similarity threshold 0.8
Pruning prune_stale() removes patterns not seen in N days
Tests 8 unit tests for store operations

7.6.3 Pattern Normalization

Task Status
normalize_pattern() learning/normalizer.rs — replaces literals with placeholders
Version detection "1.0", "TLSv1.2"<string:version>
Boolean detection true/false<boolean>
Number detection Standalone numbers → <number>
String detection Remaining quoted strings → <string>
pattern_similarity() Levenshtein distance normalized to 0.0-1.0
Tests 17 unit tests for normalization

7.6.4 Configuration

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning (default: false)
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

7.6.5 Scan Integration

Task Status
Initialize pattern store scan.rs — only in persistent mode with learning enabled
Project hash computation BLAKE3 hash for privacy-preserving project identification
Record LLM-extracted claims After LLM extraction, record patterns meeting min_confidence
Update existing patterns Merge observations when similar pattern found
Logging Reports patterns_recorded count on scan completion

7.6.6 Error Handling

Task Status
LearningStore error variant error.rs — for storage/cache failures
Graceful degradation Store failures logged, don't block scan

Files: learning/mod.rs, learning/types.rs, learning/normalizer.rs, learning/store.rs, config/mod.rs, scan.rs, error.rs, lib.rs

Tests: 30 tests covering types, normalization, and store operations.


Phase 7.6 (Legacy Documentation)

Note: The following is the original spec for reference. See above for implemented status.

Original Schema (Reference)

/// A pattern learned from LLM extraction that could become a declarative extractor.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LearnedPattern {
    /// Unique identifier
    pub id: Uuid,

    /// Example code that triggered this pattern
    pub example_code: String,

    /// Normalized pattern (variables replaced with placeholders)
    /// e.g., "const TLS_MIN_VERSION = \"1.0\"" → "const TLS_MIN_VERSION = <version>"
    pub normalized_pattern: String,

    /// The claim this pattern produces
    pub claim_template: ClaimTemplate,

    /// Language this pattern applies to
    pub language: Language,

    /// When first seen
    pub first_seen: DateTime<Utc>,

    /// When last seen
    pub last_seen: DateTime<Utc>,

    /// Projects that have this pattern (hashed for privacy)
    pub project_hashes: HashSet<String>,

    /// Total occurrences across all projects
    pub occurrences: u32,

    /// Average LLM confidence when extracting this
    pub avg_confidence: f32,

    /// Has this been promoted to a declarative extractor?
    pub promoted: bool,

    /// If promoted, the extractor ID
    pub promoted_to: Option<String>,
}

/// Template for generating claims from a learned pattern.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ClaimTemplate {
    pub subject_template: String,  // "tls/min_version"
    pub predicate: String,         // "version"
    pub value_type: ValueType,     // String, Boolean, Number
    pub description_template: String,
}

Original PatternStore Trait (Reference)

pub trait PatternStore: Send + Sync {
    /// Record a pattern learned from LLM extraction
    fn record_pattern(&self, pattern: &LearnedPattern) -> Result<()>;

    /// Find existing pattern matching this example
    fn find_similar(&self, normalized: &str, language: Language, threshold: f32) -> Option<LearnedPattern>;

    /// Get patterns ready for promotion (threshold met)
    fn get_promotion_candidates(&self, min_projects: usize, min_confidence: f32) -> Vec<LearnedPattern>;

    /// Mark pattern as promoted
    fn mark_promoted(&self, id: &Uuid, extractor_name: &str) -> Result<()>;

    /// Prune old patterns
    async fn prune_stale(&self, max_age_days: u32) -> Result<usize>;
}

7.6.3 Pattern Normalization

Task Description
Variable extraction Identify literals that vary (versions, names, values)
Placeholder insertion Replace literals with typed placeholders
Similarity scoring Compare normalized patterns for dedup
fn normalize_pattern(code: &str, claim: &ExtractedClaim) -> String {
    // "const TLS_MIN = \"1.0\"" → "const TLS_MIN = <string:version>"
    // "pool_size: 25" → "pool_size: <number>"
    // "verify_ssl: false" → "verify_ssl: <boolean>"
}

fn similarity_score(a: &str, b: &str) -> f32 {
    // Levenshtein distance normalized to 0.0-1.0
    // Patterns with score > 0.8 are considered duplicates
}

7.6.4 Integration with Scan

// In scan.rs, after LLM extraction
for claim in llm_claims {
    // Check if this is a new pattern
    if let Some(existing) = pattern_store.find_similar(&claim.matched_text, language).await {
        // Update existing pattern
        pattern_store.increment_occurrence(&existing.id, project_hash).await?;
    } else {
        // Record new pattern
        let pattern = LearnedPattern::from_claim(&claim, &code_context, project_hash);
        pattern_store.record_pattern(&pattern).await?;
    }
}

7.6.5 Configuration

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

Files: learning/mod.rs, learning/pattern.rs, learning/store.rs, learning/normalize.rs


Phase 7.7: Pattern → Extractor Promotion

High-frequency learned patterns get promoted to declarative extractors. This closes the learning loop: patterns discovered by LLM become permanent, fast regex extractors.

Vision

LearnedPattern (5+ projects, >0.8 confidence)
        ↓
Claude: "Generate regex for this pattern"
        ↓
Candidate declarative extractor
        ↓
Validate against stored examples
        ↓
Human review (optional) → Approve/Reject
        ↓
Merge to project's .aphoria/extractors/

7.7.1 Promotion Pipeline

Task Status
PromotionPipeline promotion/pipeline.rs — orchestrates full promotion flow
RegexGenerator promotion/regex_gen.rs — Gemini LLM integration
ExtractorValidator promotion/validator.rs — ReDoS detection, timing validation
YamlWriter promotion/writer.rs — outputs to .aphoria/extractors/learned/
InteractiveReviewer promotion/review.rs — CLI review workflow
PromotionCandidate promotion/types.rs
ValidationResult promotion/types.rs
pub struct PromotionPipeline {
    pattern_store: Arc<dyn PatternStore>,
    llm_client: ClaudeClient,
    validator: ExtractorValidator,
}

impl PromotionPipeline {
    /// Get patterns ready for promotion
    pub async fn get_candidates(&self) -> Vec<PromotionCandidate> {
        let patterns = self.pattern_store
            .get_promotion_candidates(5, 0.8)
            .await?;

        patterns.into_iter()
            .map(|p| self.generate_candidate(p))
            .collect()
    }

    /// Generate declarative extractor from pattern
    async fn generate_candidate(&self, pattern: LearnedPattern) -> PromotionCandidate {
        // Ask Claude to generate regex
        let regex = self.llm_client.generate_regex(&pattern).await?;

        // Build declarative extractor
        let extractor = DeclarativeExtractor {
            name: pattern.id.to_string(),
            language: pattern.language,
            pattern: regex,
            claim: pattern.claim_template.clone(),
            source: ExtractorSource::Learned {
                pattern_id: pattern.id,
                projects: pattern.project_hashes.len(),
            },
        };

        // Validate against examples
        let validation = self.validator.validate(&extractor, &pattern).await;

        PromotionCandidate { pattern, extractor, validation }
    }
}

7.7.2 Regex Generation

Task Status
Multi-example prompt Includes all examples in generation prompt
Regex safety ReDoS detection prevents catastrophic backtracking
Test coverage Validates against stored examples
async fn generate_regex(examples: &[String], claim: &ClaimTemplate) -> Result<String> {
    let prompt = format!(
        "Generate a regex pattern that matches all these code examples:\n\n{}\n\n\
         The regex should extract the value for claim: {}\n\
         Requirements:\n\
         - Must match ALL examples\n\
         - Use named capture groups for extracted values\n\
         - Avoid catastrophic backtracking (no nested quantifiers)\n\
         - Return ONLY the regex, no explanation",
        examples.join("\n---\n"),
        claim.subject_template
    );

    let response = claude.message(&prompt).await?;
    validate_regex_safety(&response)?;
    Ok(response)
}

7.7.3 Validation Suite

Task Status
Positive tests Must match all stored examples
ReDoS detection Detects catastrophic backtracking patterns
Performance test Timing validation with configurable threshold
False positive check Deferred to Phase 9 (sample codebase FP testing)
pub struct ExtractorValidator {
    sample_codebases: Vec<PathBuf>,  // Known-good projects for FP testing
}

impl ExtractorValidator {
    pub async fn validate(
        &self,
        extractor: &DeclarativeExtractor,
        pattern: &LearnedPattern
    ) -> ValidationResult {
        let mut result = ValidationResult::default();

        // Must match all positive examples
        for example in &pattern.examples {
            if !extractor.matches(example) {
                result.positive_failures.push(example.clone());
            }
        }

        // Must not have excessive false positives
        for codebase in &self.sample_codebases {
            let fps = self.count_false_positives(extractor, codebase).await;
            if fps > 10 {
                result.false_positive_warning = true;
            }
        }

        // Must be fast
        let duration = self.benchmark(extractor);
        if duration > Duration::from_millis(100) {
            result.performance_warning = true;
        }

        result
    }
}

7.7.4 Human Review Gate

Task Status
aphoria extractors review CLI to review pending promotions
aphoria extractors stats Show pattern store statistics
aphoria extractors candidates List promotion candidates
aphoria extractors promote Promote pattern to extractor
Approval workflow Approve, reject, or skip via InteractiveReviewer
Rejection tracking Deferred to Phase 9 (rejection reason persistence)
Auto-approve mode Deferred to Phase 9 (>0.95 confidence auto-promote)
$ aphoria extractors review

Pending promotions: 3

[1/3] Pattern: tls_min_version_const
      Examples: 47 (across 8 projects)
      Confidence: 0.91

      Generated regex: (?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["']?(1\.[01])["']?

      Sample matches:
        const TLS_MIN_VERSION = "1.0"     ✓ matches
        TLS_MINIMUM_VERSION: "1.1"        ✓ matches
        ssl_min_version = "1.2"           ✓ matches (TLS 1.2 is safe, false positive?)

      [a]pprove  [r]eject  [e]dit  [s]kip  [q]uit: _

7.7.5 Extractor Output

Promoted patterns become declarative extractors in .aphoria/extractors/learned/:

# .aphoria/extractors/learned/tls_min_version_const.yaml
# Auto-generated from learned pattern. DO NOT EDIT.
# Pattern ID: 550e8400-e29b-41d4-a716-446655440000
# Learned from: 8 projects, 47 occurrences
# Confidence: 0.91
# Promoted: 2026-02-10

name: "tls_min_version_const"
language: ["rust", "go", "python", "javascript", "typescript"]
pattern: '(?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["\']?(1\.[01])["\']?'
claim:
  subject: "tls/min_version"
  predicate: "version"
  value_capture: 1  # Capture group for version
  description: "TLS minimum version set to deprecated {value}"
metadata:
  source: "learned"
  pattern_id: "550e8400-e29b-41d4-a716-446655440000"
  projects: 8
  occurrences: 47
  confidence: 0.91

7.7.6 Configuration

# aphoria.toml
[promotion]
enabled = true                    # Enable promotion pipeline
auto_promote = false              # Require human approval
output_dir = ".aphoria/extractors/learned"
min_confidence = 0.8              # Minimum to consider
min_projects = 5                  # Projects needed before promotion
require_validation = true         # Must pass validation suite

Files: promotion/mod.rs, promotion/pipeline.rs, promotion/regex_gen.rs, promotion/validator.rs, promotion/review.rs, promotion/writer.rs, promotion/types.rs, handlers/extractors.rs

Tests: 43 tests covering pipeline, validation, regex generation, and YAML output.


Phase 9: Autonomous Extractor Generation

The system generates, tests, and deploys extractors without human approval for high-confidence patterns. This is the endgame: a fully self-improving extraction system.

Vision

Learned pattern exceeds autonomous threshold (>0.95 confidence, >10 projects)
        ↓
Auto-generate extractor
        ↓
Validate against comprehensive test suite
        ↓
A/B test: run new extractor in shadow mode
        ↓
If FP rate < 5%: auto-deploy
        ↓
If FP rate spikes: auto-rollback

Phase 7.8: LLM Prompt Evaluation

Measure and improve LLM extraction quality through golden fixtures and regression detection. Essential for prompt engineering without breaking existing quality.

Vision

Golden Fixtures (TOML)                 Evaluation Harness
   ├── tls-001: verify=False            ├── Load fixtures
   ├── jwt-001: algorithm=none    -->   ├── Run extraction (live/cached/mock)
   └── secrets-001: hardcoded key       ├── Match against expectations
                                        ├── Compute precision/recall/F1
                                        └── Compare to baseline (regression detection)

7.8.1 Fixture Format

Task Status
Fixture type eval/fixture.rs — TOML-based test cases
ExpectedClaim Subject/predicate/value expectations
must_contain Claims that MUST be extracted (recall)
must_not_contain Claims that MUST NOT appear (precision)
FixtureLoader Load fixtures from directory tree
CorpusManifest Corpus metadata + baseline metrics
Validation Duplicate ID, empty content, missing expectations
# tests/llm_fixtures/tls/tls-001-disabled-verification.toml
[metadata]
id = "tls-001"
name = "TLS verification disabled in Python requests"
category = "tls"
language = "python"

[input]
filename = "api_client.py"
content = """
response = requests.get(url, verify=False)
"""

[expected]
must_contain = [
    { subject = "tls/cert_verification", predicate = "enabled", value = false }
]
must_not_contain = [
    { subject = "tls/cert_verification", predicate = "enabled", value = true }
]

7.8.2 Claim Matching

Task Status
ClaimMatcher eval/matcher.rs — Flexible claim comparison
Tail-path matching Last 2 segments for subject comparison
Type coercion Boolean↔string ("true"/"yes"), number↔string
Confidence thresholds Optional min_confidence per expectation
count_false_positives() Detect unexpected claims

7.8.3 Metrics Computation

Task Status
Metrics eval/metrics.rs — Aggregate evaluation metrics
Precision/Recall/F1 Standard information retrieval metrics
Per-category breakdown Metrics by fixture category
Cost estimation Token-based cost tracking
BaselineComparison Compare current run to stored baseline
Regression detection Flag if F1/precision/recall drop > threshold

7.8.4 Evaluation Harness

Task Status
EvalHarness eval/harness.rs — Orchestrates evaluation runs
EvalMode::Live Real LLM API calls
EvalMode::Cached Use cached responses (deterministic CI)
EvalMode::Mock No LLM, tests harness itself
EvalVerdict Pass, Regression, Review, Error
update_baseline() Save current metrics as new baseline

7.8.5 Report Generation

Task Status
Report eval/report.rs — Multi-format output
Table format Terminal tables with color-coded results
JSON format Machine-readable for CI/CD integration
Markdown format Documentation and PR comments
Failed fixture details Shows unmatched expectations with rationale

7.8.6 CLI Commands

Task Status
aphoria eval run Run evaluation against fixtures
aphoria eval baseline Show current baseline metrics
aphoria eval update-baseline Update baseline (--force required)
aphoria eval list-fixtures List available fixtures by category
aphoria eval validate-fixtures Validate fixture format
--fail-on-regression Exit code 1 if regression detected
--threshold Configurable regression threshold (default 5%)
--mode live, cached, or mock
# Run evaluation in mock mode
aphoria eval run --fixtures tests/llm_fixtures --mode mock

# CI: fail on regression
aphoria eval run --mode cached --fail-on-regression --threshold 0.05

# Update baseline after prompt improvements
aphoria eval update-baseline --fixtures tests/llm_fixtures --force

# List fixtures by category
aphoria eval list-fixtures --category tls

7.8.7 Seed Fixtures

Category Fixture Description
tls tls-001 Python requests verify=False
tls tls-002 Node.js TLSv1 deprecated protocol
jwt jwt-001 Algorithm 'none' allowed
jwt jwt-002 Go WithoutClaimsValidation
secrets secrets-001 Hardcoded API key
secrets secrets-002 High-entropy JWT in config
auth auth-001 Debug authentication bypass
negative negative-001 Safe TLS config (no findings expected)
negative negative-002 Env-loaded secrets (no findings expected)
edge edge-001 Empty file edge case

Files: eval/mod.rs, eval/fixture.rs, eval/matcher.rs, eval/metrics.rs, eval/harness.rs, eval/report.rs, handlers/eval.rs, cli.rs, tests/llm_fixtures/

Documentation: docs/llm-optimization/ — Full optimization playbook with decision trees, research templates, and baseline tracking.


9.1 Autonomous Promotion

Task Description Status
AutonomousConfig Configuration with kill switch (enabled: false default)
High-confidence threshold Skip human review for >0.95 confidence
Project threshold Require >10 projects for autonomous
Validation strictness Zero failures, zero warnings required
should_auto_promote() Decision logic on PromotionCandidate
auto_promotion_blockers() Explains why pattern can't be auto-promoted
AutonomousAuditLog JSONL audit trail for all decisions
smart_auto_promote_all() Pipeline integration with audit logging
YAML header enhancement "AUTO-PROMOTED" + "Approved by: autonomous"
CLI command aphoria extractors auto-promote [--dry-run]

Safety Features:

  • Kill switch: enabled: false by default (opt-in only)
  • Auditability: All decisions logged to ~/.aphoria/audit/autonomous-decisions.jsonl
  • Reversibility: Can delete YAML + reset pattern.promoted
  • Blast radius: One pattern = one YAML file
  • Traceability: YAML header shows approval source

Files: config/types/autonomous.rs, promotion/audit.rs, promotion/types.rs, promotion/pipeline.rs, promotion/writer.rs, handlers/extractors.rs

Configuration:

[autonomous]
enabled = true            # Master switch (default: false)
min_confidence = 0.95     # Stricter than standard 0.8
min_projects = 10         # Stricter than standard 5
require_zero_failures = true
require_zero_warnings = true
audit_log = true
audit_dir = "~/.aphoria/audit/"

CLI Usage:

# Preview what would be auto-promoted
aphoria extractors auto-promote --dry-run

# Run autonomous promotion
aphoria extractors auto-promote

# Override thresholds
aphoria extractors auto-promote --min-confidence 0.97 --min-projects 15

9.2 Shadow Mode Testing

Task Description Status
ShadowConfig Configuration for shadow mode (min_scans, max_fp_rate, rollback_threshold)
ShadowTest, ShadowStatus, ShadowMetrics Core types for tracking shadow extractors
ShadowStore JSONL persistence for tests, matches, and decisions
ShadowExtractorRegistry Loads shadow extractors from learned/ directory
ShadowExecutor Runs shadow extractors during scans, stores matches separately
FeedbackCollector TP/FP feedback collection and metrics update
GraduationManager Shadow → production promotion and rollback logic
CLI commands shadow-status, feedback, graduate, rollback

Safety Features:

  • Shadow isolation: Matches stored separately, not in production output
  • Metrics transparency: FP rate visible via shadow-status
  • Graduation gate: Must meet min_scans (100) + max_fp_rate (5%) + feedback exists
  • Manual control: rollback command for immediate removal
  • Audit trail: All decisions logged to decisions.jsonl

Files: shadow/mod.rs, shadow/types.rs, shadow/store.rs, shadow/registry.rs, shadow/executor.rs, shadow/feedback.rs, shadow/graduation.rs, handlers/shadow.rs, config/types/shadow.rs

Configuration:

[shadow]
enabled = true            # Shadow mode on by default
min_scans = 100           # Scans before graduation eligible
max_fp_rate = 0.05        # Maximum FP rate for graduation
rollback_threshold = 0.15 # FP rate that triggers rollback
retention_days = 30       # Days to retain shadow data

CLI Usage:

# View shadow test status
aphoria extractors shadow-status [-v]

# Provide TP/FP feedback on matches
aphoria extractors feedback <test-name> [--limit 10]

# Graduate shadow test to production
aphoria extractors graduate <test-name> [--force]

# Rollback a shadow test
aphoria extractors rollback <test-name> --reason "too many FPs"

Tests: 44 tests covering types, store, registry, executor, feedback, graduation, and auto-rollback.

9.3 Auto-Rollback

Task Description Status
auto_rollback_enabled config Toggle to enable/disable auto-rollback (default: true)
Feedback-time check Auto-rollback triggered immediately after FP feedback
FeedbackWithRollback return record_feedback() returns rollback info
AutoRollbackResult Track checked count, rolled back names, errors
CLI command aphoria extractors auto-check for manual batch checking
Audit trail Decision logged as ShadowDecisionKind::AutoRollback
YAML deletion Extractor file deleted from learned/ on rollback

Safety Features:

  • Toggle: auto_rollback_enabled can disable feature for testing or manual-only workflows
  • Threshold configurable: rollback_threshold in config (default: 15%)
  • Minimum reviews: Requires 10+ reviewed matches before auto-rollback triggers
  • Audit trail: All auto-rollback decisions logged to decisions.jsonl
  • CLI fallback: auto-check command for manual verification

Files: shadow/feedback.rs, shadow/graduation.rs, config/types/shadow.rs, handlers/shadow.rs, cli.rs

Configuration:

[shadow]
enabled = true
auto_rollback_enabled = true  # NEW: Enable automatic rollback (default: true)
rollback_threshold = 0.15     # FP rate that triggers auto-rollback

CLI Usage:

# Automatic: Rollback happens immediately when feedback pushes FP rate over threshold
aphoria extractors feedback <test-name> --limit 10
# If FP rate exceeds 15%, you'll see:
# ⚠️  AUTO-ROLLBACK TRIGGERED: <extractor-name>

# Manual batch check: Scan all active tests and rollback any over threshold
aphoria extractors auto-check
# Output: "⚠️  Auto-rolled back 1 of 5 shadow test(s): ..."

Tests: 3 new tests covering auto-rollback triggering, disabled toggle, and threshold boundary.

9.4 Cross-Project Learning

Task Description Status
Hosted pattern sync Patterns from all projects aggregate on server
Global promotion Promote patterns seen across many orgs
Privacy preservation Only normalized patterns shared, no code
Opt-in distribution Orgs can opt-in to receive community extractors
Org A: Pattern seen in 3 projects → shared to hosted
Org B: Same pattern in 5 projects → shared to hosted
Org C: Same pattern in 4 projects → shared to hosted
        ↓
Hosted aggregates: 12 projects total
        ↓
Promotes to community extractor
        ↓
All orgs receive new extractor (if opted in)

Implementation:

  • CrossProjectConfig with opt-in flags (contribute_patterns, receive_community)
  • PatternSyncer for uploading anonymized patterns to hosted server
  • CommunityExtractorLoader for pulling community extractors as YAML files
  • BLAKE3 hashing for pattern deduplication and org anonymization
  • Privacy guarantees: normalized_pattern shared, but NOT example_code or project_hashes
  • CLI commands: aphoria patterns sync, aphoria patterns status, aphoria patterns pull-community

Files: config/types/cross_project.rs, community/pattern_syncer.rs, community/extractor_loader.rs, handlers/patterns.rs

Tests: 7 new tests covering pattern hashing, subject exclusion, anonymization, and extractor loading.

9.5 Extractor Versioning

Task Description Status
Version tracking Track which version caught which issues ExtractorVersion + VersionStore
Changelog Record changes between versions ExtractorChangelog + ChangelogEntry
Rollback support Revert to previous version aphoria extractors rollback-version
A/B metrics Compare versions side-by-side aphoria extractors compare + compute_metrics_delta()
CLI commands versions, compare, rollback-version Full CLI implementation
Tests Unit tests for all components 15+ version/changelog tests

Files:

  • promotion/version.rs - Core types (ExtractorVersion, ChangelogEntry, MetricsDelta, ExtractorChangelog, VersionStore)
  • promotion/writer.rs - Versioned YAML output (write_versioned())
  • promotion/types.rs - Version field in PromotionMetadata
  • handlers/extractors.rs - CLI handlers (handle_versions, handle_compare, handle_rollback_version)
  • cli.rs - CLI commands (Versions, Compare, RollbackVersion)

CLI Usage:

# List versions
aphoria extractors versions learned_tls_min_version
# Version History: learned_tls_min_version
# Version  Date         Changes
# ------------------------------------------------------------
# 2        2026-03-15   Added support for YAML configs
# 1        2026-02-01   Initial promotion from learned pattern

# Compare versions
aphoria extractors compare learned_tls_min_version -a 1 -b 2
# Comparison: learned_tls_min_version v1 vs v2
# Matches              +15%
# False Positives      -3%

# Rollback
aphoria extractors rollback-version learned_tls_min_version --version 1 --reason "v2 edge case bug"
# Rolled back learned_tls_min_version to v1

YAML Output:

# Generated from learned pattern. Review before editing.
# Pattern ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Version: 2 (previous: 1)
# Promoted: 2026-03-15 14:30:00 UTC

name: learned_tls_min_version
description: TLS minimum version set to deprecated value
version: 2
previous_version: 1
languages:
  - rust
  - go
pattern: '(?i)tls_?min_?(version)?\s*[:=]\s*["\']?(?P<value>1\.[01])["\']?'
claim:
  subject: tls/min_version
  predicate: version
  value_from_match: true
confidence: 0.97
metadata:
  source: learned
  pattern_id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
  version: 2
changelog:
  - version: 2
    date: 2026-03-15
    changes: "Added support for YAML configs"
    metrics:
      matches: "+15%"
      false_positives: "-3%"
  - version: 1
    date: 2026-02-01
    changes: "Initial promotion from learned pattern"

9.6 Configuration

# aphoria.toml
[autonomous]
enabled = false                   # Opt-in to autonomous mode
min_confidence = 0.95             # Higher threshold for auto
min_projects = 10                 # More evidence required
shadow_scans = 100                # Scans before promotion
max_fp_rate = 0.05                # Auto-rollback threshold

[autonomous.distribution]
receive_community = true          # Receive community extractors
contribute_patterns = true        # Share patterns to community

Files: autonomous/mod.rs, autonomous/shadow.rs, autonomous/rollback.rs, autonomous/distribution.rs


Milestone Summary

Phase Deliverable Depends On Status
0 ConceptPath in StemeDB concept-hierarchy spec
2 Aphoria CLI (scan, report, ack) Phase 0
2A Concept matching (leaf, alias, auto-alias) Phase 2
1 Authoritative corpus expansion Phase 0
3 Claude Code skill + hooks Phase 2A
4.5 Ephemeral scan mode (40x faster) Phase 2
5 Research agent loop Phase 3
6 Federated Policy & Trust Packs Phase 4.5
6.5 Trust Pack Extensions (Predicate Aliases, Key Rotation) Phase 6
4A Observational claims (Tier 4 write-back) Phase 6
4B Self-conflict detection (drift) Phase 4A
4C Diff-only scanning (--staged) Phase 4B
4E Hosted mode (team aggregation) Phase 4C
4D Enhanced ack (--reason, policy updates) Phase 4C
5.6 Community Corpus Contributions Phase 4E
7 Declarative Extractors Phase 6
7.5 LLM-in-the-Loop Extraction (Gemini) Phase 7
7.6 Pattern Learning Store Phase 7.5
7.7 Pattern → Extractor Promotion Phase 7.6
7.8 LLM Prompt Evaluation Phase 7.5
8 Enterprise Extractors (8.1-8.11) Phase 7.5
8.2 Framework-Specific Extractors (10 frameworks) Phase 8
9.1 Autonomous Promotion Phase 8
9.2 Shadow Mode Testing Phase 9.1
9.3 Auto-Rollback Phase 9.2
9.4 Cross-Project Learning Phase 9.1
9.5 Extractor Versioning Phase 9.4

Current state:

  • Phases 0-3, 4.5, 4A-4E, 5, 5.6, 6, 7, 7.5, 7.6, 7.7, 7.8, 8, 9.1, 9.2, 9.3, 9.4, 9.5 complete (clippy clean)
  • Full corpus: RFC, OWASP, Vendor sources
  • 36 extractors including:
    • Security: weak_crypto, command_injection, sql_injection, high_entropy_secrets, auth_bypass, insecure_cookies, path_traversal, unvalidated_redirects, weak_password, security_headers, insecure_deserialization, ssrf, orm_injection, xxe
    • Framework-specific: django, express, flask, fastapi, nestjs, nextjs, spring, laravel, rails, aspnet
  • Trust Packs: signed policy bundles with import/export
  • Ephemeral mode: 40x faster for CI
  • Observation write-back: --sync records novel claims as Tier 4 project memory
  • Autonomous promotion: High-confidence patterns (>0.95, 10+ projects) can skip human review with full audit trail
  • Shadow mode testing: Auto-promoted extractors run in shadow mode to measure FP rate before graduation
  • Auto-rollback: Shadow extractors exceeding FP threshold (15%) are automatically rolled back
  • Drift detection: Detects changes from prior observations
  • Staged scanning: --staged flag for fast pre-commit hooks
  • Hosted mode: Team aggregation via central StemeDB server
  • Enhanced ack: --reason flag, aphoria update for policy changes
  • Community Corpus: Opt-in anonymous pattern sharing with privacy-preserving anonymization
  • Declarative Extractors: TOML-defined custom extractors without Rust code
  • LLM Extraction: Gemini-powered semantic claim extraction for high-value files
  • Pattern Learning: LLM-extracted claims recorded for promotion to declarative extractors
  • Pattern Promotion: CLI workflow to promote learned patterns to declarative extractors with Gemini regex generation and validation
  • LLM Prompt Evaluation: Golden fixtures with precision/recall metrics, baseline comparison, and regression detection for prompt engineering
  • Cross-Project Learning: Privacy-preserving pattern sync to hosted server, community extractor pull, BLAKE3-based deduplication, opt-in sharing with CrossProjectConfig
  • Extractor Versioning: Version tracking with changelogs, safe rollback to previous versions, A/B metrics comparison between versions via VersionStore

Phase 9 Complete! Autonomous Generation pipeline is fully self-improving.

The Self-Learning Vision

Phase 7: Declarative Extractors (foundation)           ✅ COMPLETE
    ↓
Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅ COMPLETE
    ↓
Phase 7.6: Pattern Learning (remember what LLM finds)   ✅ COMPLETE
    ↓
Phase 7.7: Pattern Promotion (patterns → extractors)    ✅ COMPLETE
    ↓
Phase 7.8: LLM Prompt Evaluation (measure & improve)    ✅ COMPLETE
    ↓
Phase 8: Enterprise Extractors (36 total)              ✅ COMPLETE
    ├── 8.1: High-entropy secrets                      ✅
    ├── 8.2: Framework extractors (10 frameworks)      ✅
    ├── 8.3: Config deep parsing                       ✅
    ├── 8.4-8.11: Security patterns                    ✅
    ↓
Phase 9: Autonomous Generation (fully self-improving)   ✅ COMPLETE
    ├── 9.1: Autonomous Promotion                        ✅ COMPLETE
    ├── 9.2: Shadow Mode Testing                         ✅ COMPLETE
    ├── 9.3: Auto-Rollback                               ✅ COMPLETE
    ├── 9.4: Cross-Project Learning                      ✅ COMPLETE
    └── 9.5: Extractor Versioning                        ✅ COMPLETE

The endgame: Every PR teaches Aphoria. After a month, it knows your security patterns better than your team does.

Bidirectional Knowledge Sync (Complete)

The pre-commit hook is now a bidirectional knowledge sync:

  1. 4A : Record code claims as Tier 4 observations (project memory)
  2. 4B : Detect drift from prior observations (self-conflict)
  3. 4C : Fast diff-only scanning for pre-commit hooks (--staged)
  4. 4E : Team aggregation via hosted StemeDB server
  5. 4D : Enhanced ack with rationale and policy updates

This transforms Aphoria from a linter into a learning system that builds institutional memory per-project and collective intelligence across teams via hosted mode.


Phase 8: Enterprise Extractor Improvements

Goal: Transform extractors from "toy examples" to enterprise-grade detection that catches real violations in production codebases.

Current State Audit

Extractor Languages Strengths Weaknesses
tls_verify 8 Multi-lang, configs Misses custom wrappers
tls_version 8 API patterns Misses semantic (const = "1.0")
hardcoded_secrets 8 Placeholders, test files No entropy detection
weak_crypto 5 MD5/SHA1/DES/RC4 SHA1 false positives, misses bcrypt cost
sql_injection 5 Interpolation patterns Misses ORM unsafe methods
jwt_config 8 alg:none, skip sig Library-specific gaps
cors_config 8 Wildcard + credentials Misses dynamic origin reflection
rate_limit 8 Basic patterns Limited depth
timeout_config 8 Basic patterns Limited depth
command_injection 5 exec/system calls Indirect injection
dep_versions 3 Version parsing No CVE correlation

Enterprise Reality: Current extractors catch ~30% of real-world security misconfigurations. Config files are highest value (patterns consistent), code is lowest (semantic understanding required).


8.1 High-Entropy Secret Detection

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
HighEntropySecretsExtractor extractors/high_entropy_secrets.rs
Shannon entropy algorithm shannon_entropy() with 4.5 threshold
Charset variety check 0.4 minimum variety ratio
Known secret prefixes AWS (AKIA), Stripe (sk_live_, sk_test_), GitHub (ghp_, gho_), GitLab (glpat-), Slack (xox[baprs]-)
High-entropy context patterns api_key, secret, token, credential, auth_key contexts
False positive exclusions UUIDs, git SHAs (40-char hex), file hashes (64-char hex)
Test file confidence reduction 0.6 confidence for test files
Tests 10+ tests covering all patterns

Configuration:

# aphoria.toml
[extractors.entropy]
min_entropy = 4.5          # Shannon entropy threshold
min_charset_variety = 0.4  # Unique chars / length ratio
min_length = 20            # Minimum string length
max_length = 200           # Maximum string length

Languages: Rust, Go, Python, JavaScript, TypeScript, YAML, TOML, JSON, Dotenv


8.2 Framework-Specific Extractors

Impact: HIGH | Effort: HIGH | Status: Complete

Research Document: docs/architecture/framework-security-extractors.md

All 10 framework-specific extractors implemented and tested:

Framework Extractor Languages Tests
Spring Boot spring_security Java, YAML, Properties 7
Django django_security Python 7
Express.js express_security JavaScript, TypeScript 5
Rails rails_security Ruby, YAML 6
ASP.NET Core aspnet_security C# (via regex), JSON 6
Laravel laravel_security PHP (via regex) 5
FastAPI fastapi_security Python 5
Next.js nextjs_security JavaScript, TypeScript 5
Flask flask_security Python 6
NestJS nestjs_security TypeScript 5

Total: 10 extractors, 57+ tests, 100+ patterns

Files: extractors/{django,express,flask,fastapi,nestjs,nextjs,spring,laravel,rails,aspnet}_security.rs

8.2.1 Spring Boot Security

# application.yml misconfigs
security:
  basic:
    enabled: false      # Auth disabled
  csrf:
    enabled: false      # CSRF disabled
  headers:
    frame-options: DISABLE  # Clickjacking
// Java code patterns
@EnableWebSecurity
public class Config extends WebSecurityConfigurerAdapter {
    http.csrf().disable();  // CSRF disabled
    http.authorizeRequests().antMatchers("/**").permitAll();  // Auth bypass
}

8.2.2 Django Security

# settings.py misconfigs
DEBUG = True  # Debug in production
ALLOWED_HOSTS = ['*']  # All hosts
CSRF_COOKIE_SECURE = False  # Insecure cookies
SESSION_COOKIE_SECURE = False

8.2.3 Express.js Security

// Missing security middleware
app.use(helmet());  // helmet() should exist
app.use(cors({ origin: '*', credentials: true }));  // CORS + creds
app.disable('x-powered-by');  // Should be disabled

8.2.4 Rails Security

# config/environments/production.rb
config.force_ssl = false  # Should be true
config.action_dispatch.cookies_same_site_protection = :none

8.3 Config File Deep Parsing

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
ConfigValue enum extractors/config_parser.rs
YAML/JSON/TOML parsers Using serde_yaml, serde_json, toml
Tree walker with path tracking walk_config() with dot-path
ConfigSecurityExtractor extractors/config_security.rs
Security rules (11 rules) TLS, CSRF, debug, password, cookies, CORS, rate limit
Dev file exclusion Skip debug warnings in dev/test configs
Tests 26 tests for parsing + security rules

Patterns now caught (nested to any depth):

  • *.tls.verify: false — TLS verification disabled
  • *.insecure_skip_verify: true — Skip verification enabled
  • *.security.enabled: false — Security disabled
  • *.csrf.enabled: false — CSRF protection disabled
  • debug: true — Debug mode (only in production files)
  • *.password.min_length < 8 — Weak password policy
  • *.cookie.secure: false — Cookie secure flag disabled
  • *.cookie.httpOnly: false — Cookie httpOnly disabled
  • *.cors.allow_origin: "*" — CORS allows all origins
  • *.rate_limit.enabled: false — Rate limiting disabled

Languages: YAML, JSON, TOML


8.4 Semantic TLS Version Detection

Impact: MEDIUM | Effort: MEDIUM | Status: Complete

Task Status
Add Language::Terraform variant types/language.rs
Semantic pattern (cross-language) Catches TLS_MIN_VERSION = "1.0" with type annotations
Environment variable pattern .env files with TLS_MIN_VERSION=1.0
Terraform HCL pattern min_tls_version = "TLS1_0"
Kubernetes camelCase pattern minTLSVersion: VersionTLS10
False positive prevention TLS 1.2/1.3 not flagged
Tests 16 new tests (27 total for TLS extractor)

Patterns now caught:

  • const TLS_MIN_VERSION: &str = "1.0"; (Rust with type annotation)
  • let sslVersion = "TLSv1"; (JavaScript camelCase)
  • TLS_MINIMUM_VERSION = "1.1" (Python assignment)
  • TLS_MIN_VERSION=1.0 (dotenv)
  • export SSL_VERSION=TLSv1 (shell export)
  • min_tls_version = "TLS1_0" (Terraform)
  • minTLSVersion: VersionTLS10 (Kubernetes YAML)

Languages: Rust, Go, Python, TypeScript, JavaScript, Yaml, Toml, Json, Terraform, Dotenv


8.5 ORM SQL Injection Detection

Impact: MEDIUM | Effort: MEDIUM | Status: Complete

Task Status
OrmInjectionExtractor extractors/orm_injection.rs
Django .raw() with interpolation f"SELECT...", .format() patterns
Django .extra() with interpolation where=["...{}...".format()]
SQLAlchemy text() with interpolation text(f"SELECT...")
SQLAlchemy execute() with f-string execute(f"...")
Sequelize raw query sequelize.query(`...${...}`)
TypeORM where() .where(`...${...}`)
GORM Raw() with Sprintf .Raw(fmt.Sprintf(...))
Prisma $queryRawUnsafe $queryRawUnsafe(`...${...}`)
Tests 8+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Current sql_injection catches raw string interpolation but misses ORM escape hatches:

# SQLAlchemy
db.execute(text(f"SELECT * FROM users WHERE id = {user_id}"))
User.query.filter(text("name = '" + name + "'"))

# Django
User.objects.raw("SELECT * FROM users WHERE id = %s" % user_id)
User.objects.extra(where=["name = '%s'" % name])
// Sequelize
sequelize.query(`SELECT * FROM users WHERE id = ${userId}`);
Model.findAll({ where: sequelize.literal(`id = ${id}`) });

// Prisma
prisma.$queryRawUnsafe(`SELECT * FROM users WHERE id = ${id}`);
# ActiveRecord
User.where("name = '#{name}'")
User.find_by_sql("SELECT * FROM users WHERE id = #{id}")

8.6 Authentication Bypass Patterns

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
AuthBypassExtractor extractors/auth_bypass.rs
Hardcoded admin credentials username == "admin" && password == "..." patterns
Debug auth headers X-Debug-Auth, X-Internal-Auth, X-Admin-Auth
Skip auth env vars SKIP_AUTH, BYPASS_AUTH, NO_AUTH, DEBUG_AUTH
Backdoor patterns if username == "backdoor", if user == "test"
Default credentials admin/admin, root/root, test/test, guest/guest
Test file confidence reduction 0.5 confidence for test files
Tests 11+ tests covering all patterns

Detected patterns:

# Hardcoded credentials
if username == "admin" and password == "admin":

# Debug auth headers
if request.headers.get("X-Debug-Auth") == "secret":

# Skip auth env vars
if os.environ.get("SKIP_AUTH") == "true":

Languages: Python, JavaScript, TypeScript, Go, Rust


8.7 Insecure Deserialization

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
InsecureDeserializationExtractor extractors/insecure_deserialization.rs
Python pickle (critical) pickle.load(), pickle.loads(), Unpickler()
Python yaml.load without SafeLoader Detects missing SafeLoader
Python marshal marshal.load(), marshal.loads()
Python eval/exec with user input eval(request...), exec(user...)
JavaScript node-serialize require('node-serialize'), .unserialize()
Go gob decoder gob.NewDecoder(), gob.Decode()
Java ObjectInputStream (polyglot) ObjectInputStream, readObject()
Tests 10+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Unsafe deserialization of untrusted data:

# Python
pickle.loads(user_input)
yaml.load(user_input)  # Without Loader=SafeLoader
eval(user_input)
exec(user_input)
// Java
ObjectInputStream ois = new ObjectInputStream(userInput);
ois.readObject();  // Dangerous!
# Ruby
Marshal.load(user_input)
YAML.load(user_input)  # Should use safe_load

8.8 Path Traversal Patterns

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
PathTraversalExtractor extractors/path_traversal.rs
Python open/read/write with user input open(request...), read(params...)
Python os.path.join with user input os.path.join(base, request...)
JavaScript fs operations fs.readFile(req...), fs.writeFile(params...)
JavaScript path.join/resolve path.join(base, req.query...)
JavaScript res.sendFile res.sendFile(req.params...)
Go filepath operations filepath.Join(base, r...), os.Open(req...)
Rust path operations Path::new(request...), std::fs::read(user...)
Traversal literals ../, %2e%2e URL-encoded patterns
Tests 8+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust

File operations with user input:

# Python
open(user_input)
os.path.join(base, user_input)  # Doesn't prevent ../
shutil.copy(user_input, dest)
// JavaScript
fs.readFile(userInput)
path.join(base, userInput)  // Doesn't prevent ../
res.sendFile(userInput)

8.9 SSRF Patterns

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
SsrfExtractor extractors/ssrf.rs
Python requests library requests.get(url), requests.post(target)
Python urllib urllib.request.urlopen(url)
Python httpx httpx.get(url), AsyncClient
JavaScript fetch fetch(url), fetch(req.query...)
JavaScript axios axios.get(url), axios.post(target)
JavaScript got got(url)
Go http.Get/Post http.Get(url), http.NewRequest(...)
Rust reqwest reqwest::get(url), reqwest::Client
URL sink patterns proxy_url, webhook_url, callback_url from request
Tests 10+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust

HTTP requests with user-controlled URLs:

# Python
requests.get(user_url)
urllib.request.urlopen(user_input)
// JavaScript
fetch(userUrl)
axios.get(userUrl)
http.get(userUrl)
// Go
http.Get(userURL)
client.Do(req)  // Where req.URL is user-controlled

8.10 Missing Security Headers

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
SecurityHeadersExtractor extractors/security_headers.rs
X-Frame-Options disabled X-Frame-Options: none, ALLOWALL
X-Content-Type-Options disabled X-Content-Type-Options: disabled
X-XSS-Protection disabled X-XSS-Protection: false
Django SECURE_* settings SECURE_BROWSER_XSS_FILTER = False, etc.
YAML headers disabled x_frame_options: false, hsts: no
CSP disabled or unsafe unsafe-inline, unsafe-eval directives
HSTS disabled Strict-Transport-Security: none, hsts_seconds = 0
Tests 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, YAML, JSON, TOML

Detect when security headers are explicitly removed or not set:

# Response headers missing
response.headers.pop('X-Content-Type-Options')
response.headers['X-Frame-Options'] = 'ALLOWALL'
// Express without helmet
app.use(cors());  // CORS without other security
// No app.use(helmet()) found

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
InsecureCookiesExtractor extractors/insecure_cookies.rs
Missing Secure flag secure=False, secure: false
Missing HttpOnly flag httponly=False, httpOnly: false
SameSite=None without Secure sameSite: 'none', SameSite=None
Django settings SESSION_COOKIE_SECURE, CSRF_COOKIE_SECURE = False
Go cookie patterns Secure: false, HttpOnly: false
Rust actix-web patterns .secure(false), .http_only(false)
Test file confidence reduction 0.5 confidence for test files
Tests 8+ tests covering all patterns

Detected patterns:

# Python/Flask/Django
response.set_cookie('session', value, secure=False)
SESSION_COOKIE_SECURE = False
// JavaScript/Express
res.cookie('session', value, { httpOnly: false });
res.cookie('auth', value, { sameSite: 'none' });

Languages: Python, JavaScript, TypeScript, Go, Rust, Ruby, YAML


8.12 Unvalidated Redirects

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
UnvalidatedRedirectsExtractor extractors/unvalidated_redirects.rs
Python redirect with user input redirect(request.GET['next']), HttpResponseRedirect(url)
Python Flask redirect redirect(request.args.get(...))
JavaScript res.redirect res.redirect(req.query.next)
JavaScript window.location window.location = url, location.href = params...
Go http.Redirect http.Redirect(w, r, r.Query...)
URL parameter patterns redirect_url, return_url, next, goto from request
Tests 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Open redirect vulnerabilities:

# Python
return redirect(request.args.get('next'))
return redirect(request.GET['url'])
// JavaScript
res.redirect(req.query.redirect);
window.location = userInput;
window.location.href = params.url;

8.13 XXE (XML External Entity)

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
XxeExtractor extractors/xxe.rs
Python lxml/etree etree.parse(), lxml.fromstring()
Python xml.etree.ElementTree ET.parse(), ET.fromstring()
Python xml.dom.minidom minidom.parse(), minidom.parseString()
Python xml.sax xml.sax.parse(), xml.sax.make_parser()
JavaScript xml2js xml2js.parseString(), xml2js.Parser()
JavaScript libxmljs libxmljs.parseXml()
Go encoding/xml xml.Unmarshal(), xml.NewDecoder()
Java patterns (polyglot) DocumentBuilderFactory, SAXParser, XMLReader
DTD entity declarations <!ENTITY ... SYSTEM>, <!ENTITY ... PUBLIC>
defusedxml detection Lower confidence when defusedxml is imported
Tests 9+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Unsafe XML parsing:

# Python
etree.parse(user_input)  # Without disabling entities
xml.etree.ElementTree.parse(user_input)
// Java
DocumentBuilderFactory.newInstance()  // Without setFeature to disable XXE
SAXParserFactory.newInstance()  // Without secure processing

8.14 Weak Password Requirements

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
WeakPasswordExtractor extractors/weak_password.rs
Minimum length < 8 password_min_length: 6, minLength: 4
Bcrypt cost < 10 bcrypt_cost = 8, hash_rounds = 5
Simple length checks len(password) >= 6 in code
Complexity disabled require_special_chars: false, require_uppercase = false
Number requirement disabled require_numbers: no, require_digit = 0
Tests 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust, YAML, JSON, TOML

Password validation that's too weak:

# Python
if len(password) >= 4:  # Too short
if len(password) >= 6:  # Still weak
MIN_PASSWORD_LENGTH = 6  # Config too low
// JavaScript
if (password.length >= 4)
const MIN_LENGTH = 6;
/^.{4,}$/  // Regex allows 4+ chars

8.15 LLM-Assisted Extraction (Future)

Impact: VERY HIGH | Effort: VERY HIGH

Use Claude to understand code semantically:

// Pseudo-implementation
async fn extract_with_llm(code: &str, file: &str) -> Vec<ExtractedClaim> {
    let prompt = format!(
        "Analyze this code for security issues. Return JSON with:\n\
         - concept_path: security concept (e.g., 'tls/cert_verification')\n\
         - predicate: what aspect (e.g., 'enabled')\n\
         - value: the value found\n\
         - confidence: 0.0-1.0\n\
         - description: why this is an issue\n\n\
         Code:\n```\n{}\n```",
        code
    );
    
    let response = claude_api.message(&prompt).await?;
    parse_claims_from_llm_response(&response)
}

When to use:

  • High-value files (auth, crypto, config)
  • After regex extractors find nothing
  • For code review mode (not CI)

Considerations:

  • Cost per scan
  • Latency
  • Rate limits
  • Privacy (code leaves machine)

Implementation Priority

Phase Extractors Impact Effort Enterprise Value Status
8.1 High-entropy secrets HIGH MEDIUM Catches real leaked secrets
8.2 Framework-specific HIGH HIGH Spring/Django/Express coverage
8.3 Config deep parsing HIGH MEDIUM Nested YAML/JSON understanding
8.4 Semantic TLS MEDIUM MEDIUM Catches const TLS_MIN = "1.0"
8.5 ORM SQL injection MEDIUM MEDIUM SQLAlchemy, Django, Sequelize
8.6 Auth bypass HIGH MEDIUM Backdoors, hardcoded creds
8.7 Deserialization HIGH MEDIUM pickle, Marshal, eval
8.8 Path traversal MEDIUM LOW ../../../etc/passwd
8.9 SSRF HIGH MEDIUM Internal network access
8.10 Security headers MEDIUM LOW Missing helmet(), CSP
8.11 Cookie flags MEDIUM LOW httpOnly, secure, sameSite
8.12 Open redirects MEDIUM LOW Phishing via redirect
8.13 XXE HIGH MEDIUM XML entity injection
8.14 Weak passwords MEDIUM LOW MIN_LENGTH = 4
8.15 LLM extraction VERY HIGH VERY HIGH Semantic understanding (Phase 7.5)

Phase 8 Complete (8.1-8.14): All extractors implemented including 10 framework-specific extractors (Spring, Django, Express, Rails, ASP.NET, Laravel, FastAPI, Next.js, Flask, NestJS).


Success Metrics

Metric Current Target How to Measure
Detection rate (known vulns) ~30% >70% Run against OWASP benchmark
False positive rate Unknown <10% Manual review of 100 findings
Config file coverage Regex only Full parse Structure-aware extraction
Framework coverage 0 4 major Spring, Django, Express, Rails
Enterprise pilot feedback N/A >4/5 Post-pilot survey

Phase 10: UX & Enterprise Polish

Goal: Address enterprise buyer feedback from pilot demos. Close gaps between pitch claims and actual functionality. Source: Skeptical buyer review of applications/aphoria-pitch/ materials.

10.1 Acknowledgment Expiry

Impact: HIGH | Effort: MEDIUM | Priority: P1

Add --expires flag to aphoria ack command for time-limited exceptions.

Task Status
Add expires_at: Option<String> to AcknowledgmentInfo struct (ISO 8601 format)
Add --expires CLI flag to Commands::Ack in cli.rs
Parse durations: --expires 90d, --expires 2026-12-31 (ISO 8601 date only)
Filter expired acks in check_conflicts()
Show "Ack expired, resurfaces as BLOCK" in output
Add expiry to JSON export for audit trail
Tests for expiry parsing and behavior

Implementation Notes:

  • Created src/expiry.rs module with parse_expiry(), is_expired(), and format_expiry() functions
  • Ack payloads stored as JSON with {reason, expires_at} for backwards compatibility
  • Legacy plain-text acks treated as permanent (no expiry)
  • Expired acks preserved for audit trail per patent claim 25
  • Updated all report formatters (table, JSON, markdown) to show expiry info

CLI changes (cli.rs):

Ack {
    concept_path: String,
    #[arg(short, long)]
    reason: String,
    /// Optional expiry (e.g., "90d", "2026-12-31")
    #[arg(long)]
    expires: Option<String>,
},

Usage:

# Expire after 90 days
aphoria ack code://go/auth/tls/cert_verification \
  --reason "Integration test environment" \
  --expires 90d

# Expire on specific date (ISO 8601)
aphoria ack code://go/auth/tls/cert_verification \
  --reason "Legacy migration - ends Q2" \
  --expires 2026-12-31

Output after expiry:

BLOCK  code://go/auth/tls/cert_verification
       Your code:  TLS certificate verification is disabled (main.go:12)
       Note:       Previous acknowledgment expired 2026-12-31
       Action:     Re-acknowledge or fix the issue

Enterprise Value: "Exceptions don't become permanent." SOC 2 auditors love time-limited exceptions because they force periodic review.


10.2 Human-Readable Signer Names

Impact: MEDIUM | Effort: MEDIUM | Priority: P2

Map issuer hex IDs to human-readable team names in output.

Task Status
Add signer_name: Option<String> to PackHeader
Add contact: Option<String> to PackHeader (Slack channel, email)
Update policy export/import to preserve new fields
Show "Signed by Platform Security Team" instead of hex in output
Show contact info in conflict output
Backward-compat: gracefully handle packs without new fields

Output with signer name:

BLOCK  code://go/auth/tls/cert_verification
       Your code:  TLS certificate verification is disabled (main.go:12)
       Source:     Acme Security Standard v3.2 (Platform Security Team)
       Contact:    #security-policy
       Action:     Fix or acknowledge with: aphoria ack <path> --reason "..."

Enterprise Value: Developers know who to contact. Auditors see clear attribution.


10.3 Speed Benchmarks

Impact: LOW | Effort: LOW | Priority: P3

Document and automate speed benchmark testing.

Task Status
Create benchmarks/ directory with test corpora
Automate time aphoria scan on standard corpus
Document test conditions in benchmark results
Add aphoria scan --benchmark flag for self-test
Include benchmarks in CI (optional, non-blocking)

Usage:

# Run benchmark on current directory
aphoria scan --benchmark

# Output includes timing breakdown
Benchmark Results:
  Files scanned:     767
  Lines of code:     187,918
  Claims extracted:  722
  Conflicts found:   186
  Total time:        652ms
    - File discovery:  45ms
    - Extraction:      487ms
    - Conflict query:  120ms

Enterprise Value: "Show me the benchmark on a 100K-line codebase" → aphoria scan --benchmark


Phase 10 Completion Criteria

Metric Target
Ack expiry working with 90d default
Demo output matches pitch slides exactly
Buyer can see who signed a policy (name, not hex)
Buyer can see how to contact policy owner
Speed benchmarks documented and reproducible

Phase 11: Evidence-Based Authority

Vision: Authority comes from evidence, not titles. Merit over tenure.

Problem: All patterns treated equally. A random commit carries the same weight as a pattern backed by RFC research and product specs.

Principle: The system rewards documentation, not tenure.

Evidence Levels

Level Example Authority Weight Graduation Threshold
ProductSpec specs/api-design.md → REQ-API-001 0.95 1 usage
Standard RFC 7519, OWASP A03:2021 0.85 3 usages
Research ADR-042, docs/decision-log.md 0.70 5 usages
Commit Just code, no context 0.40 10 usages

11.1 Evidence Level Types

Task Status
Create src/evidence/mod.rs module
Define EvidenceLevel enum (Commit, Research, Standard, ProductSpec)
Implement authority_weight() method
Add evidence level to LearnedPattern struct
Update pattern display to show evidence level

11.2 Evidence Source Detection

Task Status
Create EvidenceSource enum
Implement commit message parsing for RFC/standard references
Implement ADR file detection (docs/adr/*.md patterns)
Implement spec file detection (specs/*.md, *.spec.md)
Add PatternEvidence::detect() auto-detection

11.3 Evidence-Aware Graduation

Task Status
Update GraduationManager thresholds based on evidence
ProductSpec: 1 usage → promotion candidate
Standard: 3 usages → promotion candidate
Research: 5 usages → promotion candidate
Commit-only: 10 usages → promotion candidate
Add evidence boost to shadow mode evaluation

11.4 Evidence Display

Task Status
Update aphoria patterns show to display evidence chain
Show evidence level badge in table/JSON output
Show linked sources (ADR, spec, RFC) in conflict output
Add --evidence flag to filter patterns by evidence level

Phase 11 Completion Criteria

Metric Target
Evidence detection working for 4 source types
Graduation thresholds vary by evidence level
Pattern display shows evidence chain
ProductSpec-backed patterns graduate with 1 usage

Implementation Notes

Files Created:

  • src/evidence/mod.rs - Module exports with flow documentation
  • src/evidence/types.rs - EvidenceLevel, EvidenceSource, PatternEvidence types
  • src/evidence/detection.rs - EvidenceDetector with regex-based parsing

Files Modified:

  • src/learning/types.rs - Added evidence field to LearnedPattern
  • src/learning/store.rs - Added get_all_patterns(), get_pattern_by_id()
  • src/shadow/types.rs - Added evidence_level, evidence_sources to ShadowTest
  • src/shadow/graduation.rs - Added effective_min_scans(), meets_evidence_aware_criteria()
  • src/cli.rs - Added Show variant to PatternCommands
  • src/handlers/patterns.rs - Implemented handle_pattern_show()

Tests: 29 evidence tests + 15 graduation tests passing (817 total)


Phase 12: Knowledge Scope Hierarchy

Vision: Knowledge applies at the right level - org, team, or project.

Problem: All knowledge exists at one flat level. No way to say "this applies org-wide" vs "this is just our team's preference."

Scope Levels

Organization Level (applies to all teams)
├── Security policies (TLS, auth, secrets) - NO opt-out
├── Compliance requirements (GDPR, SOC 2)
└── Architecture decisions (API gateway, event bus)

Team Level (applies to team's projects)
├── Coding conventions (naming, error handling)
├── Technology choices (frameworks, libraries)
└── Domain patterns (payment flows, user lifecycle)

Project Level (applies to single project)
├── Local overrides (justified exceptions)
├── Experimental patterns (not yet proven)
└── Context-specific decisions

12.1 Scope Level Types

Task Status
Create src/scope/mod.rs module
Define ScopeLevel enum (Organization, Team, Project)
Add scope_level and scope_id to LearnedPattern
Add ScopeConfig to .aphoria.toml
Implement --scope flag for CLI commands

12.2 Scope Inheritance

Task Status
Implement inheritance resolution (project → team → org)
Security policies: auto-apply, no opt-out
Conventions: auto-apply, teams can override with justification
Observations: never inherited, team-specific only
Add ScopedKnowledge struct with inherited_from chain

12.3 Scope Override Workflow

Task Status
Implement aphoria scope override command
Require justification for overrides
Require evidence link (spec, ADR, ticket) for overrides
Store override audit trail
Show overrides in SOC 2 reports

12.4 Cross-Scope Queries

Task Status
aphoria patterns --scope org (org-level only)
aphoria patterns --scope team --exclude-inherited
aphoria patterns --scope project --only-local
Show scope in pattern list output

Phase 12 Completion Criteria

Metric Target
3 scope levels working (org/team/project)
Inheritance resolution correct
Overrides require justification + evidence
Cross-scope queries functional

Implementation Notes:

  • src/scope/mod.rs - ScopeLevel, ScopeId, ScopeContext with inheritance chain
  • src/scope/config.rs - ScopeConfig for aphoria.toml
  • src/scope/resolver.rs - ScopeResolver with Replace/Merge/NoInherit policies
  • src/scope/override_record.rs - ScopeOverride with OverrideValue, expiration
  • src/scope/store.rs - OverrideStore with persistence to ~/.aphoria/scope/
  • src/handlers/scope.rs - CLI command handlers (status, override, list, remove)

Tests: 884 tests passing, all scope tests passing


Phase 13: Knowledge Lifecycle Management

Vision: Knowledge ages. Patterns can be deprecated and superseded.

Problem: Knowledge exists forever. No way to deprecate patterns or track evolution.

Knowledge Status

Active       → Pattern is current, enforced
Deprecated   → Pattern is being phased out, migration guidance provided
Superseded   → Pattern replaced by another, link to replacement
Archived     → Pattern removed from active use, historical only

13.1 Knowledge Status Types

Task Status
Create src/lifecycle/mod.rs module
Define KnowledgeStatus enum
Add Deprecated variant with reason, superseded_by, sunset_date
Add KnowledgeLifecycle struct with status history
Store lifecycle in pattern metadata

13.2 Deprecation Command

Task Status
Implement aphoria deprecate <pattern-id> command
Require --reason flag
Optional --superseded-by <new-pattern>
Optional --sunset-date <ISO-8601>
Notify connected teams on deprecation

13.3 Migration Guidance

Task Status
Show deprecation warning in scan output
Link to superseding pattern when available
Show migration guide/ADR when linked
FLAG (not BLOCK) deprecated pattern usage
Track migration progress across projects

13.4 Migration Tracking Dashboard

Task Status
Implement aphoria migrations status command
Show progress by team (X/Y endpoints migrated)
Show days remaining until sunset
Show blockers (acknowledged exceptions)
Export migration status for reporting

Phase 13 Completion Criteria

Metric Target
Deprecation command working
Deprecated patterns show warning in scan
Migration tracking across projects
SOC 2 report includes migration status

Implementation Notes:

  • src/lifecycle/mod.rs - KnowledgeStatus, KnowledgeLifecycle, StatusTransition
  • src/lifecycle/store.rs - LifecycleStore for persistence
  • src/lifecycle/migration.rs - MigrationStore, MigrationProgress tracking
  • src/handlers/lifecycle.rs - CLI handlers for deprecate, archive, reactivate, history, list
  • src/handlers/lifecycle.rs - Migration handlers for status, export, blockers
  • KnowledgeLifecycle added to LearnedPattern for pattern-level lifecycle tracking

Tests: 884 tests passing (35 lifecycle-specific tests)


Phase 14: Governance Workflows 🎯

Vision: Clear approval paths for pattern promotion with audit trails.

Problem: Governance is binary: manual review or >0.95 auto-promote. No structured approval workflows.

14.1 Approval Workflow Definition

Task Status
Create src/governance/mod.rs module
Define ApprovalWorkflow struct
Define ApprovalStage with required approvers
Support evidence-based auto-approve thresholds
Config: define workflows in .aphoria.toml

14.2 Approval State Machine

Task Status
Implement state transitions (pending → approved/rejected)
Multi-stage approval support
Timeout and escalation policies
Store approval history with timestamps

14.3 Approval CLI

Task Status
aphoria governance pending - list pending approvals
aphoria governance approve <id> --comment "..."
aphoria governance reject <id> --reason "..."
aphoria governance escalate <id>
Show approval status in pattern list

14.4 SOC 2 Audit Trail

Task Status
Full audit log for all governance actions
aphoria audit trail --pattern <id> - show timeline
Export governance history for auditors
Include approver identity and timestamp

Phase 14 Completion Criteria

Metric Target
Multi-stage approval working
Approval/reject with comments
Full audit trail exportable
SOC 2 evidence includes approval chain

Phase 15: Evidence Source Integration

Vision: ADRs, specs, and standards automatically link to patterns.

Problem: Evidence sources aren't automatically detected. Developers must manually reference them.

15.1 ADR Auto-Detection

Task Status
Create src/evidence/adr.rs
Detect ADR-XXX patterns in commit messages
Scan for ADR files in standard locations
Parse ADR content for related patterns
Link ADR to patterns automatically

15.2 Spec File Detection

Task Status
Create src/evidence/spec.rs
Detect spec files (specs/*.md, *.spec.md)
Parse requirement IDs (REQ-XXX)
Link requirements to patterns
Show requirement coverage in reports

15.3 Standard Reference Extraction

Task Status
Create src/evidence/standards.rs
Parse RFC references (RFC 7519)
Parse OWASP references (OWASP A03:2021)
Parse NIST references (NIST SP 800-53)
Auto-link to authoritative corpus

15.4 Evidence Display

Task Status
Show full evidence chain in pattern output
Link to source files (ADR, spec)
Show external standard references
aphoria patterns --by-evidence grouping

Phase 15 Completion Criteria

Metric Target
ADR auto-detection working
Spec file linking working
Standard references extracted
Evidence chain visible in output

Phase 16: Ignore & Exclusion System

Vision: Clean scans by properly excluding test fixtures and intentional vulnerabilities.

Problem: Scans show 210 conflicts but ~102 are test fixtures/demos. Current exclude only supports prefix matching, no .aphoriaignore file, no inline comments, no ack export.

16.1 Glob Pattern Matching

Task Status
Replace starts_with() with globset in walker/mod.rs
Support ** recursive, * wildcard, ? single char
Document glob syntax in module docs
Add tests for pattern matching edge cases
Backwards compatibility with prefix patterns

16.2 .aphoriaignore File

Task Status
Create walker/ignore_file.rs module
Load .aphoriaignore from project root
Parse gitignore-style patterns with comments
Merge with aphoria.toml excludes
Support all comment styles (#, //, etc.)

16.3 Inline Ignore Comments

Task Status
Create extractors/ignore_comments.rs module
// aphoria:ignore same-line suppression
// aphoria:ignore-next-line next-line suppression
// aphoria:ignore-block / // aphoria:end-ignore block suppression
Support multiple comment styles (Rust, Python, C, SQL)
Integrate with ExtractorRegistry.extract_all()

16.4 Acknowledgment Export/Import

Task Status
Create ack_file.rs module
aphoria ack export — export to .aphoria/acks.toml
aphoria ack import — import from .aphoria/acks.toml
Preserve expiry and reason fields
Skip duplicates on import
Version-controllable TOML format

Phase 16 Completion Criteria

Metric Target
Glob patterns working in exclude
.aphoriaignore respected
Inline comments suppress findings
Acks exportable to version control
CLI commands for ack export/import

Enterprise Pilot Success Metrics

90-Day Pilot Targets

Metric Target Measurement
Patterns captured 100+ observations Count in knowledge graph
Patterns promoted 10+ conventions Count with status=Active
Cross-team adoption 2+ teams connected Unique team_ids
New hire guidance events 5+ accepted suggestions Accept rate tracking
False positive rate <10% FP feedback / total flags
Evidence-backed patterns >50% Patterns with Research+ evidence

180-Day Production Targets

Metric Target Measurement
Knowledge retention 0 lost patterns on departures Audit log
Onboarding velocity 50% faster ramp Time to first PR
Convention adoption 80% across org Compliance rate
SOC 2 evidence Audit pass External validation
Deprecated pattern migration 90% complete by sunset Migration tracking

Enterprise Simulation UAT

See: uat/enterprise-simulation-uat.md

6-month simulation covering:

  • Month 1: Platform team adopts, baseline patterns captured
  • Month 2: Payments team joins, cross-team patterns emerge
  • Month 3: New hire guided by existing patterns
  • Month 4: Mobile team joins, org-level promotion
  • Month 5: API versioning deprecated, migration tracked
  • Month 6: SOC 2 audit evidence generated