jordan bbeee18b68 feat: Institutional knowledge vision + roadmap phases 11-15

## Vision Update
- Shift from "code-level truth linter" to "self-learning institutional knowledge"
- Evidence-based authority model: merit over titles
  - ProductSpec → 0.95 authority, 1 usage to graduate
  - Standard (RFC) → 0.85 authority, 3 usages
  - Research (ADR) → 0.70 authority, 5 usages
  - Commit only → 0.40 authority, 10 usages
- Three-tier knowledge: Policies → Conventions → Observations
- Knowledge compounds with every commit

## Gap Analysis
- Documented missing features for enterprise pilot
- Phases 11-15 spec with implementation details
- Evidence detection, scope hierarchy, lifecycle management

## Roadmap Additions
- Phase 11: Evidence-Based Authority (🎯 current)
- Phase 12: Knowledge Scope Hierarchy
- Phase 13: Knowledge Lifecycle Management
- Phase 14: Governance Workflows
- Phase 15: Evidence Source Integration

## Enterprise Simulation UAT
- 6-month simulation: 3 teams, 19 contributors
- Month-by-month scenarios with expected outcomes
- Success metrics for 90-day and 180-day milestones

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-06 23:35:41 -07:00

113 KiB

Raw Blame History

Aphoria Roadmap

Phase 0: StemeDB Foundation ✅

Tracked in: roadmap.md § 5D. Concept Hierarchy

Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.

Aphoria Phase 0	StemeDB Phase 5D	Status
0.1 ConceptPath Type	5D.1 ConceptPath Type	✅
0.2 ConceptPath in Assertion	(implicit in 5D.1)	✅
0.3 Hierarchical Index	5D.4 Hierarchical Query	✅
0.4 Alias Store	5D.3 Alias Store + 5D.5 Alias Resolution	✅
0.5 Source Class Inference	5D.6 Source Class Inference	✅
0.6 Concept API Endpoints	5D.7 Concept API Endpoints	✅

Spec: docs/specs/concept-hierarchy.md

Phase 2: CLI Core ✅

Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.

Task	Status
2.1 Project Walker	✅ `walker/mod.rs`, `walker/path_mapper.rs`, `walker/language.rs`
2.2 Extractors (10)	✅ `tls_verify`, `jwt_config`, `hardcoded_secrets`, `timeout_config`, `dep_versions`, `cors_config`, `rate_limit`, `weak_crypto`, `command_injection`, `sql_injection`
2.3 Ingestion Bridge	✅ `bridge.rs` — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion
2.4 Conflict Query	✅ `episteme.rs` — LocalEpisteme with check_conflicts()
2.5 Report Output	✅ `report/` — table (comfy-table), JSON, SARIF 2.1.0, markdown
2.6 Acknowledge Command	✅ `lib.rs` acknowledge()
Baseline & Diff	✅ `lib.rs` set_baseline(), show_diff()
Status Command	✅ `lib.rs` show_status()

183 tests pass. Clippy and fmt clean.

Phase 2 Code Quality Fixes ✅

Code review improvements to extractors:

Issue	Fix	Status
DES/RC4 concept path misclassification	Split `check_pattern()` into `check_hash_pattern()` and `check_encryption_pattern()`; DES/RC4 now use `crypto/encryption/algorithm` path	✅
SHA1 edge case undocumented	Added comments and test documenting that SHA1 detection is intentionally broad (triggers for git hashes, etc.)	✅
JS exec() regex overly broad	Tightened regex to require `child_process.` prefix or non-word/non-dot preceding character; prevents `RegExp.exec()` false positives	✅

Phase 2A: Concept Matching ✅

Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.

2A.1 Leaf-Based Concept Matching (Aphoria-side fix) ✅

Implemented in episteme.rs via ConceptIndex:

make_key(subject, predicate) extracts tail 2 path segments + predicate
build(assertions) creates in-memory index keyed by tail path
lookup(subject, predicate) finds matching authoritative assertions
check_conflicts() uses ConceptIndex instead of QueryEngine for cross-scheme matching

Integration tests prove TLS and JWT conflicts are detected correctly.

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix) ✅

Wired AliasStore into QueryEngine.execute():

Added resolve_aliases: bool field to Query (defaults to false)
Added alias_store: Option<Arc<dyn AliasStore>> to QueryEngine
Added .with_alias_store() builder method
When resolve_aliases: true, expands subject via AliasStore.resolve_all() before index lookup
Added fetch_by_subjects() and fetch_by_subjects_predicate() for multi-subject deduplication
Modified Query.matches() to skip subject filtering when aliases are resolved
Skips fast path (MV lookup) when resolve_aliases: true
Gracefully degrades when no alias store is configured

7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.

2A.3 Auto-Alias Creation ✅

When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:

code://rust/myapp/tls/cert_verification ↔ rfc://5246/tls/cert_verification
code://rust/myapp/auth/jwt/audience_validation ↔ rfc://7519/jwt/audience_validation

This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.

Implementation:

Added auto_create_aliases: bool config option to AliasConfig (defaults to true)
Added AliasOrigin::AutoDetected variant to stemedb-core for tracking auto-created aliases
Wired GenericAliasStore into LocalEpisteme for alias persistence
In check_conflicts(), when a code claim matches an authoritative claim by leaf, calls AliasStore.set_alias() to persist the relationship with AliasOrigin::AutoDetected
Alias creation is idempotent (skips if alias already exists)
4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency

Phase 1: Authoritative Corpus Expansion ✅

Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     aphoria corpus build                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ RFC Ingester │  │ OWASP        │  │ Vendor Bootstrapper   │  │
│  │ (Tier 0)     │  │ Ingester     │  │ (Tier 2)              │  │
│  │              │  │ (Tier 1)     │  │                       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         └─────────────────┼──────────────────────┘              │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ CorpusRegistry  │                            │
│                  └────────┬────────┘                            │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ LocalEpisteme   │                            │
│                  │ ingest_         │                            │
│                  │ authoritative() │                            │
│                  └─────────────────┘                            │
└─────────────────────────────────────────────────────────────────┘

1.1 CorpusBuilder Trait ✅

Task	Status
`CorpusBuilder` trait	✅ `corpus/mod.rs` — name, scheme, default_tier, build, requires_network
`CorpusRegistry`	✅ Manages multiple builders, build_all(), list_builders()
`CorpusBuildResult`	✅ Stats per builder, total assertions, success/fail/skip counts

1.2 RFC Ingester ✅

Task	Status
`RfcCorpusBuilder`	✅ `corpus/rfc.rs`
HTTP fetching	✅ Via `ureq`, cached to `~/.cache/aphoria/rfc-cache/`
RFC 2119 keyword parsing	✅ MUST, MUST NOT, SHOULD, SHALL extraction
RFC-specific parsers	✅ JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110)
Concept mapping	✅ `rfc://{number}/{topic}` at Tier 0 (Regulatory)

1.3 OWASP Ingester ✅

Task	Status
`OwaspCorpusBuilder`	✅ `corpus/owasp.rs`
HTTP fetching	✅ From GitHub raw content, cached to `~/.cache/aphoria/owasp-cache/`
Markdown parsing	✅ MUST/SHOULD statements, section context
Cheat sheet parsers	✅ Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers
Concept mapping	✅ `owasp://cheatsheet/{topic}/{claim}` at Tier 1 (Clinical)

1.4 Vendor Docs ✅

Task	Status
`VendorCorpusBuilder`	✅ `corpus/vendor.rs`
PostgreSQL claims	✅ pool_size, idle_timeout, ssl_mode
Redis claims	✅ timeout, max_retries, tls
reqwest claims	✅ cert_verification, connect_timeout, request_timeout
hyper claims	✅ keep_alive_timeout, max_concurrent_streams
Go net/http claims	✅ read_timeout, write_timeout, idle_timeout, min_tls_version
tokio-postgres claims	✅ pool_size, ssl_mode
SQLx claims	✅ max_connections, idle_timeout
Concept mapping	✅ `vendor://{product}/{topic}/{claim}` at Tier 2 (Observational)

1.5 Hardcoded Refactor ✅

Task	Status
`HardcodedCorpusBuilder`	✅ `corpus/hardcoded.rs` — original 11 assertions
`create_authoritative_assertion()`	✅ Made public in `episteme.rs` for corpus builders

1.6 CLI Integration ✅

Task	Status
`aphoria corpus build`	✅ Fetches and ingests from all sources
`--only rfc,owasp,vendor`	✅ Filter to specific sources
`--offline`	✅ Skip network-requiring sources
`--clear-cache`	✅ Clear cache before building
`aphoria corpus list`	✅ List available corpus sources
`CorpusConfig`	✅ cache_dir, include_*, rfc_list options

1.7 Error Handling ✅

Task	Status
`RfcFetch` error	✅ Per-RFC fetch failures with context
`OwaspFetch` error	✅ Per-cheat-sheet fetch failures with context
`CorpusBuild` error	✅ General corpus build failures
Graceful degradation	✅ Continue with other sources if one fails

Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs

Phase 3: Skill Integration ✅

Complete. Aphoria is now usable in Claude Code agent workflows.

3.1 Claude Code Skill ✅

Task	Status
`skill/SKILL.md`	✅ Comprehensive skill definition with all commands
`/aphoria scan`	✅ Scan project, show conflicts grouped by verdict
`/aphoria scan --fix`	✅ Interactive fix workflow
`/aphoria ack`	✅ Acknowledge conflicts as intentional
`/aphoria status`	✅ Show status and baseline
`/aphoria diff`	✅ Show changes since baseline
`/aphoria init`	✅ Initialize Aphoria
`/aphoria baseline`	✅ Set baseline
`skill/install.sh`	✅ Install script for `~/.claude/skills/aphoria/`

Files: skill/SKILL.md, skill/install.sh, skill/hooks.json

3.2 Agent Pre-Flight Hook ✅

Task	Status
`--exit-code` flag	✅ Returns 2 for BLOCK, 1 for FLAG only, 0 for clean
`--strict` flag	✅ Lower thresholds (FLAG at 0.3, BLOCK at 0.5)
Hook template	✅ `skill/hooks.json` with PreCommit and PrePush examples

Usage:

{
  "hooks": {
    "PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
    "PrePush": [{"command": "aphoria scan --strict --exit-code"}]
  }
}

3.3 Alias Suggestion Workflow ✅

Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:

Tail-path matching finds authoritative assertions
Aliases are auto-created with AliasOrigin::AutoDetected
Future queries use the alias automatically

The skill documents the suggestion flow for manual alias management:

y (Accept): Creates alias
n (Reject): Records intentional difference
defer: Flags for later review

Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ✅

Vision: The pre-commit hook is a bidirectional knowledge sync, not just a read-only linter. Every commit extracts claims, checks authority, detects drift from prior observations, and records new observations back.

Spec: uat/2026-02-04-full-cycle-precommit-vision.md

┌─────────────────────────────────────────────────────────────┐
│                     PRE-COMMIT FLOW                          │
├─────────────────────────────────────────────────────────────┤
│  1. EXTRACT     → What claims does this code make?           │
│  2. CHECK       → Against authority + own prior claims       │
│  3. CLASSIFY    → Authority conflict | Self conflict | Novel │
│  4. UPDATE      → Record observations to local Episteme      │
│  5. GATE        → Exit code (BLOCK=2, FLAG=1, PASS=0)        │
└─────────────────────────────────────────────────────────────┘

4.1 Git Pre-Commit Hook ✅

All flags needed for pre-commit integration are implemented:

#!/bin/sh
# .git/hooks/pre-commit
aphoria scan --staged --sync --exit-code

Or using pre-commit framework:

repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria Truth Sync
        entry: aphoria scan --staged --sync --exit-code
        language: system
        pass_filenames: false

4.2 Baseline Mode ✅

Already implemented in Phase 2.

4A: Observational Claims ✅

Record code claims as Tier 4 (Community) assertions when no authority conflict exists:

Task	Status
`sync: bool` in ScanArgs	✅ `types/command.rs`
`observations_recorded: usize` in ScanResult	✅ `types/result.rs`
`--sync` CLI flag	✅ `cli.rs` — requires `--persist`
`claim_to_observation()`	✅ `bridge.rs` — creates Tier 4 (Community, 0.3 weight) assertions
`ingest_observations()` in LocalEpisteme	✅ `episteme/local.rs` — writes to WAL + predicate index
Scan flow integration	✅ `scan.rs` — splits claims by conflict status, writes novel claims as observations
Handler validation	✅ `handlers.rs` — `--sync requires --persist` error
Report output	✅ `report/table.rs`, `report/json.rs` — shows observation count
Tests	✅ 5 new tests for observation write-back

Code: connection_pool.max_size = 25
Authority: (nothing)
Action: Record as Tier 4 observation (project memory)

Usage:

# Scan with observation write-back
aphoria scan --persist --sync

# Output:
# Recorded 45 observations (project memory)

4B: Self-Conflict Detection ✅

Detect drift from the project's own prior observations:

Task	Status
Query prior claims before conflict check	✅ `fetch_observations_for_concept()`
Compare current vs stored observations	✅ `check_drift()` compares values
Report changes as SELF-CONFLICT	✅ DriftResult with prior/current values
New verdict: `Drift` (distinct from Block/Flag)	✅ `Verdict::Drift`
Drift reporting in all formats	✅ table, json, markdown, sarif
Exit code includes drift	✅ `--exit-code` returns 1 for drift

Prior: db/pool_size = 25 (recorded 2026-01-15)
Now:   db/pool_size = 100
Result: DRIFT — "You changed pool_size from 25 to 100. Intentional?"

Files: types/result.rs, types/verdict.rs, episteme/local.rs, scan.rs, report/*.rs

4C: Diff-Only Scanning ✅

Fast scanning for pre-commit hooks:

Task	Status
`FileSource` enum (All, Staged)	✅ `types/command.rs`
`--staged` flag (git diff --cached)	✅ `cli.rs`, `handlers.rs`
`walker/git.rs` git utilities	✅ `find_repo_root()`, `get_staged_files()`
`walk_staged_files()`	✅ `walker/mod.rs` — filters to scan root, applies same filters
Scan dispatch by file_source	✅ `scan.rs`
Error handling (NotGitRepo, GitCommand)	✅ `error.rs`
Tests	✅ 9 tests in `tests/staged_scanning.rs`
Target: < 500ms for staged-only	✅

Files: types/command.rs, walker/git.rs, walker/mod.rs, scan.rs, cli.rs, handlers.rs, error.rs

Usage:

# Pre-commit hook (fast, staged files only)
aphoria scan --staged --exit-code

# Full cycle with observation sync
aphoria scan --staged --persist --sync --exit-code

4D: Enhanced Ack ✅

Acknowledgments with rationale and policy updates:

Task	Status
`--reason "text"` flag	✅ `cli.rs` — required on `ack`, `bless`, `update` commands
Store rationale in assertion metadata	✅ `policy_ops.rs` — stored in value/description fields
`aphoria update` for intentional drift	✅ `policy_ops.rs` — creates `policy_update` assertion
Policy update assertions	✅ `types/mod.rs` — `predicates::POLICY_UPDATE`

Files: cli.rs, handlers.rs, policy_ops.rs, types/command.rs, types/mod.rs

$ aphoria ack db/pool_size --reason "Scaling for Black Friday"
$ aphoria update db/pool_size 100 --reason "New baseline after load test"

4E: Hosted Mode ✅

Organizations run their own StemeDB server and all team members automatically sync observations:

Task	Status
`HostedConfig` in config.rs	✅ `url`, `project_id`, `team_id`, `sync_mode`, `offline_fallback`, `api_key_env`
`SyncMode` enum	✅ `remote-only` (default), `local-and-remote`
`OfflineFallback` enum	✅ `skip` (default), `fail`, `queue`
`HostedClient` HTTP client	✅ `hosted.rs` — retry logic, auth headers, observation push
`POST /v1/aphoria/observations` endpoint	✅ Server receives observations with project/team metadata
Scan integration	✅ Auto-enables sync when `[hosted]` configured
`Hosted(String)` error variant	✅ For connection/auth failures
Graceful offline fallback	✅ Based on `offline_fallback` config
Tests	✅ Config parsing, client creation, assertion conversion

# aphoria.toml
[hosted]
url = "https://episteme.acme.corp"    # Enables hosted mode
project_id = "billing-service"         # Optional, defaults to [project.name]
team_id = "platform-team"              # Optional, for multi-team servers
sync_mode = "remote-only"              # "remote-only" | "local-and-remote"
offline_fallback = "skip"              # "skip" | "fail" | "queue"
api_key_env = "APHORIA_API_KEY"        # Env var for auth token

Architecture:

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ Developer A  │  │ Developer B  │  │ Developer C  │
│ aphoria scan │  │ aphoria scan │  │ aphoria scan │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       └─────────────────┼─────────────────┘
                         ▼
              ┌─────────────────────┐
              │ Team StemeDB Server │
              │ POST /v1/aphoria/   │
              │      observations   │
              └─────────────────────┘
                         │
                         ▼
              Aggregated team patterns

Files: config.rs, hosted.rs, scan.rs, error.rs, lib.rs, crates/stemedb-api/src/handlers/aphoria.rs, crates/stemedb-api/src/dto/aphoria.rs

Phase 4.5: Ephemeral Scan Mode ✅

Performance optimization: 40x faster scans by skipping Episteme storage when persistence isn't needed.

Problem

Every aphoria scan was slow because it initialized the full Episteme stack:

WAL recovery (O(n) on every startup)
Dual backend initialization (fjall + redb)
Store and index initialization

But conflict detection is actually 100% in-memory — it never reads from the KV store. The authoritative corpus is built fresh each time, and code claims are extracted fresh each scan.

Solution

Added ScanMode enum with two modes:

Mode	Use Case	Storage	Performance
Ephemeral (default)	CI, pre-commit, quick checks	None	~0.25 seconds
Persistent	Baseline/diff tracking, alias creation	WAL + store	~1-2 seconds

Implementation ✅

Task	Status
`ScanMode` enum	✅ `types.rs` — Ephemeral (default), Persistent
`EphemeralDetector` struct	✅ `episteme/mod.rs` — in-memory corpus + ConceptIndex
`check_conflicts_pure()`	✅ Extracted as standalone function for reuse
Mode-based dispatch in `run_scan()`	✅ Uses `EphemeralDetector` for Ephemeral, `LocalEpisteme` for Persistent
`--persist` CLI flag	✅ `main.rs` — opt-in to persistent mode
Tests for both modes	✅ `test_ephemeral_scan_no_storage_created`, `test_persistent_scan_creates_storage`, `test_scan_modes_produce_same_conflicts`

Usage

# Fast ephemeral scan (default) — no storage created
aphoria scan .

# Persistent scan — enables baseline, diff, auto-alias features
aphoria scan . --persist

Performance

Mode	Time	Storage
Ephemeral	~0.25s	None
Persistent	~1-2s	WAL + store directories

Files: types.rs, episteme/mod.rs, lib.rs, main.rs, tests.rs

Phase 5: Research Agent Loop ✅

Research agent fills gaps in authoritative coverage by researching official documentation.

5.1 Gap Detection ✅

Task	Status
`Gap` struct	✅ `research/gap_detector.rs` — concept_path, topic, predicate, source info
`detect_gaps()`	✅ Compares claims against ConceptIndex, identifies missing coverage
Topic normalization	✅ Extracts last 2 path segments for cross-scheme matching
Deduplication	✅ Deduplicates gaps by topic+predicate key

5.2 Gap Storage ✅

Task	Status
`GapRecord`	✅ `research/gap_store.rs` — tracking metadata, project count, research status
`GapStore`	✅ JSON-backed persistent storage with atomic saves
Project tracking	✅ Records which projects reported each gap
Research eligibility	✅ `is_eligible_for_research()` with threshold and cooldown
Gap pruning	✅ `prune_old_gaps()` removes stale entries

5.3 Quality Validation ✅

Task	Status
`QualityValidator`	✅ `research/quality.rs` — validates researched claims
Source attribution	✅ Checks for authoritative domains (rfc-editor, owasp, vendor docs)
Normative language	✅ Verifies MUST/SHOULD/SHALL keywords present
Vague content detection	✅ Rejects "it depends", "typically", etc.
Consistency scoring	✅ Detects conflicting claims on same subject
`QualityReport`	✅ Detailed per-claim validation results
`filter_passed()`	✅ Returns only claims meeting quality threshold

5.4 Research Execution ✅

Task	Status
`Researcher`	✅ `research/researcher.rs` — orchestrates research pipeline
`DocumentationSource`	✅ Configurable sources with URL patterns and topics
Default sources	✅ Redis, PostgreSQL, Go, Rust, OWASP, Kafka, MongoDB
Content fetching	✅ HTTP with timeout and size limits
Normative extraction	✅ Regex-based MUST/SHOULD/SHALL extraction
Section tracking	✅ Extracts heading context for attribution
Confidence scoring	✅ Based on keyword strength, statement length, content size

5.5 CLI Integration ✅

Task	Status
`aphoria research run`	✅ Run research agent with configurable threshold
`aphoria research status`	✅ Show gap statistics and research progress
`aphoria research gaps`	✅ List gaps by project count
`--threshold`	✅ Minimum projects before researching (default: 3)
`--strict`	✅ Use strict quality validation
`--prune`	✅ Remove stale gaps before researching
`--ready`	✅ Show only gaps ready for research

Files: research/mod.rs, research/gap_detector.rs, research/gap_store.rs, research/quality.rs, research/researcher.rs, research/tests.rs

5.7 Security Extractors ✅

Extended Phase 2 extractors with OWASP-aligned security vulnerability detection:

Extractor	Detects	Languages
`weak_crypto`	MD5, SHA1, DES, RC4 usage	Rust, Go, Python, JS/TS
`command_injection`	Shell execution, os.system, subprocess shell=True	Rust, Go, Python, JS/TS
`sql_injection`	String concatenation in SQL queries	Rust, Go, Python, JS/TS

Concept paths:

crypto/hashing/algorithm — MD5, SHA1
crypto/encryption/algorithm — DES, RC4
os/command/input, os/shell_mode — command injection
db/query/input — SQL injection

5.6 Community Corpus Contributions ✅

Users can opt in to contribute patterns anonymously to a central corpus, enabling community consensus to adjust default thresholds.

Task	Status
`CommunityConfig`	✅ `config/mod.rs` — enabled (false), anonymize (true), exclude, include, min_confidence
`AnonymizedObservation`	✅ `community/types.rs` — privacy-preserving observation without file/line/text
`CommunityObjectValue`	✅ `community/types.rs` — serde-compatible version of ObjectValue
`PatternAggregate`	✅ `community/types.rs` — server-side aggregation with project counts
`anonymize_claim()`	✅ `community/anonymizer.rs` — wildcards project names, strips file/line, rounds timestamps
`compute_anon_hash()`	✅ Hash computed WITHOUT file/line/text (privacy-critical)
`wildcard_project_path()`	✅ `code://rust/myapp/tls` → `code://rust/*/tls`
`--community-preview` flag	✅ `cli.rs` — dry-run showing what WOULD be shared
`PatternAggregateStore`	✅ `stemedb-storage` — server-side pattern aggregation
Project deduplication	✅ Uses project_hash to prevent double-counting
`POST /v1/aphoria/community/observations`	✅ Push anonymized observations
`GET /v1/aphoria/patterns`	✅ Retrieve high-confidence community patterns

Privacy Model:

Project names wildcarded: myapp → *
File paths, line numbers, matched text NEVER shared
Timestamps rounded to hour (k-anonymity)
Server receives project_hash, not raw project names
enabled defaults to false (explicit opt-in required)
anonymize defaults to true (privacy-preserving by default)

Usage:

# Preview what would be shared (no network)
aphoria scan --community-preview

# Enable in aphoria.toml:
[community]
enabled = true
anonymize = true
min_confidence = 0.8
exclude = ["vendor://acme/internal/*"]

# Scan with sync to share patterns
aphoria scan --persist --sync

Files: community/mod.rs, community/types.rs, community/anonymizer.rs, config/mod.rs, cli.rs, handlers.rs, stemedb-storage/src/pattern_aggregate_store/

Phase 6: Federated Policy & Trust Packs ✅

Allow teams to define their own authoritative truths and distribute them as signed Trust Packs. This enables "Enterprise Grade" compliance across distributed teams.

6.1 Trust Pack Format ✅

Task	Status
`TrustPack` schema	✅ `policy.rs` — Assertions, Aliases, Metadata, Signature
`PackHeader`	✅ Name, version, issuer, timestamp
Serialization	✅ `rkyv` for zero-copy efficiency
Signing	✅ `ed25519-dalek` signing and verification

6.2 Policy Management ✅

Task	Status
`PolicyManager`	✅ Loads local and remote (HTTP/HTTPS) policies
Caching	✅ Caches remote policies in `~/.cache/aphoria/policies/`
`aphoria.toml` config	✅ `policies` list support

6.3 Core Integration ✅

Task	Status
`EphemeralDetector` integration	✅ Ingests policies into memory corpus/index
`check_conflicts_pure` update	✅ Resolves policy aliases before authoritative lookup
`LocalEpisteme` export helpers	✅ `fetch_acknowledgments`, `fetch_manual_aliases`

6.4 CLI Commands ✅

Task	Status
`aphoria policy export`	✅ Exports local `ack` decisions as a Trust Pack
`aphoria scan` policy loading	✅ Auto-loads policies from config

Files: policy.rs, config.rs, episteme/mod.rs, lib.rs, main.rs

Phase 6.5: Trust Pack Extensions ✅

Enhancements to Trust Packs for semantic predicate matching and key management.

6.5.1 Predicate Aliases ✅

Status: Complete Implemented: 2026-02-06

User Story:

As a security architect, when my policy uses required=true but the extractor emits enabled=true, I need them to match semantically.

Problem:

Policy blesses: code://standard/tls/cert_verification with predicate required, value true
Extractor emits: code://config/tls/cert_verification with predicate enabled, value false
Tail-path matching finds the concept (tls/cert_verification) ✓
But predicates differ: required vs enabled — no conflict detected ✗

Solution:

Task	Description
`predicate_aliases` field	Add to Trust Pack schema
Default aliases	`enabled` ↔ `required` ↔ `mandatory` ↔ `enforced`
ConceptIndex update	Check aliases during lookup
Pack-defined aliases	Allow packs to specify custom alias sets

Trust Pack Schema Extension:

# In Trust Pack
[predicate_aliases]
security_enabled = ["enabled", "required", "mandatory", "enforced", "active"]
version_minimum = ["min_version", "minimum_version", "tls_min_version"]

Implementation Plan:

Add predicate_aliases: HashMap<String, Vec<String>> to TrustPack
Store aliases alongside assertions during import
Update ConceptIndex.make_key() to normalize predicates via aliases
Match during conflict detection: if predicate_a aliases to predicate_b, treat as same concept

6.5.2 Pack Signing Key Rotation ✅

Status: Complete Implemented: 2026-02-06

User Story:

As a security admin, when our signing key is rotated, I need to re-sign all packs without losing policy content.

Problem:

Trust Packs are signed with Ed25519 keys
When keys are rotated (security best practice), existing packs become unverifiable
Need to re-sign packs with new key while preserving content hash

Solution:

Task	Description
`aphoria policy resign`	CLI command to re-sign pack with new key
Content hash preservation	Keep `content_hash` unchanged, only update signature
Key rotation audit	Log key rotation events
Old signature archival	Optionally keep old signature for audit trail

CLI:

# Re-sign pack with new key
aphoria policy resign my-standards.pack --key-file new-private-key.pem

# Re-sign with signature chain (audit trail)
aphoria policy resign my-standards.pack --key-file new-key.pem --chain-signatures

Trust Pack Schema Extension:

pub struct TrustPack {
    // Existing fields...
    pub signature: Signature,

    // New field for key rotation audit
    pub signature_chain: Option<Vec<SignatureRecord>>,
}

pub struct SignatureRecord {
    pub issuer_public_key: [u8; 32],
    pub signature: Signature,
    pub signed_at: DateTime<Utc>,
    pub reason: Option<String>,  // "Key rotation", "Security incident", etc.
}

6.5.3 Priority

Feature	Priority	Trigger
Predicate Aliases	Medium	Enterprise feedback showing predicate naming conflicts
Key Rotation	Low	Enterprise security key management requirements

Documented in: uat/future-scenarios.md

Phase 7: Declarative Extractors ✅

Enable users to define new extractors in config/policy files (TOML) without writing Rust code. This removes the recompilation bottleneck for custom pattern enforcement.

User Outcome: "I added a custom extractor to my aphoria.toml that detects our company's deprecated API patterns. Now every scan flags files using the old pattern without me writing any Rust code."

7.1 Core Types ✅

Task	Status
`DeclarativeExtractorDef`	✅ `extractors/declarative.rs` — name, description, languages, pattern, claim, confidence
`DeclarativeClaimDef`	✅ subject, predicate, value specification
`DeclarativeValue` enum	✅ MatchedText, Boolean, Text variants
`DeclarativeExtractor`	✅ Compiled extractor with `Extractor` trait impl

7.2 Configuration ✅

Task	Status
`ExtractorConfig.declarative`	✅ `config/mod.rs` — `Vec<DeclarativeExtractorDef>`
TOML parsing	✅ Serde deserialization with `#[serde(untagged)]` for value types
Example config	✅ Documented in module and config docs

Example aphoria.toml:

[[extractors.declarative]]
name = "deprecated_api_v1"
description = "Detects usage of deprecated v1 API endpoints"
languages = ["go", "rust", "python"]
pattern = '/api/v1/\w+'
claim.subject = "api/deprecated_endpoint"
claim.predicate = "version"
claim.value = "v1"
confidence = 1.0

[[extractors.declarative]]
name = "legacy_encryption"
description = "Detects legacy encryption algorithms"
languages = ["rust", "go", "python", "javascript"]
pattern = '(?i)blowfish|twofish|cast5'
claim.subject = "crypto/encryption/algorithm"
claim.predicate = "algorithm"
claim.value_from_match = true
confidence = 0.9

7.3 Validation & Security ✅

Task	Status
Name validation	✅ Non-empty required
Subject/predicate validation	✅ Non-empty required
Confidence validation	✅ Must be 0.0-1.0
Regex validation	✅ Compiled at load time, not scan time
ReDoS protection	✅ `RegexBuilder` with 10MB size limits
Language parsing	✅ `Language::from_str()` with `FromStr` trait
Graceful failure	✅ Invalid extractors logged as warnings, don't block others

7.4 Registry Integration ✅

Task	Status
Module export	✅ `extractors/mod.rs` — public types
Registry registration	✅ `ExtractorRegistry::new()` loads from config
Enable/disable support	✅ Declarative extractors respect `disabled` list
Runtime addition	✅ `add_from_definitions()` for Trust Pack integration

7.5 Error Handling ✅

Task	Status
`DeclarativeExtractor` error variant	✅ `error.rs` — name + message
Validation errors	✅ Clear messages for each failure mode
Structured logging	✅ `tracing::warn!` for compilation failures

7.6 Tests ✅

Task	Status
Unit tests	✅ 22 tests in `declarative.rs`
Registry tests	✅ 7 tests for integration
Validation tests	✅ Empty name, subject, predicate; invalid confidence, regex, language
Extraction tests	✅ Boolean, text, matched_text value types
Deserialization tests	✅ TOML parsing for all value types

Files: extractors/declarative.rs, extractors/mod.rs, config/mod.rs, types/language.rs, error.rs

Phase 7.5: LLM-in-the-Loop Extraction ✅

Use LLM (Gemini) to extract claims semantically during persistent scans. This fills gaps that regex extractors can't catch, providing immediate value while the learning system builds up pattern knowledge.

Vision

Code file → Regex extractors → Claims found
                ↓
         High-value files (auth, config, crypto)
                ↓
         LLM Extractor → Additional semantic claims
                ↓
         Combined claims → Conflict detection

7.5.1 LLM Extractor Implementation ✅

Task	Status
`GeminiClient` struct	✅ `llm/client.rs` — Gemini API client using ureq
`LlmExtractor` struct	✅ `llm/extractor.rs` — orchestrates extraction with budget tracking
Prompt engineering	✅ Security-focused extraction prompt with structured JSON output
Response parsing	✅ Parse Gemini's JSON response into `ExtractedClaim` format
Error handling	✅ Graceful degradation when API unavailable or key missing

7.5.2 Selective Triggering ✅

Task	Status
`is_high_value_file()`	✅ `llm/extractor.rs` — auth/, config/, crypto/, security/, secrets/, certs/, ssl/, tls/, keys/, credentials/ directories
High-value file names	✅ secret, password, credential, token, auth, login, session, jwt, tls, ssl, cert, key, config, settings, security, crypto, encrypt, decrypt, oauth, saml, ldap, api_key, apikey, access_key, private
Token budget	✅ `max_tokens_per_scan` (default 50k), `max_tokens_per_file` (default 4k)
Skip conditions	✅ Only runs when regex extractors found nothing AND file is high-value

7.5.3 Cost Controls ✅

Task	Status
Token tracking	✅ `Arc<AtomicUsize>` for thread-safe budget tracking across files
BLAKE3 caching	✅ `llm/cache.rs` — content hash + model + prompt version for cache key
Cache location	✅ `~/.cache/aphoria/llm-cache/`
Budget enforcement	✅ `within_budget()` check before each LLM call

7.5.4 Configuration ✅

# aphoria.toml
[llm]
enabled = true                    # Enable LLM extraction (default: false)
provider = "gemini"               # Only "gemini" supported
# model defaults to DEFAULT_LLM_MODEL (currently "gemini-3-flash-preview")
api_key_env = "GEMINI_API_KEY"    # Environment variable for API key
max_tokens_per_scan = 50000       # Budget per scan
max_tokens_per_file = 4000        # Budget per file (for max_output_tokens)
high_value_only = true            # Only use on auth/config/crypto files
cache_responses = true            # Cache by content hash
timeout_secs = 60                 # API timeout
min_confidence = 0.7              # Filter claims below this confidence

Files: llm/mod.rs, llm/client.rs, llm/extractor.rs, llm/cache.rs, config/mod.rs, scan.rs, error.rs

Phase 7.6: Pattern Learning Store ✅

When LLM extracts something that regex extractors missed, remember the pattern. Track which patterns recur across projects to identify candidates for promotion to declarative extractors.

Vision

LLM extracts claim from code
        ↓
Pattern not in learned store?
        ↓
Store: { example_code, claim, project_hash }
        ↓
Same pattern seen in 5+ projects?
        ↓
Flag for promotion to declarative extractor

7.6.1 LearnedPattern Schema ✅

Task	Status
`ValueType` enum	✅ `learning/types.rs` — Text, Number, Boolean
`ClaimTemplate` struct	✅ `learning/types.rs` — subject_template, predicate, value_type, description
`LearnedPattern` struct	✅ `learning/types.rs` — full schema with timestamps, project hashes, confidence tracking
Serde serialization	✅ JSON serialization with chrono timestamps
Tests	✅ 5 unit tests for types

7.6.2 PatternStore Implementation ✅

Task	Status
`PatternStore` trait	✅ `learning/store.rs` — abstract storage interface
`LocalPatternStore`	✅ JSON-backed local storage at `~/.aphoria/learning/patterns.json`
`RwLock` thread safety	✅ Write-through cache with in-memory HashMap
Deduplication	✅ `find_similar()` with Levenshtein similarity threshold 0.8
Pruning	✅ `prune_stale()` removes patterns not seen in N days
Tests	✅ 8 unit tests for store operations

7.6.3 Pattern Normalization ✅

Task	Status
`normalize_pattern()`	✅ `learning/normalizer.rs` — replaces literals with placeholders
Version detection	✅ `"1.0"`, `"TLSv1.2"` → `<string:version>`
Boolean detection	✅ `true`/`false` → `<boolean>`
Number detection	✅ Standalone numbers → `<number>`
String detection	✅ Remaining quoted strings → `<string>`
`pattern_similarity()`	✅ Levenshtein distance normalized to 0.0-1.0
Tests	✅ 17 unit tests for normalization

7.6.4 Configuration ✅

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning (default: false)
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

7.6.5 Scan Integration ✅

Task	Status
Initialize pattern store	✅ `scan.rs` — only in persistent mode with learning enabled
Project hash computation	✅ BLAKE3 hash for privacy-preserving project identification
Record LLM-extracted claims	✅ After LLM extraction, record patterns meeting min_confidence
Update existing patterns	✅ Merge observations when similar pattern found
Logging	✅ Reports patterns_recorded count on scan completion

7.6.6 Error Handling ✅

Task	Status
`LearningStore` error variant	✅ `error.rs` — for storage/cache failures
Graceful degradation	✅ Store failures logged, don't block scan

Files: learning/mod.rs, learning/types.rs, learning/normalizer.rs, learning/store.rs, config/mod.rs, scan.rs, error.rs, lib.rs

Tests: 30 tests covering types, normalization, and store operations.

Phase 7.6 (Legacy Documentation)

Note: The following is the original spec for reference. See above for implemented status.

Original Schema (Reference)

/// A pattern learned from LLM extraction that could become a declarative extractor.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LearnedPattern {
    /// Unique identifier
    pub id: Uuid,

    /// Example code that triggered this pattern
    pub example_code: String,

    /// Normalized pattern (variables replaced with placeholders)
    /// e.g., "const TLS_MIN_VERSION = \"1.0\"" → "const TLS_MIN_VERSION = <version>"
    pub normalized_pattern: String,

    /// The claim this pattern produces
    pub claim_template: ClaimTemplate,

    /// Language this pattern applies to
    pub language: Language,

    /// When first seen
    pub first_seen: DateTime<Utc>,

    /// When last seen
    pub last_seen: DateTime<Utc>,

    /// Projects that have this pattern (hashed for privacy)
    pub project_hashes: HashSet<String>,

    /// Total occurrences across all projects
    pub occurrences: u32,

    /// Average LLM confidence when extracting this
    pub avg_confidence: f32,

    /// Has this been promoted to a declarative extractor?
    pub promoted: bool,

    /// If promoted, the extractor ID
    pub promoted_to: Option<String>,
}

/// Template for generating claims from a learned pattern.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ClaimTemplate {
    pub subject_template: String,  // "tls/min_version"
    pub predicate: String,         // "version"
    pub value_type: ValueType,     // String, Boolean, Number
    pub description_template: String,
}

Original PatternStore Trait (Reference)

pub trait PatternStore: Send + Sync {
    /// Record a pattern learned from LLM extraction
    fn record_pattern(&self, pattern: &LearnedPattern) -> Result<()>;

    /// Find existing pattern matching this example
    fn find_similar(&self, normalized: &str, language: Language, threshold: f32) -> Option<LearnedPattern>;

    /// Get patterns ready for promotion (threshold met)
    fn get_promotion_candidates(&self, min_projects: usize, min_confidence: f32) -> Vec<LearnedPattern>;

    /// Mark pattern as promoted
    fn mark_promoted(&self, id: &Uuid, extractor_name: &str) -> Result<()>;

    /// Prune old patterns
    async fn prune_stale(&self, max_age_days: u32) -> Result<usize>;
}

7.6.3 Pattern Normalization ⬜

Task	Description
Variable extraction	Identify literals that vary (versions, names, values)
Placeholder insertion	Replace literals with typed placeholders
Similarity scoring	Compare normalized patterns for dedup

fn normalize_pattern(code: &str, claim: &ExtractedClaim) -> String {
    // "const TLS_MIN = \"1.0\"" → "const TLS_MIN = <string:version>"
    // "pool_size: 25" → "pool_size: <number>"
    // "verify_ssl: false" → "verify_ssl: <boolean>"
}

fn similarity_score(a: &str, b: &str) -> f32 {
    // Levenshtein distance normalized to 0.0-1.0
    // Patterns with score > 0.8 are considered duplicates
}

7.6.4 Integration with Scan ⬜

// In scan.rs, after LLM extraction
for claim in llm_claims {
    // Check if this is a new pattern
    if let Some(existing) = pattern_store.find_similar(&claim.matched_text, language).await {
        // Update existing pattern
        pattern_store.increment_occurrence(&existing.id, project_hash).await?;
    } else {
        // Record new pattern
        let pattern = LearnedPattern::from_claim(&claim, &code_context, project_hash);
        pattern_store.record_pattern(&pattern).await?;
    }
}

7.6.5 Configuration ⬜

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

Files: learning/mod.rs, learning/pattern.rs, learning/store.rs, learning/normalize.rs

Phase 7.7: Pattern → Extractor Promotion ✅

High-frequency learned patterns get promoted to declarative extractors. This closes the learning loop: patterns discovered by LLM become permanent, fast regex extractors.

Vision

LearnedPattern (5+ projects, >0.8 confidence)
        ↓
Claude: "Generate regex for this pattern"
        ↓
Candidate declarative extractor
        ↓
Validate against stored examples
        ↓
Human review (optional) → Approve/Reject
        ↓
Merge to project's .aphoria/extractors/

7.7.1 Promotion Pipeline ✅

Task	Status
`PromotionPipeline`	✅ `promotion/pipeline.rs` — orchestrates full promotion flow
`RegexGenerator`	✅ `promotion/regex_gen.rs` — Gemini LLM integration
`ExtractorValidator`	✅ `promotion/validator.rs` — ReDoS detection, timing validation
`YamlWriter`	✅ `promotion/writer.rs` — outputs to `.aphoria/extractors/learned/`
`InteractiveReviewer`	✅ `promotion/review.rs` — CLI review workflow
`PromotionCandidate`	✅ `promotion/types.rs`
`ValidationResult`	✅ `promotion/types.rs`

pub struct PromotionPipeline {
    pattern_store: Arc<dyn PatternStore>,
    llm_client: ClaudeClient,
    validator: ExtractorValidator,
}

impl PromotionPipeline {
    /// Get patterns ready for promotion
    pub async fn get_candidates(&self) -> Vec<PromotionCandidate> {
        let patterns = self.pattern_store
            .get_promotion_candidates(5, 0.8)
            .await?;

        patterns.into_iter()
            .map(|p| self.generate_candidate(p))
            .collect()
    }

    /// Generate declarative extractor from pattern
    async fn generate_candidate(&self, pattern: LearnedPattern) -> PromotionCandidate {
        // Ask Claude to generate regex
        let regex = self.llm_client.generate_regex(&pattern).await?;

        // Build declarative extractor
        let extractor = DeclarativeExtractor {
            name: pattern.id.to_string(),
            language: pattern.language,
            pattern: regex,
            claim: pattern.claim_template.clone(),
            source: ExtractorSource::Learned {
                pattern_id: pattern.id,
                projects: pattern.project_hashes.len(),
            },
        };

        // Validate against examples
        let validation = self.validator.validate(&extractor, &pattern).await;

        PromotionCandidate { pattern, extractor, validation }
    }
}

7.7.2 Regex Generation ✅

Task	Status
Multi-example prompt	✅ Includes all examples in generation prompt
Regex safety	✅ ReDoS detection prevents catastrophic backtracking
Test coverage	✅ Validates against stored examples

async fn generate_regex(examples: &[String], claim: &ClaimTemplate) -> Result<String> {
    let prompt = format!(
        "Generate a regex pattern that matches all these code examples:\n\n{}\n\n\
         The regex should extract the value for claim: {}\n\
         Requirements:\n\
         - Must match ALL examples\n\
         - Use named capture groups for extracted values\n\
         - Avoid catastrophic backtracking (no nested quantifiers)\n\
         - Return ONLY the regex, no explanation",
        examples.join("\n---\n"),
        claim.subject_template
    );

    let response = claude.message(&prompt).await?;
    validate_regex_safety(&response)?;
    Ok(response)
}

7.7.3 Validation Suite ✅

Task	Status
Positive tests	✅ Must match all stored examples
ReDoS detection	✅ Detects catastrophic backtracking patterns
Performance test	✅ Timing validation with configurable threshold
False positive check	⬜ Deferred to Phase 9 (sample codebase FP testing)

pub struct ExtractorValidator {
    sample_codebases: Vec<PathBuf>,  // Known-good projects for FP testing
}

impl ExtractorValidator {
    pub async fn validate(
        &self,
        extractor: &DeclarativeExtractor,
        pattern: &LearnedPattern
    ) -> ValidationResult {
        let mut result = ValidationResult::default();

        // Must match all positive examples
        for example in &pattern.examples {
            if !extractor.matches(example) {
                result.positive_failures.push(example.clone());
            }
        }

        // Must not have excessive false positives
        for codebase in &self.sample_codebases {
            let fps = self.count_false_positives(extractor, codebase).await;
            if fps > 10 {
                result.false_positive_warning = true;
            }
        }

        // Must be fast
        let duration = self.benchmark(extractor);
        if duration > Duration::from_millis(100) {
            result.performance_warning = true;
        }

        result
    }
}

7.7.4 Human Review Gate ✅

Task	Status
`aphoria extractors review`	✅ CLI to review pending promotions
`aphoria extractors stats`	✅ Show pattern store statistics
`aphoria extractors candidates`	✅ List promotion candidates
`aphoria extractors promote`	✅ Promote pattern to extractor
Approval workflow	✅ Approve, reject, or skip via InteractiveReviewer
Rejection tracking	⬜ Deferred to Phase 9 (rejection reason persistence)
Auto-approve mode	⬜ Deferred to Phase 9 (>0.95 confidence auto-promote)

$ aphoria extractors review

Pending promotions: 3

[1/3] Pattern: tls_min_version_const
      Examples: 47 (across 8 projects)
      Confidence: 0.91

      Generated regex: (?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["']?(1\.[01])["']?

      Sample matches:
        const TLS_MIN_VERSION = "1.0"     ✓ matches
        TLS_MINIMUM_VERSION: "1.1"        ✓ matches
        ssl_min_version = "1.2"           ✓ matches (TLS 1.2 is safe, false positive?)

      [a]pprove  [r]eject  [e]dit  [s]kip  [q]uit: _

7.7.5 Extractor Output ✅

Promoted patterns become declarative extractors in .aphoria/extractors/learned/:

# .aphoria/extractors/learned/tls_min_version_const.yaml
# Auto-generated from learned pattern. DO NOT EDIT.
# Pattern ID: 550e8400-e29b-41d4-a716-446655440000
# Learned from: 8 projects, 47 occurrences
# Confidence: 0.91
# Promoted: 2026-02-10

name: "tls_min_version_const"
language: ["rust", "go", "python", "javascript", "typescript"]
pattern: '(?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["\']?(1\.[01])["\']?'
claim:
  subject: "tls/min_version"
  predicate: "version"
  value_capture: 1  # Capture group for version
  description: "TLS minimum version set to deprecated {value}"
metadata:
  source: "learned"
  pattern_id: "550e8400-e29b-41d4-a716-446655440000"
  projects: 8
  occurrences: 47
  confidence: 0.91

7.7.6 Configuration ✅

# aphoria.toml
[promotion]
enabled = true                    # Enable promotion pipeline
auto_promote = false              # Require human approval
output_dir = ".aphoria/extractors/learned"
min_confidence = 0.8              # Minimum to consider
min_projects = 5                  # Projects needed before promotion
require_validation = true         # Must pass validation suite

Files: promotion/mod.rs, promotion/pipeline.rs, promotion/regex_gen.rs, promotion/validator.rs, promotion/review.rs, promotion/writer.rs, promotion/types.rs, handlers/extractors.rs

Tests: 43 tests covering pipeline, validation, regex generation, and YAML output.

Phase 9: Autonomous Extractor Generation ✅

The system generates, tests, and deploys extractors without human approval for high-confidence patterns. This is the endgame: a fully self-improving extraction system.

Vision

Learned pattern exceeds autonomous threshold (>0.95 confidence, >10 projects)
        ↓
Auto-generate extractor
        ↓
Validate against comprehensive test suite
        ↓
A/B test: run new extractor in shadow mode
        ↓
If FP rate < 5%: auto-deploy
        ↓
If FP rate spikes: auto-rollback

Phase 7.8: LLM Prompt Evaluation ✅

Measure and improve LLM extraction quality through golden fixtures and regression detection. Essential for prompt engineering without breaking existing quality.

Vision

Golden Fixtures (TOML)                 Evaluation Harness
   ├── tls-001: verify=False            ├── Load fixtures
   ├── jwt-001: algorithm=none    -->   ├── Run extraction (live/cached/mock)
   └── secrets-001: hardcoded key       ├── Match against expectations
                                        ├── Compute precision/recall/F1
                                        └── Compare to baseline (regression detection)

7.8.1 Fixture Format ✅

Task	Status
`Fixture` type	✅ `eval/fixture.rs` — TOML-based test cases
`ExpectedClaim`	✅ Subject/predicate/value expectations
`must_contain`	✅ Claims that MUST be extracted (recall)
`must_not_contain`	✅ Claims that MUST NOT appear (precision)
`FixtureLoader`	✅ Load fixtures from directory tree
`CorpusManifest`	✅ Corpus metadata + baseline metrics
Validation	✅ Duplicate ID, empty content, missing expectations

# tests/llm_fixtures/tls/tls-001-disabled-verification.toml
[metadata]
id = "tls-001"
name = "TLS verification disabled in Python requests"
category = "tls"
language = "python"

[input]
filename = "api_client.py"
content = """
response = requests.get(url, verify=False)
"""

[expected]
must_contain = [
    { subject = "tls/cert_verification", predicate = "enabled", value = false }
]
must_not_contain = [
    { subject = "tls/cert_verification", predicate = "enabled", value = true }
]

7.8.2 Claim Matching ✅

Task	Status
`ClaimMatcher`	✅ `eval/matcher.rs` — Flexible claim comparison
Tail-path matching	✅ Last 2 segments for subject comparison
Type coercion	✅ Boolean↔string ("true"/"yes"), number↔string
Confidence thresholds	✅ Optional min_confidence per expectation
`count_false_positives()`	✅ Detect unexpected claims

7.8.3 Metrics Computation ✅

Task	Status
`Metrics`	✅ `eval/metrics.rs` — Aggregate evaluation metrics
Precision/Recall/F1	✅ Standard information retrieval metrics
Per-category breakdown	✅ Metrics by fixture category
Cost estimation	✅ Token-based cost tracking
`BaselineComparison`	✅ Compare current run to stored baseline
Regression detection	✅ Flag if F1/precision/recall drop > threshold

7.8.4 Evaluation Harness ✅

Task	Status
`EvalHarness`	✅ `eval/harness.rs` — Orchestrates evaluation runs
`EvalMode::Live`	✅ Real LLM API calls
`EvalMode::Cached`	✅ Use cached responses (deterministic CI)
`EvalMode::Mock`	✅ No LLM, tests harness itself
`EvalVerdict`	✅ Pass, Regression, Review, Error
`update_baseline()`	✅ Save current metrics as new baseline

7.8.5 Report Generation ✅

Task	Status
`Report`	✅ `eval/report.rs` — Multi-format output
Table format	✅ Terminal tables with color-coded results
JSON format	✅ Machine-readable for CI/CD integration
Markdown format	✅ Documentation and PR comments
Failed fixture details	✅ Shows unmatched expectations with rationale

7.8.6 CLI Commands ✅

Task	Status
`aphoria eval run`	✅ Run evaluation against fixtures
`aphoria eval baseline`	✅ Show current baseline metrics
`aphoria eval update-baseline`	✅ Update baseline (--force required)
`aphoria eval list-fixtures`	✅ List available fixtures by category
`aphoria eval validate-fixtures`	✅ Validate fixture format
`--fail-on-regression`	✅ Exit code 1 if regression detected
`--threshold`	✅ Configurable regression threshold (default 5%)
`--mode`	✅ live, cached, or mock

# Run evaluation in mock mode
aphoria eval run --fixtures tests/llm_fixtures --mode mock

# CI: fail on regression
aphoria eval run --mode cached --fail-on-regression --threshold 0.05

# Update baseline after prompt improvements
aphoria eval update-baseline --fixtures tests/llm_fixtures --force

# List fixtures by category
aphoria eval list-fixtures --category tls

7.8.7 Seed Fixtures ✅

Category	Fixture	Description
tls	tls-001	Python requests verify=False
tls	tls-002	Node.js TLSv1 deprecated protocol
jwt	jwt-001	Algorithm 'none' allowed
jwt	jwt-002	Go WithoutClaimsValidation
secrets	secrets-001	Hardcoded API key
secrets	secrets-002	High-entropy JWT in config
auth	auth-001	Debug authentication bypass
negative	negative-001	Safe TLS config (no findings expected)
negative	negative-002	Env-loaded secrets (no findings expected)
edge	edge-001	Empty file edge case

Files: eval/mod.rs, eval/fixture.rs, eval/matcher.rs, eval/metrics.rs, eval/harness.rs, eval/report.rs, handlers/eval.rs, cli.rs, tests/llm_fixtures/

Documentation: docs/llm-optimization/ — Full optimization playbook with decision trees, research templates, and baseline tracking.

9.1 Autonomous Promotion ✅

Task	Description	Status
`AutonomousConfig`	Configuration with kill switch (enabled: false default)	✅
High-confidence threshold	Skip human review for >0.95 confidence	✅
Project threshold	Require >10 projects for autonomous	✅
Validation strictness	Zero failures, zero warnings required	✅
`should_auto_promote()`	Decision logic on `PromotionCandidate`	✅
`auto_promotion_blockers()`	Explains why pattern can't be auto-promoted	✅
`AutonomousAuditLog`	JSONL audit trail for all decisions	✅
`smart_auto_promote_all()`	Pipeline integration with audit logging	✅
YAML header enhancement	"AUTO-PROMOTED" + "Approved by: autonomous"	✅
CLI command	`aphoria extractors auto-promote [--dry-run]`	✅

Safety Features:

Kill switch: enabled: false by default (opt-in only)
Auditability: All decisions logged to ~/.aphoria/audit/autonomous-decisions.jsonl
Reversibility: Can delete YAML + reset pattern.promoted
Blast radius: One pattern = one YAML file
Traceability: YAML header shows approval source

Files: config/types/autonomous.rs, promotion/audit.rs, promotion/types.rs, promotion/pipeline.rs, promotion/writer.rs, handlers/extractors.rs

Configuration:

[autonomous]
enabled = true            # Master switch (default: false)
min_confidence = 0.95     # Stricter than standard 0.8
min_projects = 10         # Stricter than standard 5
require_zero_failures = true
require_zero_warnings = true
audit_log = true
audit_dir = "~/.aphoria/audit/"

CLI Usage:

# Preview what would be auto-promoted
aphoria extractors auto-promote --dry-run

# Run autonomous promotion
aphoria extractors auto-promote

# Override thresholds
aphoria extractors auto-promote --min-confidence 0.97 --min-projects 15

9.2 Shadow Mode Testing ✅

Task	Description	Status
`ShadowConfig`	Configuration for shadow mode (min_scans, max_fp_rate, rollback_threshold)	✅
`ShadowTest`, `ShadowStatus`, `ShadowMetrics`	Core types for tracking shadow extractors	✅
`ShadowStore`	JSONL persistence for tests, matches, and decisions	✅
`ShadowExtractorRegistry`	Loads shadow extractors from learned/ directory	✅
`ShadowExecutor`	Runs shadow extractors during scans, stores matches separately	✅
`FeedbackCollector`	TP/FP feedback collection and metrics update	✅
`GraduationManager`	Shadow → production promotion and rollback logic	✅
CLI commands	`shadow-status`, `feedback`, `graduate`, `rollback`	✅

Safety Features:

Shadow isolation: Matches stored separately, not in production output
Metrics transparency: FP rate visible via shadow-status
Graduation gate: Must meet min_scans (100) + max_fp_rate (5%) + feedback exists
Manual control: rollback command for immediate removal
Audit trail: All decisions logged to decisions.jsonl

Files: shadow/mod.rs, shadow/types.rs, shadow/store.rs, shadow/registry.rs, shadow/executor.rs, shadow/feedback.rs, shadow/graduation.rs, handlers/shadow.rs, config/types/shadow.rs

Configuration:

[shadow]
enabled = true            # Shadow mode on by default
min_scans = 100           # Scans before graduation eligible
max_fp_rate = 0.05        # Maximum FP rate for graduation
rollback_threshold = 0.15 # FP rate that triggers rollback
retention_days = 30       # Days to retain shadow data

CLI Usage:

# View shadow test status
aphoria extractors shadow-status [-v]

# Provide TP/FP feedback on matches
aphoria extractors feedback <test-name> [--limit 10]

# Graduate shadow test to production
aphoria extractors graduate <test-name> [--force]

# Rollback a shadow test
aphoria extractors rollback <test-name> --reason "too many FPs"

Tests: 44 tests covering types, store, registry, executor, feedback, graduation, and auto-rollback.

9.3 Auto-Rollback ✅

Task	Description	Status
`auto_rollback_enabled` config	Toggle to enable/disable auto-rollback (default: true)	✅
Feedback-time check	Auto-rollback triggered immediately after FP feedback	✅
`FeedbackWithRollback` return	`record_feedback()` returns rollback info	✅
`AutoRollbackResult`	Track checked count, rolled back names, errors	✅
CLI command	`aphoria extractors auto-check` for manual batch checking	✅
Audit trail	Decision logged as `ShadowDecisionKind::AutoRollback`	✅
YAML deletion	Extractor file deleted from learned/ on rollback	✅

Safety Features:

Toggle: auto_rollback_enabled can disable feature for testing or manual-only workflows
Threshold configurable: rollback_threshold in config (default: 15%)
Minimum reviews: Requires 10+ reviewed matches before auto-rollback triggers
Audit trail: All auto-rollback decisions logged to decisions.jsonl
CLI fallback: auto-check command for manual verification

Files: shadow/feedback.rs, shadow/graduation.rs, config/types/shadow.rs, handlers/shadow.rs, cli.rs

Configuration:

[shadow]
enabled = true
auto_rollback_enabled = true  # NEW: Enable automatic rollback (default: true)
rollback_threshold = 0.15     # FP rate that triggers auto-rollback

CLI Usage:

# Automatic: Rollback happens immediately when feedback pushes FP rate over threshold
aphoria extractors feedback <test-name> --limit 10
# If FP rate exceeds 15%, you'll see:
# ⚠️  AUTO-ROLLBACK TRIGGERED: <extractor-name>

# Manual batch check: Scan all active tests and rollback any over threshold
aphoria extractors auto-check
# Output: "⚠️  Auto-rolled back 1 of 5 shadow test(s): ..."

Tests: 3 new tests covering auto-rollback triggering, disabled toggle, and threshold boundary.

9.4 Cross-Project Learning ✅

Task	Description	Status
Hosted pattern sync	Patterns from all projects aggregate on server	✅
Global promotion	Promote patterns seen across many orgs	✅
Privacy preservation	Only normalized patterns shared, no code	✅
Opt-in distribution	Orgs can opt-in to receive community extractors	✅

Org A: Pattern seen in 3 projects → shared to hosted
Org B: Same pattern in 5 projects → shared to hosted
Org C: Same pattern in 4 projects → shared to hosted
        ↓
Hosted aggregates: 12 projects total
        ↓
Promotes to community extractor
        ↓
All orgs receive new extractor (if opted in)

Implementation:

CrossProjectConfig with opt-in flags (contribute_patterns, receive_community)
PatternSyncer for uploading anonymized patterns to hosted server
CommunityExtractorLoader for pulling community extractors as YAML files
BLAKE3 hashing for pattern deduplication and org anonymization
Privacy guarantees: normalized_pattern shared, but NOT example_code or project_hashes
CLI commands: aphoria patterns sync, aphoria patterns status, aphoria patterns pull-community

Files: config/types/cross_project.rs, community/pattern_syncer.rs, community/extractor_loader.rs, handlers/patterns.rs

Tests: 7 new tests covering pattern hashing, subject exclusion, anonymization, and extractor loading.

9.5 Extractor Versioning ✅

Task	Description	Status
Version tracking	Track which version caught which issues	✅ `ExtractorVersion` + `VersionStore`
Changelog	Record changes between versions	✅ `ExtractorChangelog` + `ChangelogEntry`
Rollback support	Revert to previous version	✅ `aphoria extractors rollback-version`
A/B metrics	Compare versions side-by-side	✅ `aphoria extractors compare` + `compute_metrics_delta()`
CLI commands	versions, compare, rollback-version	✅ Full CLI implementation
Tests	Unit tests for all components	✅ 15+ version/changelog tests

Files:

promotion/version.rs - Core types (ExtractorVersion, ChangelogEntry, MetricsDelta, ExtractorChangelog, VersionStore)
promotion/writer.rs - Versioned YAML output (write_versioned())
promotion/types.rs - Version field in PromotionMetadata
handlers/extractors.rs - CLI handlers (handle_versions, handle_compare, handle_rollback_version)
cli.rs - CLI commands (Versions, Compare, RollbackVersion)

CLI Usage:

# List versions
aphoria extractors versions learned_tls_min_version
# Version History: learned_tls_min_version
# Version  Date         Changes
# ------------------------------------------------------------
# 2        2026-03-15   Added support for YAML configs
# 1        2026-02-01   Initial promotion from learned pattern

# Compare versions
aphoria extractors compare learned_tls_min_version -a 1 -b 2
# Comparison: learned_tls_min_version v1 vs v2
# Matches              +15%
# False Positives      -3%

# Rollback
aphoria extractors rollback-version learned_tls_min_version --version 1 --reason "v2 edge case bug"
# Rolled back learned_tls_min_version to v1

YAML Output:

# Generated from learned pattern. Review before editing.
# Pattern ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
# Version: 2 (previous: 1)
# Promoted: 2026-03-15 14:30:00 UTC

name: learned_tls_min_version
description: TLS minimum version set to deprecated value
version: 2
previous_version: 1
languages:
  - rust
  - go
pattern: '(?i)tls_?min_?(version)?\s*[:=]\s*["\']?(?P<value>1\.[01])["\']?'
claim:
  subject: tls/min_version
  predicate: version
  value_from_match: true
confidence: 0.97
metadata:
  source: learned
  pattern_id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
  version: 2
changelog:
  - version: 2
    date: 2026-03-15
    changes: "Added support for YAML configs"
    metrics:
      matches: "+15%"
      false_positives: "-3%"
  - version: 1
    date: 2026-02-01
    changes: "Initial promotion from learned pattern"

9.6 Configuration ⬜

# aphoria.toml
[autonomous]
enabled = false                   # Opt-in to autonomous mode
min_confidence = 0.95             # Higher threshold for auto
min_projects = 10                 # More evidence required
shadow_scans = 100                # Scans before promotion
max_fp_rate = 0.05                # Auto-rollback threshold

[autonomous.distribution]
receive_community = true          # Receive community extractors
contribute_patterns = true        # Share patterns to community

Files: autonomous/mod.rs, autonomous/shadow.rs, autonomous/rollback.rs, autonomous/distribution.rs

Milestone Summary

Phase	Deliverable	Depends On	Status
0	ConceptPath in StemeDB	concept-hierarchy spec	✅
2	Aphoria CLI (scan, report, ack)	Phase 0	✅
2A	Concept matching (leaf, alias, auto-alias)	Phase 2	✅
1	Authoritative corpus expansion	Phase 0	✅
3	Claude Code skill + hooks	Phase 2A	✅
4.5	Ephemeral scan mode (40x faster)	Phase 2	✅
5	Research agent loop	Phase 3	✅
6	Federated Policy & Trust Packs	Phase 4.5	✅
6.5	Trust Pack Extensions (Predicate Aliases, Key Rotation)	Phase 6	✅
4A	Observational claims (Tier 4 write-back)	Phase 6	✅
4B	Self-conflict detection (drift)	Phase 4A	✅
4C	Diff-only scanning (--staged)	Phase 4B	✅
4E	Hosted mode (team aggregation)	Phase 4C	✅
4D	Enhanced ack (--reason, policy updates)	Phase 4C	✅
5.6	Community Corpus Contributions	Phase 4E	✅
7	Declarative Extractors	Phase 6	✅
7.5	LLM-in-the-Loop Extraction (Gemini)	Phase 7	✅
7.6	Pattern Learning Store	Phase 7.5	✅
7.7	Pattern → Extractor Promotion	Phase 7.6	✅
7.8	LLM Prompt Evaluation	Phase 7.5	✅
8	Enterprise Extractors (8.1-8.11)	Phase 7.5	✅
8.2	Framework-Specific Extractors (10 frameworks)	Phase 8	✅
9.1	Autonomous Promotion	Phase 8	✅
9.2	Shadow Mode Testing	Phase 9.1	✅
9.3	Auto-Rollback	Phase 9.2	✅
9.4	Cross-Project Learning	Phase 9.1	✅
9.5	Extractor Versioning	Phase 9.4	✅

Current state:

Phases 0-3, 4.5, 4A-4E, 5, 5.6, 6, 7, 7.5, 7.6, 7.7, 7.8, 8, 9.1, 9.2, 9.3, 9.4, 9.5 complete (clippy clean)
Full corpus: RFC, OWASP, Vendor sources
36 extractors including:
- Security: weak_crypto, command_injection, sql_injection, high_entropy_secrets, auth_bypass, insecure_cookies, path_traversal, unvalidated_redirects, weak_password, security_headers, insecure_deserialization, ssrf, orm_injection, xxe
- Framework-specific: django, express, flask, fastapi, nestjs, nextjs, spring, laravel, rails, aspnet
Trust Packs: signed policy bundles with import/export
Ephemeral mode: 40x faster for CI
Observation write-back: --sync records novel claims as Tier 4 project memory
Autonomous promotion: High-confidence patterns (>0.95, 10+ projects) can skip human review with full audit trail
Shadow mode testing: Auto-promoted extractors run in shadow mode to measure FP rate before graduation
Auto-rollback: Shadow extractors exceeding FP threshold (15%) are automatically rolled back
Drift detection: Detects changes from prior observations
Staged scanning: --staged flag for fast pre-commit hooks
Hosted mode: Team aggregation via central StemeDB server
Enhanced ack: --reason flag, aphoria update for policy changes
Community Corpus: Opt-in anonymous pattern sharing with privacy-preserving anonymization
Declarative Extractors: TOML-defined custom extractors without Rust code
LLM Extraction: Gemini-powered semantic claim extraction for high-value files
Pattern Learning: LLM-extracted claims recorded for promotion to declarative extractors
Pattern Promotion: CLI workflow to promote learned patterns to declarative extractors with Gemini regex generation and validation
LLM Prompt Evaluation: Golden fixtures with precision/recall metrics, baseline comparison, and regression detection for prompt engineering
Cross-Project Learning: Privacy-preserving pattern sync to hosted server, community extractor pull, BLAKE3-based deduplication, opt-in sharing with CrossProjectConfig
Extractor Versioning: Version tracking with changelogs, safe rollback to previous versions, A/B metrics comparison between versions via VersionStore

Phase 9 Complete! Autonomous Generation pipeline is fully self-improving.

The Self-Learning Vision

Phase 7: Declarative Extractors (foundation)           ✅ COMPLETE
    ↓
Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅ COMPLETE
    ↓
Phase 7.6: Pattern Learning (remember what LLM finds)   ✅ COMPLETE
    ↓
Phase 7.7: Pattern Promotion (patterns → extractors)    ✅ COMPLETE
    ↓
Phase 7.8: LLM Prompt Evaluation (measure & improve)    ✅ COMPLETE
    ↓
Phase 8: Enterprise Extractors (36 total)              ✅ COMPLETE
    ├── 8.1: High-entropy secrets                      ✅
    ├── 8.2: Framework extractors (10 frameworks)      ✅
    ├── 8.3: Config deep parsing                       ✅
    ├── 8.4-8.11: Security patterns                    ✅
    ↓
Phase 9: Autonomous Generation (fully self-improving)   ✅ COMPLETE
    ├── 9.1: Autonomous Promotion                        ✅ COMPLETE
    ├── 9.2: Shadow Mode Testing                         ✅ COMPLETE
    ├── 9.3: Auto-Rollback                               ✅ COMPLETE
    ├── 9.4: Cross-Project Learning                      ✅ COMPLETE
    └── 9.5: Extractor Versioning                        ✅ COMPLETE

The endgame: Every PR teaches Aphoria. After a month, it knows your security patterns better than your team does.

Bidirectional Knowledge Sync (Complete)

The pre-commit hook is now a bidirectional knowledge sync:

4A ✅: Record code claims as Tier 4 observations (project memory)
4B ✅: Detect drift from prior observations (self-conflict)
4C ✅: Fast diff-only scanning for pre-commit hooks (--staged)
4E ✅: Team aggregation via hosted StemeDB server
4D ✅: Enhanced ack with rationale and policy updates

This transforms Aphoria from a linter into a learning system that builds institutional memory per-project and collective intelligence across teams via hosted mode.

Phase 8: Enterprise Extractor Improvements ✅

Goal: Transform extractors from "toy examples" to enterprise-grade detection that catches real violations in production codebases.

Current State Audit

Extractor	Languages	Strengths	Weaknesses
`tls_verify`	8	Multi-lang, configs	Misses custom wrappers
`tls_version`	8	API patterns	Misses semantic (const = "1.0")
`hardcoded_secrets`	8	Placeholders, test files	No entropy detection
`weak_crypto`	5	MD5/SHA1/DES/RC4	SHA1 false positives, misses bcrypt cost
`sql_injection`	5	Interpolation patterns	Misses ORM unsafe methods
`jwt_config`	8	alg:none, skip sig	Library-specific gaps
`cors_config`	8	Wildcard + credentials	Misses dynamic origin reflection
`rate_limit`	8	Basic patterns	Limited depth
`timeout_config`	8	Basic patterns	Limited depth
`command_injection`	5	exec/system calls	Indirect injection
`dep_versions`	3	Version parsing	No CVE correlation

Enterprise Reality: Current extractors catch ~30% of real-world security misconfigurations. Config files are highest value (patterns consistent), code is lowest (semantic understanding required).

8.1 High-Entropy Secret Detection ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`HighEntropySecretsExtractor`	✅ `extractors/high_entropy_secrets.rs`
Shannon entropy algorithm	✅ `shannon_entropy()` with 4.5 threshold
Charset variety check	✅ 0.4 minimum variety ratio
Known secret prefixes	✅ AWS (AKIA), Stripe (sk_live_, sk_test_), GitHub (ghp_, gho_), GitLab (glpat-), Slack (xox[baprs]-)
High-entropy context patterns	✅ api_key, secret, token, credential, auth_key contexts
False positive exclusions	✅ UUIDs, git SHAs (40-char hex), file hashes (64-char hex)
Test file confidence reduction	✅ 0.6 confidence for test files
Tests	✅ 10+ tests covering all patterns

Configuration:

# aphoria.toml
[extractors.entropy]
min_entropy = 4.5          # Shannon entropy threshold
min_charset_variety = 0.4  # Unique chars / length ratio
min_length = 20            # Minimum string length
max_length = 200           # Maximum string length

Languages: Rust, Go, Python, JavaScript, TypeScript, YAML, TOML, JSON, Dotenv

8.2 Framework-Specific Extractors ✅

Impact: HIGH | Effort: HIGH | Status: Complete

Research Document: docs/architecture/framework-security-extractors.md

All 10 framework-specific extractors implemented and tested:

Framework	Extractor	Languages	Tests
Spring Boot	`spring_security`	Java, YAML, Properties	7
Django	`django_security`	Python	7
Express.js	`express_security`	JavaScript, TypeScript	5
Rails	`rails_security`	Ruby, YAML	6
ASP.NET Core	`aspnet_security`	C# (via regex), JSON	6
Laravel	`laravel_security`	PHP (via regex)	5
FastAPI	`fastapi_security`	Python	5
Next.js	`nextjs_security`	JavaScript, TypeScript	5
Flask	`flask_security`	Python	6
NestJS	`nestjs_security`	TypeScript	5

Total: 10 extractors, 57+ tests, 100+ patterns

Files: extractors/{django,express,flask,fastapi,nestjs,nextjs,spring,laravel,rails,aspnet}_security.rs

8.2.1 Spring Boot Security

# application.yml misconfigs
security:
  basic:
    enabled: false      # Auth disabled
  csrf:
    enabled: false      # CSRF disabled
  headers:
    frame-options: DISABLE  # Clickjacking

// Java code patterns
@EnableWebSecurity
public class Config extends WebSecurityConfigurerAdapter {
    http.csrf().disable();  // CSRF disabled
    http.authorizeRequests().antMatchers("/**").permitAll();  // Auth bypass
}

8.2.2 Django Security

# settings.py misconfigs
DEBUG = True  # Debug in production
ALLOWED_HOSTS = ['*']  # All hosts
CSRF_COOKIE_SECURE = False  # Insecure cookies
SESSION_COOKIE_SECURE = False

8.2.3 Express.js Security

// Missing security middleware
app.use(helmet());  // helmet() should exist
app.use(cors({ origin: '*', credentials: true }));  // CORS + creds
app.disable('x-powered-by');  // Should be disabled

8.2.4 Rails Security

# config/environments/production.rb
config.force_ssl = false  # Should be true
config.action_dispatch.cookies_same_site_protection = :none

8.3 Config File Deep Parsing ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`ConfigValue` enum	✅ `extractors/config_parser.rs`
YAML/JSON/TOML parsers	✅ Using `serde_yaml`, `serde_json`, `toml`
Tree walker with path tracking	✅ `walk_config()` with dot-path
`ConfigSecurityExtractor`	✅ `extractors/config_security.rs`
Security rules (11 rules)	✅ TLS, CSRF, debug, password, cookies, CORS, rate limit
Dev file exclusion	✅ Skip debug warnings in dev/test configs
Tests	✅ 26 tests for parsing + security rules

Patterns now caught (nested to any depth):

*.tls.verify: false — TLS verification disabled
*.insecure_skip_verify: true — Skip verification enabled
*.security.enabled: false — Security disabled
*.csrf.enabled: false — CSRF protection disabled
debug: true — Debug mode (only in production files)
*.password.min_length < 8 — Weak password policy
*.cookie.secure: false — Cookie secure flag disabled
*.cookie.httpOnly: false — Cookie httpOnly disabled
*.cors.allow_origin: "*" — CORS allows all origins
*.rate_limit.enabled: false — Rate limiting disabled

Languages: YAML, JSON, TOML

8.4 Semantic TLS Version Detection ✅

Impact: MEDIUM | Effort: MEDIUM | Status: Complete

Task	Status
Add `Language::Terraform` variant	✅ `types/language.rs`
Semantic pattern (cross-language)	✅ Catches `TLS_MIN_VERSION = "1.0"` with type annotations
Environment variable pattern	✅ `.env` files with `TLS_MIN_VERSION=1.0`
Terraform HCL pattern	✅ `min_tls_version = "TLS1_0"`
Kubernetes camelCase pattern	✅ `minTLSVersion: VersionTLS10`
False positive prevention	✅ TLS 1.2/1.3 not flagged
Tests	✅ 16 new tests (27 total for TLS extractor)

Patterns now caught:

const TLS_MIN_VERSION: &str = "1.0"; (Rust with type annotation)
let sslVersion = "TLSv1"; (JavaScript camelCase)
TLS_MINIMUM_VERSION = "1.1" (Python assignment)
TLS_MIN_VERSION=1.0 (dotenv)
export SSL_VERSION=TLSv1 (shell export)
min_tls_version = "TLS1_0" (Terraform)
minTLSVersion: VersionTLS10 (Kubernetes YAML)

Languages: Rust, Go, Python, TypeScript, JavaScript, Yaml, Toml, Json, Terraform, Dotenv

8.5 ORM SQL Injection Detection ✅

Impact: MEDIUM | Effort: MEDIUM | Status: Complete

Task	Status
`OrmInjectionExtractor`	✅ `extractors/orm_injection.rs`
Django .raw() with interpolation	✅ `f"SELECT..."`, `.format()` patterns
Django .extra() with interpolation	✅ `where=["...{}...".format()]`
SQLAlchemy text() with interpolation	✅ `text(f"SELECT...")`
SQLAlchemy execute() with f-string	✅ `execute(f"...")`
Sequelize raw query	✅ sequelize.query(`...${...}`)
TypeORM where()	✅ .where(`...${...}`)
GORM Raw() with Sprintf	✅ `.Raw(fmt.Sprintf(...))`
Prisma $queryRawUnsafe	✅ $queryRawUnsafe(`...${...}`)
Tests	✅ 8+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Current sql_injection catches raw string interpolation but misses ORM escape hatches:

# SQLAlchemy
db.execute(text(f"SELECT * FROM users WHERE id = {user_id}"))
User.query.filter(text("name = '" + name + "'"))

# Django
User.objects.raw("SELECT * FROM users WHERE id = %s" % user_id)
User.objects.extra(where=["name = '%s'" % name])

// Sequelize
sequelize.query(`SELECT * FROM users WHERE id = ${userId}`);
Model.findAll({ where: sequelize.literal(`id = ${id}`) });

// Prisma
prisma.$queryRawUnsafe(`SELECT * FROM users WHERE id = ${id}`);

# ActiveRecord
User.where("name = '#{name}'")
User.find_by_sql("SELECT * FROM users WHERE id = #{id}")

8.6 Authentication Bypass Patterns ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`AuthBypassExtractor`	✅ `extractors/auth_bypass.rs`
Hardcoded admin credentials	✅ `username == "admin" && password == "..."` patterns
Debug auth headers	✅ X-Debug-Auth, X-Internal-Auth, X-Admin-Auth
Skip auth env vars	✅ SKIP_AUTH, BYPASS_AUTH, NO_AUTH, DEBUG_AUTH
Backdoor patterns	✅ `if username == "backdoor"`, `if user == "test"`
Default credentials	✅ admin/admin, root/root, test/test, guest/guest
Test file confidence reduction	✅ 0.5 confidence for test files
Tests	✅ 11+ tests covering all patterns

Detected patterns:

# Hardcoded credentials
if username == "admin" and password == "admin":

# Debug auth headers
if request.headers.get("X-Debug-Auth") == "secret":

# Skip auth env vars
if os.environ.get("SKIP_AUTH") == "true":

Languages: Python, JavaScript, TypeScript, Go, Rust

8.7 Insecure Deserialization ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`InsecureDeserializationExtractor`	✅ `extractors/insecure_deserialization.rs`
Python pickle (critical)	✅ `pickle.load()`, `pickle.loads()`, `Unpickler()`
Python yaml.load without SafeLoader	✅ Detects missing SafeLoader
Python marshal	✅ `marshal.load()`, `marshal.loads()`
Python eval/exec with user input	✅ `eval(request...)`, `exec(user...)`
JavaScript node-serialize	✅ `require('node-serialize')`, `.unserialize()`
Go gob decoder	✅ `gob.NewDecoder()`, `gob.Decode()`
Java ObjectInputStream (polyglot)	✅ `ObjectInputStream`, `readObject()`
Tests	✅ 10+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Unsafe deserialization of untrusted data:

# Python
pickle.loads(user_input)
yaml.load(user_input)  # Without Loader=SafeLoader
eval(user_input)
exec(user_input)

// Java
ObjectInputStream ois = new ObjectInputStream(userInput);
ois.readObject();  // Dangerous!

# Ruby
Marshal.load(user_input)
YAML.load(user_input)  # Should use safe_load

8.8 Path Traversal Patterns ✅

Impact: MEDIUM | Effort: LOW | Status: Complete

Task	Status
`PathTraversalExtractor`	✅ `extractors/path_traversal.rs`
Python open/read/write with user input	✅ `open(request...)`, `read(params...)`
Python os.path.join with user input	✅ `os.path.join(base, request...)`
JavaScript fs operations	✅ `fs.readFile(req...)`, `fs.writeFile(params...)`
JavaScript path.join/resolve	✅ `path.join(base, req.query...)`
JavaScript res.sendFile	✅ `res.sendFile(req.params...)`
Go filepath operations	✅ `filepath.Join(base, r...)`, `os.Open(req...)`
Rust path operations	✅ `Path::new(request...)`, `std::fs::read(user...)`
Traversal literals	✅ `../`, `%2e%2e` URL-encoded patterns
Tests	✅ 8+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust

File operations with user input:

# Python
open(user_input)
os.path.join(base, user_input)  # Doesn't prevent ../
shutil.copy(user_input, dest)

// JavaScript
fs.readFile(userInput)
path.join(base, userInput)  // Doesn't prevent ../
res.sendFile(userInput)

8.9 SSRF Patterns ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`SsrfExtractor`	✅ `extractors/ssrf.rs`
Python requests library	✅ `requests.get(url)`, `requests.post(target)`
Python urllib	✅ `urllib.request.urlopen(url)`
Python httpx	✅ `httpx.get(url)`, `AsyncClient`
JavaScript fetch	✅ `fetch(url)`, `fetch(req.query...)`
JavaScript axios	✅ `axios.get(url)`, `axios.post(target)`
JavaScript got	✅ `got(url)`
Go http.Get/Post	✅ `http.Get(url)`, `http.NewRequest(...)`
Rust reqwest	✅ `reqwest::get(url)`, `reqwest::Client`
URL sink patterns	✅ `proxy_url`, `webhook_url`, `callback_url` from request
Tests	✅ 10+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust

HTTP requests with user-controlled URLs:

# Python
requests.get(user_url)
urllib.request.urlopen(user_input)

// JavaScript
fetch(userUrl)
axios.get(userUrl)
http.get(userUrl)

// Go
http.Get(userURL)
client.Do(req)  // Where req.URL is user-controlled

8.10 Missing Security Headers ✅

Impact: MEDIUM | Effort: LOW | Status: Complete

Task	Status
`SecurityHeadersExtractor`	✅ `extractors/security_headers.rs`
X-Frame-Options disabled	✅ `X-Frame-Options: none`, `ALLOWALL`
X-Content-Type-Options disabled	✅ `X-Content-Type-Options: disabled`
X-XSS-Protection disabled	✅ `X-XSS-Protection: false`
Django SECURE_* settings	✅ `SECURE_BROWSER_XSS_FILTER = False`, etc.
YAML headers disabled	✅ `x_frame_options: false`, `hsts: no`
CSP disabled or unsafe	✅ `unsafe-inline`, `unsafe-eval` directives
HSTS disabled	✅ `Strict-Transport-Security: none`, `hsts_seconds = 0`
Tests	✅ 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, YAML, JSON, TOML

Detect when security headers are explicitly removed or not set:

# Response headers missing
response.headers.pop('X-Content-Type-Options')
response.headers['X-Frame-Options'] = 'ALLOWALL'

// Express without helmet
app.use(cors());  // CORS without other security
// No app.use(helmet()) found

Impact: MEDIUM | Effort: LOW | Status: Complete

Task	Status
`InsecureCookiesExtractor`	✅ `extractors/insecure_cookies.rs`
Missing Secure flag	✅ `secure=False`, `secure: false`
Missing HttpOnly flag	✅ `httponly=False`, `httpOnly: false`
SameSite=None without Secure	✅ `sameSite: 'none'`, `SameSite=None`
Django settings	✅ SESSION_COOKIE_SECURE, CSRF_COOKIE_SECURE = False
Go cookie patterns	✅ `Secure: false`, `HttpOnly: false`
Rust actix-web patterns	✅ `.secure(false)`, `.http_only(false)`
Test file confidence reduction	✅ 0.5 confidence for test files
Tests	✅ 8+ tests covering all patterns

Detected patterns:

# Python/Flask/Django
response.set_cookie('session', value, secure=False)
SESSION_COOKIE_SECURE = False

// JavaScript/Express
res.cookie('session', value, { httpOnly: false });
res.cookie('auth', value, { sameSite: 'none' });

Languages: Python, JavaScript, TypeScript, Go, Rust, Ruby, YAML

8.12 Unvalidated Redirects ✅

Impact: MEDIUM | Effort: LOW | Status: Complete

Task	Status
`UnvalidatedRedirectsExtractor`	✅ `extractors/unvalidated_redirects.rs`
Python redirect with user input	✅ `redirect(request.GET['next'])`, `HttpResponseRedirect(url)`
Python Flask redirect	✅ `redirect(request.args.get(...))`
JavaScript res.redirect	✅ `res.redirect(req.query.next)`
JavaScript window.location	✅ `window.location = url`, `location.href = params...`
Go http.Redirect	✅ `http.Redirect(w, r, r.Query...)`
URL parameter patterns	✅ `redirect_url`, `return_url`, `next`, `goto` from request
Tests	✅ 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Open redirect vulnerabilities:

# Python
return redirect(request.args.get('next'))
return redirect(request.GET['url'])

// JavaScript
res.redirect(req.query.redirect);
window.location = userInput;
window.location.href = params.url;

8.13 XXE (XML External Entity) ✅

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task	Status
`XxeExtractor`	✅ `extractors/xxe.rs`
Python lxml/etree	✅ `etree.parse()`, `lxml.fromstring()`
Python xml.etree.ElementTree	✅ `ET.parse()`, `ET.fromstring()`
Python xml.dom.minidom	✅ `minidom.parse()`, `minidom.parseString()`
Python xml.sax	✅ `xml.sax.parse()`, `xml.sax.make_parser()`
JavaScript xml2js	✅ `xml2js.parseString()`, `xml2js.Parser()`
JavaScript libxmljs	✅ `libxmljs.parseXml()`
Go encoding/xml	✅ `xml.Unmarshal()`, `xml.NewDecoder()`
Java patterns (polyglot)	✅ `DocumentBuilderFactory`, `SAXParser`, `XMLReader`
DTD entity declarations	✅ `<!ENTITY ... SYSTEM>`, `<!ENTITY ... PUBLIC>`
defusedxml detection	✅ Lower confidence when defusedxml is imported
Tests	✅ 9+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go

Unsafe XML parsing:

# Python
etree.parse(user_input)  # Without disabling entities
xml.etree.ElementTree.parse(user_input)

// Java
DocumentBuilderFactory.newInstance()  // Without setFeature to disable XXE
SAXParserFactory.newInstance()  // Without secure processing

8.14 Weak Password Requirements ✅

Impact: MEDIUM | Effort: LOW | Status: Complete

Task	Status
`WeakPasswordExtractor`	✅ `extractors/weak_password.rs`
Minimum length < 8	✅ `password_min_length: 6`, `minLength: 4`
Bcrypt cost < 10	✅ `bcrypt_cost = 8`, `hash_rounds = 5`
Simple length checks	✅ `len(password) >= 6` in code
Complexity disabled	✅ `require_special_chars: false`, `require_uppercase = false`
Number requirement disabled	✅ `require_numbers: no`, `require_digit = 0`
Tests	✅ 7+ tests covering all patterns

Languages: Python, JavaScript, TypeScript, Go, Rust, YAML, JSON, TOML

Password validation that's too weak:

# Python
if len(password) >= 4:  # Too short
if len(password) >= 6:  # Still weak
MIN_PASSWORD_LENGTH = 6  # Config too low

// JavaScript
if (password.length >= 4)
const MIN_LENGTH = 6;
/^.{4,}$/  // Regex allows 4+ chars

8.15 LLM-Assisted Extraction (Future) ⬜

Impact: VERY HIGH | Effort: VERY HIGH

Use Claude to understand code semantically:

// Pseudo-implementation
async fn extract_with_llm(code: &str, file: &str) -> Vec<ExtractedClaim> {
    let prompt = format!(
        "Analyze this code for security issues. Return JSON with:\n\
         - concept_path: security concept (e.g., 'tls/cert_verification')\n\
         - predicate: what aspect (e.g., 'enabled')\n\
         - value: the value found\n\
         - confidence: 0.0-1.0\n\
         - description: why this is an issue\n\n\
         Code:\n```\n{}\n```",
        code
    );
    
    let response = claude_api.message(&prompt).await?;
    parse_claims_from_llm_response(&response)
}

When to use:

High-value files (auth, crypto, config)
After regex extractors find nothing
For code review mode (not CI)

Considerations:

Cost per scan
Latency
Rate limits
Privacy (code leaves machine)

Implementation Priority

Phase	Extractors	Impact	Effort	Enterprise Value	Status
8.1	High-entropy secrets	HIGH	MEDIUM	Catches real leaked secrets	✅
8.2	Framework-specific	HIGH	HIGH	Spring/Django/Express coverage	✅
8.3	Config deep parsing	HIGH	MEDIUM	Nested YAML/JSON understanding	✅
8.4	Semantic TLS	MEDIUM	MEDIUM	Catches const TLS_MIN = "1.0"	✅
8.5	ORM SQL injection	MEDIUM	MEDIUM	SQLAlchemy, Django, Sequelize	✅
8.6	Auth bypass	HIGH	MEDIUM	Backdoors, hardcoded creds	✅
8.7	Deserialization	HIGH	MEDIUM	pickle, Marshal, eval	✅
8.8	Path traversal	MEDIUM	LOW	../../../etc/passwd	✅
8.9	SSRF	HIGH	MEDIUM	Internal network access	✅
8.10	Security headers	MEDIUM	LOW	Missing helmet(), CSP	✅
8.11	Cookie flags	MEDIUM	LOW	httpOnly, secure, sameSite	✅
8.12	Open redirects	MEDIUM	LOW	Phishing via redirect	✅
8.13	XXE	HIGH	MEDIUM	XML entity injection	✅
8.14	Weak passwords	MEDIUM	LOW	MIN_LENGTH = 4	✅
8.15	LLM extraction	VERY HIGH	VERY HIGH	Semantic understanding	✅ (Phase 7.5)

Phase 8 Complete (8.1-8.14): All extractors implemented including 10 framework-specific extractors (Spring, Django, Express, Rails, ASP.NET, Laravel, FastAPI, Next.js, Flask, NestJS).

Success Metrics

Metric	Current	Target	How to Measure
Detection rate (known vulns)	~30%	>70%	Run against OWASP benchmark
False positive rate	Unknown	<10%	Manual review of 100 findings
Config file coverage	Regex only	Full parse	Structure-aware extraction
Framework coverage	0	4 major	Spring, Django, Express, Rails
Enterprise pilot feedback	N/A	>4/5	Post-pilot survey

Phase 10: UX & Enterprise Polish ⬜

Goal: Address enterprise buyer feedback from pilot demos. Close gaps between pitch claims and actual functionality. Source: Skeptical buyer review of applications/aphoria-pitch/ materials.

10.1 Acknowledgment Expiry ✅

Impact: HIGH | Effort: MEDIUM | Priority: P1

Add --expires flag to aphoria ack command for time-limited exceptions.

Task	Status
Add `expires_at: Option<String>` to `AcknowledgmentInfo` struct (ISO 8601 format)	✅
Add `--expires` CLI flag to `Commands::Ack` in `cli.rs`	✅
Parse durations: `--expires 90d`, `--expires 2026-12-31` (ISO 8601 date only)	✅
Filter expired acks in `check_conflicts()`	✅
Show "Ack expired, resurfaces as BLOCK" in output	✅
Add expiry to JSON export for audit trail	✅
Tests for expiry parsing and behavior	✅

Implementation Notes:

Created src/expiry.rs module with parse_expiry(), is_expired(), and format_expiry() functions
Ack payloads stored as JSON with {reason, expires_at} for backwards compatibility
Legacy plain-text acks treated as permanent (no expiry)
Expired acks preserved for audit trail per patent claim 25
Updated all report formatters (table, JSON, markdown) to show expiry info

CLI changes (cli.rs):

Ack {
    concept_path: String,
    #[arg(short, long)]
    reason: String,
    /// Optional expiry (e.g., "90d", "2026-12-31")
    #[arg(long)]
    expires: Option<String>,
},

Usage:

# Expire after 90 days
aphoria ack code://go/auth/tls/cert_verification \
  --reason "Integration test environment" \
  --expires 90d

# Expire on specific date (ISO 8601)
aphoria ack code://go/auth/tls/cert_verification \
  --reason "Legacy migration - ends Q2" \
  --expires 2026-12-31

Output after expiry:

BLOCK  code://go/auth/tls/cert_verification
       Your code:  TLS certificate verification is disabled (main.go:12)
       Note:       Previous acknowledgment expired 2026-12-31
       Action:     Re-acknowledge or fix the issue

Enterprise Value: "Exceptions don't become permanent." SOC 2 auditors love time-limited exceptions because they force periodic review.

10.2 Human-Readable Signer Names ⬜

Impact: MEDIUM | Effort: MEDIUM | Priority: P2

Map issuer hex IDs to human-readable team names in output.

Task	Status
Add `signer_name: Option<String>` to `PackHeader`	⬜
Add `contact: Option<String>` to `PackHeader` (Slack channel, email)	⬜
Update `policy export/import` to preserve new fields	⬜
Show "Signed by Platform Security Team" instead of hex in output	⬜
Show contact info in conflict output	⬜
Backward-compat: gracefully handle packs without new fields	⬜

Output with signer name:

BLOCK  code://go/auth/tls/cert_verification
       Your code:  TLS certificate verification is disabled (main.go:12)
       Source:     Acme Security Standard v3.2 (Platform Security Team)
       Contact:    #security-policy
       Action:     Fix or acknowledge with: aphoria ack <path> --reason "..."

Enterprise Value: Developers know who to contact. Auditors see clear attribution.

10.3 Speed Benchmarks ⬜

Impact: LOW | Effort: LOW | Priority: P3

Document and automate speed benchmark testing.

Task	Status
Create `benchmarks/` directory with test corpora	⬜
Automate `time aphoria scan` on standard corpus	⬜
Document test conditions in benchmark results	⬜
Add `aphoria scan --benchmark` flag for self-test	⬜
Include benchmarks in CI (optional, non-blocking)	⬜

Usage:

# Run benchmark on current directory
aphoria scan --benchmark

# Output includes timing breakdown
Benchmark Results:
  Files scanned:     767
  Lines of code:     187,918
  Claims extracted:  722
  Conflicts found:   186
  Total time:        652ms
    - File discovery:  45ms
    - Extraction:      487ms
    - Conflict query:  120ms

Enterprise Value: "Show me the benchmark on a 100K-line codebase" → aphoria scan --benchmark

Phase 10 Completion Criteria

Metric	Target
Ack expiry working with 90d default	✓
Demo output matches pitch slides exactly	✓
Buyer can see who signed a policy (name, not hex)	✓
Buyer can see how to contact policy owner	✓
Speed benchmarks documented and reproducible	✓

Phase 11: Evidence-Based Authority 🎯

Vision: Authority comes from evidence, not titles. Merit over tenure.

Problem: All patterns treated equally. A random commit carries the same weight as a pattern backed by RFC research and product specs.

Principle: The system rewards documentation, not tenure.

Evidence Levels

Level	Example	Authority Weight	Graduation Threshold
ProductSpec	`specs/api-design.md → REQ-API-001`	0.95	1 usage
Standard	RFC 7519, OWASP A03:2021	0.85	3 usages
Research	ADR-042, docs/decision-log.md	0.70	5 usages
Commit	Just code, no context	0.40	10 usages

11.1 Evidence Level Types ⬜

Task	Status
Create `src/evidence/mod.rs` module	⬜
Define `EvidenceLevel` enum (Commit, Research, Standard, ProductSpec)	⬜
Implement `authority_weight()` method	⬜
Add evidence level to `LearnedPattern` struct	⬜
Update pattern display to show evidence level	⬜

11.2 Evidence Source Detection ⬜

Task	Status
Create `EvidenceSource` enum	⬜
Implement commit message parsing for RFC/standard references	⬜
Implement ADR file detection (docs/adr/*.md patterns)	⬜
Implement spec file detection (specs/.md, .spec.md)	⬜
Add `PatternEvidence::detect()` auto-detection	⬜

11.3 Evidence-Aware Graduation ⬜

Task	Status
Update `GraduationManager` thresholds based on evidence	⬜
ProductSpec: 1 usage → promotion candidate	⬜
Standard: 3 usages → promotion candidate	⬜
Research: 5 usages → promotion candidate	⬜
Commit-only: 10 usages → promotion candidate	⬜
Add evidence boost to shadow mode evaluation	⬜

11.4 Evidence Display ⬜

Task	Status
Update `aphoria patterns show` to display evidence chain	⬜
Show evidence level badge in table/JSON output	⬜
Show linked sources (ADR, spec, RFC) in conflict output	⬜
Add `--evidence` flag to filter patterns by evidence level	⬜

Phase 11 Completion Criteria

Metric	Target
Evidence detection working for 4 source types	✓
Graduation thresholds vary by evidence level	✓
Pattern display shows evidence chain	✓
ProductSpec-backed patterns graduate with 1 usage	✓

Phase 12: Knowledge Scope Hierarchy ⬜

Vision: Knowledge applies at the right level - org, team, or project.

Problem: All knowledge exists at one flat level. No way to say "this applies org-wide" vs "this is just our team's preference."

Scope Levels

Organization Level (applies to all teams)
├── Security policies (TLS, auth, secrets) - NO opt-out
├── Compliance requirements (GDPR, SOC 2)
└── Architecture decisions (API gateway, event bus)

Team Level (applies to team's projects)
├── Coding conventions (naming, error handling)
├── Technology choices (frameworks, libraries)
└── Domain patterns (payment flows, user lifecycle)

Project Level (applies to single project)
├── Local overrides (justified exceptions)
├── Experimental patterns (not yet proven)
└── Context-specific decisions

12.1 Scope Level Types ⬜

Task	Status
Create `src/scope/mod.rs` module	⬜
Define `ScopeLevel` enum (Organization, Team, Project)	⬜
Add `scope_level` and `scope_id` to `LearnedPattern`	⬜
Add `ScopeConfig` to `.aphoria.toml`	⬜
Implement `--scope` flag for CLI commands	⬜

12.2 Scope Inheritance ⬜

Task	Status
Implement inheritance resolution (project → team → org)	⬜
Security policies: auto-apply, no opt-out	⬜
Conventions: auto-apply, teams can override with justification	⬜
Observations: never inherited, team-specific only	⬜
Add `ScopedKnowledge` struct with `inherited_from` chain	⬜

12.3 Scope Override Workflow ⬜

Task	Status
Implement `aphoria scope override` command	⬜
Require justification for overrides	⬜
Require evidence link (spec, ADR, ticket) for overrides	⬜
Store override audit trail	⬜
Show overrides in SOC 2 reports	⬜

12.4 Cross-Scope Queries ⬜

Task	Status
`aphoria patterns --scope org` (org-level only)	⬜
`aphoria patterns --scope team --exclude-inherited`	⬜
`aphoria patterns --scope project --only-local`	⬜
Show scope in pattern list output	⬜

Phase 12 Completion Criteria

Metric	Target
3 scope levels working (org/team/project)	✓
Inheritance resolution correct	✓
Overrides require justification + evidence	✓
Cross-scope queries functional	✓

Phase 13: Knowledge Lifecycle Management ⬜

Vision: Knowledge ages. Patterns can be deprecated and superseded.

Problem: Knowledge exists forever. No way to deprecate patterns or track evolution.

Knowledge Status

Active       → Pattern is current, enforced
Deprecated   → Pattern is being phased out, migration guidance provided
Superseded   → Pattern replaced by another, link to replacement
Archived     → Pattern removed from active use, historical only

13.1 Knowledge Status Types ⬜

Task	Status
Create `src/lifecycle/mod.rs` module	⬜
Define `KnowledgeStatus` enum	⬜
Add `Deprecated` variant with reason, superseded_by, sunset_date	⬜
Add `KnowledgeLifecycle` struct with status history	⬜
Store lifecycle in pattern metadata	⬜

13.2 Deprecation Command ⬜

Task	Status
Implement `aphoria deprecate <pattern-id>` command	⬜
Require `--reason` flag	⬜
Optional `--superseded-by <new-pattern>`	⬜
Optional `--sunset-date <ISO-8601>`	⬜
Notify connected teams on deprecation	⬜

13.3 Migration Guidance ⬜

Task	Status
Show deprecation warning in scan output	⬜
Link to superseding pattern when available	⬜
Show migration guide/ADR when linked	⬜
FLAG (not BLOCK) deprecated pattern usage	⬜
Track migration progress across projects	⬜

13.4 Migration Tracking Dashboard ⬜

Task	Status
Implement `aphoria migrations status` command	⬜
Show progress by team (X/Y endpoints migrated)	⬜
Show days remaining until sunset	⬜
Show blockers (acknowledged exceptions)	⬜
Export migration status for reporting	⬜

Phase 13 Completion Criteria

Metric	Target
Deprecation command working	✓
Deprecated patterns show warning in scan	✓
Migration tracking across projects	✓
SOC 2 report includes migration status	✓

Phase 14: Governance Workflows ⬜

Vision: Clear approval paths for pattern promotion with audit trails.

Problem: Governance is binary: manual review or >0.95 auto-promote. No structured approval workflows.

14.1 Approval Workflow Definition ⬜

Task	Status
Create `src/governance/mod.rs` module	⬜
Define `ApprovalWorkflow` struct	⬜
Define `ApprovalStage` with required approvers	⬜
Support evidence-based auto-approve thresholds	⬜
Config: define workflows in `.aphoria.toml`	⬜

14.2 Approval State Machine ⬜

Task	Status
Implement state transitions (pending → approved/rejected)	⬜
Multi-stage approval support	⬜
Timeout and escalation policies	⬜
Store approval history with timestamps	⬜

14.3 Approval CLI ⬜

Task	Status
`aphoria governance pending` - list pending approvals	⬜
`aphoria governance approve <id> --comment "..."`	⬜
`aphoria governance reject <id> --reason "..."`	⬜
`aphoria governance escalate <id>`	⬜
Show approval status in pattern list	⬜

14.4 SOC 2 Audit Trail ⬜

Task	Status
Full audit log for all governance actions	⬜
`aphoria audit trail --pattern <id>` - show timeline	⬜
Export governance history for auditors	⬜
Include approver identity and timestamp	⬜

Phase 14 Completion Criteria

Metric	Target
Multi-stage approval working	✓
Approval/reject with comments	✓
Full audit trail exportable	✓
SOC 2 evidence includes approval chain	✓

Phase 15: Evidence Source Integration ⬜

Vision: ADRs, specs, and standards automatically link to patterns.

Problem: Evidence sources aren't automatically detected. Developers must manually reference them.

15.1 ADR Auto-Detection ⬜

Task	Status
Create `src/evidence/adr.rs`	⬜
Detect ADR-XXX patterns in commit messages	⬜
Scan for ADR files in standard locations	⬜
Parse ADR content for related patterns	⬜
Link ADR to patterns automatically	⬜

15.2 Spec File Detection ⬜

Task	Status
Create `src/evidence/spec.rs`	⬜
Detect spec files (specs/.md, .spec.md)	⬜
Parse requirement IDs (REQ-XXX)	⬜
Link requirements to patterns	⬜
Show requirement coverage in reports	⬜

15.3 Standard Reference Extraction ⬜

Task	Status
Create `src/evidence/standards.rs`	⬜
Parse RFC references (RFC 7519)	⬜
Parse OWASP references (OWASP A03:2021)	⬜
Parse NIST references (NIST SP 800-53)	⬜
Auto-link to authoritative corpus	⬜

15.4 Evidence Display ⬜

Task	Status
Show full evidence chain in pattern output	⬜
Link to source files (ADR, spec)	⬜
Show external standard references	⬜
`aphoria patterns --by-evidence` grouping	⬜

Phase 15 Completion Criteria

Metric	Target
ADR auto-detection working	✓
Spec file linking working	✓
Standard references extracted	✓
Evidence chain visible in output	✓

Enterprise Pilot Success Metrics

90-Day Pilot Targets

Metric	Target	Measurement
Patterns captured	100+ observations	Count in knowledge graph
Patterns promoted	10+ conventions	Count with status=Active
Cross-team adoption	2+ teams connected	Unique team_ids
New hire guidance events	5+ accepted suggestions	Accept rate tracking
False positive rate	<10%	FP feedback / total flags
Evidence-backed patterns	>50%	Patterns with Research+ evidence

180-Day Production Targets

Metric	Target	Measurement
Knowledge retention	0 lost patterns on departures	Audit log
Onboarding velocity	50% faster ramp	Time to first PR
Convention adoption	80% across org	Compliance rate
SOC 2 evidence	Audit pass	External validation
Deprecated pattern migration	90% complete by sunset	Migration tracking

Enterprise Simulation UAT

See: uat/enterprise-simulation-uat.md

6-month simulation covering:

Month 1: Platform team adopts, baseline patterns captured
Month 2: Payments team joins, cross-team patterns emerge
Month 3: New hire guided by existing patterns
Month 4: Mobile team joins, org-level promotion
Month 5: API versioning deprecated, migration tracked
Month 6: SOC 2 audit evidence generated

113 KiB Raw Blame History

Aphoria Roadmap

Phase 0: StemeDB Foundation ✅

Phase 2: CLI Core ✅

Phase 2 Code Quality Fixes ✅

Phase 2A: Concept Matching ✅

2A.1 Leaf-Based Concept Matching (Aphoria-side fix) ✅

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix) ✅

2A.3 Auto-Alias Creation ✅

Phase 1: Authoritative Corpus Expansion ✅

Architecture

1.1 CorpusBuilder Trait ✅

1.2 RFC Ingester ✅

1.3 OWASP Ingester ✅

1.4 Vendor Docs ✅

1.5 Hardcoded Refactor ✅

1.6 CLI Integration ✅

1.7 Error Handling ✅

Phase 3: Skill Integration ✅

3.1 Claude Code Skill ✅

3.2 Agent Pre-Flight Hook ✅

3.3 Alias Suggestion Workflow ✅

Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ✅

4.1 Git Pre-Commit Hook ✅

4.2 Baseline Mode ✅

4A: Observational Claims ✅

4B: Self-Conflict Detection ✅

4C: Diff-Only Scanning ✅

4D: Enhanced Ack ✅

4E: Hosted Mode ✅

Phase 4.5: Ephemeral Scan Mode ✅

Problem

Solution

Implementation ✅

Usage

Performance

Phase 5: Research Agent Loop ✅

5.1 Gap Detection ✅

5.2 Gap Storage ✅

5.3 Quality Validation ✅

5.4 Research Execution ✅

5.5 CLI Integration ✅

5.7 Security Extractors ✅

5.6 Community Corpus Contributions ✅

Phase 6: Federated Policy & Trust Packs ✅

6.1 Trust Pack Format ✅

6.2 Policy Management ✅

6.3 Core Integration ✅

6.4 CLI Commands ✅

Phase 6.5: Trust Pack Extensions ✅

6.5.1 Predicate Aliases ✅

6.5.2 Pack Signing Key Rotation ✅

6.5.3 Priority

Phase 7: Declarative Extractors ✅

7.1 Core Types ✅

7.2 Configuration ✅

7.3 Validation & Security ✅

7.4 Registry Integration ✅

7.5 Error Handling ✅

7.6 Tests ✅

Phase 7.5: LLM-in-the-Loop Extraction ✅

Vision

7.5.1 LLM Extractor Implementation ✅

7.5.2 Selective Triggering ✅

7.5.3 Cost Controls ✅

7.5.4 Configuration ✅

Phase 7.6: Pattern Learning Store ✅

Vision

7.6.1 LearnedPattern Schema ✅

7.6.2 PatternStore Implementation ✅

7.6.3 Pattern Normalization ✅

7.6.4 Configuration ✅

7.6.5 Scan Integration ✅

7.6.6 Error Handling ✅

Phase 7.6 (Legacy Documentation)

Original Schema (Reference)

Original PatternStore Trait (Reference)

7.6.3 Pattern Normalization ⬜

7.6.4 Integration with Scan ⬜

7.6.5 Configuration ⬜

113 KiB

Raw Blame History