stemedb/applications/aphoria/roadmap.md
jordan 41c676a78e feat: Aphoria enterprise features + ontology SDK + file length compliance
Enterprise Features:
- Hosted mode with remote sync for team pattern aggregation
- Community sharing with privacy-preserving anonymization
- LLM-based semantic claim extraction with Gemini integration
- Pattern learning with promotion to declarative extractors
- High-entropy secrets extractor with configurable thresholds
- Auth bypass and insecure cookies extractors

Module Refactoring:
- Split oversized files to comply with 500-line limit
- Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc.
- Handlers split: scan.rs, policy.rs, report.rs modules
- Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/
- Learning split: store modules with metrics and persistence

SDK & Ontology:
- stemedb-ontology SDK with fluent builders and StemeDB client
- Pharma domain extractors for FDA Orange Book data
- Consumer health UAT test infrastructure

Code Quality:
- Fixed clippy warnings (needless_borrows_for_generic_args)
- Added KVStore trait imports where needed
- Fixed utoipa path re-exports for OpenAPI docs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 12:55:29 -07:00

75 KiB

Aphoria Roadmap


Phase 0: StemeDB Foundation

Tracked in: roadmap.md § 5D. Concept Hierarchy

Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.

Aphoria Phase 0 StemeDB Phase 5D Status
0.1 ConceptPath Type 5D.1 ConceptPath Type
0.2 ConceptPath in Assertion (implicit in 5D.1)
0.3 Hierarchical Index 5D.4 Hierarchical Query
0.4 Alias Store 5D.3 Alias Store + 5D.5 Alias Resolution
0.5 Source Class Inference 5D.6 Source Class Inference
0.6 Concept API Endpoints 5D.7 Concept API Endpoints

Spec: docs/specs/concept-hierarchy.md


Phase 2: CLI Core

Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.

Task Status
2.1 Project Walker walker/mod.rs, walker/path_mapper.rs, walker/language.rs
2.2 Extractors (10) tls_verify, jwt_config, hardcoded_secrets, timeout_config, dep_versions, cors_config, rate_limit, weak_crypto, command_injection, sql_injection
2.3 Ingestion Bridge bridge.rs — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion
2.4 Conflict Query episteme.rs — LocalEpisteme with check_conflicts()
2.5 Report Output report/ — table (comfy-table), JSON, SARIF 2.1.0, markdown
2.6 Acknowledge Command lib.rs acknowledge()
Baseline & Diff lib.rs set_baseline(), show_diff()
Status Command lib.rs show_status()

183 tests pass. Clippy and fmt clean.

Phase 2 Code Quality Fixes

Code review improvements to extractors:

Issue Fix Status
DES/RC4 concept path misclassification Split check_pattern() into check_hash_pattern() and check_encryption_pattern(); DES/RC4 now use crypto/encryption/algorithm path
SHA1 edge case undocumented Added comments and test documenting that SHA1 detection is intentionally broad (triggers for git hashes, etc.)
JS exec() regex overly broad Tightened regex to require child_process. prefix or non-word/non-dot preceding character; prevents RegExp.exec() false positives

Phase 2A: Concept Matching

Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.

2A.1 Leaf-Based Concept Matching (Aphoria-side fix)

Implemented in episteme.rs via ConceptIndex:

  • make_key(subject, predicate) extracts tail 2 path segments + predicate
  • build(assertions) creates in-memory index keyed by tail path
  • lookup(subject, predicate) finds matching authoritative assertions
  • check_conflicts() uses ConceptIndex instead of QueryEngine for cross-scheme matching

Integration tests prove TLS and JWT conflicts are detected correctly.

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix)

Wired AliasStore into QueryEngine.execute():

  • Added resolve_aliases: bool field to Query (defaults to false)
  • Added alias_store: Option<Arc<dyn AliasStore>> to QueryEngine
  • Added .with_alias_store() builder method
  • When resolve_aliases: true, expands subject via AliasStore.resolve_all() before index lookup
  • Added fetch_by_subjects() and fetch_by_subjects_predicate() for multi-subject deduplication
  • Modified Query.matches() to skip subject filtering when aliases are resolved
  • Skips fast path (MV lookup) when resolve_aliases: true
  • Gracefully degrades when no alias store is configured

7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.

2A.3 Auto-Alias Creation

When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:

  • code://rust/myapp/tls/cert_verificationrfc://5246/tls/cert_verification
  • code://rust/myapp/auth/jwt/audience_validationrfc://7519/jwt/audience_validation

This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.

Implementation:

  • Added auto_create_aliases: bool config option to AliasConfig (defaults to true)
  • Added AliasOrigin::AutoDetected variant to stemedb-core for tracking auto-created aliases
  • Wired GenericAliasStore into LocalEpisteme for alias persistence
  • In check_conflicts(), when a code claim matches an authoritative claim by leaf, calls AliasStore.set_alias() to persist the relationship with AliasOrigin::AutoDetected
  • Alias creation is idempotent (skips if alias already exists)
  • 4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency

Phase 1: Authoritative Corpus Expansion

Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     aphoria corpus build                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ RFC Ingester │  │ OWASP        │  │ Vendor Bootstrapper   │  │
│  │ (Tier 0)     │  │ Ingester     │  │ (Tier 2)              │  │
│  │              │  │ (Tier 1)     │  │                       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         └─────────────────┼──────────────────────┘              │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ CorpusRegistry  │                            │
│                  └────────┬────────┘                            │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ LocalEpisteme   │                            │
│                  │ ingest_         │                            │
│                  │ authoritative() │                            │
│                  └─────────────────┘                            │
└─────────────────────────────────────────────────────────────────┘

1.1 CorpusBuilder Trait

Task Status
CorpusBuilder trait corpus/mod.rs — name, scheme, default_tier, build, requires_network
CorpusRegistry Manages multiple builders, build_all(), list_builders()
CorpusBuildResult Stats per builder, total assertions, success/fail/skip counts

1.2 RFC Ingester

Task Status
RfcCorpusBuilder corpus/rfc.rs
HTTP fetching Via ureq, cached to ~/.cache/aphoria/rfc-cache/
RFC 2119 keyword parsing MUST, MUST NOT, SHOULD, SHALL extraction
RFC-specific parsers JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110)
Concept mapping rfc://{number}/{topic} at Tier 0 (Regulatory)

1.3 OWASP Ingester

Task Status
OwaspCorpusBuilder corpus/owasp.rs
HTTP fetching From GitHub raw content, cached to ~/.cache/aphoria/owasp-cache/
Markdown parsing MUST/SHOULD statements, section context
Cheat sheet parsers Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers
Concept mapping owasp://cheatsheet/{topic}/{claim} at Tier 1 (Clinical)

1.4 Vendor Docs

Task Status
VendorCorpusBuilder corpus/vendor.rs
PostgreSQL claims pool_size, idle_timeout, ssl_mode
Redis claims timeout, max_retries, tls
reqwest claims cert_verification, connect_timeout, request_timeout
hyper claims keep_alive_timeout, max_concurrent_streams
Go net/http claims read_timeout, write_timeout, idle_timeout, min_tls_version
tokio-postgres claims pool_size, ssl_mode
SQLx claims max_connections, idle_timeout
Concept mapping vendor://{product}/{topic}/{claim} at Tier 2 (Observational)

1.5 Hardcoded Refactor

Task Status
HardcodedCorpusBuilder corpus/hardcoded.rs — original 11 assertions
create_authoritative_assertion() Made public in episteme.rs for corpus builders

1.6 CLI Integration

Task Status
aphoria corpus build Fetches and ingests from all sources
--only rfc,owasp,vendor Filter to specific sources
--offline Skip network-requiring sources
--clear-cache Clear cache before building
aphoria corpus list List available corpus sources
CorpusConfig cache_dir, include_*, rfc_list options

1.7 Error Handling

Task Status
RfcFetch error Per-RFC fetch failures with context
OwaspFetch error Per-cheat-sheet fetch failures with context
CorpusBuild error General corpus build failures
Graceful degradation Continue with other sources if one fails

Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs


Phase 3: Skill Integration

Complete. Aphoria is now usable in Claude Code agent workflows.

3.1 Claude Code Skill

Task Status
skill/SKILL.md Comprehensive skill definition with all commands
/aphoria scan Scan project, show conflicts grouped by verdict
/aphoria scan --fix Interactive fix workflow
/aphoria ack Acknowledge conflicts as intentional
/aphoria status Show status and baseline
/aphoria diff Show changes since baseline
/aphoria init Initialize Aphoria
/aphoria baseline Set baseline
skill/install.sh Install script for ~/.claude/skills/aphoria/

Files: skill/SKILL.md, skill/install.sh, skill/hooks.json

3.2 Agent Pre-Flight Hook

Task Status
--exit-code flag Returns 2 for BLOCK, 1 for FLAG only, 0 for clean
--strict flag Lower thresholds (FLAG at 0.3, BLOCK at 0.5)
Hook template skill/hooks.json with PreCommit and PrePush examples

Usage:

{
  "hooks": {
    "PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
    "PrePush": [{"command": "aphoria scan --strict --exit-code"}]
  }
}

3.3 Alias Suggestion Workflow

Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:

  1. Tail-path matching finds authoritative assertions
  2. Aliases are auto-created with AliasOrigin::AutoDetected
  3. Future queries use the alias automatically

The skill documents the suggestion flow for manual alias management:

  • y (Accept): Creates alias
  • n (Reject): Records intentional difference
  • defer: Flags for later review

Phase 4: Full-Cycle Pre-Commit (Scan + Sync)

Vision: The pre-commit hook is a bidirectional knowledge sync, not just a read-only linter. Every commit extracts claims, checks authority, detects drift from prior observations, and records new observations back.

Spec: uat/2026-02-04-full-cycle-precommit-vision.md

┌─────────────────────────────────────────────────────────────┐
│                     PRE-COMMIT FLOW                          │
├─────────────────────────────────────────────────────────────┤
│  1. EXTRACT     → What claims does this code make?           │
│  2. CHECK       → Against authority + own prior claims       │
│  3. CLASSIFY    → Authority conflict | Self conflict | Novel │
│  4. UPDATE      → Record observations to local Episteme      │
│  5. GATE        → Exit code (BLOCK=2, FLAG=1, PASS=0)        │
└─────────────────────────────────────────────────────────────┘

4.1 Git Pre-Commit Hook

All flags needed for pre-commit integration are implemented:

#!/bin/sh
# .git/hooks/pre-commit
aphoria scan --staged --sync --exit-code

Or using pre-commit framework:

repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria Truth Sync
        entry: aphoria scan --staged --sync --exit-code
        language: system
        pass_filenames: false

4.2 Baseline Mode

Already implemented in Phase 2.

4A: Observational Claims

Record code claims as Tier 4 (Community) assertions when no authority conflict exists:

Task Status
sync: bool in ScanArgs types/command.rs
observations_recorded: usize in ScanResult types/result.rs
--sync CLI flag cli.rs — requires --persist
claim_to_observation() bridge.rs — creates Tier 4 (Community, 0.3 weight) assertions
ingest_observations() in LocalEpisteme episteme/local.rs — writes to WAL + predicate index
Scan flow integration scan.rs — splits claims by conflict status, writes novel claims as observations
Handler validation handlers.rs--sync requires --persist error
Report output report/table.rs, report/json.rs — shows observation count
Tests 5 new tests for observation write-back
Code: connection_pool.max_size = 25
Authority: (nothing)
Action: Record as Tier 4 observation (project memory)

Usage:

# Scan with observation write-back
aphoria scan --persist --sync

# Output:
# Recorded 45 observations (project memory)

4B: Self-Conflict Detection

Detect drift from the project's own prior observations:

Task Status
Query prior claims before conflict check fetch_observations_for_concept()
Compare current vs stored observations check_drift() compares values
Report changes as SELF-CONFLICT DriftResult with prior/current values
New verdict: Drift (distinct from Block/Flag) Verdict::Drift
Drift reporting in all formats table, json, markdown, sarif
Exit code includes drift --exit-code returns 1 for drift
Prior: db/pool_size = 25 (recorded 2026-01-15)
Now:   db/pool_size = 100
Result: DRIFT — "You changed pool_size from 25 to 100. Intentional?"

Files: types/result.rs, types/verdict.rs, episteme/local.rs, scan.rs, report/*.rs

4C: Diff-Only Scanning

Fast scanning for pre-commit hooks:

Task Status
FileSource enum (All, Staged) types/command.rs
--staged flag (git diff --cached) cli.rs, handlers.rs
walker/git.rs git utilities find_repo_root(), get_staged_files()
walk_staged_files() walker/mod.rs — filters to scan root, applies same filters
Scan dispatch by file_source scan.rs
Error handling (NotGitRepo, GitCommand) error.rs
Tests 9 tests in tests/staged_scanning.rs
Target: < 500ms for staged-only

Files: types/command.rs, walker/git.rs, walker/mod.rs, scan.rs, cli.rs, handlers.rs, error.rs

Usage:

# Pre-commit hook (fast, staged files only)
aphoria scan --staged --exit-code

# Full cycle with observation sync
aphoria scan --staged --persist --sync --exit-code

4D: Enhanced Ack

Acknowledgments with rationale and policy updates:

Task Status
--reason "text" flag cli.rs — required on ack, bless, update commands
Store rationale in assertion metadata policy_ops.rs — stored in value/description fields
aphoria update for intentional drift policy_ops.rs — creates policy_update assertion
Policy update assertions types/mod.rspredicates::POLICY_UPDATE

Files: cli.rs, handlers.rs, policy_ops.rs, types/command.rs, types/mod.rs

$ aphoria ack db/pool_size --reason "Scaling for Black Friday"
$ aphoria update db/pool_size 100 --reason "New baseline after load test"

4E: Hosted Mode

Organizations run their own StemeDB server and all team members automatically sync observations:

Task Status
HostedConfig in config.rs url, project_id, team_id, sync_mode, offline_fallback, api_key_env
SyncMode enum remote-only (default), local-and-remote
OfflineFallback enum skip (default), fail, queue
HostedClient HTTP client hosted.rs — retry logic, auth headers, observation push
POST /v1/aphoria/observations endpoint Server receives observations with project/team metadata
Scan integration Auto-enables sync when [hosted] configured
Hosted(String) error variant For connection/auth failures
Graceful offline fallback Based on offline_fallback config
Tests Config parsing, client creation, assertion conversion
# aphoria.toml
[hosted]
url = "https://episteme.acme.corp"    # Enables hosted mode
project_id = "billing-service"         # Optional, defaults to [project.name]
team_id = "platform-team"              # Optional, for multi-team servers
sync_mode = "remote-only"              # "remote-only" | "local-and-remote"
offline_fallback = "skip"              # "skip" | "fail" | "queue"
api_key_env = "APHORIA_API_KEY"        # Env var for auth token

Architecture:

┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│ Developer A  │  │ Developer B  │  │ Developer C  │
│ aphoria scan │  │ aphoria scan │  │ aphoria scan │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       └─────────────────┼─────────────────┘
                         ▼
              ┌─────────────────────┐
              │ Team StemeDB Server │
              │ POST /v1/aphoria/   │
              │      observations   │
              └─────────────────────┘
                         │
                         ▼
              Aggregated team patterns

Files: config.rs, hosted.rs, scan.rs, error.rs, lib.rs, crates/stemedb-api/src/handlers/aphoria.rs, crates/stemedb-api/src/dto/aphoria.rs


Phase 4.5: Ephemeral Scan Mode

Performance optimization: 40x faster scans by skipping Episteme storage when persistence isn't needed.

Problem

Every aphoria scan was slow because it initialized the full Episteme stack:

  • WAL recovery (O(n) on every startup)
  • Dual backend initialization (fjall + redb)
  • Store and index initialization

But conflict detection is actually 100% in-memory — it never reads from the KV store. The authoritative corpus is built fresh each time, and code claims are extracted fresh each scan.

Solution

Added ScanMode enum with two modes:

Mode Use Case Storage Performance
Ephemeral (default) CI, pre-commit, quick checks None ~0.25 seconds
Persistent Baseline/diff tracking, alias creation WAL + store ~1-2 seconds

Implementation

Task Status
ScanMode enum types.rs — Ephemeral (default), Persistent
EphemeralDetector struct episteme/mod.rs — in-memory corpus + ConceptIndex
check_conflicts_pure() Extracted as standalone function for reuse
Mode-based dispatch in run_scan() Uses EphemeralDetector for Ephemeral, LocalEpisteme for Persistent
--persist CLI flag main.rs — opt-in to persistent mode
Tests for both modes test_ephemeral_scan_no_storage_created, test_persistent_scan_creates_storage, test_scan_modes_produce_same_conflicts

Usage

# Fast ephemeral scan (default) — no storage created
aphoria scan .

# Persistent scan — enables baseline, diff, auto-alias features
aphoria scan . --persist

Performance

Mode Time Storage
Ephemeral ~0.25s None
Persistent ~1-2s WAL + store directories

Files: types.rs, episteme/mod.rs, lib.rs, main.rs, tests.rs


Phase 5: Research Agent Loop

Research agent fills gaps in authoritative coverage by researching official documentation.

5.1 Gap Detection

Task Status
Gap struct research/gap_detector.rs — concept_path, topic, predicate, source info
detect_gaps() Compares claims against ConceptIndex, identifies missing coverage
Topic normalization Extracts last 2 path segments for cross-scheme matching
Deduplication Deduplicates gaps by topic+predicate key

5.2 Gap Storage

Task Status
GapRecord research/gap_store.rs — tracking metadata, project count, research status
GapStore JSON-backed persistent storage with atomic saves
Project tracking Records which projects reported each gap
Research eligibility is_eligible_for_research() with threshold and cooldown
Gap pruning prune_old_gaps() removes stale entries

5.3 Quality Validation

Task Status
QualityValidator research/quality.rs — validates researched claims
Source attribution Checks for authoritative domains (rfc-editor, owasp, vendor docs)
Normative language Verifies MUST/SHOULD/SHALL keywords present
Vague content detection Rejects "it depends", "typically", etc.
Consistency scoring Detects conflicting claims on same subject
QualityReport Detailed per-claim validation results
filter_passed() Returns only claims meeting quality threshold

5.4 Research Execution

Task Status
Researcher research/researcher.rs — orchestrates research pipeline
DocumentationSource Configurable sources with URL patterns and topics
Default sources Redis, PostgreSQL, Go, Rust, OWASP, Kafka, MongoDB
Content fetching HTTP with timeout and size limits
Normative extraction Regex-based MUST/SHOULD/SHALL extraction
Section tracking Extracts heading context for attribution
Confidence scoring Based on keyword strength, statement length, content size

5.5 CLI Integration

Task Status
aphoria research run Run research agent with configurable threshold
aphoria research status Show gap statistics and research progress
aphoria research gaps List gaps by project count
--threshold Minimum projects before researching (default: 3)
--strict Use strict quality validation
--prune Remove stale gaps before researching
--ready Show only gaps ready for research

Files: research/mod.rs, research/gap_detector.rs, research/gap_store.rs, research/quality.rs, research/researcher.rs, research/tests.rs

5.7 Security Extractors

Extended Phase 2 extractors with OWASP-aligned security vulnerability detection:

Extractor Detects Languages
weak_crypto MD5, SHA1, DES, RC4 usage Rust, Go, Python, JS/TS
command_injection Shell execution, os.system, subprocess shell=True Rust, Go, Python, JS/TS
sql_injection String concatenation in SQL queries Rust, Go, Python, JS/TS

Concept paths:

  • crypto/hashing/algorithm — MD5, SHA1
  • crypto/encryption/algorithm — DES, RC4
  • os/command/input, os/shell_mode — command injection
  • db/query/input — SQL injection

5.6 Community Corpus Contributions

Users can opt in to contribute patterns anonymously to a central corpus, enabling community consensus to adjust default thresholds.

Task Status
CommunityConfig config/mod.rs — enabled (false), anonymize (true), exclude, include, min_confidence
AnonymizedObservation community/types.rs — privacy-preserving observation without file/line/text
CommunityObjectValue community/types.rs — serde-compatible version of ObjectValue
PatternAggregate community/types.rs — server-side aggregation with project counts
anonymize_claim() community/anonymizer.rs — wildcards project names, strips file/line, rounds timestamps
compute_anon_hash() Hash computed WITHOUT file/line/text (privacy-critical)
wildcard_project_path() code://rust/myapp/tlscode://rust/*/tls
--community-preview flag cli.rs — dry-run showing what WOULD be shared
PatternAggregateStore stemedb-storage — server-side pattern aggregation
Project deduplication Uses project_hash to prevent double-counting
POST /v1/aphoria/community/observations Push anonymized observations
GET /v1/aphoria/patterns Retrieve high-confidence community patterns

Privacy Model:

  • Project names wildcarded: myapp*
  • File paths, line numbers, matched text NEVER shared
  • Timestamps rounded to hour (k-anonymity)
  • Server receives project_hash, not raw project names
  • enabled defaults to false (explicit opt-in required)
  • anonymize defaults to true (privacy-preserving by default)

Usage:

# Preview what would be shared (no network)
aphoria scan --community-preview

# Enable in aphoria.toml:
[community]
enabled = true
anonymize = true
min_confidence = 0.8
exclude = ["vendor://acme/internal/*"]

# Scan with sync to share patterns
aphoria scan --persist --sync

Files: community/mod.rs, community/types.rs, community/anonymizer.rs, config/mod.rs, cli.rs, handlers.rs, stemedb-storage/src/pattern_aggregate_store/


Phase 6: Federated Policy & Trust Packs

Allow teams to define their own authoritative truths and distribute them as signed Trust Packs. This enables "Enterprise Grade" compliance across distributed teams.

6.1 Trust Pack Format

Task Status
TrustPack schema policy.rs — Assertions, Aliases, Metadata, Signature
PackHeader Name, version, issuer, timestamp
Serialization rkyv for zero-copy efficiency
Signing ed25519-dalek signing and verification

6.2 Policy Management

Task Status
PolicyManager Loads local and remote (HTTP/HTTPS) policies
Caching Caches remote policies in ~/.cache/aphoria/policies/
aphoria.toml config policies list support

6.3 Core Integration

Task Status
EphemeralDetector integration Ingests policies into memory corpus/index
check_conflicts_pure update Resolves policy aliases before authoritative lookup
LocalEpisteme export helpers fetch_acknowledgments, fetch_manual_aliases

6.4 CLI Commands

Task Status
aphoria policy export Exports local ack decisions as a Trust Pack
aphoria scan policy loading Auto-loads policies from config

Files: policy.rs, config.rs, episteme/mod.rs, lib.rs, main.rs


Phase 6.5: Trust Pack Extensions

Enhancements to Trust Packs based on enterprise pilot feedback. Deferred until real-world usage patterns emerge.

6.5.1 Predicate Aliases

Status: Deferred pending enterprise feedback Trigger: When enterprises report predicate naming conflicts between policy and extractors

User Story:

As a security architect, when my policy uses required=true but the extractor emits enabled=true, I need them to match semantically.

Problem:

  • Policy blesses: code://standard/tls/cert_verification with predicate required, value true
  • Extractor emits: code://config/tls/cert_verification with predicate enabled, value false
  • Tail-path matching finds the concept (tls/cert_verification) ✓
  • But predicates differ: required vs enabled — no conflict detected ✗

Solution:

Task Description
predicate_aliases field Add to Trust Pack schema
Default aliases enabledrequiredmandatoryenforced
ConceptIndex update Check aliases during lookup
Pack-defined aliases Allow packs to specify custom alias sets

Trust Pack Schema Extension:

# In Trust Pack
[predicate_aliases]
security_enabled = ["enabled", "required", "mandatory", "enforced", "active"]
version_minimum = ["min_version", "minimum_version", "tls_min_version"]

Implementation Plan:

  1. Add predicate_aliases: HashMap<String, Vec<String>> to TrustPack
  2. Store aliases alongside assertions during import
  3. Update ConceptIndex.make_key() to normalize predicates via aliases
  4. Match during conflict detection: if predicate_a aliases to predicate_b, treat as same concept

6.5.2 Pack Signing Key Rotation

Status: Deferred pending security key management requirements Trigger: Enterprise security requirements for key rotation

User Story:

As a security admin, when our signing key is rotated, I need to re-sign all packs without losing policy content.

Problem:

  • Trust Packs are signed with Ed25519 keys
  • When keys are rotated (security best practice), existing packs become unverifiable
  • Need to re-sign packs with new key while preserving content hash

Solution:

Task Description
aphoria policy resign CLI command to re-sign pack with new key
Content hash preservation Keep content_hash unchanged, only update signature
Key rotation audit Log key rotation events
Old signature archival Optionally keep old signature for audit trail

CLI:

# Re-sign pack with new key
aphoria policy resign my-standards.pack --key-file new-private-key.pem

# Re-sign with signature chain (audit trail)
aphoria policy resign my-standards.pack --key-file new-key.pem --chain-signatures

Trust Pack Schema Extension:

pub struct TrustPack {
    // Existing fields...
    pub signature: Signature,

    // New field for key rotation audit
    pub signature_chain: Option<Vec<SignatureRecord>>,
}

pub struct SignatureRecord {
    pub issuer_public_key: [u8; 32],
    pub signature: Signature,
    pub signed_at: DateTime<Utc>,
    pub reason: Option<String>,  // "Key rotation", "Security incident", etc.
}

6.5.3 Priority

Feature Priority Trigger
Predicate Aliases Medium Enterprise feedback showing predicate naming conflicts
Key Rotation Low Enterprise security key management requirements

Documented in: uat/future-scenarios.md


Phase 7: Declarative Extractors

Enable users to define new extractors in config/policy files (TOML) without writing Rust code. This removes the recompilation bottleneck for custom pattern enforcement.

User Outcome: "I added a custom extractor to my aphoria.toml that detects our company's deprecated API patterns. Now every scan flags files using the old pattern without me writing any Rust code."

7.1 Core Types

Task Status
DeclarativeExtractorDef extractors/declarative.rs — name, description, languages, pattern, claim, confidence
DeclarativeClaimDef subject, predicate, value specification
DeclarativeValue enum MatchedText, Boolean, Text variants
DeclarativeExtractor Compiled extractor with Extractor trait impl

7.2 Configuration

Task Status
ExtractorConfig.declarative config/mod.rsVec<DeclarativeExtractorDef>
TOML parsing Serde deserialization with #[serde(untagged)] for value types
Example config Documented in module and config docs

Example aphoria.toml:

[[extractors.declarative]]
name = "deprecated_api_v1"
description = "Detects usage of deprecated v1 API endpoints"
languages = ["go", "rust", "python"]
pattern = '/api/v1/\w+'
claim.subject = "api/deprecated_endpoint"
claim.predicate = "version"
claim.value = "v1"
confidence = 1.0

[[extractors.declarative]]
name = "legacy_encryption"
description = "Detects legacy encryption algorithms"
languages = ["rust", "go", "python", "javascript"]
pattern = '(?i)blowfish|twofish|cast5'
claim.subject = "crypto/encryption/algorithm"
claim.predicate = "algorithm"
claim.value_from_match = true
confidence = 0.9

7.3 Validation & Security

Task Status
Name validation Non-empty required
Subject/predicate validation Non-empty required
Confidence validation Must be 0.0-1.0
Regex validation Compiled at load time, not scan time
ReDoS protection RegexBuilder with 10MB size limits
Language parsing Language::from_str() with FromStr trait
Graceful failure Invalid extractors logged as warnings, don't block others

7.4 Registry Integration

Task Status
Module export extractors/mod.rs — public types
Registry registration ExtractorRegistry::new() loads from config
Enable/disable support Declarative extractors respect disabled list
Runtime addition add_from_definitions() for Trust Pack integration

7.5 Error Handling

Task Status
DeclarativeExtractor error variant error.rs — name + message
Validation errors Clear messages for each failure mode
Structured logging tracing::warn! for compilation failures

7.6 Tests

Task Status
Unit tests 22 tests in declarative.rs
Registry tests 7 tests for integration
Validation tests Empty name, subject, predicate; invalid confidence, regex, language
Extraction tests Boolean, text, matched_text value types
Deserialization tests TOML parsing for all value types

Files: extractors/declarative.rs, extractors/mod.rs, config/mod.rs, types/language.rs, error.rs


Phase 7.5: LLM-in-the-Loop Extraction

Use LLM (Gemini) to extract claims semantically during persistent scans. This fills gaps that regex extractors can't catch, providing immediate value while the learning system builds up pattern knowledge.

Vision

Code file → Regex extractors → Claims found
                ↓
         High-value files (auth, config, crypto)
                ↓
         LLM Extractor → Additional semantic claims
                ↓
         Combined claims → Conflict detection

7.5.1 LLM Extractor Implementation

Task Status
GeminiClient struct llm/client.rs — Gemini API client using ureq
LlmExtractor struct llm/extractor.rs — orchestrates extraction with budget tracking
Prompt engineering Security-focused extraction prompt with structured JSON output
Response parsing Parse Gemini's JSON response into ExtractedClaim format
Error handling Graceful degradation when API unavailable or key missing

7.5.2 Selective Triggering

Task Status
is_high_value_file() llm/extractor.rs — auth/, config/, crypto/, security/, secrets/, certs/, ssl/, tls/, keys/, credentials/ directories
High-value file names secret, password, credential, token, auth, login, session, jwt, tls, ssl, cert, key, config, settings, security, crypto, encrypt, decrypt, oauth, saml, ldap, api_key, apikey, access_key, private
Token budget max_tokens_per_scan (default 50k), max_tokens_per_file (default 4k)
Skip conditions Only runs when regex extractors found nothing AND file is high-value

7.5.3 Cost Controls

Task Status
Token tracking Arc<AtomicUsize> for thread-safe budget tracking across files
BLAKE3 caching llm/cache.rs — content hash + model + prompt version for cache key
Cache location ~/.cache/aphoria/llm-cache/
Budget enforcement within_budget() check before each LLM call

7.5.4 Configuration

# aphoria.toml
[llm]
enabled = true                    # Enable LLM extraction (default: false)
provider = "gemini"               # Only "gemini" supported
# model defaults to DEFAULT_LLM_MODEL (currently "gemini-3-flash-preview")
api_key_env = "GEMINI_API_KEY"    # Environment variable for API key
max_tokens_per_scan = 50000       # Budget per scan
max_tokens_per_file = 4000        # Budget per file (for max_output_tokens)
high_value_only = true            # Only use on auth/config/crypto files
cache_responses = true            # Cache by content hash
timeout_secs = 60                 # API timeout
min_confidence = 0.7              # Filter claims below this confidence

Files: llm/mod.rs, llm/client.rs, llm/extractor.rs, llm/cache.rs, config/mod.rs, scan.rs, error.rs


Phase 7.6: Pattern Learning Store

When LLM extracts something that regex extractors missed, remember the pattern. Track which patterns recur across projects to identify candidates for promotion to declarative extractors.

Vision

LLM extracts claim from code
        ↓
Pattern not in learned store?
        ↓
Store: { example_code, claim, project_hash }
        ↓
Same pattern seen in 5+ projects?
        ↓
Flag for promotion to declarative extractor

7.6.1 LearnedPattern Schema

Task Status
ValueType enum learning/types.rs — Text, Number, Boolean
ClaimTemplate struct learning/types.rs — subject_template, predicate, value_type, description
LearnedPattern struct learning/types.rs — full schema with timestamps, project hashes, confidence tracking
Serde serialization JSON serialization with chrono timestamps
Tests 5 unit tests for types

7.6.2 PatternStore Implementation

Task Status
PatternStore trait learning/store.rs — abstract storage interface
LocalPatternStore JSON-backed local storage at ~/.aphoria/learning/patterns.json
RwLock thread safety Write-through cache with in-memory HashMap
Deduplication find_similar() with Levenshtein similarity threshold 0.8
Pruning prune_stale() removes patterns not seen in N days
Tests 8 unit tests for store operations

7.6.3 Pattern Normalization

Task Status
normalize_pattern() learning/normalizer.rs — replaces literals with placeholders
Version detection "1.0", "TLSv1.2"<string:version>
Boolean detection true/false<boolean>
Number detection Standalone numbers → <number>
String detection Remaining quoted strings → <string>
pattern_similarity() Levenshtein distance normalized to 0.0-1.0
Tests 17 unit tests for normalization

7.6.4 Configuration

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning (default: false)
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

7.6.5 Scan Integration

Task Status
Initialize pattern store scan.rs — only in persistent mode with learning enabled
Project hash computation BLAKE3 hash for privacy-preserving project identification
Record LLM-extracted claims After LLM extraction, record patterns meeting min_confidence
Update existing patterns Merge observations when similar pattern found
Logging Reports patterns_recorded count on scan completion

7.6.6 Error Handling

Task Status
LearningStore error variant error.rs — for storage/cache failures
Graceful degradation Store failures logged, don't block scan

Files: learning/mod.rs, learning/types.rs, learning/normalizer.rs, learning/store.rs, config/mod.rs, scan.rs, error.rs, lib.rs

Tests: 30 tests covering types, normalization, and store operations.


Phase 7.6 (Legacy Documentation)

Note: The following is the original spec for reference. See above for implemented status.

Original Schema (Reference)

/// A pattern learned from LLM extraction that could become a declarative extractor.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct LearnedPattern {
    /// Unique identifier
    pub id: Uuid,

    /// Example code that triggered this pattern
    pub example_code: String,

    /// Normalized pattern (variables replaced with placeholders)
    /// e.g., "const TLS_MIN_VERSION = \"1.0\"" → "const TLS_MIN_VERSION = <version>"
    pub normalized_pattern: String,

    /// The claim this pattern produces
    pub claim_template: ClaimTemplate,

    /// Language this pattern applies to
    pub language: Language,

    /// When first seen
    pub first_seen: DateTime<Utc>,

    /// When last seen
    pub last_seen: DateTime<Utc>,

    /// Projects that have this pattern (hashed for privacy)
    pub project_hashes: HashSet<String>,

    /// Total occurrences across all projects
    pub occurrences: u32,

    /// Average LLM confidence when extracting this
    pub avg_confidence: f32,

    /// Has this been promoted to a declarative extractor?
    pub promoted: bool,

    /// If promoted, the extractor ID
    pub promoted_to: Option<String>,
}

/// Template for generating claims from a learned pattern.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ClaimTemplate {
    pub subject_template: String,  // "tls/min_version"
    pub predicate: String,         // "version"
    pub value_type: ValueType,     // String, Boolean, Number
    pub description_template: String,
}

Original PatternStore Trait (Reference)

pub trait PatternStore: Send + Sync {
    /// Record a pattern learned from LLM extraction
    fn record_pattern(&self, pattern: &LearnedPattern) -> Result<()>;

    /// Find existing pattern matching this example
    fn find_similar(&self, normalized: &str, language: Language, threshold: f32) -> Option<LearnedPattern>;

    /// Get patterns ready for promotion (threshold met)
    fn get_promotion_candidates(&self, min_projects: usize, min_confidence: f32) -> Vec<LearnedPattern>;

    /// Mark pattern as promoted
    fn mark_promoted(&self, id: &Uuid, extractor_name: &str) -> Result<()>;

    /// Prune old patterns
    async fn prune_stale(&self, max_age_days: u32) -> Result<usize>;
}

7.6.3 Pattern Normalization

Task Description
Variable extraction Identify literals that vary (versions, names, values)
Placeholder insertion Replace literals with typed placeholders
Similarity scoring Compare normalized patterns for dedup
fn normalize_pattern(code: &str, claim: &ExtractedClaim) -> String {
    // "const TLS_MIN = \"1.0\"" → "const TLS_MIN = <string:version>"
    // "pool_size: 25" → "pool_size: <number>"
    // "verify_ssl: false" → "verify_ssl: <boolean>"
}

fn similarity_score(a: &str, b: &str) -> f32 {
    // Levenshtein distance normalized to 0.0-1.0
    // Patterns with score > 0.8 are considered duplicates
}

7.6.4 Integration with Scan

// In scan.rs, after LLM extraction
for claim in llm_claims {
    // Check if this is a new pattern
    if let Some(existing) = pattern_store.find_similar(&claim.matched_text, language).await {
        // Update existing pattern
        pattern_store.increment_occurrence(&existing.id, project_hash).await?;
    } else {
        // Record new pattern
        let pattern = LearnedPattern::from_claim(&claim, &code_context, project_hash);
        pattern_store.record_pattern(&pattern).await?;
    }
}

7.6.5 Configuration

# aphoria.toml
[learning]
enabled = true                    # Enable pattern learning
store = "local"                   # "local" | "hosted"
min_confidence = 0.7              # Minimum LLM confidence to learn
prune_after_days = 90             # Remove patterns not seen in N days

[learning.promotion]
min_projects = 5                  # Projects needed before promotion
min_confidence = 0.8              # Average confidence needed
auto_promote = false              # Require human approval (Phase 7.7)

Files: learning/mod.rs, learning/pattern.rs, learning/store.rs, learning/normalize.rs


Phase 7.7: Pattern → Extractor Promotion

High-frequency learned patterns get promoted to declarative extractors. This closes the learning loop: patterns discovered by LLM become permanent, fast regex extractors.

Vision

LearnedPattern (5+ projects, >0.8 confidence)
        ↓
Claude: "Generate regex for this pattern"
        ↓
Candidate declarative extractor
        ↓
Validate against stored examples
        ↓
Human review (optional) → Approve/Reject
        ↓
Merge to project's .aphoria/extractors/

7.7.1 Promotion Pipeline

Task Description
Candidate selection Query patterns meeting threshold
Regex generation LLM generates regex from examples
YAML generation Convert to declarative extractor format
Validation Test against all stored examples
Review queue Present candidates for human approval
pub struct PromotionPipeline {
    pattern_store: Arc<dyn PatternStore>,
    llm_client: ClaudeClient,
    validator: ExtractorValidator,
}

impl PromotionPipeline {
    /// Get patterns ready for promotion
    pub async fn get_candidates(&self) -> Vec<PromotionCandidate> {
        let patterns = self.pattern_store
            .get_promotion_candidates(5, 0.8)
            .await?;

        patterns.into_iter()
            .map(|p| self.generate_candidate(p))
            .collect()
    }

    /// Generate declarative extractor from pattern
    async fn generate_candidate(&self, pattern: LearnedPattern) -> PromotionCandidate {
        // Ask Claude to generate regex
        let regex = self.llm_client.generate_regex(&pattern).await?;

        // Build declarative extractor
        let extractor = DeclarativeExtractor {
            name: pattern.id.to_string(),
            language: pattern.language,
            pattern: regex,
            claim: pattern.claim_template.clone(),
            source: ExtractorSource::Learned {
                pattern_id: pattern.id,
                projects: pattern.project_hashes.len(),
            },
        };

        // Validate against examples
        let validation = self.validator.validate(&extractor, &pattern).await;

        PromotionCandidate { pattern, extractor, validation }
    }
}

7.7.2 Regex Generation

Task Description
Multi-example prompt Include all examples in generation prompt
Regex safety Prevent catastrophic backtracking
Test coverage Generate test cases alongside regex
async fn generate_regex(examples: &[String], claim: &ClaimTemplate) -> Result<String> {
    let prompt = format!(
        "Generate a regex pattern that matches all these code examples:\n\n{}\n\n\
         The regex should extract the value for claim: {}\n\
         Requirements:\n\
         - Must match ALL examples\n\
         - Use named capture groups for extracted values\n\
         - Avoid catastrophic backtracking (no nested quantifiers)\n\
         - Return ONLY the regex, no explanation",
        examples.join("\n---\n"),
        claim.subject_template
    );

    let response = claude.message(&prompt).await?;
    validate_regex_safety(&response)?;
    Ok(response)
}

7.7.3 Validation Suite

Task Description
Positive tests Must match all stored examples
Negative tests Must NOT match known-safe code
Performance test Must complete in < 100ms
False positive check Run against sample codebase
pub struct ExtractorValidator {
    sample_codebases: Vec<PathBuf>,  // Known-good projects for FP testing
}

impl ExtractorValidator {
    pub async fn validate(
        &self,
        extractor: &DeclarativeExtractor,
        pattern: &LearnedPattern
    ) -> ValidationResult {
        let mut result = ValidationResult::default();

        // Must match all positive examples
        for example in &pattern.examples {
            if !extractor.matches(example) {
                result.positive_failures.push(example.clone());
            }
        }

        // Must not have excessive false positives
        for codebase in &self.sample_codebases {
            let fps = self.count_false_positives(extractor, codebase).await;
            if fps > 10 {
                result.false_positive_warning = true;
            }
        }

        // Must be fast
        let duration = self.benchmark(extractor);
        if duration > Duration::from_millis(100) {
            result.performance_warning = true;
        }

        result
    }
}

7.7.4 Human Review Gate

Task Description
aphoria extractors review CLI to review pending promotions
Approval workflow Approve, reject, or request changes
Rejection tracking Record why patterns were rejected
Auto-approve mode Skip review for >0.95 confidence (Phase 9)
$ aphoria extractors review

Pending promotions: 3

[1/3] Pattern: tls_min_version_const
      Examples: 47 (across 8 projects)
      Confidence: 0.91

      Generated regex: (?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["']?(1\.[01])["']?

      Sample matches:
        const TLS_MIN_VERSION = "1.0"     ✓ matches
        TLS_MINIMUM_VERSION: "1.1"        ✓ matches
        ssl_min_version = "1.2"           ✓ matches (TLS 1.2 is safe, false positive?)

      [a]pprove  [r]eject  [e]dit  [s]kip  [q]uit: _

7.7.5 Extractor Output

Promoted patterns become declarative extractors in .aphoria/extractors/:

# .aphoria/extractors/learned/tls_min_version_const.yaml
# Auto-generated from learned pattern. DO NOT EDIT.
# Pattern ID: 550e8400-e29b-41d4-a716-446655440000
# Learned from: 8 projects, 47 occurrences
# Confidence: 0.91
# Promoted: 2026-02-10

name: "tls_min_version_const"
language: ["rust", "go", "python", "javascript", "typescript"]
pattern: '(?i)(tls|ssl)_?(min|minimum)_?version\s*[:=]\s*["\']?(1\.[01])["\']?'
claim:
  subject: "tls/min_version"
  predicate: "version"
  value_capture: 1  # Capture group for version
  description: "TLS minimum version set to deprecated {value}"
metadata:
  source: "learned"
  pattern_id: "550e8400-e29b-41d4-a716-446655440000"
  projects: 8
  occurrences: 47
  confidence: 0.91

7.7.6 Configuration

# aphoria.toml
[promotion]
enabled = true                    # Enable promotion pipeline
auto_promote = false              # Require human approval
output_dir = ".aphoria/extractors/learned"
min_confidence = 0.8              # Minimum to consider
require_validation = true         # Must pass validation suite

[promotion.review]
notify = "slack://webhook/..."    # Notify when candidates ready
batch_size = 10                   # Max candidates per review session

Files: promotion/mod.rs, promotion/pipeline.rs, promotion/regex_gen.rs, promotion/validator.rs, promotion/review.rs


Phase 9: Autonomous Extractor Generation

The system generates, tests, and deploys extractors without human approval for high-confidence patterns. This is the endgame: a fully self-improving extraction system.

Vision

Learned pattern exceeds autonomous threshold (>0.95 confidence, >10 projects)
        ↓
Auto-generate extractor
        ↓
Validate against comprehensive test suite
        ↓
A/B test: run new extractor in shadow mode
        ↓
If FP rate < 5%: auto-deploy
        ↓
If FP rate spikes: auto-rollback

9.1 Autonomous Promotion

Task Description
High-confidence threshold Skip human review for >0.95 confidence
Project threshold Require >10 projects for autonomous
Validation strictness Stricter validation for autonomous
fn should_auto_promote(pattern: &LearnedPattern, validation: &ValidationResult) -> bool {
    pattern.avg_confidence > 0.95 &&
    pattern.project_hashes.len() > 10 &&
    validation.positive_failures.is_empty() &&
    !validation.false_positive_warning &&
    !validation.performance_warning
}

9.2 Shadow Mode Testing

Task Description
Shadow execution Run new extractor alongside existing
Metrics collection Track matches, FP rate, performance
Comparison report Compare shadow vs production results
Promotion criteria Promote if metrics meet threshold
pub struct ShadowTest {
    extractor: DeclarativeExtractor,
    start_time: DateTime<Utc>,
    scans_completed: u32,
    matches: u32,
    confirmed_true_positives: u32,
    confirmed_false_positives: u32,
}

impl ShadowTest {
    fn false_positive_rate(&self) -> f32 {
        self.confirmed_false_positives as f32 / self.matches as f32
    }

    fn should_promote(&self) -> bool {
        self.scans_completed >= 100 &&
        self.false_positive_rate() < 0.05
    }
}

9.3 Auto-Rollback

Task Description
Anomaly detection Detect FP rate spikes
Rollback trigger Auto-disable if FP > 10%
Notification Alert on rollback
Quarantine Move extractor to review queue
async fn check_extractor_health(extractor_id: &str, metrics: &Metrics) -> Action {
    let recent_fp_rate = metrics.false_positive_rate_last_24h(extractor_id);
    let baseline_fp_rate = metrics.false_positive_rate_baseline(extractor_id);

    if recent_fp_rate > 0.10 {
        Action::Rollback { reason: "FP rate exceeded 10%" }
    } else if recent_fp_rate > baseline_fp_rate * 2.0 {
        Action::Rollback { reason: "FP rate doubled from baseline" }
    } else {
        Action::Continue
    }
}

9.4 Cross-Project Learning

Task Description
Hosted pattern sync Patterns from all projects aggregate on server
Global promotion Promote patterns seen across many orgs
Privacy preservation Only normalized patterns shared, no code
Opt-in distribution Orgs can opt-in to receive community extractors
Org A: Pattern seen in 3 projects → shared to hosted
Org B: Same pattern in 5 projects → shared to hosted
Org C: Same pattern in 4 projects → shared to hosted
        ↓
Hosted aggregates: 12 projects total
        ↓
Promotes to community extractor
        ↓
All orgs receive new extractor (if opted in)

9.5 Extractor Versioning

Task Description
Version tracking Track which version caught which issues
Changelog Record changes between versions
Rollback support Revert to previous version
A/B metrics Compare versions side-by-side
# .aphoria/extractors/learned/tls_min_version_const.yaml
version: 2
previous_version: 1
changelog:
  - version: 2
    date: 2026-03-15
    changes: "Added support for YAML configs"
    metrics:
      matches: +15%
      false_positives: -3%
  - version: 1
    date: 2026-02-10
    changes: "Initial auto-generated version"

9.6 Configuration

# aphoria.toml
[autonomous]
enabled = false                   # Opt-in to autonomous mode
min_confidence = 0.95             # Higher threshold for auto
min_projects = 10                 # More evidence required
shadow_scans = 100                # Scans before promotion
max_fp_rate = 0.05                # Auto-rollback threshold

[autonomous.distribution]
receive_community = true          # Receive community extractors
contribute_patterns = true        # Share patterns to community

Files: autonomous/mod.rs, autonomous/shadow.rs, autonomous/rollback.rs, autonomous/distribution.rs


Milestone Summary

Phase Deliverable Depends On Status
0 ConceptPath in StemeDB concept-hierarchy spec
2 Aphoria CLI (scan, report, ack) Phase 0
2A Concept matching (leaf, alias, auto-alias) Phase 2
1 Authoritative corpus expansion Phase 0
3 Claude Code skill + hooks Phase 2A
4.5 Ephemeral scan mode (40x faster) Phase 2
5 Research agent loop Phase 3
6 Federated Policy & Trust Packs Phase 4.5
6.5 Trust Pack Extensions (Predicate Aliases, Key Rotation) Phase 6
4A Observational claims (Tier 4 write-back) Phase 6
4B Self-conflict detection (drift) Phase 4A
4C Diff-only scanning (--staged) Phase 4B
4E Hosted mode (team aggregation) Phase 4C
4D Enhanced ack (--reason, policy updates) Phase 4C
5.6 Community Corpus Contributions Phase 4E
7 Declarative Extractors Phase 6
7.5 LLM-in-the-Loop Extraction (Gemini) Phase 7
7.6 Pattern Learning Store Phase 7.5
7.7 Pattern → Extractor Promotion Phase 7.6
8 Enterprise Extractors (MVP: 8.1, 8.6, 8.11) Phase 7.5
9 Autonomous Extractor Generation Phase 8

Current state:

  • Phases 0-3, 4.5, 4A-4E, 5, 5.6, 6, 7, 7.5, 7.6, 8 (MVP) complete (clippy clean)
  • Full corpus: RFC, OWASP, Vendor sources
  • 17 extractors including security (weak_crypto, command_injection, sql_injection, high_entropy_secrets, auth_bypass, insecure_cookies)
  • Trust Packs: signed policy bundles with import/export
  • Ephemeral mode: 40x faster for CI
  • Observation write-back: --sync records novel claims as Tier 4 project memory
  • Drift detection: Detects changes from prior observations
  • Staged scanning: --staged flag for fast pre-commit hooks
  • Hosted mode: Team aggregation via central StemeDB server
  • Enhanced ack: --reason flag, aphoria update for policy changes
  • Community Corpus: Opt-in anonymous pattern sharing with privacy-preserving anonymization
  • Declarative Extractors: TOML-defined custom extractors without Rust code
  • LLM Extraction: Gemini-powered semantic claim extraction for high-value files
  • Enterprise Extractors MVP: High-entropy secrets (Shannon entropy), auth bypass patterns, insecure cookie flags
  • Pattern Learning: LLM-extracted claims recorded for promotion to declarative extractors

Next: Phase 7.7 → 8 (full) → 9 (Self-Learning Extraction System)

The Self-Learning Vision

Phase 7: Declarative Extractors (foundation)           ✅ COMPLETE
    ↓
Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅ COMPLETE
    ↓
Phase 7.6: Pattern Learning (remember what LLM finds)   ✅ COMPLETE
    ↓
Phase 7.7: Pattern Promotion (patterns → extractors)    ⬜ NEXT
    ↓
Phase 8: Enterprise Extractors (generated + curated)    ✅ MVP (8.1, 8.6, 8.11)
    ↓
Phase 9: Autonomous Generation (fully self-improving)   ⬜

The endgame: Every PR teaches Aphoria. After a month, it knows your security patterns better than your team does.

Bidirectional Knowledge Sync (Complete)

The pre-commit hook is now a bidirectional knowledge sync:

  1. 4A : Record code claims as Tier 4 observations (project memory)
  2. 4B : Detect drift from prior observations (self-conflict)
  3. 4C : Fast diff-only scanning for pre-commit hooks (--staged)
  4. 4E : Team aggregation via hosted StemeDB server
  5. 4D : Enhanced ack with rationale and policy updates

This transforms Aphoria from a linter into a learning system that builds institutional memory per-project and collective intelligence across teams via hosted mode.


Phase 8: Enterprise Extractor Improvements

Goal: Transform extractors from "toy examples" to enterprise-grade detection that catches real violations in production codebases.

Current State Audit

Extractor Languages Strengths Weaknesses
tls_verify 8 Multi-lang, configs Misses custom wrappers
tls_version 8 API patterns Misses semantic (const = "1.0")
hardcoded_secrets 8 Placeholders, test files No entropy detection
weak_crypto 5 MD5/SHA1/DES/RC4 SHA1 false positives, misses bcrypt cost
sql_injection 5 Interpolation patterns Misses ORM unsafe methods
jwt_config 8 alg:none, skip sig Library-specific gaps
cors_config 8 Wildcard + credentials Misses dynamic origin reflection
rate_limit 8 Basic patterns Limited depth
timeout_config 8 Basic patterns Limited depth
command_injection 5 exec/system calls Indirect injection
dep_versions 3 Version parsing No CVE correlation

Enterprise Reality: Current extractors catch ~30% of real-world security misconfigurations. Config files are highest value (patterns consistent), code is lowest (semantic understanding required).


8.1 High-Entropy Secret Detection

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
HighEntropySecretsExtractor extractors/high_entropy_secrets.rs
Shannon entropy algorithm shannon_entropy() with 4.5 threshold
Charset variety check 0.4 minimum variety ratio
Known secret prefixes AWS (AKIA), Stripe (sk_live_, sk_test_), GitHub (ghp_, gho_), GitLab (glpat-), Slack (xox[baprs]-)
High-entropy context patterns api_key, secret, token, credential, auth_key contexts
False positive exclusions UUIDs, git SHAs (40-char hex), file hashes (64-char hex)
Test file confidence reduction 0.6 confidence for test files
Tests 10+ tests covering all patterns

Configuration:

# aphoria.toml
[extractors.entropy]
min_entropy = 4.5          # Shannon entropy threshold
min_charset_variety = 0.4  # Unique chars / length ratio
min_length = 20            # Minimum string length
max_length = 200           # Maximum string length

Languages: Rust, Go, Python, JavaScript, TypeScript, YAML, TOML, JSON, Dotenv


8.2 Framework-Specific Extractors

Impact: HIGH | Effort: HIGH

Generic patterns miss framework-specific misconfigurations. Enterprise codebases use frameworks.

8.2.1 Spring Boot Security

# application.yml misconfigs
security:
  basic:
    enabled: false      # Auth disabled
  csrf:
    enabled: false      # CSRF disabled
  headers:
    frame-options: DISABLE  # Clickjacking
// Java code patterns
@EnableWebSecurity
public class Config extends WebSecurityConfigurerAdapter {
    http.csrf().disable();  // CSRF disabled
    http.authorizeRequests().antMatchers("/**").permitAll();  // Auth bypass
}

8.2.2 Django Security

# settings.py misconfigs
DEBUG = True  # Debug in production
ALLOWED_HOSTS = ['*']  # All hosts
CSRF_COOKIE_SECURE = False  # Insecure cookies
SESSION_COOKIE_SECURE = False

8.2.3 Express.js Security

// Missing security middleware
app.use(helmet());  // helmet() should exist
app.use(cors({ origin: '*', credentials: true }));  // CORS + creds
app.disable('x-powered-by');  // Should be disabled

8.2.4 Rails Security

# config/environments/production.rb
config.force_ssl = false  # Should be true
config.action_dispatch.cookies_same_site_protection = :none

8.3 Config File Deep Parsing

Impact: HIGH | Effort: MEDIUM

Current extractors use regex on config files. This misses:

  • Nested structures
  • Environment-specific overrides
  • Comments that disable security

Implementation:

// Parse YAML/JSON/TOML into structured form
enum ConfigValue {
    String(String),
    Number(f64),
    Bool(bool),
    Array(Vec<ConfigValue>),
    Object(HashMap<String, ConfigValue>),
}

// Then extract with path awareness
fn extract_config_claims(config: &ConfigValue, path: &[String]) -> Vec<ExtractedClaim> {
    // Recursively walk structure
    // Track full path: "server.tls.min_version"
    // Apply semantic rules based on path
}

Patterns to catch:

  • tls.verify: false anywhere in hierarchy
  • security.enabled: false in production configs
  • debug: true or DEBUG: true in non-dev files

8.4 Semantic TLS Version Detection

Impact: MEDIUM | Effort: MEDIUM

Current tls_version misses:

const TLS_MIN_VERSION: &str = "1.0";  // Not caught!
const MIN_TLS: &str = "TLSv1";        // Not caught!

Implementation:

// Semantic pattern: variable name suggests TLS + value is deprecated
let semantic_tls = Regex::new(
    r#"(?i)(tls|ssl)_?(min|minimum|version)[^=]*[:=]\s*["']?(1\.[01]|TLSv?1(?:\.[01])?|SSL)"#
).unwrap();

Also catch:

  • Environment variables: TLS_MIN_VERSION=1.0
  • Terraform: min_tls_version = "TLS1_0"
  • Kubernetes: minTLSVersion: VersionTLS10

8.5 ORM SQL Injection Detection

Impact: MEDIUM | Effort: MEDIUM

Current sql_injection catches raw string interpolation but misses ORM escape hatches:

# SQLAlchemy
db.execute(text(f"SELECT * FROM users WHERE id = {user_id}"))
User.query.filter(text("name = '" + name + "'"))

# Django
User.objects.raw("SELECT * FROM users WHERE id = %s" % user_id)
User.objects.extra(where=["name = '%s'" % name])
// Sequelize
sequelize.query(`SELECT * FROM users WHERE id = ${userId}`);
Model.findAll({ where: sequelize.literal(`id = ${id}`) });

// Prisma
prisma.$queryRawUnsafe(`SELECT * FROM users WHERE id = ${id}`);
# ActiveRecord
User.where("name = '#{name}'")
User.find_by_sql("SELECT * FROM users WHERE id = #{id}")

8.6 Authentication Bypass Patterns

Impact: HIGH | Effort: MEDIUM | Status: Complete

Task Status
AuthBypassExtractor extractors/auth_bypass.rs
Hardcoded admin credentials username == "admin" && password == "..." patterns
Debug auth headers X-Debug-Auth, X-Internal-Auth, X-Admin-Auth
Skip auth env vars SKIP_AUTH, BYPASS_AUTH, NO_AUTH, DEBUG_AUTH
Backdoor patterns if username == "backdoor", if user == "test"
Default credentials admin/admin, root/root, test/test, guest/guest
Test file confidence reduction 0.5 confidence for test files
Tests 11+ tests covering all patterns

Detected patterns:

# Hardcoded credentials
if username == "admin" and password == "admin":

# Debug auth headers
if request.headers.get("X-Debug-Auth") == "secret":

# Skip auth env vars
if os.environ.get("SKIP_AUTH") == "true":

Languages: Python, JavaScript, TypeScript, Go, Rust


8.7 Insecure Deserialization

Impact: HIGH | Effort: MEDIUM

Unsafe deserialization of untrusted data:

# Python
pickle.loads(user_input)
yaml.load(user_input)  # Without Loader=SafeLoader
eval(user_input)
exec(user_input)
// Java
ObjectInputStream ois = new ObjectInputStream(userInput);
ois.readObject();  // Dangerous!
# Ruby
Marshal.load(user_input)
YAML.load(user_input)  # Should use safe_load

8.8 Path Traversal Patterns

Impact: MEDIUM | Effort: LOW

File operations with user input:

# Python
open(user_input)
os.path.join(base, user_input)  # Doesn't prevent ../
shutil.copy(user_input, dest)
// JavaScript
fs.readFile(userInput)
path.join(base, userInput)  // Doesn't prevent ../
res.sendFile(userInput)

8.9 SSRF Patterns

Impact: HIGH | Effort: MEDIUM

HTTP requests with user-controlled URLs:

# Python
requests.get(user_url)
urllib.request.urlopen(user_input)
// JavaScript
fetch(userUrl)
axios.get(userUrl)
http.get(userUrl)
// Go
http.Get(userURL)
client.Do(req)  // Where req.URL is user-controlled

8.10 Missing Security Headers

Impact: MEDIUM | Effort: LOW

Detect when security headers are explicitly removed or not set:

# Response headers missing
response.headers.pop('X-Content-Type-Options')
response.headers['X-Frame-Options'] = 'ALLOWALL'
// Express without helmet
app.use(cors());  // CORS without other security
// No app.use(helmet()) found

Impact: MEDIUM | Effort: LOW | Status: Complete

Task Status
InsecureCookiesExtractor extractors/insecure_cookies.rs
Missing Secure flag secure=False, secure: false
Missing HttpOnly flag httponly=False, httpOnly: false
SameSite=None without Secure sameSite: 'none', SameSite=None
Django settings SESSION_COOKIE_SECURE, CSRF_COOKIE_SECURE = False
Go cookie patterns Secure: false, HttpOnly: false
Rust actix-web patterns .secure(false), .http_only(false)
Test file confidence reduction 0.5 confidence for test files
Tests 8+ tests covering all patterns

Detected patterns:

# Python/Flask/Django
response.set_cookie('session', value, secure=False)
SESSION_COOKIE_SECURE = False
// JavaScript/Express
res.cookie('session', value, { httpOnly: false });
res.cookie('auth', value, { sameSite: 'none' });

Languages: Python, JavaScript, TypeScript, Go, Rust, Ruby, YAML


8.12 Unvalidated Redirects

Impact: MEDIUM | Effort: LOW

Open redirect vulnerabilities:

# Python
return redirect(request.args.get('next'))
return redirect(request.GET['url'])
// JavaScript
res.redirect(req.query.redirect);
window.location = userInput;
window.location.href = params.url;

8.13 XXE (XML External Entity)

Impact: HIGH | Effort: MEDIUM

Unsafe XML parsing:

# Python
etree.parse(user_input)  # Without disabling entities
xml.etree.ElementTree.parse(user_input)
// Java
DocumentBuilderFactory.newInstance()  // Without setFeature to disable XXE
SAXParserFactory.newInstance()  // Without secure processing

8.14 Weak Password Requirements

Impact: MEDIUM | Effort: LOW

Password validation that's too weak:

# Python
if len(password) >= 4:  # Too short
if len(password) >= 6:  # Still weak
MIN_PASSWORD_LENGTH = 6  # Config too low
// JavaScript
if (password.length >= 4)
const MIN_LENGTH = 6;
/^.{4,}$/  // Regex allows 4+ chars

8.15 LLM-Assisted Extraction (Future)

Impact: VERY HIGH | Effort: VERY HIGH

Use Claude to understand code semantically:

// Pseudo-implementation
async fn extract_with_llm(code: &str, file: &str) -> Vec<ExtractedClaim> {
    let prompt = format!(
        "Analyze this code for security issues. Return JSON with:\n\
         - concept_path: security concept (e.g., 'tls/cert_verification')\n\
         - predicate: what aspect (e.g., 'enabled')\n\
         - value: the value found\n\
         - confidence: 0.0-1.0\n\
         - description: why this is an issue\n\n\
         Code:\n```\n{}\n```",
        code
    );
    
    let response = claude_api.message(&prompt).await?;
    parse_claims_from_llm_response(&response)
}

When to use:

  • High-value files (auth, crypto, config)
  • After regex extractors find nothing
  • For code review mode (not CI)

Considerations:

  • Cost per scan
  • Latency
  • Rate limits
  • Privacy (code leaves machine)

Implementation Priority

Phase Extractors Impact Effort Enterprise Value Status
8.1 High-entropy secrets HIGH MEDIUM Catches real leaked secrets
8.2 Framework-specific HIGH HIGH Spring/Django/Express coverage
8.3 Config deep parsing HIGH MEDIUM Nested YAML/JSON understanding
8.4 Semantic TLS MEDIUM MEDIUM Catches const TLS_MIN = "1.0"
8.5 ORM SQL injection MEDIUM MEDIUM SQLAlchemy, Django, Sequelize
8.6 Auth bypass HIGH MEDIUM Backdoors, hardcoded creds
8.7 Deserialization HIGH MEDIUM pickle, Marshal, eval
8.8 Path traversal MEDIUM LOW ../../../etc/passwd
8.9 SSRF HIGH MEDIUM Internal network access
8.10 Security headers MEDIUM LOW Missing helmet(), CSP
8.11 Cookie flags MEDIUM LOW httpOnly, secure, sameSite
8.12 Open redirects MEDIUM LOW Phishing via redirect
8.13 XXE HIGH MEDIUM XML entity injection
8.14 Weak passwords MEDIUM LOW MIN_LENGTH = 4
8.15 LLM extraction VERY HIGH VERY HIGH Semantic understanding (Phase 7.5)

MVP Complete (8.1, 8.6, 8.11): High-impact extractors for enterprise pilots.

Recommended order for remaining extractors:

  1. 8.3 Config deep parsing (foundational for 8.2)
  2. 8.2 Framework-specific (customer-driven)
  3. 8.5 ORM SQL injection (common in enterprise apps)
  4. 8.7 Deserialization (critical vulnerabilities)

Success Metrics

Metric Current Target How to Measure
Detection rate (known vulns) ~30% >70% Run against OWASP benchmark
False positive rate Unknown <10% Manual review of 100 findings
Config file coverage Regex only Full parse Structure-aware extraction
Framework coverage 0 4 major Spring, Django, Express, Rails
Enterprise pilot feedback N/A >4/5 Post-pilot survey