jordan 1cc453c97b feat: Aphoria policy source tracking + claim extraction pipeline

- Add PolicySourceStore for tracking where policies come from
- Implement claim extraction skill and API endpoints
- Add community UI text selection extractor component
- Create Go SDK aphoria client for policy operations
- Document patent specifications and legal disclosures
- Add guides: golden path loop, policy audit trails, pre-flight checks
- Expand Unreal Engine config extractor with source tracking
- Add UAT reports for policy source tracking validation
- Refactor tests.rs into modular test files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-04 02:35:02 -07:00

6.8 KiB

Raw Blame History

Aphoria: Technical Overview

Status: Beta (0.1.0)
Type: CLI Static Analysis Tool & Policy Engine

What It Actually Does

Aphoria is a command-line tool that scans source code for configuration patterns that contradict authoritative technical standards (RFCs, OWASP guidelines, Vendor documentation).

Unlike standard linters (which check for syntax errors or style) or SAST tools (which check for known vulnerability patterns), Aphoria validates intent against authority.

Example: If you write verify=False in a Python request, a standard linter sees valid Python code. A SAST tool might flag it as "Generic Security Risk."

Aphoria does something specific:

Extracts the claim: "This code asserts that TLS verification is disabled."
Queries its internal knowledge graph: "What do authoritative sources say about TLS verification?"
Finds RFC 5246 (Tier 0 Regulatory): "TLS verification MUST be enabled."
Calculates a conflict score (0.92) based on the authority difference.
Reports a BLOCK verdict with the specific RFC citation.

Architecture

Aphoria runs entirely locally on your machine. It embeds StemeDB, a specialized probabilistic database, to handle the logic.

[ Codebase ] ──▶ [ Walk & Extract ] ──▶ [ Claims ]
                                            │
[ RFCs/Docs ] ──▶ [ Corpus Build ] ──▶ [ StemeDB (Local) ]
                                            │
                                            ▼
                                    [ Conflict Detection ]
                                            │
                                    [ Report / Exit Code ]

1. Extraction (Regex & AST)

Aphoria uses language-specific Extractors to find configuration patterns. It currently supports Rust, Go, Python, JavaScript/TypeScript, and configuration files (YAML, TOML, INI).

It normalizes these patterns into Concept Paths:

Python: verify=False → code://python/tls/cert_verification = false
Go: InsecureSkipVerify: true → code://go/tls/cert_verification = false
Rust: danger_accept_invalid_certs(true) → code://rust/tls/cert_verification = false

2. The Knowledge Graph (StemeDB)

Aphoria maintains a local graph of Authoritative Assertions. These are structured facts derived from technical documents:

rfc://5246/tls/cert_verification = true (Source: IETF, Tier: Regulatory)
owasp://secrets/api_key = secure_storage (Source: OWASP, Tier: Clinical)
vendor://redis/timeout = 5000 (Source: Redis Docs, Tier: Observational)

3. Conflict Resolution

When a Code Claim ("verify=false") matches an Authoritative Assertion ("verify=true"), Aphoria calculates a Conflict Score.

The score depends on the Source Class:

Tier 0 (Regulatory): RFCs, Laws. Infinite authority. Conflict = BLOCK.
Tier 1 (Clinical): OWASP, NIST. High authority. Conflict = BLOCK.
Tier 2 (Observational): Vendor docs. Recommendations. Conflict = FLAG.
Tier 3 (Expert): Your internal policies. Can override lower tiers.

Key Features

1. Federated Policy (Trust Packs)

Organizations often have internal rules that override or extend public standards. (e.g., "We allow MD5 for file hashing, just not for passwords").

Aphoria allows you to export these decisions as Trust Packs (.pack files).

Format: Binary, zero-copy (rkyv), cryptographically signed (Ed25519).
Workflow: A Security Engineer runs aphoria ack to acknowledge specific exceptions in a "Golden Repo." They export this as a Trust Pack.
Enforcement: Other teams add policies = ["https://internal/security.pack"] to their config. Aphoria downloads the pack and uses those signed assertions to resolve conflicts.

2. Domain-Specific Audits

Aphoria is not limited to web security. It includes specialized corpora for different domains:

Unreal Engine: Detects synchronous loading on the game thread (performance), hardcoded asset paths (architecture), and exposed console commands (security).
Cloud Infrastructure: Detects AWS S3 public access blocks and loose IAM policies.

3. CI/CD Integration

Aphoria is designed to run in pipelines:

Fast: Scans typical projects in < 0.2 seconds.
Structural: Returns structured JSON/SARIF for dashboard integration.
Blocking: --exit-code ensures builds fail if Regulatory/Clinical conflicts exist.

Performance & Precision

We benchmarked Aphoria against VulnBank, an intentionally vulnerable polyglot codebase.

Metric	Aphoria Result	Context
Findings	63	Covered TLS, JWT, Injection, Secrets, Configs
Precision	100%	Every finding was a real vulnerability backed by an RFC/OWASP citation.
Speed	~0.1s	21 files, 5 languages. Optimized Rust implementation.

Why 100% Precision? Most tools search for "suspicious patterns" (heuristics). Aphoria searches for contradictions to specific rules. If there isn't a specific RFC or Policy saying "Don't do X," Aphoria stays silent. This eliminates the "noise" typical of security tools.

Usage Example

# 1. Initialize the local knowledge base
$ aphoria init

# 2. Scan a project
$ aphoria scan ./my-app

  BLOCK    code://rust/auth/jwt/audience_validation
         Your code:  validate_aud = false (src/auth.rs:24)
         RFC 7519:   Audience validation MUST be enabled.
         Conflict:   0.92

# 3. Fix or Acknowledge
# If you fix it in code -> Conflict disappears.
# If you acknowledge it (with a valid reason):
$ aphoria ack "code://rust/auth/jwt/audience_validation" --reason "Internal-only service"

Comparison

Tool	How it works	Best for...
Snyk / SonarQube	Data flow analysis & CVE db	Finding known exploits in dependencies or complex logic flows.
Semgrep	Syntactic pattern matching	Custom linting rules and finding generic "bad code" patterns.
Aphoria	Epistemic conflict detection	Enforcing architectural decisions, configuration compliance, and "Golden Path" alignment.

Aphoria is effectively "Semantic Semgrep"—instead of writing rules yourself, the rules are derived from the world's technical knowledge (RFCs/Docs) and your organization's signed policies.

6.8 KiB Raw Blame History