stemedb/applications/aphoria/product.md
jordan 1cc453c97b feat: Aphoria policy source tracking + claim extraction pipeline
- Add PolicySourceStore for tracking where policies come from
- Implement claim extraction skill and API endpoints
- Add community UI text selection extractor component
- Create Go SDK aphoria client for policy operations
- Document patent specifications and legal disclosures
- Add guides: golden path loop, policy audit trails, pre-flight checks
- Expand Unreal Engine config extractor with source tracking
- Add UAT reports for policy source tracking validation
- Refactor tests.rs into modular test files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:35:02 -07:00

148 lines
6.8 KiB
Markdown

# Aphoria: Technical Overview
- **Status:** Beta (0.1.0)
- **Type:** CLI Static Analysis Tool & Policy Engine
---
## What It Actually Does
Aphoria is a command-line tool that scans source code for configuration patterns that contradict authoritative technical standards (RFCs, OWASP guidelines, Vendor documentation).
Unlike standard linters (which check for syntax errors or style) or SAST tools (which check for known vulnerability patterns), Aphoria validates **intent against authority**.
**Example:**
If you write `verify=False` in a Python request, a standard linter sees valid Python code. A SAST tool might flag it as "Generic Security Risk."
Aphoria does something specific:
1. **Extracts** the claim: "This code asserts that TLS verification is disabled."
2. **Queries** its internal knowledge graph: "What do authoritative sources say about TLS verification?"
3. **Finds** `RFC 5246` (Tier 0 Regulatory): "TLS verification MUST be enabled."
4. **Calculates** a conflict score (0.92) based on the authority difference.
5. **Reports** a BLOCK verdict with the specific RFC citation.
---
## Architecture
Aphoria runs entirely locally on your machine. It embeds **StemeDB**, a specialized probabilistic database, to handle the logic.
```
[ Codebase ] ──▶ [ Walk & Extract ] ──▶ [ Claims ]
[ RFCs/Docs ] ──▶ [ Corpus Build ] ──▶ [ StemeDB (Local) ]
[ Conflict Detection ]
[ Report / Exit Code ]
```
### 1. Extraction (Regex & AST)
Aphoria uses language-specific **Extractors** to find configuration patterns. It currently supports Rust, Go, Python, JavaScript/TypeScript, and configuration files (YAML, TOML, INI).
It normalizes these patterns into **Concept Paths**:
- Python: `verify=False``code://python/tls/cert_verification = false`
- Go: `InsecureSkipVerify: true``code://go/tls/cert_verification = false`
- Rust: `danger_accept_invalid_certs(true)``code://rust/tls/cert_verification = false`
### 2. The Knowledge Graph (StemeDB)
Aphoria maintains a local graph of **Authoritative Assertions**. These are structured facts derived from technical documents:
- `rfc://5246/tls/cert_verification` = `true` (Source: IETF, Tier: Regulatory)
- `owasp://secrets/api_key` = `secure_storage` (Source: OWASP, Tier: Clinical)
- `vendor://redis/timeout` = `5000` (Source: Redis Docs, Tier: Observational)
### 3. Conflict Resolution
When a Code Claim ("verify=false") matches an Authoritative Assertion ("verify=true"), Aphoria calculates a **Conflict Score**.
The score depends on the **Source Class**:
- **Tier 0 (Regulatory):** RFCs, Laws. Infinite authority. Conflict = BLOCK.
- **Tier 1 (Clinical):** OWASP, NIST. High authority. Conflict = BLOCK.
- **Tier 2 (Observational):** Vendor docs. Recommendations. Conflict = FLAG.
- **Tier 3 (Expert):** Your internal policies. Can override lower tiers.
---
## Key Features
### 1. Federated Policy (Trust Packs)
Organizations often have internal rules that override or extend public standards. (e.g., "We allow MD5 for file hashing, just not for passwords").
Aphoria allows you to export these decisions as **Trust Packs** (`.pack` files).
- **Format:** Binary, zero-copy (rkyv), cryptographically signed (Ed25519).
- **Workflow:** A Security Engineer runs `aphoria ack` to acknowledge specific exceptions in a "Golden Repo." They export this as a Trust Pack.
- **Enforcement:** Other teams add `policies = ["https://internal/security.pack"]` to their config. Aphoria downloads the pack and uses those signed assertions to resolve conflicts.
### 2. Domain-Specific Audits
Aphoria is not limited to web security. It includes specialized corpora for different domains:
- **Unreal Engine:** Detects synchronous loading on the game thread (performance), hardcoded asset paths (architecture), and exposed console commands (security).
- **Cloud Infrastructure:** Detects AWS S3 public access blocks and loose IAM policies.
### 3. CI/CD Integration
Aphoria is designed to run in pipelines:
- **Fast:** Scans typical projects in < 0.2 seconds.
- **Structural:** Returns structured JSON/SARIF for dashboard integration.
- **Blocking:** `--exit-code` ensures builds fail if Regulatory/Clinical conflicts exist.
---
## Performance & Precision
We benchmarked Aphoria against **VulnBank**, an intentionally vulnerable polyglot codebase.
| Metric | Aphoria Result | Context |
| :------------ | :------------- | :---------------------------------------------------------------------- |
| **Findings** | 63 | Covered TLS, JWT, Injection, Secrets, Configs |
| **Precision** | 100% | Every finding was a real vulnerability backed by an RFC/OWASP citation. |
| **Speed** | ~0.1s | 21 files, 5 languages. Optimized Rust implementation. |
**Why 100% Precision?**
Most tools search for "suspicious patterns" (heuristics). Aphoria searches for **contradictions to specific rules**. If there isn't a specific RFC or Policy saying "Don't do X," Aphoria stays silent. This eliminates the "noise" typical of security tools.
---
## Usage Example
```bash
# 1. Initialize the local knowledge base
$ aphoria init
# 2. Scan a project
$ aphoria scan ./my-app
BLOCK code://rust/auth/jwt/audience_validation
Your code: validate_aud = false (src/auth.rs:24)
RFC 7519: Audience validation MUST be enabled.
Conflict: 0.92
# 3. Fix or Acknowledge
# If you fix it in code -> Conflict disappears.
# If you acknowledge it (with a valid reason):
$ aphoria ack "code://rust/auth/jwt/audience_validation" --reason "Internal-only service"
```
---
## Comparison
| Tool | How it works | Best for... |
| :------------------- | :------------------------------- | :---------------------------------------------------------------------------------------- |
| **Snyk / SonarQube** | Data flow analysis & CVE db | Finding known exploits in dependencies or complex logic flows. |
| **Semgrep** | Syntactic pattern matching | Custom linting rules and finding generic "bad code" patterns. |
| **Aphoria** | **Epistemic conflict detection** | Enforcing architectural decisions, configuration compliance, and "Golden Path" alignment. |
Aphoria is effectively **"Semantic Semgrep"**—instead of writing rules yourself, the rules are derived from the world's technical knowledge (RFCs/Docs) and your organization's signed policies.