stemedb/applications/aphoria/README.md
jml 7facac08a2 feat(aphoria): add enhanced bulk claim import with validation and reporting
Replaces tedious shell scripts with TOML-based bulk import:
- 340 lines bash → 200 lines TOML → 1 command
- 15 minutes → <1 second execution time
- 0% → 100% error detection before writes

Features:
- Pre-import validation (ID format, tiers, required fields, duplicates)
- Detailed reporting (table and JSON formats)
- Template generation (--template)
- Validation-only mode (--validate-only)
- Merge strategies (skip_existing, overwrite, fail_on_duplicate)

Documentation:
- Comprehensive guide: docs/guides/bulk-claim-import.md
- Updated README with quick start
- Example files with inline documentation

Validation catches:
- Invalid claim IDs (must be kebab-case)
- Unknown authority tiers
- Empty required fields
- Duplicate IDs within import file
- Duplicate concept paths (warnings)

Error reporting:
- Shows ALL errors before any writes (not just first failure)
- Clear context: claim index, ID, field, and error message
- Warnings for non-blocking issues

Testing:
- All clippy checks pass
- Production build succeeds
- Validated template generation, validation-only, dry-run, import, merge strategies

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 05:31:04 +00:00

448 lines
14 KiB
Markdown

# Aphoria
**An autonomous knowledge compounding system powered by Episteme.**
Aphoria is a **continuous learning flywheel** that runs on every commit, using LLM workflows to scan code, fix violations, dynamically evaluate patterns, author claims, and create extractors—constantly learning from your organization's decisions.
## The Autonomous Loop
```
Developer commits code
1. SCAN: LLM-driven extractors → observations
2. FIX: Violations detected → developer fixes
3. EVALUATE: LLM analyzes patterns → suggests new claims
4. CREATE: LLM generates extractors for custom patterns
(Loop repeats on next commit)
```
**Knowledge compounds** with every commit. Each scan benefits from all previous commits' learning—not through ML training, but through accumulated structured decisions.
## LLM-Driven Workflows
Aphoria's autonomous operation **requires LLM integration**:
- **Claude Code skills** - `/aphoria-claims`, `/aphoria-suggest`, `/aphoria-custom-extractor-creator`
- **Go ADK agents** - Custom tool-use agents for autonomous claim authoring
- **Any LLM with tool use** - Build your own integration via the CLI interface
**The CLI is a debug/fallback interface**, not the primary workflow. Manual operation doesn't scale—LLMs enforce naming conventions, reason about consequences, and drive the autonomous flywheel.
## Quick Example (Via LLM Workflow)
```bash
# Developer commits code with TLS misconfiguration
$ git commit -m "Add API client"
# LLM skill analyzes diff, finds violation
/aphoria-claims
BLOCK code://python/requests/tls/cert_verification
Your code: verify=False (api/client.py:42)
RFC 5246: TLS certificate verification MUST be enabled
Conflict: 0.92
# LLM suggests fix
> Fix detected: Enable TLS verification
# LLM creates claim for project-specific pattern
> Claim authored: api-client-tls-001
```
---
## Getting Started
**New to Aphoria?** Start with LLM-driven workflows:
1. **[Load the skill](../../.claude/skills/aphoria-claims/)** - `/aphoria-claims` for commit-time claim authoring
2. **[Learn It (20 min)](dogfood/dbpool/)** - Complete worked example with database connection pool
3. **[Build an agent](../../sdk/go/adk/)** - ADK-Go integration for autonomous operation
**Fallback (No LLM Access):**
- **[CLI Quick Start (2 min)](docs/getting-started/solo-developer-quick-start.md)** - Manual scan workflow (debug interface)
See [Getting Started Hub](docs/getting-started/) for all paths.
---
## CLI Reference (Debug/Fallback Interface)
**⚠️ The CLI is for debugging, testing, and environments without LLM access. For production workflows, use [LLM-driven skills](#llm-driven-workflows).**
### Install
```bash
# From source
cd applications/aphoria
cargo install --path .
# Verify
aphoria --version
```
### Initialize
```bash
aphoria init
```
This sets up your local database. The corpus (RFCs, OWASP guidelines, community patterns) is built dynamically during scans.
**Bootstrap corpus (optional):**
```bash
# Import patterns from wiki documentation (LLM skill recommended)
aphoria corpus import wiki ~/docs/security-best-practices/
```
### Scan (Manual Mode)
```bash
# Quick scan (ephemeral, fast)
aphoria scan .
# With persistence (enables diff/baseline, required for flywheel)
aphoria scan --persist
# With sync (enables community learning, required for flywheel)
aphoria scan --persist --sync
# CI mode (exit code 1 on BLOCK)
aphoria scan --exit-code
# Pre-commit (staged files only)
aphoria scan --staged --exit-code
```
**⚠️ Manual scanning alone does NOT activate the flywheel.** The flywheel requires LLM workflows to evaluate patterns, suggest claims, and create extractors autonomously.
### Handle Conflicts
**Fix the code:**
```python
# Before: verify=False
# After:
requests.get(url, verify=True)
```
**Or acknowledge intentionally:**
```bash
aphoria ack "code://python/requests/tls/cert_verification" \
--reason "Local dev environment with self-signed certs"
```
---
## Key Concepts: Observations vs Claims
Aphoria distinguishes between two types of extracted information:
| Type | What it is | Who creates it | Example |
|------|-----------|----------------|---------|
| **Observation** | Pattern match: "this code does X" | Extractors (automated) | `imports/tokio: true` |
| **Claim** | Rule: "code MUST do X because Y" | Humans (you!) | "Core MUST NOT import tokio because it creates runtime coupling" |
**Observations** are what extractors find - they're grep results with confidence scores. They have no opinion about whether something is good or bad.
**Claims** are human-authored rules with:
- **Provenance** - Where the rule came from (RFC, security review, architecture decision)
- **Invariant** - What must stay true ("Wallet MUST NOT derive Clone")
- **Consequence** - What breaks if violated ("Multiple wallet instances → double-spend")
- **Authority tier** - How much weight this rule carries
- **Evidence** - Supporting artifacts (ADRs, test cases, etc.)
When you run `aphoria scan`, it compares observations against:
1. **Authoritative corpus** - RFC/OWASP standards + community patterns (emergent from real usage)
2. **Your authored claims** - Project-specific rules in `.aphoria/claims.toml`
The corpus is **emergent**: patterns with 95%+ adoption across projects auto-promote to authoritative status.
See [Claims-Based Verification](#claims-based-verification) below for creating your own claims.
---
## Output Formats
```bash
aphoria scan --format table # Human-readable (default)
aphoria scan --format json # Machine-readable
aphoria scan --format sarif # GitHub Security tab
aphoria scan --format markdown # Documentation
```
---
## Pre-commit Integration
```yaml
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: aphoria
name: Aphoria truth check
entry: aphoria scan --staged --exit-code
language: system
pass_filenames: false
```
---
## CI Integration (GitHub Actions)
```yaml
- name: Install Aphoria
run: cargo install --path applications/aphoria
- name: Run Aphoria Scan
run: aphoria scan --exit-code --format sarif > results.sarif
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: results.sarif
```
---
## Key Commands
### Scanning
| Command | Description |
|---------|-------------|
| `aphoria scan` | Scan for conflicts with authoritative sources |
| `aphoria ack` | Acknowledge a conflict as intentional |
| `aphoria bless` | Define a pattern as your authoritative standard |
### Claims Management
| Command | Description |
|---------|-------------|
| `aphoria claims create` | Author a new claim with provenance and consequences |
| `aphoria claims list` | List all authored claims |
| `aphoria claims explain` | Generate detailed claim explanations |
| `aphoria claims update` | Update an existing claim |
| `aphoria claims supersede` | Mark claim as superseded by newer claim |
| `aphoria claims deprecate` | Deprecate a claim with reason |
### Inline Markers
| Command | Description |
|---------|-------------|
| `aphoria claims list-markers` | List pending inline claim markers |
| `aphoria claims formalize-marker` | Convert marker to full claim |
| `aphoria claims reject-marker` | Reject an inline marker |
### Verification
| Command | Description |
|---------|-------------|
| `aphoria verify run` | Verify authored claims against codebase |
| `aphoria verify map` | Show extractor-to-claim coverage map |
### Policy & Governance
| Command | Description |
|---------|-------------|
| `aphoria policy export` | Export standards as a Trust Pack |
| `aphoria policy import` | Import a Trust Pack from your security team |
| `aphoria governance pending` | List approval requests (Phase 14) |
| `aphoria audit export` | Export audit trail for SOC 2 compliance |
See [CLI Reference](docs/cli-reference.md) for complete command documentation.
---
## Claims-Based Verification
Beyond scanning for RFC/OWASP conflicts, Aphoria supports **human-authored claims** that encode your project's architectural decisions and safety invariants.
### Quick Example
```bash
# Author a claim
aphoria claims create \
--id wallet-no-clone-001 \
--concept-path maxwell/core/wallet/type/wallet/derives \
--predicate traits \
--value Clone \
--comparison not_contains \
--provenance "Wallet is singleton with atomic state" \
--invariant "Wallet type MUST NOT derive Clone" \
--consequence "Clone allows multiple instances, breaking single-balance invariant" \
--tier expert \
--category safety \
--by jml
# Verify claim against codebase
aphoria verify run
# Output:
# PASS wallet-no-clone-001 | maxwell/core/wallet/type/wallet/derives/traits
# Clone not found (as expected)
```
### Comparison Modes
Claims support six comparison modes for different verification patterns:
- `equals` - Value must be exactly X
- `not_equals` - Value must NOT be X
- `present` - Something must exist at this path
- `absent` - Nothing should exist at this path
- `contains` - Value must contain substring/list element (e.g., "Serialize" in "Clone,Debug,Serialize")
- `not_contains` - Value must NOT contain substring/list element (e.g., "Clone" NOT in derives)
See [Comparison Modes Guide](docs/comparison-modes.md) for detailed examples and decision tree.
### Inline Markers
Mark claims directly in code with special comments:
```rust
// @aphoria:claim[safety] Wallet MUST NOT derive Clone
#[derive(Debug)]
pub struct Wallet { ... }
```
Then formalize them:
```bash
aphoria claims list-markers
aphoria claims formalize-marker marker-001 --id wallet-no-clone-001 --by jml
```
### Git Commit Tracking
Aphoria automatically captures the git commit hash when claims and observations are ingested. This provides:
- **Temporal context** - Know exactly which code version a claim was authored against
- **Audit trail** - Trace architectural decisions through git history
- **Graceful degradation** - Works seamlessly in non-git environments
The commit hash is stored in assertion metadata and captured at ingestion time (not when TOML files are edited), avoiding the "double-commit problem."
```json
{
"authored": true,
"git_commit": "de7af7c1b9e...",
"claim_id": "wallet-no-clone-001",
"provenance": "Wallet is singleton with atomic state"
}
```
### Bulk Import Claims
Import claims in bulk from TOML files instead of creating them one-by-one via CLI.
**Quick start:**
```bash
# Generate template
aphoria claims import --template > my-claims.toml
# Validate format
aphoria claims import my-claims.toml --validate-only
# Preview changes
aphoria claims import my-claims.toml --dry-run
# Import for real
aphoria claims import my-claims.toml
```
**Benefits:**
- **Faster:** Import 22 claims in <1 second (vs. 15 minutes for shell scripts)
- **Safer:** Pre-import validation catches all errors before any writes
- **Clearer:** TOML format is more readable than 340 lines of bash
- **Atomic:** All claims imported or none (no partial writes on error)
**Merge strategies:**
- `--merge skip_existing` (default) - Skip claims with duplicate IDs
- `--merge overwrite` - Replace existing claims with same ID
- `--merge fail_on_duplicate` - Exit with error if any ID exists
**Output formats:**
- `--format table` (default) - Human-readable with symbols (✓, ⊗, ↻)
- `--format json` - Machine-readable for tooling integration
See [Bulk Import Guide](docs/guides/bulk-claim-import.md) for complete documentation and examples.
---
## Conflict Verdicts
| Verdict | Description | CI Behavior |
|---------|-------------|-------------|
| **BLOCK** | High-confidence conflict with RFC/OWASP | Fails with `--exit-code` |
| **FLAG** | Moderate-confidence conflict | Passes, visible in report |
| **ACK** | Acknowledged conflict | Passes, tracked for audit |
| **PASS** | No conflict | - |
---
## Web Dashboard
Aphoria includes a web-based dashboard for visualizing scan results, managing claims, and exploring the authoritative corpus. See [`applications/aphoria-dashboard/`](../aphoria-dashboard/) for setup instructions.
Features:
- Real-time scan visualization
- Claims management interface
- Corpus exploration and search
- Policy governance workflows
---
## Documentation
### Guides
| Guide | Audience | Time |
|-------|----------|------|
| [Solo Developer Guide](docs/guides/solo-developer-guide.md) | Individual developers, side projects | 2 min |
| [Enterprise Pilot Guide](docs/guides/enterprise-pilot-guide.md) | Security teams running pilots | 4 weeks |
| [Enterprise Quick Start](docs/guides/enterprise-quick-start.md) | Platform engineering | 5 min |
| [The First Scan](docs/guides/the-first-scan.md) | Everyone | 10 min |
### Reference
| Document | Description |
|----------|-------------|
| [CLI Reference](docs/cli-reference.md) | Complete command documentation |
| [Comparison Modes](docs/comparison-modes.md) | Guide to claim comparison modes |
| [Vision & Gaps](docs/vision-gaps.md) | Architecture and implementation status |
---
## Research & Reference
### Vision & Architecture
| Document | Description |
|----------|-------------|
| [Vision](vision.md) | Product vision and aspirational architecture |
| [Protocol Vision](protocol_vision.md) | Protocol-level design philosophy |
| [Vision & Gaps](docs/vision-gaps.md) | Honest assessment of current state vs. vision |
| [Architecture Docs](docs/architecture/README.md) | System design, concept matching, extension points |
### Testing & Validation
| Document | Description |
|----------|-------------|
| [UAT Reports](../../uat/README.md) | User acceptance testing results |
| [Phase 6 UAT](../../uat/phase6-uat.md) | Detailed validation of policy workflows |
| [Real-World Policy Source UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) | Trust Pack workflow validation |
### Gap Analysis & Research
| Document | Description |
|----------|-------------|
| [Gap Analysis: Institutional Knowledge](docs/gap-analysis-institutional-knowledge.md) | Analysis of knowledge capture gaps |
| [Gap Fixes Summary](docs/gap-fixes-summary.md) | Summary of addressed gaps |
---
## What Aphoria Is Not
- **Not a linter.** Linters check syntax. Aphoria checks decisions against authoritative sources.
- **Not SAST.** SAST finds vulnerability patterns. Aphoria finds contradictions to specific standards.
- **Not AI autocomplete.** Copilot suggests code from the internet. Aphoria surfaces *your org's* decisions at the moment you contradict them.
---
## License
See [LICENSE](../../LICENSE) for details.