stemedb/applications/aphoria
jml 65065f3d8f feat(aphoria): implement community corpus with wiki import and pattern aggregation
Implements Phase 4 (A4) - Community corpus as first-class citizens:

- **Community Corpus Builder** - Queries StemeDB pattern aggregates
- **Wiki Import** - Bootstrap corpus from markdown docs (aphoria corpus import wiki)
- **Pattern Aggregation** - Automatic learning from local scans (--sync flag)
- **Storage Layer** - StemeDBPatternStore with content-addressed deduplication
- **Promotion Logic** - Multi-tier thresholds (95%/80%/50% adoption rates)
- **Corpus Build** - Unified registry for RFC/OWASP/Vendor/Community sources
- **Trust Packs** - Export corpus as signed, distributable artifacts
- **Documentation** - bootstrap-corpus.md guide + CLI reference updates

Technical details:
- Pattern aggregates stored as assertions with predicate "pattern_aggregate"
- Content-addressed subjects via BLAKE3(subject:predicate:value)
- PatternAggregator handles write path (observations → patterns)
- StemeDBPatternStore handles read path (pattern queries)
- Integration tests + fixtures in tests/wiki_import_test.rs

Deleted hardcoded.rs (368 lines) - corpus now fully emergent from StemeDB.
Deleted enriched-corpus-patterns.md (677 lines) - feature shipped.

Closes VG-026 (community corpus), part of A4 milestone.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 00:12:31 +00:00
..
docs feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
skill feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination 2026-02-03 00:43:37 -07:00
src feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
tests feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
uat feat: Institutional knowledge vision + roadmap phases 11-15 2026-02-06 23:35:41 -07:00
.env.example feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing) 2026-02-06 22:50:55 -07:00
aphoria-vision.pdf feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation 2026-02-02 12:36:35 -07:00
Cargo.toml feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
product.md feat: Aphoria policy source tracking + claim extraction pipeline 2026-02-04 02:35:02 -07:00
protocol_vision.md feat(aphoria): add inline claim markers and claim enrichment infrastructure 2026-02-08 20:18:20 +00:00
README.md feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
roadmap-archive.md feat(aphoria): add inline claim markers and claim enrichment infrastructure 2026-02-08 20:18:20 +00:00
roadmap.md feat(aphoria): implement community corpus with wiki import and pattern aggregation 2026-02-09 00:12:31 +00:00
spec.md feat(aphoria): add inline claim markers and claim enrichment infrastructure 2026-02-08 20:18:20 +00:00
vision.md feat(aphoria): add inline claim markers and claim enrichment infrastructure 2026-02-08 20:18:20 +00:00

Aphoria

A code-level truth linter powered by Episteme.

Aphoria scans your codebase for configuration patterns that contradict authoritative technical standards (RFCs, OWASP, vendor docs). Unlike linters that check syntax or SAST tools that find vulnerability patterns, Aphoria validates intent against authority.

$ aphoria scan .

BLOCK  code://python/requests/tls/cert_verification
       Your code:  verify=False (api/client.py:42)
       RFC 5246:   TLS certificate verification MUST be enabled
       Conflict:   0.92

1 conflict found (1 BLOCK).

Quick Start

Install

# From source
cd applications/aphoria
cargo install --path .

# Verify
aphoria --version

Initialize

aphoria init

This sets up your local database. The corpus (RFCs, OWASP guidelines, community patterns) is built dynamically during scans.

Bootstrap corpus (optional):

# Import patterns from wiki documentation
aphoria corpus import wiki ~/docs/security-best-practices/

Scan

# Quick scan (ephemeral, fast)
aphoria scan .

# With persistence (enables diff/baseline)
aphoria scan --persist

# With sync (enables community learning)
aphoria scan --persist --sync

# CI mode (exit code 1 on BLOCK)
aphoria scan --exit-code

# Pre-commit (staged files only)
aphoria scan --staged --exit-code

Community Learning: When you run --persist --sync, observations from your scan are aggregated into community pattern records. Patterns seen across many projects (95%+ adoption + authority backing) auto-promote to the corpus, creating an emergent, self-improving knowledge base.

Handle Conflicts

Fix the code:

# Before: verify=False
# After:
requests.get(url, verify=True)

Or acknowledge intentionally:

aphoria ack "code://python/requests/tls/cert_verification" \
  --reason "Local dev environment with self-signed certs"

Key Concepts: Observations vs Claims

Aphoria distinguishes between two types of extracted information:

Type What it is Who creates it Example
Observation Pattern match: "this code does X" Extractors (automated) imports/tokio: true
Claim Rule: "code MUST do X because Y" Humans (you!) "Core MUST NOT import tokio because it creates runtime coupling"

Observations are what extractors find - they're grep results with confidence scores. They have no opinion about whether something is good or bad.

Claims are human-authored rules with:

  • Provenance - Where the rule came from (RFC, security review, architecture decision)
  • Invariant - What must stay true ("Wallet MUST NOT derive Clone")
  • Consequence - What breaks if violated ("Multiple wallet instances → double-spend")
  • Authority tier - How much weight this rule carries
  • Evidence - Supporting artifacts (ADRs, test cases, etc.)

When you run aphoria scan, it compares observations against:

  1. Authoritative corpus - RFC/OWASP standards + community patterns (emergent from real usage)
  2. Your authored claims - Project-specific rules in .aphoria/claims.toml

The corpus is emergent: patterns with 95%+ adoption across projects auto-promote to authoritative status.

See Claims-Based Verification below for creating your own claims.


Output Formats

aphoria scan --format table     # Human-readable (default)
aphoria scan --format json      # Machine-readable
aphoria scan --format sarif     # GitHub Security tab
aphoria scan --format markdown  # Documentation

Pre-commit Integration

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria truth check
        entry: aphoria scan --staged --exit-code
        language: system
        pass_filenames: false

CI Integration (GitHub Actions)

- name: Install Aphoria
  run: cargo install --path applications/aphoria

- name: Run Aphoria Scan
  run: aphoria scan --exit-code --format sarif > results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v2
  with:
    sarif_file: results.sarif

Key Commands

Scanning

Command Description
aphoria scan Scan for conflicts with authoritative sources
aphoria ack Acknowledge a conflict as intentional
aphoria bless Define a pattern as your authoritative standard

Claims Management

Command Description
aphoria claims create Author a new claim with provenance and consequences
aphoria claims list List all authored claims
aphoria claims explain Generate detailed claim explanations
aphoria claims update Update an existing claim
aphoria claims supersede Mark claim as superseded by newer claim
aphoria claims deprecate Deprecate a claim with reason

Inline Markers

Command Description
aphoria claims list-markers List pending inline claim markers
aphoria claims formalize-marker Convert marker to full claim
aphoria claims reject-marker Reject an inline marker

Verification

Command Description
aphoria verify run Verify authored claims against codebase
aphoria verify map Show extractor-to-claim coverage map

Policy & Governance

Command Description
aphoria policy export Export standards as a Trust Pack
aphoria policy import Import a Trust Pack from your security team
aphoria governance pending List approval requests (Phase 14)
aphoria audit export Export audit trail for SOC 2 compliance

See CLI Reference for complete command documentation.


Claims-Based Verification

Beyond scanning for RFC/OWASP conflicts, Aphoria supports human-authored claims that encode your project's architectural decisions and safety invariants.

Quick Example

# Author a claim
aphoria claims create \
  --id wallet-no-clone-001 \
  --concept-path maxwell/core/wallet/type/wallet/derives \
  --predicate traits \
  --value Clone \
  --comparison not_contains \
  --provenance "Wallet is singleton with atomic state" \
  --invariant "Wallet type MUST NOT derive Clone" \
  --consequence "Clone allows multiple instances, breaking single-balance invariant" \
  --tier expert \
  --category safety \
  --by jml

# Verify claim against codebase
aphoria verify run

# Output:
# PASS  wallet-no-clone-001 | maxwell/core/wallet/type/wallet/derives/traits
#       Clone not found (as expected)

Comparison Modes

Claims support six comparison modes for different verification patterns:

  • equals - Value must be exactly X
  • not_equals - Value must NOT be X
  • present - Something must exist at this path
  • absent - Nothing should exist at this path
  • contains - Value must contain substring/list element (e.g., "Serialize" in "Clone,Debug,Serialize")
  • not_contains - Value must NOT contain substring/list element (e.g., "Clone" NOT in derives)

See Comparison Modes Guide for detailed examples and decision tree.

Inline Markers

Mark claims directly in code with special comments:

// @aphoria:claim[safety] Wallet MUST NOT derive Clone
#[derive(Debug)]
pub struct Wallet { ... }

Then formalize them:

aphoria claims list-markers
aphoria claims formalize-marker marker-001 --id wallet-no-clone-001 --by jml

Git Commit Tracking

Aphoria automatically captures the git commit hash when claims and observations are ingested. This provides:

  • Temporal context - Know exactly which code version a claim was authored against
  • Audit trail - Trace architectural decisions through git history
  • Graceful degradation - Works seamlessly in non-git environments

The commit hash is stored in assertion metadata and captured at ingestion time (not when TOML files are edited), avoiding the "double-commit problem."

{
  "authored": true,
  "git_commit": "de7af7c1b9e...",
  "claim_id": "wallet-no-clone-001",
  "provenance": "Wallet is singleton with atomic state"
}

Conflict Verdicts

Verdict Description CI Behavior
BLOCK High-confidence conflict with RFC/OWASP Fails with --exit-code
FLAG Moderate-confidence conflict Passes, visible in report
ACK Acknowledged conflict Passes, tracked for audit
PASS No conflict -

Web Dashboard

Aphoria includes a web-based dashboard for visualizing scan results, managing claims, and exploring the authoritative corpus. See applications/aphoria-dashboard/ for setup instructions.

Features:

  • Real-time scan visualization
  • Claims management interface
  • Corpus exploration and search
  • Policy governance workflows

Documentation

Guides

Guide Audience Time
Solo Developer Guide Individual developers, side projects 2 min
Enterprise Pilot Guide Security teams running pilots 4 weeks
Enterprise Quick Start Platform engineering 5 min
The First Scan Everyone 10 min

Reference

Document Description
CLI Reference Complete command documentation
Comparison Modes Guide to claim comparison modes
Vision & Gaps Architecture and implementation status

Research & Reference

Vision & Architecture

Document Description
Vision Product vision and aspirational architecture
Protocol Vision Protocol-level design philosophy
Vision & Gaps Honest assessment of current state vs. vision
Architecture Docs System design, concept matching, extension points

Testing & Validation

Document Description
UAT Reports User acceptance testing results
Phase 6 UAT Detailed validation of policy workflows
Real-World Policy Source UAT Trust Pack workflow validation

Gap Analysis & Research

Document Description
Gap Analysis: Institutional Knowledge Analysis of knowledge capture gaps
Gap Fixes Summary Summary of addressed gaps

What Aphoria Is Not

  • Not a linter. Linters check syntax. Aphoria checks decisions against authoritative sources.
  • Not SAST. SAST finds vulnerability patterns. Aphoria finds contradictions to specific standards.
  • Not AI autocomplete. Copilot suggests code from the internet. Aphoria surfaces your org's decisions at the moment you contradict them.

License

See LICENSE for details.