stemedb/ai-lookup/features/aphoria-flywheel.md
jordan 422e2d4416 feat(aphoria): wire claims through StemeDB — Gap Closure Phase 1
Claims now flow through StemeDB's append-only knowledge graph instead of
mutable TOML files. This resolves all 6 critical claim-bypass code paths:

- Bridge: lossless AuthoredClaim ↔ Assertion round-trip (comparison, status, lifecycle mapping)
- LocalEpisteme: ingest_authored_claim() and fetch_authored_claims() with AUTHORED_CLAIM predicate index
- EpistemeClaimStore: ClaimStore trait backed by StemeDB (append-only delete via deprecation)
- CLI handlers: all claim commands read/write through StemeDB
- Scanner: loads claims from StemeDB with auto-migration fallback to TOML
- Export: new `aphoria claims export` serializes StemeDB claims to TOML/JSON

Also cleans up dead code (EpistemeConfig.url), renames ingest_claims→ingest_observations,
fixes ClaimFilter.authority_tier type, adds Draft variant to ClaimStatus, and fixes
pre-existing clippy warnings (too_many_arguments, filter_next→rfind).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 02:02:51 -07:00

5.2 KiB

Aphoria Flywheel

Last Updated: 2026-02-10 Confidence: High

Practical Truth

This is an AUTONOMOUS flywheel. LLMs drive it, not humans.

Without LLM layer: You manually create claims with aphoria corpus create, get naming wrong, scan finds 0 violations, waste 6 hours debugging. Manual workflow doesn't scale.

With LLM layer: LLM analyzes diffs, suggests claims with correct naming, enforces consistency, scan finds violations, flywheel spins autonomously.

LLM implementations:

  • Claude Code skills (/aphoria-claims, /aphoria-suggest) - Interactive agent workflow
  • Go ADK agents - Programmatic tool use, automated claim authoring
  • Any LLM with tool use - As long as it can call aphoria claims create with enforced naming

The autonomous loop: LLM analyzes code → suggests claims → enforces naming → scan aggregates patterns → better corpus → LLM has better context → better suggestions → loop.

What It Actually Is

  1. Scan code → Extractors find observations (e.g., max_connections = Option<T>)
  2. Check claims → Tail-path match against corpus claims (e.g., dbpool/max_connections must be required)
  3. Find gaps → Identify claims without extractors (uncovered claims)
  4. Create extractors → Dynamically generate extractors for uncovered existing claims
  5. Suggest claims → LLM identifies new patterns not yet in corpus
  6. Create more extractors → Generate extractors for new claims
  7. Aggregate patterns → High-adoption patterns auto-promote to community corpus (LOCAL ONLY -- no community aggregation server exists; promotion operates on the local machine only)
  8. Better corpus → Next scan catches more violations
  9. Loop

Critical: Tail-path matching is case-sensitive and uses last 2 path segments. dbpool/max_connections matches, dbpool/MaxConnections doesn't. Naming inconsistency breaks the entire flywheel.

Why LLM Layer Is Required

Workflow Time Naming Consistency Autonomy Result
Manual CLI (human) 4-6 hours for 27 claims Inconsistent (camelCase, snake_case mix) None Scan finds 0 violations (tail-path mismatch)
Claude skills (LLM) 1-2 hours for 27 claims Enforced (lowercase, slash-separated) Interactive Scan finds 7 violations ✓
Go ADK agent (LLM) Minutes for 27 claims Enforced Fully autonomous Scan finds 7 violations ✓

LLM layer auto-enforces:

  • Lowercase with underscores: max_connections not MaxConnections
  • Slash-separated paths: dbpool/config/max_connections
  • Hierarchical structure: {domain}/{component}/{property}
  • Consequence reasoning: "If X is Option, then Y breaks" (not just pattern matching)

Without LLM: Manual naming errors → tail-path mismatch → 0 violations detected → "Aphoria is broken"

With LLM: Autonomous reasoning over code → enforced naming → pattern aggregation → self-improving corpus

How the Flywheel Works

LLM workflows drive the autonomous loop. The implementation can be:

Claude Code Skills (Interactive Agent)

# Load skill in your development environment
/aphoria-claims

# Skill analyzes diff for claimable patterns
"Review this diff for claims"

# LLM enforces naming, suggests claims, you approve

Go ADK Agent (Fully Autonomous)

// Agent with aphoria_claims tool
// LLM calls: aphoria_claims_create(subject, predicate, value, explanation)
// Runs in CI/CD pipeline, no human in loop

Custom LLM Integration (Any Tool-Use LLM)

  • Give your LLM access to aphoria claims create CLI
  • Provide naming convention rules in system prompt
  • Let LLM analyze diffs and author claims programmatically
  • Examples: Cursor, Windsurf, custom agent frameworks

Scanning (Required for All Workflows)

# Scan with persistent mode (required for flywheel)
aphoria scan --persist --sync

# Observations saved → contribute to pattern aggregation → community corpus grows

Critical Requirements:

  • LLM workflow (skills, agents, or custom) for claim authoring
  • Persistent mode (--persist) for flywheel activation
  • Sync mode (--sync) for community learning
  • DON'T create claims manually (naming errors break tail-path matching)
  • DON'T use ephemeral mode (flywheel disabled)
  • DON'T mix naming conventions (case-sensitive matching)

Technical Detail (If You Care)

Tail-path matching:

// Corpus claim: "vendor://dbpool/config/max_connections"
// → tail_path = "config/max_connections" (last 2 segments)

// Observation: "dbpool/config/max_connections"
// → tail_path = "config/max_connections"
// MATCH ✓

// Observation: "dbpool/config/MaxConnections"
// → tail_path = "config/MaxConnections"
// NO MATCH ✗ (case-sensitive)

File Pointer: applications/aphoria/src/concept_index.rs:45-120 (tail-path extraction)