stemedb/applications/aphoria/vision.md

# Aphoria

> **Product Vision:** This document describes Aphoria's product vision as a knowledge compounding system that learns from your organization's decisions. For the protocol-level vision (EAP standard), see [Protocol Vision](docs/advanced/eap-protocol.md).

**Self-learning institutional knowledge that compounds with every commit.**

Aphoria transforms your organization's implicit decisions into explicit, auditable, shareable knowledge. Every commit teaches the system. Every new hire benefits from what came before. Knowledge compounds instead of walking out the door.

---

## The Problem

Every organization has institutional knowledge. It lives in:

- The senior engineer's head ("we always validate JWT audience")
- The config file nobody reads ("verify=false was a hotfix in 2019")
- The Confluence page with 3 views ("our timeout policy is 30s max")
- The Stack Overflow answer the intern copied ("this worked for someone")

This knowledge is **invisible, inconsistent, and fragile**:

- When Sarah leaves, her context leaves with her
- New hires copy patterns from 2019 code that predates current standards
- Team A's conventions contradict Team B's - neither knows
- The same mistakes repeat across 50 projects because nobody connected them

AI agents make this worse. An agent deploying code doesn't read the RFC. It picks the most confident-sounding answer from training data and acts. The agent doesn't know it's contradicting your team's decision because there's no structured way to check.

---

## The Solution

Aphoria is a **knowledge compounding system** that learns from your organization's decisions as they happen.

### The Autonomous Learning Loop

**Aphoria runs on EVERY commit** via LLM-driven workflows:

```
Developer commits code
    ↓
1. SCAN: Extractors → observations
    ↓
2. CHECK: Compare observations against claims → violations
    ↓
3. FIX: Developer fixes violations
    ↓
4. GET REMAINING CLAIMS: Identify claims without extractors
    ↓
5. CREATE EXTRACTORS: Dynamically generate extractors for uncovered claims
    ↓
6. SUGGEST NEW CLAIMS: LLM analyzes patterns → suggests new claims
    ↓
7. CREATE NEW EXTRACTORS: Generate extractors for new claims
    ↓
(Loop repeats, knowledge compounds)
```

**LLM workflows are the core mechanism:**
- **Claude Code skills** - Interactive agent workflows (`/aphoria-claims`, `/aphoria-suggest`)
- **Go ADK agents** - Fully autonomous tool-use agents for CI/CD
- **Custom integrations** - Any LLM with tool-use capability

**The CLI is a debug/fallback interface.** Manual operation doesn't scale—LLMs enforce naming conventions, reason about consequences, and drive continuous learning.

### The Three Tiers of Knowledge

> **Status: PLANNED** -- The tier system (Policies / Conventions / Observations) with automatic graduation is the target architecture. Today, `authority_tier` exists as a field on `AuthoredClaim` but nothing classifies claims into tiers at scan time, and no code reads `authority_tier` for Lens resolution or enforcement behavior (BLOCK vs FLAG vs silent). Tier-aware resolution is planned for Phase 1 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md).

```
┌─────────────────────────────────────────────────────────────────┐
│  TIER 1: POLICIES (Explicit, Authoritative)                     │
│  ─────────────────────────────────────────────────────────────- │
│  • RFC 7519: JWT audience validation required                   │
│  • OWASP A03:2021: No SQL string concatenation                  │
│  • Internal Policy: TLS 1.3 minimum (signed: CISO, 2024-01)     │
│  → BLOCK on violation, clear remediation path                   │
└─────────────────────────────────────────────────────────────────┘
                         ▲
                         │ Explicit promotion (governance approval)
                         │
┌─────────────────────────────────────────────────────────────────┐
│  TIER 2: CONVENTIONS (Emergent, Team-Approved)                  │
│  ─────────────────────────────────────────────────────────────- │
│  • API versioning: /api/v{major}/{resource}                     │
│  • Error format: {"error": {"code": X, "message": Y}}           │
│  • Retry pattern: exponential backoff with jitter               │
│  → FLAG on deviation, suggest alignment, explain context        │
└─────────────────────────────────────────────────────────────────┘
                         ▲
                         │ Automatic graduation (frequency + authority)
                         │
┌─────────────────────────────────────────────────────────────────┐
│  TIER 3: OBSERVATIONS (Learning, Not Enforced)                  │
│  ─────────────────────────────────────────────────────────────- │
│  • @alex's logging format (3 usages)                            │
│  • @jordan's config pattern (1 usage)                           │
│  → Silent capture, potential future convention                  │
└─────────────────────────────────────────────────────────────────┘
```

### The Workflow

**Note:** These workflows are **LLM-driven** via Claude Code skills, ADK-Go agents, or custom integrations. The CLI examples shown here represent the autonomous behavior, not manual commands.

**Day 1: Install Aphoria**

> **Status: PLANNED** -- `--org` and `--team` flags, org-level knowledge graph connection, and pre-loaded policies/conventions require the central StemeDB server (Phase 3 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md)). Today, `aphoria init` creates a local project config only.

```bash
$ aphoria init --org acme --team platform
Connected to Acme Engineering knowledge graph
Loaded: 12 policies, 47 conventions, 156 observations
```

**Every Commit: LLM-Driven Learning**

```bash
$ git commit -m "Add payment processing endpoint"

Aphoria scan:
  ✓ TLS verification enabled (Policy: RFC 8446)
  ✓ JWT audience validated (Policy: RFC 7519)

  + Captured: API versioning /api/v1/payments
    → This is your 4th endpoint using this pattern
    → Graduating to team convention (Platform Team)
```

**New Developer Joins:**

```bash
$ git commit -m "Add user profile endpoint"

Aphoria guidance:
  ⚠ API Versioning: Your team uses /api/v{major}/{resource}
    └ Established by @alex (Senior, Platform Team) - 12 usages
    └ Your code: /user/profile → Suggest: /api/v1/user/profile

  Accept suggestion? [y/n/explain]
  > explain

  This pattern was established during the microservices migration.
  Consistent versioning enables API gateway routing.
  See: ADR-042 (linked), @alex's original commit (linked)
```

**Knowledge Compounds:**

```
Acme Engineering (6 months)
├── 12 Policies (explicit, CISO-signed)
├── 47 Conventions (promoted from 340 observations)
│   └── 89% adoption rate, 0 regressions this quarter
├── 156 Observations (learning, not enforced)
└── 23 Deprecated patterns ("we used to do this, don't anymore")

New hire onboarding:
  • Day 1: Guided by 47 conventions, not 500 raw observations
  • Week 1: 3 suggested alignments accepted, 1 explained deviation
  • Month 1: Contributed 2 observations, 1 promoted to convention
```

---

## How It Works

### 1. Capture Decisions Where They Happen

Aphoria runs in your commit flow - the moment decisions become code:

```bash
# Pre-commit hook or CI integration
aphoria scan --persist --sync
```

Every scan:

- **Detects** security patterns (TLS, JWT, SQL injection, XSS)
- **Extracts** configuration decisions (timeouts, pool sizes, retry policies)
- **Captures** new patterns as observations
- **Checks** against existing policies and conventions
- **Syncs** to org knowledge graph

> **Status: PARTIALLY IMPLEMENTED** -- Hosted mode syncs observations and patterns to a remote StemeDB instance. Claims and extractors remain in local TOML files and are NOT synced. Full claim/extractor sync is planned for Phase 3 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md).

### 2. Graduate Patterns Through Governance

Not every observation becomes a convention. Graduation requires:

| Criteria     | Threshold             | Why                       |
| ------------ | --------------------- | ------------------------- |
| Frequency    | 5+ usages             | Not a one-off hack        |
| Consistency  | Same pattern          | Not random variation      |
| Authority    | Senior contributor    | Not a junior's experiment |
| Time         | 30+ days              | Not a temporary fix       |
| No conflicts | No FPs in shadow mode | Actually works            |

Patterns meeting criteria enter **shadow mode** - running alongside production, collecting feedback, before promotion.

### 3. Scope Knowledge Appropriately

Not all knowledge applies everywhere:

```
Organization Level (applies to all teams)
├── Security policies (TLS, auth, secrets)
├── Compliance requirements (GDPR, SOC 2)
└── Architecture decisions (API gateway, event bus)

Team Level (applies to team's projects)
├── Coding conventions (naming, error handling)
├── Technology choices (frameworks, libraries)
└── Domain patterns (payment flows, user lifecycle)

Project Level (applies to single project)
├── Local overrides (justified exceptions)
├── Experimental patterns (not yet proven)
└── Context-specific decisions
```

### 4. Authority from Evidence

> **Status: PLANNED** -- The 4-level authority ladder is the target design. Today, `authority_tier` is stored as a string field on `AuthoredClaim` but no code reads it for Lens resolution or claim prioritization. Authority-weighted resolution is planned for Phase 1 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md), where `authority_tier` will map to `SourceClass` and feed into StemeDB's Authority Lens.

Authority isn't title-based - it's **merit-based**. The weight of a pattern comes from the evidence supporting it:

```
Authority Ladder (lowest to highest):

┌─────────────────────────────────────────────────────────────────┐
│  LEVEL 4: Product Spec / Task File                              │
│  ─────────────────────────────────────────────────────────────  │
│  Pattern linked to explicit product requirement or task.        │
│  "This is what we decided to build."                            │
│  Authority: 0.95                                                │
└─────────────────────────────────────────────────────────────────┘
                              ↑
┌─────────────────────────────────────────────────────────────────┐
│  LEVEL 3: RFC / Standard Reference                              │
│  ─────────────────────────────────────────────────────────────  │
│  Pattern references authoritative external source.              │
│  "This is what the spec says."                                  │
│  Authority: 0.85                                                │
└─────────────────────────────────────────────────────────────────┘
                              ↑
┌─────────────────────────────────────────────────────────────────┐
│  LEVEL 2: Research / ADR / Documentation                        │
│  ─────────────────────────────────────────────────────────────  │
│  Pattern accompanied by investigation or reasoning.             │
│  "Here's why we chose this."                                    │
│  Authority: 0.70                                                │
└─────────────────────────────────────────────────────────────────┘
                              ↑
┌─────────────────────────────────────────────────────────────────┐
│  LEVEL 1: Just a Commit                                         │
│  ─────────────────────────────────────────────────────────────  │
│  Pattern observed in code, no supporting evidence.              │
│  "Someone did this."                                            │
│  Authority: 0.40                                                │
└─────────────────────────────────────────────────────────────────┘
```

A pattern with RFC backing outweighs 100 undocumented commits. The system rewards evidence, not tenure.

### 5. Deprecate and Evolve

Knowledge ages. Aphoria tracks lifecycle:

```
Pattern: "Use request.get() for HTTP calls"
Status: DEPRECATED (2024-06-01)
Reason: "Migrated to httpx for async support"
Superseded by: "Use httpx.AsyncClient for HTTP calls"
Migration: See ADR-089
```

New code using deprecated patterns gets guidance toward the replacement.

---

## The Enterprise Value

### For Engineering Leaders

**Knowledge retention**: When senior engineers leave, their patterns stay. The system captures context, not just code.

**Faster onboarding**: New hires get contextual guidance from day 1. "Your team does X" instead of "read the wiki."

**Reduced rework**: Patterns proven across 50 projects stop developers from reinventing (broken) wheels.

### For Security Teams

**Continuous compliance**: Every commit is checked. SOC 2 auditors get evidence, not promises.

**Policy enforcement**: Security policies aren't suggestions - they're enforced in the commit flow.

**Drift detection**: When code deviates from blessed patterns, you know immediately.

### For Platform Teams

**Convention adoption**: Define patterns once, enforce everywhere. Measure adoption rates.

**Cross-team consistency**: "This is how we do it at Acme" becomes queryable fact.

**Technical debt visibility**: See which deprecated patterns are still in use, prioritize migration.

---

## Integration Points

### Claude Code Skill

```
/aphoria scan                    # Run scan on current project
/aphoria explain <conflict>      # Explain why this is flagged
/aphoria bless <pattern>         # Promote pattern to convention
/aphoria ack <conflict> 90d      # Acknowledge with expiry
```

### Pre-commit Hook

```yaml
# .pre-commit-config.yaml
repos:
    - repo: local
      hooks:
          - id: aphoria
            name: Aphoria knowledge check
            entry: aphoria scan --exit-code
            language: system
```

### CI/CD Pipeline

```yaml
# GitHub Actions
- name: Aphoria Scan
  run: aphoria scan --format sarif --output results.sarif

- name: Upload SARIF
  uses: github/codeql-action/upload-sarif@v2
  with:
      sarif_file: results.sarif
```

### Central Knowledge Server

> **Status: PLANNED** -- No `aphoria server` binary exists yet. Org-wide knowledge aggregation is planned for Phase 3 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md). Today, hosted mode uses a StemeDB API instance for observation sync only.

```bash
# Deploy org-wide knowledge graph
aphoria server --org acme --port 18187

# Teams connect
aphoria init --server https://aphoria.acme.internal
```

---

## What This Is Not

**Not a linter.** Linters check syntax rules you define. Aphoria discovers patterns from behavior and checks against authoritative sources you didn't know existed.

**Not SAST.** SAST finds vulnerability patterns in code. Aphoria finds where code decisions contradict standards - security is just one domain.

**Not a policy wiki.** Wikis are written once and forgotten. Aphoria captures decisions as they happen and surfaces them at the moment they're relevant.

**Not AI autocomplete.** Copilot suggests code from the internet. Aphoria surfaces _your org's_ decisions at the moment you're about to contradict them.

---

## The Flywheel

```
More commits → More observations captured
     ↓
More observations → Better pattern recognition
     ↓
Better patterns → More accurate guidance
     ↓
More accurate guidance → Higher developer trust
     ↓
Higher trust → More commits with Aphoria
     ↓
More usage → More institutional knowledge
     ↓
More knowledge → Less ramp-up time, fewer mistakes
     ↓
Fewer mistakes → More confidence in AI agents
     ↓
More AI usage → More commits...
```

The more projects Aphoria scans, the smarter it gets - not through ML magic, but through accumulated structured decisions. Every commit is a vote. Every acknowledgment is context. Every promotion is governance.

---

## The Bottom Line

Your organization makes thousands of decisions every day. Most are invisible. When something breaks, you discover the decision existed only by finding the bug.

Aphoria makes those decisions visible, auditable, and compounding. Install it on day 1. Let it learn as you build. Watch new hires ramp faster. Watch senior knowledge persist after they leave. Watch cross-team consistency emerge naturally.

**Your codebase becomes a knowledge graph. Your commits become institutional memory. Your organization gets smarter with every push.**

> **Status: PARTIALLY IMPLEMENTED** -- Today, observations flow into a local embedded StemeDB instance at `.aphoria/db/`, forming a per-project knowledge graph. Claims remain in a flat TOML file (`claims.toml`) and are not yet stored in StemeDB. Wiring claims through StemeDB is Phase 1 of [gap closure](../../tmp/aphoria-stemedb-gap-closure.md).