stemedb/applications/aphoria/roadmap.md
jordan d3a88585fe feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination
This commit includes comprehensive work on Phase 6 features:

## Admission Control (Phase 6 admission middleware)
- AdmissionStore implementation backed by TrustRankStore
- PoW verification with tier-based difficulty computation
- Trust tier progression (Newcomer → Established → Trusted → Authority)
- API integration with admission status endpoints

## HLC Recency Lens (Phase 6C)
- HlcRecencyLens for distributed system ordering
- Hybrid logical clock integration with causality preservation

## Cluster Coordination (Phase 6C)
- Multi-node cluster tests (availability, partition tolerance)
- CRDT convergence tests for anti-entropy sync
- Gateway handler improvements

## Aphoria Code Linter (Phase 2A)
- RFC/OWASP corpus builders with network fetching and caching
- Concept hierarchy with auto-alias creation on conflict detection
- Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting)

## Code Organization
- Split large files into modules to comply with 500-line limit
- Improved test organization with separate test modules
- Fixed rkyv serialization for EigenTrustState (AgentScore struct)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:43:37 -07:00

15 KiB

Aphoria Roadmap


Phase 0: StemeDB Foundation

Tracked in: roadmap.md § 5D. Concept Hierarchy

Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.

Aphoria Phase 0 StemeDB Phase 5D Status
0.1 ConceptPath Type 5D.1 ConceptPath Type
0.2 ConceptPath in Assertion (implicit in 5D.1)
0.3 Hierarchical Index 5D.4 Hierarchical Query
0.4 Alias Store 5D.3 Alias Store + 5D.5 Alias Resolution
0.5 Source Class Inference 5D.6 Source Class Inference
0.6 Concept API Endpoints 5D.7 Concept API Endpoints

Spec: docs/specs/concept-hierarchy.md


Phase 2: CLI Core

Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.

Task Status
2.1 Project Walker walker/mod.rs, walker/path_mapper.rs, walker/language.rs
2.2 Extractors (7) tls_verify, jwt_config, hardcoded_secrets, timeout_config, dep_versions, cors_config, rate_limit
2.3 Ingestion Bridge bridge.rs — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion
2.4 Conflict Query episteme.rs — LocalEpisteme with check_conflicts()
2.5 Report Output report/ — table (comfy-table), JSON, SARIF 2.1.0, markdown
2.6 Acknowledge Command lib.rs acknowledge()
Baseline & Diff lib.rs set_baseline(), show_diff()
Status Command lib.rs show_status()

118 tests pass. Clippy and fmt clean.


Phase 2A: Concept Matching

Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.

2A.1 Leaf-Based Concept Matching (Aphoria-side fix)

Implemented in episteme.rs via ConceptIndex:

  • make_key(subject, predicate) extracts tail 2 path segments + predicate
  • build(assertions) creates in-memory index keyed by tail path
  • lookup(subject, predicate) finds matching authoritative assertions
  • check_conflicts() uses ConceptIndex instead of QueryEngine for cross-scheme matching

Integration tests prove TLS and JWT conflicts are detected correctly.

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix)

Wired AliasStore into QueryEngine.execute():

  • Added resolve_aliases: bool field to Query (defaults to false)
  • Added alias_store: Option<Arc<dyn AliasStore>> to QueryEngine
  • Added .with_alias_store() builder method
  • When resolve_aliases: true, expands subject via AliasStore.resolve_all() before index lookup
  • Added fetch_by_subjects() and fetch_by_subjects_predicate() for multi-subject deduplication
  • Modified Query.matches() to skip subject filtering when aliases are resolved
  • Skips fast path (MV lookup) when resolve_aliases: true
  • Gracefully degrades when no alias store is configured

7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.

2A.3 Auto-Alias Creation

When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:

  • code://rust/myapp/tls/cert_verificationrfc://5246/tls/cert_verification
  • code://rust/myapp/auth/jwt/audience_validationrfc://7519/jwt/audience_validation

This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.

Implementation:

  • Added auto_create_aliases: bool config option to AliasConfig (defaults to true)
  • Added AliasOrigin::AutoDetected variant to stemedb-core for tracking auto-created aliases
  • Wired GenericAliasStore into LocalEpisteme for alias persistence
  • In check_conflicts(), when a code claim matches an authoritative claim by leaf, calls AliasStore.set_alias() to persist the relationship with AliasOrigin::AutoDetected
  • Alias creation is idempotent (skips if alias already exists)
  • 4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency

Phase 1: Authoritative Corpus Expansion

Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     aphoria corpus build                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ RFC Ingester │  │ OWASP        │  │ Vendor Bootstrapper   │  │
│  │ (Tier 0)     │  │ Ingester     │  │ (Tier 2)              │  │
│  │              │  │ (Tier 1)     │  │                       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         └─────────────────┼──────────────────────┘              │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ CorpusRegistry  │                            │
│                  └────────┬────────┘                            │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ LocalEpisteme   │                            │
│                  │ ingest_         │                            │
│                  │ authoritative() │                            │
│                  └─────────────────┘                            │
└─────────────────────────────────────────────────────────────────┘

1.1 CorpusBuilder Trait

Task Status
CorpusBuilder trait corpus/mod.rs — name, scheme, default_tier, build, requires_network
CorpusRegistry Manages multiple builders, build_all(), list_builders()
CorpusBuildResult Stats per builder, total assertions, success/fail/skip counts

1.2 RFC Ingester

Task Status
RfcCorpusBuilder corpus/rfc.rs
HTTP fetching Via ureq, cached to ~/.cache/aphoria/rfc-cache/
RFC 2119 keyword parsing MUST, MUST NOT, SHOULD, SHALL extraction
RFC-specific parsers JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110)
Concept mapping rfc://{number}/{topic} at Tier 0 (Regulatory)

1.3 OWASP Ingester

Task Status
OwaspCorpusBuilder corpus/owasp.rs
HTTP fetching From GitHub raw content, cached to ~/.cache/aphoria/owasp-cache/
Markdown parsing MUST/SHOULD statements, section context
Cheat sheet parsers Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers
Concept mapping owasp://cheatsheet/{topic}/{claim} at Tier 1 (Clinical)

1.4 Vendor Docs

Task Status
VendorCorpusBuilder corpus/vendor.rs
PostgreSQL claims pool_size, idle_timeout, ssl_mode
Redis claims timeout, max_retries, tls
reqwest claims cert_verification, connect_timeout, request_timeout
hyper claims keep_alive_timeout, max_concurrent_streams
Go net/http claims read_timeout, write_timeout, idle_timeout, min_tls_version
tokio-postgres claims pool_size, ssl_mode
SQLx claims max_connections, idle_timeout
Concept mapping vendor://{product}/{topic}/{claim} at Tier 2 (Observational)

1.5 Hardcoded Refactor

Task Status
HardcodedCorpusBuilder corpus/hardcoded.rs — original 11 assertions
create_authoritative_assertion() Made public in episteme.rs for corpus builders

1.6 CLI Integration

Task Status
aphoria corpus build Fetches and ingests from all sources
--only rfc,owasp,vendor Filter to specific sources
--offline Skip network-requiring sources
--clear-cache Clear cache before building
aphoria corpus list List available corpus sources
CorpusConfig cache_dir, include_*, rfc_list options

1.7 Error Handling

Task Status
RfcFetch error Per-RFC fetch failures with context
OwaspFetch error Per-cheat-sheet fetch failures with context
CorpusBuild error General corpus build failures
Graceful degradation Continue with other sources if one fails

Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs

Tests: 118 tests pass. Clippy and fmt clean.


Phase 3: Skill Integration

Complete. Aphoria is now usable in Claude Code agent workflows.

3.1 Claude Code Skill

Task Status
skill/SKILL.md Comprehensive skill definition with all commands
/aphoria scan Scan project, show conflicts grouped by verdict
/aphoria scan --fix Interactive fix workflow
/aphoria ack Acknowledge conflicts as intentional
/aphoria status Show status and baseline
/aphoria diff Show changes since baseline
/aphoria init Initialize Aphoria
/aphoria baseline Set baseline
skill/install.sh Install script for ~/.claude/skills/aphoria/

Files: skill/SKILL.md, skill/install.sh, skill/hooks.json

3.2 Agent Pre-Flight Hook

Task Status
--exit-code flag Returns 2 for BLOCK, 1 for FLAG only, 0 for clean
--strict flag Lower thresholds (FLAG at 0.3, BLOCK at 0.5)
Hook template skill/hooks.json with PreCommit and PrePush examples

Usage:

{
  "hooks": {
    "PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
    "PrePush": [{"command": "aphoria scan --strict --exit-code"}]
  }
}

3.3 Alias Suggestion Workflow

Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:

  1. Tail-path matching finds authoritative assertions
  2. Aliases are auto-created with AliasOrigin::AutoDetected
  3. Future queries use the alias automatically

The skill documents the suggestion flow for manual alias management:

  • y (Accept): Creates alias
  • n (Reject): Records intentional difference
  • defer: Flags for later review

Phase 4: Pre-Commit Integration

Depends on Phase 3 (skill validates the UX before hook automation).

4.1 Git Pre-Commit Hook

A git pre-commit hook that runs Aphoria before every commit:

#!/bin/sh
# .git/hooks/pre-commit

aphoria scan --exit-code --format table

if [ $? -eq 2 ]; then
    echo "BLOCKED: Fix conflicts before committing"
    exit 1
fi

Or using pre-commit framework (.pre-commit-config.yaml):

repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria Security Lint
        entry: aphoria scan --exit-code
        language: system
        pass_filenames: false

4.2 Baseline Mode

Already implemented in Phase 2. For existing projects with many conflicts:

$ aphoria baseline
Baseline recorded: 12 existing conflicts frozen.
Future scans will only report new conflicts.

4.3 Diff-Only Scanning

Scan only changed files instead of the whole project:

# Scan only staged files
aphoria scan --staged

# Scan only files changed since baseline
aphoria scan --since-baseline

This makes pre-commit hooks fast even in large projects.


Phase 5: Research Agent Loop

Depends on gap data accumulating from project scans.

5.1 Gap Detection

When Aphoria extracts a claim and no authoritative source exists for that concept, log it as a gap:

GAP: code://rust/citadeldb/cache/redis/max_memory_policy
     No authoritative source found for redis/max_memory_policy
     Seen in 3 projects

5.2 Research Agent Trigger

When a gap is seen across N projects (configurable, default 3), dispatch a research agent:

  1. Agent searches for authoritative documentation on redis max_memory_policy
  2. Finds Redis official docs
  3. Extracts normative claims: "default is noeviction, recommended allkeys-lru for cache use cases"
  4. Ingests as vendor://redis/cache/max_memory_policy at Tier 2
  5. Future Aphoria scans now have something to conflict against

5.3 Community Corpus Contributions

Users who run Aphoria can opt in to contribute their alias mappings and acknowledgment patterns (anonymized) to a shared corpus. Common patterns propagate:

  • "Every Rust project has this JWT pattern" → pre-built alias set for Rust JWT libraries
  • "This Redis config is always flagged and always acknowledged" → lower the default threshold for that concept
  • "This TLS pattern is always a real bug" → elevate the default threshold

Milestone Summary

Phase Deliverable Depends On Status
0 ConceptPath in StemeDB concept-hierarchy spec
2 Aphoria CLI (scan, report, ack) Phase 0
2A.1 Leaf-based concept matching Phase 2
2A.2 Alias resolution in QueryEngine Phase 2
2A.3 Auto-alias creation Phase 2A.2
1 Authoritative corpus expansion Phase 0
3 Claude Code skill + hooks Phase 2A
4 Pre-commit integration (git hooks, diff scanning) Phase 3 NEXT
5 Research agent loop Phase 4 (gap data)

Current state:

  • Phase 1 is complete: RFC, OWASP, and Vendor corpus builders with aphoria corpus build CLI
  • Phase 2A is complete: conflict detection via tail-path matching, alias-aware QueryEngine, and auto-alias creation
  • Phase 3 is complete: /aphoria skill installed to ~/.claude/skills/aphoria/, hook templates ready

Next: Phase 4 — Pre-commit integration (git hooks, diff-only scanning).