jordan d3a88585fe feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination

This commit includes comprehensive work on Phase 6 features:

## Admission Control (Phase 6 admission middleware)
- AdmissionStore implementation backed by TrustRankStore
- PoW verification with tier-based difficulty computation
- Trust tier progression (Newcomer → Established → Trusted → Authority)
- API integration with admission status endpoints

## HLC Recency Lens (Phase 6C)
- HlcRecencyLens for distributed system ordering
- Hybrid logical clock integration with causality preservation

## Cluster Coordination (Phase 6C)
- Multi-node cluster tests (availability, partition tolerance)
- CRDT convergence tests for anti-entropy sync
- Gateway handler improvements

## Aphoria Code Linter (Phase 2A)
- RFC/OWASP corpus builders with network fetching and caching
- Concept hierarchy with auto-alias creation on conflict detection
- Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting)

## Code Organization
- Split large files into modules to comply with 500-line limit
- Improved test organization with separate test modules
- Fixed rkyv serialization for EigenTrustState (AgentScore struct)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-03 00:43:37 -07:00

15 KiB

Raw Blame History

Aphoria Roadmap

Phase 0: StemeDB Foundation ✅

Tracked in: roadmap.md § 5D. Concept Hierarchy

Changes to the core database that Aphoria depends on. Shipped as Phase 5D of the main StemeDB roadmap.

Aphoria Phase 0	StemeDB Phase 5D	Status
0.1 ConceptPath Type	5D.1 ConceptPath Type	✅
0.2 ConceptPath in Assertion	(implicit in 5D.1)	✅
0.3 Hierarchical Index	5D.4 Hierarchical Query	✅
0.4 Alias Store	5D.3 Alias Store + 5D.5 Alias Resolution	✅
0.5 Source Class Inference	5D.6 Source Class Inference	✅
0.6 Concept API Endpoints	5D.7 Concept API Endpoints	✅

Spec: docs/specs/concept-hierarchy.md

Phase 2: CLI Core ✅

Phase 2 was built before Phase 1 (authoritative corpus expansion). The CLI pipeline works end-to-end with a bootstrapped corpus of 11 hardcoded assertions covering TLS, JWT, CORS, secrets, and rate limiting.

Task	Status
2.1 Project Walker	✅ `walker/mod.rs`, `walker/path_mapper.rs`, `walker/language.rs`
2.2 Extractors (7)	✅ `tls_verify`, `jwt_config`, `hardcoded_secrets`, `timeout_config`, `dep_versions`, `cors_config`, `rate_limit`
2.3 Ingestion Bridge	✅ `bridge.rs` — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion
2.4 Conflict Query	✅ `episteme.rs` — LocalEpisteme with check_conflicts()
2.5 Report Output	✅ `report/` — table (comfy-table), JSON, SARIF 2.1.0, markdown
2.6 Acknowledge Command	✅ `lib.rs` acknowledge()
Baseline & Diff	✅ `lib.rs` set_baseline(), show_diff()
Status Command	✅ `lib.rs` show_status()

118 tests pass. Clippy and fmt clean.

Phase 2A: Concept Matching ✅

Status: Complete. Tail-path matching (2A.1), alias-aware queries (2A.2), and auto-alias creation (2A.3) all implemented.

2A.1 Leaf-Based Concept Matching (Aphoria-side fix) ✅

Implemented in episteme.rs via ConceptIndex:

make_key(subject, predicate) extracts tail 2 path segments + predicate
build(assertions) creates in-memory index keyed by tail path
lookup(subject, predicate) finds matching authoritative assertions
check_conflicts() uses ConceptIndex instead of QueryEngine for cross-scheme matching

Integration tests prove TLS and JWT conflicts are detected correctly.

2A.2 Alias Resolution in QueryEngine (StemeDB-side fix) ✅

Wired AliasStore into QueryEngine.execute():

Added resolve_aliases: bool field to Query (defaults to false)
Added alias_store: Option<Arc<dyn AliasStore>> to QueryEngine
Added .with_alias_store() builder method
When resolve_aliases: true, expands subject via AliasStore.resolve_all() before index lookup
Added fetch_by_subjects() and fetch_by_subjects_predicate() for multi-subject deduplication
Modified Query.matches() to skip subject filtering when aliases are resolved
Skips fast path (MV lookup) when resolve_aliases: true
Gracefully degrades when no alias store is configured

7 unit tests in engine/tests/alias_resolution.rs. This is the architecturally correct long-term fix that complements leaf matching.

2A.3 Auto-Alias Creation ✅

When Aphoria ingests authoritative assertions and code claims that share leaf names, automatically create aliases:

code://rust/myapp/tls/cert_verification ↔ rfc://5246/tls/cert_verification
code://rust/myapp/auth/jwt/audience_validation ↔ rfc://7519/jwt/audience_validation

This bridges 2A.1 (leaf matching) with 2A.2 (alias resolution) — leaf matching identifies candidates, aliases persist the relationship.

Implementation:

Added auto_create_aliases: bool config option to AliasConfig (defaults to true)
Added AliasOrigin::AutoDetected variant to stemedb-core for tracking auto-created aliases
Wired GenericAliasStore into LocalEpisteme for alias persistence
In check_conflicts(), when a code claim matches an authoritative claim by leaf, calls AliasStore.set_alias() to persist the relationship with AliasOrigin::AutoDetected
Alias creation is idempotent (skips if alias already exists)
4 unit tests verify: alias creation on conflict, no creation when disabled, correct origin, idempotency

Phase 1: Authoritative Corpus Expansion ✅

Expanded from 11 hardcoded assertions to a pluggable corpus system with RFC, OWASP, and Vendor sources.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     aphoria corpus build                         │
│                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ RFC Ingester │  │ OWASP        │  │ Vendor Bootstrapper   │  │
│  │ (Tier 0)     │  │ Ingester     │  │ (Tier 2)              │  │
│  │              │  │ (Tier 1)     │  │                       │  │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬───────────┘  │
│         │                 │                      │              │
│         └─────────────────┼──────────────────────┘              │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ CorpusRegistry  │                            │
│                  └────────┬────────┘                            │
│                           ▼                                     │
│                  ┌─────────────────┐                            │
│                  │ LocalEpisteme   │                            │
│                  │ ingest_         │                            │
│                  │ authoritative() │                            │
│                  └─────────────────┘                            │
└─────────────────────────────────────────────────────────────────┘

1.1 CorpusBuilder Trait ✅

Task	Status
`CorpusBuilder` trait	✅ `corpus/mod.rs` — name, scheme, default_tier, build, requires_network
`CorpusRegistry`	✅ Manages multiple builders, build_all(), list_builders()
`CorpusBuildResult`	✅ Stats per builder, total assertions, success/fail/skip counts

1.2 RFC Ingester ✅

Task	Status
`RfcCorpusBuilder`	✅ `corpus/rfc.rs`
HTTP fetching	✅ Via `ureq`, cached to `~/.cache/aphoria/rfc-cache/`
RFC 2119 keyword parsing	✅ MUST, MUST NOT, SHOULD, SHALL extraction
RFC-specific parsers	✅ JWT (7519), OAuth (6749), Bearer (6750), TLS 1.3 (8446), TLS BCP (7525), TOTP (6238), Basic Auth (7617), HTTP (9110)
Concept mapping	✅ `rfc://{number}/{topic}` at Tier 0 (Regulatory)

1.3 OWASP Ingester ✅

Task	Status
`OwaspCorpusBuilder`	✅ `corpus/owasp.rs`
HTTP fetching	✅ From GitHub raw content, cached to `~/.cache/aphoria/owasp-cache/`
Markdown parsing	✅ MUST/SHOULD statements, section context
Cheat sheet parsers	✅ Authentication, JWT, TLS, Secrets, Input Validation, Session, CSRF, Password Storage, HTTP Headers
Concept mapping	✅ `owasp://cheatsheet/{topic}/{claim}` at Tier 1 (Clinical)

1.4 Vendor Docs ✅

Task	Status
`VendorCorpusBuilder`	✅ `corpus/vendor.rs`
PostgreSQL claims	✅ pool_size, idle_timeout, ssl_mode
Redis claims	✅ timeout, max_retries, tls
reqwest claims	✅ cert_verification, connect_timeout, request_timeout
hyper claims	✅ keep_alive_timeout, max_concurrent_streams
Go net/http claims	✅ read_timeout, write_timeout, idle_timeout, min_tls_version
tokio-postgres claims	✅ pool_size, ssl_mode
SQLx claims	✅ max_connections, idle_timeout
Concept mapping	✅ `vendor://{product}/{topic}/{claim}` at Tier 2 (Observational)

1.5 Hardcoded Refactor ✅

Task	Status
`HardcodedCorpusBuilder`	✅ `corpus/hardcoded.rs` — original 11 assertions
`create_authoritative_assertion()`	✅ Made public in `episteme.rs` for corpus builders

1.6 CLI Integration ✅

Task	Status
`aphoria corpus build`	✅ Fetches and ingests from all sources
`--only rfc,owasp,vendor`	✅ Filter to specific sources
`--offline`	✅ Skip network-requiring sources
`--clear-cache`	✅ Clear cache before building
`aphoria corpus list`	✅ List available corpus sources
`CorpusConfig`	✅ cache_dir, include_*, rfc_list options

1.7 Error Handling ✅

Task	Status
`RfcFetch` error	✅ Per-RFC fetch failures with context
`OwaspFetch` error	✅ Per-cheat-sheet fetch failures with context
`CorpusBuild` error	✅ General corpus build failures
Graceful degradation	✅ Continue with other sources if one fails

Files: corpus/mod.rs, corpus/hardcoded.rs, corpus/rfc.rs, corpus/owasp.rs, corpus/vendor.rs

Tests: 118 tests pass. Clippy and fmt clean.

Phase 3: Skill Integration ✅

Complete. Aphoria is now usable in Claude Code agent workflows.

3.1 Claude Code Skill ✅

Task	Status
`skill/SKILL.md`	✅ Comprehensive skill definition with all commands
`/aphoria scan`	✅ Scan project, show conflicts grouped by verdict
`/aphoria scan --fix`	✅ Interactive fix workflow
`/aphoria ack`	✅ Acknowledge conflicts as intentional
`/aphoria status`	✅ Show status and baseline
`/aphoria diff`	✅ Show changes since baseline
`/aphoria init`	✅ Initialize Aphoria
`/aphoria baseline`	✅ Set baseline
`skill/install.sh`	✅ Install script for `~/.claude/skills/aphoria/`

Files: skill/SKILL.md, skill/install.sh, skill/hooks.json

3.2 Agent Pre-Flight Hook ✅

Task	Status
`--exit-code` flag	✅ Returns 2 for BLOCK, 1 for FLAG only, 0 for clean
`--strict` flag	✅ Lower thresholds (FLAG at 0.3, BLOCK at 0.5)
Hook template	✅ `skill/hooks.json` with PreCommit and PrePush examples

Usage:

{
  "hooks": {
    "PreCommit": [{"command": "aphoria scan --format sarif --exit-code"}],
    "PrePush": [{"command": "aphoria scan --strict --exit-code"}]
  }
}

3.3 Alias Suggestion Workflow ✅

Auto-alias creation is now automatic (Phase 2A.3). When Aphoria scans:

Tail-path matching finds authoritative assertions
Aliases are auto-created with AliasOrigin::AutoDetected
Future queries use the alias automatically

The skill documents the suggestion flow for manual alias management:

y (Accept): Creates alias
n (Reject): Records intentional difference
defer: Flags for later review

Phase 4: Pre-Commit Integration ⬜

Depends on Phase 3 (skill validates the UX before hook automation).

4.1 Git Pre-Commit Hook ⬜

A git pre-commit hook that runs Aphoria before every commit:

#!/bin/sh
# .git/hooks/pre-commit

aphoria scan --exit-code --format table

if [ $? -eq 2 ]; then
    echo "BLOCKED: Fix conflicts before committing"
    exit 1
fi

Or using pre-commit framework (.pre-commit-config.yaml):

repos:
  - repo: local
    hooks:
      - id: aphoria
        name: Aphoria Security Lint
        entry: aphoria scan --exit-code
        language: system
        pass_filenames: false

4.2 Baseline Mode ✅

Already implemented in Phase 2. For existing projects with many conflicts:

$ aphoria baseline
Baseline recorded: 12 existing conflicts frozen.
Future scans will only report new conflicts.

4.3 Diff-Only Scanning ⬜

Scan only changed files instead of the whole project:

# Scan only staged files
aphoria scan --staged

# Scan only files changed since baseline
aphoria scan --since-baseline

This makes pre-commit hooks fast even in large projects.

Phase 5: Research Agent Loop ⬜

Depends on gap data accumulating from project scans.

5.1 Gap Detection

When Aphoria extracts a claim and no authoritative source exists for that concept, log it as a gap:

GAP: code://rust/citadeldb/cache/redis/max_memory_policy
     No authoritative source found for redis/max_memory_policy
     Seen in 3 projects

5.2 Research Agent Trigger

When a gap is seen across N projects (configurable, default 3), dispatch a research agent:

Agent searches for authoritative documentation on redis max_memory_policy
Finds Redis official docs
Extracts normative claims: "default is noeviction, recommended allkeys-lru for cache use cases"
Ingests as vendor://redis/cache/max_memory_policy at Tier 2
Future Aphoria scans now have something to conflict against

5.3 Community Corpus Contributions

Users who run Aphoria can opt in to contribute their alias mappings and acknowledgment patterns (anonymized) to a shared corpus. Common patterns propagate:

"Every Rust project has this JWT pattern" → pre-built alias set for Rust JWT libraries
"This Redis config is always flagged and always acknowledged" → lower the default threshold for that concept
"This TLS pattern is always a real bug" → elevate the default threshold

Milestone Summary

Phase	Deliverable	Depends On	Status
0	ConceptPath in StemeDB	concept-hierarchy spec	✅
2	Aphoria CLI (scan, report, ack)	Phase 0	✅
2A.1	Leaf-based concept matching	Phase 2	✅
2A.2	Alias resolution in QueryEngine	Phase 2	✅
2A.3	Auto-alias creation	Phase 2A.2	✅
1	Authoritative corpus expansion	Phase 0	✅
3	Claude Code skill + hooks	Phase 2A	✅
4	Pre-commit integration (git hooks, diff scanning)	Phase 3	⬜ NEXT
5	Research agent loop	Phase 4 (gap data)	⬜

Current state:

Phase 1 is complete: RFC, OWASP, and Vendor corpus builders with aphoria corpus build CLI
Phase 2A is complete: conflict detection via tail-path matching, alias-aware QueryEngine, and auto-alias creation
Phase 3 is complete: /aphoria skill installed to ~/.claude/skills/aphoria/, hook templates ready

Next: Phase 4 — Pre-commit integration (git hooks, diff-only scanning).

15 KiB Raw Blame History