stemedb/applications/aphoria/uat/2026-02-04-full-cycle-precommit-vision.md
jordan 8f6506b70a feat: Aphoria scan modes + stemedb-ontology crate + consumer health UAT
Major additions:
- Staged scanning modes (working tree, staged, committed) with git integration
- Drift detection for baseline vs current state comparisons
- Hosted API handlers for policy CRUD operations via StemeDB API
- stemedb-ontology crate with domain definitions and medical extractors
- Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.)
- Aphoria development skill documentation

Code organization:
- Split large files into focused modules to stay under 500-line limit
- Extracted config tests, episteme helpers/drift/aliases, API helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:57:33 -07:00

262 lines
9.6 KiB
Markdown

# Full-Cycle Pre-Commit Vision
**Date:** 2026-02-04
**Status:** Vision / Gap Analysis
## Executive Summary
The pre-commit hook should be a **bidirectional knowledge sync**, not just a read-only linter. Every commit extracts claims from code, checks them against authority, and records observations back — building project memory and (optionally) contributing to community intelligence.
## The Vision: Scan + Sync
```
┌─────────────────────────────────────────────────────────────┐
│ PRE-COMMIT FLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. EXTRACT What claims does this code make? │
│ (TLS settings, timeouts, crypto, etc.) │
│ │
│ 2. CHECK Against authoritative corpus (Tier 0-2) │
│ Against project's own prior claims │
│ │
│ 3. CLASSIFY │
│ ┌────────────────────┬──────────────────────────────┐ │
│ │ Scenario │ Result │ │
│ ├────────────────────┼──────────────────────────────┤ │
│ │ Authority conflict │ FIX code or ACK deviation │ │
│ │ Self conflict │ Intentional change? Ack it │ │
│ │ Novel claim │ Record as observation │ │
│ │ Unchanged claim │ Update timestamp (heartbeat) │ │
│ └────────────────────┴──────────────────────────────┘ │
│ │
│ 4. UPDATE Store observations to local Episteme │
│ - New claims → Tier 4 assertions │
│ - Changed claims → new version │
│ - Acks → explicit policy decisions │
│ │
│ 5. GATE Exit codes for git hook │
│ - 2 = BLOCK (authority conflict) │
│ - 1 = FLAG (self conflict, review) │
│ - 0 = PASS │
│ │
└─────────────────────────────────────────────────────────────┘
```
## Key Concepts
### Observational Claims (Tier 4)
When code makes a claim with no authoritative coverage:
```
Code: connection_pool.max_size = 25
Authority: (nothing from RFC/OWASP/vendor)
Action: Record as Tier 4 (Observational) assertion
subject: code://rust/myapp/db/connection_pool/max_size
predicate: configured_as
object: "25"
source_class: Observational
```
This is the project's own belief — not authoritative, but tracked.
### Self-Conflict Detection
On subsequent commits, detect drift from prior observations:
```
Prior: connection_pool.max_size = 25 (recorded 2026-01-15)
Now: connection_pool.max_size = 10
Result: SELF-CONFLICT
"You changed max_size from 25 to 10"
"Was this intentional? [ack/revert/explain]"
```
This catches accidental changes to established patterns.
### The Ack Decision Tree
```
Conflict detected
┌──────────────────┐
│ Source of truth? │
└────────┬─────────┘
┌────┴────┐
│ │
Authority Self
│ │
▼ ▼
┌───────┐ ┌────────────┐
│Fix or │ │Intentional │
│comply │ │change? │
└───┬───┘ └─────┬──────┘
│ │
▼ ▼
┌───────────┐ ┌─────────────────┐
│ack: │ │ack: │
│deviation │ │policy_update │
│from_rfc │ │old=25, new=10 │
└───────────┘ └─────────────────┘
```
### Community Contribution (Opt-In)
If configured, observations can be anonymously contributed:
```toml
# aphoria.toml
[community]
contribute = true
anonymize = true # Strip project-specific paths
```
Aggregated patterns become community intelligence:
- "90% of Rust projects use pool_size 20-50"
- "This TLS pattern is always acknowledged → lower severity"
- "This JWT pattern is always a real bug → raise severity"
## End-to-End Example
### First Commit (Project Init)
```bash
$ git commit -m "Initial API server"
aphoria: Scanning staged files...
aphoria: Extracted 47 claims from 12 files
AUTHORITY CONFLICTS (2):
BLOCK: tls/min_version = TLS_1_1
RFC 8446 requires TLS_1_2 minimum
FLAG: jwt/expiry = 7d
OWASP recommends <= 24h for access tokens
NOVEL OBSERVATIONS (45):
Recorded 45 observational claims (no authority coverage)
Examples:
- db/pool_size = 25
- api/timeout = 30s
- cache/ttl = 3600s
Action required: Fix 1 BLOCK before committing
```
### Later Commit (Drift Detection)
```bash
$ git commit -m "Tune database settings"
aphoria: Scanning staged files...
aphoria: Extracted 3 changed claims
SELF-CONFLICTS (1):
FLAG: db/pool_size changed: 25100
Prior value recorded 2026-01-15
Is this intentional?
Options:
[a]ck - Yes, this is intentional (records policy update)
[r]eset - No, revert to prior value
[e]xplain - Add rationale for the change
```
### Acknowledgment with Rationale
```bash
$ aphoria ack db/pool_size --reason "Scaling for Black Friday traffic"
Recorded policy update:
subject: code://rust/myapp/db/pool_size
old_value: 25
new_value: 100
rationale: "Scaling for Black Friday traffic"
timestamp: 2026-02-04T10:30:00Z
```
## Required Capabilities
### Currently Implemented ✅
| Capability | Implementation |
|------------|----------------|
| Extract claims from code | Walker + 10 extractors |
| Check against authority | ConceptIndex + corpus |
| Report conflicts | SARIF, JSON, table, markdown |
| Acknowledge conflicts | `aphoria ack` command |
| Baseline mode | `aphoria baseline` |
| Diff detection | `aphoria diff` |
| Exit codes | `--exit-code` flag |
| Trust Packs | Phase 6 complete |
### Gaps ⬜
| Capability | Status | Notes |
|------------|--------|-------|
| **Record observational claims** | ⬜ | Write Tier 4 assertions for code claims |
| **Self-conflict detection** | ⬜ | Query prior claims on same subject |
| **Claim versioning** | ⬜ | Track value changes over time |
| **Diff-only scanning** | ⬜ | `--staged`, `--since-baseline` flags |
| **Ack with rationale** | ⬜ | `--reason` flag for ack command |
| **Policy update assertions** | ⬜ | Record intentional changes as assertions |
| **Community contribution** | ⬜ | Anonymous pattern telemetry |
| **Heartbeat timestamps** | ⬜ | Update last-seen on unchanged claims |
## Implementation Plan
### Phase 4A: Observational Claims
1. Add `ingest_observations()` to LocalEpisteme
2. Store code claims as Tier 4 (Observational) assertions
3. Key by `code://{lang}/{project}/{path}` concept paths
4. Add `--sync` flag to `aphoria scan` to enable write-back
### Phase 4B: Self-Conflict Detection
1. Before conflict check, query own prior claims
2. Compare current extraction to stored observations
3. Report changes as SELF-CONFLICT with diff
4. New verdict: `Drift` (distinct from `Block`/`Flag`)
### Phase 4C: Diff-Only Scanning
1. `--staged` flag: only scan `git diff --cached` files
2. `--since-baseline` flag: only scan files changed since baseline
3. Incremental extraction for fast pre-commit hooks
### Phase 4D: Enhanced Ack
1. `--reason "text"` flag for acknowledgments
2. Store rationale in assertion metadata
3. `ack` for authority conflicts vs `update` for self-conflicts
4. Policy update assertions for intentional drift
### Phase 4E: Community Contribution (Optional)
1. Anonymous aggregation of observation patterns
2. Opt-in telemetry endpoint
3. Privacy-preserving path normalization
4. Community corpus fed by aggregate patterns
## Success Criteria
| Criterion | Metric |
|-----------|--------|
| Pre-commit is fast | < 500ms for staged-only scan |
| Drift is caught | Self-conflicts detected on value changes |
| Memory persists | Observations survive across commits |
| Rationale is preserved | Ack reasons queryable in reports |
| Opt-in works | Community contribution respects config |
## Open Questions
1. **Storage location**: `.aphoria/` in project root vs `~/.local/share/aphoria/`?
2. **Observation expiry**: Should old observations be pruned if not seen in N commits?
3. **Merge conflicts**: How to handle observation conflicts during git merge?
4. **CI mode**: Should CI record observations, or only local dev?