Major additions: - Staged scanning modes (working tree, staged, committed) with git integration - Drift detection for baseline vs current state comparisons - Hosted API handlers for policy CRUD operations via StemeDB API - stemedb-ontology crate with domain definitions and medical extractors - Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.) - Aphoria development skill documentation Code organization: - Split large files into focused modules to stay under 500-line limit - Extracted config tests, episteme helpers/drift/aliases, API helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
262 lines
9.6 KiB
Markdown
262 lines
9.6 KiB
Markdown
# Full-Cycle Pre-Commit Vision
|
|
|
|
**Date:** 2026-02-04
|
|
**Status:** Vision / Gap Analysis
|
|
|
|
## Executive Summary
|
|
|
|
The pre-commit hook should be a **bidirectional knowledge sync**, not just a read-only linter. Every commit extracts claims from code, checks them against authority, and records observations back — building project memory and (optionally) contributing to community intelligence.
|
|
|
|
## The Vision: Scan + Sync
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ PRE-COMMIT FLOW │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ 1. EXTRACT What claims does this code make? │
|
|
│ (TLS settings, timeouts, crypto, etc.) │
|
|
│ │
|
|
│ 2. CHECK Against authoritative corpus (Tier 0-2) │
|
|
│ Against project's own prior claims │
|
|
│ │
|
|
│ 3. CLASSIFY │
|
|
│ ┌────────────────────┬──────────────────────────────┐ │
|
|
│ │ Scenario │ Result │ │
|
|
│ ├────────────────────┼──────────────────────────────┤ │
|
|
│ │ Authority conflict │ FIX code or ACK deviation │ │
|
|
│ │ Self conflict │ Intentional change? Ack it │ │
|
|
│ │ Novel claim │ Record as observation │ │
|
|
│ │ Unchanged claim │ Update timestamp (heartbeat) │ │
|
|
│ └────────────────────┴──────────────────────────────┘ │
|
|
│ │
|
|
│ 4. UPDATE Store observations to local Episteme │
|
|
│ - New claims → Tier 4 assertions │
|
|
│ - Changed claims → new version │
|
|
│ - Acks → explicit policy decisions │
|
|
│ │
|
|
│ 5. GATE Exit codes for git hook │
|
|
│ - 2 = BLOCK (authority conflict) │
|
|
│ - 1 = FLAG (self conflict, review) │
|
|
│ - 0 = PASS │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
### Observational Claims (Tier 4)
|
|
|
|
When code makes a claim with no authoritative coverage:
|
|
|
|
```
|
|
Code: connection_pool.max_size = 25
|
|
Authority: (nothing from RFC/OWASP/vendor)
|
|
Action: Record as Tier 4 (Observational) assertion
|
|
subject: code://rust/myapp/db/connection_pool/max_size
|
|
predicate: configured_as
|
|
object: "25"
|
|
source_class: Observational
|
|
```
|
|
|
|
This is the project's own belief — not authoritative, but tracked.
|
|
|
|
### Self-Conflict Detection
|
|
|
|
On subsequent commits, detect drift from prior observations:
|
|
|
|
```
|
|
Prior: connection_pool.max_size = 25 (recorded 2026-01-15)
|
|
Now: connection_pool.max_size = 10
|
|
|
|
Result: SELF-CONFLICT
|
|
"You changed max_size from 25 to 10"
|
|
"Was this intentional? [ack/revert/explain]"
|
|
```
|
|
|
|
This catches accidental changes to established patterns.
|
|
|
|
### The Ack Decision Tree
|
|
|
|
```
|
|
Conflict detected
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ Source of truth? │
|
|
└────────┬─────────┘
|
|
│
|
|
┌────┴────┐
|
|
│ │
|
|
Authority Self
|
|
│ │
|
|
▼ ▼
|
|
┌───────┐ ┌────────────┐
|
|
│Fix or │ │Intentional │
|
|
│comply │ │change? │
|
|
└───┬───┘ └─────┬──────┘
|
|
│ │
|
|
▼ ▼
|
|
┌───────────┐ ┌─────────────────┐
|
|
│ack: │ │ack: │
|
|
│deviation │ │policy_update │
|
|
│from_rfc │ │old=25, new=10 │
|
|
└───────────┘ └─────────────────┘
|
|
```
|
|
|
|
### Community Contribution (Opt-In)
|
|
|
|
If configured, observations can be anonymously contributed:
|
|
|
|
```toml
|
|
# aphoria.toml
|
|
[community]
|
|
contribute = true
|
|
anonymize = true # Strip project-specific paths
|
|
```
|
|
|
|
Aggregated patterns become community intelligence:
|
|
- "90% of Rust projects use pool_size 20-50"
|
|
- "This TLS pattern is always acknowledged → lower severity"
|
|
- "This JWT pattern is always a real bug → raise severity"
|
|
|
|
## End-to-End Example
|
|
|
|
### First Commit (Project Init)
|
|
|
|
```bash
|
|
$ git commit -m "Initial API server"
|
|
|
|
aphoria: Scanning staged files...
|
|
aphoria: Extracted 47 claims from 12 files
|
|
|
|
AUTHORITY CONFLICTS (2):
|
|
BLOCK: tls/min_version = TLS_1_1
|
|
RFC 8446 requires TLS_1_2 minimum
|
|
|
|
FLAG: jwt/expiry = 7d
|
|
OWASP recommends <= 24h for access tokens
|
|
|
|
NOVEL OBSERVATIONS (45):
|
|
Recorded 45 observational claims (no authority coverage)
|
|
Examples:
|
|
- db/pool_size = 25
|
|
- api/timeout = 30s
|
|
- cache/ttl = 3600s
|
|
|
|
Action required: Fix 1 BLOCK before committing
|
|
```
|
|
|
|
### Later Commit (Drift Detection)
|
|
|
|
```bash
|
|
$ git commit -m "Tune database settings"
|
|
|
|
aphoria: Scanning staged files...
|
|
aphoria: Extracted 3 changed claims
|
|
|
|
SELF-CONFLICTS (1):
|
|
FLAG: db/pool_size changed: 25 → 100
|
|
Prior value recorded 2026-01-15
|
|
Is this intentional?
|
|
|
|
Options:
|
|
[a]ck - Yes, this is intentional (records policy update)
|
|
[r]eset - No, revert to prior value
|
|
[e]xplain - Add rationale for the change
|
|
```
|
|
|
|
### Acknowledgment with Rationale
|
|
|
|
```bash
|
|
$ aphoria ack db/pool_size --reason "Scaling for Black Friday traffic"
|
|
|
|
Recorded policy update:
|
|
subject: code://rust/myapp/db/pool_size
|
|
old_value: 25
|
|
new_value: 100
|
|
rationale: "Scaling for Black Friday traffic"
|
|
timestamp: 2026-02-04T10:30:00Z
|
|
```
|
|
|
|
## Required Capabilities
|
|
|
|
### Currently Implemented ✅
|
|
|
|
| Capability | Implementation |
|
|
|------------|----------------|
|
|
| Extract claims from code | Walker + 10 extractors |
|
|
| Check against authority | ConceptIndex + corpus |
|
|
| Report conflicts | SARIF, JSON, table, markdown |
|
|
| Acknowledge conflicts | `aphoria ack` command |
|
|
| Baseline mode | `aphoria baseline` |
|
|
| Diff detection | `aphoria diff` |
|
|
| Exit codes | `--exit-code` flag |
|
|
| Trust Packs | Phase 6 complete |
|
|
|
|
### Gaps ⬜
|
|
|
|
| Capability | Status | Notes |
|
|
|------------|--------|-------|
|
|
| **Record observational claims** | ⬜ | Write Tier 4 assertions for code claims |
|
|
| **Self-conflict detection** | ⬜ | Query prior claims on same subject |
|
|
| **Claim versioning** | ⬜ | Track value changes over time |
|
|
| **Diff-only scanning** | ⬜ | `--staged`, `--since-baseline` flags |
|
|
| **Ack with rationale** | ⬜ | `--reason` flag for ack command |
|
|
| **Policy update assertions** | ⬜ | Record intentional changes as assertions |
|
|
| **Community contribution** | ⬜ | Anonymous pattern telemetry |
|
|
| **Heartbeat timestamps** | ⬜ | Update last-seen on unchanged claims |
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 4A: Observational Claims
|
|
|
|
1. Add `ingest_observations()` to LocalEpisteme
|
|
2. Store code claims as Tier 4 (Observational) assertions
|
|
3. Key by `code://{lang}/{project}/{path}` concept paths
|
|
4. Add `--sync` flag to `aphoria scan` to enable write-back
|
|
|
|
### Phase 4B: Self-Conflict Detection
|
|
|
|
1. Before conflict check, query own prior claims
|
|
2. Compare current extraction to stored observations
|
|
3. Report changes as SELF-CONFLICT with diff
|
|
4. New verdict: `Drift` (distinct from `Block`/`Flag`)
|
|
|
|
### Phase 4C: Diff-Only Scanning
|
|
|
|
1. `--staged` flag: only scan `git diff --cached` files
|
|
2. `--since-baseline` flag: only scan files changed since baseline
|
|
3. Incremental extraction for fast pre-commit hooks
|
|
|
|
### Phase 4D: Enhanced Ack
|
|
|
|
1. `--reason "text"` flag for acknowledgments
|
|
2. Store rationale in assertion metadata
|
|
3. `ack` for authority conflicts vs `update` for self-conflicts
|
|
4. Policy update assertions for intentional drift
|
|
|
|
### Phase 4E: Community Contribution (Optional)
|
|
|
|
1. Anonymous aggregation of observation patterns
|
|
2. Opt-in telemetry endpoint
|
|
3. Privacy-preserving path normalization
|
|
4. Community corpus fed by aggregate patterns
|
|
|
|
## Success Criteria
|
|
|
|
| Criterion | Metric |
|
|
|-----------|--------|
|
|
| Pre-commit is fast | < 500ms for staged-only scan |
|
|
| Drift is caught | Self-conflicts detected on value changes |
|
|
| Memory persists | Observations survive across commits |
|
|
| Rationale is preserved | Ack reasons queryable in reports |
|
|
| Opt-in works | Community contribution respects config |
|
|
|
|
## Open Questions
|
|
|
|
1. **Storage location**: `.aphoria/` in project root vs `~/.local/share/aphoria/`?
|
|
2. **Observation expiry**: Should old observations be pruned if not seen in N commits?
|
|
3. **Merge conflicts**: How to handle observation conflicts during git merge?
|
|
4. **CI mode**: Should CI record observations, or only local dev?
|