# Full-Cycle Pre-Commit Vision **Date:** 2026-02-04 **Status:** Vision / Gap Analysis ## Executive Summary The pre-commit hook should be a **bidirectional knowledge sync**, not just a read-only linter. Every commit extracts claims from code, checks them against authority, and records observations back — building project memory and (optionally) contributing to community intelligence. ## The Vision: Scan + Sync ``` ┌─────────────────────────────────────────────────────────────┐ │ PRE-COMMIT FLOW │ ├─────────────────────────────────────────────────────────────┤ │ │ │ 1. EXTRACT What claims does this code make? │ │ (TLS settings, timeouts, crypto, etc.) │ │ │ │ 2. CHECK Against authoritative corpus (Tier 0-2) │ │ Against project's own prior claims │ │ │ │ 3. CLASSIFY │ │ ┌────────────────────┬──────────────────────────────┐ │ │ │ Scenario │ Result │ │ │ ├────────────────────┼──────────────────────────────┤ │ │ │ Authority conflict │ FIX code or ACK deviation │ │ │ │ Self conflict │ Intentional change? Ack it │ │ │ │ Novel claim │ Record as observation │ │ │ │ Unchanged claim │ Update timestamp (heartbeat) │ │ │ └────────────────────┴──────────────────────────────┘ │ │ │ │ 4. UPDATE Store observations to local Episteme │ │ - New claims → Tier 4 assertions │ │ - Changed claims → new version │ │ - Acks → explicit policy decisions │ │ │ │ 5. GATE Exit codes for git hook │ │ - 2 = BLOCK (authority conflict) │ │ - 1 = FLAG (self conflict, review) │ │ - 0 = PASS │ │ │ └─────────────────────────────────────────────────────────────┘ ``` ## Key Concepts ### Observational Claims (Tier 4) When code makes a claim with no authoritative coverage: ``` Code: connection_pool.max_size = 25 Authority: (nothing from RFC/OWASP/vendor) Action: Record as Tier 4 (Observational) assertion subject: code://rust/myapp/db/connection_pool/max_size predicate: configured_as object: "25" source_class: Observational ``` This is the project's own belief — not authoritative, but tracked. ### Self-Conflict Detection On subsequent commits, detect drift from prior observations: ``` Prior: connection_pool.max_size = 25 (recorded 2026-01-15) Now: connection_pool.max_size = 10 Result: SELF-CONFLICT "You changed max_size from 25 to 10" "Was this intentional? [ack/revert/explain]" ``` This catches accidental changes to established patterns. ### The Ack Decision Tree ``` Conflict detected │ ▼ ┌──────────────────┐ │ Source of truth? │ └────────┬─────────┘ │ ┌────┴────┐ │ │ Authority Self │ │ ▼ ▼ ┌───────┐ ┌────────────┐ │Fix or │ │Intentional │ │comply │ │change? │ └───┬───┘ └─────┬──────┘ │ │ ▼ ▼ ┌───────────┐ ┌─────────────────┐ │ack: │ │ack: │ │deviation │ │policy_update │ │from_rfc │ │old=25, new=10 │ └───────────┘ └─────────────────┘ ``` ### Community Contribution (Opt-In) If configured, observations can be anonymously contributed: ```toml # aphoria.toml [community] contribute = true anonymize = true # Strip project-specific paths ``` Aggregated patterns become community intelligence: - "90% of Rust projects use pool_size 20-50" - "This TLS pattern is always acknowledged → lower severity" - "This JWT pattern is always a real bug → raise severity" ## End-to-End Example ### First Commit (Project Init) ```bash $ git commit -m "Initial API server" aphoria: Scanning staged files... aphoria: Extracted 47 claims from 12 files AUTHORITY CONFLICTS (2): BLOCK: tls/min_version = TLS_1_1 RFC 8446 requires TLS_1_2 minimum FLAG: jwt/expiry = 7d OWASP recommends <= 24h for access tokens NOVEL OBSERVATIONS (45): Recorded 45 observational claims (no authority coverage) Examples: - db/pool_size = 25 - api/timeout = 30s - cache/ttl = 3600s Action required: Fix 1 BLOCK before committing ``` ### Later Commit (Drift Detection) ```bash $ git commit -m "Tune database settings" aphoria: Scanning staged files... aphoria: Extracted 3 changed claims SELF-CONFLICTS (1): FLAG: db/pool_size changed: 25 → 100 Prior value recorded 2026-01-15 Is this intentional? Options: [a]ck - Yes, this is intentional (records policy update) [r]eset - No, revert to prior value [e]xplain - Add rationale for the change ``` ### Acknowledgment with Rationale ```bash $ aphoria ack db/pool_size --reason "Scaling for Black Friday traffic" Recorded policy update: subject: code://rust/myapp/db/pool_size old_value: 25 new_value: 100 rationale: "Scaling for Black Friday traffic" timestamp: 2026-02-04T10:30:00Z ``` ## Required Capabilities ### Currently Implemented ✅ | Capability | Implementation | |------------|----------------| | Extract claims from code | Walker + 10 extractors | | Check against authority | ConceptIndex + corpus | | Report conflicts | SARIF, JSON, table, markdown | | Acknowledge conflicts | `aphoria ack` command | | Baseline mode | `aphoria baseline` | | Diff detection | `aphoria diff` | | Exit codes | `--exit-code` flag | | Trust Packs | Phase 6 complete | ### Gaps ⬜ | Capability | Status | Notes | |------------|--------|-------| | **Record observational claims** | ⬜ | Write Tier 4 assertions for code claims | | **Self-conflict detection** | ⬜ | Query prior claims on same subject | | **Claim versioning** | ⬜ | Track value changes over time | | **Diff-only scanning** | ⬜ | `--staged`, `--since-baseline` flags | | **Ack with rationale** | ⬜ | `--reason` flag for ack command | | **Policy update assertions** | ⬜ | Record intentional changes as assertions | | **Community contribution** | ⬜ | Anonymous pattern telemetry | | **Heartbeat timestamps** | ⬜ | Update last-seen on unchanged claims | ## Implementation Plan ### Phase 4A: Observational Claims 1. Add `ingest_observations()` to LocalEpisteme 2. Store code claims as Tier 4 (Observational) assertions 3. Key by `code://{lang}/{project}/{path}` concept paths 4. Add `--sync` flag to `aphoria scan` to enable write-back ### Phase 4B: Self-Conflict Detection 1. Before conflict check, query own prior claims 2. Compare current extraction to stored observations 3. Report changes as SELF-CONFLICT with diff 4. New verdict: `Drift` (distinct from `Block`/`Flag`) ### Phase 4C: Diff-Only Scanning 1. `--staged` flag: only scan `git diff --cached` files 2. `--since-baseline` flag: only scan files changed since baseline 3. Incremental extraction for fast pre-commit hooks ### Phase 4D: Enhanced Ack 1. `--reason "text"` flag for acknowledgments 2. Store rationale in assertion metadata 3. `ack` for authority conflicts vs `update` for self-conflicts 4. Policy update assertions for intentional drift ### Phase 4E: Community Contribution (Optional) 1. Anonymous aggregation of observation patterns 2. Opt-in telemetry endpoint 3. Privacy-preserving path normalization 4. Community corpus fed by aggregate patterns ## Success Criteria | Criterion | Metric | |-----------|--------| | Pre-commit is fast | < 500ms for staged-only scan | | Drift is caught | Self-conflicts detected on value changes | | Memory persists | Observations survive across commits | | Rationale is preserved | Ack reasons queryable in reports | | Opt-in works | Community contribution respects config | ## Open Questions 1. **Storage location**: `.aphoria/` in project root vs `~/.local/share/aphoria/`? 2. **Observation expiry**: Should old observations be pruned if not seen in N commits? 3. **Merge conflicts**: How to handle observation conflicts during git merge? 4. **CI mode**: Should CI record observations, or only local dev?