stemedb/applications/aphoria/docs/guides/pre-flight-checks.md
jordan 1cc453c97b feat: Aphoria policy source tracking + claim extraction pipeline
- Add PolicySourceStore for tracking where policies come from
- Implement claim extraction skill and API endpoints
- Add community UI text selection extractor component
- Create Go SDK aphoria client for policy operations
- Document patent specifications and legal disclosures
- Add guides: golden path loop, policy audit trails, pre-flight checks
- Expand Unreal Engine config extractor with source tracking
- Add UAT reports for policy source tracking validation
- Refactor tests.rs into modular test files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:35:02 -07:00

119 lines
3.7 KiB
Markdown

# Guide: Pre-Flight Checks for Autonomous Coding Agents
**Target Audience:** AI Engineers, Agent Framework Builders
**Context:** AI Safety & Reliability
---
## The Problem: Confident Hallucinations
AI agents are excellent at writing code. They are terrible at understanding **Constraints**.
If you ask an agent: *"Deploy a secure Redis instance,"* it might write:
```yaml
# redis.conf
protected-mode no # "To make sure it connects easily!"
```
The agent isn't malicious. It just prioritized "connectivity" over "security" because it saw a thousand Stack Overflow posts doing the same thing.
**Traditional approach:** A human reviews the PR.
**Problem:** Humans get tired. Agents generate code faster than humans can review it.
---
## The Solution: The Automated Conscience
Aphoria acts as the agent's conscience. It provides a structured, authoritative check *before* the code leaves the agent's hands.
### 1. The Workflow
```mermaid
graph LR
User[User Request] --> Agent[Coding Agent]
Agent --> Code[Generate Code]
Code --> Aphoria[Aphoria Scan]
Aphoria -- PASS --> PR[Open PR]
Aphoria -- BLOCK --> Agent
Agent --> Retry[Self-Correct]
```
### 2. Implementing the Loop
If you are building an agent loop (using LangChain, AutoGPT, or custom), inject this step:
```python
def run_preflight_check(project_dir):
result = subprocess.run(
["aphoria", "scan", project_dir, "--format", "json"],
capture_output=True
)
scan_data = json.loads(result.stdout)
if scan_data["has_blocks"]:
return {
"status": "FAILED",
"feedback": generate_feedback(scan_data["conflicts"])
}
return {"status": "PASSED"}
def generate_feedback(conflicts):
feedback = "Your code failed safety checks:\n"
for conflict in conflicts:
feedback += f"- {conflict['claim']['file']}: {conflict['claim']['description']}\n"
feedback += f" VIOLATION: {conflict['conflicts'][0]['rfc_citation']}\n"
feedback += f" REQUIRED: {conflict['conflicts'][0]['value']}\n"
return feedback
```
### 3. Why This Works
Agents are remarkably good at fixing bugs **if you tell them exactly what constraint they violated.**
* **Bad Feedback:** "This code isn't secure." (Agent guesses randomly)
* **Aphoria Feedback:** "File `redis.conf`, line 12: `protected-mode no` violates OWASP A05:2021. Authority requires `yes`."
The agent receives:
1. **Location:** `redis.conf:12`
2. **Constraint:** OWASP A05:2021
3. **Target Value:** `yes`
It will almost always self-correct to `protected-mode yes` on the next attempt.
### 4. The "Strict Mode" for Agents
Humans can be trusted to interpret warnings. Agents should be held to a higher standard.
Always run agent checks with:
```bash
aphoria scan --strict
```
This treats even minor deviations (FLAGs) as errors. If an agent uses a deprecated dependency or a weird variable name that triggers a FLAG, force it to fix it. We want machine-generated code to be **pristine**.
### 5. Example Scenario: The JWT Hallucination
**Agent Task:** "Add JWT auth to the API."
**Attempt 1:**
Code: `jwt.verify(token, secret, { ignoreExpiration: true })`
*Aphoria Scan:*
> BLOCK: code://js/auth/jwt/expiry
> RFC 7519: Expiry validation MUST be enabled.
**Feedback Loop:**
Agent receives the error.
*Thought Process:* "Ah, RFC 7519 requires expiration check. I disabled it by mistake."
**Attempt 2:**
Code: `jwt.verify(token, secret)` (defaults to checking expiry)
*Aphoria Scan:* PASS.
**Result:** The PR that reaches the human reviewer is already compliant. The human focuses on logic, not spec compliance.
## Summary
Aphoria is the **Guardrail** that makes autonomous coding safe. It turns "Trust" into "Trust, but Verify."