- Add PolicySourceStore for tracking where policies come from - Implement claim extraction skill and API endpoints - Add community UI text selection extractor component - Create Go SDK aphoria client for policy operations - Document patent specifications and legal disclosures - Add guides: golden path loop, policy audit trails, pre-flight checks - Expand Unreal Engine config extractor with source tracking - Add UAT reports for policy source tracking validation - Refactor tests.rs into modular test files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
119 lines
3.7 KiB
Markdown
119 lines
3.7 KiB
Markdown
# Guide: Pre-Flight Checks for Autonomous Coding Agents
|
|
|
|
**Target Audience:** AI Engineers, Agent Framework Builders
|
|
**Context:** AI Safety & Reliability
|
|
|
|
---
|
|
|
|
## The Problem: Confident Hallucinations
|
|
|
|
AI agents are excellent at writing code. They are terrible at understanding **Constraints**.
|
|
|
|
If you ask an agent: *"Deploy a secure Redis instance,"* it might write:
|
|
```yaml
|
|
# redis.conf
|
|
protected-mode no # "To make sure it connects easily!"
|
|
```
|
|
|
|
The agent isn't malicious. It just prioritized "connectivity" over "security" because it saw a thousand Stack Overflow posts doing the same thing.
|
|
|
|
**Traditional approach:** A human reviews the PR.
|
|
**Problem:** Humans get tired. Agents generate code faster than humans can review it.
|
|
|
|
---
|
|
|
|
## The Solution: The Automated Conscience
|
|
|
|
Aphoria acts as the agent's conscience. It provides a structured, authoritative check *before* the code leaves the agent's hands.
|
|
|
|
### 1. The Workflow
|
|
|
|
```mermaid
|
|
graph LR
|
|
User[User Request] --> Agent[Coding Agent]
|
|
Agent --> Code[Generate Code]
|
|
Code --> Aphoria[Aphoria Scan]
|
|
Aphoria -- PASS --> PR[Open PR]
|
|
Aphoria -- BLOCK --> Agent
|
|
Agent --> Retry[Self-Correct]
|
|
```
|
|
|
|
### 2. Implementing the Loop
|
|
|
|
If you are building an agent loop (using LangChain, AutoGPT, or custom), inject this step:
|
|
|
|
```python
|
|
def run_preflight_check(project_dir):
|
|
result = subprocess.run(
|
|
["aphoria", "scan", project_dir, "--format", "json"],
|
|
capture_output=True
|
|
)
|
|
|
|
scan_data = json.loads(result.stdout)
|
|
|
|
if scan_data["has_blocks"]:
|
|
return {
|
|
"status": "FAILED",
|
|
"feedback": generate_feedback(scan_data["conflicts"])
|
|
}
|
|
|
|
return {"status": "PASSED"}
|
|
|
|
def generate_feedback(conflicts):
|
|
feedback = "Your code failed safety checks:\n"
|
|
for conflict in conflicts:
|
|
feedback += f"- {conflict['claim']['file']}: {conflict['claim']['description']}\n"
|
|
feedback += f" VIOLATION: {conflict['conflicts'][0]['rfc_citation']}\n"
|
|
feedback += f" REQUIRED: {conflict['conflicts'][0]['value']}\n"
|
|
return feedback
|
|
```
|
|
|
|
### 3. Why This Works
|
|
|
|
Agents are remarkably good at fixing bugs **if you tell them exactly what constraint they violated.**
|
|
|
|
* **Bad Feedback:** "This code isn't secure." (Agent guesses randomly)
|
|
* **Aphoria Feedback:** "File `redis.conf`, line 12: `protected-mode no` violates OWASP A05:2021. Authority requires `yes`."
|
|
|
|
The agent receives:
|
|
1. **Location:** `redis.conf:12`
|
|
2. **Constraint:** OWASP A05:2021
|
|
3. **Target Value:** `yes`
|
|
|
|
It will almost always self-correct to `protected-mode yes` on the next attempt.
|
|
|
|
### 4. The "Strict Mode" for Agents
|
|
|
|
Humans can be trusted to interpret warnings. Agents should be held to a higher standard.
|
|
|
|
Always run agent checks with:
|
|
```bash
|
|
aphoria scan --strict
|
|
```
|
|
|
|
This treats even minor deviations (FLAGs) as errors. If an agent uses a deprecated dependency or a weird variable name that triggers a FLAG, force it to fix it. We want machine-generated code to be **pristine**.
|
|
|
|
### 5. Example Scenario: The JWT Hallucination
|
|
|
|
**Agent Task:** "Add JWT auth to the API."
|
|
|
|
**Attempt 1:**
|
|
Code: `jwt.verify(token, secret, { ignoreExpiration: true })`
|
|
*Aphoria Scan:*
|
|
> BLOCK: code://js/auth/jwt/expiry
|
|
> RFC 7519: Expiry validation MUST be enabled.
|
|
|
|
**Feedback Loop:**
|
|
Agent receives the error.
|
|
*Thought Process:* "Ah, RFC 7519 requires expiration check. I disabled it by mistake."
|
|
|
|
**Attempt 2:**
|
|
Code: `jwt.verify(token, secret)` (defaults to checking expiry)
|
|
*Aphoria Scan:* PASS.
|
|
|
|
**Result:** The PR that reaches the human reviewer is already compliant. The human focuses on logic, not spec compliance.
|
|
|
|
## Summary
|
|
|
|
Aphoria is the **Guardrail** that makes autonomous coding safe. It turns "Trust" into "Trust, but Verify."
|