- Add PolicySourceStore for tracking where policies come from - Implement claim extraction skill and API endpoints - Add community UI text selection extractor component - Create Go SDK aphoria client for policy operations - Document patent specifications and legal disclosures - Add guides: golden path loop, policy audit trails, pre-flight checks - Expand Unreal Engine config extractor with source tracking - Add UAT reports for policy source tracking validation - Refactor tests.rs into modular test files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.7 KiB
Guide: Pre-Flight Checks for Autonomous Coding Agents
Target Audience: AI Engineers, Agent Framework Builders Context: AI Safety & Reliability
The Problem: Confident Hallucinations
AI agents are excellent at writing code. They are terrible at understanding Constraints.
If you ask an agent: "Deploy a secure Redis instance," it might write:
# redis.conf
protected-mode no # "To make sure it connects easily!"
The agent isn't malicious. It just prioritized "connectivity" over "security" because it saw a thousand Stack Overflow posts doing the same thing.
Traditional approach: A human reviews the PR. Problem: Humans get tired. Agents generate code faster than humans can review it.
The Solution: The Automated Conscience
Aphoria acts as the agent's conscience. It provides a structured, authoritative check before the code leaves the agent's hands.
1. The Workflow
graph LR
User[User Request] --> Agent[Coding Agent]
Agent --> Code[Generate Code]
Code --> Aphoria[Aphoria Scan]
Aphoria -- PASS --> PR[Open PR]
Aphoria -- BLOCK --> Agent
Agent --> Retry[Self-Correct]
2. Implementing the Loop
If you are building an agent loop (using LangChain, AutoGPT, or custom), inject this step:
def run_preflight_check(project_dir):
result = subprocess.run(
["aphoria", "scan", project_dir, "--format", "json"],
capture_output=True
)
scan_data = json.loads(result.stdout)
if scan_data["has_blocks"]:
return {
"status": "FAILED",
"feedback": generate_feedback(scan_data["conflicts"])
}
return {"status": "PASSED"}
def generate_feedback(conflicts):
feedback = "Your code failed safety checks:\n"
for conflict in conflicts:
feedback += f"- {conflict['claim']['file']}: {conflict['claim']['description']}\n"
feedback += f" VIOLATION: {conflict['conflicts'][0]['rfc_citation']}\n"
feedback += f" REQUIRED: {conflict['conflicts'][0]['value']}\n"
return feedback
3. Why This Works
Agents are remarkably good at fixing bugs if you tell them exactly what constraint they violated.
- Bad Feedback: "This code isn't secure." (Agent guesses randomly)
- Aphoria Feedback: "File
redis.conf, line 12:protected-mode noviolates OWASP A05:2021. Authority requiresyes."
The agent receives:
- Location:
redis.conf:12 - Constraint: OWASP A05:2021
- Target Value:
yes
It will almost always self-correct to protected-mode yes on the next attempt.
4. The "Strict Mode" for Agents
Humans can be trusted to interpret warnings. Agents should be held to a higher standard.
Always run agent checks with:
aphoria scan --strict
This treats even minor deviations (FLAGs) as errors. If an agent uses a deprecated dependency or a weird variable name that triggers a FLAG, force it to fix it. We want machine-generated code to be pristine.
5. Example Scenario: The JWT Hallucination
Agent Task: "Add JWT auth to the API."
Attempt 1:
Code: jwt.verify(token, secret, { ignoreExpiration: true })
Aphoria Scan:
BLOCK: code://js/auth/jwt/expiry RFC 7519: Expiry validation MUST be enabled.
Feedback Loop: Agent receives the error. Thought Process: "Ah, RFC 7519 requires expiration check. I disabled it by mistake."
Attempt 2:
Code: jwt.verify(token, secret) (defaults to checking expiry)
Aphoria Scan: PASS.
Result: The PR that reaches the human reviewer is already compliant. The human focuses on logic, not spec compliance.
Summary
Aphoria is the Guardrail that makes autonomous coding safe. It turns "Trust" into "Trust, but Verify."