stemedb/applications/aphoria/docs/guides/pre-flight-checks.md
jordan 1cc453c97b feat: Aphoria policy source tracking + claim extraction pipeline
- Add PolicySourceStore for tracking where policies come from
- Implement claim extraction skill and API endpoints
- Add community UI text selection extractor component
- Create Go SDK aphoria client for policy operations
- Document patent specifications and legal disclosures
- Add guides: golden path loop, policy audit trails, pre-flight checks
- Expand Unreal Engine config extractor with source tracking
- Add UAT reports for policy source tracking validation
- Refactor tests.rs into modular test files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 02:35:02 -07:00

3.7 KiB

Guide: Pre-Flight Checks for Autonomous Coding Agents

Target Audience: AI Engineers, Agent Framework Builders Context: AI Safety & Reliability


The Problem: Confident Hallucinations

AI agents are excellent at writing code. They are terrible at understanding Constraints.

If you ask an agent: "Deploy a secure Redis instance," it might write:

# redis.conf
protected-mode no  # "To make sure it connects easily!"

The agent isn't malicious. It just prioritized "connectivity" over "security" because it saw a thousand Stack Overflow posts doing the same thing.

Traditional approach: A human reviews the PR. Problem: Humans get tired. Agents generate code faster than humans can review it.


The Solution: The Automated Conscience

Aphoria acts as the agent's conscience. It provides a structured, authoritative check before the code leaves the agent's hands.

1. The Workflow

graph LR
    User[User Request] --> Agent[Coding Agent]
    Agent --> Code[Generate Code]
    Code --> Aphoria[Aphoria Scan]
    Aphoria -- PASS --> PR[Open PR]
    Aphoria -- BLOCK --> Agent
    Agent --> Retry[Self-Correct]

2. Implementing the Loop

If you are building an agent loop (using LangChain, AutoGPT, or custom), inject this step:

def run_preflight_check(project_dir):
    result = subprocess.run(
        ["aphoria", "scan", project_dir, "--format", "json"],
        capture_output=True
    )
    
    scan_data = json.loads(result.stdout)
    
    if scan_data["has_blocks"]:
        return {
            "status": "FAILED",
            "feedback": generate_feedback(scan_data["conflicts"])
        }
    
    return {"status": "PASSED"}

def generate_feedback(conflicts):
    feedback = "Your code failed safety checks:\n"
    for conflict in conflicts:
        feedback += f"- {conflict['claim']['file']}: {conflict['claim']['description']}\n"
        feedback += f"  VIOLATION: {conflict['conflicts'][0]['rfc_citation']}\n"
        feedback += f"  REQUIRED: {conflict['conflicts'][0]['value']}\n"
    return feedback

3. Why This Works

Agents are remarkably good at fixing bugs if you tell them exactly what constraint they violated.

  • Bad Feedback: "This code isn't secure." (Agent guesses randomly)
  • Aphoria Feedback: "File redis.conf, line 12: protected-mode no violates OWASP A05:2021. Authority requires yes."

The agent receives:

  1. Location: redis.conf:12
  2. Constraint: OWASP A05:2021
  3. Target Value: yes

It will almost always self-correct to protected-mode yes on the next attempt.

4. The "Strict Mode" for Agents

Humans can be trusted to interpret warnings. Agents should be held to a higher standard.

Always run agent checks with:

aphoria scan --strict

This treats even minor deviations (FLAGs) as errors. If an agent uses a deprecated dependency or a weird variable name that triggers a FLAG, force it to fix it. We want machine-generated code to be pristine.

5. Example Scenario: The JWT Hallucination

Agent Task: "Add JWT auth to the API."

Attempt 1: Code: jwt.verify(token, secret, { ignoreExpiration: true }) Aphoria Scan:

BLOCK: code://js/auth/jwt/expiry RFC 7519: Expiry validation MUST be enabled.

Feedback Loop: Agent receives the error. Thought Process: "Ah, RFC 7519 requires expiration check. I disabled it by mistake."

Attempt 2: Code: jwt.verify(token, secret) (defaults to checking expiry) Aphoria Scan: PASS.

Result: The PR that reaches the human reviewer is already compliant. The human focuses on logic, not spec compliance.

Summary

Aphoria is the Guardrail that makes autonomous coding safe. It turns "Trust" into "Trust, but Verify."