stemedb/applications/aphoria/uat/2026-02-04-uat-plan-unreal.md
jordan b3e8a9a058 feat: Multi-application expansion with chaos testing and community UI
Major additions:
- Community Next.js app (port 18187) for browsing claims with API docs
- stemedb-chaos crate: Fault injection, chaos testing, CRDT properties
- Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents
- Disputed claims handling: Manual review workflows and validation
- Aphoria security scanner: New extractors (SQL injection, command
  injection, weak crypto, TLS version), policy-based ignores, UAT reports
- Docker infrastructure: Dockerfile, docker-compose.yml for full stack
- VulnBank demo: Intentionally vulnerable multi-language test corpus

SDK & API enhancements:
- Source registry handlers for tracking data provenance
- Metrics endpoint
- Skeptic filtering improvements

Code quality:
- Split 14 large files (>500 lines) into focused modules
- All files now under 500-line limit per project guidelines

Documentation:
- Chaos testing guide, circuit breakers, observability docs
- Phase 7 UAT documentation updates
- Martin Kleppmann technical writer agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:24:14 -07:00

3.2 KiB

UAT Plan: Unreal Engine Audit (Masq Project)

Goal: Prove Aphoria's value for game development by detecting specific performance, security, and architectural issues in a real-world Unreal Engine project (MasqMain).

Hypothesis: Game developers struggle with invisible "drift" in large C++/Blueprint codebases — hardcoded paths, synchronous loading hitches, and insecure config defaults. Aphoria can surface these instantly using the same knowledge-graph approach that worked for VulnBank.

1. Test Environment

Target Codebase: /opt/MasqMain/UE (Masquerade Unreal Client) Aphoria Version: 0.1.0 + Unreal Extractors Configuration:

# aphoria.toml
[scan]
include_tests = false
max_file_size = 1048576 # 1MB

[extractors]
enabled = ["unreal_cpp", "unreal_config", "unreal_performance", "hardcoded_secrets"]

2. Success Criteria

We will consider this UAT a success if Aphoria detects at least 5 distinct issues across these categories with 100% precision (no false positives on engine code).

Category Finding Expected Verdict Why it matters
Performance LoadSynchronous() in MasqSubsystem.cpp BLOCK Causes frame hitches during gameplay.
Architecture Hardcoded /Game/UI/Foundations/... paths FLAG Breaks if assets move; use SoftObjectPtr.
Security UFUNCTION(Exec) on sensitive methods BLOCK Allows cheating/exploitation via console.
Config ApiKey=sk_live_... in DefaultMasq.ini BLOCK Leaks credentials in shipping builds.
Network MaxClientRate=15000 (too high/low) FLAG Affects multiplayer replication quality.

3. Execution Plan

Step 1: Baseline Scan

Run Aphoria against the project root to establish the current state of "epistemic drift."

cd /opt/MasqMain/UE
aphoria scan . --format table

Step 2: Verification of Findings

For each finding, verify:

  1. Context: Is it actually code we own? (Ignore Engine/ if scanning externally, but we are inside project).
  2. Authority: Does the citation (vendor://unreal/...) make sense?
  3. Accuracy: Is LoadSynchronous actually on the game thread? (Yes, in Initialize()).

Step 3: Fix Workflow (Simulated)

Demonstrate how a developer would resolve one issue using the ack workflow vs. a code fix.

  • Scenario A (Fix): Change LoadSynchronous() to StreamableManager.RequestAsyncLoad().
  • Scenario B (Ack): Acknowledge UFUNCTION(Exec) on a debug cheat function that is stripped in shipping.

4. Expected Output Artifact

A report titled 2026-02-04-masq-unreal-audit.md in applications/aphoria/uat/ containing:

  • Summary of findings.
  • "Show stopper" issues found (e.g., the Sync Load in Subsystem).
  • Comparison of how long this would take a human reviewer vs. Aphoria (0.5s).

5. Risk Assessment

  • False Positives: TEXT("/Game/...") might be valid in ConstructorHelpers (only runs at startup). We need to distinguish runtime usage from CDO initialization.
  • Engine Code: If we scan Plugins/ that are third-party, we might find issues we can't fix. We should focus on Source/Masq/.

Next Step: Execute the scan?