stemedb/applications/aphoria/docs/llm-optimization/baselines/template.md
jordan 157dbbb9eb feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing)
## Phase 8: Enterprise Extractor Improvements 
- 14 security extractors (TLS, JWT, SQL injection, XSS, etc.)
- 10 framework-specific extractors (Spring, Django, Rails, etc.)
- Config file security detection (YAML, TOML)

## Phase 9: Autonomous Extractor Generation 
- Shadow mode executor with TP/FP tracking
- Graduation pipeline with confidence thresholds
- Auto-rollback on regression detection
- Cross-project pattern syncing

## UAT Suite Complete (14 scripts, 90 tests)
- test-core-detection.sh (6 tests)
- test-declarative-extractors.sh (5 tests)
- test-domain-frameworks.sh (5 tests)
- test-domain-unreal.sh (3 tests)
- test-llm-extraction.sh (6 tests)
- test-eval-harness.sh (5 tests)
- test-cross-language.sh (3 tests)
- test-precommit-performance.sh (4 tests)
- test-output-formats.sh (8 tests)
- test-drift-detection.sh (6 tests)
- test-exit-codes.sh (12 tests)
+ 3 more scripts

## Other Changes
- Updated roadmap to mark Phase 8-9 complete
- Added .gitignore entries for build artifacts
- Updated pre-commit: 800 line limit, exclude tests/data/cmd

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:50:55 -07:00

1.1 KiB

Baseline: YYYY-MM-DD

Prompt Version: X.Y.Z Model: gemini-2.0-flash Fixture Count: N


Overall Metrics

Metric Value Target Status
Precision X.XX 0.80
Recall X.XX 0.75
F1 X.XX 0.77
Parse Success X.XX% 95%

Per-Category Breakdown

Category Fixtures Passed Failed Precision Recall F1
tls N N N X.XX X.XX X.XX
jwt N N N X.XX X.XX X.XX
secrets N N N X.XX X.XX X.XX
auth N N N X.XX X.XX X.XX
negative N N N X.XX X.XX X.XX
edge N N N X.XX X.XX X.XX

Failed Fixtures

ID Category Issue Root Cause

Changes Since Last Baseline

  • Change 1
  • Change 2

Known Issues

  • Issue 1
  • Issue 2

Next Optimization Targets

  1. Target 1
  2. Target 2
  3. Target 3

Raw Results

// Paste JSON output here for reference