# LLM Extraction Optimization > Systematic approach to maximizing Aphoria's LLM extraction quality. ## Quick Links | Document | When to Use | |----------|-------------| | [Quick Start](./quickstart.md) | First time optimizing, want to get started fast | | [Full Playbook](./playbook.md) | Comprehensive optimization guide with decision trees | | [Baseline Template](./baselines/template.md) | Recording metrics after each optimization cycle | | [Research Template](./research/template.md) | Investigating unknown issues or new approaches | ## Current Status **Latest Baseline:** [2026-02-06](./baselines/2026-02-06.md) | Metric | Current | Target | Status | |--------|---------|--------|--------| | Precision | 0.93 | 0.80 | ✅ Exceeded | | Recall | 1.00 | 0.75 | ✅ Exceeded | | F1 | 0.96 | 0.77 | ✅ Exceeded | | Parse Rate | 100% | 95% | ✅ | | Fixtures Passing | 10/10 | - | ✅ All pass | **Verdict:** PASS - All metrics exceed targets. ## Directory Structure ``` docs/llm-optimization/ ├── index.md # This file ├── quickstart.md # 15-minute getting started ├── playbook.md # Full optimization guide ├── baselines/ # Historical metrics │ ├── template.md │ └── YYYY-MM-DD.md # One per baseline └── research/ # Investigation notes ├── template.md └── [topic].md # One per research topic ``` ## Key Commands ```bash # Run evaluation aphoria eval run --fixtures tests/llm_fixtures --mode live # Check for regressions (CI) aphoria eval run --mode cached --fail-on-regression # Update baseline after improvements aphoria eval update-baseline --force # List fixtures aphoria eval list-fixtures # Validate fixtures aphoria eval validate-fixtures ``` ## Optimization Flow ``` 1. Run baseline evaluation ↓ 2. Identify failure categories ↓ 3. Apply targeted fixes (one at a time!) ↓ 4. Validate: did metrics improve? ↓ YES → Save new baseline, continue to next issue NO → Revert, try different approach or research ↓ 5. Repeat until targets met ↓ 6. Set up CI to prevent regressions ``` ## Fixture Locations | Category | Path | Count | |----------|------|-------| | TLS | `tests/llm_fixtures/tls/` | 2 | | JWT | `tests/llm_fixtures/jwt/` | 2 | | Secrets | `tests/llm_fixtures/secrets/` | 2 | | Auth | `tests/llm_fixtures/auth/` | 1 | | Negative | `tests/llm_fixtures/negative/` | 2 | | Edge | `tests/llm_fixtures/edge/` | 1 | | **Total** | | **10** | ## Related Files - **Prompt source:** `src/llm/prompts.rs` - **Extractor:** `src/llm/extractor.rs` - **Client:** `src/llm/client.rs` - **Eval harness:** `src/eval/harness.rs` - **Fixtures:** `tests/llm_fixtures/` ## Contributing Fixtures See [Fixture Writing Guide](./playbook.md#appendix-b-fixture-writing-guide) in the playbook. Quick checklist: - [ ] Create TOML file in appropriate category folder - [ ] Include both `must_contain` and `must_not_contain` - [ ] Run `aphoria eval validate-fixtures` - [ ] Test with `aphoria eval run --max-fixtures 1` - [ ] Update `manifest.toml` category counts