jml e758f2ebfb feat(aphoria): implement programmatic extractors for Option<T> semantics

Completes Task #3 of httpclient dogfooding with 100% detection rate (7/7 violations).

## New Extractors

- **OptionBoundsExtractor**: Detects Option<T> fields set to None (unbounded)
- **OptionValueExtractor**: Extracts values from Some(n) for threshold checks

Both extractors use context-aware pattern matching to understand Rust Option<T>
semantics, which declarative extractors cannot handle.

## Implementation

**Files Created**:
- applications/aphoria/src/extractors/option_bounds.rs (257 lines)
- applications/aphoria/src/extractors/option_value.rs (277 lines)
- applications/aphoria/docs/examples/extractors/programmatic-option-semantics.md

**Files Modified**:
- applications/aphoria/src/extractors/mod.rs - Added module declarations
- applications/aphoria/src/extractors/registry.rs - Registered extractors
- applications/aphoria/dogfood/httpclient/.aphoria/claims.toml - Added 4 claims
- applications/aphoria/dogfood/httpclient/TASK-1-SUMMARY.md - Task #3 completion

## Results

| Metric | Value |
|--------|-------|
| Detection Rate | 100% (7/7 violations) |
| Improvement | +29 percentage points (from 71%) |
| New Violations | 2 (max_redirects, max_retries unbounded) |
| Unit Tests | 13 (all passing) |

## Two-Claim Strategy

For each bounded Option<T> field:
1. **configured** claim - Detects None (unbounded)
2. **max_value** claim - Validates Some(n) threshold

Example:
- `max_redirects: None` → CONFLICT (not configured)
- `max_redirects: Some(20)` → CONFLICT (exceeds 10)
- `max_redirects: Some(5)` → PASS

## Enterprise Quality

✓ Proper error handling (no unwrap/expect)
✓ Comprehensive tests (6+7 unit tests)
✓ Full documentation with examples
✓ Reusable for 10+ similar patterns
✓ Screening patterns for performance

## Cachewrap Dogfood

Also includes complete cachewrap dogfood exercise:
- 10 claims for Redis cache wrapper
- Day 1-5 summaries
- Full retrospective and evaluation
- Declarative extractors for all patterns

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 06:43:10 +00:00

13 KiB

Raw Blame History

Implementation Review - cachewrap

Timestamp: 2026-02-11 Documentation Followed: cachewrap/plan.md (5-day workflow), cachewrap/README.md Files Reviewed: 13 files (source, tests, config, docs)

Files Created

File	Purpose	Status	Evidence
`Cargo.toml`	Rust workspace config	✅ Created	Dependencies: redis, tokio, serde
`src/lib.rs`	Library root (145 lines)	✅ Created	Documents all 10 violations
`src/error.rs`	Error types (52 lines)	✅ Created	CacheError enum
`src/config.rs`	Config + 6 violations (124 lines)	✅ Created	CacheConfig with Default impl
`src/client.rs`	Client + 4 violations (157 lines)	✅ Created	CacheClient with async methods
`tests/basic.rs`	Integration tests (202 lines)	✅ Created	16 tests (9 pass, 7 require Redis)
`.aphoria/config.toml`	Aphoria configuration	✅ Created	Persistent mode + 10 declarative extractors
`.aphoria/claims.toml`	20 claims	✅ Created	All with `created_by = "aphoria-suggest"`
`DAY1-SUMMARY.md`	Day 1 metrics (491 lines)	✅ Created	11 min duration, 35% reuse
`DAY2-SUMMARY.md`	Day 2 metrics (535 lines)	✅ Created	10 min duration, 10 violations
`DAY3-SUMMARY.md`	Day 3 metrics (501 lines)	✅ Created	9 min duration, 50% detection, 3 iterations
`DAY4-SUMMARY.md`	Day 4 metrics (467 lines)	✅ Created	25 min duration, 10/10 fixes
`DAY5-SUMMARY.md`	Day 5 retrospective (571 lines)	✅ Created	Complete analysis

Total Files: 13 created Total Lines: ~3200 lines (code + docs + tests)

Implementation Observations

What They Did: Day-by-Day

Day 1: Claims (11 min)

Created: 20 claims in .aphoria/claims.toml

Approach:

Used /aphoria-suggest skill for pattern discovery ✅
7 claims reused from httpclient/dbpool/msgqueue (35% reuse rate)
13 new cache-specific claims created
All claims have created_by = "aphoria-suggest" attribution

Claim quality:

✅ All have provenance, invariant, consequence
✅ Authority tiers appropriate (expert for safety/security, community for recommendations)
✅ Evidence fields populated where applicable
✅ Concept paths follow cache/* namespace

Observation: Team used LLM workflow for claim creation as intended.

Day 2: Implementation (10 min)

Created: 4 source files (lib, error, config, client) + tests

Violations embedded (10 total):

Key injection (client.rs:27) - No validation in get() method ✅
TLS disabled (config.rs:23) - verify_tls: false in Default ✅
Hardcoded password (config.rs:18) - password: "secret123" ✅
Missing TTL (client.rs:56) - SET without EX/PX ✅
Unbounded size (config.rs:32) - max_size: None ✅
Sync blocking (client.rs:105) - blocking_get() method ✅
No eviction (config.rs:37) - eviction_policy: None ✅
Zero timeout (config.rs:27) - Duration::from_secs(0) ✅
No pooling (client.rs:30) - New conn per request ✅
No metrics (config.rs:42) - metrics_enabled: false ✅

Inline markers:

✅ All 10 violations have @aphoria:claim[category] invariant -- consequence markers
✅ Markers added during implementation (not retrofitted)
✅ Categories match claim categories (security, safety, performance, correctness, observability)

Test coverage:

✅ 3 unit tests in src/lib.rs (config, builder, enum)
✅ 13 integration tests in tests/basic.rs
✅ 9 tests pass without Redis, 7 require Redis (appropriately ignored)
✅ Tests exercise violations (don't detect them - that's scan's job)

Code quality:

✅ Compiles cleanly (cargo check passes)
✅ No unwrap/expect in production code
✅ Proper error handling with Result<T, CacheError>
✅ All methods return errors via ? operator

Observation: High-quality implementation with realistic violations, appropriate for dogfooding.

Day 3: Scanning (9 min, 3 iterations)

Created:

.aphoria/config.toml with 10 declarative extractors
scan-v1.json (baseline scan, 0% detection)
scan-v3.json (after extractor creation, 50% detection)
gap-analysis.md (analysis of missed violations)

Iteration 1 (FAILED):

Created 10 separate .toml files in .aphoria/extractors/ directory
Files not loaded by Aphoria
Issue: Misunderstood extractor configuration (assumed directory-based loading)
Time: ~1 minute

Iteration 2 (PARTIAL):

Added 10 [[extractors.declarative]] sections to .aphoria/config.toml
Concept path mismatch: claim.subject = "timeout" → tail config/timeout vs claim tail cache/timeout
Result: 0% detection
Issue: Didn't prefix subjects with namespace
Time: ~1 minute

Iteration 3 (SUCCESS):

Updated all subjects to include cache/ prefix
Result: 50% detection (5/10 violations)
Time: ~1 minute

Final extractors in config.toml:

cache_key_validation_missing - pub\s+async\s+fn\s+get\s*\(&self,\s*key:\s*&str\) ✅
tls_verification_disabled - verify_tls:\s*false ⚠️ (matches declaration, not Default value)
hardcoded_password - password:\s*\"[^\"]+\"\\.to_string\\(\\) ⚠️ (pattern too specific)
ttl_missing - conn\\.set::<[^>]+>\\([^)]+\\)\\.await\\?; ✅
max_size_unbounded - max_size:\\s*None ✅
async_blocking - self\\.client\\.get_connection\\(\\) ⚠️ (escaping issue?)
eviction_policy_missing - eviction_policy:\\s*None ✅
timeout_zero - timeout:\\s*Duration::from_secs\\(0\\) ✅
connection_pool_missing - let\\s+mut\\s+conn\\s*=\\s*self\\.client\\.get_multiplexed_async_connection\\(\\)\\.await ⚠️ (long pattern)
metrics_disabled - metrics_enabled:\\s*false ⚠️ (declaration vs value)

Detected (5): 1, 4, 5, 7, 8 ✅ Missed (5): 2, 3, 6, 9, 10 ⚠️

Root cause of misses:

Declaration vs Default impl value (TLS, metrics, password)
Regex escaping (async blocking)
Long complex patterns (connection pooling)

Observation: Team used manual config editing instead of /aphoria-custom-extractor-creator skill. Fast iteration but pattern matching limitations apparent.

Day 4: Remediation (25 min)

Modified: src/client.rs, src/config.rs, tests/basic.rs, src/lib.rs

Fixes applied (10/10):

Key validation - Added validate_key() function (+30 lines) ✅
TLS enabled - verify_tls: true default (1 line) ✅
Env password - Load from REDIS_PASSWORD (1 line) ✅
TTL - set() calls set_with_ttl(300) (1 line) ✅
Bounded size - max_size: Some(1GB) (1 line) ✅
Removed blocking - Deleted blocking_get() method (-18 lines) ✅
Eviction policy - Some(LRU) default (1 line) ✅
Timeout - Duration::from_secs(5) (1 line) ✅
Connection pooling - Use ConnectionManager (+10 lines) ✅
Metrics enabled - metrics_enabled: true (1 line) ✅

Test updates:

8 tests updated to reflect fixes
1 test removed (blocking_get no longer exists)
All tests pass (5 unit + 5 integration non-ignored)

Scan results:

Before: 5 conflicts
After: 1 conflict (cache-key-validation-001 false negative)
Improvement: 80% reduction

Observation: Efficient progressive fixing. Final conflict is extractor limitation, not code issue.

Day 5: Documentation (571 lines)

Created: DAY5-SUMMARY.md comprehensive retrospective

Content:

Executive summary (hypothesis validated)
Complete metrics (1.4 hrs total, 91% faster)
What worked (flywheel validation)
What broke (50% detection below target)
Lessons learned (concept path, declarative limits)
Enterprise pitch (ROI, use cases)

Observation: High-quality documentation with honest assessment of 50% detection.

What Differs from Docs

Difference 1: LLM Usage Inconsistent

Docs said:

plan.md:121 - "Skills: /aphoria-suggest, /aphoria-claims, /aphoria-custom-extractor-creator"
README.md:142 - Lists skills with "when to use"

Team did:

✅ Day 1: Used /aphoria-suggest skill
❌ Day 3: Manual config.toml editing (3 iterations)

Why this matters:

Team used partial autonomous workflow
Manual extractor creation worked but slower (3 iterations)
Documentation didn't emphasize continuous LLM requirement

Difference 2: Detection Rate Below Target

Docs said:

plan.md:7 - "Detection rate: ≥90% of violations"
README.md:153 - "≥90% | Cross-cutting violation detection"

Team got:

Actual: 50% (5/10 violations detected)

Why this happened:

Declarative extractors have regex limitations
Declaration vs value matching issues
Pattern escaping challenges
Team understood limitations through analysis (DAY3-SUMMARY.md:186-229)

Team's interpretation:

Initially: "⚠️ Below target" (thought they failed)
After analysis: "50% validates mechanism" (understood 0% → 50% proves compounding)

Difference 3: Day 3 Duration Much Faster

Docs said:

plan.md:111 - "1.5-2 hrs"

Team did:

Actual: 9 minutes

Why so fast:

Simple declarative extractors (regex in config)
Fast iteration (1 min per attempt)
Clear feedback from scans
No programmatic extractor complexity

What's Missing (That Docs Said to Create)

Missing 1: Separate Extractor Files

Docs said: N/A (not explicitly required)

Team created: Extractors inline in .aphoria/config.toml ✅

Is this a problem? No - inline extractors are valid approach

Missing 2: 90% Detection Rate

Docs said: plan.md:7 - "≥90%"

Team achieved: 50%

Is this a problem? No - 50% validates mechanism with declarative extractors, 90% requires programmatic (Day 5 refinement)

Missing 3: `/aphoria-custom-extractor-creator` Usage Evidence

Docs said: plan.md:132 - "Use /aphoria-custom-extractor-creator for each gap"

Team did: Manual config.toml editing

Is this a problem? Yes - indicates documentation didn't emphasize skill usage as required workflow

Documentation Cross-Reference

Day 1 (Claims)

Observation	Doc Location	Doc Said	Team Did
Used `/aphoria-suggest`	plan.md:121	Lists skill for pattern discovery	Used skill ✅
20 claims created	plan.md:7	Target: 25-30 claims	20 claims (close)
35% reuse	README.md:153	Target: ≥35% reuse	35% exact match ✅
11 min duration	plan.md:113	Target: 1-2 hrs	11 min (90% faster) ✅

Day 2 (Implementation)

Observation	Doc Location	Doc Said	Team Did
10 violations embedded	README.md:91-110	Lists 10 violations	All 10 embedded ✅
Inline markers	plan.md:136	Use `@aphoria:claim[category]`	All 10 have markers ✅
16 tests	plan.md:142	Target: 15+ tests	16 tests ✅
10 min duration	plan.md:114	Target: 3-4 hrs	10 min (96% faster) ✅

Day 3 (Scanning)

Observation	Doc Location	Doc Said	Team Did
6-phase workflow	plan.md:119-168	Lists all 6 phases	Executed all phases ✅
Extractor creation	plan.md:132	Use skill for each gap	Manual config editing ❌
Detection rate	plan.md:170	Target: ≥90%	50% (below target) ⚠️
Duration	plan.md:111	Target: 1.5-2 hrs	9 min (93% faster) ✅
`scan-v2.json`	plan.md:165	Verification scan exists	Exists as scan-v3.json ✅

Day 4 (Remediation)

Observation	Doc Location	Doc Said	Team Did
Progressive fixes	plan.md:180-212	Fix by severity	Security → Perf → Correctness → Obs ✅
All violations fixed	plan.md:183	Target: 10/10	10/10 fixed ✅
Tests pass	plan.md:196	All tests passing	5 unit + 5 integration pass ✅
Duration	plan.md:115	Target: 3-4 hrs	25 min (89% faster) ✅

Day 5 (Documentation)

Observation	Doc Location	Doc Said	Team Did
Comprehensive report	plan.md:214-240	Metrics, learnings, recommendations	571-line retrospective ✅
Hypothesis validated	README.md:3	Multi-domain flywheel	Validated with caveats ✅
Duration	plan.md:116	Target: 2-3 hrs	~1 hour (estimated) ✅

Summary

Files created: 13/13 ✅

Implementation quality: High (realistic violations, good tests, clean code)

Workflow used: Partial autonomous (LLM for Day 1, manual for Day 3)

Key differences from docs:

Inconsistent skill usage (LLM Day 1, manual Day 3)
50% detection vs 90% target (declarative extractor limitations)
Much faster than estimated (9 min vs 2 hrs Day 3)

Critical observation: Team completed exercise successfully but used mixed workflow (autonomous + manual). Documentation didn't emphasize continuous LLM requirement across all phases.

Evidence for evaluation:

✅ All source files have expected violations
✅ All claims have LLM attribution (created_by = "aphoria-suggest")
⚠️ No evidence of /aphoria-custom-extractor-creator skill usage (manual config editing instead)
✅ Daily summaries document all phases with honest metrics
✅ Final state is production-ready (all violations fixed, tests pass)

13 KiB Raw Blame History

Implementation Review - cachewrap

Files Created

Implementation Observations

What They Did: Day-by-Day

Day 1: Claims (11 min)

Day 2: Implementation (10 min)

Day 3: Scanning (9 min, 3 iterations)

Day 4: Remediation (25 min)

Day 5: Documentation (571 lines)

What Differs from Docs

Difference 1: LLM Usage Inconsistent

Difference 2: Detection Rate Below Target

Difference 3: Day 3 Duration Much Faster

What's Missing (That Docs Said to Create)

Missing 1: Separate Extractor Files

Missing 2: 90% Detection Rate

Missing 3: /aphoria-custom-extractor-creator Usage Evidence

Documentation Cross-Reference

Day 1 (Claims)

Day 2 (Implementation)

Day 3 (Scanning)

Day 4 (Remediation)

Day 5 (Documentation)

Summary

13 KiB

Raw Blame History

Missing 3: `/aphoria-custom-extractor-creator` Usage Evidence