jml e758f2ebfb feat(aphoria): implement programmatic extractors for Option<T> semantics

Completes Task #3 of httpclient dogfooding with 100% detection rate (7/7 violations).

## New Extractors

- **OptionBoundsExtractor**: Detects Option<T> fields set to None (unbounded)
- **OptionValueExtractor**: Extracts values from Some(n) for threshold checks

Both extractors use context-aware pattern matching to understand Rust Option<T>
semantics, which declarative extractors cannot handle.

## Implementation

**Files Created**:
- applications/aphoria/src/extractors/option_bounds.rs (257 lines)
- applications/aphoria/src/extractors/option_value.rs (277 lines)
- applications/aphoria/docs/examples/extractors/programmatic-option-semantics.md

**Files Modified**:
- applications/aphoria/src/extractors/mod.rs - Added module declarations
- applications/aphoria/src/extractors/registry.rs - Registered extractors
- applications/aphoria/dogfood/httpclient/.aphoria/claims.toml - Added 4 claims
- applications/aphoria/dogfood/httpclient/TASK-1-SUMMARY.md - Task #3 completion

## Results

| Metric | Value |
|--------|-------|
| Detection Rate | 100% (7/7 violations) |
| Improvement | +29 percentage points (from 71%) |
| New Violations | 2 (max_redirects, max_retries unbounded) |
| Unit Tests | 13 (all passing) |

## Two-Claim Strategy

For each bounded Option<T> field:
1. **configured** claim - Detects None (unbounded)
2. **max_value** claim - Validates Some(n) threshold

Example:
- `max_redirects: None` → CONFLICT (not configured)
- `max_redirects: Some(20)` → CONFLICT (exceeds 10)
- `max_redirects: Some(5)` → PASS

## Enterprise Quality

✓ Proper error handling (no unwrap/expect)
✓ Comprehensive tests (6+7 unit tests)
✓ Full documentation with examples
✓ Reusable for 10+ similar patterns
✓ Screening patterns for performance

## Cachewrap Dogfood

Also includes complete cachewrap dogfood exercise:
- 10 claims for Redis cache wrapper
- Day 1-5 summaries
- Full retrospective and evaluation
- Declarative extractors for all patterns

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 06:43:10 +00:00

6.4 KiB

Raw Blame History

Dogfood: Distributed Cache Client Library (cachewrap)

Hypothesis: Connection patterns + resource limits + TTL semantics from 3 corpora (httpclient, dbpool, msgqueue) transfer to cache clients with 35-40% pattern reuse, demonstrating multi-domain flywheel strength and cross-cutting concern detection.

Corpus Overlap: httpclient + dbpool + msgqueue → ~35-40% pattern reuse expected

Target Metrics:

Time savings: ≥60% vs manual
Pattern reuse: ≥35% of claims (7+/20)
Detection rate: ≥90% of violations (8-9/10)
Naming errors: <2

Why This Domain? (Difficulty: ★★★★☆)

Cache clients test whether patterns from 3 different domains (HTTP, DB, messaging) transfer to a fourth domain with cross-cutting violations:

✅ Connection patterns from httpclient (timeout, TLS, async, retry) ✅ Resource limits from dbpool (max connections, lifecycle, cleanup) ✅ Semantic patterns from msgqueue (backpressure, metrics) ✅ New patterns unique to caching (TTL, eviction, sharding, consistency)

What Makes This Harder:

Lower corpus overlap (35-40% vs msgqueue's 50%)
Cross-cutting violations (security + performance + correctness)
Stateful semantics (cache invalidation, TTL expiry, consistency)
Subtle bugs (key injection, unbounded growth, race conditions)

This validates multi-domain flywheel adaptability - knowledge compounds across domains.

Quick Start

Read the plan: plan.md (detailed 5-day workflow)
Start Day 1: Use /aphoria-suggest --corpus httpclient,dbpool,msgqueue to discover reusable patterns
Follow the workflow: Track metrics daily, write summaries
Reference examples: See dogfood/httpclient/ for complete example

Status

Day 1: Claims extraction (11 min) - ✅ 20 claims (7 reused = 35%)
Day 2: Implementation (10 min) - ✅ 10 violations embedded, 16 tests pass
Day 3: Scanning (9 min) - ⚠️ 5/10 violations detected (50%)
Day 4: Remediation (25 min) - ✅ All 10 violations fixed
Day 5: Documentation (in progress) - Comprehensive report + retrospective

Total Time: 56 minutes (Days 1-4) - 89% faster than 12-16 hour target

Final Status: ✅ Production-ready with secure defaults

Expected Pattern Reuse (7/20 = 35%)

From httpclient Corpus (4 patterns):

timeout → cache/timeout
tls/certificate_validation → tls/certificate_validation
retry/max_attempts → retry/max_attempts
async/runtime → async/runtime

From dbpool Corpus (2 patterns):

max_connections → connection/max_connections
connection_lifecycle → connection/lifecycle

From msgqueue Corpus (1 pattern):

metrics/enabled → metrics/enabled

New for Cache Client (13 patterns):

cache/ttl (Time To Live)
cache/eviction_policy
cache/max_size
cache/key_prefix
cache/serialization
cache/compression
cache/consistency_mode
cache/sharding_strategy
cache/read_through
cache/write_through
cache/stampede_prevention
cache/key_validation
cache/circuit_breaker

Total: 20 claims (7 reused = 35% reuse rate)

Violations to Embed (Day 2) - Cross-Cutting

Security Violations (3):

❌ Key injection vulnerability - No key validation → Data breach
❌ verify_tls = false - No TLS verification → MITM attacks
❌ Plaintext credential storage - Hardcoded password → Credential exposure

Performance Violations (3):

❌ Missing TTL - No expiration → Memory leak (unbounded growth)
❌ Unbounded cache size - No max_size → OOM under load
❌ Synchronous blocking - No async I/O → Throughput collapse

Correctness Violations (3):

❌ No eviction policy - Missing LRU/LFU → Unpredictable behavior
❌ timeout = 0 - Indefinite blocking → Hung threads
❌ No connection pooling - New conn per request → Resource exhaustion

Observability Violation (1):

⚠️ No metrics - Missing hit/miss tracking → Debugging impossible

Files

cachewrap/
├── README.md                    # This file
├── plan.md                      # Detailed 5-day workflow
├── .aphoria/
│   ├── config.toml              # Persistent mode, corpus enabled
│   └── claims.toml              # (empty, fill on Day 1)
├── docs/
│   └── sources/                 # Authority sources
│       ├── redis-spec.md        # Redis protocol (Tier 1)
│       ├── aws-elasticache.md   # AWS best practices (Tier 2)
│       └── redis-rs-lib.md      # Rust library patterns (Tier 3)
├── src/                         # (create on Day 2)
│   └── .gitkeep
├── claims-template.sh           # Batch claim import (20 claims)
└── DAY1-SUMMARY.md              # (create after Day 1)

References

Plan: plan.md (start here)
Authority sources: docs/sources/ (use for provenance)
Complete example: dogfood/httpclient/ (gold standard)
Similar domains: dogfood/dbpool/, dogfood/msgqueue/
Skills:
- /aphoria-suggest - Day 1 pattern discovery
- /aphoria-claims - Day 1 claim authoring
- /aphoria-custom-extractor-creator - Day 3 extractor generation

Success Criteria

Metric	Target	Validates
Pattern reuse	≥35%	Multi-domain flywheel works
Time savings	≥60%	Automation value at lower reuse rate
Detection rate	≥90%	Cross-cutting violation detection
Naming errors	<2	3-corpus consistency
Total time	12-16 hrs	Difficulty calibration

What This Tests (vs Previous Exercises)

Exercise	Corpus Sources	Reuse %	Difficulty	What It Tests
httpclient	None (baseline)	0%	★★☆☆☆	Async patterns, HTTP
dbpool	httpclient	30%	★★★☆☆	Connection lifecycle
msgqueue	httpclient + dbpool	50%	★★★☆☆	Cross-domain transfer (2→1)
cachewrap	httpclient + dbpool + msgqueue	35%	★★★★☆	Multi-domain (3→1), cross-cutting

Progressive Challenge:

msgqueue: 2 corpora → 50% reuse (easier)
cachewrap: 3 corpora → 35% reuse (harder, more discovery)

Ready to start Day 1! Follow plan.md and track metrics daily.

6.4 KiB Raw Blame History