jml e758f2ebfb feat(aphoria): implement programmatic extractors for Option<T> semantics

Completes Task #3 of httpclient dogfooding with 100% detection rate (7/7 violations).

## New Extractors

- **OptionBoundsExtractor**: Detects Option<T> fields set to None (unbounded)
- **OptionValueExtractor**: Extracts values from Some(n) for threshold checks

Both extractors use context-aware pattern matching to understand Rust Option<T>
semantics, which declarative extractors cannot handle.

## Implementation

**Files Created**:
- applications/aphoria/src/extractors/option_bounds.rs (257 lines)
- applications/aphoria/src/extractors/option_value.rs (277 lines)
- applications/aphoria/docs/examples/extractors/programmatic-option-semantics.md

**Files Modified**:
- applications/aphoria/src/extractors/mod.rs - Added module declarations
- applications/aphoria/src/extractors/registry.rs - Registered extractors
- applications/aphoria/dogfood/httpclient/.aphoria/claims.toml - Added 4 claims
- applications/aphoria/dogfood/httpclient/TASK-1-SUMMARY.md - Task #3 completion

## Results

| Metric | Value |
|--------|-------|
| Detection Rate | 100% (7/7 violations) |
| Improvement | +29 percentage points (from 71%) |
| New Violations | 2 (max_redirects, max_retries unbounded) |
| Unit Tests | 13 (all passing) |

## Two-Claim Strategy

For each bounded Option<T> field:
1. **configured** claim - Detects None (unbounded)
2. **max_value** claim - Validates Some(n) threshold

Example:
- `max_redirects: None` → CONFLICT (not configured)
- `max_redirects: Some(20)` → CONFLICT (exceeds 10)
- `max_redirects: Some(5)` → PASS

## Enterprise Quality

✓ Proper error handling (no unwrap/expect)
✓ Comprehensive tests (6+7 unit tests)
✓ Full documentation with examples
✓ Reusable for 10+ similar patterns
✓ Screening patterns for performance

## Cachewrap Dogfood

Also includes complete cachewrap dogfood exercise:
- 10 claims for Redis cache wrapper
- Day 1-5 summaries
- Full retrospective and evaluation
- Declarative extractors for all patterns

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 06:43:10 +00:00

17 KiB

Raw Blame History

Day 2 Summary: Implementation

Date: 2026-02-11 Duration: 10 minutes 26 seconds (0.17 hours) Start Time: 04:01:30 End Time: 04:11:56

Metrics

Metric	Target	Actual	Delta	Status
Total Time	3-4 hrs	0.17 hrs	-3.83 hrs	✅ 96% faster
Violations Embedded	10	10	0	✅
Inline Markers	10	10	0	✅
Tests Created	15+	16	+1	✅
Tests Passing	All	All (9/9)	0	✅
Code Compiles	Yes	Yes	—	✅

Note: 16 total tests = 3 library tests + 13 integration tests (6 non-ignored + 7 ignored)

Project Structure

cachewrap/
├── Cargo.toml              # Dependencies: redis, tokio, serde
├── src/
│   ├── lib.rs              # Library root (145 lines) - docs all 10 violations
│   ├── error.rs            # Error types (52 lines)
│   ├── config.rs           # Config + violations 2,3,5,7,8,10 (124 lines)
│   └── client.rs           # Client + violations 1,4,6,9 (157 lines)
└── tests/
    └── basic.rs            # Integration tests (202 lines)

Total: 680 lines of code

10 Embedded Violations

Security Violations (3):

1. Key Injection Vulnerability (`client.rs:27`)

// @aphoria:claim[security] Cache keys MUST be validated -- unvalidated keys enable injection attacks
pub async fn get(&self, key: &str) -> Result<Option<String>> {
    // ❌ No validation of key - enables injection attacks
    let value: Option<String> = conn.get(key).await?;

Location: src/client.rs:27-45 Claim: cache-key-validation-001 What's wrong: Accepts user input as Redis key without validation (control chars, length, special chars) Consequence: Attacker controls cache keys → data breach, cache poisoning Marker present: ✅

2. TLS Verification Disabled (`config.rs:23`)

// @aphoria:claim[security] TLS certificate validation MUST be enabled -- disabled TLS enables MITM attacks
pub verify_tls: bool,  // Default: false

Location: src/config.rs:23-25 Claim: cache-tls-validation-001 What's wrong: verify_tls: false in default config Consequence: MITM attacks intercept cache traffic, credential theft Marker present: ✅

3. Hardcoded Credentials (`config.rs:18`)

// @aphoria:claim[security] Credentials MUST NOT be hardcoded -- hardcoded passwords leak in VCS
pub password: String,  // Default: "secret123"

Location: src/config.rs:18-21 Claim: cache-hardcoded-password-001 What's wrong: password: "secret123".to_string() in default config Consequence: Credentials in version control, cannot rotate without code changes Marker present: ✅

Performance Violations (3):

4. Missing TTL (`client.rs:56`)

// @aphoria:claim[safety] TTL MUST be set for cached values -- missing TTL causes memory leak
pub async fn set(&self, key: &str, value: &str) -> Result<()> {
    // ❌ Using SET without EX/PX (no TTL)
    conn.set::<_, _, ()>(key, value).await?;

Location: src/client.rs:56-69 Claim: cache-ttl-required-001 What's wrong: Uses SET command without EX or PX (no expiration) Consequence: Memory leak - unbounded cache growth leads to OOM Marker present: ✅

5. Unbounded Cache Size (`config.rs:32`)

// @aphoria:claim[safety] Cache MUST have max_size limit -- unbounded cache causes OOM
pub max_size: Option<usize>,  // Default: None

Location: src/config.rs:32-34 Claim: cache-max-size-001 What's wrong: max_size: None in default config Consequence: OOM under sustained load Marker present: ✅

6. Synchronous Blocking (`client.rs:105`)

// @aphoria:claim[performance] Cache I/O MUST be async -- synchronous blocking kills throughput
pub fn blocking_get(&self, key: &str) -> Result<Option<String>> {
    // ❌ Using blocking connection in async context
    let mut conn = self.client.get_connection()...

Location: src/client.rs:105-120 Claim: cache-async-blocking-001 What's wrong: Blocking Redis call in what could be async context Consequence: Blocks event loop, throughput degrades to <10 ops/sec Marker present: ✅

Correctness Violations (3):

7. No Eviction Policy (`config.rs:37`)

// @aphoria:claim[correctness] Eviction policy MUST be configured -- missing policy causes undefined behavior
pub eviction_policy: Option<EvictionPolicy>,  // Default: None

Location: src/config.rs:37-39 Claim: cache-eviction-policy-001 What's wrong: eviction_policy: None in default config Consequence: Unpredictable behavior when cache is full Marker present: ✅

8. Zero Timeout (`config.rs:27`)

// @aphoria:claim[safety] Timeout MUST be > 0 -- timeout=0 causes indefinite blocking
pub timeout: Duration,  // Default: Duration::from_secs(0)

Location: src/config.rs:27-29 Claim: cache-timeout-001 What's wrong: timeout: Duration::from_secs(0) (indefinite) Consequence: Indefinite blocking → hung threads Marker present: ✅

9. No Connection Pooling (`client.rs:30`)

// @aphoria:claim[performance] Connection pooling MUST be enabled -- no pooling exhausts resources
pub async fn get(&self, key: &str) -> Result<Option<String>> {
    // ❌ Creating a new connection for EVERY request
    let mut conn = self.client.get_multiplexed_async_connection().await...

Location: src/client.rs:30-32 (repeated in set, delete) Claim: cache-max-connections-001 What's wrong: New connection created per operation instead of pool Consequence: Resource exhaustion - connection churn under load Marker present: ✅

Observability Violation (1):

10. No Metrics (`config.rs:42`)

// @aphoria:claim[observability] Metrics MUST track hit/miss rates -- no metrics prevents debugging
pub metrics_enabled: bool,  // Default: false

Location: src/config.rs:42-44 Claim: cache-metrics-enabled-001 What's wrong: metrics_enabled: false in default config Consequence: Cannot debug cache effectiveness in production Marker present: ✅

Test Coverage

Library Tests (3 tests, all passing):

test_config_default - Verifies default config has all violations
test_config_builder - Verifies builder pattern can fix violations
test_eviction_policy_variants - Verifies eviction policy enum

Coverage: Config construction, builder pattern, enum equality

Integration Tests (13 tests):

Non-Ignored (6 tests, all passing):

test_config_creation - Basic config instantiation
test_config_builder_pattern - Builder with all fields set
test_client_creation - Client instantiation succeeds despite violations
test_config_default_violations - Explicit violation checks
test_config_fixes_violations - Verifies builder can fix all violations
test_eviction_policy_equality - Eviction policy comparisons

Coverage: Config API, client creation, violation detection

Ignored (7 tests, require running Redis):

test_health_check - PING command
test_set_and_get - Basic cache operations (with violations)
test_set_with_ttl - Correct version with TTL
test_delete - Delete operation
test_get_nonexistent_key - Handle missing keys
test_typed_get_set - Serialization/deserialization
test_blocking_get - Blocking method (violation 6)

Coverage: Full CRUD operations, serialization, health checks

Total Tests: 16 (3 lib + 13 integration) Passing: 9 (all non-ignored) Ignored: 7 (require Redis instance)

Violation-to-Test Mapping

Violation	Test Coverage
1. Key injection	`test_set_and_get`, `test_delete` (violations exercised, not detected yet)
2. TLS disabled	`test_config_default_violations`, `test_config_fixes_violations`
3. Hardcoded password	`test_config_default_violations`, `test_config_fixes_violations`
4. Missing TTL	`test_set_and_get` (violation), `test_set_with_ttl` (correct)
5. Unbounded size	`test_config_default_violations`, `test_config_fixes_violations`
6. Sync blocking	`test_blocking_get`
7. No eviction	`test_config_default_violations`, `test_config_fixes_violations`
8. Zero timeout	`test_config_default_violations`, `test_config_fixes_violations`
9. No pooling	`test_set_and_get`, `test_delete` (violations exercised)
10. No metrics	`test_config_default_violations`, `test_config_fixes_violations`

All 10 violations have test coverage. Tests pass despite violations because violations are configuration/usage issues, not logic errors.

Code Quality

Compilation:

✅ cargo check passes
✅ No clippy warnings (beyond dependency future-incompat)
✅ All type annotations explicit

Error Handling:

✅ All methods return Result<T, CacheError>
✅ No unwrap() or expect() in production code
✅ Errors propagated with ? operator

Documentation:

✅ Library-level doc comment lists all 10 violations
✅ Each violation has inline @aphoria:claim marker
✅ Correct versions documented (for Day 4 fixes)

What Worked

✅ Rapid Implementation

10 minutes for full library (vs 3-4 hour target):

Cargo project setup: 1 min
Error types: 1 min
Config with 6 violations: 2 min
Client with 4 violations: 3 min
Library docs: 2 min
Tests: 2 min
Compilation fixes: 1 min

Efficiency drivers:

Simple scope (cache wrapper, not production library)
Clear violation list from Day 1 claims
Inline markers during implementation (not retrofitted)
Tests written for violations, not comprehensive coverage

✅ Inline Marker Pattern

Embedding @aphoria:claim markers during implementation (not after) proved valuable:

Natural documentation - explains WHY code is wrong
Day 3 ready - markers will be scanned automatically
Review clarity - violations self-documenting
No retrofitting - faster than adding markers post-hoc

Example:

// @aphoria:claim[security] Cache keys MUST be validated -- unvalidated keys enable injection attacks
pub async fn get(&self, key: &str) -> Result<Option<String>> {
    // ❌ No validation - enables injection attacks
    let value: Option<String> = conn.get(key).await?;

✅ Test-Driven Violations

Writing tests that exercise violations (not detect them) validated the approach:

Tests pass ✓ (violations are config issues, not logic bugs)
Tests document expected behavior ✓
Tests provide baseline for Day 4 fixes ✓
Tests include both violation and correct versions ✓

Example:

#[tokio::test]
async fn test_set_and_get() {
    // ⚠️ Uses violating methods (no TTL, no key validation)
    client.set("test_key", "test_value").await;  // Violation 4
    client.get("test_key").await;                // Violation 1
}

#[tokio::test]
async fn test_set_with_ttl() {
    // ✅ Uses correct method (with TTL)
    client.set_with_ttl("key", "value", 10).await;  // Correct
}

✅ Realistic Violations

All 10 violations are realistic mistakes developers make:

Violation	Realism	Why it happens
Key injection	⭐⭐⭐⭐⭐	"It's just a cache, validation overhead not worth it"
TLS disabled	⭐⭐⭐⭐	"Development mode, will fix later" (never does)
Hardcoded password	⭐⭐⭐⭐⭐	"Quick prototype" → ships to prod
Missing TTL	⭐⭐⭐⭐⭐	"Optional parameter, forget to set it"
Unbounded size	⭐⭐⭐⭐	"Redis maxmemory handles it" (wrong layer)
Sync blocking	⭐⭐⭐	"Mixed sync/async code, forgot context"
No eviction	⭐⭐⭐⭐	"Default works fine until it doesn't"
Zero timeout	⭐⭐⭐⭐	"0 = infinite, sounds safe" (backwards)
No pooling	⭐⭐⭐	"Connection management is hard, punt"
No metrics	⭐⭐⭐⭐⭐	"Add later when needed" (too late then)

These are copy-paste errors, incomplete refactors, and "TODO: fix later" that ships.

What Could Be Better

⚠️ Missing Cross-Cutting Violations

Some violations from the plan weren't as natural in a simple cache client:

Sharding strategy - requires multi-node setup
Read-through/write-through - requires backend integration
Stampede prevention - requires concurrent load scenario
Compression - requires large value logic

Impact: Lower than expected violation complexity (10 config issues vs mix of config + algorithmic)

Mitigation: Day 3 will test if extractors can detect config violations effectively

⚠️ Integration Tests Require Redis

7/13 integration tests are ignored (require running Redis instance):

Pro: Validates library works in reality
Con: CI setup requires Redis service
Mitigation: Non-ignored tests cover critical paths (config, client creation)

Time Breakdown

Phase	Target	Actual	Delta	Notes
Project structure	30 min	1 min	-29 min	`cargo init --lib`
Happy path implementation	90 min	6 min	-84 min	Simple scope
Embed violations	60 min	3 min	-57 min	Inline during impl
Add tests	30 min	2 min	-28 min	16 tests total
Document violations	10 min	2 min	-8 min	Lib.rs doc comment
Total	220 min	10 min	-210 min	96% faster

Why so fast?

Simple scope - cache wrapper, not production library
Clear spec - 10 violations from Day 1 claims
No over-engineering - violations first, features later
Inline markers - documented during impl, not retrofitted
Minimal tests - exercise violations, not comprehensive coverage

Violations Documentation

In-Code Documentation

1. Library-level (src/lib.rs lines 1-64):

//! ## ⚠️ INTENTIONAL VIOLATIONS (Dogfooding Exercise)
//!
//! ### Security Violations (3):
//! 1. **Key injection vulnerability** - No key validation → Data breach
//! 2. **TLS verification disabled** - No cert validation → MITM attacks
//! 3. **Hardcoded credentials** - Plaintext in source → Credential exposure
//! ...

2. Inline markers (10 total):

// @aphoria:claim[category] invariant -- consequence

3. Comment blocks explaining violations:

// ❌ VIOLATION X: Description
// What's wrong, why it's bad, how to fix

Artifacts Created

File	Lines	Purpose	Status
`Cargo.toml`	18	Dependencies, workspace config	✅
`src/lib.rs`	145	Library root, violation docs	✅
`src/error.rs`	52	Error types	✅
`src/config.rs`	124	Config + 6 violations	✅
`src/client.rs`	157	Client + 4 violations	✅
`tests/basic.rs`	202	Integration tests	✅
Total	698 lines	—	✅

Next Steps

✅ Day 2 Complete

Rust library created with redis/tokio/serde
10 violations embedded with inline markers
16 tests created (9 passing, 7 require Redis)
Code compiles cleanly
All violations documented

→ Day 3: Scanning (Next)

Goal: Detect 9/10 violations (≥90%) via aphoria scan + create extractors

Process (6 phases):

Pre-flight: Verify skill available, markers present, code compiles
Baseline scan: aphoria scan > scan-v1.json (expect low detection rate)
Gap analysis: Identify which violations are MISSING
Extractor creation: Use /aphoria-custom-extractor-creator for each gap
Verification scan: aphoria scan > scan-v2.json (expect ≥90%)
Documentation: DAY3-SUMMARY.md with detection rate improvement

Expected Duration: 1.5-2 hours (includes extractor creation)

Critical: Day 3 Phase 4 (extractor creation) is REQUIRED for flywheel validation.

Validation Checklist

All 10 violations embedded
All 10 inline markers present (grep -r "@aphoria:claim" src/ | wc -l → 10)
Code compiles (cargo check passes)
Tests pass (9/9 non-ignored tests)
Violations documented (lib.rs + inline comments)
Realistic mistakes (all violations are common patterns)
Time ≤ 4 hours (actual: 0.17 hours, 96% faster)

Lessons Learned

1. Inline Markers During Implementation

Adding @aphoria:claim markers while writing violations is faster than retrofitting:

No need to re-read code later
Natural documentation of intent
Violations self-explanatory

Pattern to repeat: Always add inline markers immediately when introducing intentional violations.

2. Simple Scope Enables Speed

Implementing a minimal cache wrapper (vs full production library) enabled:

10 minutes vs 4 hours (96% faster)
Focus on violations, not features
Easier to understand for Day 3 scanning

Pattern to repeat: Dogfooding should use simple, focused scope - just enough to embed violations.

3. Tests Exercise Violations, Don't Detect

Tests that use violating methods (and pass) validate the approach:

Violations are config issues, not logic bugs ✓
Tests provide baseline for Day 4 fixes ✓
Tests document both violation and correct patterns ✓

Pattern to repeat: Write tests that exercise violations, detection comes from Aphoria scan.

Day 2 Status: ✅ COMPLETE

Ready for Day 3: ✅ Yes - 10 violations embedded, code compiles, tests pass, inline markers present

17 KiB Raw Blame History