stemedb/applications/aphoria/dogfood/cachewrap/README.md
jml e758f2ebfb feat(aphoria): implement programmatic extractors for Option<T> semantics
Completes Task #3 of httpclient dogfooding with 100% detection rate (7/7 violations).

## New Extractors

- **OptionBoundsExtractor**: Detects Option<T> fields set to None (unbounded)
- **OptionValueExtractor**: Extracts values from Some(n) for threshold checks

Both extractors use context-aware pattern matching to understand Rust Option<T>
semantics, which declarative extractors cannot handle.

## Implementation

**Files Created**:
- applications/aphoria/src/extractors/option_bounds.rs (257 lines)
- applications/aphoria/src/extractors/option_value.rs (277 lines)
- applications/aphoria/docs/examples/extractors/programmatic-option-semantics.md

**Files Modified**:
- applications/aphoria/src/extractors/mod.rs - Added module declarations
- applications/aphoria/src/extractors/registry.rs - Registered extractors
- applications/aphoria/dogfood/httpclient/.aphoria/claims.toml - Added 4 claims
- applications/aphoria/dogfood/httpclient/TASK-1-SUMMARY.md - Task #3 completion

## Results

| Metric | Value |
|--------|-------|
| Detection Rate | 100% (7/7 violations) |
| Improvement | +29 percentage points (from 71%) |
| New Violations | 2 (max_redirects, max_retries unbounded) |
| Unit Tests | 13 (all passing) |

## Two-Claim Strategy

For each bounded Option<T> field:
1. **configured** claim - Detects None (unbounded)
2. **max_value** claim - Validates Some(n) threshold

Example:
- `max_redirects: None` → CONFLICT (not configured)
- `max_redirects: Some(20)` → CONFLICT (exceeds 10)
- `max_redirects: Some(5)` → PASS

## Enterprise Quality

✓ Proper error handling (no unwrap/expect)
✓ Comprehensive tests (6+7 unit tests)
✓ Full documentation with examples
✓ Reusable for 10+ similar patterns
✓ Screening patterns for performance

## Cachewrap Dogfood

Also includes complete cachewrap dogfood exercise:
- 10 claims for Redis cache wrapper
- Day 1-5 summaries
- Full retrospective and evaluation
- Declarative extractors for all patterns

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 06:43:10 +00:00

176 lines
6.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Dogfood: Distributed Cache Client Library (cachewrap)
**Hypothesis:** Connection patterns + resource limits + TTL semantics from 3 corpora (httpclient, dbpool, msgqueue) transfer to cache clients with **35-40%** pattern reuse, demonstrating multi-domain flywheel strength and cross-cutting concern detection.
**Corpus Overlap:** httpclient + dbpool + msgqueue → **~35-40%** pattern reuse expected
**Target Metrics:**
- Time savings: **≥60%** vs manual
- Pattern reuse: **≥35%** of claims (7+/20)
- Detection rate: **≥90%** of violations (8-9/10)
- Naming errors: **<2**
---
## Why This Domain? (Difficulty: ★★★★☆)
Cache clients test whether patterns from **3 different domains** (HTTP, DB, messaging) transfer to a fourth domain with **cross-cutting violations**:
**Connection patterns** from httpclient (timeout, TLS, async, retry)
**Resource limits** from dbpool (max connections, lifecycle, cleanup)
**Semantic patterns** from msgqueue (backpressure, metrics)
**New patterns** unique to caching (TTL, eviction, sharding, consistency)
**What Makes This Harder:**
- **Lower corpus overlap** (35-40% vs msgqueue's 50%)
- **Cross-cutting violations** (security + performance + correctness)
- **Stateful semantics** (cache invalidation, TTL expiry, consistency)
- **Subtle bugs** (key injection, unbounded growth, race conditions)
This validates **multi-domain flywheel adaptability** - knowledge compounds across domains.
---
## Quick Start
1. **Read the plan:** `plan.md` (detailed 5-day workflow)
2. **Start Day 1:** Use `/aphoria-suggest --corpus httpclient,dbpool,msgqueue` to discover reusable patterns
3. **Follow the workflow:** Track metrics daily, write summaries
4. **Reference examples:** See `dogfood/httpclient/` for complete example
---
## Status
- [x] **Day 1:** Claims extraction (11 min) - 20 claims (7 reused = 35%)
- [x] **Day 2:** Implementation (10 min) - 10 violations embedded, 16 tests pass
- [x] **Day 3:** Scanning (9 min) - 5/10 violations detected (50%)
- [x] **Day 4:** Remediation (25 min) - All 10 violations fixed
- [ ] **Day 5:** Documentation (in progress) - Comprehensive report + retrospective
**Total Time:** 56 minutes (Days 1-4) - 89% faster than 12-16 hour target
**Final Status:** Production-ready with secure defaults
---
## Expected Pattern Reuse (7/20 = 35%)
### From httpclient Corpus (4 patterns):
- `timeout` `cache/timeout`
- `tls/certificate_validation` `tls/certificate_validation`
- `retry/max_attempts` `retry/max_attempts`
- `async/runtime` `async/runtime`
### From dbpool Corpus (2 patterns):
- `max_connections` `connection/max_connections`
- `connection_lifecycle` `connection/lifecycle`
### From msgqueue Corpus (1 pattern):
- `metrics/enabled` `metrics/enabled`
### New for Cache Client (13 patterns):
- `cache/ttl` (Time To Live)
- `cache/eviction_policy`
- `cache/max_size`
- `cache/key_prefix`
- `cache/serialization`
- `cache/compression`
- `cache/consistency_mode`
- `cache/sharding_strategy`
- `cache/read_through`
- `cache/write_through`
- `cache/stampede_prevention`
- `cache/key_validation`
- `cache/circuit_breaker`
**Total:** 20 claims (7 reused = 35% reuse rate)
---
## Violations to Embed (Day 2) - Cross-Cutting
### Security Violations (3):
1. **Key injection vulnerability** - No key validation Data breach
2. **verify_tls = false** - No TLS verification MITM attacks
3. **Plaintext credential storage** - Hardcoded password Credential exposure
### Performance Violations (3):
4. **Missing TTL** - No expiration Memory leak (unbounded growth)
5. **Unbounded cache size** - No max_size OOM under load
6. **Synchronous blocking** - No async I/O Throughput collapse
### Correctness Violations (3):
7. **No eviction policy** - Missing LRU/LFU Unpredictable behavior
8. **timeout = 0** - Indefinite blocking Hung threads
9. **No connection pooling** - New conn per request Resource exhaustion
### Observability Violation (1):
10. **No metrics** - Missing hit/miss tracking Debugging impossible
---
## Files
```
cachewrap/
├── README.md # This file
├── plan.md # Detailed 5-day workflow
├── .aphoria/
│ ├── config.toml # Persistent mode, corpus enabled
│ └── claims.toml # (empty, fill on Day 1)
├── docs/
│ └── sources/ # Authority sources
│ ├── redis-spec.md # Redis protocol (Tier 1)
│ ├── aws-elasticache.md # AWS best practices (Tier 2)
│ └── redis-rs-lib.md # Rust library patterns (Tier 3)
├── src/ # (create on Day 2)
│ └── .gitkeep
├── claims-template.sh # Batch claim import (20 claims)
└── DAY1-SUMMARY.md # (create after Day 1)
```
---
## References
- **Plan:** `plan.md` (start here)
- **Authority sources:** `docs/sources/` (use for provenance)
- **Complete example:** `dogfood/httpclient/` (gold standard)
- **Similar domains:** `dogfood/dbpool/`, `dogfood/msgqueue/`
- **Skills:**
- `/aphoria-suggest` - Day 1 pattern discovery
- `/aphoria-claims` - Day 1 claim authoring
- `/aphoria-custom-extractor-creator` - Day 3 extractor generation
---
## Success Criteria
| Metric | Target | Validates |
|--------|--------|-----------|
| Pattern reuse | 35% | Multi-domain flywheel works |
| Time savings | 60% | Automation value at lower reuse rate |
| Detection rate | 90% | Cross-cutting violation detection |
| Naming errors | <2 | 3-corpus consistency |
| Total time | 12-16 hrs | Difficulty calibration |
---
## What This Tests (vs Previous Exercises)
| Exercise | Corpus Sources | Reuse % | Difficulty | What It Tests |
|----------|----------------|---------|------------|---------------|
| httpclient | None (baseline) | 0% | ★★☆☆☆ | Async patterns, HTTP |
| dbpool | httpclient | 30% | ★★★☆☆ | Connection lifecycle |
| msgqueue | httpclient + dbpool | 50% | ★★★☆☆ | Cross-domain transfer (21) |
| **cachewrap** | **httpclient + dbpool + msgqueue** | **35%** | **★★★★☆** | **Multi-domain (3→1), cross-cutting** |
**Progressive Challenge:**
- msgqueue: 2 corpora 50% reuse (easier)
- **cachewrap: 3 corpora 35% reuse (harder, more discovery)**
---
**Ready to start Day 1!** Follow `plan.md` and track metrics daily.