stemedb/ai-lookup/features/phase7-uat.md
jordan a734be3a0d feat: Phase 7 Content Defense + code structure refactoring
Content Defense (Phase 7):
- Add SimilarityIndex with MinHash/LSH for near-duplicate detection
- Add QuarantineStore for flagged assertions awaiting admin review
- Add CircuitBreakerStore for per-agent circuit breaker state
- Add ContentDefenseLayer for ingestion pipeline integration
- Add API endpoints for quarantine and circuit breaker management
- Add research module with gap detection and documentation fetching

Code Structure Improvements:
- Extract research CLI commands to research_commands.rs
- Extract API routers to routers.rs module
- Extract key_codec extraction functions to separate module
- Extract test modules to separate files across multiple crates
- All files now under 500 line limit per pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:44:05 -07:00

202 lines
6.8 KiB
Markdown

# Phase 7 UAT: The Shield
**Status:** Ready for Testing
**Target Date:** 2026-02-03
**Confidence:** High (7A, 7B complete; 7C core complete)
## Summary
Phase 7 (The Shield) defends against spam, Sybil attacks, and knowledge poisoning. This UAT validates the trust-at-scale infrastructure for opening Episteme to millions of agents.
**Scope:**
- 7A Admission Control: PoW-based spam protection, trust tiers, graduated quotas
- 7B EigenTrust: Sybil-resistant global trust propagation
- 7C Content Defense: Quality scoring, quarantine store, admin API (partial - MinHash/LSH pending)
- 7D Circuit Breakers: NOT included (pending implementation)
## Test Coverage (Verified)
| Area | Tests | Status |
|------|-------|--------|
| Trust Graph Store | 23 | PASS |
| Trust Rank Store | 22 | PASS |
| Domain Trust Store | 18 | PASS |
| Admission Store | 16 | PASS |
| PoW types | 19 | PASS |
| Content Defense (quality) | 13 | PASS |
| Quarantine Store | 9 | PASS |
| Trust Tier types | 8 | PASS |
| API Admission integration | 6 | PASS |
| Content Defense Layer | 5 | PASS |
| **Total Phase 7** | **139** | **ALL PASS** |
## Realistic Usage Scenarios
### Scenario 1: New Agent Onboarding
**Goal:** Verify graduated difficulty protects against spam bots while not blocking legitimate agents.
```bash
# 1. New agent with no history should require PoW
curl -X GET http://localhost:3000/v1/admission/status \
-H "X-Agent-Id: 0000000000000000000000000000000000000000000000000000000000000001"
# Expected: 200 with pow_required: true, difficulty: 16
# 2. Submit first assertions with PoW proof
# Agent must solve: BLAKE3(nonce || agent_id || timestamp) has 16 leading zero bits
# This takes ~16 seconds on average
# 3. After 10 assertions, difficulty drops to 1 bit (trivial)
# 4. After 50 assertions OR trust > 0.6, PoW exempt
```
**Acceptance Criteria:**
- [ ] New agents see `pow_required: true`, `difficulty: 16`
- [ ] HTTP 428 returned when PoW missing/invalid
- [ ] Difficulty graduates: 16 bits (1-10) → 1 bit (11-50) → 0 (51+)
- [ ] Trusted agents (>0.6) are exempt regardless of assertion count
### Scenario 2: Trust Tier Quotas
**Goal:** Verify rate limiting scales with trust level.
| Tier | Trust Range | Quota Multiplier | Hourly Limit |
|------|-------------|------------------|--------------|
| Untrusted | 0.0-0.3 | 0.1x | 1,000/hr |
| Limited | 0.3-0.5 | 0.5x | 5,000/hr |
| Verified | 0.5-0.7 | 1.0x | 10,000/hr |
| Trusted | 0.7-0.9 | 2.0x | 20,000/hr |
| Authority | 0.9-1.0 | 10.0x | 100,000/hr |
**Acceptance Criteria:**
- [ ] Quota headers present in responses (`X-RateLimit-*`)
- [ ] Untrusted agents limited to 0.1x base quota
- [ ] Authority agents get 10x quota
- [ ] HTTP 429 returned when quota exceeded
### Scenario 3: EigenTrust Sybil Resistance
**Goal:** Verify isolated trust rings get near-zero global trust.
```
Legitimate Network: Sybil Ring:
Seed ─────> A X ──> Y
│ │ │ │
v v v v
B ──────> C Z <── W
```
**Acceptance Criteria:**
- [ ] Seed-connected agents (A, B, C) accumulate positive global trust
- [ ] Isolated ring (X, Y, Z, W) converges to near-zero trust
- [ ] Power iteration converges in <100 iterations (ε = 1e-4)
- [ ] Domain-specific trust factors applied correctly
### Scenario 4: Content Quality Filtering
**Goal:** Verify spam/noise detection without blocking legitimate content.
| Content Type | Expected Quality | Should Quarantine? |
|--------------|------------------|-------------------|
| Normal assertion: "Aspirin:treats:Headache" | >0.6 | No |
| Low entropy: "aaaa:bbbb:cccc" | <0.4 | Yes |
| Structured data with JSON | >0.7 (bonus) | No |
| Untrusted agent + high confidence | <0.5 (penalty) | Yes |
**Acceptance Criteria:**
- [ ] Shannon entropy check flags random noise (< 1.5 bits/char)
- [ ] Minimum subject/predicate length enforced (default 3 chars)
- [ ] Structured data (JSON, URLs, dates) gets +0.1 bonus
- [ ] Untrusted + high confidence gets -0.5 penalty
- [ ] Quality < 0.4 triggers quarantine
### Scenario 5: Quarantine Admin Workflow
**Goal:** Verify suspicious content can be reviewed and processed.
```bash
# 1. List pending quarantine events
curl http://localhost:3000/v1/admin/quarantine?limit=20
# 2. Review specific event
curl http://localhost:3000/v1/admin/quarantine/{hash}
# 3. Approve or reject
curl -X POST http://localhost:3000/v1/admin/quarantine/{hash}/approve
curl -X POST http://localhost:3000/v1/admin/quarantine/{hash}/reject
```
**Acceptance Criteria:**
- [ ] `GET /v1/admin/quarantine` lists pending events with reasons
- [ ] `GET /v1/admin/quarantine/{hash}` returns full assertion bytes
- [ ] `POST .../approve` moves assertion to main index
- [ ] `POST .../reject` marks as reviewed but keeps quarantined
- [ ] Quarantine reasons clearly indicate why flagged
## Integration Points to Verify
1. **Ingestion Pipeline Integration**
- Content defense layer called before indexing
- Quarantine bypasses normal index path
- Bloom filter restored on restart
2. **Trust Store Interplay**
- EigenTrust feeds into TrustTier calculation
- Domain trust factors into Authority lens weights
- Trust decay applies to computed scores
3. **API Middleware Chain**
- AdmissionLayer checks PoW before rate limiting
- MeterLayer applies tier-based quotas
- Headers reflect current trust state
## Known Limitations
1. **7C Incomplete:** MinHash/LSH bucketing not implemented
- Duplicate detection uses Bloom filter only (no near-duplicate)
- Jaccard similarity threshold (0.9) not yet enforced
2. **7D Not Started:** Circuit breakers pending
- No automatic agent banning
- No half-open recovery states
3. **Performance Untested:**
- EigenTrust computation on large graphs (>10k agents)
- Bloom filter memory at scale
- Quarantine store scan performance
## Commands to Run
```bash
# Full test suite
cargo test --workspace
# Phase 7 specific crates
cargo test -p stemedb-storage -- trust_graph
cargo test -p stemedb-storage -- domain_trust
cargo test -p stemedb-storage -- admission
cargo test -p stemedb-storage -- quarantine
cargo test -p stemedb-storage -- content_defense
cargo test -p stemedb-ingest -- content_defense
cargo test -p stemedb-api --test admission_integration
cargo test -p stemedb-core -- trust_tier
cargo test -p stemedb-core -- pow
# Clippy must pass
cargo clippy --workspace -- -D warnings
# Go SDK examples
cd sdk/go && go test ./...
```
## Success Criteria
**Phase 7 UAT passes when:**
1. All ~139 Phase 7 tests pass
2. All 5 usage scenarios verified manually
3. Clippy clean with no warnings
4. Go SDK examples pass
5. API endpoints return correct responses
6. Quarantine workflow complete end-to-end
## Related Documentation
- [Admission Control API](./admission-control.md)
- [Phase 6 UAT](./phase6-uat.md)
- [Roadmap Phase 7](../../roadmap.md#phase-7-the-shield-trust-at-scale)