stemedb/ai-lookup/features/phase7-uat.md
jordan a734be3a0d feat: Phase 7 Content Defense + code structure refactoring
Content Defense (Phase 7):
- Add SimilarityIndex with MinHash/LSH for near-duplicate detection
- Add QuarantineStore for flagged assertions awaiting admin review
- Add CircuitBreakerStore for per-agent circuit breaker state
- Add ContentDefenseLayer for ingestion pipeline integration
- Add API endpoints for quarantine and circuit breaker management
- Add research module with gap detection and documentation fetching

Code Structure Improvements:
- Extract research CLI commands to research_commands.rs
- Extract API routers to routers.rs module
- Extract key_codec extraction functions to separate module
- Extract test modules to separate files across multiple crates
- All files now under 500 line limit per pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:44:05 -07:00

6.8 KiB

Phase 7 UAT: The Shield

Status: Ready for Testing Target Date: 2026-02-03 Confidence: High (7A, 7B complete; 7C core complete)

Summary

Phase 7 (The Shield) defends against spam, Sybil attacks, and knowledge poisoning. This UAT validates the trust-at-scale infrastructure for opening Episteme to millions of agents.

Scope:

  • 7A Admission Control: PoW-based spam protection, trust tiers, graduated quotas
  • 7B EigenTrust: Sybil-resistant global trust propagation
  • 7C Content Defense: Quality scoring, quarantine store, admin API (partial - MinHash/LSH pending)
  • 7D Circuit Breakers: NOT included (pending implementation)

Test Coverage (Verified)

Area Tests Status
Trust Graph Store 23 PASS
Trust Rank Store 22 PASS
Domain Trust Store 18 PASS
Admission Store 16 PASS
PoW types 19 PASS
Content Defense (quality) 13 PASS
Quarantine Store 9 PASS
Trust Tier types 8 PASS
API Admission integration 6 PASS
Content Defense Layer 5 PASS
Total Phase 7 139 ALL PASS

Realistic Usage Scenarios

Scenario 1: New Agent Onboarding

Goal: Verify graduated difficulty protects against spam bots while not blocking legitimate agents.

# 1. New agent with no history should require PoW
curl -X GET http://localhost:3000/v1/admission/status \
  -H "X-Agent-Id: 0000000000000000000000000000000000000000000000000000000000000001"
# Expected: 200 with pow_required: true, difficulty: 16

# 2. Submit first assertions with PoW proof
# Agent must solve: BLAKE3(nonce || agent_id || timestamp) has 16 leading zero bits
# This takes ~16 seconds on average

# 3. After 10 assertions, difficulty drops to 1 bit (trivial)
# 4. After 50 assertions OR trust > 0.6, PoW exempt

Acceptance Criteria:

  • New agents see pow_required: true, difficulty: 16
  • HTTP 428 returned when PoW missing/invalid
  • Difficulty graduates: 16 bits (1-10) → 1 bit (11-50) → 0 (51+)
  • Trusted agents (>0.6) are exempt regardless of assertion count

Scenario 2: Trust Tier Quotas

Goal: Verify rate limiting scales with trust level.

Tier Trust Range Quota Multiplier Hourly Limit
Untrusted 0.0-0.3 0.1x 1,000/hr
Limited 0.3-0.5 0.5x 5,000/hr
Verified 0.5-0.7 1.0x 10,000/hr
Trusted 0.7-0.9 2.0x 20,000/hr
Authority 0.9-1.0 10.0x 100,000/hr

Acceptance Criteria:

  • Quota headers present in responses (X-RateLimit-*)
  • Untrusted agents limited to 0.1x base quota
  • Authority agents get 10x quota
  • HTTP 429 returned when quota exceeded

Scenario 3: EigenTrust Sybil Resistance

Goal: Verify isolated trust rings get near-zero global trust.

Legitimate Network:          Sybil Ring:
   Seed ─────> A              X ──> Y
     │         │              │     │
     v         v              v     v
     B ──────> C              Z <── W

Acceptance Criteria:

  • Seed-connected agents (A, B, C) accumulate positive global trust
  • Isolated ring (X, Y, Z, W) converges to near-zero trust
  • Power iteration converges in <100 iterations (ε = 1e-4)
  • Domain-specific trust factors applied correctly

Scenario 4: Content Quality Filtering

Goal: Verify spam/noise detection without blocking legitimate content.

Content Type Expected Quality Should Quarantine?
Normal assertion: "Aspirin:treats:Headache" >0.6 No
Low entropy: "aaaa:bbbb:cccc" <0.4 Yes
Structured data with JSON >0.7 (bonus) No
Untrusted agent + high confidence <0.5 (penalty) Yes

Acceptance Criteria:

  • Shannon entropy check flags random noise (< 1.5 bits/char)
  • Minimum subject/predicate length enforced (default 3 chars)
  • Structured data (JSON, URLs, dates) gets +0.1 bonus
  • Untrusted + high confidence gets -0.5 penalty
  • Quality < 0.4 triggers quarantine

Scenario 5: Quarantine Admin Workflow

Goal: Verify suspicious content can be reviewed and processed.

# 1. List pending quarantine events
curl http://localhost:3000/v1/admin/quarantine?limit=20

# 2. Review specific event
curl http://localhost:3000/v1/admin/quarantine/{hash}

# 3. Approve or reject
curl -X POST http://localhost:3000/v1/admin/quarantine/{hash}/approve
curl -X POST http://localhost:3000/v1/admin/quarantine/{hash}/reject

Acceptance Criteria:

  • GET /v1/admin/quarantine lists pending events with reasons
  • GET /v1/admin/quarantine/{hash} returns full assertion bytes
  • POST .../approve moves assertion to main index
  • POST .../reject marks as reviewed but keeps quarantined
  • Quarantine reasons clearly indicate why flagged

Integration Points to Verify

  1. Ingestion Pipeline Integration

    • Content defense layer called before indexing
    • Quarantine bypasses normal index path
    • Bloom filter restored on restart
  2. Trust Store Interplay

    • EigenTrust feeds into TrustTier calculation
    • Domain trust factors into Authority lens weights
    • Trust decay applies to computed scores
  3. API Middleware Chain

    • AdmissionLayer checks PoW before rate limiting
    • MeterLayer applies tier-based quotas
    • Headers reflect current trust state

Known Limitations

  1. 7C Incomplete: MinHash/LSH bucketing not implemented

    • Duplicate detection uses Bloom filter only (no near-duplicate)
    • Jaccard similarity threshold (0.9) not yet enforced
  2. 7D Not Started: Circuit breakers pending

    • No automatic agent banning
    • No half-open recovery states
  3. Performance Untested:

    • EigenTrust computation on large graphs (>10k agents)
    • Bloom filter memory at scale
    • Quarantine store scan performance

Commands to Run

# Full test suite
cargo test --workspace

# Phase 7 specific crates
cargo test -p stemedb-storage -- trust_graph
cargo test -p stemedb-storage -- domain_trust
cargo test -p stemedb-storage -- admission
cargo test -p stemedb-storage -- quarantine
cargo test -p stemedb-storage -- content_defense
cargo test -p stemedb-ingest -- content_defense
cargo test -p stemedb-api --test admission_integration
cargo test -p stemedb-core -- trust_tier
cargo test -p stemedb-core -- pow

# Clippy must pass
cargo clippy --workspace -- -D warnings

# Go SDK examples
cd sdk/go && go test ./...

Success Criteria

Phase 7 UAT passes when:

  1. All ~139 Phase 7 tests pass
  2. All 5 usage scenarios verified manually
  3. Clippy clean with no warnings
  4. Go SDK examples pass
  5. API endpoints return correct responses
  6. Quarantine workflow complete end-to-end