Enterprise Features: - Hosted mode with remote sync for team pattern aggregation - Community sharing with privacy-preserving anonymization - LLM-based semantic claim extraction with Gemini integration - Pattern learning with promotion to declarative extractors - High-entropy secrets extractor with configurable thresholds - Auth bypass and insecure cookies extractors Module Refactoring: - Split oversized files to comply with 500-line limit - Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc. - Handlers split: scan.rs, policy.rs, report.rs modules - Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/ - Learning split: store modules with metrics and persistence SDK & Ontology: - stemedb-ontology SDK with fluent builders and StemeDB client - Pharma domain extractors for FDA Orange Book data - Consumer health UAT test infrastructure Code Quality: - Fixed clippy warnings (needless_borrows_for_generic_args) - Added KVStore trait imports where needed - Fixed utoipa path re-exports for OpenAPI docs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
10 KiB
Week 4 UAT Execution Plan - Consumer Health
Date: 2026-02-05 Milestone: stemedb-ontology Week 4 - UAT scenarios documented and verified Status: Infrastructure Ready
Objective
Validate the four critical Consumer Health UAT scenarios programmatically:
- GLP-1 Muscle Loss Contradiction (Skeptic Lens)
- Gastroparesis Multi-Source (Source Hierarchy)
- Layered Consensus (Per-Tier Positions)
- Time Travel Query (as_of Snapshot)
Infrastructure Created
Integration Test Suite
Location: /Users/jordanwashburn/Workspace/orchard9/stemedb/crates/stemedb-ontology/tests/consumer_health_uat.rs
Purpose: Programmatic validation of UAT scenarios against a running StemeDB API instance.
Features:
- HTTP client for API calls
- DTO structures matching API contracts
- Assertion helpers for validation
- Structured test output with pass/fail/skip status
- Environment-aware API URL configuration
Test Execution
# Start StemeDB API
cargo run -p stemedb-api &
# Wait for startup
sleep 2
# Run individual scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1_muscle_loss_contradiction -- --ignored --nocapture
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis_multi_source -- --ignored --nocapture
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered_consensus -- --ignored --nocapture
# Run all scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
Scenario Details
1. GLP-1 Muscle Loss Contradiction
UAT File: glp1-muscle-loss-contradiction.md
Test Function: uat_glp1_muscle_loss_contradiction()
Status: ✅ Ready to Execute
What it tests:
- Two peer-reviewed studies with opposing conclusions coexist
- Skeptic Lens surfaces both claims without averaging
- Conflict score >= 0.5 for binary disagreement
- Status = "Contested"
- Both Boolean values present in claims array
API Endpoints:
POST /v1/assert- Create Study A and Study B assertionsGET /v1/skeptic?subject=Semaglutide:MuscleMass&predicate=muscle_sparing_effect
Expected Outcome:
{
"status": "Contested",
"conflict_score": >= 0.5,
"claims": [
{"value": {"Boolean": false}, "weight_share": ~0.51},
{"value": {"Boolean": true}, "weight_share": ~0.49}
],
"candidates_count": 2
}
Validation Checks:
- ✅ 2 candidates returned
- ✅ 2 distinct claims
- ✅ Conflict score >= 0.5
- ✅ Status = "Contested"
- ✅ Both true and false values present
2. Gastroparesis Multi-Source
UAT File: gastroparesis-multi-source.md
Test Function: uat_gastroparesis_multi_source()
Status: ✅ Ready to Execute
What it tests:
- Regulatory source (Tier 0) dominates despite 100x volume of anecdotal (Tier 5)
- Source hierarchy uses tier priority, not just weighted voting
- Layered view shows per-tier breakdown
API Endpoints:
POST /v1/assert- Create 1 FDA + 100 Reddit assertionsGET /v1/layered?subject=Semaglutide&predicate=gastroparesis_risk
Expected Outcome:
{
"tiers": [
{"tier": 0, "source_class": "Regulatory", "candidates_count": 1, ...},
{"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, ...}
],
"overall_winner": {...}, // From Tier 0
"total_candidates": 101
}
Validation Checks:
- ✅ 101 total candidates
- ✅ Tier 0 present with 1 candidate
- ✅ Tier 5 present with 100 candidates
- ✅ Overall winner from Tier 0
- ✅ Tier structure correct
3. Layered Consensus
UAT File: layered-consensus.md
Test Function: uat_layered_consensus()
Status: ✅ Ready to Execute
What it tests:
- Per-tier breakdown shows all populated tiers
- Within-tier conflict calculated (Tier 1 contested, Tier 5 unanimous)
- Cross-tier conflict calculated
- Overall winner from highest authority tier
API Endpoints:
POST /v1/assert- Create 2 Clinical (conflicting) + 50 Anecdotal (unanimous)GET /v1/layered?subject=Semaglutide:BodyComposition&predicate=lean_mass_preserved
Expected Outcome:
{
"tiers": [
{
"tier": 1,
"source_class": "Clinical",
"candidates_count": 2,
"conflict_score": > 0.5 // Contested within tier
},
{
"tier": 5,
"source_class": "Anecdotal",
"candidates_count": 50,
"conflict_score": < 0.1 // Unanimous within tier
}
],
"total_candidates": 52
}
Validation Checks:
- ✅ 52 total candidates
- ✅ Tier 1 conflict > 0.5
- ✅ Tier 5 conflict < 0.1
- ✅ Both tiers present
- ✅ Overall winner from Tier 1
4. Time Travel Query
UAT File: time-travel-query.md
Test Function: uat_time_travel_query()
Status: ⊘ Not Yet Implemented
What it tests:
- Query knowledge graph as it existed at a specific timestamp
- Historical snapshot returns only assertions before
as_ofdate - Audit trail and debugging capabilities
API Endpoints:
GET /v1/query?subject=...&predicate=...&as_of=<timestamp>
Blocked By:
- Implementation of
as_ofparameter in query handlers - Timestamp filtering in query engine
Next Steps:
- Add
as_of: Option<u64>to query parameters - Filter assertions by timestamp in query engine
- Update UAT test to use actual API
Execution Checklist
Pre-Execution
- Integration test suite created
- Test compilation verified
- API endpoints confirmed to exist
- Data structures validated against API DTOs
- StemeDB API server running
- Database initialized
- Ingest worker running (for assertion processing)
Execution
- Run GLP-1 Muscle Loss Contradiction test
- Capture test output
- Update
glp1-muscle-loss-contradiction.mdwith results - Run Gastroparesis Multi-Source test
- Capture test output
- Update
gastroparesis-multi-source.mdwith results - Run Layered Consensus test
- Capture test output
- Update
layered-consensus.mdwith results - Document Time Travel Query as blocked
- Create issue for Time Travel Query implementation
Post-Execution
- All passing tests have markdown files updated with actual results
- Failing tests have issues created
- Week 4 sign-off in roadmap
- Update stemedb-ontology README with UAT status
Known Issues / Risks
1. Signature Requirement
Issue: API requires valid Ed25519 signatures for all assertions.
Current Approach: Using dummy signatures (0000... for agent_id and signature).
Risk: If signature verification is enforced, tests will fail.
Mitigation: Either:
- Add a test-mode flag that disables signature verification
- Generate valid signatures in test helper
- Use a test agent keypair
2. Assertion Ingestion Delay
Issue: Assertions go through WAL → Ingest Worker → Index Store.
Current Approach: sleep(2-3 seconds) after ingestion.
Risk: Race conditions if ingestion is slower than expected.
Mitigation:
- Increase sleep duration
- Add polling for assertion availability
- Use synchronous ingestion for tests
3. Time Travel Query Not Implemented
Issue: as_of parameter not yet implemented in query handlers.
Impact: Scenario 4 will be skipped in Week 4.
Plan: Document as future work, implement in Phase 6.
4. Database State Isolation
Issue: Tests write to same database, may interfere with each other.
Current Approach: Use unique subject/predicate combinations per test.
Risk: If tests are re-run, old data may pollute results.
Mitigation:
- Use unique identifiers per test run
- Add database cleanup between tests
- Use temporary databases per test
Success Criteria
Week 4 is considered complete when:
- ✅ Integration test suite compiles without errors
- ✅ All API endpoints referenced in scenarios exist
- ✅ DTOs match API contracts
- ⏳ 3 out of 4 scenarios execute successfully (Time Travel deferred)
- ⏳ UAT markdown files updated with actual results
- ⏳ All assertion checks pass (conflict scores, counts, status)
- ⏳ Test output captured and documented
Next Steps After Week 4
- Week 5: Implement Time Travel Query (
as_ofparameter) - Week 6: Add more complex scenarios (multi-tier disagreement, vote integration)
- Week 7: Performance testing (1000s of assertions per tier)
- Week 8: End-to-end workflows (extract → ingest → query → visualize)
Running the Tests
Quick Start
# Terminal 1: Start StemeDB API
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
cargo run -p stemedb-api
# Terminal 2: Run UAT tests (after API is ready)
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
Individual Scenario Execution
# GLP-1 Muscle Loss
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1 -- --ignored --nocapture
# Gastroparesis
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis -- --ignored --nocapture
# Layered Consensus
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered -- --ignored --nocapture
CI/CD Integration
For automated testing in CI:
# .github/workflows/uat.yml
name: Consumer Health UAT
on: [push, pull_request]
jobs:
uat:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Start StemeDB API
run: |
cargo build -p stemedb-api
cargo run -p stemedb-api &
sleep 5
- name: Run UAT Tests
run: |
STEMEDB_API_URL=http://localhost:18180 \
cargo test --test consumer_health_uat -- --ignored --nocapture
Documentation Updates Required
After successful execution:
- ai-lookup/index.md: Add link to UAT results
- crates/stemedb-ontology/README.md: Document test suite
- roadmap.md: Mark Week 4 as complete
- uat/consumer-health/README.md: Update with test results
- Each scenario .md file: Fill in "Actual" columns with real data
Prepared by: Claude (Defensive Systems Architect) Date: 2026-02-05 Status: Infrastructure Complete - Ready for Execution