# Week 4 UAT Execution Plan - Consumer Health **Date:** 2026-02-05 **Milestone:** stemedb-ontology Week 4 - UAT scenarios documented and verified **Status:** Infrastructure Ready ## Objective Validate the four critical Consumer Health UAT scenarios programmatically: 1. GLP-1 Muscle Loss Contradiction (Skeptic Lens) 2. Gastroparesis Multi-Source (Source Hierarchy) 3. Layered Consensus (Per-Tier Positions) 4. Time Travel Query (as_of Snapshot) ## Infrastructure Created ### Integration Test Suite **Location:** `/Users/jordanwashburn/Workspace/orchard9/stemedb/crates/stemedb-ontology/tests/consumer_health_uat.rs` **Purpose:** Programmatic validation of UAT scenarios against a running StemeDB API instance. **Features:** - HTTP client for API calls - DTO structures matching API contracts - Assertion helpers for validation - Structured test output with pass/fail/skip status - Environment-aware API URL configuration ### Test Execution ```bash # Start StemeDB API cargo run -p stemedb-api & # Wait for startup sleep 2 # Run individual scenarios STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1_muscle_loss_contradiction -- --ignored --nocapture STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis_multi_source -- --ignored --nocapture STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered_consensus -- --ignored --nocapture # Run all scenarios STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture ``` ## Scenario Details ### 1. GLP-1 Muscle Loss Contradiction **UAT File:** `glp1-muscle-loss-contradiction.md` **Test Function:** `uat_glp1_muscle_loss_contradiction()` **Status:** ✅ Ready to Execute **What it tests:** - Two peer-reviewed studies with opposing conclusions coexist - Skeptic Lens surfaces both claims without averaging - Conflict score >= 0.5 for binary disagreement - Status = "Contested" - Both Boolean values present in claims array **API Endpoints:** - `POST /v1/assert` - Create Study A and Study B assertions - `GET /v1/skeptic?subject=Semaglutide:MuscleMass&predicate=muscle_sparing_effect` **Expected Outcome:** ```json { "status": "Contested", "conflict_score": >= 0.5, "claims": [ {"value": {"Boolean": false}, "weight_share": ~0.51}, {"value": {"Boolean": true}, "weight_share": ~0.49} ], "candidates_count": 2 } ``` **Validation Checks:** - ✅ 2 candidates returned - ✅ 2 distinct claims - ✅ Conflict score >= 0.5 - ✅ Status = "Contested" - ✅ Both true and false values present --- ### 2. Gastroparesis Multi-Source **UAT File:** `gastroparesis-multi-source.md` **Test Function:** `uat_gastroparesis_multi_source()` **Status:** ✅ Ready to Execute **What it tests:** - Regulatory source (Tier 0) dominates despite 100x volume of anecdotal (Tier 5) - Source hierarchy uses tier priority, not just weighted voting - Layered view shows per-tier breakdown **API Endpoints:** - `POST /v1/assert` - Create 1 FDA + 100 Reddit assertions - `GET /v1/layered?subject=Semaglutide&predicate=gastroparesis_risk` **Expected Outcome:** ```json { "tiers": [ {"tier": 0, "source_class": "Regulatory", "candidates_count": 1, ...}, {"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, ...} ], "overall_winner": {...}, // From Tier 0 "total_candidates": 101 } ``` **Validation Checks:** - ✅ 101 total candidates - ✅ Tier 0 present with 1 candidate - ✅ Tier 5 present with 100 candidates - ✅ Overall winner from Tier 0 - ✅ Tier structure correct --- ### 3. Layered Consensus **UAT File:** `layered-consensus.md` **Test Function:** `uat_layered_consensus()` **Status:** ✅ Ready to Execute **What it tests:** - Per-tier breakdown shows all populated tiers - Within-tier conflict calculated (Tier 1 contested, Tier 5 unanimous) - Cross-tier conflict calculated - Overall winner from highest authority tier **API Endpoints:** - `POST /v1/assert` - Create 2 Clinical (conflicting) + 50 Anecdotal (unanimous) - `GET /v1/layered?subject=Semaglutide:BodyComposition&predicate=lean_mass_preserved` **Expected Outcome:** ```json { "tiers": [ { "tier": 1, "source_class": "Clinical", "candidates_count": 2, "conflict_score": > 0.5 // Contested within tier }, { "tier": 5, "source_class": "Anecdotal", "candidates_count": 50, "conflict_score": < 0.1 // Unanimous within tier } ], "total_candidates": 52 } ``` **Validation Checks:** - ✅ 52 total candidates - ✅ Tier 1 conflict > 0.5 - ✅ Tier 5 conflict < 0.1 - ✅ Both tiers present - ✅ Overall winner from Tier 1 --- ### 4. Time Travel Query **UAT File:** `time-travel-query.md` **Test Function:** `uat_time_travel_query()` **Status:** ⊘ Not Yet Implemented **What it tests:** - Query knowledge graph as it existed at a specific timestamp - Historical snapshot returns only assertions before `as_of` date - Audit trail and debugging capabilities **API Endpoints:** - `GET /v1/query?subject=...&predicate=...&as_of=` **Blocked By:** - Implementation of `as_of` parameter in query handlers - Timestamp filtering in query engine **Next Steps:** 1. Add `as_of: Option` to query parameters 2. Filter assertions by timestamp in query engine 3. Update UAT test to use actual API --- ## Execution Checklist ### Pre-Execution - [x] Integration test suite created - [x] Test compilation verified - [x] API endpoints confirmed to exist - [x] Data structures validated against API DTOs - [ ] StemeDB API server running - [ ] Database initialized - [ ] Ingest worker running (for assertion processing) ### Execution - [ ] Run GLP-1 Muscle Loss Contradiction test - [ ] Capture test output - [ ] Update `glp1-muscle-loss-contradiction.md` with results - [ ] Run Gastroparesis Multi-Source test - [ ] Capture test output - [ ] Update `gastroparesis-multi-source.md` with results - [ ] Run Layered Consensus test - [ ] Capture test output - [ ] Update `layered-consensus.md` with results - [ ] Document Time Travel Query as blocked - [ ] Create issue for Time Travel Query implementation ### Post-Execution - [ ] All passing tests have markdown files updated with actual results - [ ] Failing tests have issues created - [ ] Week 4 sign-off in roadmap - [ ] Update stemedb-ontology README with UAT status ## Known Issues / Risks ### 1. Signature Requirement **Issue:** API requires valid Ed25519 signatures for all assertions. **Current Approach:** Using dummy signatures (`0000...` for agent_id and signature). **Risk:** If signature verification is enforced, tests will fail. **Mitigation:** Either: - Add a test-mode flag that disables signature verification - Generate valid signatures in test helper - Use a test agent keypair ### 2. Assertion Ingestion Delay **Issue:** Assertions go through WAL → Ingest Worker → Index Store. **Current Approach:** `sleep(2-3 seconds)` after ingestion. **Risk:** Race conditions if ingestion is slower than expected. **Mitigation:** - Increase sleep duration - Add polling for assertion availability - Use synchronous ingestion for tests ### 3. Time Travel Query Not Implemented **Issue:** `as_of` parameter not yet implemented in query handlers. **Impact:** Scenario 4 will be skipped in Week 4. **Plan:** Document as future work, implement in Phase 6. ### 4. Database State Isolation **Issue:** Tests write to same database, may interfere with each other. **Current Approach:** Use unique subject/predicate combinations per test. **Risk:** If tests are re-run, old data may pollute results. **Mitigation:** - Use unique identifiers per test run - Add database cleanup between tests - Use temporary databases per test ## Success Criteria Week 4 is considered **complete** when: 1. ✅ Integration test suite compiles without errors 2. ✅ All API endpoints referenced in scenarios exist 3. ✅ DTOs match API contracts 4. ⏳ 3 out of 4 scenarios execute successfully (Time Travel deferred) 5. ⏳ UAT markdown files updated with actual results 6. ⏳ All assertion checks pass (conflict scores, counts, status) 7. ⏳ Test output captured and documented ## Next Steps After Week 4 1. **Week 5:** Implement Time Travel Query (`as_of` parameter) 2. **Week 6:** Add more complex scenarios (multi-tier disagreement, vote integration) 3. **Week 7:** Performance testing (1000s of assertions per tier) 4. **Week 8:** End-to-end workflows (extract → ingest → query → visualize) ## Running the Tests ### Quick Start ```bash # Terminal 1: Start StemeDB API cd /Users/jordanwashburn/Workspace/orchard9/stemedb cargo run -p stemedb-api # Terminal 2: Run UAT tests (after API is ready) cd /Users/jordanwashburn/Workspace/orchard9/stemedb STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture ``` ### Individual Scenario Execution ```bash # GLP-1 Muscle Loss STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1 -- --ignored --nocapture # Gastroparesis STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis -- --ignored --nocapture # Layered Consensus STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered -- --ignored --nocapture ``` ### CI/CD Integration For automated testing in CI: ```yaml # .github/workflows/uat.yml name: Consumer Health UAT on: [push, pull_request] jobs: uat: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - uses: actions-rs/toolchain@v1 with: toolchain: stable - name: Start StemeDB API run: | cargo build -p stemedb-api cargo run -p stemedb-api & sleep 5 - name: Run UAT Tests run: | STEMEDB_API_URL=http://localhost:18180 \ cargo test --test consumer_health_uat -- --ignored --nocapture ``` ## Documentation Updates Required After successful execution: 1. **ai-lookup/index.md:** Add link to UAT results 2. **crates/stemedb-ontology/README.md:** Document test suite 3. **roadmap.md:** Mark Week 4 as complete 4. **uat/consumer-health/README.md:** Update with test results 5. **Each scenario .md file:** Fill in "Actual" columns with real data --- **Prepared by:** Claude (Defensive Systems Architect) **Date:** 2026-02-05 **Status:** Infrastructure Complete - Ready for Execution