stemedb/uat/consumer-health/WEEK4_EXECUTION_PLAN.md
jordan 41c676a78e feat: Aphoria enterprise features + ontology SDK + file length compliance
Enterprise Features:
- Hosted mode with remote sync for team pattern aggregation
- Community sharing with privacy-preserving anonymization
- LLM-based semantic claim extraction with Gemini integration
- Pattern learning with promotion to declarative extractors
- High-entropy secrets extractor with configurable thresholds
- Auth bypass and insecure cookies extractors

Module Refactoring:
- Split oversized files to comply with 500-line limit
- Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc.
- Handlers split: scan.rs, policy.rs, report.rs modules
- Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/
- Learning split: store modules with metrics and persistence

SDK & Ontology:
- stemedb-ontology SDK with fluent builders and StemeDB client
- Pharma domain extractors for FDA Orange Book data
- Consumer health UAT test infrastructure

Code Quality:
- Fixed clippy warnings (needless_borrows_for_generic_args)
- Added KVStore trait imports where needed
- Fixed utoipa path re-exports for OpenAPI docs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 12:55:29 -07:00

372 lines
10 KiB
Markdown

# Week 4 UAT Execution Plan - Consumer Health
**Date:** 2026-02-05
**Milestone:** stemedb-ontology Week 4 - UAT scenarios documented and verified
**Status:** Infrastructure Ready
## Objective
Validate the four critical Consumer Health UAT scenarios programmatically:
1. GLP-1 Muscle Loss Contradiction (Skeptic Lens)
2. Gastroparesis Multi-Source (Source Hierarchy)
3. Layered Consensus (Per-Tier Positions)
4. Time Travel Query (as_of Snapshot)
## Infrastructure Created
### Integration Test Suite
**Location:** `/Users/jordanwashburn/Workspace/orchard9/stemedb/crates/stemedb-ontology/tests/consumer_health_uat.rs`
**Purpose:** Programmatic validation of UAT scenarios against a running StemeDB API instance.
**Features:**
- HTTP client for API calls
- DTO structures matching API contracts
- Assertion helpers for validation
- Structured test output with pass/fail/skip status
- Environment-aware API URL configuration
### Test Execution
```bash
# Start StemeDB API
cargo run -p stemedb-api &
# Wait for startup
sleep 2
# Run individual scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1_muscle_loss_contradiction -- --ignored --nocapture
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis_multi_source -- --ignored --nocapture
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered_consensus -- --ignored --nocapture
# Run all scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
```
## Scenario Details
### 1. GLP-1 Muscle Loss Contradiction
**UAT File:** `glp1-muscle-loss-contradiction.md`
**Test Function:** `uat_glp1_muscle_loss_contradiction()`
**Status:** ✅ Ready to Execute
**What it tests:**
- Two peer-reviewed studies with opposing conclusions coexist
- Skeptic Lens surfaces both claims without averaging
- Conflict score >= 0.5 for binary disagreement
- Status = "Contested"
- Both Boolean values present in claims array
**API Endpoints:**
- `POST /v1/assert` - Create Study A and Study B assertions
- `GET /v1/skeptic?subject=Semaglutide:MuscleMass&predicate=muscle_sparing_effect`
**Expected Outcome:**
```json
{
"status": "Contested",
"conflict_score": >= 0.5,
"claims": [
{"value": {"Boolean": false}, "weight_share": ~0.51},
{"value": {"Boolean": true}, "weight_share": ~0.49}
],
"candidates_count": 2
}
```
**Validation Checks:**
- ✅ 2 candidates returned
- ✅ 2 distinct claims
- ✅ Conflict score >= 0.5
- ✅ Status = "Contested"
- ✅ Both true and false values present
---
### 2. Gastroparesis Multi-Source
**UAT File:** `gastroparesis-multi-source.md`
**Test Function:** `uat_gastroparesis_multi_source()`
**Status:** ✅ Ready to Execute
**What it tests:**
- Regulatory source (Tier 0) dominates despite 100x volume of anecdotal (Tier 5)
- Source hierarchy uses tier priority, not just weighted voting
- Layered view shows per-tier breakdown
**API Endpoints:**
- `POST /v1/assert` - Create 1 FDA + 100 Reddit assertions
- `GET /v1/layered?subject=Semaglutide&predicate=gastroparesis_risk`
**Expected Outcome:**
```json
{
"tiers": [
{"tier": 0, "source_class": "Regulatory", "candidates_count": 1, ...},
{"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, ...}
],
"overall_winner": {...}, // From Tier 0
"total_candidates": 101
}
```
**Validation Checks:**
- ✅ 101 total candidates
- ✅ Tier 0 present with 1 candidate
- ✅ Tier 5 present with 100 candidates
- ✅ Overall winner from Tier 0
- ✅ Tier structure correct
---
### 3. Layered Consensus
**UAT File:** `layered-consensus.md`
**Test Function:** `uat_layered_consensus()`
**Status:** ✅ Ready to Execute
**What it tests:**
- Per-tier breakdown shows all populated tiers
- Within-tier conflict calculated (Tier 1 contested, Tier 5 unanimous)
- Cross-tier conflict calculated
- Overall winner from highest authority tier
**API Endpoints:**
- `POST /v1/assert` - Create 2 Clinical (conflicting) + 50 Anecdotal (unanimous)
- `GET /v1/layered?subject=Semaglutide:BodyComposition&predicate=lean_mass_preserved`
**Expected Outcome:**
```json
{
"tiers": [
{
"tier": 1,
"source_class": "Clinical",
"candidates_count": 2,
"conflict_score": > 0.5 // Contested within tier
},
{
"tier": 5,
"source_class": "Anecdotal",
"candidates_count": 50,
"conflict_score": < 0.1 // Unanimous within tier
}
],
"total_candidates": 52
}
```
**Validation Checks:**
- ✅ 52 total candidates
- ✅ Tier 1 conflict > 0.5
- ✅ Tier 5 conflict < 0.1
- Both tiers present
- Overall winner from Tier 1
---
### 4. Time Travel Query
**UAT File:** `time-travel-query.md`
**Test Function:** `uat_time_travel_query()`
**Status:** Not Yet Implemented
**What it tests:**
- Query knowledge graph as it existed at a specific timestamp
- Historical snapshot returns only assertions before `as_of` date
- Audit trail and debugging capabilities
**API Endpoints:**
- `GET /v1/query?subject=...&predicate=...&as_of=<timestamp>`
**Blocked By:**
- Implementation of `as_of` parameter in query handlers
- Timestamp filtering in query engine
**Next Steps:**
1. Add `as_of: Option<u64>` to query parameters
2. Filter assertions by timestamp in query engine
3. Update UAT test to use actual API
---
## Execution Checklist
### Pre-Execution
- [x] Integration test suite created
- [x] Test compilation verified
- [x] API endpoints confirmed to exist
- [x] Data structures validated against API DTOs
- [ ] StemeDB API server running
- [ ] Database initialized
- [ ] Ingest worker running (for assertion processing)
### Execution
- [ ] Run GLP-1 Muscle Loss Contradiction test
- [ ] Capture test output
- [ ] Update `glp1-muscle-loss-contradiction.md` with results
- [ ] Run Gastroparesis Multi-Source test
- [ ] Capture test output
- [ ] Update `gastroparesis-multi-source.md` with results
- [ ] Run Layered Consensus test
- [ ] Capture test output
- [ ] Update `layered-consensus.md` with results
- [ ] Document Time Travel Query as blocked
- [ ] Create issue for Time Travel Query implementation
### Post-Execution
- [ ] All passing tests have markdown files updated with actual results
- [ ] Failing tests have issues created
- [ ] Week 4 sign-off in roadmap
- [ ] Update stemedb-ontology README with UAT status
## Known Issues / Risks
### 1. Signature Requirement
**Issue:** API requires valid Ed25519 signatures for all assertions.
**Current Approach:** Using dummy signatures (`0000...` for agent_id and signature).
**Risk:** If signature verification is enforced, tests will fail.
**Mitigation:** Either:
- Add a test-mode flag that disables signature verification
- Generate valid signatures in test helper
- Use a test agent keypair
### 2. Assertion Ingestion Delay
**Issue:** Assertions go through WAL Ingest Worker Index Store.
**Current Approach:** `sleep(2-3 seconds)` after ingestion.
**Risk:** Race conditions if ingestion is slower than expected.
**Mitigation:**
- Increase sleep duration
- Add polling for assertion availability
- Use synchronous ingestion for tests
### 3. Time Travel Query Not Implemented
**Issue:** `as_of` parameter not yet implemented in query handlers.
**Impact:** Scenario 4 will be skipped in Week 4.
**Plan:** Document as future work, implement in Phase 6.
### 4. Database State Isolation
**Issue:** Tests write to same database, may interfere with each other.
**Current Approach:** Use unique subject/predicate combinations per test.
**Risk:** If tests are re-run, old data may pollute results.
**Mitigation:**
- Use unique identifiers per test run
- Add database cleanup between tests
- Use temporary databases per test
## Success Criteria
Week 4 is considered **complete** when:
1. Integration test suite compiles without errors
2. All API endpoints referenced in scenarios exist
3. DTOs match API contracts
4. 3 out of 4 scenarios execute successfully (Time Travel deferred)
5. UAT markdown files updated with actual results
6. All assertion checks pass (conflict scores, counts, status)
7. Test output captured and documented
## Next Steps After Week 4
1. **Week 5:** Implement Time Travel Query (`as_of` parameter)
2. **Week 6:** Add more complex scenarios (multi-tier disagreement, vote integration)
3. **Week 7:** Performance testing (1000s of assertions per tier)
4. **Week 8:** End-to-end workflows (extract ingest query visualize)
## Running the Tests
### Quick Start
```bash
# Terminal 1: Start StemeDB API
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
cargo run -p stemedb-api
# Terminal 2: Run UAT tests (after API is ready)
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
```
### Individual Scenario Execution
```bash
# GLP-1 Muscle Loss
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1 -- --ignored --nocapture
# Gastroparesis
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis -- --ignored --nocapture
# Layered Consensus
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered -- --ignored --nocapture
```
### CI/CD Integration
For automated testing in CI:
```yaml
# .github/workflows/uat.yml
name: Consumer Health UAT
on: [push, pull_request]
jobs:
uat:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Start StemeDB API
run: |
cargo build -p stemedb-api
cargo run -p stemedb-api &
sleep 5
- name: Run UAT Tests
run: |
STEMEDB_API_URL=http://localhost:18180 \
cargo test --test consumer_health_uat -- --ignored --nocapture
```
## Documentation Updates Required
After successful execution:
1. **ai-lookup/index.md:** Add link to UAT results
2. **crates/stemedb-ontology/README.md:** Document test suite
3. **roadmap.md:** Mark Week 4 as complete
4. **uat/consumer-health/README.md:** Update with test results
5. **Each scenario .md file:** Fill in "Actual" columns with real data
---
**Prepared by:** Claude (Defensive Systems Architect)
**Date:** 2026-02-05
**Status:** Infrastructure Complete - Ready for Execution