Enterprise Features: - Hosted mode with remote sync for team pattern aggregation - Community sharing with privacy-preserving anonymization - LLM-based semantic claim extraction with Gemini integration - Pattern learning with promotion to declarative extractors - High-entropy secrets extractor with configurable thresholds - Auth bypass and insecure cookies extractors Module Refactoring: - Split oversized files to comply with 500-line limit - Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc. - Handlers split: scan.rs, policy.rs, report.rs modules - Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/ - Learning split: store modules with metrics and persistence SDK & Ontology: - stemedb-ontology SDK with fluent builders and StemeDB client - Pharma domain extractors for FDA Orange Book data - Consumer health UAT test infrastructure Code Quality: - Fixed clippy warnings (needless_borrows_for_generic_args) - Added KVStore trait imports where needed - Fixed utoipa path re-exports for OpenAPI docs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
372 lines
10 KiB
Markdown
372 lines
10 KiB
Markdown
# Week 4 UAT Execution Plan - Consumer Health
|
|
|
|
**Date:** 2026-02-05
|
|
**Milestone:** stemedb-ontology Week 4 - UAT scenarios documented and verified
|
|
**Status:** Infrastructure Ready
|
|
|
|
## Objective
|
|
|
|
Validate the four critical Consumer Health UAT scenarios programmatically:
|
|
1. GLP-1 Muscle Loss Contradiction (Skeptic Lens)
|
|
2. Gastroparesis Multi-Source (Source Hierarchy)
|
|
3. Layered Consensus (Per-Tier Positions)
|
|
4. Time Travel Query (as_of Snapshot)
|
|
|
|
## Infrastructure Created
|
|
|
|
### Integration Test Suite
|
|
|
|
**Location:** `/Users/jordanwashburn/Workspace/orchard9/stemedb/crates/stemedb-ontology/tests/consumer_health_uat.rs`
|
|
|
|
**Purpose:** Programmatic validation of UAT scenarios against a running StemeDB API instance.
|
|
|
|
**Features:**
|
|
- HTTP client for API calls
|
|
- DTO structures matching API contracts
|
|
- Assertion helpers for validation
|
|
- Structured test output with pass/fail/skip status
|
|
- Environment-aware API URL configuration
|
|
|
|
### Test Execution
|
|
|
|
```bash
|
|
# Start StemeDB API
|
|
cargo run -p stemedb-api &
|
|
|
|
# Wait for startup
|
|
sleep 2
|
|
|
|
# Run individual scenarios
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1_muscle_loss_contradiction -- --ignored --nocapture
|
|
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis_multi_source -- --ignored --nocapture
|
|
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered_consensus -- --ignored --nocapture
|
|
|
|
# Run all scenarios
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
|
|
```
|
|
|
|
## Scenario Details
|
|
|
|
### 1. GLP-1 Muscle Loss Contradiction
|
|
|
|
**UAT File:** `glp1-muscle-loss-contradiction.md`
|
|
**Test Function:** `uat_glp1_muscle_loss_contradiction()`
|
|
**Status:** ✅ Ready to Execute
|
|
|
|
**What it tests:**
|
|
- Two peer-reviewed studies with opposing conclusions coexist
|
|
- Skeptic Lens surfaces both claims without averaging
|
|
- Conflict score >= 0.5 for binary disagreement
|
|
- Status = "Contested"
|
|
- Both Boolean values present in claims array
|
|
|
|
**API Endpoints:**
|
|
- `POST /v1/assert` - Create Study A and Study B assertions
|
|
- `GET /v1/skeptic?subject=Semaglutide:MuscleMass&predicate=muscle_sparing_effect`
|
|
|
|
**Expected Outcome:**
|
|
```json
|
|
{
|
|
"status": "Contested",
|
|
"conflict_score": >= 0.5,
|
|
"claims": [
|
|
{"value": {"Boolean": false}, "weight_share": ~0.51},
|
|
{"value": {"Boolean": true}, "weight_share": ~0.49}
|
|
],
|
|
"candidates_count": 2
|
|
}
|
|
```
|
|
|
|
**Validation Checks:**
|
|
- ✅ 2 candidates returned
|
|
- ✅ 2 distinct claims
|
|
- ✅ Conflict score >= 0.5
|
|
- ✅ Status = "Contested"
|
|
- ✅ Both true and false values present
|
|
|
|
---
|
|
|
|
### 2. Gastroparesis Multi-Source
|
|
|
|
**UAT File:** `gastroparesis-multi-source.md`
|
|
**Test Function:** `uat_gastroparesis_multi_source()`
|
|
**Status:** ✅ Ready to Execute
|
|
|
|
**What it tests:**
|
|
- Regulatory source (Tier 0) dominates despite 100x volume of anecdotal (Tier 5)
|
|
- Source hierarchy uses tier priority, not just weighted voting
|
|
- Layered view shows per-tier breakdown
|
|
|
|
**API Endpoints:**
|
|
- `POST /v1/assert` - Create 1 FDA + 100 Reddit assertions
|
|
- `GET /v1/layered?subject=Semaglutide&predicate=gastroparesis_risk`
|
|
|
|
**Expected Outcome:**
|
|
```json
|
|
{
|
|
"tiers": [
|
|
{"tier": 0, "source_class": "Regulatory", "candidates_count": 1, ...},
|
|
{"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, ...}
|
|
],
|
|
"overall_winner": {...}, // From Tier 0
|
|
"total_candidates": 101
|
|
}
|
|
```
|
|
|
|
**Validation Checks:**
|
|
- ✅ 101 total candidates
|
|
- ✅ Tier 0 present with 1 candidate
|
|
- ✅ Tier 5 present with 100 candidates
|
|
- ✅ Overall winner from Tier 0
|
|
- ✅ Tier structure correct
|
|
|
|
---
|
|
|
|
### 3. Layered Consensus
|
|
|
|
**UAT File:** `layered-consensus.md`
|
|
**Test Function:** `uat_layered_consensus()`
|
|
**Status:** ✅ Ready to Execute
|
|
|
|
**What it tests:**
|
|
- Per-tier breakdown shows all populated tiers
|
|
- Within-tier conflict calculated (Tier 1 contested, Tier 5 unanimous)
|
|
- Cross-tier conflict calculated
|
|
- Overall winner from highest authority tier
|
|
|
|
**API Endpoints:**
|
|
- `POST /v1/assert` - Create 2 Clinical (conflicting) + 50 Anecdotal (unanimous)
|
|
- `GET /v1/layered?subject=Semaglutide:BodyComposition&predicate=lean_mass_preserved`
|
|
|
|
**Expected Outcome:**
|
|
```json
|
|
{
|
|
"tiers": [
|
|
{
|
|
"tier": 1,
|
|
"source_class": "Clinical",
|
|
"candidates_count": 2,
|
|
"conflict_score": > 0.5 // Contested within tier
|
|
},
|
|
{
|
|
"tier": 5,
|
|
"source_class": "Anecdotal",
|
|
"candidates_count": 50,
|
|
"conflict_score": < 0.1 // Unanimous within tier
|
|
}
|
|
],
|
|
"total_candidates": 52
|
|
}
|
|
```
|
|
|
|
**Validation Checks:**
|
|
- ✅ 52 total candidates
|
|
- ✅ Tier 1 conflict > 0.5
|
|
- ✅ Tier 5 conflict < 0.1
|
|
- ✅ Both tiers present
|
|
- ✅ Overall winner from Tier 1
|
|
|
|
---
|
|
|
|
### 4. Time Travel Query
|
|
|
|
**UAT File:** `time-travel-query.md`
|
|
**Test Function:** `uat_time_travel_query()`
|
|
**Status:** ⊘ Not Yet Implemented
|
|
|
|
**What it tests:**
|
|
- Query knowledge graph as it existed at a specific timestamp
|
|
- Historical snapshot returns only assertions before `as_of` date
|
|
- Audit trail and debugging capabilities
|
|
|
|
**API Endpoints:**
|
|
- `GET /v1/query?subject=...&predicate=...&as_of=<timestamp>`
|
|
|
|
**Blocked By:**
|
|
- Implementation of `as_of` parameter in query handlers
|
|
- Timestamp filtering in query engine
|
|
|
|
**Next Steps:**
|
|
1. Add `as_of: Option<u64>` to query parameters
|
|
2. Filter assertions by timestamp in query engine
|
|
3. Update UAT test to use actual API
|
|
|
|
---
|
|
|
|
## Execution Checklist
|
|
|
|
### Pre-Execution
|
|
|
|
- [x] Integration test suite created
|
|
- [x] Test compilation verified
|
|
- [x] API endpoints confirmed to exist
|
|
- [x] Data structures validated against API DTOs
|
|
- [ ] StemeDB API server running
|
|
- [ ] Database initialized
|
|
- [ ] Ingest worker running (for assertion processing)
|
|
|
|
### Execution
|
|
|
|
- [ ] Run GLP-1 Muscle Loss Contradiction test
|
|
- [ ] Capture test output
|
|
- [ ] Update `glp1-muscle-loss-contradiction.md` with results
|
|
- [ ] Run Gastroparesis Multi-Source test
|
|
- [ ] Capture test output
|
|
- [ ] Update `gastroparesis-multi-source.md` with results
|
|
- [ ] Run Layered Consensus test
|
|
- [ ] Capture test output
|
|
- [ ] Update `layered-consensus.md` with results
|
|
- [ ] Document Time Travel Query as blocked
|
|
- [ ] Create issue for Time Travel Query implementation
|
|
|
|
### Post-Execution
|
|
|
|
- [ ] All passing tests have markdown files updated with actual results
|
|
- [ ] Failing tests have issues created
|
|
- [ ] Week 4 sign-off in roadmap
|
|
- [ ] Update stemedb-ontology README with UAT status
|
|
|
|
## Known Issues / Risks
|
|
|
|
### 1. Signature Requirement
|
|
|
|
**Issue:** API requires valid Ed25519 signatures for all assertions.
|
|
|
|
**Current Approach:** Using dummy signatures (`0000...` for agent_id and signature).
|
|
|
|
**Risk:** If signature verification is enforced, tests will fail.
|
|
|
|
**Mitigation:** Either:
|
|
- Add a test-mode flag that disables signature verification
|
|
- Generate valid signatures in test helper
|
|
- Use a test agent keypair
|
|
|
|
### 2. Assertion Ingestion Delay
|
|
|
|
**Issue:** Assertions go through WAL → Ingest Worker → Index Store.
|
|
|
|
**Current Approach:** `sleep(2-3 seconds)` after ingestion.
|
|
|
|
**Risk:** Race conditions if ingestion is slower than expected.
|
|
|
|
**Mitigation:**
|
|
- Increase sleep duration
|
|
- Add polling for assertion availability
|
|
- Use synchronous ingestion for tests
|
|
|
|
### 3. Time Travel Query Not Implemented
|
|
|
|
**Issue:** `as_of` parameter not yet implemented in query handlers.
|
|
|
|
**Impact:** Scenario 4 will be skipped in Week 4.
|
|
|
|
**Plan:** Document as future work, implement in Phase 6.
|
|
|
|
### 4. Database State Isolation
|
|
|
|
**Issue:** Tests write to same database, may interfere with each other.
|
|
|
|
**Current Approach:** Use unique subject/predicate combinations per test.
|
|
|
|
**Risk:** If tests are re-run, old data may pollute results.
|
|
|
|
**Mitigation:**
|
|
- Use unique identifiers per test run
|
|
- Add database cleanup between tests
|
|
- Use temporary databases per test
|
|
|
|
## Success Criteria
|
|
|
|
Week 4 is considered **complete** when:
|
|
|
|
1. ✅ Integration test suite compiles without errors
|
|
2. ✅ All API endpoints referenced in scenarios exist
|
|
3. ✅ DTOs match API contracts
|
|
4. ⏳ 3 out of 4 scenarios execute successfully (Time Travel deferred)
|
|
5. ⏳ UAT markdown files updated with actual results
|
|
6. ⏳ All assertion checks pass (conflict scores, counts, status)
|
|
7. ⏳ Test output captured and documented
|
|
|
|
## Next Steps After Week 4
|
|
|
|
1. **Week 5:** Implement Time Travel Query (`as_of` parameter)
|
|
2. **Week 6:** Add more complex scenarios (multi-tier disagreement, vote integration)
|
|
3. **Week 7:** Performance testing (1000s of assertions per tier)
|
|
4. **Week 8:** End-to-end workflows (extract → ingest → query → visualize)
|
|
|
|
## Running the Tests
|
|
|
|
### Quick Start
|
|
|
|
```bash
|
|
# Terminal 1: Start StemeDB API
|
|
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
|
|
cargo run -p stemedb-api
|
|
|
|
# Terminal 2: Run UAT tests (after API is ready)
|
|
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture
|
|
```
|
|
|
|
### Individual Scenario Execution
|
|
|
|
```bash
|
|
# GLP-1 Muscle Loss
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1 -- --ignored --nocapture
|
|
|
|
# Gastroparesis
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis -- --ignored --nocapture
|
|
|
|
# Layered Consensus
|
|
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered -- --ignored --nocapture
|
|
```
|
|
|
|
### CI/CD Integration
|
|
|
|
For automated testing in CI:
|
|
|
|
```yaml
|
|
# .github/workflows/uat.yml
|
|
name: Consumer Health UAT
|
|
|
|
on: [push, pull_request]
|
|
|
|
jobs:
|
|
uat:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v2
|
|
- uses: actions-rs/toolchain@v1
|
|
with:
|
|
toolchain: stable
|
|
|
|
- name: Start StemeDB API
|
|
run: |
|
|
cargo build -p stemedb-api
|
|
cargo run -p stemedb-api &
|
|
sleep 5
|
|
|
|
- name: Run UAT Tests
|
|
run: |
|
|
STEMEDB_API_URL=http://localhost:18180 \
|
|
cargo test --test consumer_health_uat -- --ignored --nocapture
|
|
```
|
|
|
|
## Documentation Updates Required
|
|
|
|
After successful execution:
|
|
|
|
1. **ai-lookup/index.md:** Add link to UAT results
|
|
2. **crates/stemedb-ontology/README.md:** Document test suite
|
|
3. **roadmap.md:** Mark Week 4 as complete
|
|
4. **uat/consumer-health/README.md:** Update with test results
|
|
5. **Each scenario .md file:** Fill in "Actual" columns with real data
|
|
|
|
---
|
|
|
|
**Prepared by:** Claude (Defensive Systems Architect)
|
|
**Date:** 2026-02-05
|
|
**Status:** Infrastructure Complete - Ready for Execution
|