stemedb/uat/consumer-health/disagreement-dashboard.md
jordan 8f6506b70a feat: Aphoria scan modes + stemedb-ontology crate + consumer health UAT
Major additions:
- Staged scanning modes (working tree, staged, committed) with git integration
- Drift detection for baseline vs current state comparisons
- Hosted API handlers for policy CRUD operations via StemeDB API
- stemedb-ontology crate with domain definitions and medical extractors
- Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.)
- Aphoria development skill documentation

Code organization:
- Split large files into focused modules to stay under 500-line limit
- Extracted config tests, episteme helpers/drift/aliases, API helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:57:33 -07:00

205 lines
5.7 KiB
Markdown

# UAT: Disagreement Dashboard (Resolved/Active/Emerging)
**Date:** YYYY-MM-DD
**Feature:** Conflict Status Classification
**Status:** [ ] PASS / [ ] FAIL / [ ] BLOCKED
## Scenario
A "Living Review" dashboard needs to categorize assertions by conflict status:
1. **Resolved** - Had conflict, now resolved (one claim dominates or epoch superseded)
2. **Active Disagreement** - Ongoing contested claims from authoritative sources
3. **Emerging Signal** - New anecdotal cluster that may indicate unreported effect
This enables triage: researchers focus on Active Disagreement, regulators monitor Emerging Signals.
## Acceptance Criteria
| Criterion | Expected | Met? |
|-----------|----------|------|
| Resolved items identified | Status = "resolved" | [ ] |
| Active disagreement identified | Status = "active_disagreement" | [ ] |
| Emerging signal identified | Status = "emerging_signal" | [ ] |
| Dashboard summary | Counts per category | [ ] |
| Drill-down available | Full claims for each | [ ] |
## Test Matrix
| Step | Action | Expected | Actual | Status |
|------|--------|----------|--------|--------|
| 1 | Create resolved conflict | Consensus reached | | [ ] |
| 2 | Create active disagreement | Clinical studies conflict | | [ ] |
| 3 | Create emerging signal | Anecdotal cluster, no authority | | [ ] |
| 4 | Query dashboard summary | 3 categories populated | | [ ] |
| 5 | Drill into active disagreement | Full claim details | | [ ] |
## Conflict Status Definitions
| Status | Criteria |
|--------|----------|
| **Resolved** | `conflict_score < 0.1` OR single-tier unanimous OR epoch-superseded |
| **Active Disagreement** | `conflict_score >= 0.4` AND Tier 0-2 sources present on both sides |
| **Emerging Signal** | Tier 5 cluster >= 100 AND no Tier 0-2 coverage |
| **Low Priority** | Everything else (minor disagreements in low-tier sources) |
## Setup Commands
```bash
# Start StemeDB
cargo run --bin stemedb-api &
sleep 2
```
## Test Commands
### Step 1: Create Resolved Conflict (Dose Adjustment History)
```bash
# Old studies said 1mg was max
curl -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d '{
"subject": "Semaglutide",
"predicate": "recommended_max_dose",
"object": {"Number": 1.0},
"confidence": 0.9,
"source_class": "Clinical",
"epoch": "0000000000000000000000000000000000000000000000000000000000000001"
}'
# New FDA guidance supersedes (2.4mg approved)
curl -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d '{
"subject": "Semaglutide",
"predicate": "recommended_max_dose",
"object": {"Number": 2.4},
"confidence": 1.0,
"source_class": "Regulatory",
"epoch": "0000000000000000000000000000000000000000000000000000000000000002"
}'
```
**Expected:** Conflict resolved by epoch supersession
**Actual:**
**Status:** [ ]
### Step 2: Create Active Disagreement (Muscle Loss Debate)
```bash
# Study A: Significant muscle loss
curl -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d '{
"subject": "GLP1:MuscleEffect",
"predicate": "lean_mass_impact",
"object": {"Text": "Significant reduction observed"},
"confidence": 0.85,
"source_class": "Clinical"
}'
# Study B: Minimal muscle impact
curl -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d '{
"subject": "GLP1:MuscleEffect",
"predicate": "lean_mass_impact",
"object": {"Text": "Minimal reduction with exercise"},
"confidence": 0.82,
"source_class": "Clinical"
}'
```
**Expected:** Active disagreement - two clinical studies conflict
**Actual:**
**Status:** [ ]
### Step 3: Create Emerging Signal (Hair Loss Reports)
```bash
# 150 anecdotal reports, no clinical data
for i in $(seq 1 150); do
HASH=$(printf '%064d' $((5000 + i)))
curl -s -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d "{
\"subject\": \"Semaglutide\",
\"predicate\": \"hair_thinning_reported\",
\"object\": {\"Boolean\": true},
\"confidence\": 0.70,
\"source_class\": \"Anecdotal\",
\"source_hash\": \"$HASH\"
}" > /dev/null
done
echo "Created 150 hair loss reports"
```
**Expected:** Emerging signal - anecdotal cluster without authoritative coverage
**Actual:**
**Status:** [ ]
### Step 4: Query Dashboard Summary
```bash
curl "http://localhost:18180/v1/dashboard/conflicts"
```
**Expected:**
```json
{
"summary": {
"resolved": 1,
"active_disagreement": 1,
"emerging_signal": 1,
"low_priority": 0
},
"items": {
"resolved": [
{"subject": "Semaglutide", "predicate": "recommended_max_dose", "resolution": "epoch_superseded"}
],
"active_disagreement": [
{"subject": "GLP1:MuscleEffect", "predicate": "lean_mass_impact", "conflict_score": 0.88}
],
"emerging_signal": [
{"subject": "Semaglutide", "predicate": "hair_thinning_reported", "cluster_count": 150}
]
}
}
```
**Actual:**
**Status:** [ ]
### Step 5: Drill Into Active Disagreement
```bash
curl "http://localhost:18180/v1/skeptic?subject=GLP1:MuscleEffect&predicate=lean_mass_impact"
```
**Expected:** Full claim breakdown with:
- Both clinical studies listed
- Supporting evidence for each
- Conflict score >= 0.4
- Status = "Contested"
**Actual:**
**Status:** [ ]
## Sign-Off Checklist
- [ ] Resolved conflicts identified correctly
- [ ] Active disagreements surfaced
- [ ] Emerging signals detected
- [ ] Dashboard provides summary counts
- [ ] Drill-down returns full details
## Notes
*Dashboard categories are computed at query time, not stored. This ensures freshness but may have performance implications for large datasets.*
---
**Tester:**
**Date:**
**Result:**