Major additions: - Staged scanning modes (working tree, staged, committed) with git integration - Drift detection for baseline vs current state comparisons - Hosted API handlers for policy CRUD operations via StemeDB API - stemedb-ontology crate with domain definitions and medical extractors - Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.) - Aphoria development skill documentation Code organization: - Split large files into focused modules to stay under 500-line limit - Extracted config tests, episteme helpers/drift/aliases, API helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
185 lines
5.1 KiB
Markdown
185 lines
5.1 KiB
Markdown
# UAT: Gastroparesis Multi-Source (Source-Class Hierarchy)
|
|
|
|
**Date:** YYYY-MM-DD
|
|
**Feature:** Tiered Source Authority
|
|
**Status:** [ ] PASS / [ ] FAIL / [ ] BLOCKED
|
|
|
|
## Scenario
|
|
|
|
Multiple sources report on semaglutide gastroparesis risk:
|
|
- **1 FDA report (Tier 0):** Documents known gastroparesis cases
|
|
- **100 Reddit posts (Tier 5):** Anecdotal "stomach paralysis" reports
|
|
|
|
Despite the 100x volume difference, the FDA report should dominate in authority-weighted resolution.
|
|
|
|
## Acceptance Criteria
|
|
|
|
| Criterion | Expected | Met? |
|
|
|-----------|----------|------|
|
|
| FDA assertion ingested | Tier 0 | [ ] |
|
|
| 100 Reddit assertions ingested | Tier 5 | [ ] |
|
|
| Authority lens winner | FDA report | [ ] |
|
|
| Volume doesn't override authority | Tier 0 > 100x Tier 5 | [ ] |
|
|
| Layered view shows both | Per-tier breakdown | [ ] |
|
|
|
|
## Test Matrix
|
|
|
|
| Step | Action | Expected | Actual | Status |
|
|
|------|--------|----------|--------|--------|
|
|
| 1 | Ingest FDA report | Hash returned | | [ ] |
|
|
| 2 | Ingest 100 Reddit posts | 100 hashes returned | | [ ] |
|
|
| 3 | Query Authority lens | FDA wins | | [ ] |
|
|
| 4 | Query Layered lens | Per-tier breakdown | | [ ] |
|
|
| 5 | Verify weight calculation | Tier 0 weight > Tier 5 total | | [ ] |
|
|
|
|
## Authority Weight Formula
|
|
|
|
```
|
|
effective_weight = base_confidence * tier_multiplier
|
|
|
|
Tier 0 (Regulatory): multiplier = 1.0
|
|
Tier 5 (Anecdotal): multiplier = 0.1
|
|
```
|
|
|
|
100 Tier 5 posts at 0.8 confidence = 100 * 0.8 * 0.1 = 8.0 effective weight
|
|
1 Tier 0 report at 0.95 confidence = 1 * 0.95 * 1.0 = 0.95 effective weight
|
|
|
|
Wait, that's wrong! Volume would win. Let's check the actual algorithm.
|
|
|
|
**Correction:** Authority lens uses tier as a categorical priority, not just a multiplier:
|
|
- Tier 0 candidates are considered first
|
|
- Only if no Tier 0 exists, Tier 1 is considered
|
|
- etc.
|
|
|
|
This ensures regulatory sources always win when present.
|
|
|
|
## Setup Commands
|
|
|
|
```bash
|
|
# Start StemeDB
|
|
cargo run --bin stemedb-api &
|
|
sleep 2
|
|
```
|
|
|
|
## Test Commands
|
|
|
|
### Step 1: Ingest FDA Report (Tier 0)
|
|
|
|
```bash
|
|
curl -X POST http://localhost:18180/v1/assertions \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"subject": "Semaglutide",
|
|
"predicate": "gastroparesis_risk",
|
|
"object": {"Text": "Documented cases reported. Monitor patients."},
|
|
"confidence": 0.95,
|
|
"source_class": "Regulatory",
|
|
"source_hash": "0000000000000000000000000000000000000000000000000000000000000020"
|
|
}'
|
|
```
|
|
|
|
**Expected:** Hash returned
|
|
**Actual:**
|
|
**Status:** [ ]
|
|
|
|
### Step 2: Ingest 100 Reddit Posts (Tier 5)
|
|
|
|
```bash
|
|
for i in $(seq 1 100); do
|
|
# Vary the wording slightly
|
|
HASH=$(printf '%064d' $i)
|
|
curl -s -X POST http://localhost:18180/v1/assertions \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"subject\": \"Semaglutide\",
|
|
\"predicate\": \"gastroparesis_risk\",
|
|
\"object\": {\"Text\": \"My stomach stopped working after taking Ozempic\"},
|
|
\"confidence\": 0.80,
|
|
\"source_class\": \"Anecdotal\",
|
|
\"source_hash\": \"$HASH\"
|
|
}" > /dev/null
|
|
done
|
|
echo "Created 100 anecdotal assertions"
|
|
```
|
|
|
|
**Expected:** 100 assertions created
|
|
**Actual:**
|
|
**Status:** [ ]
|
|
|
|
### Step 3: Query with Authority Lens
|
|
|
|
```bash
|
|
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"
|
|
```
|
|
|
|
**Expected:** Winner is FDA report (source_class = Regulatory)
|
|
**Actual:**
|
|
**Status:** [ ]
|
|
|
|
### Step 4: Query with Layered Consensus Lens
|
|
|
|
```bash
|
|
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=layered-consensus"
|
|
```
|
|
|
|
**Expected:**
|
|
```json
|
|
{
|
|
"tiers": [
|
|
{"tier": 0, "source_class": "Regulatory", "candidates_count": 1, "winner": {...}},
|
|
{"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, "winner": {...}}
|
|
],
|
|
"overall_winner": {...}, // FDA report
|
|
"overall_conflict_score": 0.0, // Tiers agree on direction
|
|
"total_candidates": 101
|
|
}
|
|
```
|
|
**Actual:**
|
|
**Status:** [ ]
|
|
|
|
### Step 5: Verify Tier Priority (Not Just Weight)
|
|
|
|
Confirm that even if we add more anecdotal posts, the FDA report still wins.
|
|
|
|
```bash
|
|
# Add 400 more Reddit posts (total 500)
|
|
for i in $(seq 101 500); do
|
|
HASH=$(printf '%064d' $i)
|
|
curl -s -X POST http://localhost:18180/v1/assertions \
|
|
-H "Content-Type: application/json" \
|
|
-d "{
|
|
\"subject\": \"Semaglutide\",
|
|
\"predicate\": \"gastroparesis_risk\",
|
|
\"object\": {\"Text\": \"Ozempic gave me stomach problems\"},
|
|
\"confidence\": 0.95,
|
|
\"source_class\": \"Anecdotal\",
|
|
\"source_hash\": \"$HASH\"
|
|
}" > /dev/null
|
|
done
|
|
|
|
# Query again
|
|
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"
|
|
```
|
|
|
|
**Expected:** FDA report STILL wins despite 500 anecdotal posts
|
|
**Actual:**
|
|
**Status:** [ ]
|
|
|
|
## Sign-Off Checklist
|
|
|
|
- [ ] Regulatory assertion stored at Tier 0
|
|
- [ ] Anecdotal assertions stored at Tier 5
|
|
- [ ] Authority lens uses tier priority (not just weight)
|
|
- [ ] Volume of low-tier sources doesn't override high-tier
|
|
- [ ] Layered view shows per-tier breakdown
|
|
|
|
## Notes
|
|
|
|
*Key insight: Authority is categorical (tier priority), not just weighted. Tier 0 always wins when present, regardless of lower-tier volume.*
|
|
|
|
---
|
|
|
|
**Tester:**
|
|
**Date:**
|
|
**Result:**
|