stemedb/uat/consumer-health/gastroparesis-multi-source.md

# UAT: Gastroparesis Multi-Source (Source-Class Hierarchy)

**Date:** YYYY-MM-DD
**Feature:** Tiered Source Authority
**Status:** [ ] PASS / [ ] FAIL / [ ] BLOCKED

## Scenario

Multiple sources report on semaglutide gastroparesis risk:
- **1 FDA report (Tier 0):** Documents known gastroparesis cases
- **100 Reddit posts (Tier 5):** Anecdotal "stomach paralysis" reports

Despite the 100x volume difference, the FDA report should dominate in authority-weighted resolution.

## Acceptance Criteria

| Criterion | Expected | Met? |
|-----------|----------|------|
| FDA assertion ingested | Tier 0 | [ ] |
| 100 Reddit assertions ingested | Tier 5 | [ ] |
| Authority lens winner | FDA report | [ ] |
| Volume doesn't override authority | Tier 0 > 100x Tier 5 | [ ] |
| Layered view shows both | Per-tier breakdown | [ ] |

## Test Matrix

| Step | Action | Expected | Actual | Status |
|------|--------|----------|--------|--------|
| 1 | Ingest FDA report | Hash returned | | [ ] |
| 2 | Ingest 100 Reddit posts | 100 hashes returned | | [ ] |
| 3 | Query Authority lens | FDA wins | | [ ] |
| 4 | Query Layered lens | Per-tier breakdown | | [ ] |
| 5 | Verify weight calculation | Tier 0 weight > Tier 5 total | | [ ] |

## Authority Weight Formula

```
effective_weight = base_confidence * tier_multiplier

Tier 0 (Regulatory): multiplier = 1.0
Tier 5 (Anecdotal):  multiplier = 0.1
```

100 Tier 5 posts at 0.8 confidence = 100 * 0.8 * 0.1 = 8.0 effective weight
1 Tier 0 report at 0.95 confidence = 1 * 0.95 * 1.0 = 0.95 effective weight

Wait, that's wrong! Volume would win. Let's check the actual algorithm.

**Correction:** Authority lens uses tier as a categorical priority, not just a multiplier:
- Tier 0 candidates are considered first
- Only if no Tier 0 exists, Tier 1 is considered
- etc.

This ensures regulatory sources always win when present.

## Setup Commands

```bash
# Start StemeDB
cargo run --bin stemedb-api &
sleep 2
```

## Test Commands

### Step 1: Ingest FDA Report (Tier 0)

```bash
curl -X POST http://localhost:18180/v1/assertions \
  -H "Content-Type: application/json" \
  -d '{
    "subject": "Semaglutide",
    "predicate": "gastroparesis_risk",
    "object": {"Text": "Documented cases reported. Monitor patients."},
    "confidence": 0.95,
    "source_class": "Regulatory",
    "source_hash": "0000000000000000000000000000000000000000000000000000000000000020"
  }'
```

**Expected:** Hash returned
**Actual:**
**Status:** [ ]

### Step 2: Ingest 100 Reddit Posts (Tier 5)

```bash
for i in $(seq 1 100); do
  # Vary the wording slightly
  HASH=$(printf '%064d' $i)
  curl -s -X POST http://localhost:18180/v1/assertions \
    -H "Content-Type: application/json" \
    -d "{
      \"subject\": \"Semaglutide\",
      \"predicate\": \"gastroparesis_risk\",
      \"object\": {\"Text\": \"My stomach stopped working after taking Ozempic\"},
      \"confidence\": 0.80,
      \"source_class\": \"Anecdotal\",
      \"source_hash\": \"$HASH\"
    }" > /dev/null
done
echo "Created 100 anecdotal assertions"
```

**Expected:** 100 assertions created
**Actual:**
**Status:** [ ]

### Step 3: Query with Authority Lens

```bash
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"
```

**Expected:** Winner is FDA report (source_class = Regulatory)
**Actual:**
**Status:** [ ]

### Step 4: Query with Layered Consensus Lens

```bash
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=layered-consensus"
```

**Expected:**
```json
{
  "tiers": [
    {"tier": 0, "source_class": "Regulatory", "candidates_count": 1, "winner": {...}},
    {"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, "winner": {...}}
  ],
  "overall_winner": {...},  // FDA report
  "overall_conflict_score": 0.0,  // Tiers agree on direction
  "total_candidates": 101
}
```
**Actual:**
**Status:** [ ]

### Step 5: Verify Tier Priority (Not Just Weight)

Confirm that even if we add more anecdotal posts, the FDA report still wins.

```bash
# Add 400 more Reddit posts (total 500)
for i in $(seq 101 500); do
  HASH=$(printf '%064d' $i)
  curl -s -X POST http://localhost:18180/v1/assertions \
    -H "Content-Type: application/json" \
    -d "{
      \"subject\": \"Semaglutide\",
      \"predicate\": \"gastroparesis_risk\",
      \"object\": {\"Text\": \"Ozempic gave me stomach problems\"},
      \"confidence\": 0.95,
      \"source_class\": \"Anecdotal\",
      \"source_hash\": \"$HASH\"
    }" > /dev/null
done

# Query again
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"
```

**Expected:** FDA report STILL wins despite 500 anecdotal posts
**Actual:**
**Status:** [ ]

## Sign-Off Checklist

- [ ] Regulatory assertion stored at Tier 0
- [ ] Anecdotal assertions stored at Tier 5
- [ ] Authority lens uses tier priority (not just weight)
- [ ] Volume of low-tier sources doesn't override high-tier
- [ ] Layered view shows per-tier breakdown

## Notes

*Key insight: Authority is categorical (tier priority), not just weighted. Tier 0 always wins when present, regardless of lower-tier volume.*

---

**Tester:**
**Date:**
**Result:**