jordan 8f6506b70a feat: Aphoria scan modes + stemedb-ontology crate + consumer health UAT

Major additions:
- Staged scanning modes (working tree, staged, committed) with git integration
- Drift detection for baseline vs current state comparisons
- Hosted API handlers for policy CRUD operations via StemeDB API
- stemedb-ontology crate with domain definitions and medical extractors
- Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.)
- Aphoria development skill documentation

Code organization:
- Split large files into focused modules to stay under 500-line limit
- Extracted config tests, episteme helpers/drift/aliases, API helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-04 21:57:33 -07:00

5.1 KiB

Raw Blame History

UAT: Gastroparesis Multi-Source (Source-Class Hierarchy)

Date: YYYY-MM-DD Feature: Tiered Source Authority Status: [ ] PASS / [ ] FAIL / [ ] BLOCKED

Scenario

Multiple sources report on semaglutide gastroparesis risk:

1 FDA report (Tier 0): Documents known gastroparesis cases
100 Reddit posts (Tier 5): Anecdotal "stomach paralysis" reports

Despite the 100x volume difference, the FDA report should dominate in authority-weighted resolution.

Acceptance Criteria

Criterion	Expected	Met?
FDA assertion ingested	Tier 0	[ ]
100 Reddit assertions ingested	Tier 5	[ ]
Authority lens winner	FDA report	[ ]
Volume doesn't override authority	Tier 0 > 100x Tier 5	[ ]
Layered view shows both	Per-tier breakdown	[ ]

Test Matrix

Step	Action	Expected	Status
1	Ingest FDA report	Hash returned	[ ]
2	Ingest 100 Reddit posts	100 hashes returned	[ ]
3	Query Authority lens	FDA wins	[ ]
4	Query Layered lens	Per-tier breakdown	[ ]
5	Verify weight calculation	Tier 0 weight > Tier 5 total	[ ]

Authority Weight Formula

effective_weight = base_confidence * tier_multiplier

Tier 0 (Regulatory): multiplier = 1.0
Tier 5 (Anecdotal):  multiplier = 0.1

100 Tier 5 posts at 0.8 confidence = 100 * 0.8 * 0.1 = 8.0 effective weight 1 Tier 0 report at 0.95 confidence = 1 * 0.95 * 1.0 = 0.95 effective weight

Wait, that's wrong! Volume would win. Let's check the actual algorithm.

Correction: Authority lens uses tier as a categorical priority, not just a multiplier:

Tier 0 candidates are considered first
Only if no Tier 0 exists, Tier 1 is considered
etc.

This ensures regulatory sources always win when present.

Setup Commands

# Start StemeDB
cargo run --bin stemedb-api &
sleep 2

Test Commands

Step 1: Ingest FDA Report (Tier 0)

curl -X POST http://localhost:18180/v1/assertions \
  -H "Content-Type: application/json" \
  -d '{
    "subject": "Semaglutide",
    "predicate": "gastroparesis_risk",
    "object": {"Text": "Documented cases reported. Monitor patients."},
    "confidence": 0.95,
    "source_class": "Regulatory",
    "source_hash": "0000000000000000000000000000000000000000000000000000000000000020"
  }'

Expected: Hash returned Actual: Status: [ ]

Step 2: Ingest 100 Reddit Posts (Tier 5)

for i in $(seq 1 100); do
  # Vary the wording slightly
  HASH=$(printf '%064d' $i)
  curl -s -X POST http://localhost:18180/v1/assertions \
    -H "Content-Type: application/json" \
    -d "{
      \"subject\": \"Semaglutide\",
      \"predicate\": \"gastroparesis_risk\",
      \"object\": {\"Text\": \"My stomach stopped working after taking Ozempic\"},
      \"confidence\": 0.80,
      \"source_class\": \"Anecdotal\",
      \"source_hash\": \"$HASH\"
    }" > /dev/null
done
echo "Created 100 anecdotal assertions"

Expected: 100 assertions created Actual: Status: [ ]

Step 3: Query with Authority Lens

curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"

Expected: Winner is FDA report (source_class = Regulatory) Actual: Status: [ ]

Step 4: Query with Layered Consensus Lens

curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=layered-consensus"

Expected:

{
  "tiers": [
    {"tier": 0, "source_class": "Regulatory", "candidates_count": 1, "winner": {...}},
    {"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, "winner": {...}}
  ],
  "overall_winner": {...},  // FDA report
  "overall_conflict_score": 0.0,  // Tiers agree on direction
  "total_candidates": 101
}

Actual: Status: [ ]

Step 5: Verify Tier Priority (Not Just Weight)

Confirm that even if we add more anecdotal posts, the FDA report still wins.

# Add 400 more Reddit posts (total 500)
for i in $(seq 101 500); do
  HASH=$(printf '%064d' $i)
  curl -s -X POST http://localhost:18180/v1/assertions \
    -H "Content-Type: application/json" \
    -d "{
      \"subject\": \"Semaglutide\",
      \"predicate\": \"gastroparesis_risk\",
      \"object\": {\"Text\": \"Ozempic gave me stomach problems\"},
      \"confidence\": 0.95,
      \"source_class\": \"Anecdotal\",
      \"source_hash\": \"$HASH\"
    }" > /dev/null
done

# Query again
curl "http://localhost:18180/v1/query?subject=Semaglutide&predicate=gastroparesis_risk&lens=authority"

Expected: FDA report STILL wins despite 500 anecdotal posts Actual: Status: [ ]

Sign-Off Checklist

Regulatory assertion stored at Tier 0
Anecdotal assertions stored at Tier 5
Authority lens uses tier priority (not just weight)
Volume of low-tier sources doesn't override high-tier
Layered view shows per-tier breakdown

Notes

Key insight: Authority is categorical (tier priority), not just weighted. Tier 0 always wins when present, regardless of lower-tier volume.

Tester: Date: Result:

5.1 KiB Raw Blame History