stemedb/uat/consumer-health/glp1-visual-anchoring.md
jordan 8f6506b70a feat: Aphoria scan modes + stemedb-ontology crate + consumer health UAT
Major additions:
- Staged scanning modes (working tree, staged, committed) with git integration
- Drift detection for baseline vs current state comparisons
- Hosted API handlers for policy CRUD operations via StemeDB API
- stemedb-ontology crate with domain definitions and medical extractors
- Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.)
- Aphoria development skill documentation

Code organization:
- Split large files into focused modules to stay under 500-line limit
- Extracted config tests, episteme helpers/drift/aliases, API helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 21:57:33 -07:00

127 lines
3.3 KiB
Markdown

# UAT: Visual Anchoring (pHash Validation)
**Date:** YYYY-MM-DD
**Feature:** Perceptual Hash Provenance
**Status:** [ ] PASS / [ ] FAIL / [ ] BLOCKED
## Scenario
An OCR-extracted claim from a PDF table needs validation against the original visual. The perceptual hash (pHash) of the source image allows:
1. Detecting if the source has been tampered with
2. Fuzzy-matching similar screenshots
3. Provenance tracking to original visual evidence
## Acceptance Criteria
| Criterion | Expected | Met? |
|-----------|----------|------|
| Assertion stored with pHash | visual_hash populated | [ ] |
| Same image = same pHash | Hamming distance = 0 | [ ] |
| Similar image = close pHash | Hamming distance < 10 | [ ] |
| Different image = far pHash | Hamming distance > 20 | [ ] |
| Query by pHash similarity | Returns matching assertions | [ ] |
## Test Matrix
| Step | Action | Expected | Actual | Status |
|------|--------|----------|--------|--------|
| 1 | Ingest assertion with pHash | Hash returned | | [ ] |
| 2 | Query by exact pHash | Assertion returned | | [ ] |
| 3 | Query by similar pHash | Assertion returned (fuzzy) | | [ ] |
| 4 | Query by different pHash | No match | | [ ] |
## pHash Background
Perceptual hashing creates a fingerprint of visual content that:
- Survives JPEG compression
- Survives minor cropping/resizing
- Distinguishes semantically different images
We use an 8-byte (64-bit) pHash. Hamming distance measures similarity:
- 0 = identical
- < 10 = visually similar
- > 20 = different images
## Setup Commands
```bash
# Start StemeDB
cargo run --bin stemedb-api &
sleep 2
```
## Test Commands
### Step 1: Ingest Assertion with Visual Hash
```bash
# pHash of a hypothetical FDA label table screenshot
# In real usage, this would be computed from the actual image
PHASH_HEX="a1b2c3d4e5f60718"
curl -X POST http://localhost:18180/v1/assertions \
-H "Content-Type: application/json" \
-d "{
\"subject\": \"Semaglutide\",
\"predicate\": \"adverse_event_rate\",
\"object\": {\"Number\": 0.043},
\"confidence\": 0.98,
\"source_class\": \"Regulatory\",
\"visual_hash\": \"$PHASH_HEX\"
}"
```
**Expected:** Hash returned
**Actual:**
**Status:** [ ]
### Step 2: Query by Exact pHash
```bash
curl "http://localhost:18180/v1/query?visual_hash=a1b2c3d4e5f60718"
```
**Expected:** Returns the assertion from Step 1
**Actual:**
**Status:** [ ]
### Step 3: Query by Similar pHash (Hamming distance < 10)
```bash
# Slightly different pHash (2 bits flipped)
curl "http://localhost:18180/v1/query?visual_hash=a1b2c3d4e5f60719&phash_threshold=10"
```
**Expected:** Returns the assertion (fuzzy match)
**Actual:**
**Status:** [ ]
### Step 4: Query by Different pHash (Hamming distance > 20)
```bash
# Completely different pHash
curl "http://localhost:18180/v1/query?visual_hash=1234567890abcdef&phash_threshold=10"
```
**Expected:** No results (too different)
**Actual:**
**Status:** [ ]
## Sign-Off Checklist
- [ ] visual_hash field stored in assertion
- [ ] Exact pHash match works
- [ ] Fuzzy pHash match within threshold works
- [ ] Different pHash correctly excluded
- [ ] pHash indexed for efficient lookup
## Notes
*pHash computation happens client-side (during extraction). StemeDB stores and indexes the hash but doesn't compute it.*
---
**Tester:**
**Date:**
**Result:**