stemedb/uat/consumer-health/WEEK4_EXECUTION_PLAN.md
jordan 41c676a78e feat: Aphoria enterprise features + ontology SDK + file length compliance
Enterprise Features:
- Hosted mode with remote sync for team pattern aggregation
- Community sharing with privacy-preserving anonymization
- LLM-based semantic claim extraction with Gemini integration
- Pattern learning with promotion to declarative extractors
- High-entropy secrets extractor with configurable thresholds
- Auth bypass and insecure cookies extractors

Module Refactoring:
- Split oversized files to comply with 500-line limit
- Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc.
- Handlers split: scan.rs, policy.rs, report.rs modules
- Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/
- Learning split: store modules with metrics and persistence

SDK & Ontology:
- stemedb-ontology SDK with fluent builders and StemeDB client
- Pharma domain extractors for FDA Orange Book data
- Consumer health UAT test infrastructure

Code Quality:
- Fixed clippy warnings (needless_borrows_for_generic_args)
- Added KVStore trait imports where needed
- Fixed utoipa path re-exports for OpenAPI docs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 12:55:29 -07:00

10 KiB

Week 4 UAT Execution Plan - Consumer Health

Date: 2026-02-05 Milestone: stemedb-ontology Week 4 - UAT scenarios documented and verified Status: Infrastructure Ready

Objective

Validate the four critical Consumer Health UAT scenarios programmatically:

  1. GLP-1 Muscle Loss Contradiction (Skeptic Lens)
  2. Gastroparesis Multi-Source (Source Hierarchy)
  3. Layered Consensus (Per-Tier Positions)
  4. Time Travel Query (as_of Snapshot)

Infrastructure Created

Integration Test Suite

Location: /Users/jordanwashburn/Workspace/orchard9/stemedb/crates/stemedb-ontology/tests/consumer_health_uat.rs

Purpose: Programmatic validation of UAT scenarios against a running StemeDB API instance.

Features:

  • HTTP client for API calls
  • DTO structures matching API contracts
  • Assertion helpers for validation
  • Structured test output with pass/fail/skip status
  • Environment-aware API URL configuration

Test Execution

# Start StemeDB API
cargo run -p stemedb-api &

# Wait for startup
sleep 2

# Run individual scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1_muscle_loss_contradiction -- --ignored --nocapture

STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis_multi_source -- --ignored --nocapture

STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered_consensus -- --ignored --nocapture

# Run all scenarios
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture

Scenario Details

1. GLP-1 Muscle Loss Contradiction

UAT File: glp1-muscle-loss-contradiction.md Test Function: uat_glp1_muscle_loss_contradiction() Status: Ready to Execute

What it tests:

  • Two peer-reviewed studies with opposing conclusions coexist
  • Skeptic Lens surfaces both claims without averaging
  • Conflict score >= 0.5 for binary disagreement
  • Status = "Contested"
  • Both Boolean values present in claims array

API Endpoints:

  • POST /v1/assert - Create Study A and Study B assertions
  • GET /v1/skeptic?subject=Semaglutide:MuscleMass&predicate=muscle_sparing_effect

Expected Outcome:

{
  "status": "Contested",
  "conflict_score": >= 0.5,
  "claims": [
    {"value": {"Boolean": false}, "weight_share": ~0.51},
    {"value": {"Boolean": true}, "weight_share": ~0.49}
  ],
  "candidates_count": 2
}

Validation Checks:

  • 2 candidates returned
  • 2 distinct claims
  • Conflict score >= 0.5
  • Status = "Contested"
  • Both true and false values present

2. Gastroparesis Multi-Source

UAT File: gastroparesis-multi-source.md Test Function: uat_gastroparesis_multi_source() Status: Ready to Execute

What it tests:

  • Regulatory source (Tier 0) dominates despite 100x volume of anecdotal (Tier 5)
  • Source hierarchy uses tier priority, not just weighted voting
  • Layered view shows per-tier breakdown

API Endpoints:

  • POST /v1/assert - Create 1 FDA + 100 Reddit assertions
  • GET /v1/layered?subject=Semaglutide&predicate=gastroparesis_risk

Expected Outcome:

{
  "tiers": [
    {"tier": 0, "source_class": "Regulatory", "candidates_count": 1, ...},
    {"tier": 5, "source_class": "Anecdotal", "candidates_count": 100, ...}
  ],
  "overall_winner": {...},  // From Tier 0
  "total_candidates": 101
}

Validation Checks:

  • 101 total candidates
  • Tier 0 present with 1 candidate
  • Tier 5 present with 100 candidates
  • Overall winner from Tier 0
  • Tier structure correct

3. Layered Consensus

UAT File: layered-consensus.md Test Function: uat_layered_consensus() Status: Ready to Execute

What it tests:

  • Per-tier breakdown shows all populated tiers
  • Within-tier conflict calculated (Tier 1 contested, Tier 5 unanimous)
  • Cross-tier conflict calculated
  • Overall winner from highest authority tier

API Endpoints:

  • POST /v1/assert - Create 2 Clinical (conflicting) + 50 Anecdotal (unanimous)
  • GET /v1/layered?subject=Semaglutide:BodyComposition&predicate=lean_mass_preserved

Expected Outcome:

{
  "tiers": [
    {
      "tier": 1,
      "source_class": "Clinical",
      "candidates_count": 2,
      "conflict_score": > 0.5  // Contested within tier
    },
    {
      "tier": 5,
      "source_class": "Anecdotal",
      "candidates_count": 50,
      "conflict_score": < 0.1  // Unanimous within tier
    }
  ],
  "total_candidates": 52
}

Validation Checks:

  • 52 total candidates
  • Tier 1 conflict > 0.5
  • Tier 5 conflict < 0.1
  • Both tiers present
  • Overall winner from Tier 1

4. Time Travel Query

UAT File: time-travel-query.md Test Function: uat_time_travel_query() Status: ⊘ Not Yet Implemented

What it tests:

  • Query knowledge graph as it existed at a specific timestamp
  • Historical snapshot returns only assertions before as_of date
  • Audit trail and debugging capabilities

API Endpoints:

  • GET /v1/query?subject=...&predicate=...&as_of=<timestamp>

Blocked By:

  • Implementation of as_of parameter in query handlers
  • Timestamp filtering in query engine

Next Steps:

  1. Add as_of: Option<u64> to query parameters
  2. Filter assertions by timestamp in query engine
  3. Update UAT test to use actual API

Execution Checklist

Pre-Execution

  • Integration test suite created
  • Test compilation verified
  • API endpoints confirmed to exist
  • Data structures validated against API DTOs
  • StemeDB API server running
  • Database initialized
  • Ingest worker running (for assertion processing)

Execution

  • Run GLP-1 Muscle Loss Contradiction test
  • Capture test output
  • Update glp1-muscle-loss-contradiction.md with results
  • Run Gastroparesis Multi-Source test
  • Capture test output
  • Update gastroparesis-multi-source.md with results
  • Run Layered Consensus test
  • Capture test output
  • Update layered-consensus.md with results
  • Document Time Travel Query as blocked
  • Create issue for Time Travel Query implementation

Post-Execution

  • All passing tests have markdown files updated with actual results
  • Failing tests have issues created
  • Week 4 sign-off in roadmap
  • Update stemedb-ontology README with UAT status

Known Issues / Risks

1. Signature Requirement

Issue: API requires valid Ed25519 signatures for all assertions.

Current Approach: Using dummy signatures (0000... for agent_id and signature).

Risk: If signature verification is enforced, tests will fail.

Mitigation: Either:

  • Add a test-mode flag that disables signature verification
  • Generate valid signatures in test helper
  • Use a test agent keypair

2. Assertion Ingestion Delay

Issue: Assertions go through WAL → Ingest Worker → Index Store.

Current Approach: sleep(2-3 seconds) after ingestion.

Risk: Race conditions if ingestion is slower than expected.

Mitigation:

  • Increase sleep duration
  • Add polling for assertion availability
  • Use synchronous ingestion for tests

3. Time Travel Query Not Implemented

Issue: as_of parameter not yet implemented in query handlers.

Impact: Scenario 4 will be skipped in Week 4.

Plan: Document as future work, implement in Phase 6.

4. Database State Isolation

Issue: Tests write to same database, may interfere with each other.

Current Approach: Use unique subject/predicate combinations per test.

Risk: If tests are re-run, old data may pollute results.

Mitigation:

  • Use unique identifiers per test run
  • Add database cleanup between tests
  • Use temporary databases per test

Success Criteria

Week 4 is considered complete when:

  1. Integration test suite compiles without errors
  2. All API endpoints referenced in scenarios exist
  3. DTOs match API contracts
  4. 3 out of 4 scenarios execute successfully (Time Travel deferred)
  5. UAT markdown files updated with actual results
  6. All assertion checks pass (conflict scores, counts, status)
  7. Test output captured and documented

Next Steps After Week 4

  1. Week 5: Implement Time Travel Query (as_of parameter)
  2. Week 6: Add more complex scenarios (multi-tier disagreement, vote integration)
  3. Week 7: Performance testing (1000s of assertions per tier)
  4. Week 8: End-to-end workflows (extract → ingest → query → visualize)

Running the Tests

Quick Start

# Terminal 1: Start StemeDB API
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
cargo run -p stemedb-api

# Terminal 2: Run UAT tests (after API is ready)
cd /Users/jordanwashburn/Workspace/orchard9/stemedb
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat run_all_uat_scenarios -- --ignored --nocapture

Individual Scenario Execution

# GLP-1 Muscle Loss
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_glp1 -- --ignored --nocapture

# Gastroparesis
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_gastroparesis -- --ignored --nocapture

# Layered Consensus
STEMEDB_API_URL=http://localhost:18180 cargo test --test consumer_health_uat uat_layered -- --ignored --nocapture

CI/CD Integration

For automated testing in CI:

# .github/workflows/uat.yml
name: Consumer Health UAT

on: [push, pull_request]

jobs:
  uat:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: stable

      - name: Start StemeDB API
        run: |
          cargo build -p stemedb-api
          cargo run -p stemedb-api &
          sleep 5          

      - name: Run UAT Tests
        run: |
          STEMEDB_API_URL=http://localhost:18180 \
          cargo test --test consumer_health_uat -- --ignored --nocapture          

Documentation Updates Required

After successful execution:

  1. ai-lookup/index.md: Add link to UAT results
  2. crates/stemedb-ontology/README.md: Document test suite
  3. roadmap.md: Mark Week 4 as complete
  4. uat/consumer-health/README.md: Update with test results
  5. Each scenario .md file: Fill in "Actual" columns with real data

Prepared by: Claude (Defensive Systems Architect) Date: 2026-02-05 Status: Infrastructure Complete - Ready for Execution