stemedb/crates/stemedb-ontology
jml e95c978481 feat(aphoria): add inline claim markers and claim enrichment infrastructure
This commit implements Phase 17 of the Aphoria roadmap, adding:

**Inline Claim Markers (@aphoria:claim):**
- New extractor for detecting inline markers in comments
- Pending markers tracked in .aphoria/pending_markers.toml
- CLI commands: list-markers, formalize-marker, reject-marker
- Support for all major comment styles (Rust, Python, SQL, etc.)
- Auto-sync during scan (configurable)

**Claim Enrichment:**
- ClaimEnrichment type with source attribution (inline, extractor, manual)
- EnrichedClaimInfo with full enrichment metadata
- Extended AuthoredClaim with optional enrichment field
- API endpoints for enriched claim queries
- Dashboard UI components (enrichment badge, verdict badge)

**Enhanced Extractor Trait:**
- verifiable_predicates() method for declaring (tail_path, predicate) pairs
- 10 security extractors now implement verifiable_predicates
- Enables claim suggester skill to find unclaimed patterns

**Documentation:**
- Phase 17 summary with complete implementation details
- Gap fixes summary documenting 8 closed vision gaps
- Updated CLI reference with new commands
- New aphoria-docs skill for documentation maintenance
- Updated roadmap with Phase 17 completion

**Integration:**
- ClaimsFile support for claim enrichment persistence
- Pattern aggregate store support for enrichment queries
- Dashboard filters and display for enrichment metadata
- API handlers for list-markers and enrichment queries

**Tests:**
- New gap_fixes_integration test suite
- Corpus enricher module with best practices ingestion

Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 20:18:20 +00:00
..
src feat(aphoria): add inline claim markers and claim enrichment infrastructure 2026-02-08 20:18:20 +00:00
tests feat: Aphoria enterprise features + ontology SDK + file length compliance 2026-02-05 12:55:29 -07:00
Cargo.toml feat: Aphoria enterprise features + ontology SDK + file length compliance 2026-02-05 12:55:29 -07:00
README.md feat: Aphoria security extractors + LLM evaluation architecture + ontology docs 2026-02-05 15:22:55 -07:00

stemedb-ontology

Domain Ontology Layer for Episteme - defines how subjects are structured based on predicate type and domain. Ensures conflicts collide correctly when different sources report on the same thing.

Module Overview

Module Purpose
domain.rs Domain, EntityType, PredicateSchema, SourceTier builders
subject.rs SubjectBuilder for canonical subject construction
validator.rs Validates assertions against domain rules
client.rs HTTP client for StemeDB API
dto/ Request/response DTOs for API communication
pharma/ Pharmaceutical domain (reference implementation)

Quick Start

CLI Usage (steme-pharma)

# Build the CLI
cargo build --release -p stemedb-ontology

# Ingest FDA label data
./target/release/steme-pharma ingest semaglutide,tirzepatide

# Ingest with mock conflicts for testing
./target/release/steme-pharma ingest semaglutide --with-conflicts

# Query conflicts (Skeptic lens - default)
./target/release/steme-pharma query "Semaglutide:Type2Diabetes" hba1c_reduction_percent

# Query with source hierarchy (Layered Consensus)
./target/release/steme-pharma query "Semaglutide:Type2Diabetes" weight_loss_percent --mode layered

# Compare two drugs
./target/release/steme-pharma compare \
    "Semaglutide:Type2Diabetes" \
    "Tirzepatide:Type2Diabetes" \
    --predicate hba1c_reduction_percent

# Explore available predicates for a subject
./target/release/steme-pharma explore "Semaglutide:Type2Diabetes"

# Validate a subject/predicate combination
./target/release/steme-pharma validate "Semaglutide:Type2Diabetes" hba1c_reduction_percent

# JSON output (for scripting)
./target/release/steme-pharma --format json query "Semaglutide" nausea_rate

Programmatic Usage

use stemedb_ontology::{pharma, SubjectBuilder, Validator};
use stemedb_ontology::client::StemeClient;
use stemedb_ontology::pharma::extractors::{FdaLabelExtractor, MedicalExtractor, SourceInput};
use ed25519_dalek::SigningKey;
use rand::rngs::OsRng;

// Load the pharma domain definition
let domain = pharma::definition();

// Build a subject using the ontology
let schema = domain.get_schema("efficacy").unwrap();
let mut entities = std::collections::HashMap::new();
entities.insert("Drug".to_string(), "Semaglutide".to_string());
entities.insert("Indication".to_string(), "Type2Diabetes".to_string());
let subject = SubjectBuilder::build(schema, &entities, &domain).unwrap();
assert_eq!(subject, "Semaglutide:Type2Diabetes");

// Validate assertions
let validator = Validator::new(&domain);
let result = validator.validate("hba1c_reduction_percent", &subject, 0.95);
assert!(result.is_ok());

// Extract and ingest claims
let client = StemeClient::new("http://localhost:18180");
let extractor = FdaLabelExtractor::new();
let signing_key = SigningKey::generate(&mut OsRng);
let agent_id = signing_key.verifying_key().to_bytes();
let hlc = uhlc::HLCBuilder::new().build();

let claims = extractor.extract(&SourceInput::DrugName("semaglutide".into())).await?;
for claim in claims {
    let assertion = claim.to_assertion(&signing_key, agent_id, &hlc);
    let hash = client.assert(&assertion).await?;
    println!("Ingested: {}", hash);
}

// Query for conflicts
let skeptic = client.skeptic("Semaglutide:Type2Diabetes", "hba1c_reduction_percent").await?;
println!("Conflict score: {}", skeptic.conflict_score);

Architecture

                  ┌─────────────────────────────────────┐
                  │           Domain Definition         │
                  │  (EntityTypes, Schemas, Hierarchy)  │
                  └──────────────┬──────────────────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         │                       │                       │
         v                       v                       v
┌─────────────────┐   ┌──────────────────┐   ┌──────────────────┐
│  SubjectBuilder │   │    Validator     │   │  MedicalExtractor│
│                 │   │                  │   │    (trait)       │
│ Build canonical │   │ Validate against │   │ Extract claims   │
│ subject strings │   │ domain rules     │   │ from sources     │
└────────┬────────┘   └────────┬─────────┘   └────────┬─────────┘
         │                     │                      │
         └──────────────┬──────┴──────────────────────┘
                        │
                        v
              ┌───────────────────┐
              │    StemeClient    │
              │                   │
              │ Submit assertions │
              │ Query with lenses │
              └─────────┬─────────┘
                        │
                        v
              ┌───────────────────┐
              │    StemeDB API    │
              │  :18180/v1/*      │
              └───────────────────┘

Subject Patterns

Different predicate types use different subject structures to ensure proper collision:

Category Pattern Example Use Case
Efficacy {Drug}:{Indication} Semaglutide:Type2Diabetes Outcome measures for specific conditions
Safety {Drug} Semaglutide Adverse events (apply across indications)
Mechanism {Drug}:{Target} Semaglutide:GLP1R Pharmacology details
Comparison {Drug}:{Comparator}:{Indication} Semaglutide:Tirzepatide:Type2Diabetes Head-to-head trials

Source Hierarchy

Claims are weighted by source authority:

Tier Source Class Weight Examples
0 Regulatory 1.0 FDA Labels, EMA Reports
1 Clinical 0.9 Phase III RCTs, Lancet, NEJM
2 Observational 0.7 Real-World Evidence, FAERS
3 Expert 0.5 Guidelines, ADA Standards
4 Community 0.3 PatientsLikeMe, Moderated Forums
5 Anecdotal 0.1 Reddit, Twitter, Blog Posts

Adding a New Domain

See Adding a Domain Guide for step-by-step instructions on implementing new domains (e.g., cardiology, finance).

Testing

# Run all ontology tests
cargo test -p stemedb-ontology

# Run with output
cargo test -p stemedb-ontology -- --nocapture

# Consumer Health UAT
cargo test -p stemedb-ontology --test consumer_health_uat