jordan 1ce4004807 feat: Complete Phase 2 (The Cortex) - query, lens, and API layers

This commit adds the read path (Cortex) to complement the write path (Spine):

## Crates
- stemedb-api: HTTP API with axum + utoipa OpenAPI
  - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit
  - Metered endpoints with quota enforcement
  - Ed25519 signature verification
- stemedb-lens: Truth resolution lenses
  - RecencyLens, ConsensusLens, ConfidenceLens
  - VoteAwareConsensusLens (Ballot Box pattern)
  - TrustAwareAuthorityLens (The Hive pattern)
  - SkepticLens (conflict analysis)
  - EpochAwareLens (paradigm-safe queries)
- stemedb-query: Query engine with materialized views

## Storage Extensions
- VoteStore: Vote aggregation with cached counts
- TrustRankStore: Agent reputation with decay
- AuditStore: Query audit trail
- IndexStore: SP/P/S index structures
- SupersessionStore: Epoch supersession chains

## SDKs
- sdk/go/steme: Go HTTP client with Ed25519 signing
- sdk/go/adk: ADK-Go tools for AI agents

## Documentation
- Updated CLAUDE.md, architecture.md, roadmap.md
- New ai-lookup entries for all services
- Use case docs for consumer health intelligence
- Arena roadmap for simulation advancement

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-01 13:22:44 -07:00

8.0 KiB

Raw Blame History

Application Layer Concepts

Audience: Teams building applications on top of Episteme. Not covered here: Database internals (see architecture.md) or implementation roadmap (see roadmap.md).

The Boundary Principle

Episteme is a database engine. It stores assertions, resolves conflicts via lenses, and serves queries. Everything else is the application layer.

Episteme Owns

Domain	Responsibilities
Storage	WAL, KV store, indexes, content-addressing via BLAKE3
Types	Assertion, Vote, Epoch, LifecycleStage, MaterializedView
Resolution	Lenses (Recency, Consensus, Authority, Skeptic, Layered, Constraints)
Query	Filtering, audit trail, time-travel (`as_of`), decay at resolution time
Integrity	Ed25519 signature verification, checksums, epoch supersession cascades
API	HTTP endpoints: `POST /assert`, `POST /vote`, `POST /epoch`, `GET /query`

Application Layer Owns

Domain	Responsibilities
Data Acquisition	Crawlers, scrapers, API consumers (PubMed, Reddit, FAERS, etc.)
Data Transformation	NLP extraction, claim identification, confidence assignment
Classification	Source-class tier assignment, study design classification
Enrichment	DOI resolution, journal metadata, engagement metrics
Intelligence	Cluster detection, anomaly detection, signal surfacing
Presentation	Dashboards, summaries, consumer UX, LLM-powered synthesis
Orchestration	Agent swarms, reviewers, curators
Client Libraries	SDKs that wrap the HTTP API (Go, Python, etc.)
Infrastructure	Key management, budget enforcement, job scheduling

The Test

For any feature, ask: "Does this require changes to Episteme's storage, indexing, resolution, or API?"

If yes → Database feature. It goes in Episteme.
If no, it just calls the API → Application layer. It goes in your app.
If both → Split it. Episteme provides the primitive; your app provides the intelligence.

Examples

Feature	DB Primitive	App Intelligence
Source-class decay	Decay formula applied during lens resolution	Which tier a source belongs to
Cluster escalation	Accept/store/query escalation assertions	Detect clusters, compute thresholds
Disagreement dashboard	Skeptic Lens + conflict_score	UI rendering, LLM summaries
Time-travel	`as_of` query parameter	"What changed since you last looked" UX
Visual anchoring	Store + query visual_hash	Image hashing, screenshot comparison
Pharma data	Ingest via `POST /assert`	Crawl PubMed, extract claims, classify

Core Application Components

These are the building blocks that most Episteme-powered applications need.

1. Ingestion Pipeline

Transforms raw source material into signed assertions.

[Raw Source] → [Crawler/API] → [NLP Extraction] → [Classification] → [Enrichment] → [Signing] → [POST /assert]

Responsibilities:

Fetch documents from source systems (PubMed API, Reddit API, web scraping)
Extract claims as subject/predicate/object triples
Assign confidence based on extraction quality
Classify source tier (0-6)
Enrich with structured metadata
Sign with agent key
Submit to Episteme

See: Consumer Health Ingestion

2. Source-Class Classifier

Determines the authority tier of each source.

Tier	Source Type	Examples
0	Regulatory action	FDA label change, EMA withdrawal
1	Peer-reviewed RCT, meta-analysis	NEJM, Lancet, JAMA
2	Observational study, real-world evidence	Insurance claims, EHR studies
3	Pharmacovigilance	FAERS reports, EudraVigilance
4	Clinician anecdote, case report	Conference presentations, case series
5	Patient community	Reddit, forums, patient registries
6	Media, influencer, commercial	TikTok, news articles, pharma marketing

Implementation: Rule-based classifier or ML model that maps source metadata (URL pattern, DOI prefix, platform) to tier.

Domain-specific: The tier map for pharma differs from finance differs from legal.

3. Background Gardener

Monitors the knowledge graph for signals that warrant attention.

Responsibilities:

Query assertion counts by subject/predicate/tier
Detect unusual clustering (e.g., 1,847 Tier-5 assertions in 6 months)
Generate escalation assertions when thresholds are crossed
Trigger TrustRank decay on schedule
Update agent reputations based on outcomes

Episteme primitives it uses:

GET /query with aggregation
POST /assert for escalation assertions
POST /admin/decay-trust-ranks for scheduled decay

4. Presentation Layer

Renders Episteme query results for end users.

Options:

Dashboard: React/Vue app that visualizes conflict scores, tier positions, timelines
API Gateway: REST/GraphQL layer that adds business logic before returning data
Chat Interface: LLM-powered conversational access to the knowledge graph
Reports: Scheduled exports (PDF, email digests)

Episteme primitives it uses:

GET /query with various lenses
Layered Consensus for per-tier breakdown
Skeptic Lens for disagreement surfacing
as_of for time-travel views
since for change tracking

5. Agent Integration

Connects AI agents to Episteme for reading and writing knowledge.

ADK-Go Tool Example:

// Tool: Query the knowledge graph
func QueryKnowledge(ctx context.Context, subject, predicate string, lens string) (string, error) {
    resp, err := episteme.Query(ctx, &QueryParams{
        Subject:   subject,
        Predicate: predicate,
        Lens:      lens,
    })
    if err != nil {
        return "", err
    }
    return formatForAgent(resp), nil
}

// Tool: Assert a new fact
func AssertFact(ctx context.Context, subject, predicate, object string, confidence float32) error {
    return episteme.Assert(ctx, &AssertionRequest{
        Subject:    subject,
        Predicate:  predicate,
        Object:     ObjectText(object),
        Confidence: confidence,
        Lifecycle:  "Proposed",
    })
}

See: ADK-Go Integration Guide

Vertical-Specific Guides

Vertical	Guide	Key Components
Consumer Health	consumer-health.md	Pharma crawlers, tier classification, disagreement dashboard
Financial Due Diligence	(planned)	SEC filings, analyst extraction, invalidation cascades
Agile Agent Team	(planned)	Constraints lens, lifecycle workflow, audit trail

Quick Reference: What Goes Where

If you need to...	Episteme provides...	You build...
Store conflicting facts	Assertion type, append-only DAG	Nothing — just POST assertions
Resolve conflicts	Lenses (Recency, Consensus, Skeptic, Layered)	Lens selection logic
Query historical state	`as_of` parameter	Time-travel UI
Track changes	`since` parameter + MV changelog	Notification system
Weight by source authority	`source_class` field + source-aware decay	Tier classifier
Detect emerging signals	Skeptic Lens + conflict_score	Gardener (threshold logic)
Show per-tier consensus	Layered Consensus Lens	Dashboard UI
Extract claims from papers	Nothing — pre-assertion transform	NLP pipeline
Sign assertions	Signature verification	Agent wallet / key management
Generate summaries	Structured query responses	LLM summarizer

8.0 KiB Raw Blame History