stemedb/crates/stemedb-api
jordan a734be3a0d feat: Phase 7 Content Defense + code structure refactoring
Content Defense (Phase 7):
- Add SimilarityIndex with MinHash/LSH for near-duplicate detection
- Add QuarantineStore for flagged assertions awaiting admin review
- Add CircuitBreakerStore for per-agent circuit breaker state
- Add ContentDefenseLayer for ingestion pipeline integration
- Add API endpoints for quarantine and circuit breaker management
- Add research module with gap detection and documentation fetching

Code Structure Improvements:
- Extract research CLI commands to research_commands.rs
- Extract API routers to routers.rs module
- Extract key_codec extraction functions to separate module
- Extract test modules to separate files across multiple crates
- All files now under 500 line limit per pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:44:05 -07:00
..
examples feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints 2026-02-01 21:00:59 -07:00
src feat: Phase 7 Content Defense + code structure refactoring 2026-02-03 12:44:05 -07:00
tests feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination 2026-02-03 00:43:37 -07:00
Cargo.toml feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation 2026-02-02 12:36:35 -07:00
README.md feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation 2026-02-02 12:36:35 -07:00

stemedb-api

HTTP API for Episteme (StemeDB) - a probabilistic knowledge graph database.

Architecture

The API follows the standard axum pattern:

  • DTOs (dto.rs) - JSON request/response types with hex-encoded binary data
  • Handlers (handlers/) - Thin HTTP handlers that delegate to underlying engines
  • State (state.rs) - Shared application state (Journal, Store)
  • Router (lib.rs) - axum router with OpenAPI support via utoipa

Write Path

POST /v1/assert → DTO → Assertion → serialize → append to WAL → return hash

Read Path

GET /v1/query → QueryParams → Query → QueryEngine → Lens (optional) → DTOs

Running the Server

# Start the API server (defaults to http://127.0.0.1:3000)
cargo run --package stemedb-api

# With custom configuration
STEMEDB_WAL_DIR=./my-wal STEMEDB_DB_DIR=./my-db STEMEDB_BIND_ADDR=0.0.0.0:8080 cargo run --package stemedb-api

The server automatically:

  1. Opens Journal (WAL) and HybridStore (KV storage)
  2. Spawns IngestWorker background task to tail WAL
  3. Starts HTTP server with OpenAPI documentation

API Documentation

Once the server is running, visit:

http://127.0.0.1:3000/swagger-ui

This provides interactive OpenAPI documentation for all endpoints.

Endpoints

POST /v1/assert

Create a new assertion.

Request:

{
  "subject": "Tesla_Inc",
  "predicate": "has_revenue",
  "object": {
    "type": "Number",
    "value": 96.7
  },
  "confidence": 0.95,
  "signatures": [{
    "agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
    "signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40",
    "timestamp": 1706745600
  }],
  "source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
}

Response:

{
  "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "status": "created"
}

POST /v1/vote

Create a vote on an existing assertion.

Request:

{
  "assertion_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
  "weight": 0.8,
  "signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40"
}

Response:

{
  "hash": "f3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "status": "created"
}

GET /v1/query

Query assertions with optional filters and lens.

Query Parameters:

  • subject (optional) - Filter by subject entity
  • predicate (optional) - Filter by predicate/relation
  • lifecycle (optional) - Filter by lifecycle stage (Proposed, UnderReview, Approved, Deprecated, Rejected)
  • epoch (optional) - Filter by epoch (hex-encoded)
  • lens (optional) - Apply lens for conflict resolution (Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority)
  • limit (optional) - Maximum results (default: 100)

Example:

GET /v1/query?subject=Tesla_Inc&predicate=has_revenue&lifecycle=Approved&lens=Recency

Response:

{
  "assertions": [{
    "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "subject": "Tesla_Inc",
    "predicate": "has_revenue",
    "object": {
      "type": "Number",
      "value": 96.7
    },
    "confidence": 0.95,
    "lifecycle": "Approved",
    "signatures": [...],
    "timestamp": 1706745600,
    "source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
  }],
  "total_count": 1,
  "has_more": false
}

GET /v1/health

Health check endpoint.

Response:

{
  "status": "healthy",
  "version": "0.1.0",
  "assertions_count": 42
}

Environment Variables

  • STEMEDB_WAL_DIR - Directory for WAL files (default: data/wal)
  • STEMEDB_DB_DIR - Directory for KV store (default: data/db)
  • STEMEDB_BIND_ADDR - HTTP server bind address (default: 127.0.0.1:3000)

Binary Data Encoding

All binary data (hashes, signatures, agent IDs) use hex encoding in JSON:

  • Assertion hash: 32 bytes (64 hex characters)
  • Agent ID (public key): 32 bytes (64 hex characters)
  • Signature: 64 bytes (128 hex characters)
  • Source hash: 32 bytes (64 hex characters)
  • Visual hash (optional): 8 bytes (16 hex characters)

Critical Rules

  • Append-Only: The API never mutates existing assertions. Create new ones.
  • Content-Addressed: Assertion ID = BLAKE3 hash of content.
  • No Unwrap: All error handling uses ? with context (enforced by clippy).
  • Defensive Writes: All writes go through WAL with fsync.