stemedb/crates/stemedb-api
jml bb0c33f8d3 fix(api): enable querying of CLI-created community corpus items
## Problem
CLI-created community corpus items (tier 3) were stored correctly but
invisible via API queries. Two issues blocked discoverability:

1. **Prefix mismatch**: API hardcoded 'community://pattern/' for
   aggregated patterns, but CLI creates 'community://rust/http/...' URIs
2. **Query parameter parsing**: Axum's default parser doesn't support
   bracket notation (?sources[]=value) used by the dashboard

Result: 0/22 CLI-created items were queryable.

## Solution

### Fix 1: Broaden Community Prefix
- Changed: 'community://pattern/' → 'community://' in corpus handler
- Impact: Now matches both aggregated patterns AND CLI-created items
- Backward compatible: Broader prefix includes narrower results

### Fix 2: Add QsQuery Extractor
- Added: serde_qs dependency + custom QsQuery extractor
- Supports: Bracket notation for array parameters (?sources[]=a&sources[]=b)
- Compatible: Works with JavaScript URLSearchParams standard
- Tested: 3 new unit tests for extractor behavior

## Verification
-  All 22 CLI-created community items now queryable (was 0)
-  Source filtering works: community (22), RFC (2), vendor (5)
-  Multi-source queries work: ?sources[]=community&sources[]=rfc → 24
-  All 89 API tests pass + 3 new extractor tests
-  Clippy clean (0 warnings)
-  No regressions in existing functionality

## Files Changed
- crates/stemedb-api/Cargo.toml: Add serde_qs dependency
- crates/stemedb-api/src/extractors.rs: New QsQuery extractor (117 lines)
- crates/stemedb-api/src/handlers/aphoria/corpus.rs: Use QsQuery, broaden prefix
- crates/stemedb-api/src/lib.rs: Export extractors module

Also includes: Scale-adaptive thresholds, wiki corpus extraction,
documentation updates, and dashboard UI improvements from prior work.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 15:54:35 +00:00
..
examples feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints 2026-02-01 21:00:59 -07:00
src fix(api): enable querying of CLI-created community corpus items 2026-02-09 15:54:35 +00:00
tests fix(api): enable querying of CLI-created community corpus items 2026-02-09 15:54:35 +00:00
Cargo.toml fix(api): enable querying of CLI-created community corpus items 2026-02-09 15:54:35 +00:00
README.md feat: Multi-application expansion with chaos testing and community UI 2026-02-04 01:24:14 -07:00

stemedb-api

HTTP API for Episteme (StemeDB) - a probabilistic knowledge graph database.

Architecture

The API follows the standard axum pattern:

  • DTOs (dto.rs) - JSON request/response types with hex-encoded binary data
  • Handlers (handlers/) - Thin HTTP handlers that delegate to underlying engines
  • State (state.rs) - Shared application state (Journal, Store)
  • Router (lib.rs) - axum router with OpenAPI support via utoipa

Write Path

POST /v1/assert → DTO → Assertion → serialize → append to WAL → return hash

Read Path

GET /v1/query → QueryParams → Query → QueryEngine → Lens (optional) → DTOs

Running the Server

# Start the API server (defaults to http://127.0.0.1:18180)
cargo run --package stemedb-api

# With custom configuration
STEMEDB_WAL_DIR=./my-wal STEMEDB_DB_DIR=./my-db STEMEDB_BIND_ADDR=0.0.0.0:18180 cargo run --package stemedb-api

The server automatically:

  1. Opens Journal (WAL) and HybridStore (KV storage)
  2. Spawns IngestWorker background task to tail WAL
  3. Starts HTTP server with OpenAPI documentation

API Documentation

Once the server is running, visit:

http://127.0.0.1:18180/swagger-ui

This provides interactive OpenAPI documentation for all endpoints.

Endpoints

POST /v1/assert

Create a new assertion.

Request:

{
  "subject": "Tesla_Inc",
  "predicate": "has_revenue",
  "object": {
    "type": "Number",
    "value": 96.7
  },
  "confidence": 0.95,
  "signatures": [{
    "agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
    "signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40",
    "timestamp": 1706745600
  }],
  "source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
}

Response:

{
  "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "status": "created"
}

POST /v1/vote

Create a vote on an existing assertion.

Request:

{
  "assertion_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
  "weight": 0.8,
  "signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40"
}

Response:

{
  "hash": "f3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "status": "created"
}

GET /v1/query

Query assertions with optional filters and lens.

Query Parameters:

  • subject (optional) - Filter by subject entity
  • predicate (optional) - Filter by predicate/relation
  • lifecycle (optional) - Filter by lifecycle stage (Proposed, UnderReview, Approved, Deprecated, Rejected)
  • epoch (optional) - Filter by epoch (hex-encoded)
  • lens (optional) - Apply lens for conflict resolution (Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority)
  • limit (optional) - Maximum results (default: 100)

Example:

GET /v1/query?subject=Tesla_Inc&predicate=has_revenue&lifecycle=Approved&lens=Recency

Response:

{
  "assertions": [{
    "hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    "subject": "Tesla_Inc",
    "predicate": "has_revenue",
    "object": {
      "type": "Number",
      "value": 96.7
    },
    "confidence": 0.95,
    "lifecycle": "Approved",
    "signatures": [...],
    "timestamp": 1706745600,
    "source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
  }],
  "total_count": 1,
  "has_more": false
}

GET /v1/health

Health check endpoint.

Response:

{
  "status": "healthy",
  "version": "0.1.0",
  "assertions_count": 42
}

Environment Variables

  • STEMEDB_WAL_DIR - Directory for WAL files (default: data/wal)
  • STEMEDB_DB_DIR - Directory for KV store (default: data/db)
  • STEMEDB_BIND_ADDR - HTTP server bind address (default: 127.0.0.1:18180)

Binary Data Encoding

All binary data (hashes, signatures, agent IDs) use hex encoding in JSON:

  • Assertion hash: 32 bytes (64 hex characters)
  • Agent ID (public key): 32 bytes (64 hex characters)
  • Signature: 64 bytes (128 hex characters)
  • Source hash: 32 bytes (64 hex characters)
  • Visual hash (optional): 8 bytes (16 hex characters)

Critical Rules

  • Append-Only: The API never mutates existing assertions. Create new ones.
  • Content-Addressed: Assertion ID = BLAKE3 hash of content.
  • No Unwrap: All error handling uses ? with context (enforced by clippy).
  • Defensive Writes: All writes go through WAL with fsync.