stemedb/crates/stemedb-api/README.md
jml 4012791e7e fix(api): enable non-strict mode for URL-encoded bracket notation
## Problem
Dashboard sends URL-encoded query parameters:
  ?sources%5B%5D=rfc&sources%5B%5D=owasp
  (%5B = '[', %5D = ']')

But QsQuery extractor used strict mode, which rejects encoded brackets:
  Error: "Invalid field contains an encoded bracket"

Result: All corpus filters in the dashboard failed silently.

## Solution
Changed QsQuery to use serde_qs non-strict mode:
  Config::new(5, false) // false = non-strict

Now accepts BOTH:
  - Literal brackets: ?sources[]=rfc
  - Encoded brackets: ?sources%5B%5D=rfc (browsers)

## Verification
 URL-encoded query: ?sources%5B%5D=rfc&sources%5B%5D=community
   Returns: 24 items (was: error)
   Logs: sources=Some(["rfc", "community"]) 

 Literal brackets: ?sources[]=rfc (still works)
 All 4 extractor tests pass (added encoded brackets test)
 Clippy clean (0 warnings)

## Files Changed
- crates/stemedb-api/src/extractors.rs: Use non-strict Config
- crates/stemedb-api/README.md: Document QsQuery usage
- .claude/guides/backend/api-endpoints.md: Add best practices
- CLAUDE.md: Reference extractors documentation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 16:11:25 +00:00

244 lines
6.7 KiB
Markdown

# stemedb-api
HTTP API for Episteme (StemeDB) - a probabilistic knowledge graph database.
## Architecture
The API follows the standard axum pattern:
- **DTOs** (`dto.rs`) - JSON request/response types with hex-encoded binary data
- **Handlers** (`handlers/`) - Thin HTTP handlers that delegate to underlying engines
- **State** (`state.rs`) - Shared application state (Journal, Store)
- **Router** (`lib.rs`) - axum router with OpenAPI support via utoipa
## Query Parameter Patterns
### When to Use QsQuery vs Query
The API uses two different query parameter extractors depending on whether array parameters are needed:
#### Use `QsQuery` for Array Parameters
**Required when:** Your request DTO contains `Vec<T>` or `Option<Vec<T>>` fields.
```rust
use crate::extractors::QsQuery;
#[derive(Deserialize)]
struct MyRequest {
sources: Option<Vec<String>>, // Array parameter
limit: usize,
}
async fn my_handler(
State(state): State<AppState>,
QsQuery(params): QsQuery<MyRequest>, // ✅ Correct
) -> Result<Json<MyResponse>> {
// Dashboard sends: ?sources[]=rfc&sources[]=community&limit=10
// params.sources = Some(vec!["rfc", "community"])
}
```
**Why:** The StemeDB Dashboard uses JavaScript's `URLSearchParams` which generates bracket notation for arrays (`?filters[]=a&filters[]=b`). Standard `axum::extract::Query` uses `serde_urlencoded` which doesn't support bracket notation. `QsQuery` uses `serde_qs` which does.
**Warning:** If you use standard `Query` with array parameters, the dashboard filters will **silently fail** (returning all results instead of filtered results).
#### Use Standard `Query` for Scalar Parameters
**Required when:** All query parameters are scalars (no arrays/vectors).
```rust
use axum::extract::Query;
#[derive(Deserialize)]
struct SimpleRequest {
limit: usize,
offset: usize,
category: Option<String>,
}
async fn simple_handler(
State(state): State<AppState>,
Query(params): Query<SimpleRequest>, // ✅ Correct
) -> Result<Json<MyResponse>> {
// Standard URL: ?limit=10&offset=0&category=security
}
```
**When to use alias:** If your handler file also imports `stemedb_query::Query`, use `use axum::extract::Query as AxumQuery` to avoid name collision.
### Quick Reference
| DTO Field Types | Extractor | Example |
|-----------------|-----------|---------|
| All scalars (String, usize, Option<bool>) | `Query` or `AxumQuery` | `handlers/meter.rs:60` |
| Contains Vec or Option<Vec> | `QsQuery` | `handlers/aphoria/corpus.rs:41` |
See `src/extractors.rs` for detailed documentation and examples.
## Write Path
```
POST /v1/assert → DTO → Assertion → serialize → append to WAL → return hash
```
## Read Path
```
GET /v1/query → QueryParams → Query → QueryEngine → Lens (optional) → DTOs
```
## Running the Server
```bash
# Start the API server (defaults to http://127.0.0.1:18180)
cargo run --package stemedb-api
# With custom configuration
STEMEDB_WAL_DIR=./my-wal STEMEDB_DB_DIR=./my-db STEMEDB_BIND_ADDR=0.0.0.0:18180 cargo run --package stemedb-api
```
The server automatically:
1. Opens Journal (WAL) and HybridStore (KV storage)
2. Spawns IngestWorker background task to tail WAL
3. Starts HTTP server with OpenAPI documentation
## API Documentation
Once the server is running, visit:
```
http://127.0.0.1:18180/swagger-ui
```
This provides interactive OpenAPI documentation for all endpoints.
## Endpoints
### POST /v1/assert
Create a new assertion.
**Request:**
```json
{
"subject": "Tesla_Inc",
"predicate": "has_revenue",
"object": {
"type": "Number",
"value": 96.7
},
"confidence": 0.95,
"signatures": [{
"agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
"signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40",
"timestamp": 1706745600
}],
"source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
}
```
**Response:**
```json
{
"hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"status": "created"
}
```
### POST /v1/vote
Create a vote on an existing assertion.
**Request:**
```json
{
"assertion_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
"weight": 0.8,
"signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40"
}
```
**Response:**
```json
{
"hash": "f3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"status": "created"
}
```
### GET /v1/query
Query assertions with optional filters and lens.
**Query Parameters:**
- `subject` (optional) - Filter by subject entity
- `predicate` (optional) - Filter by predicate/relation
- `lifecycle` (optional) - Filter by lifecycle stage (Proposed, UnderReview, Approved, Deprecated, Rejected)
- `epoch` (optional) - Filter by epoch (hex-encoded)
- `lens` (optional) - Apply lens for conflict resolution (Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority)
- `limit` (optional) - Maximum results (default: 100)
**Example:**
```
GET /v1/query?subject=Tesla_Inc&predicate=has_revenue&lifecycle=Approved&lens=Recency
```
**Response:**
```json
{
"assertions": [{
"hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"subject": "Tesla_Inc",
"predicate": "has_revenue",
"object": {
"type": "Number",
"value": 96.7
},
"confidence": 0.95,
"lifecycle": "Approved",
"signatures": [...],
"timestamp": 1706745600,
"source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
}],
"total_count": 1,
"has_more": false
}
```
### GET /v1/health
Health check endpoint.
**Response:**
```json
{
"status": "healthy",
"version": "0.1.0",
"assertions_count": 42
}
```
## Environment Variables
- `STEMEDB_WAL_DIR` - Directory for WAL files (default: `data/wal`)
- `STEMEDB_DB_DIR` - Directory for KV store (default: `data/db`)
- `STEMEDB_BIND_ADDR` - HTTP server bind address (default: `127.0.0.1:18180`)
## Binary Data Encoding
All binary data (hashes, signatures, agent IDs) use hex encoding in JSON:
- Assertion hash: 32 bytes (64 hex characters)
- Agent ID (public key): 32 bytes (64 hex characters)
- Signature: 64 bytes (128 hex characters)
- Source hash: 32 bytes (64 hex characters)
- Visual hash (optional): 8 bytes (16 hex characters)
## Critical Rules
- **Append-Only**: The API never mutates existing assertions. Create new ones.
- **Content-Addressed**: Assertion ID = BLAKE3 hash of content.
- **No Unwrap**: All error handling uses `?` with context (enforced by clippy).
- **Defensive Writes**: All writes go through WAL with fsync.