Add CRC32C checksums to WAL record format (v2), implement crash recovery with automatic truncation of corrupt records, add feature-gated group commit buffer for batched fsync under concurrent load, and implement log rotation via segment files with global offset addressing. Key changes: - Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N] - recover_file() scans and truncates corrupt tail records - GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate) - SegmentManager with binary search resolution and cursor-based cleanup - Journal::read() auto-refreshes segments on miss for writer/reader split - Split recovery.rs and key_codec.rs into directory modules for 500-line max Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
179 lines
4.6 KiB
Markdown
179 lines
4.6 KiB
Markdown
# stemedb-api
|
|
|
|
HTTP API for Episteme (StemeDB) - a probabilistic knowledge graph database.
|
|
|
|
## Architecture
|
|
|
|
The API follows the standard axum pattern:
|
|
|
|
- **DTOs** (`dto.rs`) - JSON request/response types with hex-encoded binary data
|
|
- **Handlers** (`handlers/`) - Thin HTTP handlers that delegate to underlying engines
|
|
- **State** (`state.rs`) - Shared application state (Journal, Store)
|
|
- **Router** (`lib.rs`) - axum router with OpenAPI support via utoipa
|
|
|
|
## Write Path
|
|
|
|
```
|
|
POST /v1/assert → DTO → Assertion → serialize → append to WAL → return hash
|
|
```
|
|
|
|
## Read Path
|
|
|
|
```
|
|
GET /v1/query → QueryParams → Query → QueryEngine → Lens (optional) → DTOs
|
|
```
|
|
|
|
## Running the Server
|
|
|
|
```bash
|
|
# Start the API server (defaults to http://127.0.0.1:3000)
|
|
cargo run --package stemedb-api
|
|
|
|
# With custom configuration
|
|
STEMEDB_WAL_DIR=./my-wal STEMEDB_DB_DIR=./my-db STEMEDB_BIND_ADDR=0.0.0.0:8080 cargo run --package stemedb-api
|
|
```
|
|
|
|
The server automatically:
|
|
1. Opens Journal (WAL) and HybridStore (KV storage)
|
|
2. Spawns IngestWorker background task to tail WAL
|
|
3. Starts HTTP server with OpenAPI documentation
|
|
|
|
## API Documentation
|
|
|
|
Once the server is running, visit:
|
|
|
|
```
|
|
http://127.0.0.1:3000/swagger-ui
|
|
```
|
|
|
|
This provides interactive OpenAPI documentation for all endpoints.
|
|
|
|
## Endpoints
|
|
|
|
### POST /v1/assert
|
|
|
|
Create a new assertion.
|
|
|
|
**Request:**
|
|
```json
|
|
{
|
|
"subject": "Tesla_Inc",
|
|
"predicate": "has_revenue",
|
|
"object": {
|
|
"type": "Number",
|
|
"value": 96.7
|
|
},
|
|
"confidence": 0.95,
|
|
"signatures": [{
|
|
"agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
|
|
"signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40",
|
|
"timestamp": 1706745600
|
|
}],
|
|
"source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
|
|
"status": "created"
|
|
}
|
|
```
|
|
|
|
### POST /v1/vote
|
|
|
|
Create a vote on an existing assertion.
|
|
|
|
**Request:**
|
|
```json
|
|
{
|
|
"assertion_hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
|
|
"agent_id": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f20",
|
|
"weight": 0.8,
|
|
"signature": "0102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f40"
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"hash": "f3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
|
|
"status": "created"
|
|
}
|
|
```
|
|
|
|
### GET /v1/query
|
|
|
|
Query assertions with optional filters and lens.
|
|
|
|
**Query Parameters:**
|
|
- `subject` (optional) - Filter by subject entity
|
|
- `predicate` (optional) - Filter by predicate/relation
|
|
- `lifecycle` (optional) - Filter by lifecycle stage (Proposed, UnderReview, Approved, Deprecated, Rejected)
|
|
- `epoch` (optional) - Filter by epoch (hex-encoded)
|
|
- `lens` (optional) - Apply lens for conflict resolution (Recency, Consensus, Authority, VoteAwareConsensus, TrustAwareAuthority)
|
|
- `limit` (optional) - Maximum results (default: 100)
|
|
|
|
**Example:**
|
|
```
|
|
GET /v1/query?subject=Tesla_Inc&predicate=has_revenue&lifecycle=Approved&lens=Recency
|
|
```
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"assertions": [{
|
|
"hash": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
|
|
"subject": "Tesla_Inc",
|
|
"predicate": "has_revenue",
|
|
"object": {
|
|
"type": "Number",
|
|
"value": 96.7
|
|
},
|
|
"confidence": 0.95,
|
|
"lifecycle": "Approved",
|
|
"signatures": [...],
|
|
"timestamp": 1706745600,
|
|
"source_hash": "0000000000000000000000000000000000000000000000000000000000000000"
|
|
}],
|
|
"total_count": 1,
|
|
"has_more": false
|
|
}
|
|
```
|
|
|
|
### GET /v1/health
|
|
|
|
Health check endpoint.
|
|
|
|
**Response:**
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"version": "0.1.0",
|
|
"assertions_count": 42
|
|
}
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
- `STEMEDB_WAL_DIR` - Directory for WAL files (default: `data/wal`)
|
|
- `STEMEDB_DB_DIR` - Directory for KV store (default: `data/db`)
|
|
- `STEMEDB_BIND_ADDR` - HTTP server bind address (default: `127.0.0.1:3000`)
|
|
|
|
## Binary Data Encoding
|
|
|
|
All binary data (hashes, signatures, agent IDs) use hex encoding in JSON:
|
|
- Assertion hash: 32 bytes (64 hex characters)
|
|
- Agent ID (public key): 32 bytes (64 hex characters)
|
|
- Signature: 64 bytes (128 hex characters)
|
|
- Source hash: 32 bytes (64 hex characters)
|
|
- Visual hash (optional): 8 bytes (16 hex characters)
|
|
|
|
## Critical Rules
|
|
|
|
- **Append-Only**: The API never mutates existing assertions. Create new ones.
|
|
- **Content-Addressed**: Assertion ID = BLAKE3 hash of content.
|
|
- **No Unwrap**: All error handling uses `?` with context (enforced by clippy).
|
|
- **Defensive Writes**: All writes go through WAL with fsync.
|