Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.5 KiB
Quick Start
Get StemeDB running and validated in under 5 minutes.
Prerequisites
- Rust 1.75+ (
rustup update stable) - curl (for validation)
1. Validate It Works
# Clone and enter
git clone <repo-url>
cd stemedb
# Run end-to-end validation (builds, starts server, asserts, queries, shuts down)
make validate
Expected output:
==========================================
StemeDB Validation
==========================================
[PASS] Build complete
[PASS] Server is healthy
[PASS] Health check passed
[PASS] Assertion created: abc123...
[PASS] Query returned correct data
[PASS] Lens query (Recency) works
==========================================
All validation checks passed!
==========================================
If you see "All validation checks passed!" - StemeDB is working correctly.
2. Start the Server
cargo run --package stemedb-api
The server starts on http://localhost:18180.
3. Explore the API
Open the Swagger UI for interactive documentation:
http://localhost:18180/swagger-ui
Or check health via curl:
curl http://localhost:18180/v1/health
# {"status":"healthy","version":"0.1.0","assertions_count":0}
4. Create Your First Assertion
Using the Go SDK (recommended):
cd sdk/go/examples/basic
go run main.go
Or via curl (requires generating Ed25519 signatures):
# Generate a signed assertion
cargo run --package stemedb-api --example gen_test_assertion > /tmp/assertion.json
# Submit it
curl -X POST http://localhost:18180/v1/assert \
-H "Content-Type: application/json" \
-d @/tmp/assertion.json
5. Query It Back
# Query by subject and predicate
curl "http://localhost:18180/v1/query?subject=StemeDB_Validation&predicate=test_status"
# Query with a lens (conflict resolution)
curl "http://localhost:18180/v1/query?subject=StemeDB_Validation&predicate=test_status&lens=Recency"
6. See Conflict in Action (The "Git for Truth" Moment)
Episteme stores Claims, not Facts. When multiple agents assert conflicting values, the Skeptic endpoint shows you all competing claims instead of picking a winner.
Create Conflicting Assertions
Using the Go SDK, create assertions with different claims about the same subject:
cd sdk/go/examples/conflict
go run main.go
Query with Skeptic
The Skeptic endpoint reveals disagreement instead of hiding it:
curl "http://localhost:18180/v1/skeptic?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
Response shows all competing claims:
{
"status": "Contested",
"conflict_score": 0.72,
"claims": [
{"value": {"type": "Boolean", "value": true}, "weight_share": 0.48, "assertion_count": 1},
{"value": {"type": "Boolean", "value": false}, "weight_share": 0.52, "assertion_count": 1}
],
"candidates_count": 2
}
Key insight: Instead of silently picking a winner, you see the disagreement. This is critical for health/finance domains where hiding conflict is dangerous.
7. Authority Tiers (Source-Class Resolution)
Different sources have different authority. A regulatory filing (FDA) outweighs an anecdotal tweet. The Layered endpoint shows per-tier consensus.
Query with Layered Consensus
The conflict example creates assertions with different source_class values (Clinical vs Anecdotal).
The Layered endpoint shows how each tier resolves independently:
curl "http://localhost:18180/v1/layered?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
Response shows tier-by-tier resolution:
{
"tiers": [
{"tier": 1, "source_class": "Clinical", "winner": {"object": {"type": "Boolean", "value": true}}, "conflict_score": 0.0},
{"tier": 5, "source_class": "Anecdotal", "winner": {"object": {"type": "Boolean", "value": false}}, "conflict_score": 0.0}
],
"overall_winner": {"object": {"type": "Boolean", "value": true}},
"overall_conflict_score": 0.85
}
Key insight: Clinical tier (peer-reviewed research) wins despite Anecdotal tier (social media) disagreeing. The overall_conflict_score tells you the tiers disagree.
8. Distributed Mode (Cluster Node)
StemeDB supports horizontal scaling across multiple nodes. Each node runs SWIM membership for discovery, range sharding for data distribution, and a Gateway for request routing.
Start a Cluster Node
cargo run --package stemedb-cluster --bin stemedb-node
The node starts on http://localhost:18181 (Gateway API) and 127.0.0.1:18182 (RPC).
Check Cluster Health
curl http://localhost:18181/v1/health
# {"healthy":true,"reachable_nodes":0,"joined":true}
See Cluster Topology
curl http://localhost:18181/v1/cluster/status
Response shows shards and nodes:
{
"node_count": 0,
"shard_count": 4,
"meta_version": 1,
"nodes": []
}
Test Subject Routing
See which shard a subject maps to:
curl "http://localhost:18181/v1/route?subject=Tesla_Inc"
# {"subject":"Tesla_Inc","shard_id":0,"replicas":["abc12345"]}
curl "http://localhost:18181/v1/route?subject=Bitcoin"
# {"subject":"Bitcoin","shard_id":3,"replicas":["abc12345"]}
Different subjects hash to different shards for load distribution.
Inspect a Shard
curl http://localhost:18181/v1/shards/0
Response shows shard metadata:
{
"shard_id": 0,
"replicas": ["abc12345"],
"size_bytes": 0,
"assertion_count": 0,
"generation": 1
}
Note: The cluster node demonstrates routing topology. Full assertion storage requires running stemedb-api nodes as backends (integration in progress).
What's Next?
| Goal | Resource |
|---|---|
| Understand the vision | vision.md |
| See real use cases | use-cases/README.md |
| Use the Go SDK | sdk/go/steme/README.md |
| Build AI agents | sdk/go/adk/README.md |
| Understand architecture | architecture.md |
| API reference | crates/stemedb-api/README.md |
| Distributed architecture | docs/research/distributed-write-path.md |
Common Issues
Build fails
rustup update stable
cargo clean
cargo build --workspace
Server won't start (port in use)
# Use a different port
STEMEDB_BIND_ADDR=127.0.0.1:18190 cargo run --package stemedb-api
Validation script fails
Check the server log in the temp directory:
cat tmp/validate-*/server.log
Query returns empty results
The ingestion worker runs asynchronously. If you're writing directly to the WAL (not via API), wait ~500ms before querying.
Environment Variables
Single-Node API (stemedb-api)
| Variable | Default | Description |
|---|---|---|
STEMEDB_BIND_ADDR |
127.0.0.1:18180 |
HTTP server address |
STEMEDB_WAL_DIR |
data/wal |
Write-ahead log directory |
STEMEDB_DB_DIR |
data/db |
KV store directory |
STEMEDB_METER_ENABLED |
true |
Enable economic throttling |
Cluster Node (stemedb-node)
| Variable | Default | Description |
|---|---|---|
STEMEDB_NODE_API_ADDR |
127.0.0.1:18181 |
Gateway HTTP address |
STEMEDB_NODE_RPC_ADDR |
127.0.0.1:18182 |
gRPC sync address |
STEMEDB_SEED_NODES |
(empty) | Comma-separated seed node RPC addresses |
STEMEDB_NUM_SHARDS |
4 |
Number of shards (power of 2 recommended) |
STEMEDB_REPLICATION_FACTOR |
1 |
Replica count per shard |
STEMEDB_DATACENTER |
(empty) | Datacenter/region label |