stemedb/quickstart.md
jordan d3a88585fe feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination
This commit includes comprehensive work on Phase 6 features:

## Admission Control (Phase 6 admission middleware)
- AdmissionStore implementation backed by TrustRankStore
- PoW verification with tier-based difficulty computation
- Trust tier progression (Newcomer → Established → Trusted → Authority)
- API integration with admission status endpoints

## HLC Recency Lens (Phase 6C)
- HlcRecencyLens for distributed system ordering
- Hybrid logical clock integration with causality preservation

## Cluster Coordination (Phase 6C)
- Multi-node cluster tests (availability, partition tolerance)
- CRDT convergence tests for anti-entropy sync
- Gateway handler improvements

## Aphoria Code Linter (Phase 2A)
- RFC/OWASP corpus builders with network fetching and caching
- Concept hierarchy with auto-alias creation on conflict detection
- Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting)

## Code Organization
- Split large files into modules to comply with 500-line limit
- Improved test organization with separate test modules
- Fixed rkyv serialization for EigenTrustState (AgentScore struct)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:43:37 -07:00

289 lines
7.5 KiB
Markdown

# Quick Start
Get StemeDB running and validated in under 5 minutes.
## Prerequisites
- Rust 1.75+ (`rustup update stable`)
- curl (for validation)
## 1. Validate It Works
```bash
# Clone and enter
git clone <repo-url>
cd stemedb
# Run end-to-end validation (builds, starts server, asserts, queries, shuts down)
make validate
```
Expected output:
```
==========================================
StemeDB Validation
==========================================
[PASS] Build complete
[PASS] Server is healthy
[PASS] Health check passed
[PASS] Assertion created: abc123...
[PASS] Query returned correct data
[PASS] Lens query (Recency) works
==========================================
All validation checks passed!
==========================================
```
If you see "All validation checks passed!" - StemeDB is working correctly.
## 2. Start the Server
```bash
cargo run --package stemedb-api
```
The server starts on `http://localhost:3000`.
## 3. Explore the API
Open the Swagger UI for interactive documentation:
```
http://localhost:3000/swagger-ui
```
Or check health via curl:
```bash
curl http://localhost:3000/v1/health
# {"status":"healthy","version":"0.1.0","assertions_count":0}
```
## 4. Create Your First Assertion
Using the Go SDK (recommended):
```bash
cd sdk/go/examples/basic
go run main.go
```
Or via curl (requires generating Ed25519 signatures):
```bash
# Generate a signed assertion
cargo run --package stemedb-api --example gen_test_assertion > /tmp/assertion.json
# Submit it
curl -X POST http://localhost:3000/v1/assert \
-H "Content-Type: application/json" \
-d @/tmp/assertion.json
```
## 5. Query It Back
```bash
# Query by subject and predicate
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status"
# Query with a lens (conflict resolution)
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status&lens=Recency"
```
## 6. See Conflict in Action (The "Git for Truth" Moment)
Episteme stores **Claims, not Facts**. When multiple agents assert conflicting values,
the Skeptic endpoint shows you all competing claims instead of picking a winner.
### Create Conflicting Assertions
Using the Go SDK, create assertions with different claims about the same subject:
```bash
cd sdk/go/examples/conflict
go run main.go
```
### Query with Skeptic
The Skeptic endpoint reveals disagreement instead of hiding it:
```bash
curl "http://localhost:3000/v1/skeptic?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
```
Response shows all competing claims:
```json
{
"status": "Contested",
"conflict_score": 0.72,
"claims": [
{"value": {"type": "Boolean", "value": true}, "weight_share": 0.48, "assertion_count": 1},
{"value": {"type": "Boolean", "value": false}, "weight_share": 0.52, "assertion_count": 1}
],
"candidates_count": 2
}
```
**Key insight:** Instead of silently picking a winner, you see the disagreement. This is critical for health/finance domains where hiding conflict is dangerous.
## 7. Authority Tiers (Source-Class Resolution)
Different sources have different authority. A regulatory filing (FDA) outweighs
an anecdotal tweet. The Layered endpoint shows per-tier consensus.
### Query with Layered Consensus
The conflict example creates assertions with different `source_class` values (Clinical vs Anecdotal).
The Layered endpoint shows how each tier resolves independently:
```bash
curl "http://localhost:3000/v1/layered?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
```
Response shows tier-by-tier resolution:
```json
{
"tiers": [
{"tier": 1, "source_class": "Clinical", "winner": {"object": {"type": "Boolean", "value": true}}, "conflict_score": 0.0},
{"tier": 5, "source_class": "Anecdotal", "winner": {"object": {"type": "Boolean", "value": false}}, "conflict_score": 0.0}
],
"overall_winner": {"object": {"type": "Boolean", "value": true}},
"overall_conflict_score": 0.85
}
```
**Key insight:** Clinical tier (peer-reviewed research) wins despite Anecdotal tier (social media) disagreeing. The `overall_conflict_score` tells you the tiers disagree.
## 8. Distributed Mode (Cluster Node)
StemeDB supports horizontal scaling across multiple nodes. Each node runs SWIM membership for discovery, range sharding for data distribution, and a Gateway for request routing.
### Start a Cluster Node
```bash
cargo run --package stemedb-cluster --bin stemedb-node
```
The node starts on `http://localhost:4000` (Gateway API) and `127.0.0.1:9090` (RPC).
### Check Cluster Health
```bash
curl http://localhost:4000/v1/health
# {"healthy":true,"reachable_nodes":0,"joined":true}
```
### See Cluster Topology
```bash
curl http://localhost:4000/v1/cluster/status
```
Response shows shards and nodes:
```json
{
"node_count": 0,
"shard_count": 4,
"meta_version": 1,
"nodes": []
}
```
### Test Subject Routing
See which shard a subject maps to:
```bash
curl "http://localhost:4000/v1/route?subject=Tesla_Inc"
# {"subject":"Tesla_Inc","shard_id":0,"replicas":["abc12345"]}
curl "http://localhost:4000/v1/route?subject=Bitcoin"
# {"subject":"Bitcoin","shard_id":3,"replicas":["abc12345"]}
```
Different subjects hash to different shards for load distribution.
### Inspect a Shard
```bash
curl http://localhost:4000/v1/shards/0
```
Response shows shard metadata:
```json
{
"shard_id": 0,
"replicas": ["abc12345"],
"size_bytes": 0,
"assertion_count": 0,
"generation": 1
}
```
**Note:** The cluster node demonstrates routing topology. Full assertion storage requires running `stemedb-api` nodes as backends (integration in progress).
## What's Next?
| Goal | Resource |
|------|----------|
| Understand the vision | [vision.md](./vision.md) |
| See real use cases | [use-cases/README.md](./use-cases/README.md) |
| Use the Go SDK | [sdk/go/steme/README.md](./sdk/go/steme/README.md) |
| Build AI agents | [sdk/go/adk/README.md](./sdk/go/adk/README.md) |
| Understand architecture | [architecture.md](./architecture.md) |
| API reference | [crates/stemedb-api/README.md](./crates/stemedb-api/README.md) |
| Distributed architecture | [docs/research/distributed-write-path.md](./docs/research/distributed-write-path.md) |
## Common Issues
### Build fails
```bash
rustup update stable
cargo clean
cargo build --workspace
```
### Server won't start (port in use)
```bash
# Use a different port
STEMEDB_BIND_ADDR=127.0.0.1:3001 cargo run --package stemedb-api
```
### Validation script fails
Check the server log in the temp directory:
```bash
cat tmp/validate-*/server.log
```
### Query returns empty results
The ingestion worker runs asynchronously. If you're writing directly to the WAL (not via API), wait ~500ms before querying.
## Environment Variables
### Single-Node API (`stemedb-api`)
| Variable | Default | Description |
|----------|---------|-------------|
| `STEMEDB_BIND_ADDR` | `127.0.0.1:3000` | HTTP server address |
| `STEMEDB_WAL_DIR` | `data/wal` | Write-ahead log directory |
| `STEMEDB_DB_DIR` | `data/db` | KV store directory |
| `STEMEDB_METER_ENABLED` | `true` | Enable economic throttling |
### Cluster Node (`stemedb-node`)
| Variable | Default | Description |
|----------|---------|-------------|
| `STEMEDB_NODE_API_ADDR` | `127.0.0.1:4000` | Gateway HTTP address |
| `STEMEDB_NODE_RPC_ADDR` | `127.0.0.1:9090` | gRPC sync address |
| `STEMEDB_SEED_NODES` | (empty) | Comma-separated seed node RPC addresses |
| `STEMEDB_NUM_SHARDS` | `4` | Number of shards (power of 2 recommended) |
| `STEMEDB_REPLICATION_FACTOR` | `1` | Replica count per shard |
| `STEMEDB_DATACENTER` | (empty) | Datacenter/region label |