stemedb/quickstart.md
jordan d3a88585fe feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination
This commit includes comprehensive work on Phase 6 features:

## Admission Control (Phase 6 admission middleware)
- AdmissionStore implementation backed by TrustRankStore
- PoW verification with tier-based difficulty computation
- Trust tier progression (Newcomer → Established → Trusted → Authority)
- API integration with admission status endpoints

## HLC Recency Lens (Phase 6C)
- HlcRecencyLens for distributed system ordering
- Hybrid logical clock integration with causality preservation

## Cluster Coordination (Phase 6C)
- Multi-node cluster tests (availability, partition tolerance)
- CRDT convergence tests for anti-entropy sync
- Gateway handler improvements

## Aphoria Code Linter (Phase 2A)
- RFC/OWASP corpus builders with network fetching and caching
- Concept hierarchy with auto-alias creation on conflict detection
- Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting)

## Code Organization
- Split large files into modules to comply with 500-line limit
- Improved test organization with separate test modules
- Fixed rkyv serialization for EigenTrustState (AgentScore struct)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:43:37 -07:00

7.5 KiB

Quick Start

Get StemeDB running and validated in under 5 minutes.

Prerequisites

  • Rust 1.75+ (rustup update stable)
  • curl (for validation)

1. Validate It Works

# Clone and enter
git clone <repo-url>
cd stemedb

# Run end-to-end validation (builds, starts server, asserts, queries, shuts down)
make validate

Expected output:

==========================================
  StemeDB Validation
==========================================

[PASS] Build complete
[PASS] Server is healthy
[PASS] Health check passed
[PASS] Assertion created: abc123...
[PASS] Query returned correct data
[PASS] Lens query (Recency) works

==========================================
  All validation checks passed!
==========================================

If you see "All validation checks passed!" - StemeDB is working correctly.

2. Start the Server

cargo run --package stemedb-api

The server starts on http://localhost:3000.

3. Explore the API

Open the Swagger UI for interactive documentation:

http://localhost:3000/swagger-ui

Or check health via curl:

curl http://localhost:3000/v1/health
# {"status":"healthy","version":"0.1.0","assertions_count":0}

4. Create Your First Assertion

Using the Go SDK (recommended):

cd sdk/go/examples/basic
go run main.go

Or via curl (requires generating Ed25519 signatures):

# Generate a signed assertion
cargo run --package stemedb-api --example gen_test_assertion > /tmp/assertion.json

# Submit it
curl -X POST http://localhost:3000/v1/assert \
  -H "Content-Type: application/json" \
  -d @/tmp/assertion.json

5. Query It Back

# Query by subject and predicate
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status"

# Query with a lens (conflict resolution)
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status&lens=Recency"

6. See Conflict in Action (The "Git for Truth" Moment)

Episteme stores Claims, not Facts. When multiple agents assert conflicting values, the Skeptic endpoint shows you all competing claims instead of picking a winner.

Create Conflicting Assertions

Using the Go SDK, create assertions with different claims about the same subject:

cd sdk/go/examples/conflict
go run main.go

Query with Skeptic

The Skeptic endpoint reveals disagreement instead of hiding it:

curl "http://localhost:3000/v1/skeptic?subject=GLP1_Agonists&predicate=cardiovascular_benefit"

Response shows all competing claims:

{
  "status": "Contested",
  "conflict_score": 0.72,
  "claims": [
    {"value": {"type": "Boolean", "value": true}, "weight_share": 0.48, "assertion_count": 1},
    {"value": {"type": "Boolean", "value": false}, "weight_share": 0.52, "assertion_count": 1}
  ],
  "candidates_count": 2
}

Key insight: Instead of silently picking a winner, you see the disagreement. This is critical for health/finance domains where hiding conflict is dangerous.

7. Authority Tiers (Source-Class Resolution)

Different sources have different authority. A regulatory filing (FDA) outweighs an anecdotal tweet. The Layered endpoint shows per-tier consensus.

Query with Layered Consensus

The conflict example creates assertions with different source_class values (Clinical vs Anecdotal). The Layered endpoint shows how each tier resolves independently:

curl "http://localhost:3000/v1/layered?subject=GLP1_Agonists&predicate=cardiovascular_benefit"

Response shows tier-by-tier resolution:

{
  "tiers": [
    {"tier": 1, "source_class": "Clinical", "winner": {"object": {"type": "Boolean", "value": true}}, "conflict_score": 0.0},
    {"tier": 5, "source_class": "Anecdotal", "winner": {"object": {"type": "Boolean", "value": false}}, "conflict_score": 0.0}
  ],
  "overall_winner": {"object": {"type": "Boolean", "value": true}},
  "overall_conflict_score": 0.85
}

Key insight: Clinical tier (peer-reviewed research) wins despite Anecdotal tier (social media) disagreeing. The overall_conflict_score tells you the tiers disagree.

8. Distributed Mode (Cluster Node)

StemeDB supports horizontal scaling across multiple nodes. Each node runs SWIM membership for discovery, range sharding for data distribution, and a Gateway for request routing.

Start a Cluster Node

cargo run --package stemedb-cluster --bin stemedb-node

The node starts on http://localhost:4000 (Gateway API) and 127.0.0.1:9090 (RPC).

Check Cluster Health

curl http://localhost:4000/v1/health
# {"healthy":true,"reachable_nodes":0,"joined":true}

See Cluster Topology

curl http://localhost:4000/v1/cluster/status

Response shows shards and nodes:

{
  "node_count": 0,
  "shard_count": 4,
  "meta_version": 1,
  "nodes": []
}

Test Subject Routing

See which shard a subject maps to:

curl "http://localhost:4000/v1/route?subject=Tesla_Inc"
# {"subject":"Tesla_Inc","shard_id":0,"replicas":["abc12345"]}

curl "http://localhost:4000/v1/route?subject=Bitcoin"
# {"subject":"Bitcoin","shard_id":3,"replicas":["abc12345"]}

Different subjects hash to different shards for load distribution.

Inspect a Shard

curl http://localhost:4000/v1/shards/0

Response shows shard metadata:

{
  "shard_id": 0,
  "replicas": ["abc12345"],
  "size_bytes": 0,
  "assertion_count": 0,
  "generation": 1
}

Note: The cluster node demonstrates routing topology. Full assertion storage requires running stemedb-api nodes as backends (integration in progress).

What's Next?

Goal Resource
Understand the vision vision.md
See real use cases use-cases/README.md
Use the Go SDK sdk/go/steme/README.md
Build AI agents sdk/go/adk/README.md
Understand architecture architecture.md
API reference crates/stemedb-api/README.md
Distributed architecture docs/research/distributed-write-path.md

Common Issues

Build fails

rustup update stable
cargo clean
cargo build --workspace

Server won't start (port in use)

# Use a different port
STEMEDB_BIND_ADDR=127.0.0.1:3001 cargo run --package stemedb-api

Validation script fails

Check the server log in the temp directory:

cat tmp/validate-*/server.log

Query returns empty results

The ingestion worker runs asynchronously. If you're writing directly to the WAL (not via API), wait ~500ms before querying.

Environment Variables

Single-Node API (stemedb-api)

Variable Default Description
STEMEDB_BIND_ADDR 127.0.0.1:3000 HTTP server address
STEMEDB_WAL_DIR data/wal Write-ahead log directory
STEMEDB_DB_DIR data/db KV store directory
STEMEDB_METER_ENABLED true Enable economic throttling

Cluster Node (stemedb-node)

Variable Default Description
STEMEDB_NODE_API_ADDR 127.0.0.1:4000 Gateway HTTP address
STEMEDB_NODE_RPC_ADDR 127.0.0.1:9090 gRPC sync address
STEMEDB_SEED_NODES (empty) Comma-separated seed node RPC addresses
STEMEDB_NUM_SHARDS 4 Number of shards (power of 2 recommended)
STEMEDB_REPLICATION_FACTOR 1 Replica count per shard
STEMEDB_DATACENTER (empty) Datacenter/region label