stemedb/quickstart.md

# Quick Start

Get StemeDB running and validated in under 5 minutes.

## Prerequisites

- Rust 1.75+ (`rustup update stable`)
- curl (for validation)

## 1. Validate It Works

```bash
# Clone and enter
git clone <repo-url>
cd stemedb

# Run end-to-end validation (builds, starts server, asserts, queries, shuts down)
make validate
```

Expected output:
```
==========================================
  StemeDB Validation
==========================================

[PASS] Build complete
[PASS] Server is healthy
[PASS] Health check passed
[PASS] Assertion created: abc123...
[PASS] Query returned correct data
[PASS] Lens query (Recency) works

==========================================
  All validation checks passed!
==========================================
```

If you see "All validation checks passed!" - StemeDB is working correctly.

## 2. Start the Server

```bash
cargo run --package stemedb-api
```

The server starts on `http://localhost:3000`.

## 3. Explore the API

Open the Swagger UI for interactive documentation:

```
http://localhost:3000/swagger-ui
```

Or check health via curl:

```bash
curl http://localhost:3000/v1/health
# {"status":"healthy","version":"0.1.0","assertions_count":0}
```

## 4. Create Your First Assertion

Using the Go SDK (recommended):

```bash
cd sdk/go/examples/basic
go run main.go
```

Or via curl (requires generating Ed25519 signatures):

```bash
# Generate a signed assertion
cargo run --package stemedb-api --example gen_test_assertion > /tmp/assertion.json

# Submit it
curl -X POST http://localhost:3000/v1/assert \
  -H "Content-Type: application/json" \
  -d @/tmp/assertion.json
```

## 5. Query It Back

```bash
# Query by subject and predicate
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status"

# Query with a lens (conflict resolution)
curl "http://localhost:3000/v1/query?subject=StemeDB_Validation&predicate=test_status&lens=Recency"
```

## 6. See Conflict in Action (The "Git for Truth" Moment)

Episteme stores **Claims, not Facts**. When multiple agents assert conflicting values,
the Skeptic endpoint shows you all competing claims instead of picking a winner.

### Create Conflicting Assertions

Using the Go SDK, create assertions with different claims about the same subject:

```bash
cd sdk/go/examples/conflict
go run main.go
```

### Query with Skeptic

The Skeptic endpoint reveals disagreement instead of hiding it:

```bash
curl "http://localhost:3000/v1/skeptic?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
```

Response shows all competing claims:
```json
{
  "status": "Contested",
  "conflict_score": 0.72,
  "claims": [
    {"value": {"type": "Boolean", "value": true}, "weight_share": 0.48, "assertion_count": 1},
    {"value": {"type": "Boolean", "value": false}, "weight_share": 0.52, "assertion_count": 1}
  ],
  "candidates_count": 2
}
```

**Key insight:** Instead of silently picking a winner, you see the disagreement. This is critical for health/finance domains where hiding conflict is dangerous.

## 7. Authority Tiers (Source-Class Resolution)

Different sources have different authority. A regulatory filing (FDA) outweighs
an anecdotal tweet. The Layered endpoint shows per-tier consensus.

### Query with Layered Consensus

The conflict example creates assertions with different `source_class` values (Clinical vs Anecdotal).
The Layered endpoint shows how each tier resolves independently:

```bash
curl "http://localhost:3000/v1/layered?subject=GLP1_Agonists&predicate=cardiovascular_benefit"
```

Response shows tier-by-tier resolution:
```json
{
  "tiers": [
    {"tier": 1, "source_class": "Clinical", "winner": {"object": {"type": "Boolean", "value": true}}, "conflict_score": 0.0},
    {"tier": 5, "source_class": "Anecdotal", "winner": {"object": {"type": "Boolean", "value": false}}, "conflict_score": 0.0}
  ],
  "overall_winner": {"object": {"type": "Boolean", "value": true}},
  "overall_conflict_score": 0.85
}
```

**Key insight:** Clinical tier (peer-reviewed research) wins despite Anecdotal tier (social media) disagreeing. The `overall_conflict_score` tells you the tiers disagree.

## 8. Distributed Mode (Cluster Node)

StemeDB supports horizontal scaling across multiple nodes. Each node runs SWIM membership for discovery, range sharding for data distribution, and a Gateway for request routing.

### Start a Cluster Node

```bash
cargo run --package stemedb-cluster --bin stemedb-node
```

The node starts on `http://localhost:4000` (Gateway API) and `127.0.0.1:9090` (RPC).

### Check Cluster Health

```bash
curl http://localhost:4000/v1/health
# {"healthy":true,"reachable_nodes":0,"joined":true}
```

### See Cluster Topology

```bash
curl http://localhost:4000/v1/cluster/status
```

Response shows shards and nodes:
```json
{
  "node_count": 0,
  "shard_count": 4,
  "meta_version": 1,
  "nodes": []
}
```

### Test Subject Routing

See which shard a subject maps to:

```bash
curl "http://localhost:4000/v1/route?subject=Tesla_Inc"
# {"subject":"Tesla_Inc","shard_id":0,"replicas":["abc12345"]}

curl "http://localhost:4000/v1/route?subject=Bitcoin"
# {"subject":"Bitcoin","shard_id":3,"replicas":["abc12345"]}
```

Different subjects hash to different shards for load distribution.

### Inspect a Shard

```bash
curl http://localhost:4000/v1/shards/0
```

Response shows shard metadata:
```json
{
  "shard_id": 0,
  "replicas": ["abc12345"],
  "size_bytes": 0,
  "assertion_count": 0,
  "generation": 1
}
```

**Note:** The cluster node demonstrates routing topology. Full assertion storage requires running `stemedb-api` nodes as backends (integration in progress).

## What's Next?

| Goal | Resource |
|------|----------|
| Understand the vision | [vision.md](./vision.md) |
| See real use cases | [use-cases/README.md](./use-cases/README.md) |
| Use the Go SDK | [sdk/go/steme/README.md](./sdk/go/steme/README.md) |
| Build AI agents | [sdk/go/adk/README.md](./sdk/go/adk/README.md) |
| Understand architecture | [architecture.md](./architecture.md) |
| API reference | [crates/stemedb-api/README.md](./crates/stemedb-api/README.md) |
| Distributed architecture | [docs/research/distributed-write-path.md](./docs/research/distributed-write-path.md) |

## Common Issues

### Build fails

```bash
rustup update stable
cargo clean
cargo build --workspace
```

### Server won't start (port in use)

```bash
# Use a different port
STEMEDB_BIND_ADDR=127.0.0.1:3001 cargo run --package stemedb-api
```

### Validation script fails

Check the server log in the temp directory:
```bash
cat tmp/validate-*/server.log
```

### Query returns empty results

The ingestion worker runs asynchronously. If you're writing directly to the WAL (not via API), wait ~500ms before querying.

## Environment Variables

### Single-Node API (`stemedb-api`)

| Variable | Default | Description |
|----------|---------|-------------|
| `STEMEDB_BIND_ADDR` | `127.0.0.1:3000` | HTTP server address |
| `STEMEDB_WAL_DIR` | `data/wal` | Write-ahead log directory |
| `STEMEDB_DB_DIR` | `data/db` | KV store directory |
| `STEMEDB_METER_ENABLED` | `true` | Enable economic throttling |

### Cluster Node (`stemedb-node`)

| Variable | Default | Description |
|----------|---------|-------------|
| `STEMEDB_NODE_API_ADDR` | `127.0.0.1:4000` | Gateway HTTP address |
| `STEMEDB_NODE_RPC_ADDR` | `127.0.0.1:9090` | gRPC sync address |
| `STEMEDB_SEED_NODES` | (empty) | Comma-separated seed node RPC addresses |
| `STEMEDB_NUM_SHARDS` | `4` | Number of shards (power of 2 recommended) |
| `STEMEDB_REPLICATION_FACTOR` | `1` | Replica count per shard |
| `STEMEDB_DATACENTER` | (empty) | Datacenter/region label |