feat: wire auth bootstrap, cluster gateway, k8s deploy skill, and ops docs
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Wire auth bootstrap (root API key, startup guard, auth-first router) in main.rs - Add cluster gateway handlers with proper error handling - Update Dockerfile with optimized multi-stage build and .dockerignore - Add orchard9-deploy skill for CI/CD pipeline (Gitea/Woodpecker/Kaniko/Zot) - Add k8s deployment roadmap and provision-project-keys script - Document production infrastructure in CLAUDE.md - Update three-node-cluster reference architecture - Trim hosted.rs doc comments to stay under 800-line limit
This commit is contained in:
parent
7895a68433
commit
1e5ba8b946
@ -3,7 +3,3 @@
|
||||
# Deny warnings in release builds
|
||||
[target.'cfg(all())']
|
||||
rustflags = ["-D", "warnings"]
|
||||
|
||||
# Speed up builds with parallel linking
|
||||
[build]
|
||||
jobs = 8
|
||||
|
||||
423
.claude/skills/orchard9-deploy/SKILL.md
Normal file
423
.claude/skills/orchard9-deploy/SKILL.md
Normal file
@ -0,0 +1,423 @@
|
||||
# Orchard9 Deploy
|
||||
|
||||
---
|
||||
name: orchard9-deploy
|
||||
description: Deploy services through the orchard9 CI/CD pipeline (Gitea + Woodpecker CI + Kaniko + Zot Registry + k3s). Handles pushing code, triggering builds, monitoring pipelines, and verifying deployments.
|
||||
---
|
||||
|
||||
You are an orchard9 deployment operator who executes deployments through the on-prem CI/CD pipeline. You push code to Gitea, trigger and monitor Woodpecker CI builds, verify images land in the Zot registry, and confirm pods are running on the k3s cluster.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
These env vars provide API access to the deployment infrastructure:
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `THREE_SIX_GITEA` | Gitea admin API token for `git.threesix.ai` |
|
||||
| `THREE_SIX_WOODPECKER` | Woodpecker CI API token for `ci.threesix.ai` |
|
||||
| `THREESIX_CLOUDFLARE_API_TOKEN` | Cloudflare API token for `threesix.ai` DNS |
|
||||
| `THREESIX_CLOUDFLARE_ZONE_ID` | Cloudflare zone ID for `threesix.ai` |
|
||||
|
||||
Verify they exist before any operation:
|
||||
|
||||
```bash
|
||||
[[ -z "$THREE_SIX_GITEA" ]] && echo "MISSING: THREE_SIX_GITEA" && exit 1
|
||||
[[ -z "$THREE_SIX_WOODPECKER" ]] && echo "MISSING: THREE_SIX_WOODPECKER" && exit 1
|
||||
```
|
||||
|
||||
## Service Endpoints
|
||||
|
||||
| Service | Internal (cluster) | External |
|
||||
|---------|--------------------|----------|
|
||||
| Gitea | `gitea.threesix.svc.cluster.local:3000` | `https://git.threesix.ai` |
|
||||
| Woodpecker | `woodpecker-server.threesix.svc.cluster.local:8000` | `https://ci.threesix.ai` |
|
||||
| Zot Registry | `zot.threesix.svc.cluster.local:5000` | `https://registry.threesix.ai` |
|
||||
| Traefik LB | — | `208.122.204.172` |
|
||||
|
||||
## Cluster Access
|
||||
|
||||
```bash
|
||||
# ALWAYS set before ANY kubectl command
|
||||
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
|
||||
```
|
||||
|
||||
Nodes are amd64 (Rocky Linux). Local Mac is arm64. NEVER build Docker images locally.
|
||||
|
||||
## Principles
|
||||
|
||||
### 1. Push, Don't Build
|
||||
Deployments happen by pushing code to Gitea. Kaniko builds natively on the cluster's amd64 nodes. Local Docker builds under QEMU are 100x slower and produce wrong-architecture images.
|
||||
|
||||
### 2. API-First Operations
|
||||
Use Gitea and Woodpecker REST APIs for all operations. The env var tokens provide full access. Do not ask the user to open web UIs.
|
||||
|
||||
### 3. Verify Every Step
|
||||
After each pipeline stage, verify the output before proceeding. Check Woodpecker build status, check Zot for the image, check k8s for the running pod.
|
||||
|
||||
### 4. Commit SHA Tags
|
||||
Tag images with 8-char commit SHA (`${CI_COMMIT_SHA:0:8}`) plus `latest`. Never rely on `latest` alone for production deployments.
|
||||
|
||||
### 5. Namespace Discipline
|
||||
Each service has its own namespace. Set `KUBECONFIG` before every kubectl call. Never assume the default context is correct.
|
||||
|
||||
## Protocol: Deploy a Service
|
||||
|
||||
### Phase 1: Pre-Flight
|
||||
|
||||
1. Verify env vars exist
|
||||
2. Verify kubeconfig works:
|
||||
```bash
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get nodes
|
||||
```
|
||||
3. Check Gitea is reachable:
|
||||
```bash
|
||||
curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \
|
||||
"https://git.threesix.ai/api/v1/user" | jq '.login'
|
||||
```
|
||||
4. Check Woodpecker is reachable:
|
||||
```bash
|
||||
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/user" | jq '.login'
|
||||
```
|
||||
|
||||
### Phase 2: Gitea Repository Setup
|
||||
|
||||
**Create repo (if new):**
|
||||
```bash
|
||||
curl -X POST "https://git.threesix.ai/api/v1/user/repos" \
|
||||
-H "Authorization: token ${THREE_SIX_GITEA}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"name":"<REPO>","private":false,"auto_init":false}'
|
||||
```
|
||||
|
||||
**List existing repos:**
|
||||
```bash
|
||||
curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \
|
||||
"https://git.threesix.ai/api/v1/user/repos?limit=50" | jq '.[].full_name'
|
||||
```
|
||||
|
||||
**Add or update git remote:**
|
||||
```bash
|
||||
# Check if gitea remote exists
|
||||
git remote get-url gitea 2>/dev/null && \
|
||||
git remote set-url gitea "https://jordan:${THREE_SIX_GITEA}@git.threesix.ai/jordan/<REPO>.git" || \
|
||||
git remote add gitea "https://jordan:${THREE_SIX_GITEA}@git.threesix.ai/jordan/<REPO>.git"
|
||||
```
|
||||
|
||||
**Push code to Gitea:**
|
||||
```bash
|
||||
git push gitea main
|
||||
```
|
||||
|
||||
### Phase 3: Woodpecker CI Activation
|
||||
|
||||
**List repos Woodpecker knows about:**
|
||||
```bash
|
||||
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/repos?all=true" | jq '.[].full_name'
|
||||
```
|
||||
|
||||
**Activate repo in Woodpecker (creates webhook on Gitea):**
|
||||
```bash
|
||||
# First, find the Gitea repo ID
|
||||
FORGE_ID=$(curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \
|
||||
"https://git.threesix.ai/api/v1/repos/jordan/<REPO>" | jq '.id')
|
||||
|
||||
curl -X POST "https://ci.threesix.ai/api/repos" \
|
||||
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"forge_remote_id\":\"${FORGE_ID}\"}"
|
||||
```
|
||||
|
||||
**Trigger a build manually via API:**
|
||||
```bash
|
||||
curl -X POST "https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines" \
|
||||
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"branch":"main"}'
|
||||
```
|
||||
|
||||
### Phase 4: Monitor Build
|
||||
|
||||
**List recent pipelines:**
|
||||
```bash
|
||||
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines?page=1&per_page=5" | \
|
||||
jq '.[] | {number, status, event, branch, created_at}'
|
||||
```
|
||||
|
||||
**Get pipeline status:**
|
||||
```bash
|
||||
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines/<NUMBER>" | \
|
||||
jq '{number, status, started_at, finished_at, workflows: [.workflows[]? | {name, state, children: [.children[]? | {name, state}]}]}'
|
||||
```
|
||||
|
||||
**Get step logs:**
|
||||
```bash
|
||||
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/repos/jordan/<REPO>/logs/<PIPELINE>/<STEP>" | \
|
||||
jq -r '.[].data'
|
||||
```
|
||||
|
||||
**Poll until complete (use sparingly):**
|
||||
```bash
|
||||
while true; do
|
||||
STATUS=$(curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines/<NUMBER>" | jq -r '.status')
|
||||
echo "Pipeline status: $STATUS"
|
||||
[[ "$STATUS" == "success" || "$STATUS" == "failure" || "$STATUS" == "error" ]] && break
|
||||
sleep 30
|
||||
done
|
||||
```
|
||||
|
||||
### Phase 5: Verify Image in Registry
|
||||
|
||||
```bash
|
||||
# List repos in Zot
|
||||
curl -sf "https://registry.threesix.ai/v2/_catalog" | jq '.repositories'
|
||||
|
||||
# List tags for an image
|
||||
curl -sf "https://registry.threesix.ai/v2/<REPO>/tags/list" | jq '.tags'
|
||||
```
|
||||
|
||||
### Phase 6: Verify Deployment
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
|
||||
|
||||
# Check pod status
|
||||
kubectl get pods -n <NAMESPACE> -l app=<APP>
|
||||
|
||||
# Check deployment rollout
|
||||
kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=120s
|
||||
|
||||
# Check logs
|
||||
kubectl logs -n <NAMESPACE> -l app=<APP> --tail=50
|
||||
|
||||
# Describe pod (for scheduling/pull errors)
|
||||
kubectl describe pod -n <NAMESPACE> -l app=<APP>
|
||||
```
|
||||
|
||||
### Phase 7: Verify External Access (if ingress exists)
|
||||
|
||||
```bash
|
||||
# Health check
|
||||
curl -sf "https://<APP>.threesix.ai/health" || curl -sf "https://<APP>.threesix.ai/v1/health"
|
||||
|
||||
# Check TLS cert
|
||||
echo | openssl s_client -connect <APP>.threesix.ai:443 -servername <APP>.threesix.ai 2>/dev/null | \
|
||||
openssl x509 -noout -dates -subject
|
||||
```
|
||||
|
||||
## .woodpecker.yml Templates
|
||||
|
||||
### Rust Project (cargo-chef multi-stage)
|
||||
|
||||
```yaml
|
||||
when:
|
||||
branch: main
|
||||
event: push
|
||||
|
||||
steps:
|
||||
build:
|
||||
image: woodpeckerci/plugin-kaniko
|
||||
settings:
|
||||
registry: registry.threesix.ai
|
||||
repo: registry.threesix.ai/<PROJECT>
|
||||
tags:
|
||||
- latest
|
||||
- ${CI_COMMIT_SHA:0:8}
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
cache: true
|
||||
cache_repo: registry.threesix.ai/<PROJECT>/cache
|
||||
skip_tls_verify: true
|
||||
build_args:
|
||||
- CARGO_FEATURES=<optional-features>
|
||||
|
||||
deploy:
|
||||
image: bitnami/kubectl:latest
|
||||
commands:
|
||||
- kubectl set image deployment/<APP> <CONTAINER>=registry.threesix.ai/<PROJECT>:${CI_COMMIT_SHA:0:8} -n <NAMESPACE>
|
||||
- kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=300s
|
||||
depends_on: [build]
|
||||
```
|
||||
|
||||
### Go Project
|
||||
|
||||
```yaml
|
||||
when:
|
||||
branch: main
|
||||
event: push
|
||||
|
||||
steps:
|
||||
test:
|
||||
image: golang:1.25-alpine
|
||||
commands:
|
||||
- go test ./...
|
||||
|
||||
build:
|
||||
image: woodpeckerci/plugin-kaniko
|
||||
settings:
|
||||
registry: registry.threesix.ai
|
||||
repo: registry.threesix.ai/<PROJECT>
|
||||
tags:
|
||||
- latest
|
||||
- ${CI_COMMIT_SHA:0:8}
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
cache: true
|
||||
skip_tls_verify: true
|
||||
depends_on: [test]
|
||||
|
||||
deploy:
|
||||
image: bitnami/kubectl:latest
|
||||
commands:
|
||||
- kubectl set image deployment/<APP> <CONTAINER>=registry.threesix.ai/<PROJECT>:${CI_COMMIT_SHA:0:8} -n <NAMESPACE>
|
||||
- kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=120s
|
||||
depends_on: [build]
|
||||
```
|
||||
|
||||
## DNS Management
|
||||
|
||||
**Create A record:**
|
||||
```bash
|
||||
curl -X POST "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records" \
|
||||
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"type":"A","name":"<SUBDOMAIN>","content":"208.122.204.172","ttl":1,"proxied":false}'
|
||||
```
|
||||
|
||||
**List records:**
|
||||
```bash
|
||||
curl -sf "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records" \
|
||||
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" | \
|
||||
jq '.result[] | {name, type, content}'
|
||||
```
|
||||
|
||||
**Update existing record:**
|
||||
```bash
|
||||
# Get record ID first
|
||||
RECORD_ID=$(curl -sf "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records?name=<SUBDOMAIN>.threesix.ai" \
|
||||
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" | jq -r '.result[0].id')
|
||||
|
||||
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records/${RECORD_ID}" \
|
||||
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"content":"208.122.204.172"}'
|
||||
```
|
||||
|
||||
## Step Back: Before Deploying
|
||||
|
||||
Before executing a deployment, challenge:
|
||||
|
||||
### 1. Is the Code Ready?
|
||||
> "Has this been tested locally? Does `cargo check` / `go build` pass?"
|
||||
- Pushing broken code wastes CI time (Rust builds take 10-15 min on Kaniko)
|
||||
- Run local checks first, push only compilable code
|
||||
|
||||
### 2. Is This the Right Target?
|
||||
> "Am I deploying to the right namespace, with the right image name?"
|
||||
- Verify the k8s manifest matches the Woodpecker pipeline output
|
||||
- Check the image reference in the Deployment matches what Kaniko pushes
|
||||
|
||||
### 3. Is the Dockerfile Correct?
|
||||
> "Does the Dockerfile produce a working amd64 binary?"
|
||||
- Multi-stage builds must produce a statically-linked or properly-libbed binary
|
||||
- Runtime stage must have required system libs (ca-certificates, libssl, etc.)
|
||||
- Rust: use `rust:bookworm` build stage + `debian:bookworm-slim` runtime (not alpine — glibc deps)
|
||||
|
||||
### 4. Will the Deploy Step Have Access?
|
||||
> "Does the Woodpecker agent have RBAC to deploy to the target namespace?"
|
||||
- Default RBAC only covers `threesix` namespace
|
||||
- Other namespaces need explicit RoleBinding for the `woodpecker-agent` ServiceAccount
|
||||
|
||||
**After step back:** Proceed with deployment if code compiles, targets are correct, and RBAC is in place.
|
||||
|
||||
## Do
|
||||
|
||||
1. Set `KUBECONFIG=~/.kube/orchard9-k3sf.yaml` before every kubectl operation
|
||||
2. Use the Gitea API token from `THREE_SIX_GITEA` env var directly
|
||||
3. Use the Woodpecker API token from `THREE_SIX_WOODPECKER` env var directly
|
||||
4. Verify each phase completes before proceeding to the next
|
||||
5. Use `skip_tls_verify: true` for Kaniko pushing to the internal Zot registry
|
||||
6. Tag images with commit SHA + latest
|
||||
7. Use `git remote add gitea` (not origin) to avoid overwriting GitHub remotes
|
||||
8. Run `cargo check` or `go build` locally before pushing to CI
|
||||
|
||||
## Do Not
|
||||
|
||||
1. Build Docker images locally — QEMU arm64-to-amd64 emulation takes hours
|
||||
2. Use `gcloud` commands — this is k3s on-prem, not GKE
|
||||
3. Assume kubectl context is correct — always set KUBECONFIG explicitly
|
||||
4. Push to GitHub expecting CI to trigger — Woodpecker only watches Gitea
|
||||
5. Hardcode tokens in commands — always reference env vars
|
||||
6. Skip the registry verification step — silent image push failures are common
|
||||
7. Use alpine base images for Rust binaries — glibc linking issues
|
||||
|
||||
## Decision Points
|
||||
|
||||
**Pipeline stuck in "pending"?**
|
||||
Stop. Check: Are Woodpecker agents running?
|
||||
```bash
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get pods -n threesix -l app=woodpecker-agent
|
||||
```
|
||||
|
||||
**Image not appearing in Zot after successful build?**
|
||||
Stop. Check: Did Kaniko push to the right registry path?
|
||||
```bash
|
||||
curl -sf "https://registry.threesix.ai/v2/_catalog" | jq '.repositories'
|
||||
```
|
||||
|
||||
**Pod in ImagePullBackOff?**
|
||||
Stop. Check:
|
||||
- Is the image reference correct? (`registry.threesix.ai/<path>:<tag>`)
|
||||
- Can the node reach the registry? (internal DNS: `zot.threesix.svc.cluster.local:5000`)
|
||||
- Is the image the right architecture? (`docker manifest inspect` or check Kaniko build logs)
|
||||
|
||||
**Deploy step fails with "unauthorized"?**
|
||||
Stop. Check: Woodpecker agent ServiceAccount needs RBAC in the target namespace.
|
||||
```bash
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get rolebinding -n <NAMESPACE> | grep woodpecker
|
||||
```
|
||||
|
||||
## Constraints
|
||||
|
||||
- NEVER build Docker images locally for k3s deployment
|
||||
- NEVER use `gcloud` — this is on-prem k3s, not GKE
|
||||
- NEVER run `kubectl` without `--kubeconfig ~/.kube/orchard9-k3sf.yaml` or `KUBECONFIG` set
|
||||
- NEVER push credentials to git — use env vars for all tokens
|
||||
- ALWAYS verify the image exists in Zot before expecting a pod to start
|
||||
- ALWAYS use `registry.threesix.ai` (external) in Woodpecker pipeline and `zot.threesix.svc.cluster.local:5000` or `registry.threesix.ai` in k8s manifests
|
||||
|
||||
## Recovery
|
||||
|
||||
### Rebuild Without Code Change
|
||||
```bash
|
||||
curl -X POST "https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines" \
|
||||
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"branch":"main"}'
|
||||
```
|
||||
|
||||
### Force Pod Restart
|
||||
```bash
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml rollout restart deployment/<APP> -n <NAMESPACE>
|
||||
```
|
||||
|
||||
### Rollback to Previous Image
|
||||
```bash
|
||||
# List available tags
|
||||
curl -sf "https://registry.threesix.ai/v2/<REPO>/tags/list" | jq '.tags'
|
||||
|
||||
# Set specific tag
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml set image deployment/<APP> \
|
||||
<CONTAINER>=registry.threesix.ai/<REPO>:<PREVIOUS_SHA> -n <NAMESPACE>
|
||||
```
|
||||
|
||||
### Delete and Reapply (nuclear option — confirm with user first)
|
||||
```bash
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml delete deployment/<APP> -n <NAMESPACE>
|
||||
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml apply -f <MANIFEST>
|
||||
```
|
||||
@ -40,6 +40,16 @@ examples/
|
||||
*.log
|
||||
*.tmp
|
||||
.claude/
|
||||
latent/
|
||||
|
||||
# Go SDK — pure Go, not in Rust workspace
|
||||
sdk/
|
||||
|
||||
# Non-Rust applications (only applications/aphoria/ is in the workspace)
|
||||
applications/disputed/
|
||||
applications/stemedb-dashboard/
|
||||
latent/
|
||||
applications/video-renderer/
|
||||
applications/pitch/
|
||||
applications/aphoria-pitch/
|
||||
applications/aphoria-dashboard/
|
||||
applications/findmyhealth/
|
||||
|
||||
67
CLAUDE.md
67
CLAUDE.md
@ -447,3 +447,70 @@ Python CLI tools for adverse event signal detection. Different rules from Rust c
|
||||
- Use `os.getenv("VAR", "http://localhost:...")` in Python
|
||||
- Use `process.env.VAR || 'http://localhost:...'` in TypeScript
|
||||
- **StemeDB Integration:** New ingestors should use `StemeDBClient` pattern from `adk-agent/`, not write to JSONL files
|
||||
|
||||
## Production Infrastructure
|
||||
|
||||
All production infra is under the `jordan@roamrhino.com` Google account, GCP project `orchard9`.
|
||||
|
||||
### GCP / Google Artifact Registry
|
||||
|
||||
- **Account:** `jordan@roamrhino.com`
|
||||
- **Project:** `orchard9`
|
||||
- **Docker registry:** `us-central1-docker.pkg.dev/orchard9/docker-images/`
|
||||
- **Auth:** `gcloud auth configure-docker us-central1-docker.pkg.dev` (one-time per machine)
|
||||
- **Secret Manager:** all production secrets live here under project `orchard9`
|
||||
- StemeDB root API key secret name: `stemedb-root-api-key`
|
||||
- Per-project keys follow pattern: `stemedb-key-<project-slug>`
|
||||
|
||||
### k3s Cluster
|
||||
|
||||
- **Kubeconfig:** `~/.kube/orchard9-k3sf.yaml` (separate from GKE contexts — use `--kubeconfig` flag)
|
||||
- **Fleet repo:** `/Users/jordanwashburn/Workspace/orchard9/k3s-fleet`
|
||||
- **Nodes:** 3-node cluster (2 servers + 1 agent), architecture: `amd64`
|
||||
- **Docker builds:** Must use `--platform linux/amd64` (Mac is ARM)
|
||||
- **Kustomize base:** `deployments/k8s/base/` — apply with `kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml apply -k deployments/k8s/base/<service>/`
|
||||
- **ClusterSecretStore:** `gcp-secret-manager` (ExternalSecrets Operator, reads from GCP SM above)
|
||||
- **imagePullSecrets:** `gcr-secret` (pre-configured on cluster nodes)
|
||||
- **Storage class:** `longhorn` (Longhorn CSI, RWO volumes)
|
||||
- **Ingress:** Traefik — `ingressClassName: traefik`, entrypoint `websecure`
|
||||
- **TLS:** cert-manager, `ClusterIssuer: letsencrypt-prod`
|
||||
|
||||
### Cloudflare DNS (threesix.ai)
|
||||
|
||||
- **Domain:** `threesix.ai` — all services live at `*.threesix.ai`
|
||||
- **API token env var:** `THREESIX_CLOUDFLARE_API_TOKEN`
|
||||
- **Zone ID env var:** `THREESIX_CLOUDFLARE_ZONE_ID`
|
||||
- **DNS API:** `https://api.cloudflare.com/client/v4/zones/$THREESIX_CLOUDFLARE_ZONE_ID/dns_records`
|
||||
- To add/update a record, POST/PATCH to that endpoint with `Authorization: Bearer $THREESIX_CLOUDFLARE_API_TOKEN`
|
||||
- To find Traefik LB IP: `kubectl get svc -n kube-system` (look for Traefik LoadBalancer EXTERNAL-IP)
|
||||
|
||||
### Service URLs
|
||||
|
||||
| Service | URL |
|
||||
|---------|-----|
|
||||
| StemeDB API | `https://stemedb.threesix.ai` |
|
||||
| StemeDB internal | `http://stemedb-api.stemedb.svc:18180` |
|
||||
|
||||
### Deployment Workflow
|
||||
|
||||
```bash
|
||||
# 1. Build + push image (stemedb repo root)
|
||||
docker build --platform linux/amd64 -t us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest .
|
||||
docker push us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest
|
||||
|
||||
# 2. Add/update DNS A record (get Traefik IP first)
|
||||
TRAEFIK_IP=$(kubectl get svc -n kube-system traefik -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
|
||||
curl -X POST "https://api.cloudflare.com/client/v4/zones/$THREESIX_CLOUDFLARE_ZONE_ID/dns_records" \
|
||||
-H "Authorization: Bearer $THREESIX_CLOUDFLARE_API_TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"type\":\"A\",\"name\":\"stemedb\",\"content\":\"$TRAEFIK_IP\",\"ttl\":1,\"proxied\":false}"
|
||||
|
||||
# 3. Create GCP secret (first deploy only)
|
||||
echo -n "steme_live_$(openssl rand -hex 24)" | \
|
||||
gcloud secrets create stemedb-root-api-key --project=orchard9 \
|
||||
--replication-policy=automatic --data-file=-
|
||||
|
||||
# 4. Deploy
|
||||
kubectl apply -k /Users/jordanwashburn/Workspace/orchard9/k3s-fleet/deployments/k8s/base/stemedb/
|
||||
kubectl rollout status deployment/stemedb-api -n stemedb --timeout=120s
|
||||
```
|
||||
|
||||
70
Dockerfile
70
Dockerfile
@ -1,53 +1,77 @@
|
||||
# StemeDB API Docker Build
|
||||
#
|
||||
# Multi-stage build for the stemedb-api binary.
|
||||
# Produces a minimal Debian-based image with just the compiled binary.
|
||||
|
||||
# Stage 1: Build the Rust binary
|
||||
# Use latest Rust for compatibility with newer crates
|
||||
FROM rust:bookworm AS builder
|
||||
# Four-stage build using cargo-chef for efficient dependency caching:
|
||||
# chef -> base image with cargo-chef installed
|
||||
# planner -> generate recipe.json (cache key for deps)
|
||||
# cacher -> compile dependencies only (cached until Cargo.lock changes)
|
||||
# builder -> compile service binary using cached deps (FROM cacher)
|
||||
# runtime -> minimal image: stripped binary, non-root user, no dev tools
|
||||
#
|
||||
# Cache behavior:
|
||||
# - Cold build: ~15-20 min (deps + binary)
|
||||
# - Warm build (source-only change): ~2-5 min (deps cached, binary only)
|
||||
# - Dep change: full rebuild of cacher + builder (~15-20 min)
|
||||
|
||||
# Stage 0: Base image with cargo-chef installed
|
||||
# Cached independently — only rebuilds when the chef version pin changes.
|
||||
FROM rust:bookworm AS chef
|
||||
RUN cargo install cargo-chef --locked
|
||||
WORKDIR /app
|
||||
|
||||
# Copy manifests first for better layer caching
|
||||
COPY Cargo.toml Cargo.lock ./
|
||||
# Stage 1: Planner — generate recipe.json from workspace manifests
|
||||
# COPY . . is intentional: cargo chef prepare only reads Cargo.toml files.
|
||||
# BuildKit content-addresses recipe.json, so the cacher layer stays cached
|
||||
# even if this stage rebuilds due to a .rs source change.
|
||||
FROM chef AS planner
|
||||
COPY . .
|
||||
RUN cargo chef prepare --recipe-path recipe.json
|
||||
|
||||
# Copy workspace members
|
||||
COPY crates/ crates/
|
||||
COPY applications/ applications/
|
||||
COPY sdk/ sdk/
|
||||
# Stage 2: Cacher — compile dependencies only
|
||||
# This layer is invalidated only when Cargo.toml or Cargo.lock changes.
|
||||
# protobuf-compiler is required by stemedb-rpc/build.rs (compiles sync.proto).
|
||||
FROM chef AS cacher
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends protobuf-compiler && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
COPY --from=planner /app/recipe.json recipe.json
|
||||
# Proto files must be present for stemedb-rpc/build.rs to run during dep compilation
|
||||
COPY crates/stemedb-rpc/proto/ crates/stemedb-rpc/proto/
|
||||
RUN cargo chef cook --release --recipe-path recipe.json
|
||||
|
||||
# Build release binary (only stemedb-api)
|
||||
# Stage 3: Builder — compile the service binary using cached deps
|
||||
# Inherits compiled deps from cacher; only workspace source is compiled here.
|
||||
FROM cacher AS builder
|
||||
COPY . .
|
||||
RUN cargo build --release -p stemedb-api
|
||||
# Strip debug symbols before copying to runtime image
|
||||
RUN strip target/release/stemedb-api
|
||||
|
||||
# Stage 2: Runtime image
|
||||
FROM debian:bookworm-slim
|
||||
# Stage 4: Runtime — minimal production image
|
||||
FROM debian:bookworm-slim AS runtime
|
||||
|
||||
# Install runtime dependencies
|
||||
RUN apt-get update && \
|
||||
apt-get install -y --no-install-recommends \
|
||||
ca-certificates \
|
||||
curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Copy the binary from builder
|
||||
# Non-root user for security
|
||||
RUN useradd --system --no-create-home --shell /bin/false stemedb
|
||||
|
||||
COPY --from=builder /app/target/release/stemedb-api /usr/local/bin/stemedb-api
|
||||
|
||||
# Create data directories
|
||||
RUN mkdir -p /data/wal /data/db
|
||||
RUN mkdir -p /data/wal /data/db && chown -R stemedb:stemedb /data
|
||||
|
||||
USER stemedb
|
||||
|
||||
# Set environment defaults
|
||||
ENV STEMEDB_WAL_DIR=/data/wal \
|
||||
STEMEDB_DB_DIR=/data/db \
|
||||
STEMEDB_BIND_ADDR=0.0.0.0:18180 \
|
||||
RUST_LOG=stemedb_api=info
|
||||
|
||||
# Expose the API port
|
||||
EXPOSE 18180
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=5s --timeout=3s --start-period=10s --retries=3 \
|
||||
CMD curl -f http://localhost:18180/v1/health || exit 1
|
||||
|
||||
# Run the API server
|
||||
CMD ["stemedb-api"]
|
||||
|
||||
@ -6,6 +6,7 @@
|
||||
use std::time::Duration;
|
||||
|
||||
use ed25519_dalek::SigningKey;
|
||||
use rand::Rng;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use stemedb_core::types::Assertion;
|
||||
use tracing::{info, instrument, warn};
|
||||
@ -16,105 +17,54 @@ use crate::AphoriaError;
|
||||
|
||||
/// HTTP client for pushing observations to a hosted StemeDB server.
|
||||
pub struct HostedClient {
|
||||
/// Base URL of the server (e.g., "https://episteme.acme.corp").
|
||||
base_url: String,
|
||||
|
||||
/// Project identifier.
|
||||
project_id: String,
|
||||
|
||||
/// Optional team identifier.
|
||||
team_id: Option<String>,
|
||||
|
||||
/// Agent's public key (hex-encoded).
|
||||
agent_id: String,
|
||||
|
||||
/// Optional API key for authentication.
|
||||
api_key: Option<String>,
|
||||
|
||||
/// Maximum retry attempts.
|
||||
max_retries: u32,
|
||||
|
||||
/// Delay between retries in milliseconds.
|
||||
retry_delay_ms: u64,
|
||||
|
||||
/// Behavior when server is unreachable.
|
||||
offline_fallback: OfflineFallback,
|
||||
|
||||
/// Whether to route observations to community endpoint for pattern aggregation.
|
||||
/// When true, observations go to /v1/aphoria/community/observations.
|
||||
/// When false, observations go to /v1/aphoria/observations.
|
||||
community_enabled: bool,
|
||||
}
|
||||
|
||||
/// Request payload for pushing observations (team storage).
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct PushObservationsRequest {
|
||||
/// The observations to push.
|
||||
pub observations: Vec<ObservationDto>,
|
||||
|
||||
/// Project identifier.
|
||||
pub project_id: String,
|
||||
|
||||
/// Optional team identifier.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub team_id: Option<String>,
|
||||
|
||||
/// Client version for debugging.
|
||||
pub client_version: String,
|
||||
}
|
||||
|
||||
/// Request payload for pushing community observations (corpus aggregation).
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct PushCommunityObservationsRequest {
|
||||
/// The anonymized observations to share.
|
||||
pub observations: Vec<CommunityObservationDto>,
|
||||
|
||||
/// Hash of the project (for deduplication, NOT the actual project name).
|
||||
/// This is BLAKE3 hash of the project name to prevent name leakage.
|
||||
/// BLAKE3 hash of project name (prevents name leakage).
|
||||
pub project_hash: String,
|
||||
|
||||
/// Client version for debugging.
|
||||
pub client_version: String,
|
||||
}
|
||||
|
||||
/// Community observation response.
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
pub struct PushCommunityObservationsResponse {
|
||||
/// Number of observations recorded.
|
||||
pub recorded: usize,
|
||||
|
||||
/// Number of new patterns discovered.
|
||||
pub new_patterns: usize,
|
||||
|
||||
/// Number of existing patterns updated.
|
||||
pub updated_patterns: usize,
|
||||
}
|
||||
|
||||
/// A single observation in the request (team storage).
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct ObservationDto {
|
||||
/// The subject (concept path).
|
||||
pub subject: String,
|
||||
|
||||
/// The predicate being claimed.
|
||||
pub predicate: String,
|
||||
|
||||
/// The object value.
|
||||
pub object: ObjectValueDto,
|
||||
|
||||
/// Confidence score (0.0 to 1.0).
|
||||
pub confidence: f32,
|
||||
|
||||
/// Source hash (hex-encoded).
|
||||
pub source_hash: String,
|
||||
|
||||
/// Signatures (hex-encoded).
|
||||
pub signatures: Vec<SignatureDto>,
|
||||
|
||||
/// Timestamp of the observation.
|
||||
pub timestamp: u64,
|
||||
|
||||
/// Source metadata as JSON string.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub source_metadata: Option<String>,
|
||||
}
|
||||
@ -318,12 +268,16 @@ impl HostedClient {
|
||||
|
||||
let url = format!("{}/v1/aphoria/observations", self.base_url);
|
||||
|
||||
// Retry loop
|
||||
// Retry loop with exponential backoff + jitter
|
||||
let mut delay_ms = self.retry_delay_ms;
|
||||
let mut last_error = None;
|
||||
for attempt in 0..=self.max_retries {
|
||||
if attempt > 0 {
|
||||
info!(attempt, "Retrying push to team server");
|
||||
std::thread::sleep(Duration::from_millis(self.retry_delay_ms));
|
||||
let jitter_pct: u64 = rand::thread_rng().gen_range(75..=125);
|
||||
let sleep_ms = delay_ms * jitter_pct / 100;
|
||||
info!(attempt, sleep_ms, "Retrying push to team server");
|
||||
std::thread::sleep(Duration::from_millis(sleep_ms));
|
||||
delay_ms = (delay_ms * 2).min(30_000);
|
||||
}
|
||||
|
||||
match self.do_push_team(&url, &request) {
|
||||
@ -336,6 +290,10 @@ impl HostedClient {
|
||||
return Ok(response.accepted);
|
||||
}
|
||||
Err(e) => {
|
||||
if !is_retryable_hosted_error(&e) {
|
||||
warn!(attempt, error = %e, "Non-retryable error pushing to team server");
|
||||
return self.handle_push_error(Some(e));
|
||||
}
|
||||
warn!(attempt, error = %e, "Failed to push to team server");
|
||||
last_error = Some(e);
|
||||
}
|
||||
@ -366,12 +324,16 @@ impl HostedClient {
|
||||
|
||||
let url = format!("{}/v1/aphoria/community/observations", self.base_url);
|
||||
|
||||
// Retry loop
|
||||
// Retry loop with exponential backoff + jitter
|
||||
let mut delay_ms = self.retry_delay_ms;
|
||||
let mut last_error = None;
|
||||
for attempt in 0..=self.max_retries {
|
||||
if attempt > 0 {
|
||||
info!(attempt, "Retrying push to community corpus");
|
||||
std::thread::sleep(Duration::from_millis(self.retry_delay_ms));
|
||||
let jitter_pct: u64 = rand::thread_rng().gen_range(75..=125);
|
||||
let sleep_ms = delay_ms * jitter_pct / 100;
|
||||
info!(attempt, sleep_ms, "Retrying push to community corpus");
|
||||
std::thread::sleep(Duration::from_millis(sleep_ms));
|
||||
delay_ms = (delay_ms * 2).min(30_000);
|
||||
}
|
||||
|
||||
match self.do_push_community(&url, &request) {
|
||||
@ -385,6 +347,10 @@ impl HostedClient {
|
||||
return Ok(response.recorded);
|
||||
}
|
||||
Err(e) => {
|
||||
if !is_retryable_hosted_error(&e) {
|
||||
warn!(attempt, error = %e, "Non-retryable error pushing to community corpus");
|
||||
return self.handle_push_error(Some(e));
|
||||
}
|
||||
warn!(attempt, error = %e, "Failed to push to community corpus");
|
||||
last_error = Some(e);
|
||||
}
|
||||
@ -518,12 +484,16 @@ impl HostedClient {
|
||||
|
||||
let url = format!("{}/v1/aphoria/patterns", self.base_url);
|
||||
|
||||
// Retry loop
|
||||
// Retry loop with exponential backoff + jitter
|
||||
let mut delay_ms = self.retry_delay_ms;
|
||||
let mut last_error = None;
|
||||
for attempt in 0..=self.max_retries {
|
||||
if attempt > 0 {
|
||||
info!(attempt, "Retrying pattern push to hosted server");
|
||||
std::thread::sleep(Duration::from_millis(self.retry_delay_ms));
|
||||
let jitter_pct: u64 = rand::thread_rng().gen_range(75..=125);
|
||||
let sleep_ms = delay_ms * jitter_pct / 100;
|
||||
info!(attempt, sleep_ms, "Retrying pattern push to hosted server");
|
||||
std::thread::sleep(Duration::from_millis(sleep_ms));
|
||||
delay_ms = (delay_ms * 2).min(30_000);
|
||||
}
|
||||
|
||||
match self.do_push_patterns(&url, &request) {
|
||||
@ -537,6 +507,11 @@ impl HostedClient {
|
||||
return Ok(response);
|
||||
}
|
||||
Err(e) => {
|
||||
if !is_retryable_hosted_error(&e) {
|
||||
warn!(attempt, error = %e, "Non-retryable error pushing patterns");
|
||||
last_error = Some(e);
|
||||
break;
|
||||
}
|
||||
warn!(attempt, error = %e, "Failed to push patterns to hosted server");
|
||||
last_error = Some(e);
|
||||
}
|
||||
@ -617,12 +592,16 @@ impl HostedClient {
|
||||
url = format!("{}?{}", url, params.join("&"));
|
||||
}
|
||||
|
||||
// Retry loop
|
||||
// Retry loop with exponential backoff + jitter
|
||||
let mut delay_ms = self.retry_delay_ms;
|
||||
let mut last_error = None;
|
||||
for attempt in 0..=self.max_retries {
|
||||
if attempt > 0 {
|
||||
info!(attempt, "Retrying community extractors fetch");
|
||||
std::thread::sleep(Duration::from_millis(self.retry_delay_ms));
|
||||
let jitter_pct: u64 = rand::thread_rng().gen_range(75..=125);
|
||||
let sleep_ms = delay_ms * jitter_pct / 100;
|
||||
info!(attempt, sleep_ms, "Retrying community extractors fetch");
|
||||
std::thread::sleep(Duration::from_millis(sleep_ms));
|
||||
delay_ms = (delay_ms * 2).min(30_000);
|
||||
}
|
||||
|
||||
match self.do_get_extractors(&url) {
|
||||
@ -631,6 +610,11 @@ impl HostedClient {
|
||||
return Ok(extractors);
|
||||
}
|
||||
Err(e) => {
|
||||
if !is_retryable_hosted_error(&e) {
|
||||
warn!(attempt, error = %e, "Non-retryable error fetching community extractors");
|
||||
last_error = Some(e);
|
||||
break;
|
||||
}
|
||||
warn!(attempt, error = %e, "Failed to fetch community extractors");
|
||||
last_error = Some(e);
|
||||
}
|
||||
@ -692,6 +676,22 @@ impl HostedClient {
|
||||
}
|
||||
}
|
||||
|
||||
/// Determines whether a hosted push/fetch error is worth retrying.
|
||||
///
|
||||
/// Returns `false` for HTTP 4xx client errors (auth failures, bad requests) —
|
||||
/// these will not succeed on retry. Returns `true` for 5xx server errors,
|
||||
/// connection errors, and timeouts.
|
||||
fn is_retryable_hosted_error(error: &AphoriaError) -> bool {
|
||||
let msg = error.to_string();
|
||||
// Non-retryable: client errors (4xx). The message format is
|
||||
// "Server returned status 4XX" from do_push_*/do_get_extractors.
|
||||
if msg.contains("Server returned status 4") {
|
||||
return false;
|
||||
}
|
||||
// All other errors (5xx, connection refused, timeout) are retryable.
|
||||
true
|
||||
}
|
||||
|
||||
/// Convert an Assertion to an ObservationDto for the API.
|
||||
fn assertion_to_dto(assertion: &Assertion) -> ObservationDto {
|
||||
use stemedb_core::types::ObjectValue;
|
||||
@ -781,198 +781,3 @@ fn wildcardize_subject(subject: &str, project_id: &str) -> String {
|
||||
// Simple replacement: replace project_id with wildcard
|
||||
subject.replace(project_id, "*")
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::bridge::generate_signing_key;
|
||||
use crate::config::SyncMode;
|
||||
|
||||
#[test]
|
||||
fn test_client_not_created_without_url() {
|
||||
let config = HostedConfig::default();
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "test-project")
|
||||
.expect("should not fail");
|
||||
assert!(client.is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_client_created_with_url() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
team_id: Some("platform".to_string()),
|
||||
sync_mode: SyncMode::RemoteOnly,
|
||||
offline_fallback: OfflineFallback::Skip,
|
||||
max_retries: 3,
|
||||
retry_delay_ms: 1000,
|
||||
api_key_env: String::new(),
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(client.base_url, "https://episteme.acme.corp");
|
||||
assert_eq!(client.project_id, "my-project");
|
||||
assert_eq!(client.team_id, Some("platform".to_string()));
|
||||
assert_eq!(client.agent_id.len(), 64); // 32 bytes hex-encoded
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_client_uses_fallback_project_name() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: None, // Not set
|
||||
..Default::default()
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(client.project_id, "fallback-project");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_assertion_to_dto() {
|
||||
use stemedb_core::types::{
|
||||
Assertion, HlcTimestamp, LifecycleStage, ObjectValue, SignatureEntry, SourceClass,
|
||||
};
|
||||
|
||||
let assertion = Assertion {
|
||||
subject: "code://rust/myapp/tls".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
object: ObjectValue::Boolean(true),
|
||||
parent_hash: None,
|
||||
source_hash: [1u8; 32],
|
||||
source_class: SourceClass::Community,
|
||||
visual_hash: None,
|
||||
epoch: None,
|
||||
source_metadata: Some(b"{\"file\":\"test.rs\"}".to_vec()),
|
||||
narrative: None,
|
||||
lifecycle: LifecycleStage::Approved,
|
||||
signatures: vec![SignatureEntry {
|
||||
agent_id: [2u8; 32],
|
||||
signature: [3u8; 64],
|
||||
timestamp: 12345,
|
||||
version: 1,
|
||||
}],
|
||||
confidence: 0.9,
|
||||
timestamp: 67890,
|
||||
hlc_timestamp: HlcTimestamp::default(),
|
||||
vector: None,
|
||||
};
|
||||
|
||||
let dto = assertion_to_dto(&assertion);
|
||||
|
||||
assert_eq!(dto.subject, "code://rust/myapp/tls");
|
||||
assert_eq!(dto.predicate, "enabled");
|
||||
assert!(matches!(dto.object, ObjectValueDto::Boolean(true)));
|
||||
assert_eq!(dto.confidence, 0.9);
|
||||
assert_eq!(dto.timestamp, 67890);
|
||||
assert_eq!(dto.signatures.len(), 1);
|
||||
assert_eq!(dto.signatures[0].version, 1);
|
||||
assert_eq!(dto.source_metadata, Some("{\"file\":\"test.rs\"}".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_compute_org_hash() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
team_id: Some("platform".to_string()),
|
||||
..Default::default()
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
let hash = client.compute_org_hash();
|
||||
|
||||
// Hash should be 64 hex characters (32 bytes)
|
||||
assert_eq!(hash.len(), 64);
|
||||
|
||||
// Same inputs should produce same hash
|
||||
let hash2 = client.compute_org_hash();
|
||||
assert_eq!(hash, hash2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_compute_org_hash_without_team() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
team_id: None,
|
||||
..Default::default()
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
let hash = client.compute_org_hash();
|
||||
assert_eq!(hash.len(), 64);
|
||||
|
||||
// With team should produce different hash
|
||||
let config_with_team = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
team_id: Some("platform".to_string()),
|
||||
..Default::default()
|
||||
};
|
||||
let client_with_team =
|
||||
HostedClient::new(&config_with_team, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
let hash_with_team = client_with_team.compute_org_hash();
|
||||
|
||||
assert_ne!(hash, hash_with_team);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_push_patterns_empty() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
..Default::default()
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
// Empty patterns should return default response without making HTTP call
|
||||
let result = client.push_patterns(vec![]);
|
||||
assert!(result.is_ok());
|
||||
let response = result.unwrap();
|
||||
assert_eq!(response.accepted, 0);
|
||||
assert_eq!(response.merged, 0);
|
||||
assert_eq!(response.deduplicated, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_accessors() {
|
||||
let config = HostedConfig {
|
||||
url: Some("https://episteme.acme.corp".to_string()),
|
||||
project_id: Some("my-project".to_string()),
|
||||
..Default::default()
|
||||
};
|
||||
let community_config = CommunityConfig::default();
|
||||
let key = generate_signing_key();
|
||||
let client = HostedClient::new(&config, &community_config, &key, "fallback-project")
|
||||
.expect("should not fail")
|
||||
.unwrap();
|
||||
|
||||
assert_eq!(client.base_url(), "https://episteme.acme.corp");
|
||||
assert_eq!(client.project_id(), "my-project");
|
||||
}
|
||||
}
|
||||
|
||||
@ -10,6 +10,7 @@ workspace = true
|
||||
[features]
|
||||
default = ["aphoria"]
|
||||
aphoria = ["dep:aphoria"]
|
||||
cluster = ["dep:stemedb-cluster", "dep:stemedb-sync"]
|
||||
|
||||
[dependencies]
|
||||
stemedb-core = { path = "../stemedb-core" }
|
||||
@ -22,6 +23,10 @@ stemedb-lens = { path = "../stemedb-lens" }
|
||||
# Optional: Aphoria code-level truth linting
|
||||
aphoria = { path = "../../applications/aphoria", optional = true }
|
||||
|
||||
# Optional: Multi-node cluster participation
|
||||
stemedb-cluster = { path = "../stemedb-cluster", optional = true }
|
||||
stemedb-sync = { path = "../stemedb-sync", optional = true }
|
||||
|
||||
axum = { version = "0.7", features = ["json"] }
|
||||
axum-server = { version = "0.7", features = ["tls-rustls"] }
|
||||
tokio = { version = "1", features = ["full"] }
|
||||
|
||||
@ -133,8 +133,8 @@ pub use aphoria::{
|
||||
|
||||
// From stemedb_claims module
|
||||
pub use stemedb_claims::{
|
||||
AuthoredClaimDto, AuthoredValueDto, ClaimSearchQuery, ClaimStatsDto,
|
||||
CreateClaimRequest, CreateClaimResponse,
|
||||
AuthoredClaimDto, AuthoredValueDto, ClaimSearchQuery, ClaimStatsDto, CreateClaimRequest,
|
||||
CreateClaimResponse,
|
||||
};
|
||||
|
||||
// From subjects module
|
||||
|
||||
@ -1,7 +1,7 @@
|
||||
//! DTOs for StemeDB claims endpoints.
|
||||
|
||||
use std::collections::HashMap;
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::collections::HashMap;
|
||||
use utoipa::{IntoParams, ToSchema};
|
||||
|
||||
/// Request to create a claim in StemeDB.
|
||||
|
||||
@ -89,11 +89,8 @@ pub use aphoria::{
|
||||
};
|
||||
|
||||
pub use stemedb_claims::{
|
||||
create_claim as create_stemedb_claim,
|
||||
delete_claim as delete_stemedb_claim,
|
||||
get_claim as get_stemedb_claim,
|
||||
get_claim_stats as get_stemedb_claim_stats,
|
||||
list_claims as list_stemedb_claims,
|
||||
search_claims as search_stemedb_claims,
|
||||
create_claim as create_stemedb_claim, delete_claim as delete_stemedb_claim,
|
||||
get_claim as get_stemedb_claim, get_claim_stats as get_stemedb_claim_stats,
|
||||
list_claims as list_stemedb_claims, search_claims as search_stemedb_claims,
|
||||
};
|
||||
pub use subjects::{list_predicates, list_subjects};
|
||||
|
||||
@ -18,8 +18,8 @@ use stemedb_storage::{key_codec, KVStore};
|
||||
|
||||
use crate::{
|
||||
dto::{
|
||||
AuthoredClaimDto, AuthoredValueDto, ClaimSearchQuery, ClaimStatsDto,
|
||||
CreateClaimRequest, CreateClaimResponse,
|
||||
AuthoredClaimDto, AuthoredValueDto, ClaimSearchQuery, ClaimStatsDto, CreateClaimRequest,
|
||||
CreateClaimResponse,
|
||||
},
|
||||
error::{ApiError, Result},
|
||||
AppState,
|
||||
@ -566,9 +566,7 @@ pub async fn search_claims(
|
||||
}
|
||||
if let Some(max_tier) = query.max_tier {
|
||||
claims.retain(|c| {
|
||||
tier_string_to_number(&c.authority_tier)
|
||||
.map(|t| t <= max_tier)
|
||||
.unwrap_or(false)
|
||||
tier_string_to_number(&c.authority_tier).map(|t| t <= max_tier).unwrap_or(false)
|
||||
});
|
||||
}
|
||||
if let Some(ref status) = query.status {
|
||||
@ -635,10 +633,8 @@ pub async fn get_claim_stats(
|
||||
*value_counts.entry(value_to_string(&claim.value)).or_insert(0) += 1;
|
||||
}
|
||||
|
||||
let most_common_value = value_counts
|
||||
.into_iter()
|
||||
.max_by_key(|(_, count)| *count)
|
||||
.map(|(val, _)| val);
|
||||
let most_common_value =
|
||||
value_counts.into_iter().max_by_key(|(_, count)| *count).map(|(val, _)| val);
|
||||
|
||||
Ok(Json(ClaimStatsDto {
|
||||
concept_path,
|
||||
|
||||
@ -9,7 +9,8 @@ use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
|
||||
use axum::Extension;
|
||||
use metrics_exporter_prometheus::PrometheusBuilder;
|
||||
use stemedb_api::{
|
||||
create_router_config, create_router_with_meter_config, AppState, SecurityConfig,
|
||||
bootstrap, create_router_config, create_router_full_protection_full_config,
|
||||
create_router_with_meter_config, ApiKeyAuthConfig, AppState, SecurityConfig,
|
||||
};
|
||||
use stemedb_ingest::worker::IngestWorker;
|
||||
use stemedb_storage::HybridStore;
|
||||
@ -158,10 +159,14 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let write_journal = Journal::open(&config.wal_dir)?;
|
||||
let read_journal = Journal::open(&config.wal_dir)?;
|
||||
let store = Arc::new(HybridStore::open(&config.db_dir)?);
|
||||
let corpus_store = config.corpus_db_dir.as_ref().map(|d| {
|
||||
let _ = std::fs::create_dir_all(d);
|
||||
Arc::new(HybridStore::open(d).unwrap())
|
||||
});
|
||||
let corpus_store = config
|
||||
.corpus_db_dir
|
||||
.as_ref()
|
||||
.map(|d| {
|
||||
let _ = std::fs::create_dir_all(d);
|
||||
HybridStore::open(d).map(Arc::new)
|
||||
})
|
||||
.transpose()?;
|
||||
|
||||
let state = AppState::new(write_journal, read_journal, Arc::clone(&store), corpus_store);
|
||||
let worker_journal = state.journal.clone();
|
||||
@ -184,6 +189,71 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
}
|
||||
});
|
||||
|
||||
// Bootstrap root API key from env (idempotent: no-op if key already exists).
|
||||
if let Err(e) = bootstrap::bootstrap_root_api_key(&*state.api_key_store).await {
|
||||
error!("Failed to bootstrap root API key: {}", e);
|
||||
std::process::exit(1);
|
||||
}
|
||||
|
||||
// Cluster mode: join SWIM membership when STEMEDB_CLUSTER_MODE=true.
|
||||
// Requires the `cluster` feature to be enabled at compile time.
|
||||
#[cfg(feature = "cluster")]
|
||||
{
|
||||
let cluster_mode = std::env::var("STEMEDB_CLUSTER_MODE")
|
||||
.map(|v| v.to_lowercase() == "true" || v == "1")
|
||||
.unwrap_or(false);
|
||||
|
||||
if cluster_mode {
|
||||
use stemedb_cluster::{stable_node_id, NodeInfo, SwimConfig, SwimMembership};
|
||||
|
||||
let node_id = stable_node_id();
|
||||
|
||||
let rpc_addr: std::net::SocketAddr = std::env::var("STEMEDB_NODE_RPC_ADDR")
|
||||
.unwrap_or_else(|_| "127.0.0.1:18182".to_string())
|
||||
.parse()
|
||||
.unwrap_or_else(|_| std::net::SocketAddr::from(([127, 0, 0, 1], 18182)));
|
||||
|
||||
let api_addr: std::net::SocketAddr = config
|
||||
.bind_addr
|
||||
.parse()
|
||||
.unwrap_or_else(|_| std::net::SocketAddr::from(([127, 0, 0, 1], 18180)));
|
||||
|
||||
let local_info = NodeInfo::new(node_id, rpc_addr, api_addr);
|
||||
let membership = Arc::new(SwimMembership::new(local_info, SwimConfig::default()));
|
||||
|
||||
let seeds: Vec<std::net::SocketAddr> = std::env::var("STEMEDB_CLUSTER_SEEDS")
|
||||
.unwrap_or_default()
|
||||
.split(',')
|
||||
.filter(|s| !s.trim().is_empty())
|
||||
.filter_map(|s| s.trim().parse().ok())
|
||||
.collect();
|
||||
|
||||
if !seeds.is_empty() {
|
||||
if let Err(e) = membership.join(seeds).await {
|
||||
warn!("Cluster join failed (continuing as solo node): {}", e);
|
||||
}
|
||||
}
|
||||
|
||||
membership.start();
|
||||
info!(
|
||||
node_id = %node_id.short_hex(),
|
||||
rpc_addr = %rpc_addr,
|
||||
"Cluster mode active"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Startup guard: unsafe skip + auth enabled is a fatal misconfiguration.
|
||||
if config.unsafe_skip_signatures && bootstrap::is_auth_enabled() {
|
||||
error!(
|
||||
"FATAL: STEMEDB_UNSAFE_SKIP_SIGNATURES=true conflicts with \
|
||||
STEMEDB_AUTH_ENABLED=true. Signature verification must be enabled \
|
||||
when auth is enforced. Unset STEMEDB_UNSAFE_SKIP_SIGNATURES or \
|
||||
disable STEMEDB_AUTH_ENABLED."
|
||||
);
|
||||
std::process::exit(1);
|
||||
}
|
||||
|
||||
// Build router (with or without metering) with security config
|
||||
let security_config = config.to_security_config();
|
||||
info!(
|
||||
@ -193,7 +263,21 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
security_config.http_timeout_secs,
|
||||
);
|
||||
|
||||
let app = if config.meter_enabled {
|
||||
let app = if bootstrap::is_auth_enabled() {
|
||||
info!(
|
||||
require_all = bootstrap::is_auth_require_all(),
|
||||
"Auth enforced (STEMEDB_AUTH_ENABLED=true) — full protection stack active"
|
||||
);
|
||||
create_router_full_protection_full_config(
|
||||
state,
|
||||
ApiKeyAuthConfig {
|
||||
enabled: true,
|
||||
require_for_all: bootstrap::is_auth_require_all(),
|
||||
..ApiKeyAuthConfig::default()
|
||||
},
|
||||
security_config,
|
||||
)
|
||||
} else if config.meter_enabled {
|
||||
info!("The Meter enabled: economic throttling active (10K tokens/agent/hour)");
|
||||
create_router_with_meter_config(state, security_config)
|
||||
} else {
|
||||
|
||||
@ -29,6 +29,9 @@ axum = "0.7"
|
||||
tower = "0.5"
|
||||
tower-http = { version = "0.5", features = ["cors", "trace"] }
|
||||
|
||||
# HTTP client for gateway request forwarding
|
||||
reqwest = { version = "0.12", features = ["json"] }
|
||||
|
||||
# Serialization
|
||||
serde = { version = "1.0", features = ["derive"] }
|
||||
serde_json = "1.0"
|
||||
|
||||
@ -23,7 +23,7 @@ use tracing::info;
|
||||
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
|
||||
|
||||
use stemedb_cluster::{
|
||||
Gateway, NodeId, NodeInfo, RangeManager, RangeRouter, ShardingConfig, SwimConfig,
|
||||
stable_node_id, Gateway, NodeInfo, RangeManager, RangeRouter, ShardingConfig, SwimConfig,
|
||||
SwimMembership,
|
||||
};
|
||||
|
||||
@ -82,7 +82,8 @@ async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
|
||||
let config = NodeConfig::from_env();
|
||||
|
||||
let node_id = NodeId::random();
|
||||
// Use stable NodeId (env var → hostname → random fallback)
|
||||
let node_id = stable_node_id();
|
||||
|
||||
info!(
|
||||
node_id = %node_id.short_hex(),
|
||||
|
||||
@ -425,7 +425,7 @@ impl ClusterConfigBuilder {
|
||||
.ok_or_else(|| crate::ClusterError::Config("api_addr is required".to_string()))?;
|
||||
|
||||
Ok(ClusterConfig {
|
||||
node_id: self.node_id.unwrap_or_else(NodeId::random),
|
||||
node_id: self.node_id.unwrap_or_else(crate::stable_node_id),
|
||||
rpc_addr,
|
||||
api_addr,
|
||||
seed_nodes: self.seed_nodes,
|
||||
|
||||
@ -8,41 +8,77 @@ use tracing::instrument;
|
||||
use crate::gateway::service::GatewayState;
|
||||
use crate::sharding::ShardId;
|
||||
|
||||
use super::types::{
|
||||
ApiError, ClusterStatusResponse, HealthResponse, NodeStatusInfo, QueryParams, QueryResponse,
|
||||
};
|
||||
use super::types::{ApiError, ClusterStatusResponse, HealthResponse, NodeStatusInfo, QueryParams};
|
||||
|
||||
/// GET /v1/query - Query assertions.
|
||||
///
|
||||
/// Routes by subject hash to a replica (preferring local) and forwards the
|
||||
/// request via HTTP to that node's stemedb-api.
|
||||
#[instrument(skip(state), fields(subject = %params.subject))]
|
||||
pub async fn handle_query(
|
||||
State(state): State<Arc<GatewayState>>,
|
||||
Query(params): Query<QueryParams>,
|
||||
) -> Result<Json<QueryResponse>, ApiError> {
|
||||
) -> Result<Json<serde_json::Value>, ApiError> {
|
||||
state.inc_requests();
|
||||
|
||||
// 1. Route by subject hash
|
||||
let shard_id = state.router.route_subject(¶ms.subject).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Routing failed: {e}"),
|
||||
})?;
|
||||
|
||||
// 2. Get replicas, preferring local
|
||||
// 2. Get replicas, preferring local node to minimize latency
|
||||
let replicas = state.router.get_replicas_prefer_local(shard_id).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("No replicas for shard {shard_id}: {e}"),
|
||||
})?;
|
||||
|
||||
let replica = replicas.first().ok_or_else(|| ApiError {
|
||||
let replica_id = replicas.first().ok_or_else(|| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("No replicas available for shard {shard_id}"),
|
||||
})?;
|
||||
|
||||
// 3. Forward to replica via RPC (not yet wired)
|
||||
// 3. Look up replica's HTTP API address via membership
|
||||
let replica_info = state.membership.get_member(*replica_id).ok_or_else(|| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Replica {} not found in membership", replica_id.short_hex()),
|
||||
})?;
|
||||
|
||||
// 4. Get or create a pooled HTTP client for this node
|
||||
let http_client = {
|
||||
let entry = state.http_forwarders.entry(*replica_id).or_insert_with(reqwest::Client::new);
|
||||
entry.clone()
|
||||
};
|
||||
|
||||
// 5. Forward to replica's stemedb-api, preserving all query parameters
|
||||
let url = format!("http://{}/v1/query", replica_info.api_addr);
|
||||
tracing::info!(
|
||||
shard_id = shard_id,
|
||||
replica = %replica.short_hex(),
|
||||
"Routed query to replica"
|
||||
shard_id,
|
||||
replica = %replica_id.short_hex(),
|
||||
url = %url,
|
||||
"Forwarding query to replica"
|
||||
);
|
||||
|
||||
Ok(Json(QueryResponse { assertions: vec![], shard_id, served_by: replica.short_hex() }))
|
||||
let response = http_client.get(&url).query(¶ms).send().await.map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Forward to replica failed: {e}"),
|
||||
})?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let status_code = response.status().as_u16();
|
||||
let body = response.text().await.unwrap_or_default();
|
||||
return Err(ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Replica returned {status_code}: {body}"),
|
||||
});
|
||||
}
|
||||
|
||||
let result: serde_json::Value = response.json().await.map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Failed to parse replica response: {e}"),
|
||||
})?;
|
||||
|
||||
Ok(Json(result))
|
||||
}
|
||||
|
||||
/// GET /v1/health - Health check.
|
||||
|
||||
@ -26,19 +26,6 @@ pub struct CreateAssertionRequest {
|
||||
pub public_key: String,
|
||||
}
|
||||
|
||||
/// Response from assertion creation.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct AssertionResponse {
|
||||
/// ID of the created assertion (content hash).
|
||||
pub assertion_id: String,
|
||||
|
||||
/// Shard the assertion was routed to.
|
||||
pub shard_id: ShardId,
|
||||
|
||||
/// Node that processed the write.
|
||||
pub leader_node: String,
|
||||
}
|
||||
|
||||
/// Query parameters for assertion lookup.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct QueryParams {
|
||||
@ -55,19 +42,6 @@ pub struct QueryParams {
|
||||
pub limit: Option<usize>,
|
||||
}
|
||||
|
||||
/// Query response with assertions.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct QueryResponse {
|
||||
/// Matching assertions.
|
||||
pub assertions: Vec<serde_json::Value>,
|
||||
|
||||
/// Shard that served the query.
|
||||
pub shard_id: ShardId,
|
||||
|
||||
/// Node that served the query.
|
||||
pub served_by: String,
|
||||
}
|
||||
|
||||
/// Vote request.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct VoteRequest {
|
||||
|
||||
@ -7,49 +7,84 @@ use tracing::instrument;
|
||||
|
||||
use crate::gateway::service::GatewayState;
|
||||
|
||||
use super::types::{
|
||||
ApiError, AssertionResponse, CreateAssertionRequest, VoteRequest, VoteResponse,
|
||||
};
|
||||
use super::types::{ApiError, CreateAssertionRequest, VoteRequest, VoteResponse};
|
||||
|
||||
/// POST /v1/assert - Create a new assertion.
|
||||
///
|
||||
/// Routes by subject hash to the shard leader and forwards the request via
|
||||
/// HTTP to that node's stemedb-api. Returns the response from the leader.
|
||||
#[instrument(skip(state, req), fields(subject = %req.subject))]
|
||||
pub async fn handle_assert(
|
||||
State(state): State<Arc<GatewayState>>,
|
||||
Json(req): Json<CreateAssertionRequest>,
|
||||
) -> Result<Json<AssertionResponse>, ApiError> {
|
||||
// 1. Route by subject hash
|
||||
) -> Result<Json<serde_json::Value>, ApiError> {
|
||||
state.inc_requests();
|
||||
|
||||
// 1. Route by subject hash to determine shard
|
||||
let shard_id = state.router.route_subject(&req.subject).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Routing failed: {e}"),
|
||||
})?;
|
||||
|
||||
// 2. Get leader for this shard
|
||||
let leader = state.router.get_leader(shard_id).map_err(|e| ApiError {
|
||||
let leader_id = state.router.get_leader(shard_id).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("No leader for shard {shard_id}: {e}"),
|
||||
})?;
|
||||
|
||||
// 3. Forward to leader via RPC (not yet wired)
|
||||
// 3. Look up leader's HTTP API address via membership
|
||||
let leader_info = state.membership.get_member(leader_id).ok_or_else(|| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Leader {} not found in membership", leader_id.short_hex()),
|
||||
})?;
|
||||
|
||||
// 4. Get or create a pooled HTTP client for this node
|
||||
let http_client = {
|
||||
let entry = state.http_forwarders.entry(leader_id).or_insert_with(reqwest::Client::new);
|
||||
entry.clone()
|
||||
};
|
||||
|
||||
// 5. Forward to the leader's stemedb-api
|
||||
let url = format!("http://{}/v1/assert", leader_info.api_addr);
|
||||
tracing::info!(
|
||||
shard_id = shard_id,
|
||||
leader = %leader.short_hex(),
|
||||
"Routed assertion to shard leader"
|
||||
shard_id,
|
||||
leader = %leader_id.short_hex(),
|
||||
url = %url,
|
||||
"Forwarding assertion to shard leader"
|
||||
);
|
||||
|
||||
// Return routing result (actual RPC forwarding requires stemedb-rpc integration)
|
||||
Ok(Json(AssertionResponse {
|
||||
assertion_id: format!("pending_{}", req.subject),
|
||||
shard_id,
|
||||
leader_node: leader.short_hex(),
|
||||
}))
|
||||
let response = http_client.post(&url).json(&req).send().await.map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Forward to leader failed: {e}"),
|
||||
})?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let status_code = response.status().as_u16();
|
||||
let body = response.text().await.unwrap_or_default();
|
||||
return Err(ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Leader returned {status_code}: {body}"),
|
||||
});
|
||||
}
|
||||
|
||||
let result: serde_json::Value = response.json().await.map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Failed to parse leader response: {e}"),
|
||||
})?;
|
||||
|
||||
Ok(Json(result))
|
||||
}
|
||||
|
||||
/// POST /v1/vote - Submit a vote.
|
||||
///
|
||||
/// Routes to the shard leader for the assertion's subject and forwards via HTTP.
|
||||
#[instrument(skip(state, req), fields(subject = %req.subject))]
|
||||
pub async fn handle_vote(
|
||||
State(state): State<Arc<GatewayState>>,
|
||||
Json(req): Json<VoteRequest>,
|
||||
) -> Result<Json<VoteResponse>, ApiError> {
|
||||
state.inc_requests();
|
||||
|
||||
// Route by subject hash
|
||||
let shard_id = state.router.route_subject(&req.subject).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
@ -57,18 +92,45 @@ pub async fn handle_vote(
|
||||
})?;
|
||||
|
||||
// Get leader
|
||||
let leader = state.router.get_leader(shard_id).map_err(|e| ApiError {
|
||||
let leader_id = state.router.get_leader(shard_id).map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("No leader for shard {shard_id}: {e}"),
|
||||
})?;
|
||||
|
||||
// Forward to leader via RPC (not yet wired)
|
||||
// Look up leader's API address
|
||||
let leader_info = state.membership.get_member(leader_id).ok_or_else(|| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Leader {} not found in membership", leader_id.short_hex()),
|
||||
})?;
|
||||
|
||||
// Get or create a pooled HTTP client
|
||||
let http_client = {
|
||||
let entry = state.http_forwarders.entry(leader_id).or_insert_with(reqwest::Client::new);
|
||||
entry.clone()
|
||||
};
|
||||
|
||||
// Forward to leader's stemedb-api
|
||||
let url = format!("http://{}/v1/vote", leader_info.api_addr);
|
||||
tracing::info!(
|
||||
shard_id = shard_id,
|
||||
leader = %leader.short_hex(),
|
||||
shard_id,
|
||||
leader = %leader_id.short_hex(),
|
||||
assertion_id = %req.assertion_id,
|
||||
"Routed vote to shard leader"
|
||||
"Forwarding vote to shard leader"
|
||||
);
|
||||
|
||||
let response = http_client.post(&url).json(&req).send().await.map_err(|e| ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Forward to leader failed: {e}"),
|
||||
})?;
|
||||
|
||||
if !response.status().is_success() {
|
||||
let status_code = response.status().as_u16();
|
||||
let body = response.text().await.unwrap_or_default();
|
||||
return Err(ApiError {
|
||||
code: "UNAVAILABLE".to_string(),
|
||||
message: format!("Leader returned {status_code}: {body}"),
|
||||
});
|
||||
}
|
||||
|
||||
Ok(Json(VoteResponse { success: true, shard_id }))
|
||||
}
|
||||
|
||||
@ -31,9 +31,11 @@ pub struct GatewayState {
|
||||
/// Membership for discovering nodes.
|
||||
pub membership: Arc<SwimMembership>,
|
||||
|
||||
/// RPC client pool (node ID -> client).
|
||||
/// In a full implementation, these would be gRPC clients.
|
||||
pub rpc_clients: DashMap<NodeId, ()>,
|
||||
/// HTTP client pool for forwarding requests to each node's stemedb-api.
|
||||
///
|
||||
/// Keyed by NodeId. `reqwest::Client` is cheap to clone (Arc internally)
|
||||
/// and reuses TCP connections via connection pooling.
|
||||
pub http_forwarders: DashMap<NodeId, reqwest::Client>,
|
||||
|
||||
/// Request counter for metrics.
|
||||
pub request_count: AtomicU64,
|
||||
@ -49,7 +51,7 @@ impl GatewayState {
|
||||
Self {
|
||||
router,
|
||||
membership,
|
||||
rpc_clients: DashMap::new(),
|
||||
http_forwarders: DashMap::new(),
|
||||
request_count: AtomicU64::new(0),
|
||||
sync_notifiers: RwLock::new(Vec::new()),
|
||||
}
|
||||
|
||||
@ -71,3 +71,73 @@ pub use error::{ClusterError, Result};
|
||||
pub use gateway::{Gateway, GatewayBuilder};
|
||||
pub use membership::{MembershipEvent, NodeId, NodeInfo, NodeState, SwimMembership};
|
||||
pub use sharding::{MetaRange, RangeDescriptor, RangeManager, RangeRouter, ShardId};
|
||||
|
||||
/// Returns a stable [`NodeId`] for this process.
|
||||
///
|
||||
/// Priority:
|
||||
/// 1. `STEMEDB_NODE_ID` env var — hashed via BLAKE3 (k8s: set in Deployment env)
|
||||
/// 2. `HOSTNAME` env var — hashed via BLAKE3, stable within a pod when hostname = pod name
|
||||
/// 3. Random fallback — development/test only
|
||||
///
|
||||
/// # Example (k8s)
|
||||
/// ```yaml
|
||||
/// env:
|
||||
/// - name: STEMEDB_NODE_ID
|
||||
/// value: "node-a"
|
||||
/// ```
|
||||
pub fn stable_node_id() -> NodeId {
|
||||
fn hash_to_node_id(s: &str) -> NodeId {
|
||||
let hash = blake3::hash(s.as_bytes());
|
||||
let bytes: &[u8; 32] = hash.as_bytes();
|
||||
let mut id_bytes = [0u8; 16];
|
||||
id_bytes.copy_from_slice(&bytes[..16]);
|
||||
NodeId::from_bytes(id_bytes)
|
||||
}
|
||||
|
||||
if let Ok(val) = std::env::var("STEMEDB_NODE_ID") {
|
||||
if !val.is_empty() {
|
||||
return hash_to_node_id(&val);
|
||||
}
|
||||
}
|
||||
|
||||
if let Ok(hostname) = std::env::var("HOSTNAME") {
|
||||
if !hostname.is_empty() {
|
||||
return hash_to_node_id(&hostname);
|
||||
}
|
||||
}
|
||||
|
||||
NodeId::random()
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod stable_node_id_tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_stable_node_id_env_var() {
|
||||
// Same env var → same NodeId
|
||||
std::env::set_var("STEMEDB_NODE_ID", "test-node-a");
|
||||
let id1 = stable_node_id();
|
||||
let id2 = stable_node_id();
|
||||
assert_eq!(id1, id2);
|
||||
std::env::remove_var("STEMEDB_NODE_ID");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_stable_node_id_different_values() {
|
||||
// Different values → different NodeIds
|
||||
let id_a = {
|
||||
std::env::set_var("STEMEDB_NODE_ID", "node-a");
|
||||
let id = stable_node_id();
|
||||
std::env::remove_var("STEMEDB_NODE_ID");
|
||||
id
|
||||
};
|
||||
let id_b = {
|
||||
std::env::set_var("STEMEDB_NODE_ID", "node-b");
|
||||
let id = stable_node_id();
|
||||
std::env::remove_var("STEMEDB_NODE_ID");
|
||||
id
|
||||
};
|
||||
assert_ne!(id_a, id_b);
|
||||
}
|
||||
}
|
||||
|
||||
@ -88,17 +88,18 @@ impl SwimMembership {
|
||||
*local = info;
|
||||
}
|
||||
|
||||
/// Joins the cluster by contacting seed nodes.
|
||||
/// Joins the cluster by contacting seed nodes via gRPC ping.
|
||||
///
|
||||
/// # Algorithm
|
||||
///
|
||||
/// 1. Contact each seed node to get their membership list
|
||||
/// 2. Merge received lists into our local view
|
||||
/// 3. Announce ourselves to the cluster
|
||||
/// 1. For each seed, attempt a `Ping` RPC to verify reachability
|
||||
/// 2. If at least one seed is reachable, mark as joined
|
||||
/// 3. If no seeds are reachable, start as an isolated node (not an error —
|
||||
/// gossip and anti-entropy will sync state once the network recovers)
|
||||
///
|
||||
/// # Errors
|
||||
///
|
||||
/// Returns error if no seed nodes are reachable.
|
||||
/// Never returns an error — isolated startup is acceptable.
|
||||
#[instrument(skip(self), fields(seed_count = seeds.len()))]
|
||||
pub async fn join(&self, seeds: Vec<std::net::SocketAddr>) -> Result<()> {
|
||||
if seeds.is_empty() {
|
||||
@ -108,17 +109,52 @@ impl SwimMembership {
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
// Seed contact via RPC is not yet wired. Once stemedb-rpc integration
|
||||
// is complete, this will:
|
||||
// 1. Send JoinRequest to each seed
|
||||
// 2. Receive MembershipList response
|
||||
// 3. Merge into our local state
|
||||
// 4. Broadcast our presence
|
||||
//
|
||||
// For now, use `alive_node()` to manually register discovered peers.
|
||||
info!(seeds = ?seeds, "Joining cluster (seed RPC contact pending integration)");
|
||||
self.joined.store(true, Ordering::SeqCst);
|
||||
info!("Joining cluster via seeds");
|
||||
|
||||
let local_id = self.local_id();
|
||||
let local_rpc_addr = self.local_info().rpc_addr;
|
||||
let mut contacted = 0usize;
|
||||
|
||||
for seed_addr in &seeds {
|
||||
// Skip our own RPC address to avoid self-pinging
|
||||
if *seed_addr == local_rpc_addr {
|
||||
continue;
|
||||
}
|
||||
|
||||
let addr = format!("http://{}", seed_addr);
|
||||
let client = match stemedb_rpc::SyncClient::connect(&addr).await {
|
||||
Ok(c) => c,
|
||||
Err(e) => {
|
||||
warn!(seed = %seed_addr, error = %e, "Cannot connect to seed, skipping");
|
||||
continue;
|
||||
}
|
||||
};
|
||||
|
||||
let ping = stemedb_rpc::proto::PingRequest { node_id: local_id.as_bytes().to_vec() };
|
||||
|
||||
match client.ping(ping).await {
|
||||
Ok(resp) => {
|
||||
let seed_id_hex = hex::encode(&resp.node_id[..resp.node_id.len().min(4)]);
|
||||
info!(
|
||||
seed = %seed_addr,
|
||||
seed_id = %seed_id_hex,
|
||||
"Seed reachable, cluster join successful"
|
||||
);
|
||||
contacted += 1;
|
||||
}
|
||||
Err(e) => {
|
||||
warn!(seed = %seed_addr, error = %e, "Seed ping failed");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if contacted == 0 {
|
||||
warn!("No seeds reachable — starting as isolated node (anti-entropy will sync later)");
|
||||
} else {
|
||||
info!(contacted, "Joined cluster via seeds");
|
||||
}
|
||||
|
||||
self.joined.store(true, Ordering::SeqCst);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
|
||||
@ -1,6 +1,10 @@
|
||||
//! Build script for stemedb-rpc that generates gRPC code from proto files.
|
||||
|
||||
fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||||
// Only re-run when these inputs change; without this, cargo re-runs on every build.
|
||||
println!("cargo:rerun-if-changed=proto/sync.proto");
|
||||
println!("cargo:rerun-if-changed=build.rs");
|
||||
|
||||
tonic_build::configure()
|
||||
.build_server(true)
|
||||
.build_client(true)
|
||||
|
||||
@ -22,10 +22,10 @@ use crate::error::Result;
|
||||
use async_trait::async_trait;
|
||||
use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
|
||||
use std::sync::Arc;
|
||||
use std::time::Instant;
|
||||
use std::time::{Duration, Instant};
|
||||
use stemedb_core::types::HlcTimestamp;
|
||||
use stemedb_rpc::proto::GossipRequest;
|
||||
use stemedb_rpc::SyncClient;
|
||||
use stemedb_rpc::{RetryConfig, SyncClient};
|
||||
use tokio::sync::Mutex;
|
||||
use tracing::{debug, info, instrument, warn};
|
||||
|
||||
@ -113,9 +113,19 @@ impl GossipBroadcaster {
|
||||
pub async fn with_fanout(peer_addrs: Vec<String>, fanout: usize) -> Result<Self> {
|
||||
let mut clients = Vec::with_capacity(peer_addrs.len());
|
||||
|
||||
// Gossip-specific retry config: shorter backoff than default.
|
||||
// Gossip is best-effort; 3 retries with 500ms→5s backoff keeps
|
||||
// messages from silently dropping during 30s pod-restart windows.
|
||||
let gossip_retry = RetryConfig {
|
||||
max_retries: 3,
|
||||
initial_backoff: Duration::from_millis(500),
|
||||
max_backoff: Duration::from_secs(5),
|
||||
};
|
||||
|
||||
for addr in &peer_addrs {
|
||||
match SyncClient::connect(addr).await {
|
||||
Ok(client) => {
|
||||
let client = client.with_retry_config(gossip_retry.clone());
|
||||
info!(peer = %addr, "Connected to peer for gossip");
|
||||
clients.push(Arc::new(client));
|
||||
}
|
||||
|
||||
@ -6,6 +6,7 @@
|
||||
|
||||
| Need to... | Go to |
|
||||
|------------|-------|
|
||||
| **Deploy to k3s (100 projects)** | [k3s Deploy Roadmap](./deployment/k8s-deploy-roadmap.md) |
|
||||
| **Deploy for the first time** | [Single-Node Pilot Architecture](./reference-architecture/single-node-pilot.md) |
|
||||
| **Troubleshoot an incident** | [Operational Runbooks](./runbooks/) |
|
||||
| **Scale to production** | [Three-Node Cluster Architecture](./reference-architecture/three-node-cluster.md) |
|
||||
@ -130,4 +131,4 @@ Submit pull requests to keep this guide current and valuable.
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2026-02-11
|
||||
**Last Updated:** 2026-03-02
|
||||
|
||||
711
docs/operations/deployment/k8s-deploy-roadmap.md
Normal file
711
docs/operations/deployment/k8s-deploy-roadmap.md
Normal file
@ -0,0 +1,711 @@
|
||||
# k3s Deploy Roadmap: StemeDB + Aphoria → 100 Projects
|
||||
|
||||
**Target:** Production deployment on k3s-fleet with Longhorn, cert-manager, External Secrets, Prometheus/Grafana, Traefik.
|
||||
**Timeline:** 3 weeks to ship-ready for 100 projects.
|
||||
|
||||
---
|
||||
|
||||
## Ship Blockers (P0) — Must Fix Before Any Project Onboards
|
||||
|
||||
### ~~1. Auth router not wired in production~~ ✅ RESOLVED (2026-03-02)
|
||||
|
||||
`create_router_full_protection_full_config` is now called when `STEMEDB_AUTH_ENABLED=true`.
|
||||
Router dispatch checks `bootstrap::is_auth_enabled()` first — full protection stack activates
|
||||
in production. Metering-only path still available when auth is disabled (local dev).
|
||||
|
||||
**Resolution:** `crates/stemedb-api/src/main.rs` updated.
|
||||
|
||||
---
|
||||
|
||||
### ~~2. `STEMEDB_UNSAFE_SKIP_SIGNATURES` startup guard missing~~ ✅ RESOLVED (2026-03-02)
|
||||
|
||||
Startup guard added: if `STEMEDB_UNSAFE_SKIP_SIGNATURES=true` and `STEMEDB_AUTH_ENABLED=true`,
|
||||
server logs a fatal error and exits with code 1. Misconfiguration is caught at boot, not silently.
|
||||
|
||||
**Resolution:** `crates/stemedb-api/src/main.rs` updated.
|
||||
|
||||
---
|
||||
|
||||
### ~~3. Bootstrap key not seeded from env on fresh PVC~~ ✅ RESOLVED (2026-03-02)
|
||||
|
||||
`bootstrap::bootstrap_root_api_key()` is now called at startup (after IngestWorker spawn).
|
||||
Reads `STEMEDB_ROOT_API_KEY`, idempotent — no-op if key already exists in the store. Fatal
|
||||
error on failure.
|
||||
|
||||
**Resolution:** `crates/stemedb-api/src/main.rs` updated.
|
||||
|
||||
---
|
||||
|
||||
### ~~4. No k8s manifests — StemeDB cannot be deployed to k3s~~ ✅ RESOLVED (2026-03-02)
|
||||
|
||||
Manifests deployed to `k3s-fleet/deployments/k8s/base/stemedb/` (single `stemedb.yaml` following
|
||||
`tidaldb/` pattern). Includes ExternalSecret, PVC (50Gi Longhorn), Deployment (Recreate, non-root,
|
||||
all probes), ClusterIP Service, Traefik Ingress at `stemedb.threesix.ai`.
|
||||
|
||||
**Remaining manual step:** Build + push image, create GCP secret, add DNS record (see Pre-Deploy section below).
|
||||
|
||||
---
|
||||
|
||||
### ~~5. Image registry — k3s cannot pull without a registry~~ ✅ RESOLVED (2026-03-02)
|
||||
|
||||
Registry confirmed: `us-central1-docker.pkg.dev/orchard9/docker-images/` (GAR).
|
||||
`imagePullSecrets: gcr-secret` wired in Deployment. Dockerfile updated with `--features aphoria`.
|
||||
|
||||
**Remaining manual step:** `docker build && docker push` to populate the image.
|
||||
|
||||
---
|
||||
|
||||
## Pre-Deploy Checklist (Manual Steps Before `kubectl apply`)
|
||||
|
||||
```bash
|
||||
# 1. Build and push image (from stemedb repo root)
|
||||
docker build -t us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest .
|
||||
docker push us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest
|
||||
|
||||
# 2. Create root API key in GCP Secret Manager
|
||||
ROOT_KEY="steme_live_$(openssl rand -hex 24)"
|
||||
echo "Root key: $ROOT_KEY" # Save this — needed for provision-project-keys.sh
|
||||
echo -n "$ROOT_KEY" | gcloud secrets create stemedb-root-api-key \
|
||||
--project=orchard9 --replication-policy=automatic --data-file=-
|
||||
|
||||
# 3. Add DNS: stemedb.threesix.ai → Traefik LB IP (Cloudflare)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Original Manifest Spec (archived for reference)
|
||||
|
||||
The following was the original spec. Actual implementation is in `k3s-fleet/deployments/k8s/base/stemedb/stemedb.yaml`.
|
||||
|
||||
Create `deployments/k8s/base/stemedb/` with the following files:
|
||||
|
||||
**`namespace.yaml`**
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: stemedb
|
||||
```
|
||||
|
||||
**`pvc.yaml`** — Two PVCs to isolate WAL fsync from LSM compaction I/O
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: stemedb-wal
|
||||
namespace: stemedb
|
||||
annotations:
|
||||
volumeType: longhorn
|
||||
spec:
|
||||
accessModes: [ReadWriteOnce]
|
||||
storageClassName: longhorn
|
||||
resources:
|
||||
requests:
|
||||
storage: 20Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: stemedb-db
|
||||
namespace: stemedb
|
||||
annotations:
|
||||
volumeType: longhorn
|
||||
spec:
|
||||
accessModes: [ReadWriteOnce]
|
||||
storageClassName: longhorn
|
||||
resources:
|
||||
requests:
|
||||
storage: 50Gi
|
||||
```
|
||||
|
||||
> Set `numberOfReplicas: 2` in Longhorn StorageClass (not default 3) to halve cross-node fsync amplification.
|
||||
|
||||
**`deployment.yaml`** — Critical spec decisions annotated
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
spec:
|
||||
replicas: 1 # Non-negotiable. Embedded KV requires exclusive volume access.
|
||||
strategy:
|
||||
type: Recreate # NOT RollingUpdate. RWO PVC + 2 pods = deadlock.
|
||||
selector:
|
||||
matchLabels:
|
||||
app: stemedb-api
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: stemedb-api
|
||||
annotations:
|
||||
prometheus.io/scrape: "true"
|
||||
prometheus.io/port: "18180"
|
||||
prometheus.io/path: "/metrics"
|
||||
spec:
|
||||
securityContext:
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
fsGroup: 1000
|
||||
readOnlyRootFilesystem: false # WAL writes to /data
|
||||
terminationGracePeriodSeconds: 30 # Let in-flight WAL writes complete.
|
||||
containers:
|
||||
- name: stemedb-api
|
||||
image: <REGISTRY>/stemedb-api:latest
|
||||
ports:
|
||||
- containerPort: 18180
|
||||
env:
|
||||
- name: STEMEDB_BIND_ADDR
|
||||
value: "0.0.0.0:18180"
|
||||
- name: STEMEDB_WAL_DIR
|
||||
value: /data/wal
|
||||
- name: STEMEDB_DB_DIR
|
||||
value: /data/db
|
||||
- name: STEMEDB_METER_ENABLED
|
||||
value: "true"
|
||||
- name: STEMEDB_ROOT_API_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: stemedb-secrets
|
||||
key: root-api-key
|
||||
resources:
|
||||
requests:
|
||||
cpu: "500m"
|
||||
memory: "1Gi"
|
||||
limits:
|
||||
cpu: "2000m"
|
||||
memory: "4Gi"
|
||||
startupProbe: # WAL replay can take 60s after crash — do not skip this.
|
||||
httpGet:
|
||||
path: /v1/health
|
||||
port: 18180
|
||||
periodSeconds: 5
|
||||
failureThreshold: 12 # 60s total window before k8s kills pod
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /v1/health
|
||||
port: 18180
|
||||
periodSeconds: 15
|
||||
failureThreshold: 3
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /v1/health
|
||||
port: 18180
|
||||
periodSeconds: 5
|
||||
failureThreshold: 3
|
||||
volumeMounts:
|
||||
- name: wal
|
||||
mountPath: /data/wal
|
||||
- name: db
|
||||
mountPath: /data/db
|
||||
volumes:
|
||||
- name: wal
|
||||
persistentVolumeClaim:
|
||||
claimName: stemedb-wal
|
||||
- name: db
|
||||
persistentVolumeClaim:
|
||||
claimName: stemedb-db
|
||||
```
|
||||
|
||||
**`service.yaml`**
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
spec:
|
||||
selector:
|
||||
app: stemedb-api
|
||||
ports:
|
||||
- port: 18180
|
||||
targetPort: 18180
|
||||
type: ClusterIP
|
||||
```
|
||||
|
||||
**`ingress.yaml`** — Traefik terminates TLS; do NOT set `STEMEDB_TLS_CERT_PATH`
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
traefik.ingress.kubernetes.io/router.middlewares: stemedb-ratelimit@kubernetescrd
|
||||
cert-manager.io/cluster-issuer: letsencrypt-prod
|
||||
spec:
|
||||
ingressClassName: traefik
|
||||
rules:
|
||||
- host: stemedb.yourdomain.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: stemedb-api
|
||||
port:
|
||||
number: 18180
|
||||
tls:
|
||||
- hosts:
|
||||
- stemedb.yourdomain.com
|
||||
secretName: stemedb-tls
|
||||
```
|
||||
|
||||
**`middleware.yaml`** — Traefik rate limit (global, before app-level limits)
|
||||
```yaml
|
||||
apiVersion: traefik.containo.us/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: ratelimit
|
||||
namespace: stemedb
|
||||
spec:
|
||||
rateLimit:
|
||||
average: 500
|
||||
burst: 1000
|
||||
period: 1s
|
||||
```
|
||||
|
||||
**`external-secret.yaml`** — Pull from GCP Secret Manager via External Secrets Operator
|
||||
```yaml
|
||||
apiVersion: external-secrets.io/v1beta1
|
||||
kind: ExternalSecret
|
||||
metadata:
|
||||
name: stemedb-secrets
|
||||
namespace: stemedb
|
||||
spec:
|
||||
refreshInterval: 1h
|
||||
secretStoreRef:
|
||||
name: gcp-secret-manager # adjust to your cluster's SecretStore name
|
||||
kind: ClusterSecretStore
|
||||
target:
|
||||
name: stemedb-secrets
|
||||
data:
|
||||
- secretKey: root-api-key
|
||||
remoteRef:
|
||||
key: stemedb-root-api-key
|
||||
```
|
||||
|
||||
**`kustomization.yaml`**
|
||||
```yaml
|
||||
apiVersion: kustomize.config.k8s.io/v1beta1
|
||||
kind: Kustomization
|
||||
resources:
|
||||
- namespace.yaml
|
||||
- pvc.yaml
|
||||
- deployment.yaml
|
||||
- service.yaml
|
||||
- ingress.yaml
|
||||
- middleware.yaml
|
||||
- external-secret.yaml
|
||||
```
|
||||
|
||||
**Deploy:**
|
||||
```bash
|
||||
kubectl apply -k deployments/k8s/base/stemedb/
|
||||
kubectl rollout status deployment/stemedb-api -n stemedb
|
||||
curl https://stemedb.yourdomain.com/v1/health
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Checklist (Week 1 — Gate: First Project Can Connect)
|
||||
|
||||
| # | Task | File(s) | Status |
|
||||
|---|------|---------|--------|
|
||||
| 1 | Wire auth router in `main.rs` | `crates/stemedb-api/src/main.rs` | ✅ Done |
|
||||
| 2 | Add `STEMEDB_UNSAFE_SKIP_SIGNATURES` startup guard | `crates/stemedb-api/src/main.rs` | ✅ Done |
|
||||
| 3 | Add bootstrap key seed from `STEMEDB_ROOT_API_KEY` | `crates/stemedb-api/src/main.rs` | ✅ Done |
|
||||
| 4 | Add `--features aphoria` to Dockerfile | `Dockerfile` | ✅ Done |
|
||||
| 5 | Create k8s manifests | `k3s-fleet/.../stemedb/` | ✅ Done |
|
||||
| 6 | Write `scripts/provision-project-keys.sh` | `scripts/` | ✅ Done |
|
||||
| 7 | Build + push Docker image | GAR | ⏳ Manual |
|
||||
| 8 | Store root API key in GCP Secret Manager | GCP Console | ⏳ Manual |
|
||||
| 9 | Add DNS record: `stemedb.threesix.ai` | Cloudflare | ⏳ Manual |
|
||||
| 10 | Deploy to k3s + smoke test | k3s-fleet | ⏳ Pending |
|
||||
|
||||
**Gate test (run after deploy):**
|
||||
```bash
|
||||
# Health check
|
||||
curl https://stemedb.threesix.ai/v1/health
|
||||
|
||||
# Unauthenticated write → 401
|
||||
curl -s -o /dev/null -w "%{http_code}" -X POST \
|
||||
https://stemedb.threesix.ai/v1/assert -H "Content-Type: application/json" -d '{}'
|
||||
|
||||
# Authenticated write → 200/201
|
||||
curl -X POST https://stemedb.threesix.ai/v1/assert \
|
||||
-H "X-API-Key: $ROOT_KEY" -H "Content-Type: application/json" \
|
||||
-d '{"subject":"test/ping","predicate":"alive","value":true,"agent_id":"test"}'
|
||||
|
||||
# Confirm key persists across restart
|
||||
kubectl rollout restart deployment/stemedb-api -n stemedb
|
||||
kubectl rollout status deployment/stemedb-api -n stemedb --timeout=120s
|
||||
curl https://stemedb.threesix.ai/v1/health
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Production Hardening (Week 2 — Gate: 10 Projects)
|
||||
|
||||
### Backup CronJob
|
||||
|
||||
Create `deployments/k8s/base/stemedb/backup-cronjob.yaml`:
|
||||
|
||||
```yaml
|
||||
apiVersion: batch/v1
|
||||
kind: CronJob
|
||||
metadata:
|
||||
name: stemedb-backup
|
||||
namespace: stemedb
|
||||
spec:
|
||||
schedule: "0 */6 * * *" # Every 6 hours
|
||||
concurrencyPolicy: Forbid
|
||||
jobTemplate:
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
restartPolicy: OnFailure
|
||||
containers:
|
||||
- name: backup
|
||||
image: rclone/rclone:latest
|
||||
command:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- |
|
||||
# WAL: copy all completed segments (all except the last, which is locked)
|
||||
SEGMENTS=$(ls /data/wal/*.wal 2>/dev/null | sort | head -n -1)
|
||||
if [ -n "$SEGMENTS" ]; then
|
||||
rclone copy /data/wal/ gcs:$BACKUP_BUCKET/wal/ \
|
||||
--include "*.wal" --exclude "$(ls /data/wal/*.wal | sort | tail -n 1 | xargs basename)"
|
||||
fi
|
||||
# DB snapshot
|
||||
rclone copy /data/db/ gcs:$BACKUP_BUCKET/db/$(date -u +%Y%m%dT%H%M%SZ)/
|
||||
echo "Backup complete"
|
||||
env:
|
||||
- name: BACKUP_BUCKET
|
||||
value: stemedb-backups # your GCS bucket name
|
||||
volumeMounts:
|
||||
- name: wal
|
||||
mountPath: /data/wal
|
||||
readOnly: true
|
||||
- name: db
|
||||
mountPath: /data/db
|
||||
readOnly: true
|
||||
- name: rclone-config
|
||||
mountPath: /config/rclone
|
||||
volumes:
|
||||
- name: wal
|
||||
persistentVolumeClaim:
|
||||
claimName: stemedb-wal
|
||||
- name: db
|
||||
persistentVolumeClaim:
|
||||
claimName: stemedb-db
|
||||
- name: rclone-config
|
||||
secret:
|
||||
secretName: rclone-gcs-config
|
||||
```
|
||||
|
||||
**Test backup manually:**
|
||||
```bash
|
||||
kubectl create job --from=cronjob/stemedb-backup backup-test -n stemedb
|
||||
kubectl logs -l job-name=backup-test -n stemedb -f
|
||||
```
|
||||
|
||||
### Monitoring — Wire into Prometheus
|
||||
|
||||
**`service-monitor.yaml`**
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: ServiceMonitor
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
labels:
|
||||
release: prometheus # must match your Prometheus Operator label selector
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: stemedb-api
|
||||
endpoints:
|
||||
- port: "18180"
|
||||
path: /metrics
|
||||
interval: 15s
|
||||
```
|
||||
|
||||
**`alert-rules.yaml`** — 6 alerts that fire first at 100-project scale
|
||||
```yaml
|
||||
apiVersion: monitoring.coreos.com/v1
|
||||
kind: PrometheusRule
|
||||
metadata:
|
||||
name: stemedb-alerts
|
||||
namespace: stemedb
|
||||
labels:
|
||||
release: prometheus
|
||||
spec:
|
||||
groups:
|
||||
- name: stemedb.rules
|
||||
rules:
|
||||
- alert: StemeDBPodNotRunning
|
||||
expr: absent(up{job="stemedb-api"}) > 0
|
||||
for: 2m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "StemeDB pod is not running"
|
||||
|
||||
- alert: StemeDBWALLatencyHigh
|
||||
expr: histogram_quantile(0.99, rate(stemedb_wal_fsync_latency_seconds_bucket[5m])) > 0.05
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "WAL fsync p99 > 50ms — Longhorn I/O degradation likely"
|
||||
|
||||
- alert: StemeDBDataVolumeNearlyFull
|
||||
expr: |
|
||||
kubelet_volume_stats_used_bytes{persistentvolumeclaim=~"stemedb-.*"}
|
||||
/ kubelet_volume_stats_capacity_bytes{persistentvolumeclaim=~"stemedb-.*"}
|
||||
> 0.75
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "StemeDB PVC usage > 75% — resize requires downtime"
|
||||
|
||||
- alert: StemeDBRateLimitSaturating
|
||||
expr: rate(stemedb_http_requests_total{status="429"}[5m]) > 1
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "429 rate > 1/s — projects hitting rate limits"
|
||||
|
||||
- alert: StemeDBErrorRateHigh
|
||||
expr: |
|
||||
rate(stemedb_http_requests_total{status=~"5.."}[5m])
|
||||
/ rate(stemedb_http_requests_total[5m])
|
||||
> 0.01
|
||||
for: 5m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "5xx error rate > 1%"
|
||||
|
||||
- alert: StemeDBOOMKilled
|
||||
expr: |
|
||||
kube_pod_container_status_last_terminated_reason{
|
||||
container="stemedb-api",
|
||||
reason="OOMKilled"
|
||||
} > 0
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "StemeDB container OOM killed — increase memory limit or find leak"
|
||||
```
|
||||
|
||||
### NetworkPolicy + PDB
|
||||
|
||||
**`network-policy.yaml`**
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
spec:
|
||||
podSelector:
|
||||
matchLabels:
|
||||
app: stemedb-api
|
||||
policyTypes: [Ingress, Egress]
|
||||
ingress:
|
||||
- from:
|
||||
- namespaceSelector:
|
||||
matchLabels:
|
||||
kubernetes.io/metadata.name: kube-system # Traefik
|
||||
- namespaceSelector:
|
||||
matchLabels:
|
||||
kubernetes.io/metadata.name: monitoring # Prometheus
|
||||
ports:
|
||||
- port: 18180
|
||||
egress:
|
||||
- ports:
|
||||
- port: 53 # DNS
|
||||
- port: 443 # GCP APIs (backup, secrets)
|
||||
```
|
||||
|
||||
**`pdb.yaml`**
|
||||
```yaml
|
||||
apiVersion: policy/v1
|
||||
kind: PodDisruptionBudget
|
||||
metadata:
|
||||
name: stemedb-api
|
||||
namespace: stemedb
|
||||
spec:
|
||||
maxUnavailable: 0
|
||||
selector:
|
||||
matchLabels:
|
||||
app: stemedb-api
|
||||
```
|
||||
|
||||
### Phase 2 Checklist
|
||||
|
||||
| # | Task | File(s) | Est |
|
||||
|---|------|---------|-----|
|
||||
| 1 | Deploy backup CronJob | `deployments/k8s/base/stemedb/backup-cronjob.yaml` | 2h |
|
||||
| 2 | Create GCS bucket + rclone Secret | GCP Console | 1h |
|
||||
| 3 | Wire ServiceMonitor into Prometheus | `service-monitor.yaml` | 1h |
|
||||
| 4 | Deploy 6 alert rules | `alert-rules.yaml` | 1h |
|
||||
| 5 | Add NetworkPolicy + PDB | `network-policy.yaml`, `pdb.yaml` | 1h |
|
||||
| 6 | Fix Longhorn PVC reclaim policy in DR runbook | `docs/operations/runbooks/disaster-recovery.md` | 30m |
|
||||
|
||||
**Gate test:** Kill pod → `StemeDBPodNotRunning` fires within 2 min. Run backup job manually → GCS has files.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Scale to 100 Projects (Week 3)
|
||||
|
||||
### Per-project key provisioning script
|
||||
|
||||
Create `scripts/provision-project-keys.sh`:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# Usage: ./provision-project-keys.sh projects.txt
|
||||
# projects.txt: one project name per line
|
||||
|
||||
STEMEDB_URL="${STEMEDB_URL:-https://stemedb.yourdomain.com}"
|
||||
ADMIN_KEY="${STEMEDB_ADMIN_KEY:?Set STEMEDB_ADMIN_KEY}"
|
||||
PROJECTS_FILE="${1:?Usage: $0 <projects-file>}"
|
||||
|
||||
while IFS= read -r project; do
|
||||
[[ -z "$project" ]] && continue
|
||||
|
||||
echo "Provisioning key for: $project"
|
||||
|
||||
response=$(curl -sf -X POST "$STEMEDB_URL/v1/admin/api-keys" \
|
||||
-H "X-API-Key: $ADMIN_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"label\":\"project-$project\",\"role\":\"write_agent\"}")
|
||||
|
||||
key=$(echo "$response" | jq -r '.key')
|
||||
|
||||
# Store in GCP Secret Manager
|
||||
echo -n "$key" | gcloud secrets create "stemedb-key-$project" \
|
||||
--data-file=- \
|
||||
--replication-policy=automatic 2>/dev/null \
|
||||
|| echo -n "$key" | gcloud secrets versions add "stemedb-key-$project" --data-file=-
|
||||
|
||||
echo " Key stored: stemedb-key-$project"
|
||||
done < "$PROJECTS_FILE"
|
||||
|
||||
echo "Done."
|
||||
```
|
||||
|
||||
**Onboarding runbook for each project:**
|
||||
```bash
|
||||
# 1. Retrieve key from Secret Manager
|
||||
gcloud secrets versions access latest --secret="stemedb-key-<project>"
|
||||
|
||||
# 2. Update project's aphoria.toml
|
||||
cat >> .aphoria/config.toml <<EOF
|
||||
[hosted]
|
||||
url = "https://stemedb.yourdomain.com"
|
||||
api_key_env = "STEMEDB_API_KEY"
|
||||
EOF
|
||||
|
||||
# 3. Export key in CI/CD env
|
||||
# STEMEDB_API_KEY=steme_live_<value>
|
||||
```
|
||||
|
||||
### Aphoria retry logic (P1)
|
||||
|
||||
Projects run `aphoria scan --persist` locally and call the remote StemeDB. During StemeDB pod
|
||||
restarts (Recreate strategy = brief downtime), Aphoria should retry rather than fail the commit.
|
||||
|
||||
> This is a change to the `aphoria` binary, not to StemeDB. Add 3-attempt exponential backoff
|
||||
> (2s, 4s, 8s) on HTTP 502/503 responses in the Aphoria HTTP client.
|
||||
|
||||
### Phase 3 Checklist
|
||||
|
||||
| # | Task | File(s) | Est |
|
||||
|---|------|---------|-----|
|
||||
| 1 | Run provision script for all 100 projects | `scripts/provision-project-keys.sh` | 2h |
|
||||
| 2 | Write per-project onboarding runbook | `docs/operations/onboarding-project.md` | 1h |
|
||||
| 3 | Add retry logic to `aphoria` HTTP client | `applications/aphoria/` | 2h |
|
||||
| 4 | Split WAL + DB into two PVCs (migration) | `deployments/k8s/base/stemedb/` | 2h |
|
||||
|
||||
**Gate test:** 5 projects scan simultaneously with their own keys → each isolated → one rate-limited → others unaffected.
|
||||
|
||||
---
|
||||
|
||||
## What NOT to Build Yet
|
||||
|
||||
| Item | Why not |
|
||||
|------|---------|
|
||||
| HPA | StemeDB is stateful (embedded KV). Cannot scale horizontally. |
|
||||
| mTLS between pods | Single service. Add when you have a second service. |
|
||||
| WAF | Body limits + Traefik rate limit + circuit breaker is sufficient for 100 known projects. |
|
||||
| Per-tenant namespaces | Multiplies operational surface 100x. API key isolation is the right model. |
|
||||
| Multi-region / clustering | 3-node k3s + Longhorn 2-replica is your HA story. P6 in roadmap. |
|
||||
| PITR with WAL timestamps | 6-hour backup RPO is acceptable for pilot. Improve later. |
|
||||
| Secrets rotation automation | Manual rotation via `/v1/admin/api-keys/:hash/rotate` is fine for 100 projects. |
|
||||
| Distributed tracing | You have one service. WAL fsync histogram covers what you need. |
|
||||
|
||||
---
|
||||
|
||||
## Open Questions (Resolve Week 1)
|
||||
|
||||
1. **Image registry**: Which registry does k3s-fleet already use? Check `get_service_config()` in `deploy-stack.sh`.
|
||||
2. **Bootstrap key API**: Verify exact method signatures on `ApiKeyStore` before writing the seed logic in `main.rs`.
|
||||
3. **Aphoria scan model**: Do projects run `aphoria scan` locally (calling remote StemeDB) or as a k8s Job? Determines where retry logic lives.
|
||||
4. **GCS bucket**: Does one exist for backups, or does it need to be created?
|
||||
5. **CORS**: All router variants in `routers.rs` use `allow_origin(Any)`. Production needs this restricted to Traefik's internal domain. Add `STEMEDB_ALLOWED_ORIGINS` env var support.
|
||||
|
||||
---
|
||||
|
||||
## Risk Register
|
||||
|
||||
| Risk | Likelihood | Mitigation |
|
||||
|------|-----------|-----------|
|
||||
| Longhorn fsync latency at 100-project burst | Medium | Pin pod + volume to same node (Phase 3), `dataLocality: bestEffort`; monitor WAL p99 from day 1 |
|
||||
| Single-instance downtime during deploys | High (Recreate strategy) | Startup probe + maintenance window policy + Aphoria retry logic |
|
||||
| Fresh PVC after disaster = 100 project keys lost | Low but catastrophic | Bootstrap key seed in `main.rs` + `provision-project-keys.sh` idempotent re-run |
|
||||
| Image registry blocker | High if unresolved | Resolve Day 1; entire deployment depends on it |
|
||||
| CORS vulnerability | Medium | `allow_origin(Any)` in all router variants; fix before public launch |
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure After Phase 1
|
||||
|
||||
```
|
||||
deployments/
|
||||
└── k8s/
|
||||
└── base/
|
||||
└── stemedb/
|
||||
├── kustomization.yaml
|
||||
├── namespace.yaml
|
||||
├── pvc.yaml
|
||||
├── deployment.yaml
|
||||
├── service.yaml
|
||||
├── ingress.yaml
|
||||
├── middleware.yaml
|
||||
└── external-secret.yaml
|
||||
|
||||
scripts/
|
||||
└── provision-project-keys.sh (new)
|
||||
```
|
||||
|
||||
After Phase 2, add to `deployments/k8s/base/stemedb/`:
|
||||
- `backup-cronjob.yaml`
|
||||
- `service-monitor.yaml`
|
||||
- `alert-rules.yaml`
|
||||
- `network-policy.yaml`
|
||||
- `pdb.yaml`
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-03-02 — Week 1 code changes complete; 3 manual steps remain before deploy*
|
||||
@ -4,14 +4,64 @@
|
||||
|
||||
**✅ RECOMMENDED FOR PRODUCTION** - Survives single node failure, automatic replication
|
||||
|
||||
> **Implementation status:** The cluster crates (`stemedb-cluster`, `stemedb-sync`, `stemedb-rpc`) are
|
||||
> implemented. The k3s/Longhorn deployment path is the current production path (see Phase 2 section).
|
||||
> Bare-metal deployment via config.toml is aspirational and not yet wired to the binary.
|
||||
|
||||
---
|
||||
|
||||
## Architectural Rationale: Why Gossip, Not Raft
|
||||
|
||||
This section exists because the wrong answer here — "just add Raft" — is commonly assumed and actively harmful for StemeDB's workload.
|
||||
|
||||
### The append-only insight
|
||||
|
||||
Most databases need Raft because writes can **conflict**: two nodes update the same row, and a leader must serialize them. StemeDB doesn't have this problem. Every assertion receives a **BLAKE3 content hash** as its ID. If two nodes both write the same assertion independently, they produce the same hash → the same key → identical data. There is nothing to conflict on.
|
||||
|
||||
This means the assertion write path is naturally **CRDT-like**: the system needs every node to eventually receive every assertion, but doesn't need consensus on which assertion "wins." Gossip + Merkle anti-entropy handles this correctly and efficiently. Raft would add leader-election overhead and write latency for zero benefit on the data path.
|
||||
|
||||
### What actually needs coordination
|
||||
|
||||
Not everything in StemeDB is append-only. Mutable state requires stronger guarantees:
|
||||
|
||||
| State | Type | Replication strategy | Why |
|
||||
|-------|------|---------------------|-----|
|
||||
| Assertions | Append-only (CRDT) | Gossip + Merkle anti-entropy | No conflicts possible by design |
|
||||
| API keys | Mutable | Synchronous broadcast or coordinator | Revoked key must not be reusable |
|
||||
| Quota / meter counts | Mutable counter | Coordinator node or bounded staleness | Double-spend if two nodes both allow |
|
||||
| Circuit breaker state | Mutable | Synchronous broadcast | Trip/reset must propagate atomically |
|
||||
| Epochs | Append-only, ordered | Gossip is sufficient | Creation order captured in content |
|
||||
|
||||
**Practical implication:** Admin operations (key management, quota changes) should be routed to a designated coordinator and synchronously acknowledged. Assertion writes and reads can go to any node with no coordination.
|
||||
|
||||
### Read scaling
|
||||
|
||||
Because Lens resolution is pure local computation on indexed data, **any node can serve any read with no coordination**. Reads scale horizontally to N nodes without inter-node communication.
|
||||
|
||||
### Write scaling (assertions)
|
||||
|
||||
Because assertion writes are idempotent by content hash, **any node can accept any write**. The cluster gateway (`stemedb-cluster`, port 18181) routes writes by subject hash shard prefix — each node owns a partition of the key space. Merkle anti-entropy ensures all nodes converge. Write throughput scales linearly with node count.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The three-node cluster provides high availability through automatic replication (factor 2) and CRDT-based eventual consistency. Survives single node failure with <5 minute recovery time.
|
||||
The three-node cluster provides high availability through automatic replication (factor 2) and gossip-based eventual consistency for assertions. Survives single node failure with <5 minute recovery time.
|
||||
|
||||
```
|
||||
[See: diagrams/three-node.txt for ASCII diagram]
|
||||
┌──────────────────────────────┐
|
||||
Internet ──→ LB → │ Cluster Gateway (port 18181) │
|
||||
│ Reads: round-robin any node │
|
||||
│ Writes: route by shard prefix│
|
||||
└──────┬────────────┬───────────┘
|
||||
│ │
|
||||
┌──────────▼──┐ ┌────▼─────────┐ ┌──────────────┐
|
||||
│ Node A │ │ Node B │ │ Node C │
|
||||
│ Shard 0-84 │ │ Shard 85-169 │ │ Shard 170-255│
|
||||
│ WAL + KV │ │ WAL + KV │ │ WAL + KV │
|
||||
└──────┬──────┘ └────┬──────────┘ └───────┬──────┘
|
||||
│ Merkle sync (gossip, port 18183) │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
@ -92,7 +142,49 @@ Each node runs the full stack:
|
||||
|
||||
---
|
||||
|
||||
## Deployment Steps
|
||||
## k3s Deployment Path (Current — Longhorn + StatefulSet)
|
||||
|
||||
> This is the **current production deployment path** for k3s-fleet. The bare-metal steps below
|
||||
> are for non-k8s environments and use a config.toml interface that is not yet wired to the binary.
|
||||
|
||||
For each cluster node on k3s, deploy a separate StatefulSet with its own Longhorn PVC:
|
||||
|
||||
```
|
||||
k3s-fleet/deployments/k8s/base/stemedb/
|
||||
├── stemedb.yaml # Node A (current single-node — Phase 1)
|
||||
├── stemedb-b.yaml # Node B (Phase 2 — add when ready to scale reads)
|
||||
├── stemedb-c.yaml # Node C (Phase 2)
|
||||
└── kustomization.yaml
|
||||
```
|
||||
|
||||
**Critical k3s constraints:**
|
||||
- Each node needs its own `ReadWriteOnce` Longhorn PVC — embedded KV (fjall) cannot share a volume
|
||||
- Use `strategy: Recreate` on each Deployment (not RollingUpdate) — RWO PVC + 2 pods = deadlock
|
||||
- Cluster gateway (port 18181) must be exposed as a separate Service for inter-node routing
|
||||
- Use `topologySpreadConstraints` to ensure nodes land on different k3s worker hosts
|
||||
|
||||
**Phase 2 read-replica k8s addition (when ready):**
|
||||
```yaml
|
||||
# Add to stemedb-b.yaml — identical to stemedb.yaml except:
|
||||
# - Different node ID env var
|
||||
# - STEMEDB_CLUSTER_SEEDS pointing to Node A's gateway ClusterIP
|
||||
# - Its own PVC claim
|
||||
env:
|
||||
- name: STEMEDB_NODE_ID
|
||||
value: "node-b"
|
||||
- name: STEMEDB_CLUSTER_SEEDS
|
||||
value: "stemedb-api.stemedb.svc:18181"
|
||||
```
|
||||
|
||||
**See:** [k8s Deploy Roadmap](../deployment/k8s-deploy-roadmap.md) for the phased rollout plan.
|
||||
|
||||
---
|
||||
|
||||
## Bare-Metal Deployment Steps
|
||||
|
||||
> ⚠️ The config.toml cluster configuration shown here is **planned** and not yet wired to the
|
||||
> `stemedb-api` binary. Current binary configuration uses environment variables only. This section
|
||||
> documents the intended interface for when cluster config is implemented.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
@ -234,27 +326,29 @@ scrape_configs:
|
||||
|
||||
### Two Nodes Fail (Catastrophic)
|
||||
|
||||
**Impact:** Read-only mode (no writes accepted)
|
||||
**Impact:** Single surviving node continues accepting assertion writes and serving reads. Admin operations (key management) are degraded — single-node has no peer to synchronously acknowledge.
|
||||
|
||||
**Recovery:**
|
||||
1. Manual intervention required
|
||||
2. Restore third node or add new node
|
||||
3. Trigger Merkle sync
|
||||
4. Resume writes when quorum restored
|
||||
1. Manual intervention required to restore cluster
|
||||
2. Restore failed nodes or add new nodes
|
||||
3. Trigger Merkle sync (`/cluster/sync` endpoint) after nodes rejoin
|
||||
4. Admin operations fully restored when cluster membership is repaired
|
||||
|
||||
**RTO:** 30 minutes - 2 hours (manual)
|
||||
**Data loss:** Potential (depends on which nodes failed)
|
||||
**RTO:** 30 minutes - 2 hours (manual restore)
|
||||
**Data loss:** Assertion writes continue on surviving node and merge on recovery. Recent admin operations (key revocations) issued during degraded window may not have propagated — audit after recovery.
|
||||
|
||||
### Network Partition
|
||||
|
||||
**Impact:** Split brain possible (both sides accept writes)
|
||||
**Impact:**
|
||||
- **Assertion writes:** Both partitions accept writes independently. This is safe — same content → same BLAKE3 hash, different content → different hashes that merge cleanly after partition heals.
|
||||
- **Admin operations (API key revocations, quota changes):** A revocation issued to one partition is invisible to the other until partition heals. A revoked key may still be honored by nodes in the other partition during the partition window.
|
||||
|
||||
**Recovery:**
|
||||
- CRDT merge resolves conflicts automatically
|
||||
- Lenses (Recency, Authority) handle conflicts at read time
|
||||
- No manual intervention needed after partition heals
|
||||
- Merkle anti-entropy detects and fills gaps automatically when partition heals
|
||||
- Lenses (Recency, Authority) handle any assertion-level divergence at read time
|
||||
- Admin state re-synchronizes via coordinator broadcast on reconnect
|
||||
|
||||
**Data loss:** None (CRDTs preserve all writes)
|
||||
**Data loss:** None for assertions (all writes from both partitions preserved and merged).
|
||||
|
||||
### Replication Lag
|
||||
|
||||
@ -284,9 +378,10 @@ scrape_configs:
|
||||
|
||||
**Target:** 1,000 assertions/sec sustained
|
||||
|
||||
- Each node accepts writes
|
||||
- Replication happens asynchronously
|
||||
- No coordination required (CRDTs)
|
||||
- Each node accepts assertion writes (routed by cluster gateway via shard prefix)
|
||||
- Replication happens asynchronously via Merkle gossip
|
||||
- No coordination required for assertions (CRDT-safe by content hash)
|
||||
- Admin writes (API keys, quota changes) route to coordinator and require synchronous acknowledgment — expect higher latency on those operations (~50ms vs ~5ms for assertions)
|
||||
|
||||
### Replication Lag
|
||||
|
||||
@ -384,6 +479,21 @@ Compare to single-node ($87/month): 5x cost for 10x availability
|
||||
|
||||
---
|
||||
|
||||
## Scaling Path Beyond Three Nodes
|
||||
|
||||
Three nodes on k3s handles the 100-project target. For mass traffic beyond that, the scaling path is incremental — not a rearchitecture:
|
||||
|
||||
| Phase | Target | Work type | What changes |
|
||||
|-------|--------|-----------|-------------|
|
||||
| **Phase 1** | 1 node, 100 projects | ✅ Done | Single Deployment, Longhorn PVC, auth wired |
|
||||
| **Phase 2** | 3 nodes, read-scaled | Ops-heavy | Add 2 read replicas as separate Deployments; cluster gateway routes reads round-robin |
|
||||
| **Phase 3** | 3 nodes, write-sharded | Code-heavy | Gateway enforces shard ownership; each node owns ⅓ of subject hash space; reads still any-node |
|
||||
| **Phase 4** | N nodes, coordinator | Code-heavy | Designate one node (or small 3-node Raft group) exclusively for mutable admin state; assertion nodes are pure data |
|
||||
|
||||
**What you do NOT need:** Raft on the assertion write path. The append-only, content-addressed design means there are no write conflicts to serialize. Raft belongs only on the mutable admin state path (Phase 4), which is a small fraction of total traffic.
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Single-Node Pilot](./single-node-pilot.md) - Simpler architecture
|
||||
@ -394,4 +504,4 @@ Compare to single-node ($87/month): 5x cost for 10x availability
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2026-02-11
|
||||
**Last Updated:** 2026-03-02 — Added architectural rationale (gossip vs Raft), k3s deployment path, fixed mutable-state coordination notes, added 4-phase scaling table
|
||||
|
||||
54
scripts/provision-project-keys.sh
Executable file
54
scripts/provision-project-keys.sh
Executable file
@ -0,0 +1,54 @@
|
||||
#!/usr/bin/env bash
|
||||
# provision-project-keys.sh — create per-project API keys and store in GCP Secret Manager
|
||||
#
|
||||
# Usage: STEMEDB_ADMIN_KEY=steme_live_... ./scripts/provision-project-keys.sh projects.txt
|
||||
# projects.txt: one project slug per line (e.g. "my-app", "another-project")
|
||||
#
|
||||
# Requires: curl, jq, gcloud (authenticated)
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
STEMEDB_URL="${STEMEDB_URL:-https://stemedb.threesix.ai}"
|
||||
ADMIN_KEY="${STEMEDB_ADMIN_KEY:?Set STEMEDB_ADMIN_KEY to a root/admin API key}"
|
||||
PROJECTS_FILE="${1:?Usage: $0 <projects-file>}"
|
||||
GCP_PROJECT="${GCP_PROJECT:-orchard9}"
|
||||
|
||||
echo "Provisioning keys against: $STEMEDB_URL"
|
||||
echo "GCP project for secrets: $GCP_PROJECT"
|
||||
echo ""
|
||||
|
||||
while IFS= read -r project; do
|
||||
[[ -z "$project" || "$project" =~ ^# ]] && continue
|
||||
|
||||
echo "→ Provisioning: $project"
|
||||
|
||||
response=$(curl -sf -X POST "$STEMEDB_URL/v1/admin/api-keys" \
|
||||
-H "X-API-Key: $ADMIN_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"environment\":\"live\",\"label\":\"project-$project\",\"role\":\"write_agent\"}") \
|
||||
|| { echo " ERROR: API call failed for $project"; continue; }
|
||||
|
||||
key=$(echo "$response" | jq -r '.key')
|
||||
|
||||
if [[ -z "$key" || "$key" == "null" ]]; then
|
||||
echo " ERROR: no key returned for $project"
|
||||
continue
|
||||
fi
|
||||
|
||||
secret_name="stemedb-key-$project"
|
||||
if gcloud secrets describe "$secret_name" --project="$GCP_PROJECT" &>/dev/null; then
|
||||
echo -n "$key" | gcloud secrets versions add "$secret_name" \
|
||||
--project="$GCP_PROJECT" --data-file=-
|
||||
echo " Updated existing secret: $secret_name"
|
||||
else
|
||||
echo -n "$key" | gcloud secrets create "$secret_name" \
|
||||
--project="$GCP_PROJECT" \
|
||||
--replication-policy=automatic \
|
||||
--data-file=-
|
||||
echo " Created new secret: $secret_name"
|
||||
fi
|
||||
done < "$PROJECTS_FILE"
|
||||
|
||||
echo ""
|
||||
echo "Done. Projects retrieve their keys with:"
|
||||
echo " gcloud secrets versions access latest --secret=stemedb-key-<project> --project=$GCP_PROJECT"
|
||||
Loading…
Reference in New Issue
Block a user