# k3s Deploy Roadmap: StemeDB + Aphoria → 100 Projects **Target:** Production deployment on k3s-fleet with Longhorn, cert-manager, External Secrets, Prometheus/Grafana, Traefik. **Timeline:** 3 weeks to ship-ready for 100 projects. --- ## Ship Blockers (P0) — Must Fix Before Any Project Onboards ### ~~1. Auth router not wired in production~~ ✅ RESOLVED (2026-03-02) `create_router_full_protection_full_config` is now called when `STEMEDB_AUTH_ENABLED=true`. Router dispatch checks `bootstrap::is_auth_enabled()` first — full protection stack activates in production. Metering-only path still available when auth is disabled (local dev). **Resolution:** `crates/stemedb-api/src/main.rs` updated. --- ### ~~2. `STEMEDB_UNSAFE_SKIP_SIGNATURES` startup guard missing~~ ✅ RESOLVED (2026-03-02) Startup guard added: if `STEMEDB_UNSAFE_SKIP_SIGNATURES=true` and `STEMEDB_AUTH_ENABLED=true`, server logs a fatal error and exits with code 1. Misconfiguration is caught at boot, not silently. **Resolution:** `crates/stemedb-api/src/main.rs` updated. --- ### ~~3. Bootstrap key not seeded from env on fresh PVC~~ ✅ RESOLVED (2026-03-02) `bootstrap::bootstrap_root_api_key()` is now called at startup (after IngestWorker spawn). Reads `STEMEDB_ROOT_API_KEY`, idempotent — no-op if key already exists in the store. Fatal error on failure. **Resolution:** `crates/stemedb-api/src/main.rs` updated. --- ### ~~4. No k8s manifests — StemeDB cannot be deployed to k3s~~ ✅ RESOLVED (2026-03-02) Manifests deployed to `k3s-fleet/deployments/k8s/base/stemedb/` (single `stemedb.yaml` following `tidaldb/` pattern). Includes ExternalSecret, PVC (50Gi Longhorn), Deployment (Recreate, non-root, all probes), ClusterIP Service, Traefik Ingress at `stemedb.threesix.ai`. **Remaining manual step:** Build + push image, create GCP secret, add DNS record (see Pre-Deploy section below). --- ### ~~5. Image registry — k3s cannot pull without a registry~~ ✅ RESOLVED (2026-03-02) Registry confirmed: `us-central1-docker.pkg.dev/orchard9/docker-images/` (GAR). `imagePullSecrets: gcr-secret` wired in Deployment. Dockerfile updated with `--features aphoria`. **Remaining manual step:** `docker build && docker push` to populate the image. --- ## Pre-Deploy Checklist (Manual Steps Before `kubectl apply`) ```bash # 1. Build and push image (from stemedb repo root) docker build -t us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest . docker push us-central1-docker.pkg.dev/orchard9/docker-images/stemedb-api:latest # 2. Create root API key in GCP Secret Manager ROOT_KEY="steme_live_$(openssl rand -hex 24)" echo "Root key: $ROOT_KEY" # Save this — needed for provision-project-keys.sh echo -n "$ROOT_KEY" | gcloud secrets create stemedb-root-api-key \ --project=orchard9 --replication-policy=automatic --data-file=- # 3. Add DNS: stemedb.threesix.ai → Traefik LB IP (Cloudflare) ``` --- ## Original Manifest Spec (archived for reference) The following was the original spec. Actual implementation is in `k3s-fleet/deployments/k8s/base/stemedb/stemedb.yaml`. Create `deployments/k8s/base/stemedb/` with the following files: **`namespace.yaml`** ```yaml apiVersion: v1 kind: Namespace metadata: name: stemedb ``` **`pvc.yaml`** — Two PVCs to isolate WAL fsync from LSM compaction I/O ```yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: stemedb-wal namespace: stemedb annotations: volumeType: longhorn spec: accessModes: [ReadWriteOnce] storageClassName: longhorn resources: requests: storage: 20Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: stemedb-db namespace: stemedb annotations: volumeType: longhorn spec: accessModes: [ReadWriteOnce] storageClassName: longhorn resources: requests: storage: 50Gi ``` > Set `numberOfReplicas: 2` in Longhorn StorageClass (not default 3) to halve cross-node fsync amplification. **`deployment.yaml`** — Critical spec decisions annotated ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: stemedb-api namespace: stemedb spec: replicas: 1 # Non-negotiable. Embedded KV requires exclusive volume access. strategy: type: Recreate # NOT RollingUpdate. RWO PVC + 2 pods = deadlock. selector: matchLabels: app: stemedb-api template: metadata: labels: app: stemedb-api annotations: prometheus.io/scrape: "true" prometheus.io/port: "18180" prometheus.io/path: "/metrics" spec: securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 1000 readOnlyRootFilesystem: false # WAL writes to /data terminationGracePeriodSeconds: 30 # Let in-flight WAL writes complete. containers: - name: stemedb-api image: /stemedb-api:latest ports: - containerPort: 18180 env: - name: STEMEDB_BIND_ADDR value: "0.0.0.0:18180" - name: STEMEDB_WAL_DIR value: /data/wal - name: STEMEDB_DB_DIR value: /data/db - name: STEMEDB_METER_ENABLED value: "true" - name: STEMEDB_ROOT_API_KEY valueFrom: secretKeyRef: name: stemedb-secrets key: root-api-key resources: requests: cpu: "500m" memory: "1Gi" limits: cpu: "2000m" memory: "4Gi" startupProbe: # WAL replay can take 60s after crash — do not skip this. httpGet: path: /v1/health port: 18180 periodSeconds: 5 failureThreshold: 12 # 60s total window before k8s kills pod livenessProbe: httpGet: path: /v1/health port: 18180 periodSeconds: 15 failureThreshold: 3 readinessProbe: httpGet: path: /v1/health port: 18180 periodSeconds: 5 failureThreshold: 3 volumeMounts: - name: wal mountPath: /data/wal - name: db mountPath: /data/db volumes: - name: wal persistentVolumeClaim: claimName: stemedb-wal - name: db persistentVolumeClaim: claimName: stemedb-db ``` **`service.yaml`** ```yaml apiVersion: v1 kind: Service metadata: name: stemedb-api namespace: stemedb spec: selector: app: stemedb-api ports: - port: 18180 targetPort: 18180 type: ClusterIP ``` **`ingress.yaml`** — Traefik terminates TLS; do NOT set `STEMEDB_TLS_CERT_PATH` ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: stemedb-api namespace: stemedb annotations: traefik.ingress.kubernetes.io/router.entrypoints: websecure traefik.ingress.kubernetes.io/router.middlewares: stemedb-ratelimit@kubernetescrd cert-manager.io/cluster-issuer: letsencrypt-prod spec: ingressClassName: traefik rules: - host: stemedb.yourdomain.com http: paths: - path: / pathType: Prefix backend: service: name: stemedb-api port: number: 18180 tls: - hosts: - stemedb.yourdomain.com secretName: stemedb-tls ``` **`middleware.yaml`** — Traefik rate limit (global, before app-level limits) ```yaml apiVersion: traefik.containo.us/v1alpha1 kind: Middleware metadata: name: ratelimit namespace: stemedb spec: rateLimit: average: 500 burst: 1000 period: 1s ``` **`external-secret.yaml`** — Pull from GCP Secret Manager via External Secrets Operator ```yaml apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: stemedb-secrets namespace: stemedb spec: refreshInterval: 1h secretStoreRef: name: gcp-secret-manager # adjust to your cluster's SecretStore name kind: ClusterSecretStore target: name: stemedb-secrets data: - secretKey: root-api-key remoteRef: key: stemedb-root-api-key ``` **`kustomization.yaml`** ```yaml apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - namespace.yaml - pvc.yaml - deployment.yaml - service.yaml - ingress.yaml - middleware.yaml - external-secret.yaml ``` **Deploy:** ```bash kubectl apply -k deployments/k8s/base/stemedb/ kubectl rollout status deployment/stemedb-api -n stemedb curl https://stemedb.yourdomain.com/v1/health ``` --- ## Phase 1 Checklist (Week 1 — Gate: First Project Can Connect) | # | Task | File(s) | Status | |---|------|---------|--------| | 1 | Wire auth router in `main.rs` | `crates/stemedb-api/src/main.rs` | ✅ Done | | 2 | Add `STEMEDB_UNSAFE_SKIP_SIGNATURES` startup guard | `crates/stemedb-api/src/main.rs` | ✅ Done | | 3 | Add bootstrap key seed from `STEMEDB_ROOT_API_KEY` | `crates/stemedb-api/src/main.rs` | ✅ Done | | 4 | Add `--features aphoria` to Dockerfile | `Dockerfile` | ✅ Done | | 5 | Create k8s manifests | `k3s-fleet/.../stemedb/` | ✅ Done | | 6 | Write `scripts/provision-project-keys.sh` | `scripts/` | ✅ Done | | 7 | Build + push Docker image | GAR | ⏳ Manual | | 8 | Store root API key in GCP Secret Manager | GCP Console | ⏳ Manual | | 9 | Add DNS record: `stemedb.threesix.ai` | Cloudflare | ⏳ Manual | | 10 | Deploy to k3s + smoke test | k3s-fleet | ⏳ Pending | **Gate test (run after deploy):** ```bash # Health check curl https://stemedb.threesix.ai/v1/health # Unauthenticated write → 401 curl -s -o /dev/null -w "%{http_code}" -X POST \ https://stemedb.threesix.ai/v1/assert -H "Content-Type: application/json" -d '{}' # Authenticated write → 200/201 curl -X POST https://stemedb.threesix.ai/v1/assert \ -H "X-API-Key: $ROOT_KEY" -H "Content-Type: application/json" \ -d '{"subject":"test/ping","predicate":"alive","value":true,"agent_id":"test"}' # Confirm key persists across restart kubectl rollout restart deployment/stemedb-api -n stemedb kubectl rollout status deployment/stemedb-api -n stemedb --timeout=120s curl https://stemedb.threesix.ai/v1/health ``` --- ## Phase 2: Production Hardening (Week 2 — Gate: 10 Projects) ### Backup CronJob Create `deployments/k8s/base/stemedb/backup-cronjob.yaml`: ```yaml apiVersion: batch/v1 kind: CronJob metadata: name: stemedb-backup namespace: stemedb spec: schedule: "0 */6 * * *" # Every 6 hours concurrencyPolicy: Forbid jobTemplate: spec: template: spec: restartPolicy: OnFailure containers: - name: backup image: rclone/rclone:latest command: - /bin/sh - -c - | # WAL: copy all completed segments (all except the last, which is locked) SEGMENTS=$(ls /data/wal/*.wal 2>/dev/null | sort | head -n -1) if [ -n "$SEGMENTS" ]; then rclone copy /data/wal/ gcs:$BACKUP_BUCKET/wal/ \ --include "*.wal" --exclude "$(ls /data/wal/*.wal | sort | tail -n 1 | xargs basename)" fi # DB snapshot rclone copy /data/db/ gcs:$BACKUP_BUCKET/db/$(date -u +%Y%m%dT%H%M%SZ)/ echo "Backup complete" env: - name: BACKUP_BUCKET value: stemedb-backups # your GCS bucket name volumeMounts: - name: wal mountPath: /data/wal readOnly: true - name: db mountPath: /data/db readOnly: true - name: rclone-config mountPath: /config/rclone volumes: - name: wal persistentVolumeClaim: claimName: stemedb-wal - name: db persistentVolumeClaim: claimName: stemedb-db - name: rclone-config secret: secretName: rclone-gcs-config ``` **Test backup manually:** ```bash kubectl create job --from=cronjob/stemedb-backup backup-test -n stemedb kubectl logs -l job-name=backup-test -n stemedb -f ``` ### Monitoring — Wire into Prometheus **`service-monitor.yaml`** ```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: stemedb-api namespace: stemedb labels: release: prometheus # must match your Prometheus Operator label selector spec: selector: matchLabels: app: stemedb-api endpoints: - port: "18180" path: /metrics interval: 15s ``` **`alert-rules.yaml`** — 6 alerts that fire first at 100-project scale ```yaml apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: stemedb-alerts namespace: stemedb labels: release: prometheus spec: groups: - name: stemedb.rules rules: - alert: StemeDBPodNotRunning expr: absent(up{job="stemedb-api"}) > 0 for: 2m labels: severity: critical annotations: summary: "StemeDB pod is not running" - alert: StemeDBWALLatencyHigh expr: histogram_quantile(0.99, rate(stemedb_wal_fsync_latency_seconds_bucket[5m])) > 0.05 for: 5m labels: severity: warning annotations: summary: "WAL fsync p99 > 50ms — Longhorn I/O degradation likely" - alert: StemeDBDataVolumeNearlyFull expr: | kubelet_volume_stats_used_bytes{persistentvolumeclaim=~"stemedb-.*"} / kubelet_volume_stats_capacity_bytes{persistentvolumeclaim=~"stemedb-.*"} > 0.75 for: 5m labels: severity: warning annotations: summary: "StemeDB PVC usage > 75% — resize requires downtime" - alert: StemeDBRateLimitSaturating expr: rate(stemedb_http_requests_total{status="429"}[5m]) > 1 for: 5m labels: severity: warning annotations: summary: "429 rate > 1/s — projects hitting rate limits" - alert: StemeDBErrorRateHigh expr: | rate(stemedb_http_requests_total{status=~"5.."}[5m]) / rate(stemedb_http_requests_total[5m]) > 0.01 for: 5m labels: severity: critical annotations: summary: "5xx error rate > 1%" - alert: StemeDBOOMKilled expr: | kube_pod_container_status_last_terminated_reason{ container="stemedb-api", reason="OOMKilled" } > 0 labels: severity: critical annotations: summary: "StemeDB container OOM killed — increase memory limit or find leak" ``` ### NetworkPolicy + PDB **`network-policy.yaml`** ```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: stemedb-api namespace: stemedb spec: podSelector: matchLabels: app: stemedb-api policyTypes: [Ingress, Egress] ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system # Traefik - namespaceSelector: matchLabels: kubernetes.io/metadata.name: monitoring # Prometheus ports: - port: 18180 egress: - ports: - port: 53 # DNS - port: 443 # GCP APIs (backup, secrets) ``` **`pdb.yaml`** ```yaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: stemedb-api namespace: stemedb spec: maxUnavailable: 0 selector: matchLabels: app: stemedb-api ``` ### Phase 2 Checklist | # | Task | File(s) | Est | |---|------|---------|-----| | 1 | Deploy backup CronJob | `deployments/k8s/base/stemedb/backup-cronjob.yaml` | 2h | | 2 | Create GCS bucket + rclone Secret | GCP Console | 1h | | 3 | Wire ServiceMonitor into Prometheus | `service-monitor.yaml` | 1h | | 4 | Deploy 6 alert rules | `alert-rules.yaml` | 1h | | 5 | Add NetworkPolicy + PDB | `network-policy.yaml`, `pdb.yaml` | 1h | | 6 | Fix Longhorn PVC reclaim policy in DR runbook | `docs/operations/runbooks/disaster-recovery.md` | 30m | **Gate test:** Kill pod → `StemeDBPodNotRunning` fires within 2 min. Run backup job manually → GCS has files. --- ## Phase 3: Scale to 100 Projects (Week 3) ### Per-project key provisioning script Create `scripts/provision-project-keys.sh`: ```bash #!/usr/bin/env bash set -euo pipefail # Usage: ./provision-project-keys.sh projects.txt # projects.txt: one project name per line STEMEDB_URL="${STEMEDB_URL:-https://stemedb.yourdomain.com}" ADMIN_KEY="${STEMEDB_ADMIN_KEY:?Set STEMEDB_ADMIN_KEY}" PROJECTS_FILE="${1:?Usage: $0 }" while IFS= read -r project; do [[ -z "$project" ]] && continue echo "Provisioning key for: $project" response=$(curl -sf -X POST "$STEMEDB_URL/v1/admin/api-keys" \ -H "X-API-Key: $ADMIN_KEY" \ -H "Content-Type: application/json" \ -d "{\"label\":\"project-$project\",\"role\":\"write_agent\"}") key=$(echo "$response" | jq -r '.key') # Store in GCP Secret Manager echo -n "$key" | gcloud secrets create "stemedb-key-$project" \ --data-file=- \ --replication-policy=automatic 2>/dev/null \ || echo -n "$key" | gcloud secrets versions add "stemedb-key-$project" --data-file=- echo " Key stored: stemedb-key-$project" done < "$PROJECTS_FILE" echo "Done." ``` **Onboarding runbook for each project:** ```bash # 1. Retrieve key from Secret Manager gcloud secrets versions access latest --secret="stemedb-key-" # 2. Update project's aphoria.toml cat >> .aphoria/config.toml < ``` ### Aphoria retry logic (P1) Projects run `aphoria scan --persist` locally and call the remote StemeDB. During StemeDB pod restarts (Recreate strategy = brief downtime), Aphoria should retry rather than fail the commit. > This is a change to the `aphoria` binary, not to StemeDB. Add 3-attempt exponential backoff > (2s, 4s, 8s) on HTTP 502/503 responses in the Aphoria HTTP client. ### Phase 3 Checklist | # | Task | File(s) | Est | |---|------|---------|-----| | 1 | Run provision script for all 100 projects | `scripts/provision-project-keys.sh` | 2h | | 2 | Write per-project onboarding runbook | `docs/operations/onboarding-project.md` | 1h | | 3 | Add retry logic to `aphoria` HTTP client | `applications/aphoria/` | 2h | | 4 | Split WAL + DB into two PVCs (migration) | `deployments/k8s/base/stemedb/` | 2h | **Gate test:** 5 projects scan simultaneously with their own keys → each isolated → one rate-limited → others unaffected. --- ## What NOT to Build Yet | Item | Why not | |------|---------| | HPA | StemeDB is stateful (embedded KV). Cannot scale horizontally. | | mTLS between pods | Single service. Add when you have a second service. | | WAF | Body limits + Traefik rate limit + circuit breaker is sufficient for 100 known projects. | | Per-tenant namespaces | Multiplies operational surface 100x. API key isolation is the right model. | | Multi-region / clustering | 3-node k3s + Longhorn 2-replica is your HA story. P6 in roadmap. | | PITR with WAL timestamps | 6-hour backup RPO is acceptable for pilot. Improve later. | | Secrets rotation automation | Manual rotation via `/v1/admin/api-keys/:hash/rotate` is fine for 100 projects. | | Distributed tracing | You have one service. WAL fsync histogram covers what you need. | --- ## Open Questions (Resolve Week 1) 1. **Image registry**: Which registry does k3s-fleet already use? Check `get_service_config()` in `deploy-stack.sh`. 2. **Bootstrap key API**: Verify exact method signatures on `ApiKeyStore` before writing the seed logic in `main.rs`. 3. **Aphoria scan model**: Do projects run `aphoria scan` locally (calling remote StemeDB) or as a k8s Job? Determines where retry logic lives. 4. **GCS bucket**: Does one exist for backups, or does it need to be created? 5. **CORS**: All router variants in `routers.rs` use `allow_origin(Any)`. Production needs this restricted to Traefik's internal domain. Add `STEMEDB_ALLOWED_ORIGINS` env var support. --- ## Risk Register | Risk | Likelihood | Mitigation | |------|-----------|-----------| | Longhorn fsync latency at 100-project burst | Medium | Pin pod + volume to same node (Phase 3), `dataLocality: bestEffort`; monitor WAL p99 from day 1 | | Single-instance downtime during deploys | High (Recreate strategy) | Startup probe + maintenance window policy + Aphoria retry logic | | Fresh PVC after disaster = 100 project keys lost | Low but catastrophic | Bootstrap key seed in `main.rs` + `provision-project-keys.sh` idempotent re-run | | Image registry blocker | High if unresolved | Resolve Day 1; entire deployment depends on it | | CORS vulnerability | Medium | `allow_origin(Any)` in all router variants; fix before public launch | --- ## Directory Structure After Phase 1 ``` deployments/ └── k8s/ └── base/ └── stemedb/ ├── kustomization.yaml ├── namespace.yaml ├── pvc.yaml ├── deployment.yaml ├── service.yaml ├── ingress.yaml ├── middleware.yaml └── external-secret.yaml scripts/ └── provision-project-keys.sh (new) ``` After Phase 2, add to `deployments/k8s/base/stemedb/`: - `backup-cronjob.yaml` - `service-monitor.yaml` - `alert-rules.yaml` - `network-policy.yaml` - `pdb.yaml` --- *Last updated: 2026-03-02 — Week 1 code changes complete; 3 manual steps remain before deploy*