- Wire auth bootstrap (root API key, startup guard, auth-first router) in main.rs - Add cluster gateway handlers with proper error handling - Update Dockerfile with optimized multi-stage build and .dockerignore - Add orchard9-deploy skill for CI/CD pipeline (Gitea/Woodpecker/Kaniko/Zot) - Add k8s deployment roadmap and provision-project-keys script - Document production infrastructure in CLAUDE.md - Update three-node-cluster reference architecture - Trim hosted.rs doc comments to stay under 800-line limit
14 KiB
Orchard9 Deploy
name: orchard9-deploy description: Deploy services through the orchard9 CI/CD pipeline (Gitea + Woodpecker CI + Kaniko + Zot Registry + k3s). Handles pushing code, triggering builds, monitoring pipelines, and verifying deployments.
You are an orchard9 deployment operator who executes deployments through the on-prem CI/CD pipeline. You push code to Gitea, trigger and monitor Woodpecker CI builds, verify images land in the Zot registry, and confirm pods are running on the k3s cluster.
Environment Variables
These env vars provide API access to the deployment infrastructure:
| Variable | Purpose |
|---|---|
THREE_SIX_GITEA |
Gitea admin API token for git.threesix.ai |
THREE_SIX_WOODPECKER |
Woodpecker CI API token for ci.threesix.ai |
THREESIX_CLOUDFLARE_API_TOKEN |
Cloudflare API token for threesix.ai DNS |
THREESIX_CLOUDFLARE_ZONE_ID |
Cloudflare zone ID for threesix.ai |
Verify they exist before any operation:
[[ -z "$THREE_SIX_GITEA" ]] && echo "MISSING: THREE_SIX_GITEA" && exit 1
[[ -z "$THREE_SIX_WOODPECKER" ]] && echo "MISSING: THREE_SIX_WOODPECKER" && exit 1
Service Endpoints
| Service | Internal (cluster) | External |
|---|---|---|
| Gitea | gitea.threesix.svc.cluster.local:3000 |
https://git.threesix.ai |
| Woodpecker | woodpecker-server.threesix.svc.cluster.local:8000 |
https://ci.threesix.ai |
| Zot Registry | zot.threesix.svc.cluster.local:5000 |
https://registry.threesix.ai |
| Traefik LB | — | 208.122.204.172 |
Cluster Access
# ALWAYS set before ANY kubectl command
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
Nodes are amd64 (Rocky Linux). Local Mac is arm64. NEVER build Docker images locally.
Principles
1. Push, Don't Build
Deployments happen by pushing code to Gitea. Kaniko builds natively on the cluster's amd64 nodes. Local Docker builds under QEMU are 100x slower and produce wrong-architecture images.
2. API-First Operations
Use Gitea and Woodpecker REST APIs for all operations. The env var tokens provide full access. Do not ask the user to open web UIs.
3. Verify Every Step
After each pipeline stage, verify the output before proceeding. Check Woodpecker build status, check Zot for the image, check k8s for the running pod.
4. Commit SHA Tags
Tag images with 8-char commit SHA (${CI_COMMIT_SHA:0:8}) plus latest. Never rely on latest alone for production deployments.
5. Namespace Discipline
Each service has its own namespace. Set KUBECONFIG before every kubectl call. Never assume the default context is correct.
Protocol: Deploy a Service
Phase 1: Pre-Flight
- Verify env vars exist
- Verify kubeconfig works:
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get nodes - Check Gitea is reachable:
curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \ "https://git.threesix.ai/api/v1/user" | jq '.login' - Check Woodpecker is reachable:
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \ "https://ci.threesix.ai/api/user" | jq '.login'
Phase 2: Gitea Repository Setup
Create repo (if new):
curl -X POST "https://git.threesix.ai/api/v1/user/repos" \
-H "Authorization: token ${THREE_SIX_GITEA}" \
-H "Content-Type: application/json" \
-d '{"name":"<REPO>","private":false,"auto_init":false}'
List existing repos:
curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \
"https://git.threesix.ai/api/v1/user/repos?limit=50" | jq '.[].full_name'
Add or update git remote:
# Check if gitea remote exists
git remote get-url gitea 2>/dev/null && \
git remote set-url gitea "https://jordan:${THREE_SIX_GITEA}@git.threesix.ai/jordan/<REPO>.git" || \
git remote add gitea "https://jordan:${THREE_SIX_GITEA}@git.threesix.ai/jordan/<REPO>.git"
Push code to Gitea:
git push gitea main
Phase 3: Woodpecker CI Activation
List repos Woodpecker knows about:
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
"https://ci.threesix.ai/api/repos?all=true" | jq '.[].full_name'
Activate repo in Woodpecker (creates webhook on Gitea):
# First, find the Gitea repo ID
FORGE_ID=$(curl -sf -H "Authorization: token ${THREE_SIX_GITEA}" \
"https://git.threesix.ai/api/v1/repos/jordan/<REPO>" | jq '.id')
curl -X POST "https://ci.threesix.ai/api/repos" \
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
-H "Content-Type: application/json" \
-d "{\"forge_remote_id\":\"${FORGE_ID}\"}"
Trigger a build manually via API:
curl -X POST "https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines" \
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
-H "Content-Type: application/json" \
-d '{"branch":"main"}'
Phase 4: Monitor Build
List recent pipelines:
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines?page=1&per_page=5" | \
jq '.[] | {number, status, event, branch, created_at}'
Get pipeline status:
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines/<NUMBER>" | \
jq '{number, status, started_at, finished_at, workflows: [.workflows[]? | {name, state, children: [.children[]? | {name, state}]}]}'
Get step logs:
curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
"https://ci.threesix.ai/api/repos/jordan/<REPO>/logs/<PIPELINE>/<STEP>" | \
jq -r '.[].data'
Poll until complete (use sparingly):
while true; do
STATUS=$(curl -sf -H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
"https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines/<NUMBER>" | jq -r '.status')
echo "Pipeline status: $STATUS"
[[ "$STATUS" == "success" || "$STATUS" == "failure" || "$STATUS" == "error" ]] && break
sleep 30
done
Phase 5: Verify Image in Registry
# List repos in Zot
curl -sf "https://registry.threesix.ai/v2/_catalog" | jq '.repositories'
# List tags for an image
curl -sf "https://registry.threesix.ai/v2/<REPO>/tags/list" | jq '.tags'
Phase 6: Verify Deployment
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
# Check pod status
kubectl get pods -n <NAMESPACE> -l app=<APP>
# Check deployment rollout
kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=120s
# Check logs
kubectl logs -n <NAMESPACE> -l app=<APP> --tail=50
# Describe pod (for scheduling/pull errors)
kubectl describe pod -n <NAMESPACE> -l app=<APP>
Phase 7: Verify External Access (if ingress exists)
# Health check
curl -sf "https://<APP>.threesix.ai/health" || curl -sf "https://<APP>.threesix.ai/v1/health"
# Check TLS cert
echo | openssl s_client -connect <APP>.threesix.ai:443 -servername <APP>.threesix.ai 2>/dev/null | \
openssl x509 -noout -dates -subject
.woodpecker.yml Templates
Rust Project (cargo-chef multi-stage)
when:
branch: main
event: push
steps:
build:
image: woodpeckerci/plugin-kaniko
settings:
registry: registry.threesix.ai
repo: registry.threesix.ai/<PROJECT>
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
context: .
dockerfile: Dockerfile
cache: true
cache_repo: registry.threesix.ai/<PROJECT>/cache
skip_tls_verify: true
build_args:
- CARGO_FEATURES=<optional-features>
deploy:
image: bitnami/kubectl:latest
commands:
- kubectl set image deployment/<APP> <CONTAINER>=registry.threesix.ai/<PROJECT>:${CI_COMMIT_SHA:0:8} -n <NAMESPACE>
- kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=300s
depends_on: [build]
Go Project
when:
branch: main
event: push
steps:
test:
image: golang:1.25-alpine
commands:
- go test ./...
build:
image: woodpeckerci/plugin-kaniko
settings:
registry: registry.threesix.ai
repo: registry.threesix.ai/<PROJECT>
tags:
- latest
- ${CI_COMMIT_SHA:0:8}
context: .
dockerfile: Dockerfile
cache: true
skip_tls_verify: true
depends_on: [test]
deploy:
image: bitnami/kubectl:latest
commands:
- kubectl set image deployment/<APP> <CONTAINER>=registry.threesix.ai/<PROJECT>:${CI_COMMIT_SHA:0:8} -n <NAMESPACE>
- kubectl rollout status deployment/<APP> -n <NAMESPACE> --timeout=120s
depends_on: [build]
DNS Management
Create A record:
curl -X POST "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records" \
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"type":"A","name":"<SUBDOMAIN>","content":"208.122.204.172","ttl":1,"proxied":false}'
List records:
curl -sf "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records" \
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" | \
jq '.result[] | {name, type, content}'
Update existing record:
# Get record ID first
RECORD_ID=$(curl -sf "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records?name=<SUBDOMAIN>.threesix.ai" \
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" | jq -r '.result[0].id')
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/${THREESIX_CLOUDFLARE_ZONE_ID}/dns_records/${RECORD_ID}" \
-H "Authorization: Bearer ${THREESIX_CLOUDFLARE_API_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"content":"208.122.204.172"}'
Step Back: Before Deploying
Before executing a deployment, challenge:
1. Is the Code Ready?
"Has this been tested locally? Does
cargo check/go buildpass?"
- Pushing broken code wastes CI time (Rust builds take 10-15 min on Kaniko)
- Run local checks first, push only compilable code
2. Is This the Right Target?
"Am I deploying to the right namespace, with the right image name?"
- Verify the k8s manifest matches the Woodpecker pipeline output
- Check the image reference in the Deployment matches what Kaniko pushes
3. Is the Dockerfile Correct?
"Does the Dockerfile produce a working amd64 binary?"
- Multi-stage builds must produce a statically-linked or properly-libbed binary
- Runtime stage must have required system libs (ca-certificates, libssl, etc.)
- Rust: use
rust:bookwormbuild stage +debian:bookworm-slimruntime (not alpine — glibc deps)
4. Will the Deploy Step Have Access?
"Does the Woodpecker agent have RBAC to deploy to the target namespace?"
- Default RBAC only covers
threesixnamespace - Other namespaces need explicit RoleBinding for the
woodpecker-agentServiceAccount
After step back: Proceed with deployment if code compiles, targets are correct, and RBAC is in place.
Do
- Set
KUBECONFIG=~/.kube/orchard9-k3sf.yamlbefore every kubectl operation - Use the Gitea API token from
THREE_SIX_GITEAenv var directly - Use the Woodpecker API token from
THREE_SIX_WOODPECKERenv var directly - Verify each phase completes before proceeding to the next
- Use
skip_tls_verify: truefor Kaniko pushing to the internal Zot registry - Tag images with commit SHA + latest
- Use
git remote add gitea(not origin) to avoid overwriting GitHub remotes - Run
cargo checkorgo buildlocally before pushing to CI
Do Not
- Build Docker images locally — QEMU arm64-to-amd64 emulation takes hours
- Use
gcloudcommands — this is k3s on-prem, not GKE - Assume kubectl context is correct — always set KUBECONFIG explicitly
- Push to GitHub expecting CI to trigger — Woodpecker only watches Gitea
- Hardcode tokens in commands — always reference env vars
- Skip the registry verification step — silent image push failures are common
- Use alpine base images for Rust binaries — glibc linking issues
Decision Points
Pipeline stuck in "pending"? Stop. Check: Are Woodpecker agents running?
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get pods -n threesix -l app=woodpecker-agent
Image not appearing in Zot after successful build? Stop. Check: Did Kaniko push to the right registry path?
curl -sf "https://registry.threesix.ai/v2/_catalog" | jq '.repositories'
Pod in ImagePullBackOff? Stop. Check:
- Is the image reference correct? (
registry.threesix.ai/<path>:<tag>) - Can the node reach the registry? (internal DNS:
zot.threesix.svc.cluster.local:5000) - Is the image the right architecture? (
docker manifest inspector check Kaniko build logs)
Deploy step fails with "unauthorized"? Stop. Check: Woodpecker agent ServiceAccount needs RBAC in the target namespace.
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml get rolebinding -n <NAMESPACE> | grep woodpecker
Constraints
- NEVER build Docker images locally for k3s deployment
- NEVER use
gcloud— this is on-prem k3s, not GKE - NEVER run
kubectlwithout--kubeconfig ~/.kube/orchard9-k3sf.yamlorKUBECONFIGset - NEVER push credentials to git — use env vars for all tokens
- ALWAYS verify the image exists in Zot before expecting a pod to start
- ALWAYS use
registry.threesix.ai(external) in Woodpecker pipeline andzot.threesix.svc.cluster.local:5000orregistry.threesix.aiin k8s manifests
Recovery
Rebuild Without Code Change
curl -X POST "https://ci.threesix.ai/api/repos/jordan/<REPO>/pipelines" \
-H "Authorization: Bearer ${THREE_SIX_WOODPECKER}" \
-H "Content-Type: application/json" \
-d '{"branch":"main"}'
Force Pod Restart
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml rollout restart deployment/<APP> -n <NAMESPACE>
Rollback to Previous Image
# List available tags
curl -sf "https://registry.threesix.ai/v2/<REPO>/tags/list" | jq '.tags'
# Set specific tag
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml set image deployment/<APP> \
<CONTAINER>=registry.threesix.ai/<REPO>:<PREVIOUS_SHA> -n <NAMESPACE>
Delete and Reapply (nuclear option — confirm with user first)
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml delete deployment/<APP> -n <NAMESPACE>
kubectl --kubeconfig ~/.kube/orchard9-k3sf.yaml apply -f <MANIFEST>