# rdev - Remote Developer Run Claude Code instances in isolated Kubernetes pods with REST API control. Enables bots, CI/CD systems, and external orchestrators to dispatch agentive development work to isolated environments. **Platform:** threesix.ai - Agent-driven development at scale with shared worker pools. ## Terminology | Term | Meaning | Location | |------|---------|----------| | **platform** | rdev itself (orchestrator API, handlers, workers) | `cmd/rdev-api/`, `internal/`, `pkg/api/` | | **skeleton** | Code that ships in generated projects | `internal/adapter/templates/templates/skeleton/` | | **component templates** | Service/worker/app/cli templates added to skeleton | `templates/components/{service,worker,cli,app-*}/` | When discussing code: "add to **platform**" = edit rdev; "add to **skeleton**" = edit project templates. ### Database Rule | Context | Database | Details | |---------|----------|---------| | **rdev platform** | PostgreSQL | API keys, audit logs, work queue, credentials (`internal/adapter/postgres/`) | | **Generated projects (production)** | CockroachDB | Provisioned per-project by rdev (`internal/adapter/cockroach/`) | | **Generated projects (local dev)** | PostgreSQL | Via docker-compose, wire-compatible with CockroachDB | Both use `lib/pq` driver. The `type: postgres` component API provisions **CockroachDB** in production — the name is a legacy artifact. Skeleton SQL must be compatible with both PostgreSQL and CockroachDB. ## Find Your Guide | If you need to... | Read this | |-------------------|-----------| | **Set up local dev** | [local/setup.md](.claude/guides/local/setup.md) | | **Run tests** | [local/testing.md](.claude/guides/local/testing.md) | | **Write Go code / handlers** | [backend/go-guidelines.md](.claude/guides/backend/go-guidelines.md) | | **Understand pkg/api** | [packages/api-framework.md](.claude/guides/packages/api-framework.md) | | **Add a new handler/endpoint** | [backend/adding-handlers.md](.claude/guides/backend/adding-handlers.md) | | **Understand hexagonal architecture** | [backend/hexagonal.md](.claude/guides/backend/hexagonal.md) | | **Deploy to k3s** | [ops/deploying.md](.claude/guides/ops/deploying.md) | | **Release a new version** | [ops/releasing.md](.claude/guides/ops/releasing.md) | | **Work with Kubernetes adapters** | [services/kubernetes.md](.claude/guides/services/kubernetes.md) | | **Database / migrations** | [ops/database.md](.claude/guides/ops/database.md) | | **Manage credentials** | [ops/credentials.md](.claude/guides/ops/credentials.md) | | **Work queue system** | [services/work-queue.md](.claude/guides/services/work-queue.md) | | **Worker pool management** | [services/worker-pool.md](.claude/guides/services/worker-pool.md) | | **Project templates** | [services/templates.md](.claude/guides/services/templates.md) | | **Composable monorepo templates** | [services/composable-monorepo.md](.claude/guides/services/composable-monorepo.md) | | **E2E testing strategy** | [services/e2e-testing-strategy.md](.claude/guides/services/e2e-testing-strategy.md) | | **Cookbook tree system (commands)** | [services/cookbook-trees.md](.claude/guides/services/cookbook-trees.md) | | **Slackpath reference architectures** | [services/cookbook-trees.md](.claude/guides/services/cookbook-trees.md#slackpath-trees-reference-architectures) | | **Write cookbook trees** | [cookbook-trees/SKILL.md](.claude/skills/cookbook-trees/SKILL.md) | | **Build/maintain skeleton packages** | [skeleton-craftsman/SKILL.md](.claude/skills/skeleton-craftsman/SKILL.md) | | **Build orchestration** | [services/build-orchestration.md](.claude/guides/services/build-orchestration.md) | | **Build event streaming** | [services/build-streaming.md](.claude/guides/services/build-streaming.md) | | **Resource provisioning plan** | [services/resource-provisioning-plan.md](.claude/guides/services/resource-provisioning-plan.md) | | **Database provisioning** | [services/database-provisioning.md](.claude/guides/services/database-provisioning.md) | | **Cache provisioning** | [services/cache-provisioning.md](.claude/guides/services/cache-provisioning.md) | | **CockroachDB operations** | [services/cockroachdb.md](.claude/guides/services/cockroachdb.md) | | **Redis operations** | [services/redis.md](.claude/guides/services/redis.md) | | **DNS / Cloudflare** | [services/dns-cloudflare.md](.claude/guides/services/dns-cloudflare.md) | | **Network policies / internal routing** | [ops/networking.md](.claude/guides/ops/networking.md) | | **Debug external system health** | [ops/external-health-diagnostics.md](.claude/guides/ops/external-health-diagnostics.md) | | **SDLC orchestration** | [services/sdlc.md](.claude/guides/services/sdlc.md) | | **Visual verification (Playwright)** | [services/visual-verification.md](.claude/guides/services/visual-verification.md) | | **Interactive remote development** | [services/interactive-remote-dev.md](.claude/guides/services/interactive-remote-dev.md) | | **Gitea 1.22 / SDK / webhooks** | [ops/gitea-1.22.md](.claude/guides/ops/gitea-1.22.md) | | **Go 1.25 features & migration** | [backend/go-1.25.md](.claude/guides/backend/go-1.25.md) | | **Woodpecker CI v3 pipelines** | [ops/woodpecker-v3.md](.claude/guides/ops/woodpecker-v3.md) | | **Traefik v3 ingress & middleware** | [ops/traefik-v3.md](.claude/guides/ops/traefik-v3.md) | | **Zot container registry** | [ops/zot-registry.md](.claude/guides/ops/zot-registry.md) | | **cert-manager / TLS certificates** | [ops/cert-manager.md](.claude/guides/ops/cert-manager.md) | | **Notify / email delivery** | [services/notify.md](.claude/guides/services/notify.md) | | **Structured logging** | `internal/logging/` - field constants, context propagation, redaction | | **Update the AionUi SDK** | [SDK Update Workflow](#sdk-update-workflow) | ## SDK Update Workflow When you add/change API endpoints in rdev, update the AionUi hand-written SDK: ```bash # Step 1: Regenerate openapi.json in rdev make sdk # Step 2: Sync to AionUi and typecheck cd /path/to/AionUi RDEV_REPO=/path/to/rdev ./scripts/sync-rdev-sdk.sh ``` The sync script shows added/removed endpoints and fails if TypeScript breaks. You must **manually** update `src/sdk/rdev/types.ts` and `src/sdk/rdev/resources/*.ts` for new endpoints. For CI drift detection: add `make sdk-check` to the Woodpecker pipeline. ## Critical Rules - **Frustration = systemic fix:** When the user says they're tired of repeating something, stop what you're doing and find or create a systemic fix in `.claude/**/*` or `CLAUDE.md` — don't just apologize and do the same thing again. - **AI credentials are provisioned:** rdev injects `LAOZHANG_API_KEY` and `GEMINI_API_KEY` as env vars into every deployed component (`component_deploy.go:fetchProjectCredentials`). Skeleton code reads them with `os.Getenv()`. Never treat AI packages as needing external setup. - **Root cause fixes:** When diagnosing failures in generated projects, NEVER patch the project directly. Find the systemic root cause in: (1) **platform** - rdev handlers/services that create resources, (2) **skeleton** - templates that ship in generated projects, or (3) **cookbook** - test scripts with wrong assumptions. Fix the source, not the symptom. Every project-specific fix is technical debt that will recur. - **LLM vs rdev:** LLMs generate code; rdev executes deterministic operations (git, lint, deploy). Never rely on LLMs for runbook tasks. - **Pod git ops:** Git operations run inside pods via `PodGitOperations` (kubectl exec), never locally. - **No dead code:** Delete unused code immediately. Don't leave "might use later" exports. - **KUBECONFIG:** ALWAYS set `export KUBECONFIG=~/.kube/orchard9-k3sf.yaml` before kubectl commands - **Container builds:** NEVER build Docker images locally. ALWAYS use `git push origin main` to trigger Woodpecker CI which builds via in-cluster Kaniko. Local Docker builds produce wrong architecture (arm64 vs amd64). If an image is missing from registry.threesix.ai, push to origin — don't improvise. - **Hexagonal:** Domain models in `internal/domain/` must have ZERO external dependencies - **Ports:** All adapters implement interfaces from `internal/port/` - **Migrations:** NEVER modify committed migrations. Create NEW ones. - **500-line limit:** Files exceeding 500 lines must be split - **Tests:** All handlers and services require tests - **No fallbacks:** NEVER design "try X, fall back to Y" flows — fix X. Fallbacks hide errors and deliver inferior experiences. - **Multi-step ops:** NEVER log-and-continue after partial failure. Rollback or document partial state. - **Logging:** Use `logging.FromContext(ctx)` or injected `*slog.Logger`. NEVER `fmt.Println`, `log.Fatal`, `log.Printf`, or bare `slog.Info()`. Error key is ALWAYS `"error"` (not `"err"`). Use field constants from `internal/logging/fields.go` (e.g., `logging.FieldProjectID`, `logging.FieldError`). Log once at boundary (handlers/workers log, services return errors). Sensitive data (passwords, tokens, keys) is auto-redacted. - **HTTP clients:** NEVER create `&http.Client{}` without a `Timeout` field. All HTTP clients must have explicit timeouts (30s standard, 5s for health checks). A bare client can hang indefinitely. - **Config:** Use `envutil.GetEnv()` / `GetEnvInt()` / `GetEnvBool()` from `internal/envutil` for all env var reads with defaults. NEVER define local `getEnv` helpers — they duplicate and drift. Raw `os.Getenv()` is fine for required values with no default (secrets, passwords). - **Handler timeouts:** NEVER use inline `time.Duration` in `context.WithTimeout` inside handlers. Use constants from `internal/handlers/timeouts.go`: `TimeoutFastLookup` (5s), `TimeoutLookup` (10s), `TimeoutStandard` (30s), `TimeoutHeavyWrite` (60s), `TimeoutOrchestration` (90s), `TimeoutLongRunning` (10m). - **Worker timeouts:** NEVER use inline `time.Duration` in `context.WithTimeout` inside worker code. Use constants from `internal/worker/timeouts.go`: `TimeoutQuickOp` (5s), `TimeoutHealthCheck` (10s), `TimeoutMaintenance` (30s), `TimeoutWorkExecution` (10m). - **Response helpers:** Use `api.WriteUnauthorized`, `api.WriteForbidden`, `api.WriteBadRequest`, `api.WriteNotFound`, `api.WriteInternalError` instead of bare `api.WriteError` with status codes. Only use `api.WriteError` directly for custom error codes (e.g., KEY_REVOKED, IP_NOT_ALLOWED). - **Auth scopes:** EVERY route in a handler's `Mount()` function MUST use `r.With(auth.RequireScope(...))`. Use `ScopeProjectsRead` for GET endpoints, `ScopeProjectsExecute` for mutation endpoints. Use the appropriate domain scope (e.g., `ScopeQueueRead`, `ScopeBuildWrite`) when available. Admin-only endpoints use `auth.ScopeAdmin` alone. See `internal/handlers/builds.go` for the canonical pattern. - **JSON decoding:** ALWAYS use `api.DecodeJSON(r, &req)` to decode request bodies. NEVER use raw `json.NewDecoder(r.Body).Decode()`. The helper handles nil body, EOF, and returns typed errors. Decode error message is always `"invalid request body"`. - **Validation:** Use `validate.New()` accumulator for 2+ field checks in handlers: `v := validate.New(); v.Required(req.Name, "name"); v.Required(req.Type, "type"); if err := v.Error() { ... }`. Single-field checks can stay inline. NEVER duplicate validation logic that exists in `internal/validate`. - **Error wrapping:** ALWAYS use `%w` (not `%v`) when wrapping errors in `fmt.Errorf`. Using `%v` stringifies the error and breaks `errors.Is`/`errors.As` chains. For non-error types (structs, slices), create a typed error implementing `error` instead of stringifying with `%v`. - **Context propagation:** NEVER use `context.Background()` in handlers, services, or adapters that receive a context parameter. Always derive from parent context. Use `context.WithoutCancel(ctx)` for fire-and-forget goroutines that need tracing but independent cancellation. - **Cookbooks:** Load `.claude/skills/cookbook-trees/SKILL.md` before writing/modifying any cookbook tree. - **Version alignment:** Skeleton templates MUST use consistent versions across all files: Go 1.25 (go.work, go.mod, Dockerfiles, CI images), Node 20, Alpine 3.19. When updating a version, grep the entire templates/ tree and update ALL occurrences to prevent drift. ## Quick Reference ```bash # Environment (should already be in ~/.zshrc) export KUBECONFIG=~/.kube/orchard9-k3sf.yaml export RDEV_API_URL="https://rdev.masq-ops.orchard9.ai" export RDEV_API_KEY="" # Already set in ~/.zshrc # Verify environment is loaded echo $RDEV_API_KEY # Should print a base64 string # If empty: source ~/.zshrc # For scripts: use cookbooks/scripts/common.sh library # Provides: api_call(), wait_for_build(), wait_for_pipeline(), wait_for_site() # Example: source "$(dirname "$0")/common.sh" && api_call GET "/health" # Run locally go run ./cmd/rdev-api # Run tests go test ./... # Automated deploy (push triggers Woodpecker CI) git push origin main # Builds and deploys automatically via Woodpecker # Manual deploy (if Woodpecker unavailable) kubectl apply -f deployments/k8s/base/rdev-api.yaml kubectl rollout restart -n rdev deployment/rdev-api # Images are at registry.threesix.ai/rdev/{api,worker,claudebox} # Verify pods kubectl get pods -n rdev # View logs ./scripts/logs.sh # Last 100 lines ./scripts/logs.sh -f # Follow/stream ./scripts/logs.sh -n 500 # Last 500 lines ./scripts/logs.sh -e # Errors only ./scripts/logs.sh -p # Previous crashed container # Shell aliases (after source ~/.zshrc) rdev-logs # Last 100 lines rdev-logs-f # Follow/stream rdev-pods # List pods # API calls - use cookbook test scripts (they handle auth via common.sh) ./cookbooks/scripts/landing-test.sh run|status|teardown ./cookbooks/scripts/tree-runner.sh run --project-name # Or direct API calls (requires env vars above) curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/health | jq curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/projects | jq ``` ## Architecture Overview ``` cmd/rdev-api/ # Entry point, DI, OpenAPI spec cmd/sdlc/ # SDLC CLI binary (runs inside project pods) internal/ ├── sdlc/ # SDLC library (types, classifier, state I/O) ├── domain/ # Pure business models (no deps) ├── port/ # Interface contracts ├── service/ # Business logic orchestration ├── handlers/ # HTTP handlers (REST endpoints) ├── adapter/ # Infrastructure implementations │ ├── kubernetes/ # K8s client, pod executor │ ├── postgres/ # Audit, queue, webhooks, credentials │ ├── cockroach/ # Database provisioning (project DBs) │ ├── redis/ # Cache provisioning via ACLs │ ├── gitea/ # Git repository management │ ├── cloudflare/ # DNS provider │ └── woodpecker/ # CI provider ├── auth/ # API key auth, scopes ├── middleware/ # Rate limiting ├── worker/ # Background queue processor └── webhook/ # Event dispatcher pkg/api/ # HTTP framework (app, responses) deployments/k8s/ # Kustomize manifests └── base/templates/ # Project templates scripts/ # Operational scripts ├── load-credentials.sh # Load secrets to rdev-api ├── release.sh # Build, tag, push releases └── logs.sh # View rdev-api logs cookbooks/ # End-to-end workflow guides ├── landing-page.md # Landing page deployment flow └── scripts/ # Executable cookbook scripts ``` ## Key Concepts - **Projects**: Kubernetes pods with Claude Code, discovered by label `rdev.orchard9.ai/project=true` - **Workers**: Shared claudebox pods that execute any project's tasks, labeled `rdev.orchard9.ai/role=worker` - **Work Queue**: Async task queue for build/test/deploy jobs - **Credentials**: Infrastructure secrets (tokens, keys) stored encrypted in PostgreSQL - **Commands**: Claude/shell/git commands executed via kubectl exec, streamed via SSE - **API Keys**: Scoped auth with project restrictions, IP filtering, expiration - **Webhooks**: Event subscriptions with retry delivery - **Templates**: Project scaffolding with .woodpecker.yml, .claude/, and stack files ## threesix.ai Platform Status | Feature | Status | Description | |---------|--------|-------------| | Woodpecker Auto-Activation | **Done** | CI enabled on project creation via SDK | | Project Templates | **Done** | Embedded templates (astro-landing, go-api, default) | | Work Queue | **Done** | PostgreSQL with atomic dequeue, retry logic | | Multi-Provider Agents | **Done** | Claude Code + OpenCode via registry | | Webhooks | **Done** | Event dispatcher with retry delivery | | Embedded Worker | **Done** | Goroutine in rdev-api, polls queue | | Multi-Domain Support | **Done** | Auto-slugs, custom subdomains, DNS aliases | | Build Event Streaming | **Done** | Real-time SSE/WebSocket for build output | | Database Provisioning | **Done** | CockroachDB adapter with auto-provisioning | | Cache Provisioning | **Done** | Redis ACL-based adapter with auto-provisioning | | Build Orchestration | Planned | Structured build specs via API | | SDLC Orchestration | **Done** | Deterministic feature lifecycle with classifier engine, API, orchestrator, and 15 skeleton commands | | Composable Monorepo Templates | **Done** | Monorepo skeleton + component templates (service, worker, app-astro, app-react, cli) | | Visual Verification | Planned | Playwright screenshots/video + AI evaluation for feature completeness | | Checkout/Checkin | **Done** | Sidecar dev flow: temporary git tokens, branch checkout, review on checkin | | Interactive Remote Dev | **Done** | Sessions with pod binding, command execution, SSE streaming, ephemeral preview URLs | **Current Version:** v0.10.25 ## Constraints - **ON-PREM k3s** - not GKE, always set KUBECONFIG - **Kustomize only** - no ArgoCD - **chi/v5 router** - no gin, echo, or other frameworks - **sqlx for DB** - no GORM - **slog for logging** - no logrus, zap