Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add UndeployAll() using label selectors to clean up monorepo components on project deletion (replaces name-based Undeploy in DeleteProject and the direct undeploy handler) - Add ResourceGC background worker that periodically finds K8s resources whose project label has no matching DB record, deletes after 1h safety window - Widen deployer client type from *kubernetes.Clientset to kubernetes.Interface for testability - UndeployAll accumulates errors via errors.Join instead of failing fast - Add checkout/checkin sidecar dev flow: temporary git tokens, branch checkout, review on checkin with cleanup workers - Add interactive sessions: pod binding, command execution, SSE streaming, ephemeral preview URLs with session cleanup workers - Add GET /workers/pool endpoint for aggregate capacity and queue depth - Add sessions:read and sessions:execute auth scopes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
214 lines
15 KiB
Markdown
214 lines
15 KiB
Markdown
# rdev - Remote Developer
|
|
|
|
Run Claude Code instances in isolated Kubernetes pods with REST API control. Enables bots, CI/CD systems, and external orchestrators to dispatch agentive development work to isolated environments.
|
|
|
|
**Platform:** threesix.ai - Agent-driven development at scale with shared worker pools.
|
|
|
|
## Terminology
|
|
|
|
| Term | Meaning | Location |
|
|
|------|---------|----------|
|
|
| **platform** | rdev itself (orchestrator API, handlers, workers) | `cmd/rdev-api/`, `internal/`, `pkg/api/` |
|
|
| **skeleton** | Code that ships in generated projects | `internal/adapter/templates/templates/skeleton/` |
|
|
| **component templates** | Service/worker/app/cli templates added to skeleton | `templates/components/{service,worker,cli,app-*}/` |
|
|
|
|
When discussing code: "add to **platform**" = edit rdev; "add to **skeleton**" = edit project templates.
|
|
|
|
## Find Your Guide
|
|
|
|
| If you need to... | Read this |
|
|
|-------------------|-----------|
|
|
| **Set up local dev** | [local/setup.md](.claude/guides/local/setup.md) |
|
|
| **Run tests** | [local/testing.md](.claude/guides/local/testing.md) |
|
|
| **Write Go code / handlers** | [backend/go-guidelines.md](.claude/guides/backend/go-guidelines.md) |
|
|
| **Understand pkg/api** | [packages/api-framework.md](.claude/guides/packages/api-framework.md) |
|
|
| **Add a new handler/endpoint** | [backend/adding-handlers.md](.claude/guides/backend/adding-handlers.md) |
|
|
| **Understand hexagonal architecture** | [backend/hexagonal.md](.claude/guides/backend/hexagonal.md) |
|
|
| **Deploy to k3s** | [ops/deploying.md](.claude/guides/ops/deploying.md) |
|
|
| **Release a new version** | [ops/releasing.md](.claude/guides/ops/releasing.md) |
|
|
| **Work with Kubernetes adapters** | [services/kubernetes.md](.claude/guides/services/kubernetes.md) |
|
|
| **Database / migrations** | [ops/database.md](.claude/guides/ops/database.md) |
|
|
| **Manage credentials** | [ops/credentials.md](.claude/guides/ops/credentials.md) |
|
|
| **Work queue system** | [services/work-queue.md](.claude/guides/services/work-queue.md) |
|
|
| **Worker pool management** | [services/worker-pool.md](.claude/guides/services/worker-pool.md) |
|
|
| **Project templates** | [services/templates.md](.claude/guides/services/templates.md) |
|
|
| **Composable monorepo templates** | [services/composable-monorepo.md](.claude/guides/services/composable-monorepo.md) |
|
|
| **E2E testing strategy** | [services/e2e-testing-strategy.md](.claude/guides/services/e2e-testing-strategy.md) |
|
|
| **Cookbook tree system (commands)** | [services/cookbook-trees.md](.claude/guides/services/cookbook-trees.md) |
|
|
| **Slackpath reference architectures** | [services/cookbook-trees.md](.claude/guides/services/cookbook-trees.md#slackpath-trees-reference-architectures) |
|
|
| **Write cookbook trees** | [cookbook-trees/SKILL.md](.claude/skills/cookbook-trees/SKILL.md) |
|
|
| **Build orchestration** | [services/build-orchestration.md](.claude/guides/services/build-orchestration.md) |
|
|
| **Build event streaming** | [services/build-streaming.md](.claude/guides/services/build-streaming.md) |
|
|
| **Resource provisioning plan** | [services/resource-provisioning-plan.md](.claude/guides/services/resource-provisioning-plan.md) |
|
|
| **Database provisioning** | [services/database-provisioning.md](.claude/guides/services/database-provisioning.md) |
|
|
| **Cache provisioning** | [services/cache-provisioning.md](.claude/guides/services/cache-provisioning.md) |
|
|
| **CockroachDB operations** | [services/cockroachdb.md](.claude/guides/services/cockroachdb.md) |
|
|
| **Redis operations** | [services/redis.md](.claude/guides/services/redis.md) |
|
|
| **DNS / Cloudflare** | [services/dns-cloudflare.md](.claude/guides/services/dns-cloudflare.md) |
|
|
| **Network policies / internal routing** | [ops/networking.md](.claude/guides/ops/networking.md) |
|
|
| **Debug external system health** | [ops/external-health-diagnostics.md](.claude/guides/ops/external-health-diagnostics.md) |
|
|
| **SDLC orchestration** | [services/sdlc.md](.claude/guides/services/sdlc.md) |
|
|
| **Visual verification (Playwright)** | [services/visual-verification.md](.claude/guides/services/visual-verification.md) |
|
|
| **Interactive remote development** | [services/interactive-remote-dev.md](.claude/guides/services/interactive-remote-dev.md) |
|
|
| **Structured logging** | `internal/logging/` - field constants, context propagation, redaction |
|
|
|
|
## Critical Rules
|
|
|
|
- **Root cause fixes:** When diagnosing failures in generated projects, NEVER patch the project directly. Find the systemic root cause in: (1) **platform** - rdev handlers/services that create resources, (2) **skeleton** - templates that ship in generated projects, or (3) **cookbook** - test scripts with wrong assumptions. Fix the source, not the symptom. Every project-specific fix is technical debt that will recur.
|
|
- **LLM vs rdev:** LLMs generate code; rdev executes deterministic operations (git, lint, deploy). Never rely on LLMs for runbook tasks.
|
|
- **Pod git ops:** Git operations run inside pods via `PodGitOperations` (kubectl exec), never locally.
|
|
- **No dead code:** Delete unused code immediately. Don't leave "might use later" exports.
|
|
- **KUBECONFIG:** ALWAYS set `export KUBECONFIG=~/.kube/orchard9-k3sf.yaml` before kubectl commands
|
|
- **Container builds:** NEVER build Docker images locally. ALWAYS use `git push origin main` to trigger Woodpecker CI which builds via in-cluster Kaniko. Local Docker builds produce wrong architecture (arm64 vs amd64). If an image is missing from registry.threesix.ai, push to origin — don't improvise.
|
|
- **Hexagonal:** Domain models in `internal/domain/` must have ZERO external dependencies
|
|
- **Ports:** All adapters implement interfaces from `internal/port/`
|
|
- **Migrations:** NEVER modify committed migrations. Create NEW ones.
|
|
- **500-line limit:** Files exceeding 500 lines must be split
|
|
- **Tests:** All handlers and services require tests
|
|
- **Multi-step ops:** NEVER log-and-continue after partial failure. Rollback or document partial state.
|
|
- **Logging:** Use `logging.FromContext(ctx)` or injected `*slog.Logger`. NEVER `fmt.Println`, `log.Fatal`, `log.Printf`, or bare `slog.Info()`. Error key is ALWAYS `"error"` (not `"err"`). Use field constants from `internal/logging/fields.go` (e.g., `logging.FieldProjectID`, `logging.FieldError`). Log once at boundary (handlers/workers log, services return errors). Sensitive data (passwords, tokens, keys) is auto-redacted.
|
|
- **HTTP clients:** NEVER create `&http.Client{}` without a `Timeout` field. All HTTP clients must have explicit timeouts (30s standard, 5s for health checks). A bare client can hang indefinitely.
|
|
- **Config:** Use `envutil.GetEnv()` / `GetEnvInt()` / `GetEnvBool()` from `internal/envutil` for all env var reads with defaults. NEVER define local `getEnv` helpers — they duplicate and drift. Raw `os.Getenv()` is fine for required values with no default (secrets, passwords).
|
|
- **Handler timeouts:** NEVER use inline `time.Duration` in `context.WithTimeout` inside handlers. Use constants from `internal/handlers/timeouts.go`: `TimeoutFastLookup` (5s), `TimeoutLookup` (10s), `TimeoutStandard` (30s), `TimeoutHeavyWrite` (60s), `TimeoutOrchestration` (90s), `TimeoutLongRunning` (10m).
|
|
- **Worker timeouts:** NEVER use inline `time.Duration` in `context.WithTimeout` inside worker code. Use constants from `internal/worker/timeouts.go`: `TimeoutQuickOp` (5s), `TimeoutHealthCheck` (10s), `TimeoutMaintenance` (30s), `TimeoutWorkExecution` (10m).
|
|
- **Response helpers:** Use `api.WriteUnauthorized`, `api.WriteForbidden`, `api.WriteBadRequest`, `api.WriteNotFound`, `api.WriteInternalError` instead of bare `api.WriteError` with status codes. Only use `api.WriteError` directly for custom error codes (e.g., KEY_REVOKED, IP_NOT_ALLOWED).
|
|
- **Auth scopes:** EVERY route in a handler's `Mount()` function MUST use `r.With(auth.RequireScope(...))`. Use `ScopeProjectsRead` for GET endpoints, `ScopeProjectsExecute` for mutation endpoints. Use the appropriate domain scope (e.g., `ScopeQueueRead`, `ScopeBuildWrite`) when available. Admin-only endpoints use `auth.ScopeAdmin` alone. See `internal/handlers/builds.go` for the canonical pattern.
|
|
- **JSON decoding:** ALWAYS use `api.DecodeJSON(r, &req)` to decode request bodies. NEVER use raw `json.NewDecoder(r.Body).Decode()`. The helper handles nil body, EOF, and returns typed errors. Decode error message is always `"invalid request body"`.
|
|
- **Validation:** Use `validate.New()` accumulator for 2+ field checks in handlers: `v := validate.New(); v.Required(req.Name, "name"); v.Required(req.Type, "type"); if err := v.Error() { ... }`. Single-field checks can stay inline. NEVER duplicate validation logic that exists in `internal/validate`.
|
|
- **Error wrapping:** ALWAYS use `%w` (not `%v`) when wrapping errors in `fmt.Errorf`. Using `%v` stringifies the error and breaks `errors.Is`/`errors.As` chains. For non-error types (structs, slices), create a typed error implementing `error` instead of stringifying with `%v`.
|
|
- **Context propagation:** NEVER use `context.Background()` in handlers, services, or adapters that receive a context parameter. Always derive from parent context. Use `context.WithoutCancel(ctx)` for fire-and-forget goroutines that need tracing but independent cancellation.
|
|
- **Cookbooks:** Load `.claude/skills/cookbook-trees/SKILL.md` before writing/modifying any cookbook tree.
|
|
- **Version alignment:** Skeleton templates MUST use consistent versions across all files: Go 1.25 (go.work, go.mod, Dockerfiles, CI images), Node 20, Alpine 3.19. When updating a version, grep the entire templates/ tree and update ALL occurrences to prevent drift.
|
|
|
|
## Quick Reference
|
|
|
|
```bash
|
|
# Environment (should already be in ~/.zshrc)
|
|
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
|
|
export RDEV_API_URL="https://rdev.masq-ops.orchard9.ai"
|
|
export RDEV_API_KEY="<your-api-key>" # Already set in ~/.zshrc
|
|
|
|
# Verify environment is loaded
|
|
echo $RDEV_API_KEY # Should print a base64 string
|
|
# If empty: source ~/.zshrc
|
|
|
|
# For scripts: use cookbooks/scripts/common.sh library
|
|
# Provides: api_call(), wait_for_build(), wait_for_pipeline(), wait_for_site()
|
|
# Example: source "$(dirname "$0")/common.sh" && api_call GET "/health"
|
|
|
|
# Run locally
|
|
go run ./cmd/rdev-api
|
|
|
|
# Run tests
|
|
go test ./...
|
|
|
|
# Automated deploy (push triggers Woodpecker CI)
|
|
git push origin main # Builds and deploys automatically via Woodpecker
|
|
|
|
# Manual deploy (if Woodpecker unavailable)
|
|
kubectl apply -f deployments/k8s/base/rdev-api.yaml
|
|
kubectl rollout restart -n rdev deployment/rdev-api
|
|
|
|
# Images are at registry.threesix.ai/rdev/{api,worker,claudebox}
|
|
|
|
# Verify pods
|
|
kubectl get pods -n rdev
|
|
|
|
# View logs
|
|
./scripts/logs.sh # Last 100 lines
|
|
./scripts/logs.sh -f # Follow/stream
|
|
./scripts/logs.sh -n 500 # Last 500 lines
|
|
./scripts/logs.sh -e # Errors only
|
|
./scripts/logs.sh -p # Previous crashed container
|
|
|
|
# Shell aliases (after source ~/.zshrc)
|
|
rdev-logs # Last 100 lines
|
|
rdev-logs-f # Follow/stream
|
|
rdev-pods # List pods
|
|
|
|
# API calls - use cookbook test scripts (they handle auth via common.sh)
|
|
./cookbooks/scripts/landing-test.sh run|status|teardown <name>
|
|
./cookbooks/scripts/tree-runner.sh run <tree-name> --project-name <name>
|
|
|
|
# Or direct API calls (requires env vars above)
|
|
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/health | jq
|
|
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/projects | jq
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
cmd/rdev-api/ # Entry point, DI, OpenAPI spec
|
|
cmd/sdlc/ # SDLC CLI binary (runs inside project pods)
|
|
internal/
|
|
├── sdlc/ # SDLC library (types, classifier, state I/O)
|
|
├── domain/ # Pure business models (no deps)
|
|
├── port/ # Interface contracts
|
|
├── service/ # Business logic orchestration
|
|
├── handlers/ # HTTP handlers (REST endpoints)
|
|
├── adapter/ # Infrastructure implementations
|
|
│ ├── kubernetes/ # K8s client, pod executor
|
|
│ ├── postgres/ # Audit, queue, webhooks, credentials
|
|
│ ├── cockroach/ # Database provisioning (project DBs)
|
|
│ ├── redis/ # Cache provisioning via ACLs
|
|
│ ├── gitea/ # Git repository management
|
|
│ ├── cloudflare/ # DNS provider
|
|
│ └── woodpecker/ # CI provider
|
|
├── auth/ # API key auth, scopes
|
|
├── middleware/ # Rate limiting
|
|
├── worker/ # Background queue processor
|
|
└── webhook/ # Event dispatcher
|
|
pkg/api/ # HTTP framework (app, responses)
|
|
deployments/k8s/ # Kustomize manifests
|
|
└── base/templates/ # Project templates
|
|
scripts/ # Operational scripts
|
|
├── load-credentials.sh # Load secrets to rdev-api
|
|
├── release.sh # Build, tag, push releases
|
|
└── logs.sh # View rdev-api logs
|
|
cookbooks/ # End-to-end workflow guides
|
|
├── landing-page.md # Landing page deployment flow
|
|
└── scripts/ # Executable cookbook scripts
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
- **Projects**: Kubernetes pods with Claude Code, discovered by label `rdev.orchard9.ai/project=true`
|
|
- **Workers**: Shared claudebox pods that execute any project's tasks, labeled `rdev.orchard9.ai/role=worker`
|
|
- **Work Queue**: Async task queue for build/test/deploy jobs
|
|
- **Credentials**: Infrastructure secrets (tokens, keys) stored encrypted in PostgreSQL
|
|
- **Commands**: Claude/shell/git commands executed via kubectl exec, streamed via SSE
|
|
- **API Keys**: Scoped auth with project restrictions, IP filtering, expiration
|
|
- **Webhooks**: Event subscriptions with retry delivery
|
|
- **Templates**: Project scaffolding with .woodpecker.yml, .claude/, and stack files
|
|
|
|
## threesix.ai Platform Status
|
|
|
|
| Feature | Status | Description |
|
|
|---------|--------|-------------|
|
|
| Woodpecker Auto-Activation | **Done** | CI enabled on project creation via SDK |
|
|
| Project Templates | **Done** | Embedded templates (astro-landing, go-api, default) |
|
|
| Work Queue | **Done** | PostgreSQL with atomic dequeue, retry logic |
|
|
| Multi-Provider Agents | **Done** | Claude Code + OpenCode via registry |
|
|
| Webhooks | **Done** | Event dispatcher with retry delivery |
|
|
| Embedded Worker | **Done** | Goroutine in rdev-api, polls queue |
|
|
| Multi-Domain Support | **Done** | Auto-slugs, custom subdomains, DNS aliases |
|
|
| Build Event Streaming | **Done** | Real-time SSE/WebSocket for build output |
|
|
| Database Provisioning | **Done** | CockroachDB adapter with auto-provisioning |
|
|
| Cache Provisioning | **Done** | Redis ACL-based adapter with auto-provisioning |
|
|
| Build Orchestration | Planned | Structured build specs via API |
|
|
| SDLC Orchestration | **Done** | Deterministic feature lifecycle with classifier engine, API, orchestrator, and 15 skeleton commands |
|
|
| Composable Monorepo Templates | **Done** | Monorepo skeleton + component templates (service, worker, app-astro, app-react, cli) |
|
|
| Visual Verification | Planned | Playwright screenshots/video + AI evaluation for feature completeness |
|
|
| Checkout/Checkin | **Done** | Sidecar dev flow: temporary git tokens, branch checkout, review on checkin |
|
|
| Interactive Remote Dev | **Done** | Sessions with pod binding, command execution, SSE streaming, ephemeral preview URLs |
|
|
|
|
**Current Version:** v0.10.25
|
|
|
|
## Constraints
|
|
|
|
- **ON-PREM k3s** - not GKE, always set KUBECONFIG
|
|
- **Kustomize only** - no ArgoCD
|
|
- **chi/v5 router** - no gin, echo, or other frameworks
|
|
- **sqlx for DB** - no GORM
|
|
- **slog for logging** - no logrus, zap
|