Major changes: - Add internal/logging package with field constants, context propagation, sensitive data auto-redaction, and per-component log levels - Add worker timeout constants (TimeoutQuickOp, TimeoutHealthCheck, etc.) - Extend SDLC with callback handlers, generate endpoints, and executor - Add new cookbook trees for aeries and slackpath progression - Add skeleton templates for queue, realtime, and microservices - Add worker component template with async job processing - Refactor services and handlers to use new logging infrastructure - Split component.go into component_infra.go and component_listing.go Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
13 KiB
13 KiB
rdev - Remote Developer
Run Claude Code instances in isolated Kubernetes pods with REST API control. Enables bots, CI/CD systems, and external orchestrators to dispatch agentive development work to isolated environments.
Platform: threesix.ai - Agent-driven development at scale with shared worker pools.
Find Your Guide
| If you need to... | Read this |
|---|---|
| Set up local dev | local/setup.md |
| Run tests | local/testing.md |
| Write Go code / handlers | backend/go-guidelines.md |
| Understand pkg/api | packages/api-framework.md |
| Add a new handler/endpoint | backend/adding-handlers.md |
| Understand hexagonal architecture | backend/hexagonal.md |
| Deploy to k3s | ops/deploying.md |
| Release a new version | ops/releasing.md |
| Work with Kubernetes adapters | services/kubernetes.md |
| Database / migrations | ops/database.md |
| Manage credentials | ops/credentials.md |
| Work queue system | services/work-queue.md |
| Worker pool management | services/worker-pool.md |
| Project templates | services/templates.md |
| Composable monorepo templates | services/composable-monorepo.md |
| E2E testing strategy | services/e2e-testing-strategy.md |
| Cookbook tree system (commands) | services/cookbook-trees.md |
| Write E2E cookbook scripts | cookbook-scripts/SKILL.md |
| Build orchestration | services/build-orchestration.md |
| Build event streaming | services/build-streaming.md |
| Resource provisioning plan | services/resource-provisioning-plan.md |
| Database provisioning | services/database-provisioning.md |
| Cache provisioning | services/cache-provisioning.md |
| CockroachDB operations | services/cockroachdb.md |
| Redis operations | services/redis.md |
| DNS / Cloudflare | services/dns-cloudflare.md |
| Network policies / internal routing | ops/networking.md |
| Debug external system health | ops/external-health-diagnostics.md |
| SDLC orchestration | services/sdlc.md |
| Visual verification (Playwright) | services/visual-verification.md |
| Structured logging | internal/logging/ - field constants, context propagation, redaction |
Critical Rules
- LLM vs rdev: LLMs generate code; rdev executes deterministic operations (git, lint, deploy). Never rely on LLMs for runbook tasks.
- Pod git ops: Git operations run inside pods via
PodGitOperations(kubectl exec), never locally. - No dead code: Delete unused code immediately. Don't leave "might use later" exports.
- KUBECONFIG: ALWAYS set
export KUBECONFIG=~/.kube/orchard9-k3sf.yamlbefore kubectl commands - Hexagonal: Domain models in
internal/domain/must have ZERO external dependencies - Ports: All adapters implement interfaces from
internal/port/ - Migrations: NEVER modify committed migrations. Create NEW ones.
- 500-line limit: Files exceeding 500 lines must be split
- Tests: All handlers and services require tests
- Multi-step ops: NEVER log-and-continue after partial failure. Rollback or document partial state.
- Logging: Use
logging.FromContext(ctx)or injected*slog.Logger. NEVERfmt.Println,log.Fatal,log.Printf, or bareslog.Info(). Error key is ALWAYS"error"(not"err"). Use field constants frominternal/logging/fields.go(e.g.,logging.FieldProjectID,logging.FieldError). Log once at boundary (handlers/workers log, services return errors). Sensitive data (passwords, tokens, keys) is auto-redacted. - HTTP clients: NEVER create
&http.Client{}without aTimeoutfield. All HTTP clients must have explicit timeouts (30s standard, 5s for health checks). A bare client can hang indefinitely. - Config: Use
envutil.GetEnv()/GetEnvInt()/GetEnvBool()frominternal/envutilfor all env var reads with defaults. NEVER define localgetEnvhelpers — they duplicate and drift. Rawos.Getenv()is fine for required values with no default (secrets, passwords). - Handler timeouts: NEVER use inline
time.Durationincontext.WithTimeoutinside handlers. Use constants frominternal/handlers/timeouts.go:TimeoutFastLookup(5s),TimeoutLookup(10s),TimeoutStandard(30s),TimeoutHeavyWrite(60s),TimeoutOrchestration(90s),TimeoutLongRunning(10m). - Worker timeouts: NEVER use inline
time.Durationincontext.WithTimeoutinside worker code. Use constants frominternal/worker/timeouts.go:TimeoutQuickOp(5s),TimeoutHealthCheck(10s),TimeoutMaintenance(30s),TimeoutWorkExecution(10m). - Response helpers: Use
api.WriteUnauthorized,api.WriteForbidden,api.WriteBadRequest,api.WriteNotFound,api.WriteInternalErrorinstead of bareapi.WriteErrorwith status codes. Only useapi.WriteErrordirectly for custom error codes (e.g., KEY_REVOKED, IP_NOT_ALLOWED). - Auth scopes: EVERY route in a handler's
Mount()function MUST user.With(auth.RequireScope(...)). UseScopeProjectsReadfor GET endpoints,ScopeProjectsExecutefor mutation endpoints. Use the appropriate domain scope (e.g.,ScopeQueueRead,ScopeBuildWrite) when available. Admin-only endpoints useauth.ScopeAdminalone. Seeinternal/handlers/builds.gofor the canonical pattern. - JSON decoding: ALWAYS use
api.DecodeJSON(r, &req)to decode request bodies. NEVER use rawjson.NewDecoder(r.Body).Decode(). The helper handles nil body, EOF, and returns typed errors. Decode error message is always"invalid request body". - Validation: Use
validate.New()accumulator for 2+ field checks in handlers:v := validate.New(); v.Required(req.Name, "name"); v.Required(req.Type, "type"); if err := v.Error() { ... }. Single-field checks can stay inline. NEVER duplicate validation logic that exists ininternal/validate. - Error wrapping: ALWAYS use
%w(not%v) when wrapping errors infmt.Errorf. Using%vstringifies the error and breakserrors.Is/errors.Aschains. For non-error types (structs, slices), create a typed error implementingerrorinstead of stringifying with%v.
Quick Reference
# Required env vars (add to ~/.zshrc)
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
export RDEV_API_URL="https://rdev.masq-ops.orchard9.ai"
export RDEV_API_KEY="<from rdev-credentials secret>"
# Infrastructure credentials stored in .secrets (gitignored)
# See: .claude/guides/ops/credentials.md for setup
# Keys: GITEA_TOKEN, CLOUDFLARE_API_TOKEN, CLOUDFLARE_ZONE_ID, WOODPECKER_*
# Run locally
go run ./cmd/rdev-api
# Run tests
go test ./...
# Release + deploy (one command)
./scripts/release.sh v0.10.1 "Description of changes" --deploy
# Release only (no deploy)
./scripts/release.sh v0.10.1 "Description of changes"
# Manual deploy (if needed)
kubectl apply -f deployments/k8s/base/rdev-api.yaml
kubectl rollout restart -n rdev deployment/rdev-api
# Verify pods
kubectl get pods -n rdev
# View logs
./scripts/logs.sh # Last 100 lines
./scripts/logs.sh -f # Follow/stream
./scripts/logs.sh -n 500 # Last 500 lines
./scripts/logs.sh -e # Errors only
./scripts/logs.sh -p # Previous crashed container
# Shell aliases (after source ~/.zshrc)
rdev-logs # Last 100 lines
rdev-logs-f # Follow/stream
rdev-pods # List pods
# API calls (NOTE: $RDEV_API_KEY doesn't expand in curl -H, use the test script instead)
# ./cookbooks/scripts/landing-test.sh run|status|teardown <name>
curl -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/health
curl -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/projects
curl -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/work/stats
Architecture Overview
cmd/rdev-api/ # Entry point, DI, OpenAPI spec
cmd/sdlc/ # SDLC CLI binary (runs inside project pods)
internal/
├── sdlc/ # SDLC library (types, classifier, state I/O)
├── domain/ # Pure business models (no deps)
├── port/ # Interface contracts
├── service/ # Business logic orchestration
├── handlers/ # HTTP handlers (REST endpoints)
├── adapter/ # Infrastructure implementations
│ ├── kubernetes/ # K8s client, pod executor
│ ├── postgres/ # Audit, queue, webhooks, credentials
│ ├── cockroach/ # Database provisioning (project DBs)
│ ├── redis/ # Cache provisioning via ACLs
│ ├── gitea/ # Git repository management
│ ├── cloudflare/ # DNS provider
│ └── woodpecker/ # CI provider
├── auth/ # API key auth, scopes
├── middleware/ # Rate limiting
├── worker/ # Background queue processor
└── webhook/ # Event dispatcher
pkg/api/ # HTTP framework (app, responses)
deployments/k8s/ # Kustomize manifests
└── base/templates/ # Project templates
scripts/ # Operational scripts
├── load-credentials.sh # Load secrets to rdev-api
├── release.sh # Build, tag, push releases
└── logs.sh # View rdev-api logs
cookbooks/ # End-to-end workflow guides
├── landing-page.md # Landing page deployment flow
└── scripts/ # Executable cookbook scripts
Key Concepts
- Projects: Kubernetes pods with Claude Code, discovered by label
rdev.orchard9.ai/project=true - Workers: Shared claudebox pods that execute any project's tasks, labeled
rdev.orchard9.ai/role=worker - Work Queue: Async task queue for build/test/deploy jobs
- Credentials: Infrastructure secrets (tokens, keys) stored encrypted in PostgreSQL
- Commands: Claude/shell/git commands executed via kubectl exec, streamed via SSE
- API Keys: Scoped auth with project restrictions, IP filtering, expiration
- Webhooks: Event subscriptions with retry delivery
- Templates: Project scaffolding with .woodpecker.yml, .claude/, and stack files
threesix.ai Platform Status
| Feature | Status | Description |
|---|---|---|
| Woodpecker Auto-Activation | Done | CI enabled on project creation via SDK |
| Project Templates | Done | Embedded templates (astro-landing, go-api, default) |
| Work Queue | Done | PostgreSQL with atomic dequeue, retry logic |
| Multi-Provider Agents | Done | Claude Code + OpenCode via registry |
| Webhooks | Done | Event dispatcher with retry delivery |
| Embedded Worker | Done | Goroutine in rdev-api, polls queue |
| Multi-Domain Support | Done | Auto-slugs, custom subdomains, DNS aliases |
| Build Event Streaming | Done | Real-time SSE/WebSocket for build output |
| Database Provisioning | Done | CockroachDB adapter with auto-provisioning |
| Cache Provisioning | Done | Redis ACL-based adapter with auto-provisioning |
| Build Orchestration | Planned | Structured build specs via API |
| SDLC Orchestration | Done | Deterministic feature lifecycle with classifier engine, API, orchestrator, and 15 skeleton commands |
| Composable Monorepo Templates | Done | Monorepo skeleton + component templates (service, worker, app-astro, app-react, cli) |
| Visual Verification | Planned | Playwright screenshots/video + AI evaluation for feature completeness |
Current Version: v0.10.25
Constraints
- ON-PREM k3s - not GKE, always set KUBECONFIG
- Kustomize only - no ArgoCD
- chi/v5 router - no gin, echo, or other frameworks
- sqlx for DB - no GORM
- slog for logging - no logrus, zap