rdev/CLAUDE.md
jordan 592b2d5ec0 fix: clarify database types across docs and fix video storage persistence
Two distinct fixes:

1. Database terminology: Make it crystal clear that generated projects use
   CockroachDB in production and PostgreSQL for local dev, while the rdev
   platform itself uses PostgreSQL. Updated 15 files across skeleton agents,
   component templates, cookbook trees, and platform docs.

2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded
   by the Gemini adapter with auth) and re-downloading from the provider URL
   with a plain GET — which fails because Gemini URLs require API key auth.
   Now uses vid.Data first, falls back to downloadURL only for public URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:13:21 -07:00

17 KiB

rdev - Remote Developer

Run Claude Code instances in isolated Kubernetes pods with REST API control. Enables bots, CI/CD systems, and external orchestrators to dispatch agentive development work to isolated environments.

Platform: threesix.ai - Agent-driven development at scale with shared worker pools.

Terminology

Term Meaning Location
platform rdev itself (orchestrator API, handlers, workers) cmd/rdev-api/, internal/, pkg/api/
skeleton Code that ships in generated projects internal/adapter/templates/templates/skeleton/
component templates Service/worker/app/cli templates added to skeleton templates/components/{service,worker,cli,app-*}/

When discussing code: "add to platform" = edit rdev; "add to skeleton" = edit project templates.

Database Rule

Context Database Details
rdev platform PostgreSQL API keys, audit logs, work queue, credentials (internal/adapter/postgres/)
Generated projects (production) CockroachDB Provisioned per-project by rdev (internal/adapter/cockroach/)
Generated projects (local dev) PostgreSQL Via docker-compose, wire-compatible with CockroachDB

Both use lib/pq driver. The type: postgres component API provisions CockroachDB in production — the name is a legacy artifact. Skeleton SQL must be compatible with both PostgreSQL and CockroachDB.

Find Your Guide

If you need to... Read this
Set up local dev local/setup.md
Run tests local/testing.md
Write Go code / handlers backend/go-guidelines.md
Understand pkg/api packages/api-framework.md
Add a new handler/endpoint backend/adding-handlers.md
Understand hexagonal architecture backend/hexagonal.md
Deploy to k3s ops/deploying.md
Release a new version ops/releasing.md
Work with Kubernetes adapters services/kubernetes.md
Database / migrations ops/database.md
Manage credentials ops/credentials.md
Work queue system services/work-queue.md
Worker pool management services/worker-pool.md
Project templates services/templates.md
Composable monorepo templates services/composable-monorepo.md
E2E testing strategy services/e2e-testing-strategy.md
Cookbook tree system (commands) services/cookbook-trees.md
Slackpath reference architectures services/cookbook-trees.md
Write cookbook trees cookbook-trees/SKILL.md
Build/maintain skeleton packages skeleton-craftsman/SKILL.md
Build orchestration services/build-orchestration.md
Build event streaming services/build-streaming.md
Resource provisioning plan services/resource-provisioning-plan.md
Database provisioning services/database-provisioning.md
Cache provisioning services/cache-provisioning.md
CockroachDB operations services/cockroachdb.md
Redis operations services/redis.md
DNS / Cloudflare services/dns-cloudflare.md
Network policies / internal routing ops/networking.md
Debug external system health ops/external-health-diagnostics.md
SDLC orchestration services/sdlc.md
Visual verification (Playwright) services/visual-verification.md
Interactive remote development services/interactive-remote-dev.md
Gitea 1.22 / SDK / webhooks ops/gitea-1.22.md
Go 1.25 features & migration backend/go-1.25.md
Woodpecker CI v3 pipelines ops/woodpecker-v3.md
Traefik v3 ingress & middleware ops/traefik-v3.md
Zot container registry ops/zot-registry.md
cert-manager / TLS certificates ops/cert-manager.md
Structured logging internal/logging/ - field constants, context propagation, redaction

Critical Rules

  • Frustration = systemic fix: When the user says they're tired of repeating something, stop what you're doing and find or create a systemic fix in .claude/**/* or CLAUDE.md — don't just apologize and do the same thing again.
  • AI credentials are provisioned: rdev injects LAOZHANG_API_KEY and GEMINI_API_KEY as env vars into every deployed component (component_deploy.go:fetchProjectCredentials). Skeleton code reads them with os.Getenv(). Never treat AI packages as needing external setup.
  • Root cause fixes: When diagnosing failures in generated projects, NEVER patch the project directly. Find the systemic root cause in: (1) platform - rdev handlers/services that create resources, (2) skeleton - templates that ship in generated projects, or (3) cookbook - test scripts with wrong assumptions. Fix the source, not the symptom. Every project-specific fix is technical debt that will recur.
  • LLM vs rdev: LLMs generate code; rdev executes deterministic operations (git, lint, deploy). Never rely on LLMs for runbook tasks.
  • Pod git ops: Git operations run inside pods via PodGitOperations (kubectl exec), never locally.
  • No dead code: Delete unused code immediately. Don't leave "might use later" exports.
  • KUBECONFIG: ALWAYS set export KUBECONFIG=~/.kube/orchard9-k3sf.yaml before kubectl commands
  • Container builds: NEVER build Docker images locally. ALWAYS use git push origin main to trigger Woodpecker CI which builds via in-cluster Kaniko. Local Docker builds produce wrong architecture (arm64 vs amd64). If an image is missing from registry.threesix.ai, push to origin — don't improvise.
  • Hexagonal: Domain models in internal/domain/ must have ZERO external dependencies
  • Ports: All adapters implement interfaces from internal/port/
  • Migrations: NEVER modify committed migrations. Create NEW ones.
  • 500-line limit: Files exceeding 500 lines must be split
  • Tests: All handlers and services require tests
  • No fallbacks: NEVER design "try X, fall back to Y" flows — fix X. Fallbacks hide errors and deliver inferior experiences.
  • Multi-step ops: NEVER log-and-continue after partial failure. Rollback or document partial state.
  • Logging: Use logging.FromContext(ctx) or injected *slog.Logger. NEVER fmt.Println, log.Fatal, log.Printf, or bare slog.Info(). Error key is ALWAYS "error" (not "err"). Use field constants from internal/logging/fields.go (e.g., logging.FieldProjectID, logging.FieldError). Log once at boundary (handlers/workers log, services return errors). Sensitive data (passwords, tokens, keys) is auto-redacted.
  • HTTP clients: NEVER create &http.Client{} without a Timeout field. All HTTP clients must have explicit timeouts (30s standard, 5s for health checks). A bare client can hang indefinitely.
  • Config: Use envutil.GetEnv() / GetEnvInt() / GetEnvBool() from internal/envutil for all env var reads with defaults. NEVER define local getEnv helpers — they duplicate and drift. Raw os.Getenv() is fine for required values with no default (secrets, passwords).
  • Handler timeouts: NEVER use inline time.Duration in context.WithTimeout inside handlers. Use constants from internal/handlers/timeouts.go: TimeoutFastLookup (5s), TimeoutLookup (10s), TimeoutStandard (30s), TimeoutHeavyWrite (60s), TimeoutOrchestration (90s), TimeoutLongRunning (10m).
  • Worker timeouts: NEVER use inline time.Duration in context.WithTimeout inside worker code. Use constants from internal/worker/timeouts.go: TimeoutQuickOp (5s), TimeoutHealthCheck (10s), TimeoutMaintenance (30s), TimeoutWorkExecution (10m).
  • Response helpers: Use api.WriteUnauthorized, api.WriteForbidden, api.WriteBadRequest, api.WriteNotFound, api.WriteInternalError instead of bare api.WriteError with status codes. Only use api.WriteError directly for custom error codes (e.g., KEY_REVOKED, IP_NOT_ALLOWED).
  • Auth scopes: EVERY route in a handler's Mount() function MUST use r.With(auth.RequireScope(...)). Use ScopeProjectsRead for GET endpoints, ScopeProjectsExecute for mutation endpoints. Use the appropriate domain scope (e.g., ScopeQueueRead, ScopeBuildWrite) when available. Admin-only endpoints use auth.ScopeAdmin alone. See internal/handlers/builds.go for the canonical pattern.
  • JSON decoding: ALWAYS use api.DecodeJSON(r, &req) to decode request bodies. NEVER use raw json.NewDecoder(r.Body).Decode(). The helper handles nil body, EOF, and returns typed errors. Decode error message is always "invalid request body".
  • Validation: Use validate.New() accumulator for 2+ field checks in handlers: v := validate.New(); v.Required(req.Name, "name"); v.Required(req.Type, "type"); if err := v.Error() { ... }. Single-field checks can stay inline. NEVER duplicate validation logic that exists in internal/validate.
  • Error wrapping: ALWAYS use %w (not %v) when wrapping errors in fmt.Errorf. Using %v stringifies the error and breaks errors.Is/errors.As chains. For non-error types (structs, slices), create a typed error implementing error instead of stringifying with %v.
  • Context propagation: NEVER use context.Background() in handlers, services, or adapters that receive a context parameter. Always derive from parent context. Use context.WithoutCancel(ctx) for fire-and-forget goroutines that need tracing but independent cancellation.
  • Cookbooks: Load .claude/skills/cookbook-trees/SKILL.md before writing/modifying any cookbook tree.
  • Version alignment: Skeleton templates MUST use consistent versions across all files: Go 1.25 (go.work, go.mod, Dockerfiles, CI images), Node 20, Alpine 3.19. When updating a version, grep the entire templates/ tree and update ALL occurrences to prevent drift.

Quick Reference

# Environment (should already be in ~/.zshrc)
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
export RDEV_API_URL="https://rdev.masq-ops.orchard9.ai"
export RDEV_API_KEY="<your-api-key>"  # Already set in ~/.zshrc

# Verify environment is loaded
echo $RDEV_API_KEY  # Should print a base64 string
# If empty: source ~/.zshrc

# For scripts: use cookbooks/scripts/common.sh library
# Provides: api_call(), wait_for_build(), wait_for_pipeline(), wait_for_site()
# Example: source "$(dirname "$0")/common.sh" && api_call GET "/health"

# Run locally
go run ./cmd/rdev-api

# Run tests
go test ./...

# Automated deploy (push triggers Woodpecker CI)
git push origin main  # Builds and deploys automatically via Woodpecker

# Manual deploy (if Woodpecker unavailable)
kubectl apply -f deployments/k8s/base/rdev-api.yaml
kubectl rollout restart -n rdev deployment/rdev-api

# Images are at registry.threesix.ai/rdev/{api,worker,claudebox}

# Verify pods
kubectl get pods -n rdev

# View logs
./scripts/logs.sh           # Last 100 lines
./scripts/logs.sh -f        # Follow/stream
./scripts/logs.sh -n 500    # Last 500 lines
./scripts/logs.sh -e        # Errors only
./scripts/logs.sh -p        # Previous crashed container

# Shell aliases (after source ~/.zshrc)
rdev-logs                   # Last 100 lines
rdev-logs-f                 # Follow/stream
rdev-pods                   # List pods

# API calls - use cookbook test scripts (they handle auth via common.sh)
./cookbooks/scripts/landing-test.sh run|status|teardown <name>
./cookbooks/scripts/tree-runner.sh run <tree-name> --project-name <name>

# Or direct API calls (requires env vars above)
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/health | jq
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/projects | jq

Architecture Overview

cmd/rdev-api/          # Entry point, DI, OpenAPI spec
cmd/sdlc/              # SDLC CLI binary (runs inside project pods)
internal/
├── sdlc/              # SDLC library (types, classifier, state I/O)
├── domain/            # Pure business models (no deps)
├── port/              # Interface contracts
├── service/           # Business logic orchestration
├── handlers/          # HTTP handlers (REST endpoints)
├── adapter/           # Infrastructure implementations
│   ├── kubernetes/    # K8s client, pod executor
│   ├── postgres/      # Audit, queue, webhooks, credentials
│   ├── cockroach/     # Database provisioning (project DBs)
│   ├── redis/         # Cache provisioning via ACLs
│   ├── gitea/         # Git repository management
│   ├── cloudflare/    # DNS provider
│   └── woodpecker/    # CI provider
├── auth/              # API key auth, scopes
├── middleware/        # Rate limiting
├── worker/            # Background queue processor
└── webhook/           # Event dispatcher
pkg/api/               # HTTP framework (app, responses)
deployments/k8s/       # Kustomize manifests
  └── base/templates/  # Project templates
scripts/               # Operational scripts
  ├── load-credentials.sh  # Load secrets to rdev-api
  ├── release.sh           # Build, tag, push releases
  └── logs.sh              # View rdev-api logs
cookbooks/             # End-to-end workflow guides
  ├── landing-page.md      # Landing page deployment flow
  └── scripts/             # Executable cookbook scripts

Key Concepts

  • Projects: Kubernetes pods with Claude Code, discovered by label rdev.orchard9.ai/project=true
  • Workers: Shared claudebox pods that execute any project's tasks, labeled rdev.orchard9.ai/role=worker
  • Work Queue: Async task queue for build/test/deploy jobs
  • Credentials: Infrastructure secrets (tokens, keys) stored encrypted in PostgreSQL
  • Commands: Claude/shell/git commands executed via kubectl exec, streamed via SSE
  • API Keys: Scoped auth with project restrictions, IP filtering, expiration
  • Webhooks: Event subscriptions with retry delivery
  • Templates: Project scaffolding with .woodpecker.yml, .claude/, and stack files

threesix.ai Platform Status

Feature Status Description
Woodpecker Auto-Activation Done CI enabled on project creation via SDK
Project Templates Done Embedded templates (astro-landing, go-api, default)
Work Queue Done PostgreSQL with atomic dequeue, retry logic
Multi-Provider Agents Done Claude Code + OpenCode via registry
Webhooks Done Event dispatcher with retry delivery
Embedded Worker Done Goroutine in rdev-api, polls queue
Multi-Domain Support Done Auto-slugs, custom subdomains, DNS aliases
Build Event Streaming Done Real-time SSE/WebSocket for build output
Database Provisioning Done CockroachDB adapter with auto-provisioning
Cache Provisioning Done Redis ACL-based adapter with auto-provisioning
Build Orchestration Planned Structured build specs via API
SDLC Orchestration Done Deterministic feature lifecycle with classifier engine, API, orchestrator, and 15 skeleton commands
Composable Monorepo Templates Done Monorepo skeleton + component templates (service, worker, app-astro, app-react, cli)
Visual Verification Planned Playwright screenshots/video + AI evaluation for feature completeness
Checkout/Checkin Done Sidecar dev flow: temporary git tokens, branch checkout, review on checkin
Interactive Remote Dev Done Sessions with pod binding, command execution, SSE streaming, ephemeral preview URLs

Current Version: v0.10.25

Constraints

  • ON-PREM k3s - not GKE, always set KUBECONFIG
  • Kustomize only - no ArgoCD
  • chi/v5 router - no gin, echo, or other frameworks
  • sqlx for DB - no GORM
  • slog for logging - no logrus, zap