jordan 592b2d5ec0 fix: clarify database types across docs and fix video storage persistence

Two distinct fixes:

1. Database terminology: Make it crystal clear that generated projects use
   CockroachDB in production and PostgreSQL for local dev, while the rdev
   platform itself uses PostgreSQL. Updated 15 files across skeleton agents,
   component templates, cookbook trees, and platform docs.

2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded
   by the Gemini adapter with auth) and re-downloading from the provider URL
   with a plain GET — which fails because Gemini URLs require API key auth.
   Now uses vid.Data first, falls back to downloadURL only for public URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-19 23:13:21 -07:00

17 KiB

Raw Blame History

rdev - Remote Developer

Run Claude Code instances in isolated Kubernetes pods with REST API control. Enables bots, CI/CD systems, and external orchestrators to dispatch agentive development work to isolated environments.

Platform: threesix.ai - Agent-driven development at scale with shared worker pools.

Terminology

Term	Meaning	Location
platform	rdev itself (orchestrator API, handlers, workers)	`cmd/rdev-api/`, `internal/`, `pkg/api/`
skeleton	Code that ships in generated projects	`internal/adapter/templates/templates/skeleton/`
component templates	Service/worker/app/cli templates added to skeleton	`templates/components/{service,worker,cli,app-*}/`

When discussing code: "add to platform" = edit rdev; "add to skeleton" = edit project templates.

Database Rule

Context	Database	Details
rdev platform	PostgreSQL	API keys, audit logs, work queue, credentials (`internal/adapter/postgres/`)
Generated projects (production)	CockroachDB	Provisioned per-project by rdev (`internal/adapter/cockroach/`)
Generated projects (local dev)	PostgreSQL	Via docker-compose, wire-compatible with CockroachDB

Both use lib/pq driver. The type: postgres component API provisions CockroachDB in production — the name is a legacy artifact. Skeleton SQL must be compatible with both PostgreSQL and CockroachDB.

Find Your Guide

If you need to...	Read this
Set up local dev	local/setup.md
Run tests	local/testing.md
Write Go code / handlers	backend/go-guidelines.md
Understand pkg/api	packages/api-framework.md
Add a new handler/endpoint	backend/adding-handlers.md
Understand hexagonal architecture	backend/hexagonal.md
Deploy to k3s	ops/deploying.md
Release a new version	ops/releasing.md
Work with Kubernetes adapters	services/kubernetes.md
Database / migrations	ops/database.md
Manage credentials	ops/credentials.md
Work queue system	services/work-queue.md
Worker pool management	services/worker-pool.md
Project templates	services/templates.md
Composable monorepo templates	services/composable-monorepo.md
E2E testing strategy	services/e2e-testing-strategy.md
Cookbook tree system (commands)	services/cookbook-trees.md
Slackpath reference architectures	services/cookbook-trees.md
Write cookbook trees	cookbook-trees/SKILL.md
Build/maintain skeleton packages	skeleton-craftsman/SKILL.md
Build orchestration	services/build-orchestration.md
Build event streaming	services/build-streaming.md
Resource provisioning plan	services/resource-provisioning-plan.md
Database provisioning	services/database-provisioning.md
Cache provisioning	services/cache-provisioning.md
CockroachDB operations	services/cockroachdb.md
Redis operations	services/redis.md
DNS / Cloudflare	services/dns-cloudflare.md
Network policies / internal routing	ops/networking.md
Debug external system health	ops/external-health-diagnostics.md
SDLC orchestration	services/sdlc.md
Visual verification (Playwright)	services/visual-verification.md
Interactive remote development	services/interactive-remote-dev.md
Gitea 1.22 / SDK / webhooks	ops/gitea-1.22.md
Go 1.25 features & migration	backend/go-1.25.md
Woodpecker CI v3 pipelines	ops/woodpecker-v3.md
Traefik v3 ingress & middleware	ops/traefik-v3.md
Zot container registry	ops/zot-registry.md
cert-manager / TLS certificates	ops/cert-manager.md
Structured logging	`internal/logging/` - field constants, context propagation, redaction

Critical Rules

Frustration = systemic fix: When the user says they're tired of repeating something, stop what you're doing and find or create a systemic fix in .claude/**/* or CLAUDE.md — don't just apologize and do the same thing again.
AI credentials are provisioned: rdev injects LAOZHANG_API_KEY and GEMINI_API_KEY as env vars into every deployed component (component_deploy.go:fetchProjectCredentials). Skeleton code reads them with os.Getenv(). Never treat AI packages as needing external setup.
Root cause fixes: When diagnosing failures in generated projects, NEVER patch the project directly. Find the systemic root cause in: (1) platform - rdev handlers/services that create resources, (2) skeleton - templates that ship in generated projects, or (3) cookbook - test scripts with wrong assumptions. Fix the source, not the symptom. Every project-specific fix is technical debt that will recur.
LLM vs rdev: LLMs generate code; rdev executes deterministic operations (git, lint, deploy). Never rely on LLMs for runbook tasks.
Pod git ops: Git operations run inside pods via PodGitOperations (kubectl exec), never locally.
No dead code: Delete unused code immediately. Don't leave "might use later" exports.
KUBECONFIG: ALWAYS set export KUBECONFIG=~/.kube/orchard9-k3sf.yaml before kubectl commands
Container builds: NEVER build Docker images locally. ALWAYS use git push origin main to trigger Woodpecker CI which builds via in-cluster Kaniko. Local Docker builds produce wrong architecture (arm64 vs amd64). If an image is missing from registry.threesix.ai, push to origin — don't improvise.
Hexagonal: Domain models in internal/domain/ must have ZERO external dependencies
Ports: All adapters implement interfaces from internal/port/
Migrations: NEVER modify committed migrations. Create NEW ones.
500-line limit: Files exceeding 500 lines must be split
Tests: All handlers and services require tests
No fallbacks: NEVER design "try X, fall back to Y" flows — fix X. Fallbacks hide errors and deliver inferior experiences.
Multi-step ops: NEVER log-and-continue after partial failure. Rollback or document partial state.
Logging: Use logging.FromContext(ctx) or injected *slog.Logger. NEVER fmt.Println, log.Fatal, log.Printf, or bare slog.Info(). Error key is ALWAYS "error" (not "err"). Use field constants from internal/logging/fields.go (e.g., logging.FieldProjectID, logging.FieldError). Log once at boundary (handlers/workers log, services return errors). Sensitive data (passwords, tokens, keys) is auto-redacted.
HTTP clients: NEVER create &http.Client{} without a Timeout field. All HTTP clients must have explicit timeouts (30s standard, 5s for health checks). A bare client can hang indefinitely.
Config: Use envutil.GetEnv() / GetEnvInt() / GetEnvBool() from internal/envutil for all env var reads with defaults. NEVER define local getEnv helpers — they duplicate and drift. Raw os.Getenv() is fine for required values with no default (secrets, passwords).
Handler timeouts: NEVER use inline time.Duration in context.WithTimeout inside handlers. Use constants from internal/handlers/timeouts.go: TimeoutFastLookup (5s), TimeoutLookup (10s), TimeoutStandard (30s), TimeoutHeavyWrite (60s), TimeoutOrchestration (90s), TimeoutLongRunning (10m).
Worker timeouts: NEVER use inline time.Duration in context.WithTimeout inside worker code. Use constants from internal/worker/timeouts.go: TimeoutQuickOp (5s), TimeoutHealthCheck (10s), TimeoutMaintenance (30s), TimeoutWorkExecution (10m).
Response helpers: Use api.WriteUnauthorized, api.WriteForbidden, api.WriteBadRequest, api.WriteNotFound, api.WriteInternalError instead of bare api.WriteError with status codes. Only use api.WriteError directly for custom error codes (e.g., KEY_REVOKED, IP_NOT_ALLOWED).
Auth scopes: EVERY route in a handler's Mount() function MUST use r.With(auth.RequireScope(...)). Use ScopeProjectsRead for GET endpoints, ScopeProjectsExecute for mutation endpoints. Use the appropriate domain scope (e.g., ScopeQueueRead, ScopeBuildWrite) when available. Admin-only endpoints use auth.ScopeAdmin alone. See internal/handlers/builds.go for the canonical pattern.
JSON decoding: ALWAYS use api.DecodeJSON(r, &req) to decode request bodies. NEVER use raw json.NewDecoder(r.Body).Decode(). The helper handles nil body, EOF, and returns typed errors. Decode error message is always "invalid request body".
Validation: Use validate.New() accumulator for 2+ field checks in handlers: v := validate.New(); v.Required(req.Name, "name"); v.Required(req.Type, "type"); if err := v.Error() { ... }. Single-field checks can stay inline. NEVER duplicate validation logic that exists in internal/validate.
Error wrapping: ALWAYS use %w (not %v) when wrapping errors in fmt.Errorf. Using %v stringifies the error and breaks errors.Is/errors.As chains. For non-error types (structs, slices), create a typed error implementing error instead of stringifying with %v.
Context propagation: NEVER use context.Background() in handlers, services, or adapters that receive a context parameter. Always derive from parent context. Use context.WithoutCancel(ctx) for fire-and-forget goroutines that need tracing but independent cancellation.
Cookbooks: Load .claude/skills/cookbook-trees/SKILL.md before writing/modifying any cookbook tree.
Version alignment: Skeleton templates MUST use consistent versions across all files: Go 1.25 (go.work, go.mod, Dockerfiles, CI images), Node 20, Alpine 3.19. When updating a version, grep the entire templates/ tree and update ALL occurrences to prevent drift.

Quick Reference

# Environment (should already be in ~/.zshrc)
export KUBECONFIG=~/.kube/orchard9-k3sf.yaml
export RDEV_API_URL="https://rdev.masq-ops.orchard9.ai"
export RDEV_API_KEY="<your-api-key>"  # Already set in ~/.zshrc

# Verify environment is loaded
echo $RDEV_API_KEY  # Should print a base64 string
# If empty: source ~/.zshrc

# For scripts: use cookbooks/scripts/common.sh library
# Provides: api_call(), wait_for_build(), wait_for_pipeline(), wait_for_site()
# Example: source "$(dirname "$0")/common.sh" && api_call GET "/health"

# Run locally
go run ./cmd/rdev-api

# Run tests
go test ./...

# Automated deploy (push triggers Woodpecker CI)
git push origin main  # Builds and deploys automatically via Woodpecker

# Manual deploy (if Woodpecker unavailable)
kubectl apply -f deployments/k8s/base/rdev-api.yaml
kubectl rollout restart -n rdev deployment/rdev-api

# Images are at registry.threesix.ai/rdev/{api,worker,claudebox}

# Verify pods
kubectl get pods -n rdev

# View logs
./scripts/logs.sh           # Last 100 lines
./scripts/logs.sh -f        # Follow/stream
./scripts/logs.sh -n 500    # Last 500 lines
./scripts/logs.sh -e        # Errors only
./scripts/logs.sh -p        # Previous crashed container

# Shell aliases (after source ~/.zshrc)
rdev-logs                   # Last 100 lines
rdev-logs-f                 # Follow/stream
rdev-pods                   # List pods

# API calls - use cookbook test scripts (they handle auth via common.sh)
./cookbooks/scripts/landing-test.sh run|status|teardown <name>
./cookbooks/scripts/tree-runner.sh run <tree-name> --project-name <name>

# Or direct API calls (requires env vars above)
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/health | jq
curl -s -H "X-API-Key: $RDEV_API_KEY" $RDEV_API_URL/projects | jq

Architecture Overview

cmd/rdev-api/          # Entry point, DI, OpenAPI spec
cmd/sdlc/              # SDLC CLI binary (runs inside project pods)
internal/
├── sdlc/              # SDLC library (types, classifier, state I/O)
├── domain/            # Pure business models (no deps)
├── port/              # Interface contracts
├── service/           # Business logic orchestration
├── handlers/          # HTTP handlers (REST endpoints)
├── adapter/           # Infrastructure implementations
│   ├── kubernetes/    # K8s client, pod executor
│   ├── postgres/      # Audit, queue, webhooks, credentials
│   ├── cockroach/     # Database provisioning (project DBs)
│   ├── redis/         # Cache provisioning via ACLs
│   ├── gitea/         # Git repository management
│   ├── cloudflare/    # DNS provider
│   └── woodpecker/    # CI provider
├── auth/              # API key auth, scopes
├── middleware/        # Rate limiting
├── worker/            # Background queue processor
└── webhook/           # Event dispatcher
pkg/api/               # HTTP framework (app, responses)
deployments/k8s/       # Kustomize manifests
  └── base/templates/  # Project templates
scripts/               # Operational scripts
  ├── load-credentials.sh  # Load secrets to rdev-api
  ├── release.sh           # Build, tag, push releases
  └── logs.sh              # View rdev-api logs
cookbooks/             # End-to-end workflow guides
  ├── landing-page.md      # Landing page deployment flow
  └── scripts/             # Executable cookbook scripts

Key Concepts

Projects: Kubernetes pods with Claude Code, discovered by label rdev.orchard9.ai/project=true
Workers: Shared claudebox pods that execute any project's tasks, labeled rdev.orchard9.ai/role=worker
Work Queue: Async task queue for build/test/deploy jobs
Credentials: Infrastructure secrets (tokens, keys) stored encrypted in PostgreSQL
Commands: Claude/shell/git commands executed via kubectl exec, streamed via SSE
API Keys: Scoped auth with project restrictions, IP filtering, expiration
Webhooks: Event subscriptions with retry delivery
Templates: Project scaffolding with .woodpecker.yml, .claude/, and stack files

threesix.ai Platform Status

Feature	Status	Description
Woodpecker Auto-Activation	Done	CI enabled on project creation via SDK
Project Templates	Done	Embedded templates (astro-landing, go-api, default)
Work Queue	Done	PostgreSQL with atomic dequeue, retry logic
Multi-Provider Agents	Done	Claude Code + OpenCode via registry
Webhooks	Done	Event dispatcher with retry delivery
Embedded Worker	Done	Goroutine in rdev-api, polls queue
Multi-Domain Support	Done	Auto-slugs, custom subdomains, DNS aliases
Build Event Streaming	Done	Real-time SSE/WebSocket for build output
Database Provisioning	Done	CockroachDB adapter with auto-provisioning
Cache Provisioning	Done	Redis ACL-based adapter with auto-provisioning
Build Orchestration	Planned	Structured build specs via API
SDLC Orchestration	Done	Deterministic feature lifecycle with classifier engine, API, orchestrator, and 15 skeleton commands
Composable Monorepo Templates	Done	Monorepo skeleton + component templates (service, worker, app-astro, app-react, cli)
Visual Verification	Planned	Playwright screenshots/video + AI evaluation for feature completeness
Checkout/Checkin	Done	Sidecar dev flow: temporary git tokens, branch checkout, review on checkin
Interactive Remote Dev	Done	Sessions with pod binding, command execution, SSE streaming, ephemeral preview URLs

Current Version: v0.10.25

Constraints

ON-PREM k3s - not GKE, always set KUBECONFIG
Kustomize only - no ArgoCD
chi/v5 router - no gin, echo, or other frameworks
sqlx for DB - no GORM
slog for logging - no logrus, zap

17 KiB Raw Blame History