- Add ListPipelines/GetPipeline to CIProvider port with Woodpecker adapter
- Add DNS alias endpoints: GET/POST/DELETE /projects/{id}/domains
- Implement worker executor daemon, build executor, and git operations
- Add build service, worker service, and build audit tracking
- Add worker registry with PostgreSQL adapter and migration
- Add multi-provider code agent interface (Claude Code + OpenCode)
- Add create-and-build combo endpoint
- Update landing-page cookbook to reflect all gaps closed
- Fix tech debt: unified validation, auth scopes, error wrapping, slog patterns
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.5 KiB
3.5 KiB
Worker Pool
Last Updated: 2026-01-27 Confidence: High
Summary
Shared worker pool that executes build tasks for any project. Currently runs as an embedded WorkExecutor daemon inside rdev-api. Workers register with the worker registry, poll the work queue for tasks, execute Claude Code in cloned repos via GitOperations, and report results with audit trails.
Key Facts:
- Embedded WorkExecutor daemon runs inside rdev-api process
- Workers poll work queue every 5 seconds, heartbeat every 30 seconds
- Stale workers (no heartbeat for 2 minutes) automatically marked offline by QueueMaintenance
- Stale tasks (running >30 min without completion) automatically requeued
- Old tasks (>7 days) automatically cleaned up
- Queue depth and worker counts exported as Prometheus metrics
- Future: external worker binary for separate pod deployment
File Pointers:
- Domain:
internal/domain/worker.go(Worker, WorkerStatus) - Domain:
internal/domain/build.go(BuildSpec, BuildResult) - Port:
internal/port/worker_registry.go(WorkerRegistry interface) - Port:
internal/port/build_audit.go(BuildAudit interface) - Adapter:
internal/adapter/postgres/worker_registry.go - Adapter:
internal/adapter/postgres/build_audit.go - Service:
internal/service/worker_service.go - Service:
internal/service/build_service.go - Executor:
internal/worker/work_executor.go(poll loop, heartbeat, task routing) - Executor:
internal/worker/build_executor.go(BuildSpec→AgentRequest) - Git:
internal/worker/git_operations.go(clone, commit, push) - Maintenance:
internal/worker/queue_maintenance.go(stale recovery, cleanup, metrics) - Handler:
internal/handlers/workers.go(REST API for workers) - Handler:
internal/handlers/builds.go(REST API for builds) - Handler:
internal/handlers/create_and_build.go(combined create+build) - Migration:
internal/db/migrations/012_worker_registry.sql
Worker Lifecycle (Embedded)
- rdev-api starts → WorkExecutor registers as worker in registry
- Heartbeat loop: every 30s sends heartbeat via WorkerService
- Poll loop: every 5s dequeues next task from work queue
- BuildExecutor: clones repo, executes CodeAgent, commits/pushes if auto_commit
- Reports completion with BuildResult via WorkerService
- Graceful shutdown: deregisters worker on rdev-api stop
Worker Statuses
idle- available for new tasksbusy- currently executing a taskdraining- not accepting new tasks (pre-shutdown)offline- missed heartbeat threshold
API Endpoints
| Method | Path | Description |
|---|---|---|
| GET | /workers |
List all workers with status summary |
| GET | /workers/{workerId} |
Get worker details |
| POST | /workers/{workerId}/drain |
Set worker to draining |
| POST | /projects/{id}/builds |
Start build for project |
| GET | /projects/{id}/builds |
List builds for project |
| GET | /builds/{taskId} |
Get build status |
| POST | /project/create-and-build |
Create project + start build |
Queue Maintenance
The QueueMaintenance worker runs inside rdev-api alongside the WorkExecutor:
- Stale task recovery (every 1m): Requeues tasks running >30m without completion
- Stale worker marking (every 1m): Marks workers offline after 2m without heartbeat
- Old task cleanup (every 1m): Removes completed/failed/cancelled tasks >7 days old
- Metrics refresh (every 15s): Updates Prometheus gauges for queue depth and worker counts