- Add ListPipelines/GetPipeline to CIProvider port with Woodpecker adapter
- Add DNS alias endpoints: GET/POST/DELETE /projects/{id}/domains
- Implement worker executor daemon, build executor, and git operations
- Add build service, worker service, and build audit tracking
- Add worker registry with PostgreSQL adapter and migration
- Add multi-provider code agent interface (Claude Code + OpenCode)
- Add create-and-build combo endpoint
- Update landing-page cookbook to reflect all gaps closed
- Fix tech debt: unified validation, auth scopes, error wrapping, slog patterns
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
321 lines
15 KiB
Markdown
321 lines
15 KiB
Markdown
# Worker Executor Implementation Plan
|
|
|
|
> Close the last gap in the landing page cookbook: automated code generation via the worker pool.
|
|
|
|
## Context
|
|
|
|
The work queue, worker registry, build audit, and code agent systems are **all implemented**. The single missing piece is a **work executor** — a background loop that consumes queued tasks and executes them via a code agent. This is analogous to the existing `QueueProcessor` (which processes per-project command queue tasks), but for the generic `WorkQueue` (cross-project worker pool tasks).
|
|
|
|
### What Already Exists
|
|
|
|
| Component | File | Status |
|
|
|-----------|------|--------|
|
|
| Work queue (PostgreSQL) | `internal/adapter/postgres/work_queue.go` | Done |
|
|
| Worker registry (PostgreSQL) | `internal/adapter/postgres/worker_registry.go` | Done |
|
|
| Build audit (PostgreSQL) | `internal/adapter/postgres/build_audit.go` | Done |
|
|
| WorkService (enqueue/dequeue/complete/fail) | `internal/service/work_service.go` | Done |
|
|
| WorkerService (claim/complete/health) | `internal/service/worker_service.go` | Done |
|
|
| BuildService (start/status/complete) | `internal/service/build_service.go` | Done |
|
|
| WorkHandler (REST API) | `internal/handlers/work.go` | Done |
|
|
| AgentsHandler (REST API) | `internal/handlers/agents.go` | Done |
|
|
| CodeAgent interface | `internal/port/code_agent.go` | Done |
|
|
| Domain models (WorkTask, Worker, BuildSpec) | `internal/domain/` | Done |
|
|
| Command QueueProcessor (reference pattern) | `internal/worker/queue_processor.go` | Done |
|
|
|
|
### What's Missing
|
|
|
|
| Gap | Priority |
|
|
|-----|----------|
|
|
| Work executor daemon (poll loop) | Critical |
|
|
| BuildSpec → AgentRequest translation | Critical |
|
|
| Git clone/commit/push in executor | Critical |
|
|
| Git credential resolution for cross-project | High |
|
|
| Worker management REST endpoints | Medium |
|
|
| DNS alias endpoint | Medium |
|
|
| Create-and-build endpoint | Medium |
|
|
| Woodpecker build status proxy | Low |
|
|
|
|
---
|
|
|
|
## Week 1: Work Executor Core
|
|
|
|
**Goal:** A background loop that claims tasks from the work queue and executes them via a code agent. By end of week, `POST /work/enqueue` → task claimed → agent executes → result recorded.
|
|
|
|
### Tasks
|
|
|
|
1. **Create `internal/worker/work_executor.go`**
|
|
- Follow the `QueueProcessor` pattern from `queue_processor.go`
|
|
- Poll loop: calls `WorkerService.ClaimTask(workerID)` on a ticker
|
|
- On task claim: route to appropriate handler based on `task.Type`
|
|
- On completion: call `WorkerService.CompleteTask(workerID, taskID, result)`
|
|
- On failure: call `WorkService.FailTask(taskID, errMsg)` (handles retry logic)
|
|
- Graceful shutdown via context cancellation
|
|
- Self-registers as a worker via `WorkerService.Register()` on start
|
|
- Sends heartbeats via `WorkerService.Heartbeat()` on a 30s ticker
|
|
|
|
2. **Create `internal/worker/build_executor.go`**
|
|
- Handles `WorkTaskTypeBuild` tasks specifically
|
|
- Extracts `BuildSpec` fields from `WorkTask.Spec` (map[string]any → typed fields)
|
|
- Translates `BuildSpec.Prompt` into `domain.AgentRequest`
|
|
- Calls `CodeAgent.Execute()` with event streaming
|
|
- Collects output, files changed, duration into `domain.BuildResult`
|
|
- Returns `BuildResult` to the work executor
|
|
|
|
3. **Wire into `cmd/rdev-api/main.go`**
|
|
- Create `WorkExecutor` alongside existing `QueueProcessor`
|
|
- Inject: `WorkerService`, `BuildService`, `CodeAgentRegistry`
|
|
- Start on boot, stop on shutdown
|
|
- Worker ID: hostname or pod name (from `HOSTNAME` env var)
|
|
|
|
4. **Create `internal/worker/work_executor_test.go`**
|
|
- Test: executor starts and registers as a worker
|
|
- Test: executor claims a task and routes to build handler
|
|
- Test: build handler translates spec and calls code agent
|
|
- Test: results are recorded via CompleteTask
|
|
- Test: failures trigger FailTask with retry
|
|
- Test: graceful shutdown stops the poll loop
|
|
- Use mock implementations of ports
|
|
|
|
### Deliverables
|
|
|
|
- `POST /work/enqueue` with a build task → executor picks it up → agent runs → result in `GET /work/{taskId}`
|
|
- Worker visible in registry during execution
|
|
- Build audit entry created with spec and result
|
|
|
|
### Files Created/Modified
|
|
|
|
| File | Action |
|
|
|------|--------|
|
|
| `internal/worker/work_executor.go` | Create |
|
|
| `internal/worker/build_executor.go` | Create |
|
|
| `internal/worker/work_executor_test.go` | Create |
|
|
| `cmd/rdev-api/main.go` | Modify (wire executor) |
|
|
|
|
---
|
|
|
|
## Week 2: Git Operations & Cross-Project Execution
|
|
|
|
**Goal:** The executor can clone any project's repo, run the agent in that directory, and push results back. By end of week, the full build cycle works: enqueue → clone → agent generates code → commit → push → CI triggers.
|
|
|
|
### Tasks
|
|
|
|
1. **Create `internal/worker/git_operations.go`**
|
|
- `CloneRepo(ctx, gitURL, dir, token) error` — clone via HTTPS with token auth
|
|
- `CommitAndPush(ctx, dir, message) (commitSHA string, filesChanged []string, err error)`
|
|
- `ConfigureGit(dir, name, email)` — set git user for commits
|
|
- Uses `os/exec` for git commands (same pattern as `kubernetes.Executor` uses for kubectl)
|
|
- Workspace management: creates temp dir per task, cleans up after
|
|
|
|
2. **Add git credential resolution to `BuildExecutor`**
|
|
- Option A (simplest): Use the Gitea token already in `InfraConfig.GiteaToken`
|
|
- All project repos are in Gitea, so one token covers all repos
|
|
- Pass token via HTTPS clone URL: `https://token@git.threesix.ai/org/repo.git`
|
|
- Option B (per-project): Look up project's git URL from database, resolve credentials
|
|
- **Recommendation:** Option A — the Gitea token is already loaded and available
|
|
|
|
3. **Integrate git ops into `BuildExecutor`**
|
|
- Before agent execution: clone the project's repo to a temp directory
|
|
- Look up project git URL from database (add `ProjectStore` port or query directly)
|
|
- After agent execution: if `auto_commit` is true, commit changes
|
|
- After commit: if `auto_push` is true, push to remote
|
|
- Capture `commit_sha` and `files_changed` in `BuildResult`
|
|
|
|
4. **Add project git URL lookup**
|
|
- The `ProjectInfraService` stores git URLs in the database during `CreateProject`
|
|
- Add a method to retrieve git info by project ID
|
|
- Or: include `git_url` in the `WorkTask.Spec` at enqueue time (simpler, no extra lookup)
|
|
|
|
5. **Create `internal/worker/git_operations_test.go`**
|
|
- Test: clone with token auth
|
|
- Test: commit and push
|
|
- Test: workspace cleanup on success and failure
|
|
- Test: git URL construction with token
|
|
|
|
6. **Integration test**
|
|
- Enqueue a build task with a real prompt
|
|
- Verify agent executes in cloned repo
|
|
- Verify commit is created (if auto_commit)
|
|
- Verify push succeeds (if auto_push)
|
|
- Verify BuildResult has correct fields
|
|
|
|
### Deliverables
|
|
|
|
- Full build cycle: enqueue → clone → execute → commit → push
|
|
- Git credentials resolved from infrastructure config
|
|
- Temp workspace created and cleaned per task
|
|
- Build audit shows commit SHA and files changed
|
|
|
|
### Files Created/Modified
|
|
|
|
| File | Action |
|
|
|------|--------|
|
|
| `internal/worker/git_operations.go` | Create |
|
|
| `internal/worker/git_operations_test.go` | Create |
|
|
| `internal/worker/build_executor.go` | Modify (add git integration) |
|
|
| `internal/worker/work_executor.go` | Modify (pass git config) |
|
|
| `cmd/rdev-api/main.go` | Modify (pass gitea token to executor) |
|
|
|
|
---
|
|
|
|
## Week 3: API Enhancements
|
|
|
|
**Goal:** Add the REST endpoints that complete the platform experience. By end of week, users can create a project, enqueue a build, monitor CI status, and manage DNS — all through rdev-api.
|
|
|
|
### Tasks
|
|
|
|
1. **Worker management endpoints — `internal/handlers/workers.go`**
|
|
- `GET /workers` — list all workers with status
|
|
- `GET /workers/{id}` — get worker details
|
|
- `POST /workers/{id}/drain` — drain a worker
|
|
- Wire `WorkerService` into handler
|
|
- Register in `cmd/rdev-api/main.go` and `openapi.go`
|
|
|
|
2. **Build management endpoints — `internal/handlers/builds.go`**
|
|
- `POST /projects/{id}/builds` — enqueue a build (wraps `BuildService.StartBuild()`)
|
|
- `GET /projects/{id}/builds` — list build history
|
|
- `GET /projects/{id}/builds/{taskId}` — get build status
|
|
- Simpler API than raw `/work/enqueue` — project-scoped, build-specific
|
|
- Register in `cmd/rdev-api/main.go` and `openapi.go`
|
|
|
|
3. **DNS alias endpoint — `internal/handlers/infrastructure.go`**
|
|
- `POST /projects/{id}/domains` — add DNS alias (A or CNAME record)
|
|
- `GET /projects/{id}/domains` — list domains for project
|
|
- `DELETE /projects/{id}/domains/{domain}` — remove alias
|
|
- Uses existing Cloudflare adapter's `CreateRecord()` and `DeleteRecordByName()`
|
|
- The adapter already supports full CRUD — just needs a handler
|
|
|
|
4. **Woodpecker build status proxy — `internal/handlers/ci.go`**
|
|
- `GET /projects/{id}/ci/pipelines` — list recent Woodpecker pipelines
|
|
- `GET /projects/{id}/ci/pipelines/{number}` — get pipeline details
|
|
- Add `ListPipelines()` and `GetPipeline()` to `port.CIProvider`
|
|
- Implement in `internal/adapter/woodpecker/client.go` using Woodpecker SDK
|
|
- Low priority — can defer if time is tight
|
|
|
|
5. **Create-and-build endpoint — `internal/handlers/project_management.go`**
|
|
- `POST /project/create-and-build`
|
|
- Request: `{ name, description, template, prompt, auto_push }`
|
|
- Calls `ProjectInfraService.CreateProject()` then `BuildService.StartBuild()`
|
|
- Returns project info + task ID
|
|
- Trivial once executor is working
|
|
|
|
6. **Tests for all new handlers**
|
|
- Follow existing patterns in `handlers/*_test.go`
|
|
- Test request validation, success paths, error handling
|
|
|
|
### Deliverables
|
|
|
|
- `POST /projects/{id}/builds` as the clean API for code generation
|
|
- `GET /workers` for monitoring the worker pool
|
|
- `POST /projects/{id}/domains` for DNS aliases
|
|
- `POST /project/create-and-build` for the single-call flow
|
|
- All endpoints documented in `openapi.go`
|
|
|
|
### Files Created/Modified
|
|
|
|
| File | Action |
|
|
|------|--------|
|
|
| `internal/handlers/workers.go` | Create |
|
|
| `internal/handlers/workers_test.go` | Create |
|
|
| `internal/handlers/builds.go` | Create |
|
|
| `internal/handlers/builds_test.go` | Create |
|
|
| `internal/handlers/infrastructure.go` | Modify (add domain endpoints) |
|
|
| `internal/handlers/ci.go` | Create (if time) |
|
|
| `internal/handlers/project_management.go` | Modify (add create-and-build) |
|
|
| `internal/adapter/woodpecker/client.go` | Modify (add pipeline methods, if time) |
|
|
| `internal/port/ci.go` or port updates | Modify (add pipeline interface, if time) |
|
|
| `cmd/rdev-api/main.go` | Modify (wire new handlers) |
|
|
| `cmd/rdev-api/openapi.go` | Modify (add routes to spec) |
|
|
|
|
---
|
|
|
|
## Week 4: Polish, Validation & Observability
|
|
|
|
**Goal:** End-to-end validation of the cookbook flow. Observability for production operation. Documentation updated.
|
|
|
|
### Tasks
|
|
|
|
1. **End-to-end cookbook validation**
|
|
- Run the landing page cookbook flow from start to finish
|
|
- `POST /project` with `astro-landing` template
|
|
- `POST /projects/landing/builds` with customization prompt
|
|
- Monitor via `GET /work/{taskId}/status`
|
|
- Verify CI triggers on push
|
|
- Verify site is live at `https://landing.threesix.ai`
|
|
- Fix any issues found during validation
|
|
|
|
2. **Stale task recovery**
|
|
- Add periodic `RequeueStale()` call to the work executor
|
|
- Requeue tasks where the worker crashed mid-execution
|
|
- Add periodic `CleanupOld()` call to remove ancient completed tasks
|
|
- These methods exist on `WorkQueue` but nothing calls them
|
|
|
|
3. **Observability additions**
|
|
- Add metrics to work executor: tasks_claimed, tasks_completed, tasks_failed, execution_duration
|
|
- Add metrics to worker service: workers_registered, workers_idle, workers_busy
|
|
- Follow existing pattern in `internal/metrics/metrics.go`
|
|
- Add work executor health to readiness check (`GET /ready`)
|
|
|
|
4. **Queue maintenance worker**
|
|
- Create `internal/worker/queue_maintenance.go`
|
|
- Runs on a slower ticker (every 5 minutes)
|
|
- Calls `RequeueStale(ctx, 10*time.Minute)` — requeue tasks running > 10min with no heartbeat
|
|
- Calls `CleanupOld(ctx, 7*24*time.Hour)` — prune tasks older than 7 days
|
|
- Wire into main.go
|
|
|
|
5. **Update documentation**
|
|
- Update `cookbooks/landing-page.md` with final validated flow
|
|
- Update `ai-lookup/features/build-orchestration.md`
|
|
- Update `ai-lookup/services/worker-pool.md`
|
|
- Add `.claude/guides/services/build-orchestration.md` if needed
|
|
|
|
6. **Update CLAUDE.md roadmap**
|
|
- Mark "Work Queue" as implemented
|
|
- Mark "Worker Pool" as implemented
|
|
- Mark "Build Orchestration" as implemented
|
|
- Update "Bot Communication" status
|
|
|
|
### Deliverables
|
|
|
|
- Cookbook flow works end-to-end without manual intervention (except code generation prompt)
|
|
- Stale task recovery running in production
|
|
- Metrics visible in `/metrics` endpoint
|
|
- All documentation reflects actual capabilities
|
|
|
|
### Files Created/Modified
|
|
|
|
| File | Action |
|
|
|------|--------|
|
|
| `internal/worker/queue_maintenance.go` | Create |
|
|
| `internal/metrics/metrics.go` | Modify (add work executor metrics) |
|
|
| `internal/handlers/health.go` | Modify (add executor health) |
|
|
| `cookbooks/landing-page.md` | Modify (final validation) |
|
|
| `ai-lookup/features/build-orchestration.md` | Modify |
|
|
| `ai-lookup/services/worker-pool.md` | Modify |
|
|
| `CLAUDE.md` | Modify (update roadmap) |
|
|
| `cmd/rdev-api/main.go` | Modify (wire maintenance worker) |
|
|
|
|
---
|
|
|
|
## Risk & Dependencies
|
|
|
|
| Risk | Mitigation |
|
|
|------|-----------|
|
|
| CodeAgent execution in a temp directory (not a K8s pod) may not work the same as in-pod execution | Test early in Week 1; fallback is to kubectl exec into a worker pod |
|
|
| Gitea token may lack permissions for new repos created by different users | Test with actual token; all repos should be in the same org |
|
|
| Agent execution may take longer than expected (10+ minutes for complex prompts) | Make timeout configurable; increase default |
|
|
| Worker process crash loses in-flight task | Stale requeue (Week 4) handles this automatically |
|
|
| 500-line file limit may require splitting new files | Plan for split from the start; `work_executor.go` + `build_executor.go` + `git_operations.go` keeps things modular |
|
|
|
|
## Architecture Decision: In-Process vs External Worker
|
|
|
|
The plan above implements the executor **in-process** (running inside the rdev-api binary). This is simpler and matches the existing `QueueProcessor` pattern. The alternative would be a separate worker binary, which would allow independent scaling. The in-process approach is the right starting point — it can be extracted into a separate binary later if scaling requires it.
|
|
|
|
## Summary
|
|
|
|
| Week | Focus | Key Deliverable |
|
|
|------|-------|----------------|
|
|
| 1 | Work executor core | Tasks flow from queue → agent → result |
|
|
| 2 | Git operations | Clone → execute → commit → push cycle |
|
|
| 3 | API enhancements | Build, worker, DNS, create-and-build endpoints |
|
|
| 4 | Polish & validation | E2E cookbook flow, observability, docs |
|