rdev/ai-lookup/services/worker-pool.md
jordan 39df51defd feat: Add multi-provider code agent interface with Claude Code and OpenCode adapters
Implements weeks 1-4 of the multi-provider architecture:

Week 1 - Foundation:
- Add domain models (AgentProvider, AgentRequest, AgentEvent, AgentResult)
- Define CodeAgent port interface with Execute, Cancel, Capabilities
- Create thread-safe provider registry with first-registered default

Week 2 - Claude Code Adapter:
- Extract kubectl exec logic into CodeAgent implementation
- Parse stream-json output format (init, message, tool_use, result)
- Support session continuation via --resume flag

Week 3 - OpenCode Adapter:
- HTTP/SSE client for opencode serve API
- Session management (create, send message, abort)
- Event streaming with documented buffer rationale

Week 4 - Quality & Polish:
- Fix race condition in OpenCode Cancel method
- Add AgentRequest.Validate() with ErrPromptRequired, ErrInvalidTimeout
- Document DefaultAvailabilityTimeout constants
- Add HTTP error context for debugging

Also includes:
- Work queue system with PostgreSQL adapter
- Credential store for infrastructure secrets
- Project templates with Woodpecker CI integration
- Comprehensive test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 09:25:51 -07:00

62 lines
1.7 KiB
Markdown

# Worker Pool
**Last Updated:** 2025-01
**Confidence:** High (Planned - see address-the-gaps.md)
## Summary
Shared pool of claudebox workers (3-5 pods) that can build any project. Workers register, send heartbeats, and poll for tasks. Scales horizontally by adding workers, not projects.
**Key Facts:**
- Workers labeled `rdev.orchard9.ai/role=worker`
- StatefulSet: `claudebox-worker` with 3+ replicas
- Each worker has dedicated PVC for workspace
- Workers poll rdev-api for tasks every 5 seconds
- Health tracked via heartbeat endpoint
**File Pointers:**
- Port: `internal/port/worker_registry.go`
- Adapter: `internal/adapter/postgres/worker_registry.go`
- Handler: `internal/handlers/workers.go`
- K8s manifest: `deployments/k8s/base/claudebox-worker.yaml`
## Port Interface
```go
type WorkerRegistry interface {
Register(ctx context.Context, worker WorkerInfo) error
Heartbeat(ctx context.Context, workerID string) error
Deregister(ctx context.Context, workerID string) error
ListActive(ctx context.Context) ([]WorkerInfo, error)
}
type WorkerInfo struct {
ID string
PodName string
Namespace string
Status string // "idle", "busy", "unhealthy"
LastSeen time.Time
CurrentTask string
}
```
## Worker Lifecycle
1. Pod starts → calls `POST /workers` to register
2. Main loop: heartbeat every 5s, poll for tasks
3. Task received → clone repo, run Claude, commit, report
4. Pod shutdown → `DELETE /workers/{id}` to deregister
## Environment Variables
```
WORKER_ID=$(hostname)
RDEV_API_URL=http://rdev-api.rdev.svc:8080
RDEV_API_KEY=<worker service key>
```
## Related Topics
- [Work Queue](./work-queue.md)
- [Build Orchestration](../features/build-orchestration.md)