rdev/ai-lookup/services/worker-pool.md
jordan 39df51defd feat: Add multi-provider code agent interface with Claude Code and OpenCode adapters
Implements weeks 1-4 of the multi-provider architecture:

Week 1 - Foundation:
- Add domain models (AgentProvider, AgentRequest, AgentEvent, AgentResult)
- Define CodeAgent port interface with Execute, Cancel, Capabilities
- Create thread-safe provider registry with first-registered default

Week 2 - Claude Code Adapter:
- Extract kubectl exec logic into CodeAgent implementation
- Parse stream-json output format (init, message, tool_use, result)
- Support session continuation via --resume flag

Week 3 - OpenCode Adapter:
- HTTP/SSE client for opencode serve API
- Session management (create, send message, abort)
- Event streaming with documented buffer rationale

Week 4 - Quality & Polish:
- Fix race condition in OpenCode Cancel method
- Add AgentRequest.Validate() with ErrPromptRequired, ErrInvalidTimeout
- Document DefaultAvailabilityTimeout constants
- Add HTTP error context for debugging

Also includes:
- Work queue system with PostgreSQL adapter
- Credential store for infrastructure secrets
- Project templates with Woodpecker CI integration
- Comprehensive test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 09:25:51 -07:00

1.7 KiB

Worker Pool

Last Updated: 2025-01 Confidence: High (Planned - see address-the-gaps.md)

Summary

Shared pool of claudebox workers (3-5 pods) that can build any project. Workers register, send heartbeats, and poll for tasks. Scales horizontally by adding workers, not projects.

Key Facts:

  • Workers labeled rdev.orchard9.ai/role=worker
  • StatefulSet: claudebox-worker with 3+ replicas
  • Each worker has dedicated PVC for workspace
  • Workers poll rdev-api for tasks every 5 seconds
  • Health tracked via heartbeat endpoint

File Pointers:

  • Port: internal/port/worker_registry.go
  • Adapter: internal/adapter/postgres/worker_registry.go
  • Handler: internal/handlers/workers.go
  • K8s manifest: deployments/k8s/base/claudebox-worker.yaml

Port Interface

type WorkerRegistry interface {
    Register(ctx context.Context, worker WorkerInfo) error
    Heartbeat(ctx context.Context, workerID string) error
    Deregister(ctx context.Context, workerID string) error
    ListActive(ctx context.Context) ([]WorkerInfo, error)
}

type WorkerInfo struct {
    ID          string
    PodName     string
    Namespace   string
    Status      string // "idle", "busy", "unhealthy"
    LastSeen    time.Time
    CurrentTask string
}

Worker Lifecycle

  1. Pod starts → calls POST /workers to register
  2. Main loop: heartbeat every 5s, poll for tasks
  3. Task received → clone repo, run Claude, commit, report
  4. Pod shutdown → DELETE /workers/{id} to deregister

Environment Variables

WORKER_ID=$(hostname)
RDEV_API_URL=http://rdev-api.rdev.svc:8080
RDEV_API_KEY=<worker service key>