rdev/docs/features/multi-provider.md
jordan bc47e426b0 feat: Add CI pipeline proxy, DNS alias management, and worker executor system
- Add ListPipelines/GetPipeline to CIProvider port with Woodpecker adapter
- Add DNS alias endpoints: GET/POST/DELETE /projects/{id}/domains
- Implement worker executor daemon, build executor, and git operations
- Add build service, worker service, and build audit tracking
- Add worker registry with PostgreSQL adapter and migration
- Add multi-provider code agent interface (Claude Code + OpenCode)
- Add create-and-build combo endpoint
- Update landing-page cookbook to reflect all gaps closed
- Fix tech debt: unified validation, auth scopes, error wrapping, slog patterns

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 21:05:28 -07:00

677 lines
21 KiB
Markdown

# Multi-Provider Code Agent Interface
> **Status:** Complete (Weeks 1-5)
> **Feature:** Unified interface supporting Claude Code and OpenCode providers
## Overview
This document describes the architecture for supporting multiple code agent providers (Claude Code, OpenCode) through a unified interface. The design enables provider switching at runtime without breaking existing functionality.
## Implementation Progress
| Phase | Status | Description |
|-------|--------|-------------|
| Week 1: Foundation | ✅ Complete | Domain models, port interface, registry |
| Week 2: Claude Code Adapter | ✅ Complete | kubectl exec wrapper, stream-json parser |
| Week 3: OpenCode Adapter | ✅ Complete | HTTP/SSE client, session management |
| Week 4: Service Integration | ✅ Complete | ProjectService integration, event streaming |
| Week 5: Polish | ✅ Complete | Agent HTTP endpoints, health monitoring, metrics, DI wiring |
## Architecture
### Current Flow
```
Handler → ProjectService → CodeAgent (port) → ClaudeCodeAdapter | OpenCodeAdapter
↓ ↓
kubectl exec HTTP API
claude -p opencode serve
```
### Fallback Support
When `CodeAgentRegistry` is not configured, the service falls back to the legacy `CommandExecutor` for backward compatibility.
## Domain Models (✅ Implemented)
### File: `internal/domain/code_agent.go`
```go
// AgentProvider identifies which code agent implementation to use
type AgentProvider string
const (
AgentProviderClaudeCode AgentProvider = "claudecode"
AgentProviderOpenCode AgentProvider = "opencode"
)
// Validation and parsing
func (p AgentProvider) IsValid() bool
func (p AgentProvider) String() string
func ParseAgentProvider(s string) (AgentProvider, error)
func ValidAgentProviders() []AgentProvider
// AgentRequest contains parameters for executing a code agent command
type AgentRequest struct {
Prompt string
ProjectID ProjectID
SessionID string // For continuation
AllowedTools []string // Tool restrictions
Model string // Model override (OpenCode only)
WorkingDir string // Defaults to /workspace
Timeout time.Duration // Execution timeout
Metadata map[string]string // Provider-specific options
}
// AgentEventType categorizes events emitted during agent execution
type AgentEventType string
const (
AgentEventOutput AgentEventType = "output"
AgentEventToolUse AgentEventType = "tool_use"
AgentEventToolResult AgentEventType = "tool_result"
AgentEventThinking AgentEventType = "thinking"
AgentEventError AgentEventType = "error"
AgentEventComplete AgentEventType = "complete"
)
// AgentEvent represents a single event during agent execution
type AgentEvent struct {
Type AgentEventType
Timestamp time.Time
Content string
Stream string // "stdout", "stderr", or empty
ToolName string // For tool_use/tool_result events
ToolInput map[string]any // Tool invocation arguments
Metadata map[string]any
}
// AgentEventHandler is a callback for receiving agent events
type AgentEventHandler func(event AgentEvent)
// AgentResult contains the outcome of agent execution
type AgentResult struct {
SessionID string // For continuation
ExitCode int
DurationMs int64
Error error
TokensUsed *AgentTokenUsage // If available
FinalOutput string
}
func (r *AgentResult) Success() bool // ExitCode == 0 && Error == nil
// AgentTokenUsage tracks token consumption
type AgentTokenUsage struct {
InputTokens int64
OutputTokens int64
TotalTokens int64
}
// AgentCapabilities describes what a provider supports
type AgentCapabilities struct {
Provider AgentProvider
SupportsSessionContinuation bool
SupportsModelSelection bool
SupportsToolControl bool
SupportedModels []string
DefaultModel string
MaxPromptLength int
SupportsStreaming bool
}
```
### Project Extension
```go
// In domain/project.go
type Project struct {
// ... existing fields ...
AgentProvider AgentProvider // Which code agent to use
}
// New label for K8s discovery
const LabelAgentProvider = "rdev.orchard9.ai/agent-provider"
```
### Error Handling
```go
// In domain/errors.go
var ErrInvalidAgentProvider = errors.New("invalid agent provider")
```
## Port Interface (✅ Implemented)
### File: `internal/port/code_agent.go`
```go
// CodeAgent defines operations for executing AI coding agent commands
type CodeAgent interface {
// Name returns a human-readable name for this agent
Name() string
// Provider returns the agent provider identifier
Provider() domain.AgentProvider
// Execute runs an agent command and streams events to the handler
Execute(ctx context.Context, req *domain.AgentRequest, handler domain.AgentEventHandler) (*domain.AgentResult, error)
// Cancel attempts to cancel a running agent session
Cancel(ctx context.Context, sessionID string) error
// Capabilities returns what this agent supports
Capabilities() domain.AgentCapabilities
// Available returns true if the agent is ready to accept requests
Available(ctx context.Context) bool
}
// CodeAgentRegistry manages registered code agent implementations
type CodeAgentRegistry interface {
// Register adds an agent for a provider (overwrites existing)
Register(agent CodeAgent)
// Get returns the agent for a specific provider (nil if not found)
Get(provider domain.AgentProvider) CodeAgent
// Default returns the default agent (nil if empty)
Default() CodeAgent
// SetDefault sets the default provider (error if not registered)
SetDefault(provider domain.AgentProvider) error
// Available returns all registered providers
Available() []domain.AgentProvider
// AvailableAgents returns agents that are currently available
AvailableAgents(ctx context.Context) []CodeAgent
}
```
## Provider Registry (✅ Implemented)
### File: `internal/adapter/codeagent/registry.go`
```go
// Registry implements port.CodeAgentRegistry with thread-safe agent management
type Registry struct {
mu sync.RWMutex
agents map[domain.AgentProvider]port.CodeAgent
defProv domain.AgentProvider
hasAgent bool
}
func NewRegistry() *Registry
// Additional methods beyond interface
func (r *Registry) DefaultProvider() domain.AgentProvider
func (r *Registry) Count() int
```
**Thread Safety:**
- `sync.RWMutex` for concurrent access
- Read locks for: `Get`, `Default`, `Available`, `AvailableAgents`, `DefaultProvider`, `Count`
- Write locks for: `Register`, `SetDefault`
- First registered agent becomes default automatically
**Test Coverage:**
- Register/Get operations
- Default selection (first registered)
- SetDefault (success and failure)
- Available providers listing
- AvailableAgents filtering
- Concurrent access (race-tested)
- Re-registration overwrites
## Claude Code Adapter (✅ Implemented)
### Package: `internal/adapter/codeagent/claudecode/`
**Files:**
- `adapter.go` - CodeAgent implementation wrapping kubectl exec
- `parser.go` - Stream-JSON NDJSON parser for Claude Code output
- `adapter_test.go` - Comprehensive test coverage
- `parser_test.go` - Parser unit tests
**Key Features:**
- Wraps `kubectl exec` for pod access
- Uses `--output-format stream-json` for structured NDJSON output
- Supports `--resume <session_id>` for conversation continuation
- Maps `AllowedTools` to `--allowedTools` flag
- Uses `--dangerously-skip-permissions` for non-interactive mode
**Command Construction:**
```go
func (a *Adapter) buildCommandArgs(namespace, podName string, req *domain.AgentRequest) []string {
args := []string{
"exec", "-n", namespace, podName, "--",
"claude", "-p", "--output-format", "stream-json", "--dangerously-skip-permissions",
}
if req.SessionID != "" {
args = append(args, "--resume", req.SessionID)
}
for _, tool := range req.AllowedTools {
args = append(args, "--allowedTools", tool)
}
if req.WorkingDir != "" && req.WorkingDir != "/workspace" {
args = append(args, "--add-dir", req.WorkingDir)
}
args = append(args, req.Prompt)
return args
}
```
**Stream JSON Message Types:**
| Type | Description | Mapped Event |
|------|-------------|--------------|
| `init` | Session started | `AgentEventOutput` |
| `message` | Text output from assistant | `AgentEventOutput` |
| `tool_use` | Tool invocation | `AgentEventToolUse` |
| `tool_result` | Tool response | `AgentEventToolResult` |
| `result` | Execution complete | `AgentEventComplete` |
**Capabilities:**
```go
func (a *Adapter) Capabilities() domain.AgentCapabilities {
return domain.AgentCapabilities{
Provider: domain.AgentProviderClaudeCode,
SupportsSessionContinuation: true,
SupportsModelSelection: false, // Claude Code only uses Claude
SupportsToolControl: true,
SupportedModels: []string{"claude-sonnet-4-20250514", "claude-opus-4-20250514"},
DefaultModel: "claude-sonnet-4-20250514",
MaxPromptLength: 0, // Unlimited
SupportsStreaming: true,
}
}
```
## OpenCode Adapter (✅ Implemented)
### Package: `internal/adapter/codeagent/opencode/`
**Files:**
- `adapter.go` - CodeAgent implementation using HTTP/SSE
- `client.go` - HTTP client with SSE subscription support
- `adapter_test.go` - Mock server tests for all operations
**HTTP Client API:**
```go
type Client struct {
baseURL string
httpClient *http.Client
username string
password string
}
// Health check
func (c *Client) Health(ctx context.Context) (*HealthResponse, error)
// Session management
func (c *Client) CreateSession(ctx context.Context, req *CreateSessionRequest) (*Session, error)
func (c *Client) GetSession(ctx context.Context, sessionID string) (*Session, error)
func (c *Client) AbortSession(ctx context.Context, sessionID string) error
// Message sending
func (c *Client) SendMessage(ctx context.Context, sessionID string, req *SendMessageRequest) (*SendMessageResponse, error)
func (c *Client) SendPromptAsync(ctx context.Context, sessionID string, req *SendMessageRequest) error
// SSE streaming
func (c *Client) SubscribeEvents(ctx context.Context) (<-chan SSEEvent, error)
```
**SSE Event Mapping:**
| SSE Event | Description | Mapped Event |
|-----------|-------------|--------------|
| `server.connected` | Connected to server | `AgentEventOutput` |
| `message.created` | New message | `AgentEventOutput` |
| `message.updated` | Message updated | `AgentEventOutput` |
| `tool.started` | Tool execution started | `AgentEventToolUse` |
| `tool.completed` | Tool execution done | `AgentEventToolResult` |
| `session.completed` | Session finished | `AgentEventComplete` |
| `error` | Error occurred | `AgentEventError` |
**Capabilities:**
```go
func (a *Adapter) Capabilities() domain.AgentCapabilities {
return domain.AgentCapabilities{
Provider: domain.AgentProviderOpenCode,
SupportsSessionContinuation: true,
SupportsModelSelection: true, // OpenCode supports multiple providers
SupportsToolControl: true,
SupportedModels: []string{
"claude-sonnet-4-20250514",
"claude-opus-4-20250514",
"gpt-4o",
"gpt-4-turbo",
"gemini-pro",
},
DefaultModel: "claude-sonnet-4-20250514",
MaxPromptLength: 0, // Unlimited
SupportsStreaming: true,
}
}
```
**Authentication:**
- Basic auth support via `username` and `password` config
- Default username: `opencode`
## Service Integration (✅ Implemented)
### Files Modified/Created
| File | Description |
|------|-------------|
| `project_service.go` | Core service with agent registry support |
| `project_service_agent.go` | Agent execution and resolution methods |
| `project_service_commands.go` | Shell/Git command execution (extracted) |
| `project_service_queue.go` | Queue operations (extracted) |
### ProjectService Changes
**New Fields:**
```go
type ProjectService struct {
// ... existing fields ...
agentRegistry port.CodeAgentRegistry // Optional code agent registry
}
func (s *ProjectService) WithCodeAgentRegistry(registry port.CodeAgentRegistry) *ProjectService
```
**Updated Request/Response:**
```go
type ExecuteClaudeRequest struct {
ProjectID domain.ProjectID
Prompt string
StreamID string
SessionID string // Optional: resume a previous session
Model string // Optional: model override (OpenCode only)
AllowedTools []string // Optional: restrict tool access
Audit *AuditContext
}
type ExecuteClaudeResult struct {
CommandID domain.CommandID
StreamURL string
SessionID string // Session ID for continuation
AgentProvider domain.AgentProvider // Which provider handled the request
}
```
**Agent Resolution:**
```go
// resolveAgent returns the appropriate CodeAgent for a project.
// Returns nil if no agent registry is configured or no agent is available.
func (s *ProjectService) resolveAgent(project *domain.Project) port.CodeAgent {
if s.agentRegistry == nil {
return nil
}
// Try project-specific agent first
if project.AgentProvider != "" {
if agent := s.agentRegistry.Get(project.AgentProvider); agent != nil {
return agent
}
}
// Fall back to default
return s.agentRegistry.Default()
}
```
**Event Streaming:**
Agent events are converted to SSE stream events:
| Agent Event | Stream Event | Data |
|-------------|--------------|------|
| `AgentEventOutput` | `output` | `{line, stream}` |
| `AgentEventToolUse` | `tool_use` | `{tool, input}` |
| `AgentEventToolResult` | `tool_result` | `{output}` |
| `AgentEventError` | `error` | `{error}` |
| `AgentEventComplete` | `agent_complete` | metadata |
| (final) | `complete` | `{exit_code, duration_ms, session_id, provider}` |
**Additional Service Methods:**
```go
// Get capabilities for a specific provider
func (s *ProjectService) GetAgentCapabilities(provider domain.AgentProvider) *domain.AgentCapabilities
// List all available providers
func (s *ProjectService) ListAvailableAgents() []domain.AgentProvider
// Get/set default agent
func (s *ProjectService) GetDefaultAgent() domain.AgentProvider
func (s *ProjectService) SetDefaultAgent(provider domain.AgentProvider) error
```
## API Changes (✅ Complete - Week 5)
### Agent Management Endpoints
```http
# List all registered agents
GET /agents
{
"data": {
"agents": [
{
"provider": "claudecode",
"name": "Claude Code",
"available": true,
"default": true,
"supported_models": ["claude-sonnet-4-20250514"],
"default_model": "claude-sonnet-4-20250514"
}
],
"default_agent": "claudecode",
"total_agents": 1,
"available_count": 1
}
}
# Get agent capabilities
GET /agents/{provider}
{
"data": {
"provider": "claudecode",
"supports_session_continuation": true,
"supports_model_selection": false,
"supports_tool_control": true,
"supports_streaming": true,
"supported_models": ["claude-sonnet-4-20250514"],
"default_model": "claude-sonnet-4-20250514",
"max_prompt_length": 100000
}
}
# Set default agent
POST /agents/default
Content-Type: application/json
{
"provider": "opencode"
}
# Agent health status
GET /agents/health
{
"data": {
"agents": [
{
"provider": "claudecode",
"name": "Claude Code",
"healthy": true,
"message": "available",
"latency": "1.234ms",
"checked_at": "2025-01-27T10:00:00Z"
}
],
"healthy_count": 1,
"total_count": 1,
"default_agent": "claudecode",
"default_healthy": true
}
}
```
### Project Response
```json
{
"id": "proj-123",
"name": "my-project",
"agent_provider": "claudecode",
"agent_capabilities": {
"supports_session_continuation": true,
"supports_model_selection": false
}
}
```
### Update Provider
```http
PATCH /projects/{id}
Content-Type: application/json
{
"agent_provider": "opencode"
}
```
### Execute with Model (OpenCode only)
```http
POST /projects/{id}/claude
Content-Type: application/json
{
"prompt": "Fix the bug in main.go",
"model": "gpt-4o",
"session_id": "prev-session-123"
}
```
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `CODE_AGENT_DEFAULT` | `claudecode` | Default provider for new projects |
| `OPENCODE_ENABLED` | `false` | Enable OpenCode adapter |
| `OPENCODE_URL` | `http://127.0.0.1:4096` | OpenCode server URL |
| `OPENCODE_USERNAME` | `opencode` | OpenCode basic auth username |
| `OPENCODE_PASSWORD` | (none) | OpenCode basic auth password |
### Project-Level Override
Projects can specify their preferred provider in the database. On provider switch:
1. Active session is cleared (no cross-provider continuation)
2. New provider is validated as available
3. Next command uses new provider
## File Structure
```
internal/
├── domain/
│ ├── code_agent.go ✅ AgentProvider, AgentRequest, AgentEvent, etc.
│ ├── code_agent_test.go ✅ Domain model tests
│ ├── project.go ✅ Added AgentProvider field
│ └── errors.go ✅ Added ErrInvalidAgentProvider
├── port/
│ └── code_agent.go ✅ CodeAgent, CodeAgentRegistry interfaces
├── adapter/
│ └── codeagent/
│ ├── registry.go ✅ Provider registry implementation
│ ├── registry_test.go ✅ Registry tests (incl. concurrent access)
│ ├── claudecode/ ✅ Week 2
│ │ ├── adapter.go ✅ CodeAgent implementation
│ │ ├── parser.go ✅ Stream-JSON NDJSON parser
│ │ ├── adapter_test.go ✅ Adapter tests
│ │ └── parser_test.go ✅ Parser tests
│ └── opencode/ ✅ Week 3
│ ├── adapter.go ✅ CodeAgent implementation
│ ├── client.go ✅ HTTP/SSE client
│ └── adapter_test.go ✅ Mock server tests
├── handlers/
│ ├── agents.go ✅ Week 5: Agent management endpoints
│ └── agents_test.go ✅ Week 5: Handler tests
├── service/
│ ├── project_service.go ✅ Week 4: Agent registry integration
│ ├── project_service_agent.go ✅ Week 4: Agent execution methods + metrics
│ ├── project_service_commands.go ✅ Extracted shell/git commands
│ └── project_service_queue.go ✅ Extracted queue operations
├── metrics/
│ └── metrics.go ✅ Week 5: Agent metrics (requests, tool use, availability)
└── worker/
└── queue_processor.go ⬜ Future: Use CodeAgent for queue
cmd/
└── rdev-api/
├── main.go ✅ Week 5: Agent registry DI wiring
└── openapi.go ✅ Week 5: Agent API documentation
```
## Observability (✅ Complete - Week 5)
### Prometheus Metrics
| Metric | Labels | Description |
|--------|--------|-------------|
| `rdev_agent_requests_total` | provider, status | Total code agent requests |
| `rdev_agent_request_duration_seconds` | provider | Execution duration histogram |
| `rdev_agent_tool_use_total` | provider, tool | Tool invocations by agents |
| `rdev_agent_available` | provider | Availability gauge (1=available, 0=unavailable) |
### Health Check
```http
GET /agents/health
{
"data": {
"agents": [
{
"provider": "claudecode",
"name": "Claude Code",
"healthy": true,
"message": "available",
"latency": "1.234ms",
"checked_at": "2025-01-27T10:00:00Z"
}
],
"healthy_count": 1,
"total_count": 1,
"default_agent": "claudecode",
"default_healthy": true
}
}
```
## Risks and Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| OpenCode API changes | Adapter breaks | Pin to specific version, add API versioning |
| Latency difference (subprocess vs HTTP) | User experience varies | Monitor p99 latency, document tradeoffs |
| Session state incompatibility | Can't resume across providers | Clear session on provider switch |
| Container image size increase | Slower deployments | OpenCode sidecar optional, not in base image |
## Design Decisions
1. **Event callback pattern** - Matches existing `OutputHandler`, enables streaming
2. **Registry pattern** - Allows runtime switching, extensible for more providers
3. **OpenCode as sidecar** - Keeps Claude Code as proven default, OpenCode opt-in
4. **Session per provider** - No cross-provider session sharing to avoid state corruption
5. **First-registered default** - Registry automatically uses first agent as default
6. **Backward compatibility** - Falls back to legacy executor when no registry configured
7. **File splitting** - Service files split to comply with 500-line limit