Implements horizontally-scalable worker pool architecture: - claudebox-sidecar: HTTP server for Claude Code, git, and SDLC ops - rdev-worker: standalone worker binary polling rdev-api for tasks - HTTP client adapter for sidecar communication - HPA with custom Prometheus metrics for autoscaling - ServiceMonitor for metrics scraping Code review fixes applied: - URL-encode query parameters in GitStatus (Critical #1) - Remove unused shellQuote function (Critical #2) - Use stdlib strings.Split/TrimSpace (Critical #3) - Add version injection via ldflags (Warning #4) - Add debug logging for swallowed git/sdlc errors (Warning #5, #6) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
929 lines
39 KiB
Markdown
929 lines
39 KiB
Markdown
# Orchard Studio: Gap Analysis
|
|
|
|
This document maps the delta between current `rdev` capabilities and what Orchard Studio requires.
|
|
|
|
## Current Foundation (What We Have)
|
|
|
|
| Capability | Status | Location |
|
|
|------------|--------|----------|
|
|
| SDLC Classifier | ✅ Complete | `internal/sdlc/classifier.go` |
|
|
| Feature State Machine | ✅ Complete | `internal/sdlc/` (10 phases, 31 rules) |
|
|
| Composable Templates | ✅ Complete | `internal/adapter/templates/` |
|
|
| Worker Pod Execution | ✅ Complete | `internal/worker/sdlc_executor.go` |
|
|
| Webhook Dispatcher | ✅ Complete | `internal/webhook/dispatcher.go` |
|
|
| Project Provisioning | ✅ Complete | K8s namespace, DNS, git repo |
|
|
| Database Provisioning | ✅ Complete | CockroachDB adapter |
|
|
| Tree Workflows | ✅ Proven | `cookbooks/trees/*.yaml` |
|
|
|
|
---
|
|
|
|
## Gap 0: Design Reference Capture & Processing
|
|
|
|
**Current:** No mechanism for users to provide visual inspiration. Features are described purely in text.
|
|
|
|
**Required:** Users can provide URLs or screenshots as design references, which inform the Architect's questions and the Blueprint's design system section.
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT FLOW │
|
|
│ │
|
|
│ User: "Build a pricing page" │
|
|
│ Architect: *asks about data model, endpoints...* │
|
|
│ (No visual context, design decisions are guesswork) │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED FLOW │
|
|
│ │
|
|
│ User: "Build a pricing page like this" + [URL or screenshot] │
|
|
│ System: Captures screenshot, stores with Blueprint │
|
|
│ Architect: "I see a dark theme with 3 tiers..." → asks clarifying Qs │
|
|
│ Blueprint: Populates designSystem section with extracted tokens │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Two Input Types
|
|
|
|
| Input | Capture Method | Storage |
|
|
|-------|----------------|---------|
|
|
| **URL** | Playwright screenshots the page automatically | `/references/{blueprintId}/{refId}.png` |
|
|
| **Screenshot** | User uploads image (drag/drop, paste, file picker) | Same storage path |
|
|
|
|
### Implementation Required
|
|
|
|
1. **Reference Capture Service:**
|
|
- For URLs: Reuse `verify_executor.go` pattern (Playwright pod)
|
|
- For uploads: Standard file upload handling
|
|
- Store thumbnails alongside Blueprint
|
|
|
|
2. **Chat Endpoint Enhancement:**
|
|
- Accept `references[]` array in request body
|
|
- Process references before LLM call
|
|
- Include reference images in Architect prompt context
|
|
|
|
3. **Architect Prompt Updates:**
|
|
- Describe what it observes in natural language
|
|
- Ask clarifying questions about design intent
|
|
- Extract structured design tokens into Blueprint
|
|
|
|
4. **Blueprint Schema:**
|
|
- Add `references.items[]` array
|
|
- Add `sections.designSystem` section
|
|
- Track which references informed which design decisions
|
|
|
|
5. **Plan Pane Rendering:**
|
|
- Show reference thumbnails in UI
|
|
- Display extracted design tokens
|
|
- Allow user to add annotations
|
|
|
|
### Complexity: Medium
|
|
|
|
- URL capture reuses existing Playwright infrastructure
|
|
- File upload is standard pattern
|
|
- Main work is Architect prompt engineering for visual understanding
|
|
- LLM vision capabilities needed (Claude can see images natively)
|
|
|
|
---
|
|
|
|
## Gap 1: Blueprint Storage & Chat API
|
|
|
|
**Current:** Features are created via `POST /sdlc/features` with a complete spec. No iterative refinement.
|
|
|
|
**Required:** Multi-turn conversation that builds a Blueprint incrementally.
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT FLOW │
|
|
│ │
|
|
│ User writes spec → POST /sdlc/features → Feature created │
|
|
│ (one shot, no iteration) │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED FLOW │
|
|
│ │
|
|
│ User message → Architect responds + updates Blueprint → │
|
|
│ User message → Architect responds + updates Blueprint → │
|
|
│ ...repeat until ready... │
|
|
│ User: "build it" → Blueprint → SDLC Feature → Build │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
1. **Database Tables:**
|
|
- `blueprints` - stores structured Blueprint JSON
|
|
- `blueprint_messages` - conversation history with snapshots
|
|
|
|
2. **API Endpoints:**
|
|
- `POST /projects/{id}/blueprint/chat` - send message, get reply + updated blueprint
|
|
- `GET /projects/{id}/blueprints` - list blueprints
|
|
- `GET /projects/{id}/blueprints/{id}` - get specific blueprint
|
|
- `DELETE /projects/{id}/blueprints/{id}` - discard draft
|
|
|
|
3. **Service Layer:**
|
|
- `ArchitectService` - manages conversation, calls LLM, updates Blueprint
|
|
|
|
### Complexity: Medium
|
|
- Schema is defined (see app-vision.md)
|
|
- Standard CRUD + LLM integration
|
|
- Most work is in prompt engineering for Architect
|
|
|
|
---
|
|
|
|
## Gap 2: Architect Agent Persona
|
|
|
|
**Current:** We have coding agents (`/implement-feature`). They write code, not specs.
|
|
|
|
**Required:** An agent that asks questions, fills in a structured Blueprint, knows when to stop.
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT AGENTS │
|
|
│ │
|
|
│ User: "Add cat photos" │
|
|
│ Agent: *immediately writes code* │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ ARCHITECT AGENT │
|
|
│ │
|
|
│ User: "Add cat photos" │
|
|
│ Architect: "Should photos be public or friends-only?" │
|
|
│ User: "Public" │
|
|
│ Architect: "Got it. Do you want likes, comments, or neither?" │
|
|
│ ...continues until Blueprint is complete... │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
1. **System Prompt:**
|
|
- `.claude/agents/architect.md` - detailed persona
|
|
- Structured output format (reply + Blueprint JSON)
|
|
- Question strategy (when to ask vs assume)
|
|
|
|
2. **Structured Output Parsing:**
|
|
- LLM returns `{reply: string, blueprint: Blueprint}`
|
|
- Validate Blueprint against schema
|
|
- Handle partial updates (delta vs full replacement)
|
|
|
|
3. **Completeness Logic:**
|
|
- `isReadyToBuild(blueprint)` function
|
|
- Clear rules for when questions are resolved
|
|
- Override mechanism for user to force build
|
|
|
|
### Complexity: Medium-High
|
|
- Prompt engineering is iterative
|
|
- Structured output from LLMs can be fragile
|
|
- Need fallback handling for malformed responses
|
|
|
|
---
|
|
|
|
## Gap 3: Operation Tracking (Tree Runner in DB)
|
|
|
|
**Current:** Tree workflows run via shell script (`tree-runner.sh`). State in local JSON files.
|
|
|
|
**Required:** Operations tracked in database, queryable via API, streamable to UI.
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT │
|
|
│ │
|
|
│ ./tree-runner.sh slackpath-1.yaml │
|
|
│ → Runs in terminal │
|
|
│ → State in .checkpoints/slackpath-1.json │
|
|
│ → No API visibility │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED │
|
|
│ │
|
|
│ POST /operations/start {tree: "slackpath-1"} │
|
|
│ → Returns operation_id │
|
|
│ → State in operations table │
|
|
│ → GET /operations/{id}/stream returns SSE events │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
1. **Database Tables:**
|
|
- `operations` - tracks running/completed operations
|
|
- `operation_events` - event log for replay/streaming
|
|
|
|
2. **Service Layer:**
|
|
- `OrchestratorService` - manages operation lifecycle
|
|
- Port tree-runner logic from bash to Go
|
|
- Event emission during execution
|
|
|
|
3. **API Endpoints:**
|
|
- `POST /projects/{id}/operations` - start operation
|
|
- `GET /projects/{id}/operations/{id}` - get status
|
|
- `GET /projects/{id}/operations/{id}/stream` - SSE stream
|
|
|
|
4. **Worker Integration:**
|
|
- SDLC executor emits events as it progresses
|
|
- Events written to `operation_events` table
|
|
- SSE handler reads from table and streams
|
|
|
|
### Complexity: High
|
|
- Tree runner logic is non-trivial (dependencies, outputs, error handling)
|
|
- SSE streaming requires careful connection management
|
|
- Need to handle operation cancellation, resumption
|
|
|
|
---
|
|
|
|
## Gap 4: Real-Time Progress Streaming
|
|
|
|
**Current:** Webhooks fire on build complete. No per-step visibility.
|
|
|
|
**Required:** SSE stream showing "Designing schema... Writing handlers... Running tests..."
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT │
|
|
│ │
|
|
│ Build starts → ... silence ... → Webhook: "build complete" │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED │
|
|
│ │
|
|
│ Build starts → │
|
|
│ event: {"phase": "spec", "status": "complete"} │
|
|
│ event: {"phase": "design", "status": "in_progress"} │
|
|
│ event: {"phase": "design", "status": "complete"} │
|
|
│ event: {"phase": "implement", "progress": 0.5} │
|
|
│ ... │
|
|
│ event: {"status": "complete", "url": "..."} │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
1. **SDLC Executor Changes:**
|
|
- Emit events at phase transitions
|
|
- Emit progress within phases (task completion)
|
|
- Write events to `operation_events` table
|
|
|
|
2. **SSE Handler:**
|
|
- `GET /operations/{id}/stream`
|
|
- Long-lived connection
|
|
- Read events from DB (or Redis pub/sub)
|
|
- Handle client disconnection gracefully
|
|
|
|
3. **Event Types:**
|
|
```go
|
|
type OperationEvent struct {
|
|
Type string // "phase", "progress", "artifact", "error", "complete"
|
|
Phase string // "spec", "design", "implement", "test", "deploy"
|
|
Status string // "in_progress", "complete", "failed"
|
|
Message string // Human-readable
|
|
Progress float64 // 0.0 to 1.0 for granular progress
|
|
Timestamp time.Time
|
|
}
|
|
```
|
|
|
|
### Complexity: Medium
|
|
- SSE is straightforward in Go
|
|
- Main work is instrumenting SDLC executor
|
|
- Need to balance granularity vs noise
|
|
|
|
---
|
|
|
|
## Gap 5: Blueprint → SDLC Feature Conversion
|
|
|
|
**Current:** SDLC features are created manually with spec documents.
|
|
|
|
**Required:** Automated conversion from structured Blueprint to SDLC feature spec.
|
|
|
|
### What's Missing
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT │
|
|
│ │
|
|
│ Human writes: spec.md with prose description │
|
|
│ → POST /sdlc/features │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED │
|
|
│ │
|
|
│ Blueprint JSON → Template rendering → spec.md │
|
|
│ → Automated POST /sdlc/features │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
1. **Spec Template:**
|
|
```markdown
|
|
# Feature: {{.Feature}}
|
|
|
|
## Summary
|
|
{{.Summary}}
|
|
|
|
## Data Model
|
|
{{range .Sections.DataModel.Entities}}
|
|
### {{.Name}}
|
|
| Field | Type |
|
|
|-------|------|
|
|
{{range .Fields}}| {{.Name}} | {{.Type}} |
|
|
{{end}}
|
|
{{end}}
|
|
|
|
## API Endpoints
|
|
{{range .Sections.APIEndpoints.Endpoints}}
|
|
- `{{.Method}} {{.Path}}` - {{.Description}}
|
|
{{end}}
|
|
|
|
## UI Components
|
|
{{range .Sections.UIComponents.Components}}
|
|
- **{{.Name}}**: {{.Purpose}}
|
|
{{end}}
|
|
|
|
## Assumptions
|
|
{{range .Assumptions}}
|
|
- {{.Assumption}}
|
|
{{end}}
|
|
```
|
|
|
|
2. **Conversion Service:**
|
|
- Takes Blueprint, renders spec.md
|
|
- Creates SDLC feature via existing API
|
|
- Links Blueprint to created feature (`built_feature_slug`)
|
|
|
|
### Complexity: Low
|
|
- Template rendering is straightforward
|
|
- SDLC feature creation already exists
|
|
- Main work is template design
|
|
|
|
---
|
|
|
|
## Gap 6: Frontend (Next.js Studio)
|
|
|
|
**Current:** No frontend. All interaction via API/CLI.
|
|
|
|
**Required:** Three-pane interface (Chat, Plan, Preview).
|
|
|
|
### What's Missing
|
|
|
|
Everything. This is a new application.
|
|
|
|
### Implementation Required
|
|
|
|
1. **Project Setup:**
|
|
- Next.js 14 with App Router
|
|
- Tailwind CSS for styling
|
|
- Authentication (integrate with rdev auth)
|
|
|
|
2. **Core Components:**
|
|
```
|
|
apps/studio/
|
|
├── app/
|
|
│ ├── page.tsx # Template selection
|
|
│ ├── projects/
|
|
│ │ └── [id]/
|
|
│ │ └── page.tsx # Three-pane workspace
|
|
│ └── api/ # Proxy to rdev-api
|
|
├── components/
|
|
│ ├── ChatPane.tsx
|
|
│ ├── PlanPane.tsx
|
|
│ ├── PreviewPane.tsx
|
|
│ ├── ActivityFeed.tsx
|
|
│ └── BuildProgress.tsx
|
|
└── lib/
|
|
├── api.ts # rdev-api client
|
|
└── sse.ts # SSE connection manager
|
|
```
|
|
|
|
3. **State Management:**
|
|
- Blueprint state (updated on each chat response)
|
|
- Operation state (updated via SSE)
|
|
- UI state (which pane is focused, etc.)
|
|
|
|
4. **Key Interactions:**
|
|
- Send chat message → receive reply + blueprint
|
|
- Click "Build It" → start operation → show progress
|
|
- Operation complete → refresh preview iframe
|
|
|
|
### Complexity: Medium
|
|
- Standard Next.js app
|
|
- SSE client requires careful handling
|
|
- Most complexity is in polish and UX
|
|
|
|
---
|
|
|
|
## Gap 7: Platform Service Infrastructure
|
|
|
|
**Current:** Projects manage their own integrations. No shared services, no credential management.
|
|
|
|
**Required:** A service catalog with provisioning, credential injection, and upgrade paths for existing projects.
|
|
|
|
### The "Upgrade" Problem
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ CURRENT │
|
|
│ │
|
|
│ Project created 3 months ago │
|
|
│ → No centralized logging │
|
|
│ → No analytics │
|
|
│ → Rolling your own email │
|
|
│ → No easy way to add platform services │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ REQUIRED │
|
|
│ │
|
|
│ POST /projects/{id}/services │
|
|
│ { "type": "logging", "provider": "loki" } │
|
|
│ │
|
|
│ → Provision credentials │
|
|
│ → Inject into K8s secrets │
|
|
│ → Create integration PR with config changes │
|
|
│ → Project now ships logs to centralized system │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Service Rollout Order
|
|
|
|
Build infrastructure with simplest service first, then add complexity:
|
|
|
|
| Order | Service | Why This Order |
|
|
|-------|---------|----------------|
|
|
| 1 | **Logging** | Pure infrastructure, no user-facing code changes |
|
|
| 2 | **Email** | Simple API calls, clear success/failure |
|
|
| 3 | **Stats** | Frontend SDK + backend events |
|
|
| 4 | **Auth** | Most complex (middleware, user model, protected routes) |
|
|
|
|
### Implementation Required
|
|
|
|
#### 1. Service Catalog
|
|
|
|
```yaml
|
|
# internal/platform/catalog.yaml
|
|
services:
|
|
logging:
|
|
description: "Centralized log aggregation"
|
|
providers:
|
|
loki:
|
|
name: "Grafana Loki"
|
|
credentials:
|
|
- LOKI_URL
|
|
- LOKI_TENANT_ID
|
|
integration:
|
|
go:
|
|
config_template: "loki-logger.go.tmpl"
|
|
env_example: ["LOKI_URL", "LOKI_TENANT_ID"]
|
|
node:
|
|
packages: ["pino", "pino-loki"]
|
|
config_template: "pino-loki.ts.tmpl"
|
|
|
|
email:
|
|
description: "Transactional email"
|
|
providers:
|
|
resend:
|
|
name: "Resend"
|
|
credentials:
|
|
- RESEND_API_KEY
|
|
integration:
|
|
go:
|
|
packages: ["github.com/resendlabs/resend-go"]
|
|
service_template: "email-service.go.tmpl"
|
|
node:
|
|
packages: ["resend"]
|
|
service_template: "email-client.ts.tmpl"
|
|
|
|
stats:
|
|
description: "Product analytics"
|
|
providers:
|
|
posthog:
|
|
name: "PostHog"
|
|
credentials:
|
|
- POSTHOG_API_KEY
|
|
- POSTHOG_HOST
|
|
integration:
|
|
go:
|
|
packages: ["github.com/posthog/posthog-go"]
|
|
node:
|
|
packages: ["posthog-js", "posthog-node"]
|
|
provider_template: "analytics-provider.tsx.tmpl"
|
|
|
|
auth:
|
|
description: "User authentication"
|
|
providers:
|
|
clerk:
|
|
name: "Clerk"
|
|
credentials:
|
|
- CLERK_SECRET_KEY
|
|
- NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY
|
|
integration:
|
|
node:
|
|
packages: ["@clerk/nextjs"]
|
|
middleware_template: "clerk-middleware.ts.tmpl"
|
|
provider_template: "clerk-provider.tsx.tmpl"
|
|
```
|
|
|
|
#### 2. Database Schema
|
|
|
|
```sql
|
|
-- Track which services a project uses
|
|
CREATE TABLE project_services (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
project_id UUID NOT NULL REFERENCES projects(id),
|
|
service_type TEXT NOT NULL, -- 'logging', 'email', 'stats', 'auth'
|
|
provider TEXT NOT NULL, -- 'loki', 'resend', 'posthog', 'clerk'
|
|
environment TEXT NOT NULL, -- 'staging', 'production', 'all'
|
|
|
|
-- Encrypted credentials
|
|
credentials_encrypted BYTEA,
|
|
|
|
-- Non-sensitive config
|
|
config JSONB NOT NULL DEFAULT '{}',
|
|
|
|
-- Status tracking
|
|
status TEXT NOT NULL DEFAULT 'provisioning',
|
|
-- provisioning → active → needs_update → deprovisioned
|
|
|
|
-- Integration tracking
|
|
integration_status TEXT DEFAULT 'pending',
|
|
-- pending → pr_created → integrated → needs_update
|
|
integration_pr_url TEXT,
|
|
integration_commit TEXT,
|
|
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
|
|
UNIQUE(project_id, service_type, environment)
|
|
);
|
|
```
|
|
|
|
#### 3. Provisioner Interface
|
|
|
|
```go
|
|
// internal/port/platform_provisioner.go
|
|
type PlatformProvisioner interface {
|
|
// Provision creates credentials for a project
|
|
Provision(ctx context.Context, req ProvisionRequest) (*ProvisionResult, error)
|
|
|
|
// Verify checks if credentials are still valid
|
|
Verify(ctx context.Context, projectID string, creds map[string]string) error
|
|
|
|
// Deprovision cleans up (optional, for account removal)
|
|
Deprovision(ctx context.Context, projectID string) error
|
|
}
|
|
|
|
type ProvisionRequest struct {
|
|
ProjectID uuid.UUID
|
|
ProjectName string
|
|
Environment string // "staging", "production"
|
|
}
|
|
|
|
type ProvisionResult struct {
|
|
Credentials map[string]string // Encrypted before storage
|
|
Config map[string]string // Non-sensitive config
|
|
}
|
|
```
|
|
|
|
#### 4. Service Addition API
|
|
|
|
```
|
|
POST /projects/{projectId}/services
|
|
{
|
|
"serviceType": "logging",
|
|
"provider": "loki" // Optional, uses platform default
|
|
}
|
|
|
|
Response:
|
|
{
|
|
"serviceId": "svc_abc123",
|
|
"status": "provisioning",
|
|
"integrationMethod": "pr", // or "direct"
|
|
"prUrl": null // Populated when PR is created
|
|
}
|
|
|
|
GET /projects/{projectId}/services/{serviceId}
|
|
{
|
|
"serviceId": "svc_abc123",
|
|
"serviceType": "logging",
|
|
"provider": "loki",
|
|
"status": "active",
|
|
"integrationStatus": "integrated",
|
|
"integrationCommit": "abc123...",
|
|
"credentials": {
|
|
"LOKI_URL": "[redacted]",
|
|
"LOKI_TENANT_ID": "project-xyz"
|
|
}
|
|
}
|
|
```
|
|
|
|
#### 5. Integration Flow
|
|
|
|
```
|
|
POST /projects/{id}/services {type: "logging", provider: "loki"}
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ 1. PROVISION │
|
|
│ │
|
|
│ LokiProvisioner.Provision() │
|
|
│ → Create tenant in Loki (or use shared with project prefix) │
|
|
│ → Generate credentials │
|
|
│ → Store encrypted in project_services │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ 2. INJECT │
|
|
│ │
|
|
│ K8sSecretInjector.Inject() │
|
|
│ → Add LOKI_URL, LOKI_TENANT_ID to project's K8s secret │
|
|
│ → Trigger deployment restart to pick up new env vars │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ 3. INTEGRATE │
|
|
│ │
|
|
│ IntegrationService.CreatePR() or .DirectCommit() │
|
|
│ → Clone project repo │
|
|
│ → Apply integration templates: │
|
|
│ • Update logger config to ship to Loki │
|
|
│ • Add env vars to .env.example │
|
|
│ • Update deployment to mount secrets │
|
|
│ → Create PR (or direct commit for new projects) │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ 4. VERIFY │
|
|
│ │
|
|
│ After PR merge / deploy: │
|
|
│ → Check logs appearing in Loki │
|
|
│ → Update integration_status to "integrated" │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Complexity: High
|
|
|
|
- Service catalog is straightforward (YAML/DB)
|
|
- Each provisioner is unique (Loki vs Resend vs PostHog)
|
|
- Credential encryption and management needs care
|
|
- Integration templates need to handle Go + Node + various frameworks
|
|
- PR creation requires git operations
|
|
|
|
### Starting Point: Logging with Loki
|
|
|
|
```go
|
|
// internal/adapter/loki/provisioner.go
|
|
type LokiProvisioner struct {
|
|
lokiURL string
|
|
adminToken string // For tenant creation if using multi-tenant Loki
|
|
}
|
|
|
|
func (p *LokiProvisioner) Provision(ctx context.Context, req ProvisionRequest) (*ProvisionResult, error) {
|
|
// For single-tenant Loki, just create a unique label prefix
|
|
tenantID := fmt.Sprintf("project-%s", req.ProjectID)
|
|
|
|
return &ProvisionResult{
|
|
Credentials: map[string]string{
|
|
"LOKI_URL": p.lokiURL,
|
|
"LOKI_TENANT_ID": tenantID,
|
|
},
|
|
Config: map[string]string{
|
|
"service_name": req.ProjectName,
|
|
},
|
|
}, nil
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Gap 8: Dual Environment Support
|
|
|
|
**Current:** Single deployment per project. Main branch = production.
|
|
|
|
**Required:** Staging + Production environments. Build deploys to staging, "Publish" promotes to production.
|
|
|
|
### The Environment Model
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Project: cool-project │
|
|
│ │
|
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
|
│ │ STAGING │ │
|
|
│ │ staging.cool-project.threesix.ai │ │
|
|
│ │ │ │
|
|
│ │ • Where development happens │ │
|
|
│ │ • Preview pane shows this │ │
|
|
│ │ • "Build It" deploys here │ │
|
|
│ │ • May use test credentials for services │ │
|
|
│ └─────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ [Publish] │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
|
│ │ PRODUCTION │ │
|
|
│ │ cool-project.threesix.ai │ │
|
|
│ │ │ │
|
|
│ │ • User-facing, stable │ │
|
|
│ │ • Only updated via explicit "Publish" │ │
|
|
│ │ • Production credentials for services │ │
|
|
│ │ • Enabled after first publish │ │
|
|
│ └─────────────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Implementation Required
|
|
|
|
#### 1. DNS Changes
|
|
|
|
```go
|
|
// On project creation, create both records (prod may be placeholder)
|
|
CreateDNSRecord("staging.cool-project.threesix.ai", stagingIP)
|
|
CreateDNSRecord("cool-project.threesix.ai", prodIP) // Or placeholder until first publish
|
|
```
|
|
|
|
#### 2. K8s Deployment Model
|
|
|
|
```yaml
|
|
# Option A: Two deployments in same namespace
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: cool-project-staging
|
|
namespace: cool-project
|
|
---
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: cool-project-production
|
|
namespace: cool-project
|
|
|
|
# Option B: Two namespaces (cleaner isolation)
|
|
# cool-project-staging namespace
|
|
# cool-project-production namespace
|
|
```
|
|
|
|
**Recommendation:** Same namespace, two deployments. Simpler to manage, secrets can be shared or scoped.
|
|
|
|
#### 3. Database Model
|
|
|
|
Two options:
|
|
|
|
**A. Same database, schema prefixes:**
|
|
```sql
|
|
-- Staging tables
|
|
staging_users, staging_posts, staging_...
|
|
|
|
-- Production tables
|
|
prod_users, prod_posts, prod_...
|
|
```
|
|
|
|
**B. Separate databases (cleaner):**
|
|
```
|
|
cool-project-staging (CockroachDB database)
|
|
cool-project-production (CockroachDB database)
|
|
```
|
|
|
|
**Recommendation:** Separate databases. Cleaner isolation, no risk of cross-env data access.
|
|
|
|
#### 4. Project Schema Updates
|
|
|
|
```sql
|
|
ALTER TABLE projects ADD COLUMN environments JSONB NOT NULL DEFAULT '{
|
|
"staging": {"enabled": true, "deployed_at": null},
|
|
"production": {"enabled": false, "deployed_at": null, "published_at": null}
|
|
}';
|
|
```
|
|
|
|
#### 5. Publish API
|
|
|
|
```
|
|
POST /projects/{projectId}/publish
|
|
{
|
|
"fromEnvironment": "staging", // Usually staging
|
|
"toEnvironment": "production"
|
|
}
|
|
|
|
Response:
|
|
{
|
|
"operationId": "op_xyz789",
|
|
"status": "publishing",
|
|
"streamUrl": "/operations/{operationId}/stream"
|
|
}
|
|
```
|
|
|
|
**Publish Flow:**
|
|
1. Validate staging is healthy
|
|
2. Provision production credentials for any services (if not exist)
|
|
3. Run migrations on production database
|
|
4. Deploy staging image to production deployment
|
|
5. Health check production
|
|
6. Update DNS if needed
|
|
7. Update project.environments.production
|
|
|
|
### Complexity: Medium
|
|
|
|
- DNS: Already have CloudflareAdapter, just create two records
|
|
- K8s: Straightforward deployment duplication
|
|
- Database: CockroachDB adapter supports multiple databases
|
|
- Main complexity is the publish flow coordination
|
|
|
|
### Defer Until After Gap 7
|
|
|
|
Dual environments can work with platform services, but we can build Gap 7 (services) first:
|
|
- Services provision for a single environment initially
|
|
- Then extend to environment-aware provisioning
|
|
- Then add the publish flow that syncs services to production
|
|
|
|
---
|
|
|
|
## Summary: Work Required
|
|
|
|
| Gap | Effort | Dependencies | Critical Path |
|
|
|-----|--------|--------------|---------------|
|
|
| 0. Design References | 2-3 days | Gap 1 (storage) | Yes (for design flows) |
|
|
| 1. Blueprint Storage | 2-3 days | None | Yes |
|
|
| 2. Architect Agent | 3-5 days | Gap 1 | Yes |
|
|
| 3. Operation Tracking | 4-6 days | None | Yes |
|
|
| 4. Progress Streaming | 2-3 days | Gap 3 | Yes |
|
|
| 5. Blueprint → SDLC | 1-2 days | Gap 1 | Yes |
|
|
| 6. Frontend | 5-7 days | Gaps 1-5 | Yes |
|
|
| 7. Platform Services | 5-8 days | None (can start now) | Parallel track |
|
|
| 8. Dual Environments | 3-5 days | Gap 7 | After services work |
|
|
|
|
**Total Estimate:** 4-5 weeks of focused work (Gaps 7-8 can parallel with 1-6)
|
|
|
|
**Service Rollout (within Gap 7):**
|
|
1. Logging (Loki) - 2 days
|
|
2. Email (Resend) - 2 days
|
|
3. Stats (PostHog) - 2 days
|
|
4. Auth (Clerk) - 3 days
|
|
|
|
**Note:** Gap 0 (Design References) can be implemented in parallel with Gap 2 (Architect Agent) since both involve Architect prompt engineering. The reference capture infrastructure (Gap 0) builds on Gap 1's storage layer.
|
|
|
|
### Critical Path
|
|
|
|
```
|
|
┌──► Gap 0 (References) ──┐
|
|
│ │
|
|
Gap 1 (Blueprint) ──┼──► Gap 2 (Architect) ───┼──► Gap 5 (Conversion)
|
|
│ │
|
|
│ └──► Gap 6 (Frontend)
|
|
│ ▲
|
|
Gap 3 (Operations) ─┴──► Gap 4 (Streaming) ────────┘
|
|
|
|
|
|
Parallel Track:
|
|
|
|
Gap 7 (Services) ──► Logging ──► Email ──► Stats ──► Auth
|
|
│
|
|
└──► Gap 8 (Environments) ──► Publish Flow
|
|
```
|
|
|
|
Gap 7 can start immediately and run parallel to the Studio work.
|
|
Gap 8 depends on Gap 7 for service credential handling per environment.
|
|
|
|
---
|
|
|
|
## Risk Assessment
|
|
|
|
| Risk | Likelihood | Impact | Mitigation |
|
|
|------|------------|--------|------------|
|
|
| Architect outputs malformed JSON | High | Medium | JSON schema validation, retry logic |
|
|
| SSE connections drop | Medium | Low | Client-side reconnection, event replay from DB |
|
|
| Blueprint schema too restrictive | Medium | Medium | Start minimal, add sections iteratively |
|
|
| LLM latency affects chat UX | Low | High | Stream partial responses, show typing indicator |
|
|
| Build failures leave broken state | Low | Medium | SDLC already handles partial state |
|
|
|
|
---
|
|
|
|
## What's NOT a Gap
|
|
|
|
These are already solved by the current rdev foundation:
|
|
|
|
- **Project provisioning** - K8s, DNS, git all work
|
|
- **Template seeding** - Composable monorepo templates
|
|
- **SDLC execution** - Classifier + worker + artifact tracking
|
|
- **CI/CD** - Woodpecker integration
|
|
- **Database provisioning** - CockroachDB adapter
|
|
- **Webhooks** - Event dispatcher with retry
|
|
|
|
The foundation is solid. The gaps are about **exposing** existing capabilities through a conversational UI, not rebuilding core functionality.
|