Implements horizontally-scalable worker pool architecture: - claudebox-sidecar: HTTP server for Claude Code, git, and SDLC ops - rdev-worker: standalone worker binary polling rdev-api for tasks - HTTP client adapter for sidecar communication - HPA with custom Prometheus metrics for autoscaling - ServiceMonitor for metrics scraping Code review fixes applied: - URL-encode query parameters in GitStatus (Critical #1) - Remove unused shellQuote function (Critical #2) - Use stdlib strings.Split/TrimSpace (Critical #3) - Add version injection via ldflags (Warning #4) - Add debug logging for swallowed git/sdlc errors (Warning #5, #6) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
39 KiB
Orchard Studio: Gap Analysis
This document maps the delta between current rdev capabilities and what Orchard Studio requires.
Current Foundation (What We Have)
| Capability | Status | Location |
|---|---|---|
| SDLC Classifier | ✅ Complete | internal/sdlc/classifier.go |
| Feature State Machine | ✅ Complete | internal/sdlc/ (10 phases, 31 rules) |
| Composable Templates | ✅ Complete | internal/adapter/templates/ |
| Worker Pod Execution | ✅ Complete | internal/worker/sdlc_executor.go |
| Webhook Dispatcher | ✅ Complete | internal/webhook/dispatcher.go |
| Project Provisioning | ✅ Complete | K8s namespace, DNS, git repo |
| Database Provisioning | ✅ Complete | CockroachDB adapter |
| Tree Workflows | ✅ Proven | cookbooks/trees/*.yaml |
Gap 0: Design Reference Capture & Processing
Current: No mechanism for users to provide visual inspiration. Features are described purely in text.
Required: Users can provide URLs or screenshots as design references, which inform the Architect's questions and the Blueprint's design system section.
What's Missing
┌─────────────────────────────────────────────────────────────────────────┐
│ CURRENT FLOW │
│ │
│ User: "Build a pricing page" │
│ Architect: *asks about data model, endpoints...* │
│ (No visual context, design decisions are guesswork) │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ REQUIRED FLOW │
│ │
│ User: "Build a pricing page like this" + [URL or screenshot] │
│ System: Captures screenshot, stores with Blueprint │
│ Architect: "I see a dark theme with 3 tiers..." → asks clarifying Qs │
│ Blueprint: Populates designSystem section with extracted tokens │
└─────────────────────────────────────────────────────────────────────────┘
Two Input Types
| Input | Capture Method | Storage |
|---|---|---|
| URL | Playwright screenshots the page automatically | /references/{blueprintId}/{refId}.png |
| Screenshot | User uploads image (drag/drop, paste, file picker) | Same storage path |
Implementation Required
-
Reference Capture Service:
- For URLs: Reuse
verify_executor.gopattern (Playwright pod) - For uploads: Standard file upload handling
- Store thumbnails alongside Blueprint
- For URLs: Reuse
-
Chat Endpoint Enhancement:
- Accept
references[]array in request body - Process references before LLM call
- Include reference images in Architect prompt context
- Accept
-
Architect Prompt Updates:
- Describe what it observes in natural language
- Ask clarifying questions about design intent
- Extract structured design tokens into Blueprint
-
Blueprint Schema:
- Add
references.items[]array - Add
sections.designSystemsection - Track which references informed which design decisions
- Add
-
Plan Pane Rendering:
- Show reference thumbnails in UI
- Display extracted design tokens
- Allow user to add annotations
Complexity: Medium
- URL capture reuses existing Playwright infrastructure
- File upload is standard pattern
- Main work is Architect prompt engineering for visual understanding
- LLM vision capabilities needed (Claude can see images natively)
Gap 1: Blueprint Storage & Chat API
Current: Features are created via POST /sdlc/features with a complete spec. No iterative refinement.
Required: Multi-turn conversation that builds a Blueprint incrementally.
What's Missing
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT FLOW │
│ │
│ User writes spec → POST /sdlc/features → Feature created │
│ (one shot, no iteration) │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REQUIRED FLOW │
│ │
│ User message → Architect responds + updates Blueprint → │
│ User message → Architect responds + updates Blueprint → │
│ ...repeat until ready... │
│ User: "build it" → Blueprint → SDLC Feature → Build │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
-
Database Tables:
blueprints- stores structured Blueprint JSONblueprint_messages- conversation history with snapshots
-
API Endpoints:
POST /projects/{id}/blueprint/chat- send message, get reply + updated blueprintGET /projects/{id}/blueprints- list blueprintsGET /projects/{id}/blueprints/{id}- get specific blueprintDELETE /projects/{id}/blueprints/{id}- discard draft
-
Service Layer:
ArchitectService- manages conversation, calls LLM, updates Blueprint
Complexity: Medium
- Schema is defined (see app-vision.md)
- Standard CRUD + LLM integration
- Most work is in prompt engineering for Architect
Gap 2: Architect Agent Persona
Current: We have coding agents (/implement-feature). They write code, not specs.
Required: An agent that asks questions, fills in a structured Blueprint, knows when to stop.
What's Missing
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT AGENTS │
│ │
│ User: "Add cat photos" │
│ Agent: *immediately writes code* │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ARCHITECT AGENT │
│ │
│ User: "Add cat photos" │
│ Architect: "Should photos be public or friends-only?" │
│ User: "Public" │
│ Architect: "Got it. Do you want likes, comments, or neither?" │
│ ...continues until Blueprint is complete... │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
-
System Prompt:
.claude/agents/architect.md- detailed persona- Structured output format (reply + Blueprint JSON)
- Question strategy (when to ask vs assume)
-
Structured Output Parsing:
- LLM returns
{reply: string, blueprint: Blueprint} - Validate Blueprint against schema
- Handle partial updates (delta vs full replacement)
- LLM returns
-
Completeness Logic:
isReadyToBuild(blueprint)function- Clear rules for when questions are resolved
- Override mechanism for user to force build
Complexity: Medium-High
- Prompt engineering is iterative
- Structured output from LLMs can be fragile
- Need fallback handling for malformed responses
Gap 3: Operation Tracking (Tree Runner in DB)
Current: Tree workflows run via shell script (tree-runner.sh). State in local JSON files.
Required: Operations tracked in database, queryable via API, streamable to UI.
What's Missing
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT │
│ │
│ ./tree-runner.sh slackpath-1.yaml │
│ → Runs in terminal │
│ → State in .checkpoints/slackpath-1.json │
│ → No API visibility │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REQUIRED │
│ │
│ POST /operations/start {tree: "slackpath-1"} │
│ → Returns operation_id │
│ → State in operations table │
│ → GET /operations/{id}/stream returns SSE events │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
-
Database Tables:
operations- tracks running/completed operationsoperation_events- event log for replay/streaming
-
Service Layer:
OrchestratorService- manages operation lifecycle- Port tree-runner logic from bash to Go
- Event emission during execution
-
API Endpoints:
POST /projects/{id}/operations- start operationGET /projects/{id}/operations/{id}- get statusGET /projects/{id}/operations/{id}/stream- SSE stream
-
Worker Integration:
- SDLC executor emits events as it progresses
- Events written to
operation_eventstable - SSE handler reads from table and streams
Complexity: High
- Tree runner logic is non-trivial (dependencies, outputs, error handling)
- SSE streaming requires careful connection management
- Need to handle operation cancellation, resumption
Gap 4: Real-Time Progress Streaming
Current: Webhooks fire on build complete. No per-step visibility.
Required: SSE stream showing "Designing schema... Writing handlers... Running tests..."
What's Missing
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT │
│ │
│ Build starts → ... silence ... → Webhook: "build complete" │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REQUIRED │
│ │
│ Build starts → │
│ event: {"phase": "spec", "status": "complete"} │
│ event: {"phase": "design", "status": "in_progress"} │
│ event: {"phase": "design", "status": "complete"} │
│ event: {"phase": "implement", "progress": 0.5} │
│ ... │
│ event: {"status": "complete", "url": "..."} │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
-
SDLC Executor Changes:
- Emit events at phase transitions
- Emit progress within phases (task completion)
- Write events to
operation_eventstable
-
SSE Handler:
GET /operations/{id}/stream- Long-lived connection
- Read events from DB (or Redis pub/sub)
- Handle client disconnection gracefully
-
Event Types:
type OperationEvent struct { Type string // "phase", "progress", "artifact", "error", "complete" Phase string // "spec", "design", "implement", "test", "deploy" Status string // "in_progress", "complete", "failed" Message string // Human-readable Progress float64 // 0.0 to 1.0 for granular progress Timestamp time.Time }
Complexity: Medium
- SSE is straightforward in Go
- Main work is instrumenting SDLC executor
- Need to balance granularity vs noise
Gap 5: Blueprint → SDLC Feature Conversion
Current: SDLC features are created manually with spec documents.
Required: Automated conversion from structured Blueprint to SDLC feature spec.
What's Missing
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT │
│ │
│ Human writes: spec.md with prose description │
│ → POST /sdlc/features │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REQUIRED │
│ │
│ Blueprint JSON → Template rendering → spec.md │
│ → Automated POST /sdlc/features │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
-
Spec Template:
# Feature: {{.Feature}} ## Summary {{.Summary}} ## Data Model {{range .Sections.DataModel.Entities}} ### {{.Name}} | Field | Type | |-------|------| {{range .Fields}}| {{.Name}} | {{.Type}} | {{end}} {{end}} ## API Endpoints {{range .Sections.APIEndpoints.Endpoints}} - `{{.Method}} {{.Path}}` - {{.Description}} {{end}} ## UI Components {{range .Sections.UIComponents.Components}} - **{{.Name}}**: {{.Purpose}} {{end}} ## Assumptions {{range .Assumptions}} - {{.Assumption}} {{end}} -
Conversion Service:
- Takes Blueprint, renders spec.md
- Creates SDLC feature via existing API
- Links Blueprint to created feature (
built_feature_slug)
Complexity: Low
- Template rendering is straightforward
- SDLC feature creation already exists
- Main work is template design
Gap 6: Frontend (Next.js Studio)
Current: No frontend. All interaction via API/CLI.
Required: Three-pane interface (Chat, Plan, Preview).
What's Missing
Everything. This is a new application.
Implementation Required
-
Project Setup:
- Next.js 14 with App Router
- Tailwind CSS for styling
- Authentication (integrate with rdev auth)
-
Core Components:
apps/studio/ ├── app/ │ ├── page.tsx # Template selection │ ├── projects/ │ │ └── [id]/ │ │ └── page.tsx # Three-pane workspace │ └── api/ # Proxy to rdev-api ├── components/ │ ├── ChatPane.tsx │ ├── PlanPane.tsx │ ├── PreviewPane.tsx │ ├── ActivityFeed.tsx │ └── BuildProgress.tsx └── lib/ ├── api.ts # rdev-api client └── sse.ts # SSE connection manager -
State Management:
- Blueprint state (updated on each chat response)
- Operation state (updated via SSE)
- UI state (which pane is focused, etc.)
-
Key Interactions:
- Send chat message → receive reply + blueprint
- Click "Build It" → start operation → show progress
- Operation complete → refresh preview iframe
Complexity: Medium
- Standard Next.js app
- SSE client requires careful handling
- Most complexity is in polish and UX
Gap 7: Platform Service Infrastructure
Current: Projects manage their own integrations. No shared services, no credential management.
Required: A service catalog with provisioning, credential injection, and upgrade paths for existing projects.
The "Upgrade" Problem
┌─────────────────────────────────────────────────────────────────┐
│ CURRENT │
│ │
│ Project created 3 months ago │
│ → No centralized logging │
│ → No analytics │
│ → Rolling your own email │
│ → No easy way to add platform services │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ REQUIRED │
│ │
│ POST /projects/{id}/services │
│ { "type": "logging", "provider": "loki" } │
│ │
│ → Provision credentials │
│ → Inject into K8s secrets │
│ → Create integration PR with config changes │
│ → Project now ships logs to centralized system │
└─────────────────────────────────────────────────────────────────┘
Service Rollout Order
Build infrastructure with simplest service first, then add complexity:
| Order | Service | Why This Order |
|---|---|---|
| 1 | Logging | Pure infrastructure, no user-facing code changes |
| 2 | Simple API calls, clear success/failure | |
| 3 | Stats | Frontend SDK + backend events |
| 4 | Auth | Most complex (middleware, user model, protected routes) |
Implementation Required
1. Service Catalog
# internal/platform/catalog.yaml
services:
logging:
description: "Centralized log aggregation"
providers:
loki:
name: "Grafana Loki"
credentials:
- LOKI_URL
- LOKI_TENANT_ID
integration:
go:
config_template: "loki-logger.go.tmpl"
env_example: ["LOKI_URL", "LOKI_TENANT_ID"]
node:
packages: ["pino", "pino-loki"]
config_template: "pino-loki.ts.tmpl"
email:
description: "Transactional email"
providers:
resend:
name: "Resend"
credentials:
- RESEND_API_KEY
integration:
go:
packages: ["github.com/resendlabs/resend-go"]
service_template: "email-service.go.tmpl"
node:
packages: ["resend"]
service_template: "email-client.ts.tmpl"
stats:
description: "Product analytics"
providers:
posthog:
name: "PostHog"
credentials:
- POSTHOG_API_KEY
- POSTHOG_HOST
integration:
go:
packages: ["github.com/posthog/posthog-go"]
node:
packages: ["posthog-js", "posthog-node"]
provider_template: "analytics-provider.tsx.tmpl"
auth:
description: "User authentication"
providers:
clerk:
name: "Clerk"
credentials:
- CLERK_SECRET_KEY
- NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY
integration:
node:
packages: ["@clerk/nextjs"]
middleware_template: "clerk-middleware.ts.tmpl"
provider_template: "clerk-provider.tsx.tmpl"
2. Database Schema
-- Track which services a project uses
CREATE TABLE project_services (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID NOT NULL REFERENCES projects(id),
service_type TEXT NOT NULL, -- 'logging', 'email', 'stats', 'auth'
provider TEXT NOT NULL, -- 'loki', 'resend', 'posthog', 'clerk'
environment TEXT NOT NULL, -- 'staging', 'production', 'all'
-- Encrypted credentials
credentials_encrypted BYTEA,
-- Non-sensitive config
config JSONB NOT NULL DEFAULT '{}',
-- Status tracking
status TEXT NOT NULL DEFAULT 'provisioning',
-- provisioning → active → needs_update → deprovisioned
-- Integration tracking
integration_status TEXT DEFAULT 'pending',
-- pending → pr_created → integrated → needs_update
integration_pr_url TEXT,
integration_commit TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(project_id, service_type, environment)
);
3. Provisioner Interface
// internal/port/platform_provisioner.go
type PlatformProvisioner interface {
// Provision creates credentials for a project
Provision(ctx context.Context, req ProvisionRequest) (*ProvisionResult, error)
// Verify checks if credentials are still valid
Verify(ctx context.Context, projectID string, creds map[string]string) error
// Deprovision cleans up (optional, for account removal)
Deprovision(ctx context.Context, projectID string) error
}
type ProvisionRequest struct {
ProjectID uuid.UUID
ProjectName string
Environment string // "staging", "production"
}
type ProvisionResult struct {
Credentials map[string]string // Encrypted before storage
Config map[string]string // Non-sensitive config
}
4. Service Addition API
POST /projects/{projectId}/services
{
"serviceType": "logging",
"provider": "loki" // Optional, uses platform default
}
Response:
{
"serviceId": "svc_abc123",
"status": "provisioning",
"integrationMethod": "pr", // or "direct"
"prUrl": null // Populated when PR is created
}
GET /projects/{projectId}/services/{serviceId}
{
"serviceId": "svc_abc123",
"serviceType": "logging",
"provider": "loki",
"status": "active",
"integrationStatus": "integrated",
"integrationCommit": "abc123...",
"credentials": {
"LOKI_URL": "[redacted]",
"LOKI_TENANT_ID": "project-xyz"
}
}
5. Integration Flow
POST /projects/{id}/services {type: "logging", provider: "loki"}
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ 1. PROVISION │
│ │
│ LokiProvisioner.Provision() │
│ → Create tenant in Loki (or use shared with project prefix) │
│ → Generate credentials │
│ → Store encrypted in project_services │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ 2. INJECT │
│ │
│ K8sSecretInjector.Inject() │
│ → Add LOKI_URL, LOKI_TENANT_ID to project's K8s secret │
│ → Trigger deployment restart to pick up new env vars │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ 3. INTEGRATE │
│ │
│ IntegrationService.CreatePR() or .DirectCommit() │
│ → Clone project repo │
│ → Apply integration templates: │
│ • Update logger config to ship to Loki │
│ • Add env vars to .env.example │
│ • Update deployment to mount secrets │
│ → Create PR (or direct commit for new projects) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ 4. VERIFY │
│ │
│ After PR merge / deploy: │
│ → Check logs appearing in Loki │
│ → Update integration_status to "integrated" │
└─────────────────────────────────────────────────────────────────┘
Complexity: High
- Service catalog is straightforward (YAML/DB)
- Each provisioner is unique (Loki vs Resend vs PostHog)
- Credential encryption and management needs care
- Integration templates need to handle Go + Node + various frameworks
- PR creation requires git operations
Starting Point: Logging with Loki
// internal/adapter/loki/provisioner.go
type LokiProvisioner struct {
lokiURL string
adminToken string // For tenant creation if using multi-tenant Loki
}
func (p *LokiProvisioner) Provision(ctx context.Context, req ProvisionRequest) (*ProvisionResult, error) {
// For single-tenant Loki, just create a unique label prefix
tenantID := fmt.Sprintf("project-%s", req.ProjectID)
return &ProvisionResult{
Credentials: map[string]string{
"LOKI_URL": p.lokiURL,
"LOKI_TENANT_ID": tenantID,
},
Config: map[string]string{
"service_name": req.ProjectName,
},
}, nil
}
Gap 8: Dual Environment Support
Current: Single deployment per project. Main branch = production.
Required: Staging + Production environments. Build deploys to staging, "Publish" promotes to production.
The Environment Model
┌─────────────────────────────────────────────────────────────────┐
│ Project: cool-project │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ STAGING │ │
│ │ staging.cool-project.threesix.ai │ │
│ │ │ │
│ │ • Where development happens │ │
│ │ • Preview pane shows this │ │
│ │ • "Build It" deploys here │ │
│ │ • May use test credentials for services │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ [Publish] │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ PRODUCTION │ │
│ │ cool-project.threesix.ai │ │
│ │ │ │
│ │ • User-facing, stable │ │
│ │ • Only updated via explicit "Publish" │ │
│ │ • Production credentials for services │ │
│ │ • Enabled after first publish │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Implementation Required
1. DNS Changes
// On project creation, create both records (prod may be placeholder)
CreateDNSRecord("staging.cool-project.threesix.ai", stagingIP)
CreateDNSRecord("cool-project.threesix.ai", prodIP) // Or placeholder until first publish
2. K8s Deployment Model
# Option A: Two deployments in same namespace
apiVersion: apps/v1
kind: Deployment
metadata:
name: cool-project-staging
namespace: cool-project
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cool-project-production
namespace: cool-project
# Option B: Two namespaces (cleaner isolation)
# cool-project-staging namespace
# cool-project-production namespace
Recommendation: Same namespace, two deployments. Simpler to manage, secrets can be shared or scoped.
3. Database Model
Two options:
A. Same database, schema prefixes:
-- Staging tables
staging_users, staging_posts, staging_...
-- Production tables
prod_users, prod_posts, prod_...
B. Separate databases (cleaner):
cool-project-staging (CockroachDB database)
cool-project-production (CockroachDB database)
Recommendation: Separate databases. Cleaner isolation, no risk of cross-env data access.
4. Project Schema Updates
ALTER TABLE projects ADD COLUMN environments JSONB NOT NULL DEFAULT '{
"staging": {"enabled": true, "deployed_at": null},
"production": {"enabled": false, "deployed_at": null, "published_at": null}
}';
5. Publish API
POST /projects/{projectId}/publish
{
"fromEnvironment": "staging", // Usually staging
"toEnvironment": "production"
}
Response:
{
"operationId": "op_xyz789",
"status": "publishing",
"streamUrl": "/operations/{operationId}/stream"
}
Publish Flow:
- Validate staging is healthy
- Provision production credentials for any services (if not exist)
- Run migrations on production database
- Deploy staging image to production deployment
- Health check production
- Update DNS if needed
- Update project.environments.production
Complexity: Medium
- DNS: Already have CloudflareAdapter, just create two records
- K8s: Straightforward deployment duplication
- Database: CockroachDB adapter supports multiple databases
- Main complexity is the publish flow coordination
Defer Until After Gap 7
Dual environments can work with platform services, but we can build Gap 7 (services) first:
- Services provision for a single environment initially
- Then extend to environment-aware provisioning
- Then add the publish flow that syncs services to production
Summary: Work Required
| Gap | Effort | Dependencies | Critical Path |
|---|---|---|---|
| 0. Design References | 2-3 days | Gap 1 (storage) | Yes (for design flows) |
| 1. Blueprint Storage | 2-3 days | None | Yes |
| 2. Architect Agent | 3-5 days | Gap 1 | Yes |
| 3. Operation Tracking | 4-6 days | None | Yes |
| 4. Progress Streaming | 2-3 days | Gap 3 | Yes |
| 5. Blueprint → SDLC | 1-2 days | Gap 1 | Yes |
| 6. Frontend | 5-7 days | Gaps 1-5 | Yes |
| 7. Platform Services | 5-8 days | None (can start now) | Parallel track |
| 8. Dual Environments | 3-5 days | Gap 7 | After services work |
Total Estimate: 4-5 weeks of focused work (Gaps 7-8 can parallel with 1-6)
Service Rollout (within Gap 7):
- Logging (Loki) - 2 days
- Email (Resend) - 2 days
- Stats (PostHog) - 2 days
- Auth (Clerk) - 3 days
Note: Gap 0 (Design References) can be implemented in parallel with Gap 2 (Architect Agent) since both involve Architect prompt engineering. The reference capture infrastructure (Gap 0) builds on Gap 1's storage layer.
Critical Path
┌──► Gap 0 (References) ──┐
│ │
Gap 1 (Blueprint) ──┼──► Gap 2 (Architect) ───┼──► Gap 5 (Conversion)
│ │
│ └──► Gap 6 (Frontend)
│ ▲
Gap 3 (Operations) ─┴──► Gap 4 (Streaming) ────────┘
Parallel Track:
Gap 7 (Services) ──► Logging ──► Email ──► Stats ──► Auth
│
└──► Gap 8 (Environments) ──► Publish Flow
Gap 7 can start immediately and run parallel to the Studio work. Gap 8 depends on Gap 7 for service credential handling per environment.
Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Architect outputs malformed JSON | High | Medium | JSON schema validation, retry logic |
| SSE connections drop | Medium | Low | Client-side reconnection, event replay from DB |
| Blueprint schema too restrictive | Medium | Medium | Start minimal, add sections iteratively |
| LLM latency affects chat UX | Low | High | Stream partial responses, show typing indicator |
| Build failures leave broken state | Low | Medium | SDLC already handles partial state |
What's NOT a Gap
These are already solved by the current rdev foundation:
- Project provisioning - K8s, DNS, git all work
- Template seeding - Composable monorepo templates
- SDLC execution - Classifier + worker + artifact tracking
- CI/CD - Woodpecker integration
- Database provisioning - CockroachDB adapter
- Webhooks - Event dispatcher with retry
The foundation is solid. The gaps are about exposing existing capabilities through a conversational UI, not rebuilding core functionality.