jordan/rdev

Author	SHA1	Message	Date
jordan	3979ef2d08	feat: wire mixed-heritage through Stage 4 and fix pronoun support All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - specgen: extend dnaLLMResponse with heritage fields; conditionally extend Stage 4 prompt for EthnicityMixed to ask LLM for primary_heritage, secondary_heritage, and mix_percentage; populate IdentityDNA fields from response so mixed personas get a real heritage breakdown - imagegen: buildIdentitySection() produces "East Asian and Latina/Hispanic heritage" description for mixed personas instead of generic "mixed-race" - videogen: add genderPronouns() helper; replace hardcoded she/her with pronoun set across all 4 video prompts; generateVideo() returns raw bytes so caller can upload to storage - service: GenerateVideo() uploads video to storage and sets VideoSpec.URL; anchor ordering ensures position 1 is generated first; emit persona_video_failed SSE event on non-fatal video failures; replace manual fold helpers with strings.ToLower + strings.Contains - worker/main: register persona_generate handler when both AI managers ready - docs: add persona_video_failed to SSE events reference in personagen.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 01:21:59 -07:00
jordan	002c32aedb	feat: add album generation system to skeleton Adds anchor-based image album generation across docs, skeleton, and rendered full-monorepo. One subject description + one anchor image + N directed shots, covering personas, products, characters, and brand assets out of the box. ## What ships Skeleton packages: - pkg/album/types.go — Album, Shot, ShotStatus, ShotTemplate, AlbumUpdater - pkg/album/templates.go — PortraitSession, ProductShoot, CharacterSheet built-ins - pkg/album/handler.go — AnchorHandler + ShotHandler queue job handlers - packages/realtime/src/useAlbumGeneration.ts — SSE hook owning all album state - packages/ui/src/components/AlbumGrid.tsx — responsive shot grid with shimmer - packages/ui/src/components/ShotCard.tsx — pending/generating/complete/failed states - packages/ui/src/components/AnchorPreview.tsx — anchor CTA + image with controls Component service template: - internal/port/album.go — AlbumRepository interface - internal/adapter/memory/album.go — in-memory repo for standalone dev - internal/service/album.go — create, list, get, generateAnchor, generateAllShots - internal/api/handlers/album.go — HTTP handlers (CRUD + 202 generation endpoints) - Routes: GET/POST /albums, GET/DELETE /albums/{id}, POST /albums/{id}/anchor, POST/DELETE /albums/{id}/shots, POST /albums/{id}/shots/{index} Documentation: - .claude/guides/album.md — full guide with API, SSE events, frontend usage Key architecture decisions: - Anchor bytes never stored in queue payload — workers fetch AnchorURL at runtime - Generation order enforced: POST /shots returns 422 if no anchor exists - All album SSE events on existing user:<userId> channel (no new channel) - AlbumUpdater interface lets job handlers update repo from inside queue workers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 23:57:21 -07:00
jordan	4603402b84	feat: OTP supports unified register+login flow All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Previously SendOTP silently dropped requests for unknown emails, so new users had no passwordless path in. Now: - SendOTP: if REGISTRATION_ENABLED and email unknown, generates and sends the code anyway (UserID nil until verify) - VerifyOTP: if email unknown after valid code, auto-registers the user (emailVerified=true — OTP delivery proves ownership, name defaults to email local-part) then creates a session REGISTRATION_ENABLED=false continues to block unknown emails at SendOTP, preserving invite-only / closed-beta behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 11:17:42 -07:00
jordan	5ac9af018a	fix: always log OTP codes to stdout in standalone dev mode All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details In-memory auth codes are ephemeral — they're wiped on server restart. Previously, codes were only visible via email delivery. If the server restarted between OTP send and OTP verify, the code would be lost. Now memory.AuthCodeRepository.Create() always logs the code to stdout with a [DEV] prefix. This gives developers a reliable fallback regardless of whether NOTIFY_URL is set. Updated CLAUDE.md to document this behavior and the DEV_USER_EMAIL env var. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 00:13:12 -07:00
jordan	5f66eb0e7b	fix: seed dev user from DEV_USER_EMAIL env var so auth survives restarts All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details In standalone mode (no DATABASE_URL), the in-memory user store only had hardcoded demo accounts. Any real email the developer used was lost on every server restart, causing OTP requests to silently fail with "unknown email". NewUserRepository now accepts devEmail + devPassword. If DEV_USER_EMAIL is set, that account is seeded on every startup alongside the demo users. The developer's email is always registered, OTPs route to notify (or log to console), and re-renders/restarts no longer break the auth flow. New config fields: DevUserEmail (DEV_USER_EMAIL) / DevUserPassword (DEV_USER_PASSWORD, default: "DevPassword1"). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 23:46:12 -07:00
jordan	27e6cfd42b	feat: add HTML email template system to skeleton service component All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Every project generated from the skeleton now ships with styled, production-ready transactional emails out of the box. New pkg/email package: - Renderer: loads templates from caller-provided embed.FS, inlines CSS via douceur at startup, derives plain text via goquery for multipart delivery - DevHandler: live browser preview at GET /dev/emails and /dev/emails/{purpose} (development only, never mounted in production) - CSSInlineErr field on RenderedEmail so callers can log degraded renders New service component templates: - internal/email/embed.go.tmpl — embeds template FS (uses all: prefix for _*.html) - internal/email/renderer_test.go.tmpl — 9 tests covering all purposes + brand injection - internal/email/templates/ — 5 HTML email types (login_otp, email_verify, magic_link, password_reset, welcome) + 5 shared partials (_layout, _header, _footer, _button, _code_box) Updated service component templates: - config.go.tmpl — brand fields: AppName, AppURL, SupportEmail, LogoURL, BrandColor - main.go.tmpl — wires renderer at startup, logs template count - routes.go.tmpl — mounts /dev/emails in development; EmailRenderer in Dependencies - notify.go.tmpl — renders HTML before sending; warns on CSS inlining failure - go.mod.tmpl — adds douceur, goquery, gorilla/css, andybalholm/cascadia Deleted: internal/adapter/email/helpers.go.tmpl (replaced by meta.yaml + renderer) Fix: template directory named email_verify (matching domain.PurposeEmailVerify) rather than verify_email — the mismatch caused all verification emails to fail with "unknown email purpose" at send time while tests passed (tests called Render directly with the wrong name). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 22:44:59 -07:00
jordan	4f01015132	feat: implement project access enforcement and management API All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Fix no-op RequireProjectAccess middleware to enforce project_ids - Apply project access middleware to all project-scoped routes - Filter GET /projects by allowed project IDs for restricted keys - Add GET /me endpoint with key identity, scopes, and project access info - Add PATCH /keys/{id} for partial key updates (name, scopes, project_ids, allowed_ips, expires_in) - Add GET/POST/DELETE /projects/{id}/access for project-centric access management - Auto-grant creating key access when using POST /project/create-and-build - Accept grant_to_key_ids in create-and-build to grant multiple keys on project creation - Move newProvisionerWithDeps test helper from production code to test file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 15:38:37 -07:00
jordan	0f25bd8dbe	feat: hook in notify service for per-project email delivery All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add NotifyProvisioner (port + adapter) using real notify admin API - Create notify account + send key + host grant per project - Inject NOTIFY_API_KEY/HOST/FROM into component deployments - Store NOTIFY_URL, NOTIFY_ADMIN_KEY, RESEND_API_KEY in credential store - Add setup-notify.sh for one-time host/provider/domain setup - Add NOTIFY_ADMIN_KEY constant to domain/credential.go - Wire provisioner in main.go with connection test guard - Add .claude/guides/services/notify.md and CLAUDE.md entry Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 00:30:32 -07:00
jordan	bc77504b35	fix: add 'use client' directive to MediaLibrary and MediaUploader components These components use useState/useRef hooks but lacked the Next.js 'use client' directive, causing the Next.js app build to fail with Server Component errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 00:32:24 -07:00
jordan	592b2d5ec0	fix: clarify database types across docs and fix video storage persistence Two distinct fixes: 1. Database terminology: Make it crystal clear that generated projects use CockroachDB in production and PostgreSQL for local dev, while the rdev platform itself uses PostgreSQL. Updated 15 files across skeleton agents, component templates, cookbook trees, and platform docs. 2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded by the Gemini adapter with auth) and re-downloading from the provider URL with a plain GET — which fails because Gemini URLs require API key auth. Now uses vid.Data first, falls back to downloadURL only for public URLs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:13:21 -07:00
jordan	a8c8a0a14d	feat: add GCS-based persistent media storage, AI generation pipeline, and composable skeleton packages All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Adds complete media storage pipeline with GCS presigned uploads, AI image/video/text generation via queue-based workers, realtime SSE event streaming, and comprehensive skeleton packages (storage, mediagen, textgen, generation, realtime, persona, routing, ai-client). Includes security fixes for media delete authorization, nil pointer guards in handlers, video persistence via download-then-upload, consistent signed URLs, and Image→ImageIcon rename to avoid DOM collision. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:29:09 -07:00
jordan	7249575dea	feat(sessions): add command execution endpoint and activity tracking All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add POST /sessions/:id/exec endpoint for executing commands in sessions - Add session activity tracking (last_activity_at timestamp) - Add database migration 024 for session activity column - Add comprehensive tests for session handlers and service layer - Add wildcard TLS certificate for preview.threesix.ai subdomain - Add infrastructure mocks for testing preview service - Refactor preview cleanup logic to remove unused methods - Add AIOS core documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-13 08:41:05 -07:00
jordan	84af398d85	refactor: add timeout constants for agent execution tiers All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Add TimeoutAgentExecution (22m) to handlers for synchronous SDLC execution, and TimeoutAgent{Default,Medium,Heavy} (12/22/47m) to workers for tiered agent task execution. Aligns with SDLC action complexity tiers and prevents inline duration literals. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-11 10:48:24 -07:00
jordan	542bc722ab	fix(architect): handle missing projects in repo, add cookbook hooks/validation All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details The architect API returned "failed to start conversation" because projectRepo.Get() failed — the in-memory K8s repo watches the rdev namespace but projects deploy to the projects namespace. Made project lookup non-fatal with fallback to default pod. Added error logging to all architect handler methods (were silently swallowing errors). Also adds setup-hooks, commit-after-qa, and pre-merge-validate steps to the foundary cookbook tree for git hooks and code quality gates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 02:25:40 -07:00
jordan	c68fadbccd	fix(architect): add pod_name to agent requests, rewrite foundary cookbook All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details The architect service was missing pod_name/namespace in AgentRequest metadata, causing Claude Code adapter to reject all requests. Added ArchitectServiceConfig with pod resolution (project PodName → default claudebox-0). Removed silent JSON fallback in extractSpecFromMessages that masked errors. Rewrote foundary cookbook from 90-step SDLC flow to focused 25-step cookbook using natural language build prompts instead of /slash-commands that claudebox cannot execute. Added "no fallbacks" rule to CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 01:24:34 -07:00
jordan	a9ad3d8304	chore: accumulated platform hardening and CI fixes All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details CI / Woodpecker: - Add explicit depends_on to all .woodpecker.yml steps (rdev + templates) - Fix skip_tls_verify -> skip-tls-verify (correct Kaniko flag name) - Add replicasets get/list to deployer RBAC for rollout status - Skeleton template: add failure:ignore on docs steps, Traefik TLS annotations on ingress, depends_on on verify step Component templates: - Fix container name in deploy steps (PROJECT_NAME-COMPONENT_NAME) - Replace kubectl scale with kubectl patch for replicas - Add post-deploy image verification and rollout status checks - Applied consistently across all 5 component templates Adapters: - gitea: Add HTTP client timeout (30s), context cancellation checks, handle 404 on GetRepo/DeleteRepo - zot: Add retry with exponential backoff (doWithRetry), limit response body reads to 10MB - cockroach: Use net.JoinHostPort for IPv6-safe DSN construction - woodpecker: Fix error wrapping (%v -> %w) - redis: Fix error wrapping (%v -> %w) - deployer: Add context cancellation checks Services: - apikey_service: Fix error wrapping (%v -> %w) - component_deploy: Fix error wrapping (%v -> %w) - project_infra: Fix error wrapping (%v -> %w) - webhook/dispatcher: Fix error wrapping (%v -> %w) Other: - CLAUDE.md: Add guide links for Gitea, Go 1.25, Woodpecker v3, Traefik v3, Zot registry - circuitbreaker: Add test for error wrapping - docs: Update deployment, troubleshooting, and runbook docs - health: Fix error wrapping (%v -> %w) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 23:16:56 -07:00
jordan	3c9876a678	fix(worker): increase SSE scanner buffer to 1MB in claudebox client Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The HTTP claudebox client's ExecuteStream method used a bare bufio.NewScanner with the default 64KB max token size. When Claude Code produces tool results > 64KB (e.g., reading large files), the SSE event exceeds the scanner limit and fails with "token too long". Every other scanner in the codebase (claudecode adapter, claudebox executor, kubernetes executor) already uses scanner.Buffer(buf, 1MB). This was the only one missed. Fixes: "agent execution failed: read stream: bufio.Scanner: token too long" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 23:14:20 -07:00
jordan	b6e778d5ab	fix(git): harden git flow for concurrent SDLC stress test failures Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details 5 fixes from stress test analysis: 1. CRITICAL: Add pull-before-push to claudebox GitOperations.CommitAndPush, matching the fix already in PodGitOperations (prevents push rejections when concurrent builds advance the remote). 2. HIGH: Extract ResetToMain into PodGitOperations as a shared public method. Wire into BuildExecutor after CloneRepo and update SDLCTaskExecutor to use the shared method. Prevents builds from running on wrong branch when worker pods are reused across tasks. 3. HIGH: Make branch create push failure fatal with retry+rollback in cmd/sdlc/cmd_branch.go. Prevents orphaned .sdlc/ state that causes merge failures after completing all 10 SDLC phases. 4. MEDIUM: Shell-escape token in credential helpers (both PodGitOperations and claudebox GitOperations) to prevent shell injection via tokens containing special characters. 5. MEDIUM: Add GitResetToMain to claudebox sidecar (git.go implementation, server.go endpoint, client.go HTTP method) and wire into HTTPSDLCTaskExecutor for the HTTP sidecar path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 20:57:27 -07:00
jordan	cefc15aa7d	fix(worker): include stdout in error messages when Claude command fails Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Auth errors like "OAuth token has expired" were lost because Claude writes them to stdout, not stderr. The error message only showed kubectl's generic "command terminated with exit code 1". Now includes both stdout and stderr in the error, making failures immediately diagnosable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 17:55:46 -07:00
jordan	b7d0e84946	fix(deploy): create component deployments with 0 replicas to prevent ImagePullBackOff All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Components are scaffolded before CI builds their images. Previously deployments started with 1 replica, causing ImagePullBackOff until the first build completed. Now deployments start at 0 replicas; CI deploy steps scale to 1 after verifying the image exists in the registry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 10:16:14 -07:00
jordan	9f957d6e75	fix(templates): harden component CI steps and compile regexes Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Add --connect-timeout 10 and --max-time 15 to all verify step curl calls to prevent hanging on registry health checks - Fix cli template: depends_on [deps] -> [preflight] for consistency - Add cross-reference comment to service template about verify logic being replicated across all 5 component templates - Document component CI step rules in composable-monorepo.md - Compile regexes at package level instead of per-call in component_updates.go - Add component_updates_test.go Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 19:36:23 -07:00
jordan	9226454b85	feat: label-based undeploy, GC reconciliation, checkout/sessions, pool status Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Add UndeployAll() using label selectors to clean up monorepo components on project deletion (replaces name-based Undeploy in DeleteProject and the direct undeploy handler) - Add ResourceGC background worker that periodically finds K8s resources whose project label has no matching DB record, deletes after 1h safety window - Widen deployer client type from *kubernetes.Clientset to kubernetes.Interface for testability - UndeployAll accumulates errors via errors.Join instead of failing fast - Add checkout/checkin sidecar dev flow: temporary git tokens, branch checkout, review on checkin with cleanup workers - Add interactive sessions: pod binding, command execution, SSE streaming, ephemeral preview URLs with session cleanup workers - Add GET /workers/pool endpoint for aggregate capacity and queue depth - Add sessions:read and sessions:execute auth scopes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 19:11:28 -07:00
jordan	2a2f2fa370	fix(logging): implement http.Flusher on responseWriter for SSE streaming Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The logging middleware's responseWriter wrapped http.ResponseWriter but only implemented WriteHeader, Write, and Unwrap. The missing Flush() method caused w.(http.Flusher) type assertions to fail in the claudebox sidecar's streaming endpoint, returning 500 "streaming not supported". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 13:23:42 -07:00
jordan	6ec2a4fea3	fix(sdlc): persist branch metadata on main before feature branch creation Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The `sdlc merge` command reads the Branch field from the feature manifest on main, but `sdlc branch create` was only committing that state to the feature branch (via the executor's CommitAndPush). This caused merge to fail with "feature has no branch". Two changes: 1. cmd/sdlc/cmd_branch.go: commit .sdlc/ state to main before `git checkout -b`, ensuring Branch metadata is on main where merge reads it. 2. internal/worker/sdlc_executor.go: reset workspace to main (`git fetch && git checkout main && git reset --hard origin/main`) before each SDLC task, preventing cross-task branch contamination from commands that switch branches. Also updates foundary cookbook with architect fallback pattern and on_error: continue for steps that may fail during early lifecycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 08:36:10 -07:00
jordan	a69eb7e587	feat(foundary): implement complete backend for conversational project design Implements all 5 phases of Foundary Studio backend: Phase 1: Chat Persistence (8 API endpoints) - Conversations and messages with proper cascading deletes - PostgreSQL schema with auto-update triggers - Full CRUD operations with structured logging Phase 2: Blueprint Entity (5 API endpoints) - JSONB spec storage with GIN indexes - Flexible structured data for project specifications - Version-controlled blueprint management Phase 3: Architect Service (3 API endpoints) - Conversational AI orchestration with Claude - Multi-turn dialogue with context building - Blueprint spec extraction from conversations Phase 4: Work Queue Integration - Verified existing endpoint compatibility Phase 5: Structured Questions (6 API endpoints) - Four question types: text, choice, multichoice, yesno - Answer validation with proper constraints - Conversation-linked Q&A flow Architecture: - Textbook hexagonal architecture (domain → port → adapter → service → handler) - Zero external dependencies in domain layer - Consistent error handling with proper wrapping - Auth scopes on all routes (projects:read, projects:execute) - Structured logging with operation context and duration tracking - NULL-safe DTO converters throughout Database: - 3 new migrations (019, 020, 021) - UUIDs for all primary keys - Proper foreign key constraints with ON DELETE CASCADE - Optimized indexes including partial index for unanswered questions - Auto-update triggers for timestamps OpenAPI Documentation: - Complete API documentation under 'Foundary' tag - 22 new endpoints documented with examples - Request/response schemas for all operations Logging Improvements: - Added operation field to all service logs - Added duration_ms tracking for performance monitoring - Log response_length instead of full response content - Consistent use of logging field constants - Execute-then-log pattern for delete operations Files: 32 changed, 2800+ lines added - 7 domain models - 3 database migrations - 3 port interfaces - 3 postgres adapters - 4 services (conversation, blueprint, question, architect) - 4 handlers with DTOs - OpenAPI documentation - Integration in main.go 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-09 00:50:46 -07:00
jordan	adcea2fc1f	fix(templates): upgrade Go to 1.25 and fix Woodpecker syntax Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details ## Template Version Alignment - Go: 1.23 → 1.25 across all templates (go.work, go.mod, Dockerfiles, CI) - Alpine: latest → 3.19 (explicit version pinning) - Woodpecker: failure:retry → failure:ignore (invalid syntax fix) ## SDLC Tree Fixes (slackpath-5-full-lifecycle) Fixed merge failures by correcting lifecycle flow: 1. Branch Creation: Added missing create-branch step (planned → ready) - Bug: Merge command requires feature.Branch field to be set - Fix: POST /projects/{id}/sdlc/features/{slug}/branch 2. Artifact Status: Changed approval to pass for execution artifacts - Bug: Review/audit/QA need status="passed" not "approved" - Fix: /artifacts/{type}/approve → /artifacts/{type}/pass - Added: pass-qa step after wait-qa 3. Phase Transition Order: Reordered merge phase transition - Bug: Merge command checks if phase == "merge" first - Fix: transition-to-merge BEFORE merge-feature (not after) ## GCS Provisioner Fix - Replaced deprecated option.WithCredentialsFile with env var approach - Now uses GOOGLE_APPLICATION_CREDENTIALS for ADC (Application Default Credentials) - Avoids security risk from deprecated credential options - Fixed test: Added ComponentTypeGCS to ValidComponentTypes test ## Critical Rules Added - Version alignment: All template versions must stay in sync - When updating versions, grep entire templates/ tree ## Files Changed - 27 template files: Go version + Woodpecker syntax - 1 tree file: SDLC lifecycle flow corrections - 1 CLAUDE.md: Version alignment rule - 1 GCS provisioner: Deprecated API fix - 1 test file: Added missing component type Root cause: Skeleton templates lagged behind Go 1.25 release and had invalid Woodpecker syntax. SDLC tree skipped required branch creation and used wrong artifact approval endpoints. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 23:57:38 -07:00
jordan	a419c53592	fix(sdlc): make phase transitions idempotent Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Allow transitioning to the current phase (no-op success) instead of rejecting it as a "backward" transition. This fixes issues where external systems retry transition commands. Before: draft -> draft returned error After: draft -> draft returns nil (already there) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 14:21:05 -07:00
jordan	00f55f7f6f	fix(sdlc): route conflict with SDLCGenerateHandler shadowing SDLC routes Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details SDLCGenerateHandler was using r.Route() to create a sub-router at /projects/{id}/sdlc/features/{slug}, which shadowed SDLCHandler's nested routes like /features/{slug}/artifacts/{type}/approve. Changed to direct route registration to avoid chi route conflicts. This fixes 404 errors on SDLC feature and artifact endpoints. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 11:27:41 -07:00
jordan	4486042155	fix(registry): delete container images on project teardown Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Root cause of DIGEST_INVALID errors was registry disk exhaustion. Project teardown wasn't cleaning up container images, causing the registry PVC to fill up over time. Changes: - Add RegistryProvider port interface for registry operations - Extend zot.Client with DeleteProjectRepositories method - Wire registry provider into ProjectInfraService - Delete images during DeleteProject cleanup (step 4) The zot client uses the OCI distribution API: - Lists all repos, filters by project prefix - Gets manifest digests via HEAD request - Deletes manifests by digest to trigger GC Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 02:56:18 -07:00
jordan	f20fc6c51c	feat(saga): implement enterprise-grade resilience architecture Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Fixes issues from code review of resilience implementation: - Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler) - Fix CompletedSteps() to include skipped steps for dependency resolution - Fix reverse loop bug in saga compensation (use standard swap pattern) - Add circuit breaker state change callbacks for Prometheus metrics Phase 1 (Build Resilience): - Add failure:retry to all component Kaniko build steps - Add preflight registry health check before builds - Add services-deployed sync point to decouple docs from critical path Phase 2 (API Resilience): - Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry) - Wire circuit breakers with metrics callbacks - Add /health/circuits endpoint for circuit breaker status Phase 3 (Saga Engine): - Full domain model (Saga, SagaStep, RetryPolicy, BackoffType) - PostgreSQL saga repository with CRUD and step management - Saga executor with retry, compensation, skip step support - Saga API handlers with CRUD and control operations Phase 4 (Observability): - Add saga metrics (total, step_duration, retry, circuit_breaker_state) - Add logging fields (saga_id, saga_name, step_name) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-08 01:58:02 -07:00
jordan	9085965864	fix(skeleton): enforce chi {param} URL syntax in agent guidance Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Agents were generating `:id` (Echo/Gin style) instead of `{id}` (chi style), causing routes to not match. Updated api-designer, go-specialist agents and skeleton CLAUDE.md with explicit CRITICAL notes about brace syntax. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 20:44:52 -07:00
jordan	863dfd3214	fix: skip root deployment for empty template (defaults to skeleton) Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details When req.Template is empty, it defaults to 'skeleton' but the check in createInitialDeployment only matched 'skeleton' explicitly, not empty string. This caused a broken deployment to be created for monorepo projects with a non-existent image. Root cause: slackpath-5 creates project with empty template, which defaults to skeleton, but createInitialDeployment was still creating a root deployment that references registry.threesix.ai/{project}:latest which never gets built (skeleton has no root Dockerfile). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 19:32:19 -07:00
jordan	bcf9f28bb9	fix: add failure:ignore to docs build steps Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details When docs infrastructure doesn't exist, the docs build steps should gracefully skip without failing the entire pipeline. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 18:26:00 -07:00
jordan	2a25a161cb	fix: use plugin-kaniko for docs image build Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The raw gcr.io/kaniko-project/executor with commands: doesn't work properly in Woodpecker. Switch to woodpeckerci/plugin-kaniko with settings: to match other component builds. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 18:08:31 -07:00
jordan	bed72961fe	fix: add --insecure flag to kaniko for docs image build Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The registry.threesix.ai uses a self-signed certificate. Service builds use plugin-kaniko with skip-tls-verify, but docs build used raw kaniko executor without TLS bypass, causing exit 128. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 17:50:38 -07:00
jordan	be80fd2d4a	fix: correct kaniko dockerfile path for docs image build Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details When --context=docs is set, the --dockerfile path should be relative to the context directory. Changed from docs/Dockerfile.nginx to Dockerfile.nginx since kaniko already looks in the docs/ directory. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 17:35:54 -07:00
jordan	caf0990ceb	fix: downgrade rouge to 3.x for middleman-syntax compatibility Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details middleman-syntax ~> 3.2 requires rouge ~> 3.2, but Gemfile had rouge ~> 4.0 causing bundle install to fail with version resolution error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 16:48:49 -07:00
jordan	b41e0dfbf9	fix: use raw JSON responses in claudebox server Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The claudebox sidecar was using api.WriteJSON which wraps responses in {data: ..., meta: ...} format. The claudebox HTTP client expects raw JSON responses without wrapping. This caused git clone to appear to fail - the HTTP request succeeded and returned {data: {success: true, cloned: true}, meta: {...}}, but the client decoded success=false because it couldn't find the fields at the top level. Added writeRawJSON helper and replaced all api.WriteJSON calls with it for actual responses. Error responses still use api.WriteBadRequest which returns proper error format. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 16:41:21 -07:00
jordan	af91bad0ff	feat: add Slate documentation templates to skeleton Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Adds complete Slate documentation infrastructure to generated projects: - docs/ directory with Gemfile, config.rb, and source templates - Dockerfile for building docs site - Dockerfile.nginx for serving static docs - generate-docs.sh script for CI integration - Claude command for AI-assisted docs generation - OpenAPI → Slate markdown conversion via widdershins Also includes: - --export-openapi flag for service binaries - DNS provisioning for docs.{domain} subdomain - Updated project_infra for docs DNS records Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 16:06:36 -07:00
jordan	f64377116a	fix: add build-complete sync point for docs pipeline ordering Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The export-openapi step was running in parallel with component builds because it had no explicit dependency. This could cause docs generation to run before component services were fully built. Changes: - Add build-complete step with NO depends_on (waits for ALL prior steps) - Make export-openapi depend on build-complete - Complete docs pipeline: export-openapi → generate-docs → build-docs → build-docs-image → deploy-docs - Update verify step label selector to use project= instead of app= Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 16:02:17 -07:00
jordan	59aa173384	fix: clear stale error when dequeuing work tasks Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details When a task is retried (dequeued again after failure), the previous error message was persisting in the work_queue table. This caused the API to return confusing responses with status="running" but also containing an error message from the previous attempt. Now clears error and completed_at when claiming a task, matching the fix already applied to build_audit.UpdateStatus. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 08:51:34 -07:00
jordan	9833725f31	fix: preserve work on build retry, clear stale audit data Two critical fixes for build retry behavior: 1. pod_git_operations.go: Normalize remote URL before comparison - Clone stores URL with token (https://token:x@host/...) - Subsequent retry compares against URL without token - Without normalization, URLs never match, so workspace is always cleared and re-cloned, losing all code from previous attempt 2. build_audit.go: Clear stale result data when task transitions to running - When a failed task is retried, UpdateStatus only updated status/worker_id - Result and completed_at from previous failure remained, causing API to return stale failure data even while retry was running - Now clears result, completed_at and resets started_at when status is set to "running" Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 08:40:36 -07:00
jordan	9cca5cc41b	fix: add proper instrumentation to git clone for debugging Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Log clone request with work_dir, URL, and token presence - Log workspace state (is_git_repo, existing remote) - Log all decision points (pull vs clone, clear workspace) - Detect and clear non-empty non-git directories before clone - Capture both stdout and stderr for clone failures - Include exit code in error messages	2026-02-07 07:59:53 -07:00
jordan	e58d679e67	fix: add go mod download to component Dockerfiles Empty go.sum files were causing Docker builds to fail because Go couldn't verify dependencies. Added go mod download steps for both pkg and component directories before building. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 23:35:02 -07:00
jordan	d74efb75ff	fix: wire workService to WorkersHandler and add /work/tasks endpoint Critical fix: WorkersHandler was missing workService dependency, causing 500 errors when workers tried to fail tasks. This caused tasks to get stuck in "running" state permanently. Also adds: - /work/tasks endpoint for debugging all tasks across projects - List method to WorkQueue interface for admin views - HTTP client tests for api_client.go and claudebox/client.go (48 tests) - Split work.go DTOs into work_dto.go to stay under 500 lines Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 10:35:39 -07:00
jordan	d7a6f37593	fix: worker graceful shutdown and RWO PVC compatibility Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Add WaitGroup for graceful shutdown of in-flight tasks - Change replicas to 1 with Recreate strategy (RWO PVC limitation) - Optimize Dockerfile: combine RUN commands for smaller layers - Add compiled binaries to .gitignore Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 00:35:00 -07:00
jordan	f6a2b61b16	fix: add skeleton settings.local.json (was globally gitignored) Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 22:55:17 -07:00
jordan	3b35900a2d	feat: enterprise worker pool with HTTP sidecar pattern Implements horizontally-scalable worker pool architecture: - claudebox-sidecar: HTTP server for Claude Code, git, and SDLC ops - rdev-worker: standalone worker binary polling rdev-api for tasks - HTTP client adapter for sidecar communication - HPA with custom Prometheus metrics for autoscaling - ServiceMonitor for metrics scraping Code review fixes applied: - URL-encode query parameters in GitStatus (Critical #1) - Remove unused shellQuote function (Critical #2) - Use stdlib strings.Split/TrimSpace (Critical #3) - Add version injection via ldflags (Warning #4) - Add debug logging for swallowed git/sdlc errors (Warning #5, #6) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 16:21:11 -07:00
jordan	3b0779fbe8	fix: slackpath trees use batch endpoint for atomic multi-component adds Updates slackpath-2 and slackpath-4 to use POST /projects/{id}/components/batch for adding multiple Go components atomically in a single git commit. This prevents the go.work race condition where individual commits reference modules that don't exist yet. Also adds on_error: continue for infrastructure provisioning steps that may already exist from skeleton (redis, postgres). Verified: - slackpath-1: ✅ Complete (wait_build polled 5 times, detected success) - slackpath-2: ✅ Complete (wait_build polled 111 times, detected success) - slackpath-3: ✅ Infrastructure passed (worker capacity limited testing) - slackpath-4: ✅ Infrastructure passed (worker capacity limited testing) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 14:44:53 -07:00
jordan	853ec4cf81	fix: go.work race condition with batch components and idempotent provisioning Three coordinated fixes for CI pipeline race conditions: 1. Woodpecker step dependencies: Added depends_on: [deps] to all 6 component templates (service, worker, cli, app-astro, app-react, app-nextjs) so build steps wait for go work sync to complete. 2. Idempotent resource provisioning: Modified provisionResources() to check for existing database/cache before creating, preventing "already exists" errors on component re-adds. 3. Batch component endpoint: POST /projects/{id}/components/batch enables atomic multi-component additions in a single git commit. Validates all components upfront, provisions infra sequentially, commits code components atomically. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 12:31:40 -07:00

1 2 3

102 Commits