Every project generated from the skeleton now ships with styled,
production-ready transactional emails out of the box.
New pkg/email package:
- Renderer: loads templates from caller-provided embed.FS, inlines CSS via
douceur at startup, derives plain text via goquery for multipart delivery
- DevHandler: live browser preview at GET /dev/emails and /dev/emails/{purpose}
(development only, never mounted in production)
- CSSInlineErr field on RenderedEmail so callers can log degraded renders
New service component templates:
- internal/email/embed.go.tmpl — embeds template FS (uses all: prefix for _*.html)
- internal/email/renderer_test.go.tmpl — 9 tests covering all purposes + brand injection
- internal/email/templates/ — 5 HTML email types (login_otp, email_verify,
magic_link, password_reset, welcome) + 5 shared partials (_layout, _header,
_footer, _button, _code_box)
Updated service component templates:
- config.go.tmpl — brand fields: AppName, AppURL, SupportEmail, LogoURL, BrandColor
- main.go.tmpl — wires renderer at startup, logs template count
- routes.go.tmpl — mounts /dev/emails in development; EmailRenderer in Dependencies
- notify.go.tmpl — renders HTML before sending; warns on CSS inlining failure
- go.mod.tmpl — adds douceur, goquery, gorilla/css, andybalholm/cascadia
Deleted: internal/adapter/email/helpers.go.tmpl (replaced by meta.yaml + renderer)
Fix: template directory named email_verify (matching domain.PurposeEmailVerify)
rather than verify_email — the mismatch caused all verification emails to fail
with "unknown email purpose" at send time while tests passed (tests called
Render directly with the wrong name).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix no-op RequireProjectAccess middleware to enforce project_ids
- Apply project access middleware to all project-scoped routes
- Filter GET /projects by allowed project IDs for restricted keys
- Add GET /me endpoint with key identity, scopes, and project access info
- Add PATCH /keys/{id} for partial key updates (name, scopes, project_ids, allowed_ips, expires_in)
- Add GET/POST/DELETE /projects/{id}/access for project-centric access management
- Auto-grant creating key access when using POST /project/create-and-build
- Accept grant_to_key_ids in create-and-build to grant multiple keys on project creation
- Move newProvisionerWithDeps test helper from production code to test file
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add NotifyProvisioner (port + adapter) using real notify admin API
- Create notify account + send key + host grant per project
- Inject NOTIFY_API_KEY/HOST/FROM into component deployments
- Store NOTIFY_URL, NOTIFY_ADMIN_KEY, RESEND_API_KEY in credential store
- Add setup-notify.sh for one-time host/provider/domain setup
- Add NOTIFY_ADMIN_KEY constant to domain/credential.go
- Wire provisioner in main.go with connection test guard
- Add .claude/guides/services/notify.md and CLAUDE.md entry
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These components use useState/useRef hooks but lacked the Next.js 'use client'
directive, causing the Next.js app build to fail with Server Component errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two distinct fixes:
1. Database terminology: Make it crystal clear that generated projects use
CockroachDB in production and PostgreSQL for local dev, while the rdev
platform itself uses PostgreSQL. Updated 15 files across skeleton agents,
component templates, cookbook trees, and platform docs.
2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded
by the Gemini adapter with auth) and re-downloading from the provider URL
with a plain GET — which fails because Gemini URLs require API key auth.
Now uses vid.Data first, falls back to downloadURL only for public URLs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds complete media storage pipeline with GCS presigned uploads, AI image/video/text generation
via queue-based workers, realtime SSE event streaming, and comprehensive skeleton packages
(storage, mediagen, textgen, generation, realtime, persona, routing, ai-client). Includes
security fixes for media delete authorization, nil pointer guards in handlers, video persistence
via download-then-upload, consistent signed URLs, and Image→ImageIcon rename to avoid DOM collision.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TimeoutAgentExecution (22m) to handlers for synchronous SDLC
execution, and TimeoutAgent{Default,Medium,Heavy} (12/22/47m) to
workers for tiered agent task execution. Aligns with SDLC action
complexity tiers and prevents inline duration literals.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The architect API returned "failed to start conversation" because
projectRepo.Get() failed — the in-memory K8s repo watches the rdev
namespace but projects deploy to the projects namespace. Made project
lookup non-fatal with fallback to default pod. Added error logging to
all architect handler methods (were silently swallowing errors).
Also adds setup-hooks, commit-after-qa, and pre-merge-validate steps
to the foundary cookbook tree for git hooks and code quality gates.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The HTTP claudebox client's ExecuteStream method used a bare
bufio.NewScanner with the default 64KB max token size. When Claude Code
produces tool results > 64KB (e.g., reading large files), the SSE event
exceeds the scanner limit and fails with "token too long".
Every other scanner in the codebase (claudecode adapter, claudebox
executor, kubernetes executor) already uses scanner.Buffer(buf, 1MB).
This was the only one missed.
Fixes: "agent execution failed: read stream: bufio.Scanner: token too long"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 fixes from stress test analysis:
1. CRITICAL: Add pull-before-push to claudebox GitOperations.CommitAndPush,
matching the fix already in PodGitOperations (prevents push rejections
when concurrent builds advance the remote).
2. HIGH: Extract ResetToMain into PodGitOperations as a shared public method.
Wire into BuildExecutor after CloneRepo and update SDLCTaskExecutor to
use the shared method. Prevents builds from running on wrong branch when
worker pods are reused across tasks.
3. HIGH: Make branch create push failure fatal with retry+rollback in
cmd/sdlc/cmd_branch.go. Prevents orphaned .sdlc/ state that causes
merge failures after completing all 10 SDLC phases.
4. MEDIUM: Shell-escape token in credential helpers (both PodGitOperations
and claudebox GitOperations) to prevent shell injection via tokens
containing special characters.
5. MEDIUM: Add GitResetToMain to claudebox sidecar (git.go implementation,
server.go endpoint, client.go HTTP method) and wire into
HTTPSDLCTaskExecutor for the HTTP sidecar path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auth errors like "OAuth token has expired" were lost because Claude writes
them to stdout, not stderr. The error message only showed kubectl's generic
"command terminated with exit code 1". Now includes both stdout and stderr
in the error, making failures immediately diagnosable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Components are scaffolded before CI builds their images. Previously deployments
started with 1 replica, causing ImagePullBackOff until the first build completed.
Now deployments start at 0 replicas; CI deploy steps scale to 1 after verifying
the image exists in the registry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --connect-timeout 10 and --max-time 15 to all verify step curl
calls to prevent hanging on registry health checks
- Fix cli template: depends_on [deps] -> [preflight] for consistency
- Add cross-reference comment to service template about verify logic
being replicated across all 5 component templates
- Document component CI step rules in composable-monorepo.md
- Compile regexes at package level instead of per-call in
component_updates.go
- Add component_updates_test.go
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add UndeployAll() using label selectors to clean up monorepo components
on project deletion (replaces name-based Undeploy in DeleteProject and
the direct undeploy handler)
- Add ResourceGC background worker that periodically finds K8s resources
whose project label has no matching DB record, deletes after 1h safety
window
- Widen deployer client type from *kubernetes.Clientset to
kubernetes.Interface for testability
- UndeployAll accumulates errors via errors.Join instead of failing fast
- Add checkout/checkin sidecar dev flow: temporary git tokens, branch
checkout, review on checkin with cleanup workers
- Add interactive sessions: pod binding, command execution, SSE streaming,
ephemeral preview URLs with session cleanup workers
- Add GET /workers/pool endpoint for aggregate capacity and queue depth
- Add sessions:read and sessions:execute auth scopes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements all 5 phases of Foundary Studio backend:
Phase 1: Chat Persistence (8 API endpoints)
- Conversations and messages with proper cascading deletes
- PostgreSQL schema with auto-update triggers
- Full CRUD operations with structured logging
Phase 2: Blueprint Entity (5 API endpoints)
- JSONB spec storage with GIN indexes
- Flexible structured data for project specifications
- Version-controlled blueprint management
Phase 3: Architect Service (3 API endpoints)
- Conversational AI orchestration with Claude
- Multi-turn dialogue with context building
- Blueprint spec extraction from conversations
Phase 4: Work Queue Integration
- Verified existing endpoint compatibility
Phase 5: Structured Questions (6 API endpoints)
- Four question types: text, choice, multichoice, yesno
- Answer validation with proper constraints
- Conversation-linked Q&A flow
Architecture:
- Textbook hexagonal architecture (domain → port → adapter → service → handler)
- Zero external dependencies in domain layer
- Consistent error handling with proper wrapping
- Auth scopes on all routes (projects:read, projects:execute)
- Structured logging with operation context and duration tracking
- NULL-safe DTO converters throughout
Database:
- 3 new migrations (019, 020, 021)
- UUIDs for all primary keys
- Proper foreign key constraints with ON DELETE CASCADE
- Optimized indexes including partial index for unanswered questions
- Auto-update triggers for timestamps
OpenAPI Documentation:
- Complete API documentation under 'Foundary' tag
- 22 new endpoints documented with examples
- Request/response schemas for all operations
Logging Improvements:
- Added operation field to all service logs
- Added duration_ms tracking for performance monitoring
- Log response_length instead of full response content
- Consistent use of logging field constants
- Execute-then-log pattern for delete operations
Files: 32 changed, 2800+ lines added
- 7 domain models
- 3 database migrations
- 3 port interfaces
- 3 postgres adapters
- 4 services (conversation, blueprint, question, architect)
- 4 handlers with DTOs
- OpenAPI documentation
- Integration in main.go
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Root cause of DIGEST_INVALID errors was registry disk exhaustion.
Project teardown wasn't cleaning up container images, causing the
registry PVC to fill up over time.
Changes:
- Add RegistryProvider port interface for registry operations
- Extend zot.Client with DeleteProjectRepositories method
- Wire registry provider into ProjectInfraService
- Delete images during DeleteProject cleanup (step 4)
The zot client uses the OCI distribution API:
- Lists all repos, filters by project prefix
- Gets manifest digests via HEAD request
- Deletes manifests by digest to trigger GC
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Agents were generating `:id` (Echo/Gin style) instead of `{id}` (chi style),
causing routes to not match. Updated api-designer, go-specialist agents and
skeleton CLAUDE.md with explicit CRITICAL notes about brace syntax.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When req.Template is empty, it defaults to 'skeleton' but the check
in createInitialDeployment only matched 'skeleton' explicitly, not
empty string. This caused a broken deployment to be created for
monorepo projects with a non-existent image.
Root cause: slackpath-5 creates project with empty template, which
defaults to skeleton, but createInitialDeployment was still creating
a root deployment that references registry.threesix.ai/{project}:latest
which never gets built (skeleton has no root Dockerfile).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When docs infrastructure doesn't exist, the docs build steps should
gracefully skip without failing the entire pipeline.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The raw gcr.io/kaniko-project/executor with commands: doesn't work
properly in Woodpecker. Switch to woodpeckerci/plugin-kaniko with
settings: to match other component builds.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The registry.threesix.ai uses a self-signed certificate.
Service builds use plugin-kaniko with skip-tls-verify, but docs
build used raw kaniko executor without TLS bypass, causing exit 128.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When --context=docs is set, the --dockerfile path should be relative
to the context directory. Changed from docs/Dockerfile.nginx to
Dockerfile.nginx since kaniko already looks in the docs/ directory.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
middleman-syntax ~> 3.2 requires rouge ~> 3.2, but Gemfile had rouge ~> 4.0
causing bundle install to fail with version resolution error.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds complete Slate documentation infrastructure to generated projects:
- docs/ directory with Gemfile, config.rb, and source templates
- Dockerfile for building docs site
- Dockerfile.nginx for serving static docs
- generate-docs.sh script for CI integration
- Claude command for AI-assisted docs generation
- OpenAPI → Slate markdown conversion via widdershins
Also includes:
- --export-openapi flag for service binaries
- DNS provisioning for docs.{domain} subdomain
- Updated project_infra for docs DNS records
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The export-openapi step was running in parallel with component builds
because it had no explicit dependency. This could cause docs generation
to run before component services were fully built.
Changes:
- Add build-complete step with NO depends_on (waits for ALL prior steps)
- Make export-openapi depend on build-complete
- Complete docs pipeline: export-openapi → generate-docs → build-docs →
build-docs-image → deploy-docs
- Update verify step label selector to use project= instead of app=
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a task is retried (dequeued again after failure), the previous
error message was persisting in the work_queue table. This caused the
API to return confusing responses with status="running" but also
containing an error message from the previous attempt.
Now clears error and completed_at when claiming a task, matching the
fix already applied to build_audit.UpdateStatus.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two critical fixes for build retry behavior:
1. pod_git_operations.go: Normalize remote URL before comparison
- Clone stores URL with token (https://token:x@host/...)
- Subsequent retry compares against URL without token
- Without normalization, URLs never match, so workspace is always
cleared and re-cloned, losing all code from previous attempt
2. build_audit.go: Clear stale result data when task transitions to running
- When a failed task is retried, UpdateStatus only updated status/worker_id
- Result and completed_at from previous failure remained, causing
API to return stale failure data even while retry was running
- Now clears result, completed_at and resets started_at when
status is set to "running"
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Empty go.sum files were causing Docker builds to fail because
Go couldn't verify dependencies. Added go mod download steps
for both pkg and component directories before building.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Critical fix: WorkersHandler was missing workService dependency, causing
500 errors when workers tried to fail tasks. This caused tasks to get
stuck in "running" state permanently.
Also adds:
- /work/tasks endpoint for debugging all tasks across projects
- List method to WorkQueue interface for admin views
- HTTP client tests for api_client.go and claudebox/client.go (48 tests)
- Split work.go DTOs into work_dto.go to stay under 500 lines
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add WaitGroup for graceful shutdown of in-flight tasks
- Change replicas to 1 with Recreate strategy (RWO PVC limitation)
- Optimize Dockerfile: combine RUN commands for smaller layers
- Add compiled binaries to .gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements horizontally-scalable worker pool architecture:
- claudebox-sidecar: HTTP server for Claude Code, git, and SDLC ops
- rdev-worker: standalone worker binary polling rdev-api for tasks
- HTTP client adapter for sidecar communication
- HPA with custom Prometheus metrics for autoscaling
- ServiceMonitor for metrics scraping
Code review fixes applied:
- URL-encode query parameters in GitStatus (Critical #1)
- Remove unused shellQuote function (Critical #2)
- Use stdlib strings.Split/TrimSpace (Critical #3)
- Add version injection via ldflags (Warning #4)
- Add debug logging for swallowed git/sdlc errors (Warning #5, #6)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates slackpath-2 and slackpath-4 to use POST /projects/{id}/components/batch
for adding multiple Go components atomically in a single git commit. This
prevents the go.work race condition where individual commits reference modules
that don't exist yet.
Also adds on_error: continue for infrastructure provisioning steps that may
already exist from skeleton (redis, postgres).
Verified:
- slackpath-1: ✅ Complete (wait_build polled 5 times, detected success)
- slackpath-2: ✅ Complete (wait_build polled 111 times, detected success)
- slackpath-3: ✅ Infrastructure passed (worker capacity limited testing)
- slackpath-4: ✅ Infrastructure passed (worker capacity limited testing)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three coordinated fixes for CI pipeline race conditions:
1. Woodpecker step dependencies: Added depends_on: [deps] to all 6 component
templates (service, worker, cli, app-astro, app-react, app-nextjs) so build
steps wait for go work sync to complete.
2. Idempotent resource provisioning: Modified provisionResources() to check
for existing database/cache before creating, preventing "already exists"
errors on component re-adds.
3. Batch component endpoint: POST /projects/{id}/components/batch enables
atomic multi-component additions in a single git commit. Validates all
components upfront, provisions infra sequentially, commits code components
atomically.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Worker template fixes:
- Replace panic() with logger.Error() + os.Exit(1) for config errors
- Remove double-timeout application (context + middleware)
- Add error message truncation to prevent log bloat
- Use named constants for shutdown grace period and stale check interval
Skeleton pkg/auth fixes:
- Fix error wrapping to use %w consistently in jwt.go
- Add GetUserOrError() as safe alternative to MustGetUser() panic
Skeleton pkg/queue fixes:
- Check RowsAffected() errors instead of ignoring them
- Add input validation to EnqueueWithOptions (require job type, cap retries)
- Add log truncation for error messages
- Fix inaccurate doc comment claiming exponential backoff
Worker timeout consolidation:
- Add internal/worker/timeouts.go with named constants
- Migrate all workers to use timeout constants
Cleanup:
- Remove obsolete slack-preparation-thoughts.md files
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major changes:
- Add internal/logging package with field constants, context propagation,
sensitive data auto-redaction, and per-component log levels
- Add worker timeout constants (TimeoutQuickOp, TimeoutHealthCheck, etc.)
- Extend SDLC with callback handlers, generate endpoints, and executor
- Add new cookbook trees for aeries and slackpath progression
- Add skeleton templates for queue, realtime, and microservices
- Add worker component template with async job processing
- Refactor services and handlers to use new logging infrastructure
- Split component.go into component_infra.go and component_listing.go
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds AddIngressPath and RemoveIngressPath to the Deployer interface
for managing per-component ingress rules in monorepo projects.
- Implement conflict retry logic for concurrent ingress updates
- Add K8s client interface for testability
- Add comprehensive unit tests for ingress path operations
- Add component deployment and teardown methods to ComponentService
- Update service templates with OpenAPI spec improvements
- Add evolving-app cookbook tree for reference
- Split resources.go into resources_ingress.go for path-based routing
- Split component.go into component_deploy.go for deployment helpers
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use glob pattern go.work.su[m] instead of go.work.sum to allow
the COPY to succeed even when go.work.sum doesn't exist yet.
This happens on fresh monorepos before dependencies are synced.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add /diagnostics endpoint for system health overview
- Add external health worker for monitoring Gitea, Woodpecker, Registry
- Add health check methods to Gitea and Woodpecker clients
- Remove hardcoded fallback projects (pantheon, aeries)
- Add diagnostics domain types and service layer
- Add comprehensive tests for diagnostics handler and service
- Fix tests to use registered test project instead of hardcoded one
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Composable monorepo CI fixes:
- Add empty go.sum.tmpl files for pkg, service, worker, and cli components
- Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko)
- Add deps step to CI that runs go work sync and go mod tidy before builds
- Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0)
Health endpoint improvements:
- Add registry health check (zot OCI /v2/ endpoint)
- Add health metrics for CI, registry, and Git
- Add /health/ci endpoint for Woodpecker health
Visual verification scaffolding:
- Add Playwright pod and scripts ConfigMap
- Add vision.md and implementation breakdown plan
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add WorkErrorCode type with RATE_LIMITED, AUTH_FAILED, TIMEOUT, STALE_WORKER, AGENT_ERROR, INVALID_SPEC
- Add ClassifyAgentError function to detect error patterns from stderr
- Add error_code column to work_queue table (migration 016)
- Add FailWithCode method to WorkQueue interface and implementations
- Update RequeueStaleWithIDs to mark permanently failed tasks with STALE_WORKER
- Add ErrorCode to BuildResult for API responses
- Update work executor to classify errors before failing tasks
This enables users to see actual failure reasons (e.g., "RATE_LIMITED") instead of
builds stuck in "running" state forever when Claude hits rate limits.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add pass/fail/needs-fix CLI commands to cmd/sdlc/cmd_artifact.go
- Add 3 new methods to SDLCExecutor interface in internal/port
- Implement methods in kubernetes adapter
- Add service methods to SDLCService
- Add HTTP handlers for POST .../artifacts/{type}/pass|fail|needs-fix
- Update 6 skeleton commands to evaluate and set artifact status
- Update test mocks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add branch lifecycle commands (branch, merge, archive) to the SDLC CLI.
Introduce orchestrator handler and service for multi-step SDLC workflows.
Expand skeleton template with 15 Claude commands covering the full feature
lifecycle. Extend classifier rules, error types, and executor port for
branch operations. Split rules.go and classifier_test.go to stay within
500-line limit.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>