jordan/rdev

Author	SHA1	Message	Date
jordan	32d50a6952	feat: make infra provisioning idempotent + aeries-daeya public discovery feed All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Make postgres and redis provisioning idempotent: return success when already provisioned with credentials stored, allowing cookbook trees to safely include explicit add-db/add-redis steps alongside auto-provisioned project creation - Update tests to reflect new idempotent behavior - Consolidate docs CI into single multi-stage Docker build (remove separate build-docs step; Dockerfile.nginx now builds Slate then serves with nginx) - Delete redundant skeleton docs/Dockerfile (replaced by multi-stage nginx image) - Add watch verb to woodpecker-deployer RBAC (required by kubectl rollout status) - Aeries Daeya cookbook: add public discovery feed (/) + character profiles (/c/:handle), characters.published/handle/tagline fields, dark pink design system, /studio/* routes, verify-public-discovery + verify-otp-endpoint smoke test steps - Fix Input.tsx: remove non-existent --border-hover CSS variable hover effect	2026-02-28 17:32:21 -07:00
jordan	e42c18a9a3	feat: add session web UI mode + aeries-daeya cookbook tree Session WebUI: - Add `web_ui` flag to session create — launches claude-code-ui in pod on port 3001 - Install @siteboon/claude-code-ui in claudebox Dockerfile, expose port 3001 - Migration 027: add web_ui column to sessions table - startWebUI/stopWebUI fire-and-forget helpers in SessionsHandler - Service selects preview port 3001 (web UI) vs 8080 (sidecar) based on flag Aeries Daeya cookbook: - Add cookbooks/trees/aeries-daeya.yaml: privacy-first avatar social platform (infra → avatar data model → AI generation pipeline → studio UI) - Add cookbooks/scripts/aeries-daeya-test.sh: run/status/diagnose/teardown harness - Fix race condition in common.sh wait_for_pipeline: detect already-running pipelines at startup and track directly instead of waiting for a newer one Docs/tooling: - Add SDK Update Workflow section to CLAUDE.md - Add `make sdk` and `make sdk-check` targets for OpenAPI spec management Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 23:14:08 -07:00
jordan	ddcfe52b5c	feat: implement shared notify host model for platform email delivery Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Replace per-project notify host provisioning (7-9 API calls + DNS + async Resend verification) with a shared platform host for all .threesix.ai projects. Under the new model: - CreateProjectNotify: 3 calls only (account + send key + host grant) - No per-project Resend domain, DNS records, or async verification - All .threesix.ai projects share `threesix.ai` as the platform host - Custom domains still get a dedicated host via ReprovisionNotifyHost Changes: - domain/notify.go: slim NotifyCredentials (no Host/From/ResendDomainID); add NotifyHostCredentials for reprovision return path - port/notify_provisioner.go: update interface signatures and docs - adapter/notify/provisioner.go: rewrite CreateProjectNotify (3 steps); rewrite DeleteProjectNotify (account-only vs full cleanup) - adapter/notify/provisioner_reprovision.go: return *NotifyHostCredentials - adapter/notify/provisioner_test.go: update tests for new model - service/project_infra_crud.go: store only NOTIFY_API_KEY on provision - domain/credential.go: add CredKeyNotifySharedHost/CredKeyNotifySharedFrom - cmd/rdev-api/config.go: add NotifySharedHost/NotifySharedFrom to InfraConfig - service/component.go: add notifySharedHost/notifySharedFrom + WithNotifyDefaults - service/component_deploy.go: inject shared host defaults when no custom host stored - handlers/notify.go: handle shared-host projects in Reprovision guard; add WithSharedNotifyHost builder - cmd/rdev-api/main.go: wire SharedHost to provisioner, component service, and notify handler Bootstrap: NOTIFY_SHARED_HOST=threesix.ai and NOTIFY_SHARED_FROM=noreply@threesix.ai stored in credential store (host id=1 already provisioned with Resend provider). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 17:04:11 -07:00
jordan	17240f4efd	fix(rc-5): add Redis ACL persistence + cache reprovision endpoint ## Changes ### port.Deployer interface - Add PatchProjectSecrets(ctx, projectName, patch) to merge key-value pairs into all K8s secrets labeled project={projectName} - Add RestartAll(ctx, projectName) to trigger rolling restart of all deployments for a project, picking up fresh secrets without waiting for CI ### deployer adapter - Implement PatchProjectSecrets: lists secrets by label, merges patch into Data, writes each secret back - Implement RestartAll: lists deployments by label, sets restartedAt annotation ### domain/credential.go - Add CredentialCategoryCache = "cache" constant - Use constant in component_infra.go (was raw string "cache") ### handlers/cache.go (new) - POST /projects/{projectID}/cache/reprovision - Calls CreateProjectCache (which handles delete+recreate with new password) - Updates credential store (REDIS_URL, REDIS_URL_STAGING, REDIS_PREFIX) - Patches all K8s secrets for the project immediately - Triggers RestartAll so pods pick up new credentials without waiting for deploy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 20:22:31 -07:00
jordan	62a9bbb237	fix: resolve 7 root causes causing cookbook deployment failures All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details RC-1: Gitea org fallback already removed (no-op, confirmed) RC-3: Push/pull now explicitly target origin main (HEAD:main) in both pod_git_operations.go and claudebox/git.go — fixes Woodpecker webhook trigger by ensuring pushes always land on the main branch RC-4: wait_for_pipeline records baseline pipeline number before polling; only returns success when a NEWER pipeline completes — prevents false positive when a prior pipeline was already success RC-5: Redis WRONGPASS fixed on live persona-community-5 instance; platform gap noted (no reprovision endpoint for Redis ACL drift) RC-6: Removed on_error:continue from all infra provisioning steps (add-db, add-redis) across persona-community, slackpath-2/3/4/5 trees — infra failures now fail the tree instead of silently continuing to a crash RC-7: Added .pnpm-store/ to skeleton .gitignore — prevents thousands of cache files being committed by agents after pnpm install RC-2: Updated all 12 cookbook trees — git_clone_url jordan/ → threesix/ (24 occurrences across all slackpath, aeries, full-stack, genkit trees) Also: strings.Cut and strings.SplitSeq lint fixes in pod_git_operations.go and claudebox/git.go Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 18:49:09 -07:00
jordan	3dbde72966	feat: add claude_id tracking and session improvements for interactive dev All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add claude_id field to sessions (migration 026) for tracking Claude process IDs across pod restarts - Extend session repository with UpdateClaudeID and session lookup methods - Improve kubernetes executor with better error handling and exec streaming - Add claudebox client/server improvements for session lifecycle - Expand sessions handler with exec streaming endpoint - Add comprehensive tests for sessions and kubernetes executor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 00:20:32 -07:00
jordan	96219a647f	feat: add POST /projects/{id}/notify/reprovision to migrate notify host All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Implements ReprovisionNotifyHost to migrate a project's email sending from an old notify host to a new one (e.g., from project-name-based to slug-based host). Preserves the project's notify account and send key. - Adds ReprovisionNotifyHost to port.NotifyProvisioner interface - Implements revokeHostAccess on notifyAdminAPI + adminClient - Implements Provisioner.ReprovisionNotifyHost (12-step migration) in provisioner_reprovision.go (split to keep provisioner.go < 500 lines) - Adds NotifyHandler.Reprovision handler (POST /notify/reprovision) - Updates OpenAPI spec with reprovision endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 21:28:59 -07:00
jordan	ee1c214b7e	fix: correct Resend DNS record type, name, and MX priority All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Three bugs in the notify provisioner DNS record upsert: 1. rec.Record ("DKIM"/"SPF") was used as the DNS record type — Cloudflare doesn't know those labels. Fix: use rec.DNSType ("TXT"/"MX") from the resendDNSRecord.type JSON field, which is the actual DNS record type. 2. rec.Name from Resend is already relative to the zone apex (e.g., "resend._domainkey.mail.project-name"), not relative to the registered domain. Code was doing rec.Name + "." + host which produced a doubled subdomain. Fix: pass rec.Name directly — Cloudflare's normalizeName appends ".baseDomain" to build the correct FQDN. 3. MX records have priority 10 in Resend's response but DNSRecord had no Priority field and Cloudflare CreateRecord/UpdateRecord didn't send it. Fix: add Priority int to domain.DNSRecord and include it in the body for both Create and Update when non-zero. These bugs caused DKIM/SPF DNS records to never be created for any project. Re-provision affected projects using POST /projects/{id}/notify/provision after clearing NOTIFY_RESEND_DOMAIN_ID from the credential store. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 19:52:11 -07:00
jordan	c54664b751	feat: add POST /projects/{id}/notify/provision to repair Resend domain All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Repairs projects where notify account was created but Resend domain provisioning (steps 7-9) failed — e.g., RESEND_API_KEY not yet configured at project creation time. - ProvisionNotifyDomain in port.NotifyProvisioner interface - Provisioner.ProvisionNotifyDomain: creates Resend domain for existing notify host, upserts DKIM/SPF DNS records in Cloudflare, kicks off async verification via verifyWithRetry - POST /projects/{id}/notify/provision handler: - reads NOTIFY_HOST from credential store (fails if not set) - rejects if NOTIFY_RESEND_DOMAIN_ID already present (use /verify) - stores returned domain ID as NOTIFY_RESEND_DOMAIN_ID credential Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 17:52:03 -07:00
jordan	fa0d030def	feat: improve notify domain verification reliability and add status endpoints All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add verifyWithRetry to provisioner: 60s initial DNS propagation delay, 5 retries with 30s backoff before marking verification as failed - Add GetNotifyDomainStatus: polls Resend API for domain verification status, returns "not_configured" when Resend not set up - Add VerifyProjectNotify: synchronous re-verification for handler use - Add getDomainStatus to resendAPI interface + resendClient implementation - Add NotifyDomainStatus domain struct (host, resend_domain_id, status) - Guard NOTIFY_RESEND_DOMAIN_ID storage against empty string writes - New handler: GET /projects/{id}/notify/status (returns verification state) - New handler: POST /projects/{id}/notify/verify (triggers re-verification) - Add verify-notify-domain cookbook step to persona-community, slackpath-1, and slackpath-4 trees (polls status for up to 6 min) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 16:25:55 -07:00
jordan	2612de8446	fix: inject SERVICE_NAME/SERVICE_PORT for app components in batch path All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details AddComponentBatch was missing the SERVICE_NAME injection that AddComponent has. When an app-react component (e.g. creator-ui) was rendered via the batch endpoint alongside a service component, {{SERVICE_NAME}} in App.tsx.tmpl was never substituted — rendering the literal string into the repo. Fix: scan the batch's own code requests for a service component first (since the service isn't in the DB yet during batch processing), then fall back to findFirstServiceComponent from DB. This is the same AddComponent vs AddComponentBatch parity gap that caused the JWT_SECRET issue (RC-2). The auth API URL in every app-react project was broken when deployed via the batch endpoint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 11:52:03 -07:00
jordan	a843fd7ff4	fix: make NOTIFY_API_KEY optional — fall back to log-only email mode All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details NOTIFY_URL is a global platform credential; NOTIFY_API_KEY is project- scoped and may not be provisioned if notify setup failed or the notifyProvisioner wasn't configured. Previously the service would crash on startup with "invalid configuration: API key is required" when NOTIFY_URL was set but NOTIFY_API_KEY was missing. Now the condition checks both: only initialize the notify client when both NOTIFY_URL and NOTIFY_API_KEY are set. When either is absent, fall back to log-only mode with a warning (instead of os.Exit(1)). This is the correct behavior: email not delivered is survivable, but a service crash on startup breaks the entire application. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:52:58 -07:00
jordan	d91bfc50fa	fix: handle missing Redis credentials and redis.Nil in provisioner All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Two bugs fixed: 1. redis.Nil not handled in GetProjectCache: When ACL GETUSER returns nil (user doesn't exist), go-redis represents this as redis.Nil error. The provisioner only checked for err.Contains("User") which didn't match, causing spurious "get ACL user: redis: nil" errors on re-provision. 2. provisionRedis returns 409 even when REDIS_URL not in credential store: If the Redis ACL user exists but REDIS_URL was never stored (e.g., due to a failed previous run or lost state), the service would permanently refuse to provision, leaving the project without usable Redis credentials. Now checks the credential store: if REDIS_URL exists → true 409 duplicate; if REDIS_URL missing → re-provision (CreateProjectCache resets the password). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:19:00 -07:00
jordan	a593605caa	fix: call ensureProjectJWTSecret in AddComponentBatch Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details AddComponent (single-component path) already calls ensureProjectJWTSecret, but AddComponentBatch has its own parallel implementation that bypassed it. Components added via the /batch endpoint never had JWT_SECRET provisioned, causing CrashLoopBackOff on startup ("JWT_SECRET must be set"). Add the call before the createInitialComponentDeployment loop so the secret is stored in the credential store before K8s Secrets are created. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:12:26 -07:00
jordan	ad1f19739d	fix: call config.MustInit() before config.Load() so Viper reads env vars All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Without MustInit(), viper.AutomaticEnv() is never called, so Viper cannot read environment variables injected by K8s secrets (envFrom). This caused DATABASE_URL to always appear empty in deployed services, forcing them into standalone/in-memory mode even when a database was provisioned. os.Getenv() calls like JWT_SECRET worked fine (direct syscall). Viper-backed reads like DATABASE_URL did not (require AutomaticEnv). Added pkgconfig.MustInit() call at the top of main() in both the service component template and the full-monorepo example-api. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 04:25:38 -07:00
jordan	4d9203eddc	fix: commit usePersonaGeneration.ts skeleton template All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details The usePersonaGeneration hook was created on disk but never committed to git, so rendered projects had a broken import in index.ts causing TypeScript build failures in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 03:59:31 -07:00
jordan	3247ce3ca0	fix: worker deployments and JWT_SECRET auto-provisioning All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details RC-1: Workers now get a Kubernetes Deployment on component creation. NeedsPort() (port assignment) was incorrectly used to gate Deployment creation - workers have no HTTP port but still need a Deployment so CI `kubectl set image` can succeed. Added NeedsDeployment() returning true for service/worker/app-react/app-astro/app-nextjs. AddIngressPath is now guarded by port > 0 so workers don't attempt HTTP routing. RC-2: JWT_SECRET is now auto-provisioned per-project when the first code component is added. The skeleton service template fatally requires JWT_SECRET at startup; previously fetchProjectCredentials() never fetched it. ensureProjectJWTSecret() generates a cryptographically random 32-byte secret, stores it as "{projectID}:JWT_SECRET", and JWT_SECRET is now included in projectScopedKeys so it's injected into every deployment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 03:42:53 -07:00
jordan	9be5c7d81b	fix: address code review issues in album and generation skeleton packages - Add ErrAnchorRequired sentinel to pkg/album — replaces fragile string equality check used for 422 detection; callers now use errors.Is() - Extract parseShotIndex() helper in album handler — replaces fmt.Sscanf which silently accepted partial parses like "12abc"; strconv.Atoi requires the full string to be numeric - Restructure caption saves in album/handler — captions now written outside the len(img.Data) > 0 gate, so URL-only providers (no bytes returned) still get captions - Add storage.FetchURL() shared utility — removes fetchBytes/downloadURL duplication across album and generation packages; callers control timeout via their http.Client - Add video captions to VideoHandler — same caption sidecar pattern applied to videos - Add persona generation event types to realtime package — persona_spec_, persona_image_, persona_video_* events added to EventType union and usePersonaGeneration hook exported Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 03:01:37 -07:00
jordan	062a828a00	feat: save prompt caption alongside every generated image After each successful image upload to storage, a sidecar `.caption` file is uploaded at the same path with `.caption` extension containing the exact prompt used to generate the image. Coverage: - generation/handlers.go: ImageHandler → media/{userID}/images/{jobID}_{i}.caption - album/handler.go: AnchorHandler → albums/{userID}/{albumID}/anchor.caption - album/handler.go: ShotHandler → albums/{userID}/{albumID}/shots/{shotIndex}.caption - personagen/service.go: generatePosition → personas/{specID}/images/{pos:02d}.caption Caption failures are logged at warn level and never abort the job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 02:38:08 -07:00
jordan	3979ef2d08	feat: wire mixed-heritage through Stage 4 and fix pronoun support All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - specgen: extend dnaLLMResponse with heritage fields; conditionally extend Stage 4 prompt for EthnicityMixed to ask LLM for primary_heritage, secondary_heritage, and mix_percentage; populate IdentityDNA fields from response so mixed personas get a real heritage breakdown - imagegen: buildIdentitySection() produces "East Asian and Latina/Hispanic heritage" description for mixed personas instead of generic "mixed-race" - videogen: add genderPronouns() helper; replace hardcoded she/her with pronoun set across all 4 video prompts; generateVideo() returns raw bytes so caller can upload to storage - service: GenerateVideo() uploads video to storage and sets VideoSpec.URL; anchor ordering ensures position 1 is generated first; emit persona_video_failed SSE event on non-fatal video failures; replace manual fold helpers with strings.ToLower + strings.Contains - worker/main: register persona_generate handler when both AI managers ready - docs: add persona_video_failed to SSE events reference in personagen.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 01:21:59 -07:00
jordan	002c32aedb	feat: add album generation system to skeleton Adds anchor-based image album generation across docs, skeleton, and rendered full-monorepo. One subject description + one anchor image + N directed shots, covering personas, products, characters, and brand assets out of the box. ## What ships Skeleton packages: - pkg/album/types.go — Album, Shot, ShotStatus, ShotTemplate, AlbumUpdater - pkg/album/templates.go — PortraitSession, ProductShoot, CharacterSheet built-ins - pkg/album/handler.go — AnchorHandler + ShotHandler queue job handlers - packages/realtime/src/useAlbumGeneration.ts — SSE hook owning all album state - packages/ui/src/components/AlbumGrid.tsx — responsive shot grid with shimmer - packages/ui/src/components/ShotCard.tsx — pending/generating/complete/failed states - packages/ui/src/components/AnchorPreview.tsx — anchor CTA + image with controls Component service template: - internal/port/album.go — AlbumRepository interface - internal/adapter/memory/album.go — in-memory repo for standalone dev - internal/service/album.go — create, list, get, generateAnchor, generateAllShots - internal/api/handlers/album.go — HTTP handlers (CRUD + 202 generation endpoints) - Routes: GET/POST /albums, GET/DELETE /albums/{id}, POST /albums/{id}/anchor, POST/DELETE /albums/{id}/shots, POST /albums/{id}/shots/{index} Documentation: - .claude/guides/album.md — full guide with API, SSE events, frontend usage Key architecture decisions: - Anchor bytes never stored in queue payload — workers fetch AnchorURL at runtime - Generation order enforced: POST /shots returns 422 if no anchor exists - All album SSE events on existing user:<userId> channel (no new channel) - AlbumUpdater interface lets job handlers update repo from inside queue workers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 23:57:21 -07:00
jordan	4603402b84	feat: OTP supports unified register+login flow All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Previously SendOTP silently dropped requests for unknown emails, so new users had no passwordless path in. Now: - SendOTP: if REGISTRATION_ENABLED and email unknown, generates and sends the code anyway (UserID nil until verify) - VerifyOTP: if email unknown after valid code, auto-registers the user (emailVerified=true — OTP delivery proves ownership, name defaults to email local-part) then creates a session REGISTRATION_ENABLED=false continues to block unknown emails at SendOTP, preserving invite-only / closed-beta behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 11:17:42 -07:00
jordan	5ac9af018a	fix: always log OTP codes to stdout in standalone dev mode All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details In-memory auth codes are ephemeral — they're wiped on server restart. Previously, codes were only visible via email delivery. If the server restarted between OTP send and OTP verify, the code would be lost. Now memory.AuthCodeRepository.Create() always logs the code to stdout with a [DEV] prefix. This gives developers a reliable fallback regardless of whether NOTIFY_URL is set. Updated CLAUDE.md to document this behavior and the DEV_USER_EMAIL env var. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 00:13:12 -07:00
jordan	5f66eb0e7b	fix: seed dev user from DEV_USER_EMAIL env var so auth survives restarts All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details In standalone mode (no DATABASE_URL), the in-memory user store only had hardcoded demo accounts. Any real email the developer used was lost on every server restart, causing OTP requests to silently fail with "unknown email". NewUserRepository now accepts devEmail + devPassword. If DEV_USER_EMAIL is set, that account is seeded on every startup alongside the demo users. The developer's email is always registered, OTPs route to notify (or log to console), and re-renders/restarts no longer break the auth flow. New config fields: DevUserEmail (DEV_USER_EMAIL) / DevUserPassword (DEV_USER_PASSWORD, default: "DevPassword1"). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 23:46:12 -07:00
jordan	27e6cfd42b	feat: add HTML email template system to skeleton service component All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Every project generated from the skeleton now ships with styled, production-ready transactional emails out of the box. New pkg/email package: - Renderer: loads templates from caller-provided embed.FS, inlines CSS via douceur at startup, derives plain text via goquery for multipart delivery - DevHandler: live browser preview at GET /dev/emails and /dev/emails/{purpose} (development only, never mounted in production) - CSSInlineErr field on RenderedEmail so callers can log degraded renders New service component templates: - internal/email/embed.go.tmpl — embeds template FS (uses all: prefix for _*.html) - internal/email/renderer_test.go.tmpl — 9 tests covering all purposes + brand injection - internal/email/templates/ — 5 HTML email types (login_otp, email_verify, magic_link, password_reset, welcome) + 5 shared partials (_layout, _header, _footer, _button, _code_box) Updated service component templates: - config.go.tmpl — brand fields: AppName, AppURL, SupportEmail, LogoURL, BrandColor - main.go.tmpl — wires renderer at startup, logs template count - routes.go.tmpl — mounts /dev/emails in development; EmailRenderer in Dependencies - notify.go.tmpl — renders HTML before sending; warns on CSS inlining failure - go.mod.tmpl — adds douceur, goquery, gorilla/css, andybalholm/cascadia Deleted: internal/adapter/email/helpers.go.tmpl (replaced by meta.yaml + renderer) Fix: template directory named email_verify (matching domain.PurposeEmailVerify) rather than verify_email — the mismatch caused all verification emails to fail with "unknown email purpose" at send time while tests passed (tests called Render directly with the wrong name). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 22:44:59 -07:00
jordan	4f01015132	feat: implement project access enforcement and management API All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Fix no-op RequireProjectAccess middleware to enforce project_ids - Apply project access middleware to all project-scoped routes - Filter GET /projects by allowed project IDs for restricted keys - Add GET /me endpoint with key identity, scopes, and project access info - Add PATCH /keys/{id} for partial key updates (name, scopes, project_ids, allowed_ips, expires_in) - Add GET/POST/DELETE /projects/{id}/access for project-centric access management - Auto-grant creating key access when using POST /project/create-and-build - Accept grant_to_key_ids in create-and-build to grant multiple keys on project creation - Move newProvisionerWithDeps test helper from production code to test file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 15:38:37 -07:00
jordan	0f25bd8dbe	feat: hook in notify service for per-project email delivery All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add NotifyProvisioner (port + adapter) using real notify admin API - Create notify account + send key + host grant per project - Inject NOTIFY_API_KEY/HOST/FROM into component deployments - Store NOTIFY_URL, NOTIFY_ADMIN_KEY, RESEND_API_KEY in credential store - Add setup-notify.sh for one-time host/provider/domain setup - Add NOTIFY_ADMIN_KEY constant to domain/credential.go - Wire provisioner in main.go with connection test guard - Add .claude/guides/services/notify.md and CLAUDE.md entry Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 00:30:32 -07:00
jordan	bc77504b35	fix: add 'use client' directive to MediaLibrary and MediaUploader components These components use useState/useRef hooks but lacked the Next.js 'use client' directive, causing the Next.js app build to fail with Server Component errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 00:32:24 -07:00
jordan	592b2d5ec0	fix: clarify database types across docs and fix video storage persistence Two distinct fixes: 1. Database terminology: Make it crystal clear that generated projects use CockroachDB in production and PostgreSQL for local dev, while the rdev platform itself uses PostgreSQL. Updated 15 files across skeleton agents, component templates, cookbook trees, and platform docs. 2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded by the Gemini adapter with auth) and re-downloading from the provider URL with a plain GET — which fails because Gemini URLs require API key auth. Now uses vid.Data first, falls back to downloadURL only for public URLs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:13:21 -07:00
jordan	a8c8a0a14d	feat: add GCS-based persistent media storage, AI generation pipeline, and composable skeleton packages All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Adds complete media storage pipeline with GCS presigned uploads, AI image/video/text generation via queue-based workers, realtime SSE event streaming, and comprehensive skeleton packages (storage, mediagen, textgen, generation, realtime, persona, routing, ai-client). Includes security fixes for media delete authorization, nil pointer guards in handlers, video persistence via download-then-upload, consistent signed URLs, and Image→ImageIcon rename to avoid DOM collision. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:29:09 -07:00
jordan	7249575dea	feat(sessions): add command execution endpoint and activity tracking All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details - Add POST /sessions/:id/exec endpoint for executing commands in sessions - Add session activity tracking (last_activity_at timestamp) - Add database migration 024 for session activity column - Add comprehensive tests for session handlers and service layer - Add wildcard TLS certificate for preview.threesix.ai subdomain - Add infrastructure mocks for testing preview service - Refactor preview cleanup logic to remove unused methods - Add AIOS core documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-13 08:41:05 -07:00
jordan	84af398d85	refactor: add timeout constants for agent execution tiers All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Add TimeoutAgentExecution (22m) to handlers for synchronous SDLC execution, and TimeoutAgent{Default,Medium,Heavy} (12/22/47m) to workers for tiered agent task execution. Aligns with SDLC action complexity tiers and prevents inline duration literals. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-11 10:48:24 -07:00
jordan	542bc722ab	fix(architect): handle missing projects in repo, add cookbook hooks/validation All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details The architect API returned "failed to start conversation" because projectRepo.Get() failed — the in-memory K8s repo watches the rdev namespace but projects deploy to the projects namespace. Made project lookup non-fatal with fallback to default pod. Added error logging to all architect handler methods (were silently swallowing errors). Also adds setup-hooks, commit-after-qa, and pre-merge-validate steps to the foundary cookbook tree for git hooks and code quality gates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 02:25:40 -07:00
jordan	c68fadbccd	fix(architect): add pod_name to agent requests, rewrite foundary cookbook All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details The architect service was missing pod_name/namespace in AgentRequest metadata, causing Claude Code adapter to reject all requests. Added ArchitectServiceConfig with pod resolution (project PodName → default claudebox-0). Removed silent JSON fallback in extractSpecFromMessages that masked errors. Rewrote foundary cookbook from 90-step SDLC flow to focused 25-step cookbook using natural language build prompts instead of /slash-commands that claudebox cannot execute. Added "no fallbacks" rule to CLAUDE.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 01:24:34 -07:00
jordan	a9ad3d8304	chore: accumulated platform hardening and CI fixes All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details CI / Woodpecker: - Add explicit depends_on to all .woodpecker.yml steps (rdev + templates) - Fix skip_tls_verify -> skip-tls-verify (correct Kaniko flag name) - Add replicasets get/list to deployer RBAC for rollout status - Skeleton template: add failure:ignore on docs steps, Traefik TLS annotations on ingress, depends_on on verify step Component templates: - Fix container name in deploy steps (PROJECT_NAME-COMPONENT_NAME) - Replace kubectl scale with kubectl patch for replicas - Add post-deploy image verification and rollout status checks - Applied consistently across all 5 component templates Adapters: - gitea: Add HTTP client timeout (30s), context cancellation checks, handle 404 on GetRepo/DeleteRepo - zot: Add retry with exponential backoff (doWithRetry), limit response body reads to 10MB - cockroach: Use net.JoinHostPort for IPv6-safe DSN construction - woodpecker: Fix error wrapping (%v -> %w) - redis: Fix error wrapping (%v -> %w) - deployer: Add context cancellation checks Services: - apikey_service: Fix error wrapping (%v -> %w) - component_deploy: Fix error wrapping (%v -> %w) - project_infra: Fix error wrapping (%v -> %w) - webhook/dispatcher: Fix error wrapping (%v -> %w) Other: - CLAUDE.md: Add guide links for Gitea, Go 1.25, Woodpecker v3, Traefik v3, Zot registry - circuitbreaker: Add test for error wrapping - docs: Update deployment, troubleshooting, and runbook docs - health: Fix error wrapping (%v -> %w) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 23:16:56 -07:00
jordan	3c9876a678	fix(worker): increase SSE scanner buffer to 1MB in claudebox client Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The HTTP claudebox client's ExecuteStream method used a bare bufio.NewScanner with the default 64KB max token size. When Claude Code produces tool results > 64KB (e.g., reading large files), the SSE event exceeds the scanner limit and fails with "token too long". Every other scanner in the codebase (claudecode adapter, claudebox executor, kubernetes executor) already uses scanner.Buffer(buf, 1MB). This was the only one missed. Fixes: "agent execution failed: read stream: bufio.Scanner: token too long" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 23:14:20 -07:00
jordan	b6e778d5ab	fix(git): harden git flow for concurrent SDLC stress test failures Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details 5 fixes from stress test analysis: 1. CRITICAL: Add pull-before-push to claudebox GitOperations.CommitAndPush, matching the fix already in PodGitOperations (prevents push rejections when concurrent builds advance the remote). 2. HIGH: Extract ResetToMain into PodGitOperations as a shared public method. Wire into BuildExecutor after CloneRepo and update SDLCTaskExecutor to use the shared method. Prevents builds from running on wrong branch when worker pods are reused across tasks. 3. HIGH: Make branch create push failure fatal with retry+rollback in cmd/sdlc/cmd_branch.go. Prevents orphaned .sdlc/ state that causes merge failures after completing all 10 SDLC phases. 4. MEDIUM: Shell-escape token in credential helpers (both PodGitOperations and claudebox GitOperations) to prevent shell injection via tokens containing special characters. 5. MEDIUM: Add GitResetToMain to claudebox sidecar (git.go implementation, server.go endpoint, client.go HTTP method) and wire into HTTPSDLCTaskExecutor for the HTTP sidecar path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 20:57:27 -07:00
jordan	cefc15aa7d	fix(worker): include stdout in error messages when Claude command fails Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Auth errors like "OAuth token has expired" were lost because Claude writes them to stdout, not stderr. The error message only showed kubectl's generic "command terminated with exit code 1". Now includes both stdout and stderr in the error, making failures immediately diagnosable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 17:55:46 -07:00
jordan	b7d0e84946	fix(deploy): create component deployments with 0 replicas to prevent ImagePullBackOff All checks were successful ci/woodpecker/push/woodpecker Pipeline was successful Details Components are scaffolded before CI builds their images. Previously deployments started with 1 replica, causing ImagePullBackOff until the first build completed. Now deployments start at 0 replicas; CI deploy steps scale to 1 after verifying the image exists in the registry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-10 10:16:14 -07:00
jordan	9f957d6e75	fix(templates): harden component CI steps and compile regexes Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Add --connect-timeout 10 and --max-time 15 to all verify step curl calls to prevent hanging on registry health checks - Fix cli template: depends_on [deps] -> [preflight] for consistency - Add cross-reference comment to service template about verify logic being replicated across all 5 component templates - Document component CI step rules in composable-monorepo.md - Compile regexes at package level instead of per-call in component_updates.go - Add component_updates_test.go Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 19:36:23 -07:00
jordan	9226454b85	feat: label-based undeploy, GC reconciliation, checkout/sessions, pool status Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details - Add UndeployAll() using label selectors to clean up monorepo components on project deletion (replaces name-based Undeploy in DeleteProject and the direct undeploy handler) - Add ResourceGC background worker that periodically finds K8s resources whose project label has no matching DB record, deletes after 1h safety window - Widen deployer client type from *kubernetes.Clientset to kubernetes.Interface for testability - UndeployAll accumulates errors via errors.Join instead of failing fast - Add checkout/checkin sidecar dev flow: temporary git tokens, branch checkout, review on checkin with cleanup workers - Add interactive sessions: pod binding, command execution, SSE streaming, ephemeral preview URLs with session cleanup workers - Add GET /workers/pool endpoint for aggregate capacity and queue depth - Add sessions:read and sessions:execute auth scopes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 19:11:28 -07:00
jordan	2a2f2fa370	fix(logging): implement http.Flusher on responseWriter for SSE streaming Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The logging middleware's responseWriter wrapped http.ResponseWriter but only implemented WriteHeader, Write, and Unwrap. The missing Flush() method caused w.(http.Flusher) type assertions to fail in the claudebox sidecar's streaming endpoint, returning 500 "streaming not supported". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 13:23:42 -07:00
jordan	6ec2a4fea3	fix(sdlc): persist branch metadata on main before feature branch creation Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details The `sdlc merge` command reads the Branch field from the feature manifest on main, but `sdlc branch create` was only committing that state to the feature branch (via the executor's CommitAndPush). This caused merge to fail with "feature has no branch". Two changes: 1. cmd/sdlc/cmd_branch.go: commit .sdlc/ state to main before `git checkout -b`, ensuring Branch metadata is on main where merge reads it. 2. internal/worker/sdlc_executor.go: reset workspace to main (`git fetch && git checkout main && git reset --hard origin/main`) before each SDLC task, preventing cross-task branch contamination from commands that switch branches. Also updates foundary cookbook with architect fallback pattern and on_error: continue for steps that may fail during early lifecycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 08:36:10 -07:00
jordan	a69eb7e587	feat(foundary): implement complete backend for conversational project design Implements all 5 phases of Foundary Studio backend: Phase 1: Chat Persistence (8 API endpoints) - Conversations and messages with proper cascading deletes - PostgreSQL schema with auto-update triggers - Full CRUD operations with structured logging Phase 2: Blueprint Entity (5 API endpoints) - JSONB spec storage with GIN indexes - Flexible structured data for project specifications - Version-controlled blueprint management Phase 3: Architect Service (3 API endpoints) - Conversational AI orchestration with Claude - Multi-turn dialogue with context building - Blueprint spec extraction from conversations Phase 4: Work Queue Integration - Verified existing endpoint compatibility Phase 5: Structured Questions (6 API endpoints) - Four question types: text, choice, multichoice, yesno - Answer validation with proper constraints - Conversation-linked Q&A flow Architecture: - Textbook hexagonal architecture (domain → port → adapter → service → handler) - Zero external dependencies in domain layer - Consistent error handling with proper wrapping - Auth scopes on all routes (projects:read, projects:execute) - Structured logging with operation context and duration tracking - NULL-safe DTO converters throughout Database: - 3 new migrations (019, 020, 021) - UUIDs for all primary keys - Proper foreign key constraints with ON DELETE CASCADE - Optimized indexes including partial index for unanswered questions - Auto-update triggers for timestamps OpenAPI Documentation: - Complete API documentation under 'Foundary' tag - 22 new endpoints documented with examples - Request/response schemas for all operations Logging Improvements: - Added operation field to all service logs - Added duration_ms tracking for performance monitoring - Log response_length instead of full response content - Consistent use of logging field constants - Execute-then-log pattern for delete operations Files: 32 changed, 2800+ lines added - 7 domain models - 3 database migrations - 3 port interfaces - 3 postgres adapters - 4 services (conversation, blueprint, question, architect) - 4 handlers with DTOs - OpenAPI documentation - Integration in main.go 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2026-02-09 00:50:46 -07:00
jordan	adcea2fc1f	fix(templates): upgrade Go to 1.25 and fix Woodpecker syntax Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details ## Template Version Alignment - Go: 1.23 → 1.25 across all templates (go.work, go.mod, Dockerfiles, CI) - Alpine: latest → 3.19 (explicit version pinning) - Woodpecker: failure:retry → failure:ignore (invalid syntax fix) ## SDLC Tree Fixes (slackpath-5-full-lifecycle) Fixed merge failures by correcting lifecycle flow: 1. Branch Creation: Added missing create-branch step (planned → ready) - Bug: Merge command requires feature.Branch field to be set - Fix: POST /projects/{id}/sdlc/features/{slug}/branch 2. Artifact Status: Changed approval to pass for execution artifacts - Bug: Review/audit/QA need status="passed" not "approved" - Fix: /artifacts/{type}/approve → /artifacts/{type}/pass - Added: pass-qa step after wait-qa 3. Phase Transition Order: Reordered merge phase transition - Bug: Merge command checks if phase == "merge" first - Fix: transition-to-merge BEFORE merge-feature (not after) ## GCS Provisioner Fix - Replaced deprecated option.WithCredentialsFile with env var approach - Now uses GOOGLE_APPLICATION_CREDENTIALS for ADC (Application Default Credentials) - Avoids security risk from deprecated credential options - Fixed test: Added ComponentTypeGCS to ValidComponentTypes test ## Critical Rules Added - Version alignment: All template versions must stay in sync - When updating versions, grep entire templates/ tree ## Files Changed - 27 template files: Go version + Woodpecker syntax - 1 tree file: SDLC lifecycle flow corrections - 1 CLAUDE.md: Version alignment rule - 1 GCS provisioner: Deprecated API fix - 1 test file: Added missing component type Root cause: Skeleton templates lagged behind Go 1.25 release and had invalid Woodpecker syntax. SDLC tree skipped required branch creation and used wrong artifact approval endpoints. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 23:57:38 -07:00
jordan	a419c53592	fix(sdlc): make phase transitions idempotent Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Allow transitioning to the current phase (no-op success) instead of rejecting it as a "backward" transition. This fixes issues where external systems retry transition commands. Before: draft -> draft returned error After: draft -> draft returns nil (already there) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 14:21:05 -07:00
jordan	00f55f7f6f	fix(sdlc): route conflict with SDLCGenerateHandler shadowing SDLC routes Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details SDLCGenerateHandler was using r.Route() to create a sub-router at /projects/{id}/sdlc/features/{slug}, which shadowed SDLCHandler's nested routes like /features/{slug}/artifacts/{type}/approve. Changed to direct route registration to avoid chi route conflicts. This fixes 404 errors on SDLC feature and artifact endpoints. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 11:27:41 -07:00
jordan	4486042155	fix(registry): delete container images on project teardown Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Root cause of DIGEST_INVALID errors was registry disk exhaustion. Project teardown wasn't cleaning up container images, causing the registry PVC to fill up over time. Changes: - Add RegistryProvider port interface for registry operations - Extend zot.Client with DeleteProjectRepositories method - Wire registry provider into ProjectInfraService - Delete images during DeleteProject cleanup (step 4) The zot client uses the OCI distribution API: - Lists all repos, filters by project prefix - Gets manifest digests via HEAD request - Deletes manifests by digest to trigger GC Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 02:56:18 -07:00
jordan	f20fc6c51c	feat(saga): implement enterprise-grade resilience architecture Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Fixes issues from code review of resilience implementation: - Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler) - Fix CompletedSteps() to include skipped steps for dependency resolution - Fix reverse loop bug in saga compensation (use standard swap pattern) - Add circuit breaker state change callbacks for Prometheus metrics Phase 1 (Build Resilience): - Add failure:retry to all component Kaniko build steps - Add preflight registry health check before builds - Add services-deployed sync point to decouple docs from critical path Phase 2 (API Resilience): - Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry) - Wire circuit breakers with metrics callbacks - Add /health/circuits endpoint for circuit breaker status Phase 3 (Saga Engine): - Full domain model (Saga, SagaStep, RetryPolicy, BackoffType) - PostgreSQL saga repository with CRUD and step management - Saga executor with retry, compensation, skip step support - Saga API handlers with CRUD and control operations Phase 4 (Observability): - Add saga metrics (total, step_duration, retry, circuit_breaker_state) - Add logging fields (saga_id, saga_name, step_name) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-08 01:58:02 -07:00
jordan	9085965864	fix(skeleton): enforce chi {param} URL syntax in agent guidance Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details Agents were generating `:id` (Echo/Gin style) instead of `{id}` (chi style), causing routes to not match. Updated api-designer, go-specialist agents and skeleton CLAUDE.md with explicit CRITICAL notes about brace syntax. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 20:44:52 -07:00

1 2 3

121 Commits