Commit Graph

60 Commits

Author SHA1 Message Date
jordan
32d50a6952 feat: make infra provisioning idempotent + aeries-daeya public discovery feed
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Make postgres and redis provisioning idempotent: return success when already
  provisioned with credentials stored, allowing cookbook trees to safely include
  explicit add-db/add-redis steps alongside auto-provisioned project creation
- Update tests to reflect new idempotent behavior
- Consolidate docs CI into single multi-stage Docker build (remove separate
  build-docs step; Dockerfile.nginx now builds Slate then serves with nginx)
- Delete redundant skeleton docs/Dockerfile (replaced by multi-stage nginx image)
- Add watch verb to woodpecker-deployer RBAC (required by kubectl rollout status)
- Aeries Daeya cookbook: add public discovery feed (/) + character profiles (/c/:handle),
  characters.published/handle/tagline fields, dark pink design system, /studio/* routes,
  verify-public-discovery + verify-otp-endpoint smoke test steps
- Fix Input.tsx: remove non-existent --border-hover CSS variable hover effect
2026-02-28 17:32:21 -07:00
jordan
62a9bbb237 fix: resolve 7 root causes causing cookbook deployment failures
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
RC-1: Gitea org fallback already removed (no-op, confirmed)
RC-3: Push/pull now explicitly target origin main (HEAD:main) in both
  pod_git_operations.go and claudebox/git.go — fixes Woodpecker webhook
  trigger by ensuring pushes always land on the main branch
RC-4: wait_for_pipeline records baseline pipeline number before polling;
  only returns success when a NEWER pipeline completes — prevents false
  positive when a prior pipeline was already success
RC-5: Redis WRONGPASS fixed on live persona-community-5 instance; platform
  gap noted (no reprovision endpoint for Redis ACL drift)
RC-6: Removed on_error:continue from all infra provisioning steps (add-db,
  add-redis) across persona-community, slackpath-2/3/4/5 trees — infra
  failures now fail the tree instead of silently continuing to a crash
RC-7: Added .pnpm-store/ to skeleton .gitignore — prevents thousands of
  cache files being committed by agents after pnpm install
RC-2: Updated all 12 cookbook trees — git_clone_url jordan/ → threesix/
  (24 occurrences across all slackpath, aeries, full-stack, genkit trees)
Also: strings.Cut and strings.SplitSeq lint fixes in pod_git_operations.go
  and claudebox/git.go

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 18:49:09 -07:00
jordan
a843fd7ff4 fix: make NOTIFY_API_KEY optional — fall back to log-only email mode
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
NOTIFY_URL is a global platform credential; NOTIFY_API_KEY is project-
scoped and may not be provisioned if notify setup failed or the
notifyProvisioner wasn't configured. Previously the service would crash
on startup with "invalid configuration: API key is required" when
NOTIFY_URL was set but NOTIFY_API_KEY was missing.

Now the condition checks both: only initialize the notify client when
both NOTIFY_URL and NOTIFY_API_KEY are set. When either is absent, fall
back to log-only mode with a warning (instead of os.Exit(1)).

This is the correct behavior: email not delivered is survivable, but a
service crash on startup breaks the entire application.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 05:52:58 -07:00
jordan
ad1f19739d fix: call config.MustInit() before config.Load() so Viper reads env vars
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Without MustInit(), viper.AutomaticEnv() is never called, so Viper
cannot read environment variables injected by K8s secrets (envFrom).
This caused DATABASE_URL to always appear empty in deployed services,
forcing them into standalone/in-memory mode even when a database was
provisioned.

os.Getenv() calls like JWT_SECRET worked fine (direct syscall).
Viper-backed reads like DATABASE_URL did not (require AutomaticEnv).

Added pkgconfig.MustInit() call at the top of main() in both the
service component template and the full-monorepo example-api.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 04:25:38 -07:00
jordan
4d9203eddc fix: commit usePersonaGeneration.ts skeleton template
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
The usePersonaGeneration hook was created on disk but never committed
to git, so rendered projects had a broken import in index.ts causing
TypeScript build failures in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 03:59:31 -07:00
jordan
9be5c7d81b fix: address code review issues in album and generation skeleton packages
- Add ErrAnchorRequired sentinel to pkg/album — replaces fragile string equality check
  used for 422 detection; callers now use errors.Is()
- Extract parseShotIndex() helper in album handler — replaces fmt.Sscanf which silently
  accepted partial parses like "12abc"; strconv.Atoi requires the full string to be numeric
- Restructure caption saves in album/handler — captions now written outside the
  len(img.Data) > 0 gate, so URL-only providers (no bytes returned) still get captions
- Add storage.FetchURL() shared utility — removes fetchBytes/downloadURL duplication
  across album and generation packages; callers control timeout via their http.Client
- Add video captions to VideoHandler — same caption sidecar pattern applied to videos
- Add persona generation event types to realtime package — persona_spec_*, persona_image_*,
  persona_video_* events added to EventType union and usePersonaGeneration hook exported

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 03:01:37 -07:00
jordan
062a828a00 feat: save prompt caption alongside every generated image
After each successful image upload to storage, a sidecar `.caption`
file is uploaded at the same path with `.caption` extension containing
the exact prompt used to generate the image.

Coverage:
- generation/handlers.go: ImageHandler → media/{userID}/images/{jobID}_{i}.caption
- album/handler.go: AnchorHandler → albums/{userID}/{albumID}/anchor.caption
- album/handler.go: ShotHandler → albums/{userID}/{albumID}/shots/{shotIndex}.caption
- personagen/service.go: generatePosition → personas/{specID}/images/{pos:02d}.caption

Caption failures are logged at warn level and never abort the job.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 02:38:08 -07:00
jordan
3979ef2d08 feat: wire mixed-heritage through Stage 4 and fix pronoun support
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- specgen: extend dnaLLMResponse with heritage fields; conditionally extend
  Stage 4 prompt for EthnicityMixed to ask LLM for primary_heritage,
  secondary_heritage, and mix_percentage; populate IdentityDNA fields from
  response so mixed personas get a real heritage breakdown
- imagegen: buildIdentitySection() produces "East Asian and Latina/Hispanic
  heritage" description for mixed personas instead of generic "mixed-race"
- videogen: add genderPronouns() helper; replace hardcoded she/her with
  pronoun set across all 4 video prompts; generateVideo() returns raw bytes
  so caller can upload to storage
- service: GenerateVideo() uploads video to storage and sets VideoSpec.URL;
  anchor ordering ensures position 1 is generated first; emit
  persona_video_failed SSE event on non-fatal video failures; replace manual
  fold helpers with strings.ToLower + strings.Contains
- worker/main: register persona_generate handler when both AI managers ready
- docs: add persona_video_failed to SSE events reference in personagen.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 01:21:59 -07:00
jordan
002c32aedb feat: add album generation system to skeleton
Adds anchor-based image album generation across docs, skeleton, and rendered
full-monorepo. One subject description + one anchor image + N directed shots,
covering personas, products, characters, and brand assets out of the box.

## What ships

**Skeleton packages:**
- pkg/album/types.go — Album, Shot, ShotStatus, ShotTemplate, AlbumUpdater
- pkg/album/templates.go — PortraitSession, ProductShoot, CharacterSheet built-ins
- pkg/album/handler.go — AnchorHandler + ShotHandler queue job handlers
- packages/realtime/src/useAlbumGeneration.ts — SSE hook owning all album state
- packages/ui/src/components/AlbumGrid.tsx — responsive shot grid with shimmer
- packages/ui/src/components/ShotCard.tsx — pending/generating/complete/failed states
- packages/ui/src/components/AnchorPreview.tsx — anchor CTA + image with controls

**Component service template:**
- internal/port/album.go — AlbumRepository interface
- internal/adapter/memory/album.go — in-memory repo for standalone dev
- internal/service/album.go — create, list, get, generateAnchor, generateAllShots
- internal/api/handlers/album.go — HTTP handlers (CRUD + 202 generation endpoints)
- Routes: GET/POST /albums, GET/DELETE /albums/{id}, POST /albums/{id}/anchor,
  POST/DELETE /albums/{id}/shots, POST /albums/{id}/shots/{index}

**Documentation:**
- .claude/guides/album.md — full guide with API, SSE events, frontend usage

**Key architecture decisions:**
- Anchor bytes never stored in queue payload — workers fetch AnchorURL at runtime
- Generation order enforced: POST /shots returns 422 if no anchor exists
- All album SSE events on existing user:<userId> channel (no new channel)
- AlbumUpdater interface lets job handlers update repo from inside queue workers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 23:57:21 -07:00
jordan
4603402b84 feat: OTP supports unified register+login flow
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Previously SendOTP silently dropped requests for unknown emails, so new
users had no passwordless path in. Now:

- SendOTP: if REGISTRATION_ENABLED and email unknown, generates and
  sends the code anyway (UserID nil until verify)
- VerifyOTP: if email unknown after valid code, auto-registers the user
  (emailVerified=true — OTP delivery proves ownership, name defaults to
  email local-part) then creates a session

REGISTRATION_ENABLED=false continues to block unknown emails at SendOTP,
preserving invite-only / closed-beta behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 11:17:42 -07:00
jordan
5ac9af018a fix: always log OTP codes to stdout in standalone dev mode
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
In-memory auth codes are ephemeral — they're wiped on server restart.
Previously, codes were only visible via email delivery. If the server
restarted between OTP send and OTP verify, the code would be lost.

Now memory.AuthCodeRepository.Create() always logs the code to stdout
with a [DEV] prefix. This gives developers a reliable fallback regardless
of whether NOTIFY_URL is set. Updated CLAUDE.md to document this behavior
and the DEV_USER_EMAIL env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 00:13:12 -07:00
jordan
5f66eb0e7b fix: seed dev user from DEV_USER_EMAIL env var so auth survives restarts
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
In standalone mode (no DATABASE_URL), the in-memory user store only had
hardcoded demo accounts. Any real email the developer used was lost on every
server restart, causing OTP requests to silently fail with "unknown email".

NewUserRepository now accepts devEmail + devPassword. If DEV_USER_EMAIL is
set, that account is seeded on every startup alongside the demo users. The
developer's email is always registered, OTPs route to notify (or log to
console), and re-renders/restarts no longer break the auth flow.

New config fields: DevUserEmail (DEV_USER_EMAIL) / DevUserPassword
(DEV_USER_PASSWORD, default: "DevPassword1").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 23:46:12 -07:00
jordan
27e6cfd42b feat: add HTML email template system to skeleton service component
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Every project generated from the skeleton now ships with styled,
production-ready transactional emails out of the box.

New pkg/email package:
- Renderer: loads templates from caller-provided embed.FS, inlines CSS via
  douceur at startup, derives plain text via goquery for multipart delivery
- DevHandler: live browser preview at GET /dev/emails and /dev/emails/{purpose}
  (development only, never mounted in production)
- CSSInlineErr field on RenderedEmail so callers can log degraded renders

New service component templates:
- internal/email/embed.go.tmpl — embeds template FS (uses all: prefix for _*.html)
- internal/email/renderer_test.go.tmpl — 9 tests covering all purposes + brand injection
- internal/email/templates/ — 5 HTML email types (login_otp, email_verify,
  magic_link, password_reset, welcome) + 5 shared partials (_layout, _header,
  _footer, _button, _code_box)

Updated service component templates:
- config.go.tmpl — brand fields: AppName, AppURL, SupportEmail, LogoURL, BrandColor
- main.go.tmpl — wires renderer at startup, logs template count
- routes.go.tmpl — mounts /dev/emails in development; EmailRenderer in Dependencies
- notify.go.tmpl — renders HTML before sending; warns on CSS inlining failure
- go.mod.tmpl — adds douceur, goquery, gorilla/css, andybalholm/cascadia

Deleted: internal/adapter/email/helpers.go.tmpl (replaced by meta.yaml + renderer)

Fix: template directory named email_verify (matching domain.PurposeEmailVerify)
rather than verify_email — the mismatch caused all verification emails to fail
with "unknown email purpose" at send time while tests passed (tests called
Render directly with the wrong name).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 22:44:59 -07:00
jordan
4f01015132 feat: implement project access enforcement and management API
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Fix no-op RequireProjectAccess middleware to enforce project_ids
- Apply project access middleware to all project-scoped routes
- Filter GET /projects by allowed project IDs for restricted keys
- Add GET /me endpoint with key identity, scopes, and project access info
- Add PATCH /keys/{id} for partial key updates (name, scopes, project_ids, allowed_ips, expires_in)
- Add GET/POST/DELETE /projects/{id}/access for project-centric access management
- Auto-grant creating key access when using POST /project/create-and-build
- Accept grant_to_key_ids in create-and-build to grant multiple keys on project creation
- Move newProvisionerWithDeps test helper from production code to test file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 15:38:37 -07:00
jordan
bc77504b35 fix: add 'use client' directive to MediaLibrary and MediaUploader components
These components use useState/useRef hooks but lacked the Next.js 'use client'
directive, causing the Next.js app build to fail with Server Component errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 00:32:24 -07:00
jordan
592b2d5ec0 fix: clarify database types across docs and fix video storage persistence
Two distinct fixes:

1. Database terminology: Make it crystal clear that generated projects use
   CockroachDB in production and PostgreSQL for local dev, while the rdev
   platform itself uses PostgreSQL. Updated 15 files across skeleton agents,
   component templates, cookbook trees, and platform docs.

2. Video storage: VideoHandler was ignoring vid.Data bytes (already downloaded
   by the Gemini adapter with auth) and re-downloading from the provider URL
   with a plain GET — which fails because Gemini URLs require API key auth.
   Now uses vid.Data first, falls back to downloadURL only for public URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:13:21 -07:00
jordan
a8c8a0a14d feat: add GCS-based persistent media storage, AI generation pipeline, and composable skeleton packages
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Adds complete media storage pipeline with GCS presigned uploads, AI image/video/text generation
via queue-based workers, realtime SSE event streaming, and comprehensive skeleton packages
(storage, mediagen, textgen, generation, realtime, persona, routing, ai-client). Includes
security fixes for media delete authorization, nil pointer guards in handlers, video persistence
via download-then-upload, consistent signed URLs, and Image→ImageIcon rename to avoid DOM collision.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:29:09 -07:00
jordan
542bc722ab fix(architect): handle missing projects in repo, add cookbook hooks/validation
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
The architect API returned "failed to start conversation" because
projectRepo.Get() failed — the in-memory K8s repo watches the rdev
namespace but projects deploy to the projects namespace. Made project
lookup non-fatal with fallback to default pod. Added error logging to
all architect handler methods (were silently swallowing errors).

Also adds setup-hooks, commit-after-qa, and pre-merge-validate steps
to the foundary cookbook tree for git hooks and code quality gates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 02:25:40 -07:00
jordan
a9ad3d8304 chore: accumulated platform hardening and CI fixes
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
CI / Woodpecker:
- Add explicit depends_on to all .woodpecker.yml steps (rdev + templates)
- Fix skip_tls_verify -> skip-tls-verify (correct Kaniko flag name)
- Add replicasets get/list to deployer RBAC for rollout status
- Skeleton template: add failure:ignore on docs steps, Traefik TLS
  annotations on ingress, depends_on on verify step

Component templates:
- Fix container name in deploy steps (PROJECT_NAME-COMPONENT_NAME)
- Replace kubectl scale with kubectl patch for replicas
- Add post-deploy image verification and rollout status checks
- Applied consistently across all 5 component templates

Adapters:
- gitea: Add HTTP client timeout (30s), context cancellation checks,
  handle 404 on GetRepo/DeleteRepo
- zot: Add retry with exponential backoff (doWithRetry), limit response
  body reads to 10MB
- cockroach: Use net.JoinHostPort for IPv6-safe DSN construction
- woodpecker: Fix error wrapping (%v -> %w)
- redis: Fix error wrapping (%v -> %w)
- deployer: Add context cancellation checks

Services:
- apikey_service: Fix error wrapping (%v -> %w)
- component_deploy: Fix error wrapping (%v -> %w)
- project_infra: Fix error wrapping (%v -> %w)
- webhook/dispatcher: Fix error wrapping (%v -> %w)

Other:
- CLAUDE.md: Add guide links for Gitea, Go 1.25, Woodpecker v3,
  Traefik v3, Zot registry
- circuitbreaker: Add test for error wrapping
- docs: Update deployment, troubleshooting, and runbook docs
- health: Fix error wrapping (%v -> %w)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 23:16:56 -07:00
jordan
b7d0e84946 fix(deploy): create component deployments with 0 replicas to prevent ImagePullBackOff
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Components are scaffolded before CI builds their images. Previously deployments
started with 1 replica, causing ImagePullBackOff until the first build completed.
Now deployments start at 0 replicas; CI deploy steps scale to 1 after verifying
the image exists in the registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 10:16:14 -07:00
jordan
9f957d6e75 fix(templates): harden component CI steps and compile regexes
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add --connect-timeout 10 and --max-time 15 to all verify step curl
  calls to prevent hanging on registry health checks
- Fix cli template: depends_on [deps] -> [preflight] for consistency
- Add cross-reference comment to service template about verify logic
  being replicated across all 5 component templates
- Document component CI step rules in composable-monorepo.md
- Compile regexes at package level instead of per-call in
  component_updates.go
- Add component_updates_test.go

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:36:23 -07:00
jordan
9226454b85 feat: label-based undeploy, GC reconciliation, checkout/sessions, pool status
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add UndeployAll() using label selectors to clean up monorepo components
  on project deletion (replaces name-based Undeploy in DeleteProject and
  the direct undeploy handler)
- Add ResourceGC background worker that periodically finds K8s resources
  whose project label has no matching DB record, deletes after 1h safety
  window
- Widen deployer client type from *kubernetes.Clientset to
  kubernetes.Interface for testability
- UndeployAll accumulates errors via errors.Join instead of failing fast
- Add checkout/checkin sidecar dev flow: temporary git tokens, branch
  checkout, review on checkin with cleanup workers
- Add interactive sessions: pod binding, command execution, SSE streaming,
  ephemeral preview URLs with session cleanup workers
- Add GET /workers/pool endpoint for aggregate capacity and queue depth
- Add sessions:read and sessions:execute auth scopes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:11:28 -07:00
jordan
adcea2fc1f fix(templates): upgrade Go to 1.25 and fix Woodpecker syntax
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
## Template Version Alignment
- Go: 1.23 → 1.25 across all templates (go.work, go.mod, Dockerfiles, CI)
- Alpine: latest → 3.19 (explicit version pinning)
- Woodpecker: failure:retry → failure:ignore (invalid syntax fix)

## SDLC Tree Fixes (slackpath-5-full-lifecycle)
Fixed merge failures by correcting lifecycle flow:

1. **Branch Creation**: Added missing create-branch step (planned → ready)
   - Bug: Merge command requires feature.Branch field to be set
   - Fix: POST /projects/{id}/sdlc/features/{slug}/branch

2. **Artifact Status**: Changed approval to pass for execution artifacts
   - Bug: Review/audit/QA need status="passed" not "approved"
   - Fix: /artifacts/{type}/approve → /artifacts/{type}/pass
   - Added: pass-qa step after wait-qa

3. **Phase Transition Order**: Reordered merge phase transition
   - Bug: Merge command checks if phase == "merge" first
   - Fix: transition-to-merge BEFORE merge-feature (not after)

## GCS Provisioner Fix
- Replaced deprecated option.WithCredentialsFile with env var approach
- Now uses GOOGLE_APPLICATION_CREDENTIALS for ADC (Application Default Credentials)
- Avoids security risk from deprecated credential options
- Fixed test: Added ComponentTypeGCS to ValidComponentTypes test

## Critical Rules Added
- Version alignment: All template versions must stay in sync
- When updating versions, grep entire templates/ tree

## Files Changed
- 27 template files: Go version + Woodpecker syntax
- 1 tree file: SDLC lifecycle flow corrections
- 1 CLAUDE.md: Version alignment rule
- 1 GCS provisioner: Deprecated API fix
- 1 test file: Added missing component type

Root cause: Skeleton templates lagged behind Go 1.25 release and had
invalid Woodpecker syntax. SDLC tree skipped required branch creation
and used wrong artifact approval endpoints.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 23:57:38 -07:00
jordan
f20fc6c51c feat(saga): implement enterprise-grade resilience architecture
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Fixes issues from code review of resilience implementation:

- Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler)
- Fix CompletedSteps() to include skipped steps for dependency resolution
- Fix reverse loop bug in saga compensation (use standard swap pattern)
- Add circuit breaker state change callbacks for Prometheus metrics

Phase 1 (Build Resilience):
- Add failure:retry to all component Kaniko build steps
- Add preflight registry health check before builds
- Add services-deployed sync point to decouple docs from critical path

Phase 2 (API Resilience):
- Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry)
- Wire circuit breakers with metrics callbacks
- Add /health/circuits endpoint for circuit breaker status

Phase 3 (Saga Engine):
- Full domain model (Saga, SagaStep, RetryPolicy, BackoffType)
- PostgreSQL saga repository with CRUD and step management
- Saga executor with retry, compensation, skip step support
- Saga API handlers with CRUD and control operations

Phase 4 (Observability):
- Add saga metrics (total, step_duration, retry, circuit_breaker_state)
- Add logging fields (saga_id, saga_name, step_name)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 01:58:02 -07:00
jordan
9085965864 fix(skeleton): enforce chi {param} URL syntax in agent guidance
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Agents were generating `:id` (Echo/Gin style) instead of `{id}` (chi style),
causing routes to not match. Updated api-designer, go-specialist agents and
skeleton CLAUDE.md with explicit CRITICAL notes about brace syntax.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 20:44:52 -07:00
jordan
863dfd3214 fix: skip root deployment for empty template (defaults to skeleton)
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
When req.Template is empty, it defaults to 'skeleton' but the check
in createInitialDeployment only matched 'skeleton' explicitly, not
empty string. This caused a broken deployment to be created for
monorepo projects with a non-existent image.

Root cause: slackpath-5 creates project with empty template, which
defaults to skeleton, but createInitialDeployment was still creating
a root deployment that references registry.threesix.ai/{project}:latest
which never gets built (skeleton has no root Dockerfile).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 19:32:19 -07:00
jordan
bcf9f28bb9 fix: add failure:ignore to docs build steps
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
When docs infrastructure doesn't exist, the docs build steps should
gracefully skip without failing the entire pipeline.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 18:26:00 -07:00
jordan
2a25a161cb fix: use plugin-kaniko for docs image build
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
The raw gcr.io/kaniko-project/executor with commands: doesn't work
properly in Woodpecker. Switch to woodpeckerci/plugin-kaniko with
settings: to match other component builds.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 18:08:31 -07:00
jordan
bed72961fe fix: add --insecure flag to kaniko for docs image build
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
The registry.threesix.ai uses a self-signed certificate.
Service builds use plugin-kaniko with skip-tls-verify, but docs
build used raw kaniko executor without TLS bypass, causing exit 128.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 17:50:38 -07:00
jordan
be80fd2d4a fix: correct kaniko dockerfile path for docs image build
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
When --context=docs is set, the --dockerfile path should be relative
to the context directory. Changed from docs/Dockerfile.nginx to
Dockerfile.nginx since kaniko already looks in the docs/ directory.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 17:35:54 -07:00
jordan
caf0990ceb fix: downgrade rouge to 3.x for middleman-syntax compatibility
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
middleman-syntax ~> 3.2 requires rouge ~> 3.2, but Gemfile had rouge ~> 4.0
causing bundle install to fail with version resolution error.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 16:48:49 -07:00
jordan
af91bad0ff feat: add Slate documentation templates to skeleton
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Adds complete Slate documentation infrastructure to generated projects:
- docs/ directory with Gemfile, config.rb, and source templates
- Dockerfile for building docs site
- Dockerfile.nginx for serving static docs
- generate-docs.sh script for CI integration
- Claude command for AI-assisted docs generation
- OpenAPI → Slate markdown conversion via widdershins

Also includes:
- --export-openapi flag for service binaries
- DNS provisioning for docs.{domain} subdomain
- Updated project_infra for docs DNS records

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 16:06:36 -07:00
jordan
f64377116a fix: add build-complete sync point for docs pipeline ordering
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
The export-openapi step was running in parallel with component builds
because it had no explicit dependency. This could cause docs generation
to run before component services were fully built.

Changes:
- Add build-complete step with NO depends_on (waits for ALL prior steps)
- Make export-openapi depend on build-complete
- Complete docs pipeline: export-openapi → generate-docs → build-docs →
  build-docs-image → deploy-docs
- Update verify step label selector to use project= instead of app=

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 16:02:17 -07:00
jordan
e58d679e67 fix: add go mod download to component Dockerfiles
Empty go.sum files were causing Docker builds to fail because
Go couldn't verify dependencies. Added go mod download steps
for both pkg and component directories before building.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 23:35:02 -07:00
jordan
f6a2b61b16 fix: add skeleton settings.local.json (was globally gitignored)
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 22:55:17 -07:00
jordan
3b0779fbe8 fix: slackpath trees use batch endpoint for atomic multi-component adds
Updates slackpath-2 and slackpath-4 to use POST /projects/{id}/components/batch
for adding multiple Go components atomically in a single git commit. This
prevents the go.work race condition where individual commits reference modules
that don't exist yet.

Also adds on_error: continue for infrastructure provisioning steps that may
already exist from skeleton (redis, postgres).

Verified:
- slackpath-1:  Complete (wait_build polled 5 times, detected success)
- slackpath-2:  Complete (wait_build polled 111 times, detected success)
- slackpath-3:  Infrastructure passed (worker capacity limited testing)
- slackpath-4:  Infrastructure passed (worker capacity limited testing)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 14:44:53 -07:00
jordan
853ec4cf81 fix: go.work race condition with batch components and idempotent provisioning
Three coordinated fixes for CI pipeline race conditions:

1. Woodpecker step dependencies: Added depends_on: [deps] to all 6 component
   templates (service, worker, cli, app-astro, app-react, app-nextjs) so build
   steps wait for go work sync to complete.

2. Idempotent resource provisioning: Modified provisionResources() to check
   for existing database/cache before creating, preventing "already exists"
   errors on component re-adds.

3. Batch component endpoint: POST /projects/{id}/components/batch enables
   atomic multi-component additions in a single git commit. Validates all
   components upfront, provisions infra sequentially, commits code components
   atomically.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 12:31:40 -07:00
jordan
53862c773b fix: resolve systemic debt in worker and skeleton templates
Worker template fixes:
- Replace panic() with logger.Error() + os.Exit(1) for config errors
- Remove double-timeout application (context + middleware)
- Add error message truncation to prevent log bloat
- Use named constants for shutdown grace period and stale check interval

Skeleton pkg/auth fixes:
- Fix error wrapping to use %w consistently in jwt.go
- Add GetUserOrError() as safe alternative to MustGetUser() panic

Skeleton pkg/queue fixes:
- Check RowsAffected() errors instead of ignoring them
- Add input validation to EnqueueWithOptions (require job type, cap retries)
- Add log truncation for error messages
- Fix inaccurate doc comment claiming exponential backoff

Worker timeout consolidation:
- Add internal/worker/timeouts.go with named constants
- Migrate all workers to use timeout constants

Cleanup:
- Remove obsolete slack-preparation-thoughts.md files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 23:44:55 -07:00
jordan
d69da6d627 feat: add structured logging infrastructure and SDLC extensions
Major changes:
- Add internal/logging package with field constants, context propagation,
  sensitive data auto-redaction, and per-component log levels
- Add worker timeout constants (TimeoutQuickOp, TimeoutHealthCheck, etc.)
- Extend SDLC with callback handlers, generate endpoints, and executor
- Add new cookbook trees for aeries and slackpath progression
- Add skeleton templates for queue, realtime, and microservices
- Add worker component template with async job processing
- Refactor services and handlers to use new logging infrastructure
- Split component.go into component_infra.go and component_listing.go

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:56:04 -07:00
jordan
1790afd0ee feat: add path-based ingress management for component lifecycle
Adds AddIngressPath and RemoveIngressPath to the Deployer interface
for managing per-component ingress rules in monorepo projects.

- Implement conflict retry logic for concurrent ingress updates
- Add K8s client interface for testability
- Add comprehensive unit tests for ingress path operations
- Add component deployment and teardown methods to ComponentService
- Update service templates with OpenAPI spec improvements
- Add evolving-app cookbook tree for reference
- Split resources.go into resources_ingress.go for path-based routing
- Split component.go into component_deploy.go for deployment helpers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:31:50 -07:00
jordan
196e3d96e8 fix: make go.work.sum optional in Dockerfiles
Use glob pattern go.work.su[m] instead of go.work.sum to allow
the COPY to succeed even when go.work.sum doesn't exist yet.
This happens on fresh monorepos before dependencies are synced.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:58:46 -07:00
jordan
9a1309a0c5 feat: fix composable monorepo CI builds + health endpoint improvements
Composable monorepo CI fixes:
- Add empty go.sum.tmpl files for pkg, service, worker, and cli components
- Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko)
- Add deps step to CI that runs go work sync and go mod tidy before builds
- Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0)

Health endpoint improvements:
- Add registry health check (zot OCI /v2/ endpoint)
- Add health metrics for CI, registry, and Git
- Add /health/ci endpoint for Woodpecker health

Visual verification scaffolding:
- Add Playwright pod and scripts ConfigMap
- Add vision.md and implementation breakdown plan

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 18:46:51 -07:00
jordan
6e8f5821af feat: add artifact pass/fail/needs-fix lifecycle for SDLC execution phases
- Add pass/fail/needs-fix CLI commands to cmd/sdlc/cmd_artifact.go
- Add 3 new methods to SDLCExecutor interface in internal/port
- Implement methods in kubernetes adapter
- Add service methods to SDLCService
- Add HTTP handlers for POST .../artifacts/{type}/pass|fail|needs-fix
- Update 6 skeleton commands to evaluate and set artifact status
- Update test mocks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 22:14:53 -07:00
jordan
56e3f83955 feat: add auth scopes, OpenAPI docs, SDLC guides, and code quality improvements
- Add auth.RequireScope() to all handler routes for proper authorization
- Add SDLC OpenAPI endpoint documentation (state, features, tasks, branches, merge, archive, orchestrator)
- Add SDLC documentation guides (getting-started, cli-reference, api-reference, command-catalog)
- Add artifact_test.go for SDLC artifact coverage
- Add CLAUDE.md rules: auth scopes requirement, error wrapping with %w
- Fix error wrapping to use %w instead of %v throughout codebase
- Improve CLI merge command with conflict detection and resolution
- Fix handler tests to include auth middleware for RequireScope
- Add cookbook tree runner scripts for automated testing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 13:55:50 -07:00
jordan
f22b220c6d feat: add SDLC branch management, merge, archive, and orchestrator APIs
Add branch lifecycle commands (branch, merge, archive) to the SDLC CLI.
Introduce orchestrator handler and service for multi-step SDLC workflows.
Expand skeleton template with 15 Claude commands covering the full feature
lifecycle. Extend classifier rules, error types, and executor port for
branch operations. Split rules.go and classifier_test.go to stay within
500-line limit.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:30:03 -07:00
jordan
425ef0f806 feat: add SDLC orchestration - library, CLI, and API integration
Implements deterministic feature lifecycle management for agent-driven
development. Agents use the CLI in pods; operators control via REST API.

Library (internal/sdlc/):
- Feature lifecycle with 10 phases (draft → released)
- Classifier engine with priority-ordered rules
- Artifact tracking with approval workflow
- Task management within features
- YAML-based state persistence

CLI (cmd/sdlc/):
- init, state, next, feature, artifact, task, query commands
- --json flag for machine-readable output
- Runs inside project pods

API (21 endpoints under /projects/{id}/sdlc/):
- State: GET /state, GET /next
- Features: CRUD + transition/block/unblock
- Artifacts: approve/reject per type
- Tasks: add/start/complete/block
- Queries: blocked/ready/needs-approval

Architecture:
- Port: SDLCExecutor interface (internal/port/)
- Adapter: kubectl exec into pods (internal/adapter/kubernetes/)
- Service: pod resolution + logging (internal/service/)
- Handlers: 5 files under 500-line limit (internal/handlers/)

Also includes template upgrades (chassis framework, UI components,
OpenAPI helpers, backend/frontend guides) and component improvements.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 09:57:05 -07:00
jordan
62460bf098 feat: complete template upgrade - chassis framework, UI library, auth, app-nextjs, OpenAPI, and cookbook
Weeks 1-7 of the template upgrade plan:
- pkg/api: typed HTTPError with sentinels, Wrap/WrapMiddleware, Bind, health probes, OpenAPI schema/param builders
- skeleton/packages: ui (design tokens, components), layout (DashboardShell), auth (AuthProvider, ProtectedRoute), api-client
- skeleton/pkg: httperror, app/handler, app/bind, app/health, auth (JWT/API key middleware)
- components/app-nextjs: Next.js 14 App Router template with dashboard, server actions, auth
- cookbooks/feature-development.md with test and validation scripts
- Handler tests for components, project management, and woodpecker webhook
- 3 rounds of code review fixes applied

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 00:46:51 -07:00
jordan
c280a92012 feat: add operations audit system and template improvements
Operations Audit (new feature):
- Add Operation domain model with status tracking (pending, running, completed, failed, cancelled)
- Add OperationRepository with PostgreSQL implementation
- Add OperationService for CRUD and lifecycle management
- Add operations handlers (list, get, cancel endpoints)
- Add migration 015_operations.sql for operations table
- Add operation cleanup worker for stale operation handling
- Add ErrOperationNotFound to domain errors

Template Improvements:
- Add CLAUDE.md configuration files to astro-landing, default, and go-api templates
- Fix PORT template variable usage in nginx configs for app templates
- Add replace directives for local pkg module in Go templates
- Simplify Go service/worker Dockerfiles for workspace builds
- Fix TypeScript error in logger template

Other:
- Refactor landing-test.sh cookbook script
- Update CLAUDE.md version reference

Note: Some files exceed 500-line limit (pre-existing debt + new feature)
- component.go: 550 lines (unchanged, pre-existing)
- main.go: 522 lines (added operations wiring)
- operation_repo.go: 569 lines (new, needs splitting)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 19:08:57 -07:00
jordan
b3d47abd7c feat: add curated skills, commands, and agents to skeleton template
Add best-of-best Claude Code configuration from local setup to the
composable monorepo skeleton template, giving new projects a powerful
starting configuration.

Commands added (4):
- do-parallel: Execute tasks in parallel waves with agent selection
- remember: Store learnings as institutional memory
- prepare: Pre-implementation readiness assessment
- root-cause: Root cause analysis with parallel investigation

Skills added (5):
- orchestrated-execution: Task pipelines with implementation → review → fix
- root-cause-analyst: Systematic diagnosis with confidence scoring
- knowledge-librarian: Organize learnings in ai-lookup/ structure
- feature-verifier: Verify features work with evidence matrix
- prepare: Binary outcome readiness assessment (brief or gap list)

Agents added (1):
- quality-engineer: Code quality, test coverage, error handling reviewer

All Citadel-specific references genericized to use skeleton's existing
agents (go-specialist, testing-strategist, security-architect, etc).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 15:33:25 -07:00
jordan
05a64c51e7 release: v0.10.27 - fix: woodpecker step YAML multi-line command syntax 2026-02-01 12:42:18 -07:00