Commit Graph

41 Commits

Author SHA1 Message Date
jordan
17240f4efd fix(rc-5): add Redis ACL persistence + cache reprovision endpoint
## Changes

### port.Deployer interface
- Add PatchProjectSecrets(ctx, projectName, patch) to merge key-value pairs
  into all K8s secrets labeled project={projectName}
- Add RestartAll(ctx, projectName) to trigger rolling restart of all deployments
  for a project, picking up fresh secrets without waiting for CI

### deployer adapter
- Implement PatchProjectSecrets: lists secrets by label, merges patch into Data,
  writes each secret back
- Implement RestartAll: lists deployments by label, sets restartedAt annotation

### domain/credential.go
- Add CredentialCategoryCache = "cache" constant
- Use constant in component_infra.go (was raw string "cache")

### handlers/cache.go (new)
- POST /projects/{projectID}/cache/reprovision
- Calls CreateProjectCache (which handles delete+recreate with new password)
- Updates credential store (REDIS_URL, REDIS_URL_STAGING, REDIS_PREFIX)
- Patches all K8s secrets for the project immediately
- Triggers RestartAll so pods pick up new credentials without waiting for deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 20:22:31 -07:00
jordan
3dbde72966 feat: add claude_id tracking and session improvements for interactive dev
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add claude_id field to sessions (migration 026) for tracking Claude
  process IDs across pod restarts
- Extend session repository with UpdateClaudeID and session lookup methods
- Improve kubernetes executor with better error handling and exec streaming
- Add claudebox client/server improvements for session lifecycle
- Expand sessions handler with exec streaming endpoint
- Add comprehensive tests for sessions and kubernetes executor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 00:20:32 -07:00
jordan
fa0d030def feat: improve notify domain verification reliability and add status endpoints
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add verifyWithRetry to provisioner: 60s initial DNS propagation delay,
  5 retries with 30s backoff before marking verification as failed
- Add GetNotifyDomainStatus: polls Resend API for domain verification status,
  returns "not_configured" when Resend not set up
- Add VerifyProjectNotify: synchronous re-verification for handler use
- Add getDomainStatus to resendAPI interface + resendClient implementation
- Add NotifyDomainStatus domain struct (host, resend_domain_id, status)
- Guard NOTIFY_RESEND_DOMAIN_ID storage against empty string writes
- New handler: GET /projects/{id}/notify/status (returns verification state)
- New handler: POST /projects/{id}/notify/verify (triggers re-verification)
- Add verify-notify-domain cookbook step to persona-community,
  slackpath-1, and slackpath-4 trees (polls status for up to 6 min)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 16:25:55 -07:00
jordan
002c32aedb feat: add album generation system to skeleton
Adds anchor-based image album generation across docs, skeleton, and rendered
full-monorepo. One subject description + one anchor image + N directed shots,
covering personas, products, characters, and brand assets out of the box.

## What ships

**Skeleton packages:**
- pkg/album/types.go — Album, Shot, ShotStatus, ShotTemplate, AlbumUpdater
- pkg/album/templates.go — PortraitSession, ProductShoot, CharacterSheet built-ins
- pkg/album/handler.go — AnchorHandler + ShotHandler queue job handlers
- packages/realtime/src/useAlbumGeneration.ts — SSE hook owning all album state
- packages/ui/src/components/AlbumGrid.tsx — responsive shot grid with shimmer
- packages/ui/src/components/ShotCard.tsx — pending/generating/complete/failed states
- packages/ui/src/components/AnchorPreview.tsx — anchor CTA + image with controls

**Component service template:**
- internal/port/album.go — AlbumRepository interface
- internal/adapter/memory/album.go — in-memory repo for standalone dev
- internal/service/album.go — create, list, get, generateAnchor, generateAllShots
- internal/api/handlers/album.go — HTTP handlers (CRUD + 202 generation endpoints)
- Routes: GET/POST /albums, GET/DELETE /albums/{id}, POST /albums/{id}/anchor,
  POST/DELETE /albums/{id}/shots, POST /albums/{id}/shots/{index}

**Documentation:**
- .claude/guides/album.md — full guide with API, SSE events, frontend usage

**Key architecture decisions:**
- Anchor bytes never stored in queue payload — workers fetch AnchorURL at runtime
- Generation order enforced: POST /shots returns 422 if no anchor exists
- All album SSE events on existing user:<userId> channel (no new channel)
- AlbumUpdater interface lets job handlers update repo from inside queue workers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-22 23:57:21 -07:00
jordan
4f01015132 feat: implement project access enforcement and management API
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Fix no-op RequireProjectAccess middleware to enforce project_ids
- Apply project access middleware to all project-scoped routes
- Filter GET /projects by allowed project IDs for restricted keys
- Add GET /me endpoint with key identity, scopes, and project access info
- Add PATCH /keys/{id} for partial key updates (name, scopes, project_ids, allowed_ips, expires_in)
- Add GET/POST/DELETE /projects/{id}/access for project-centric access management
- Auto-grant creating key access when using POST /project/create-and-build
- Accept grant_to_key_ids in create-and-build to grant multiple keys on project creation
- Move newProvisionerWithDeps test helper from production code to test file

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 15:38:37 -07:00
jordan
0f25bd8dbe feat: hook in notify service for per-project email delivery
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add NotifyProvisioner (port + adapter) using real notify admin API
- Create notify account + send key + host grant per project
- Inject NOTIFY_API_KEY/HOST/FROM into component deployments
- Store NOTIFY_URL, NOTIFY_ADMIN_KEY, RESEND_API_KEY in credential store
- Add setup-notify.sh for one-time host/provider/domain setup
- Add NOTIFY_ADMIN_KEY constant to domain/credential.go
- Wire provisioner in main.go with connection test guard
- Add .claude/guides/services/notify.md and CLAUDE.md entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-21 00:30:32 -07:00
jordan
a8c8a0a14d feat: add GCS-based persistent media storage, AI generation pipeline, and composable skeleton packages
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Adds complete media storage pipeline with GCS presigned uploads, AI image/video/text generation
via queue-based workers, realtime SSE event streaming, and comprehensive skeleton packages
(storage, mediagen, textgen, generation, realtime, persona, routing, ai-client). Includes
security fixes for media delete authorization, nil pointer guards in handlers, video persistence
via download-then-upload, consistent signed URLs, and Image→ImageIcon rename to avoid DOM collision.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:29:09 -07:00
jordan
7249575dea feat(sessions): add command execution endpoint and activity tracking
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
- Add POST /sessions/:id/exec endpoint for executing commands in sessions
- Add session activity tracking (last_activity_at timestamp)
- Add database migration 024 for session activity column
- Add comprehensive tests for session handlers and service layer
- Add wildcard TLS certificate for preview.threesix.ai subdomain
- Add infrastructure mocks for testing preview service
- Refactor preview cleanup logic to remove unused methods
- Add AIOS core documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-13 08:41:05 -07:00
jordan
84af398d85 refactor: add timeout constants for agent execution tiers
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Add TimeoutAgentExecution (22m) to handlers for synchronous SDLC
execution, and TimeoutAgent{Default,Medium,Heavy} (12/22/47m) to
workers for tiered agent task execution. Aligns with SDLC action
complexity tiers and prevents inline duration literals.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-11 10:48:24 -07:00
jordan
c68fadbccd fix(architect): add pod_name to agent requests, rewrite foundary cookbook
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
The architect service was missing pod_name/namespace in AgentRequest
metadata, causing Claude Code adapter to reject all requests. Added
ArchitectServiceConfig with pod resolution (project PodName → default
claudebox-0). Removed silent JSON fallback in extractSpecFromMessages
that masked errors.

Rewrote foundary cookbook from 90-step SDLC flow to focused 25-step
cookbook using natural language build prompts instead of /slash-commands
that claudebox cannot execute. Added "no fallbacks" rule to CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 01:24:34 -07:00
jordan
9226454b85 feat: label-based undeploy, GC reconciliation, checkout/sessions, pool status
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
- Add UndeployAll() using label selectors to clean up monorepo components
  on project deletion (replaces name-based Undeploy in DeleteProject and
  the direct undeploy handler)
- Add ResourceGC background worker that periodically finds K8s resources
  whose project label has no matching DB record, deletes after 1h safety
  window
- Widen deployer client type from *kubernetes.Clientset to
  kubernetes.Interface for testability
- UndeployAll accumulates errors via errors.Join instead of failing fast
- Add checkout/checkin sidecar dev flow: temporary git tokens, branch
  checkout, review on checkin with cleanup workers
- Add interactive sessions: pod binding, command execution, SSE streaming,
  ephemeral preview URLs with session cleanup workers
- Add GET /workers/pool endpoint for aggregate capacity and queue depth
- Add sessions:read and sessions:execute auth scopes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 19:11:28 -07:00
jordan
a69eb7e587 feat(foundary): implement complete backend for conversational project design
Implements all 5 phases of Foundary Studio backend:

Phase 1: Chat Persistence (8 API endpoints)
- Conversations and messages with proper cascading deletes
- PostgreSQL schema with auto-update triggers
- Full CRUD operations with structured logging

Phase 2: Blueprint Entity (5 API endpoints)
- JSONB spec storage with GIN indexes
- Flexible structured data for project specifications
- Version-controlled blueprint management

Phase 3: Architect Service (3 API endpoints)
- Conversational AI orchestration with Claude
- Multi-turn dialogue with context building
- Blueprint spec extraction from conversations

Phase 4: Work Queue Integration
- Verified existing endpoint compatibility

Phase 5: Structured Questions (6 API endpoints)
- Four question types: text, choice, multichoice, yesno
- Answer validation with proper constraints
- Conversation-linked Q&A flow

Architecture:
- Textbook hexagonal architecture (domain → port → adapter → service → handler)
- Zero external dependencies in domain layer
- Consistent error handling with proper wrapping
- Auth scopes on all routes (projects:read, projects:execute)
- Structured logging with operation context and duration tracking
- NULL-safe DTO converters throughout

Database:
- 3 new migrations (019, 020, 021)
- UUIDs for all primary keys
- Proper foreign key constraints with ON DELETE CASCADE
- Optimized indexes including partial index for unanswered questions
- Auto-update triggers for timestamps

OpenAPI Documentation:
- Complete API documentation under 'Foundary' tag
- 22 new endpoints documented with examples
- Request/response schemas for all operations

Logging Improvements:
- Added operation field to all service logs
- Added duration_ms tracking for performance monitoring
- Log response_length instead of full response content
- Consistent use of logging field constants
- Execute-then-log pattern for delete operations

Files: 32 changed, 2800+ lines added
- 7 domain models
- 3 database migrations
- 3 port interfaces
- 3 postgres adapters
- 4 services (conversation, blueprint, question, architect)
- 4 handlers with DTOs
- OpenAPI documentation
- Integration in main.go

🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-09 00:50:46 -07:00
jordan
adcea2fc1f fix(templates): upgrade Go to 1.25 and fix Woodpecker syntax
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
## Template Version Alignment
- Go: 1.23 → 1.25 across all templates (go.work, go.mod, Dockerfiles, CI)
- Alpine: latest → 3.19 (explicit version pinning)
- Woodpecker: failure:retry → failure:ignore (invalid syntax fix)

## SDLC Tree Fixes (slackpath-5-full-lifecycle)
Fixed merge failures by correcting lifecycle flow:

1. **Branch Creation**: Added missing create-branch step (planned → ready)
   - Bug: Merge command requires feature.Branch field to be set
   - Fix: POST /projects/{id}/sdlc/features/{slug}/branch

2. **Artifact Status**: Changed approval to pass for execution artifacts
   - Bug: Review/audit/QA need status="passed" not "approved"
   - Fix: /artifacts/{type}/approve → /artifacts/{type}/pass
   - Added: pass-qa step after wait-qa

3. **Phase Transition Order**: Reordered merge phase transition
   - Bug: Merge command checks if phase == "merge" first
   - Fix: transition-to-merge BEFORE merge-feature (not after)

## GCS Provisioner Fix
- Replaced deprecated option.WithCredentialsFile with env var approach
- Now uses GOOGLE_APPLICATION_CREDENTIALS for ADC (Application Default Credentials)
- Avoids security risk from deprecated credential options
- Fixed test: Added ComponentTypeGCS to ValidComponentTypes test

## Critical Rules Added
- Version alignment: All template versions must stay in sync
- When updating versions, grep entire templates/ tree

## Files Changed
- 27 template files: Go version + Woodpecker syntax
- 1 tree file: SDLC lifecycle flow corrections
- 1 CLAUDE.md: Version alignment rule
- 1 GCS provisioner: Deprecated API fix
- 1 test file: Added missing component type

Root cause: Skeleton templates lagged behind Go 1.25 release and had
invalid Woodpecker syntax. SDLC tree skipped required branch creation
and used wrong artifact approval endpoints.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 23:57:38 -07:00
jordan
4486042155 fix(registry): delete container images on project teardown
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Root cause of DIGEST_INVALID errors was registry disk exhaustion.
Project teardown wasn't cleaning up container images, causing the
registry PVC to fill up over time.

Changes:
- Add RegistryProvider port interface for registry operations
- Extend zot.Client with DeleteProjectRepositories method
- Wire registry provider into ProjectInfraService
- Delete images during DeleteProject cleanup (step 4)

The zot client uses the OCI distribution API:
- Lists all repos, filters by project prefix
- Gets manifest digests via HEAD request
- Deletes manifests by digest to trigger GC

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 02:56:18 -07:00
jordan
f20fc6c51c feat(saga): implement enterprise-grade resilience architecture
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
Fixes issues from code review of resilience implementation:

- Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler)
- Fix CompletedSteps() to include skipped steps for dependency resolution
- Fix reverse loop bug in saga compensation (use standard swap pattern)
- Add circuit breaker state change callbacks for Prometheus metrics

Phase 1 (Build Resilience):
- Add failure:retry to all component Kaniko build steps
- Add preflight registry health check before builds
- Add services-deployed sync point to decouple docs from critical path

Phase 2 (API Resilience):
- Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry)
- Wire circuit breakers with metrics callbacks
- Add /health/circuits endpoint for circuit breaker status

Phase 3 (Saga Engine):
- Full domain model (Saga, SagaStep, RetryPolicy, BackoffType)
- PostgreSQL saga repository with CRUD and step management
- Saga executor with retry, compensation, skip step support
- Saga API handlers with CRUD and control operations

Phase 4 (Observability):
- Add saga metrics (total, step_duration, retry, circuit_breaker_state)
- Add logging fields (saga_id, saga_name, step_name)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 01:58:02 -07:00
jordan
d74efb75ff fix: wire workService to WorkersHandler and add /work/tasks endpoint
Critical fix: WorkersHandler was missing workService dependency, causing
500 errors when workers tried to fail tasks. This caused tasks to get
stuck in "running" state permanently.

Also adds:
- /work/tasks endpoint for debugging all tasks across projects
- List method to WorkQueue interface for admin views
- HTTP client tests for api_client.go and claudebox/client.go (48 tests)
- Split work.go DTOs into work_dto.go to stay under 500 lines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 10:35:39 -07:00
jordan
53862c773b fix: resolve systemic debt in worker and skeleton templates
Worker template fixes:
- Replace panic() with logger.Error() + os.Exit(1) for config errors
- Remove double-timeout application (context + middleware)
- Add error message truncation to prevent log bloat
- Use named constants for shutdown grace period and stale check interval

Skeleton pkg/auth fixes:
- Fix error wrapping to use %w consistently in jwt.go
- Add GetUserOrError() as safe alternative to MustGetUser() panic

Skeleton pkg/queue fixes:
- Check RowsAffected() errors instead of ignoring them
- Add input validation to EnqueueWithOptions (require job type, cap retries)
- Add log truncation for error messages
- Fix inaccurate doc comment claiming exponential backoff

Worker timeout consolidation:
- Add internal/worker/timeouts.go with named constants
- Migrate all workers to use timeout constants

Cleanup:
- Remove obsolete slack-preparation-thoughts.md files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 23:44:55 -07:00
jordan
d69da6d627 feat: add structured logging infrastructure and SDLC extensions
Major changes:
- Add internal/logging package with field constants, context propagation,
  sensitive data auto-redaction, and per-component log levels
- Add worker timeout constants (TimeoutQuickOp, TimeoutHealthCheck, etc.)
- Extend SDLC with callback handlers, generate endpoints, and executor
- Add new cookbook trees for aeries and slackpath progression
- Add skeleton templates for queue, realtime, and microservices
- Add worker component template with async job processing
- Refactor services and handlers to use new logging infrastructure
- Split component.go into component_infra.go and component_listing.go

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 22:56:04 -07:00
jordan
b093a4b26d feat: implement Visual Verification API layer (Week 2)
Add REST API endpoints for submitting visual verification tasks,
tracking progress via SSE, and retrieving screenshot/video artifacts.

Changes:
- Add ScopeVerifyRead/ScopeVerifyWrite auth scopes
- Create VerifyService for task submission and lifecycle management
- Create VerifyHandler with POST/GET/DELETE/SSE endpoints:
  - POST /verify - Submit capture task
  - GET /verify/{taskId} - Get task status and artifacts
  - GET /verify/{taskId}/stream - SSE progress stream
  - DELETE /verify/{taskId} - Cancel pending task
  - GET /projects/{id}/verify - List verify tasks
- Wire VerifyExecutor in main.go for Playwright pod execution
- Fix work.go validation to include "verify" task type
- Add comprehensive handler tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:29:40 -07:00
jordan
210064d490 feat: add diagnostics endpoint and external health monitoring
- Add /diagnostics endpoint for system health overview
- Add external health worker for monitoring Gitea, Woodpecker, Registry
- Add health check methods to Gitea and Woodpecker clients
- Remove hardcoded fallback projects (pantheon, aeries)
- Add diagnostics domain types and service layer
- Add comprehensive tests for diagnostics handler and service
- Fix tests to use registered test project instead of hardcoded one

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:10:56 -07:00
jordan
9a1309a0c5 feat: fix composable monorepo CI builds + health endpoint improvements
Composable monorepo CI fixes:
- Add empty go.sum.tmpl files for pkg, service, worker, and cli components
- Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko)
- Add deps step to CI that runs go work sync and go mod tidy before builds
- Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0)

Health endpoint improvements:
- Add registry health check (zot OCI /v2/ endpoint)
- Add health metrics for CI, registry, and Git
- Add /health/ci endpoint for Woodpecker health

Visual verification scaffolding:
- Add Playwright pod and scripts ConfigMap
- Add vision.md and implementation breakdown plan

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 18:46:51 -07:00
jordan
b5fdf35f1b feat: add WorkerService.FailTask for audit updates + visual verification scaffolding
- Add FailTask to WorkerService to update build_audit on failure path
  (fixes bug where audit showed "running" when task actually failed)
- Add WorkServiceFailer interface to avoid circular dependency
- Add VerifyExecutor with Playwright-based visual verification
- Add verify domain types (VerifySpec, VerifyResult, screenshot capture)
- Wire VerifyExecutor placeholder into WorkExecutor (impl in Week 2)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:09:16 -07:00
jordan
aaf66764fb feat: add worker pool infrastructure for composable projects
- Add POST /workers/register and POST /workers/{workerId}/heartbeat endpoints
- Start worker health checker goroutine in main.go
- Fix network policy to allow K8s API server access (includes real endpoint IPs)
- Add rdev.orchard9.ai/role: worker label to claudebox StatefulSet

This enables the embedded WorkExecutor to reach claudebox-0 for executing
builds on composable projects that don't have dedicated pods.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 19:55:37 -07:00
jordan
f22b220c6d feat: add SDLC branch management, merge, archive, and orchestrator APIs
Add branch lifecycle commands (branch, merge, archive) to the SDLC CLI.
Introduce orchestrator handler and service for multi-step SDLC workflows.
Expand skeleton template with 15 Claude commands covering the full feature
lifecycle. Extend classifier rules, error types, and executor port for
branch operations. Split rules.go and classifier_test.go to stay within
500-line limit.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:30:03 -07:00
jordan
425ef0f806 feat: add SDLC orchestration - library, CLI, and API integration
Implements deterministic feature lifecycle management for agent-driven
development. Agents use the CLI in pods; operators control via REST API.

Library (internal/sdlc/):
- Feature lifecycle with 10 phases (draft → released)
- Classifier engine with priority-ordered rules
- Artifact tracking with approval workflow
- Task management within features
- YAML-based state persistence

CLI (cmd/sdlc/):
- init, state, next, feature, artifact, task, query commands
- --json flag for machine-readable output
- Runs inside project pods

API (21 endpoints under /projects/{id}/sdlc/):
- State: GET /state, GET /next
- Features: CRUD + transition/block/unblock
- Artifacts: approve/reject per type
- Tasks: add/start/complete/block
- Queries: blocked/ready/needs-approval

Architecture:
- Port: SDLCExecutor interface (internal/port/)
- Adapter: kubectl exec into pods (internal/adapter/kubernetes/)
- Service: pod resolution + logging (internal/service/)
- Handlers: 5 files under 500-line limit (internal/handlers/)

Also includes template upgrades (chassis framework, UI components,
OpenAPI helpers, backend/frontend guides) and component improvements.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 09:57:05 -07:00
jordan
62460bf098 feat: complete template upgrade - chassis framework, UI library, auth, app-nextjs, OpenAPI, and cookbook
Weeks 1-7 of the template upgrade plan:
- pkg/api: typed HTTPError with sentinels, Wrap/WrapMiddleware, Bind, health probes, OpenAPI schema/param builders
- skeleton/packages: ui (design tokens, components), layout (DashboardShell), auth (AuthProvider, ProtectedRoute), api-client
- skeleton/pkg: httperror, app/handler, app/bind, app/health, auth (JWT/API key middleware)
- components/app-nextjs: Next.js 14 App Router template with dashboard, server actions, auth
- cookbooks/feature-development.md with test and validation scripts
- Handler tests for components, project management, and woodpecker webhook
- 3 rounds of code review fixes applied

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 00:46:51 -07:00
jordan
c280a92012 feat: add operations audit system and template improvements
Operations Audit (new feature):
- Add Operation domain model with status tracking (pending, running, completed, failed, cancelled)
- Add OperationRepository with PostgreSQL implementation
- Add OperationService for CRUD and lifecycle management
- Add operations handlers (list, get, cancel endpoints)
- Add migration 015_operations.sql for operations table
- Add operation cleanup worker for stale operation handling
- Add ErrOperationNotFound to domain errors

Template Improvements:
- Add CLAUDE.md configuration files to astro-landing, default, and go-api templates
- Fix PORT template variable usage in nginx configs for app templates
- Add replace directives for local pkg module in Go templates
- Simplify Go service/worker Dockerfiles for workspace builds
- Fix TypeScript error in logger template

Other:
- Refactor landing-test.sh cookbook script
- Update CLAUDE.md version reference

Note: Some files exceed 500-line limit (pre-existing debt + new feature)
- component.go: 550 lines (unchanged, pre-existing)
- main.go: 522 lines (added operations wiring)
- operation_repo.go: 569 lines (new, needs splitting)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 19:08:57 -07:00
jordan
05a64c51e7 release: v0.10.27 - fix: woodpecker step YAML multi-line command syntax 2026-02-01 12:42:18 -07:00
jordan
8282d60c69 feat: implement composable monorepo template system with component architecture
Adds the composable monorepo template system that generates project skeletons
with pluggable components (service, worker, app-react, app-astro, cli).

Key changes:
- Monorepo skeleton templates with shared pkg/, scripts/, and git hooks
- Component templates (service, worker, app-react, app-astro, cli) with
  Dockerfiles, CI steps, and component.yaml manifests
- Component domain model with validation and dependency resolution
- Component handler endpoints for CRUD and composition
- Template provider extended with BuildComposableProject and component assembly
- Deployer extended with composable project deployment support
- Handler timeout constants (TimeoutFastLookup through TimeoutLongRunning)
- envutil package for centralized env var reads with defaults
- api.DecodeJSON helper for standardized request body decoding
- Standardized response helpers (WriteBadRequest, WriteNotFound, etc.)
- Replaced fullstack-app cookbook with composable-app cookbook
- Hardened handler timeouts, logging, and error responses across all handlers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 19:11:42 -07:00
jordan
910bcb62e1 fix: Sync build audit with work queue when stale tasks are requeued
When a worker dies mid-build, queue maintenance now updates both
work_queue and build_audit tables when requeuing stale tasks.
This prevents builds from showing "running" forever in the API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 02:07:52 -07:00
jordan
9c15976f86 feat: Complete Claude endpoint and update cookbook
- Add session_id, model, allowed_tools to Claude request handler
- Update OpenAPI spec for Claude endpoint
- Fix BuildExecutor constructor call sites
- Rewrite landing-test.sh for agent-driven flow
- Fix cookbook documentation for correct API format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 21:25:29 -07:00
jordan
34e72687e6 feat: Complete automation gaps for repeatable project deployments
- Initial K8s deployment auto-creation during project creation
- DNS record upsert support (create or update existing records)
- Ingress host management for domain aliases (AddIngressHost/RemoveIngressHost)
- Woodpecker deployer RBAC manifest for CI deploy steps
- Single-commit template seeding via Gitea bulk file API

Closes automation gaps exposed during www.threesix.ai launch:
- Projects now auto-create K8s Deployment/Service/Ingress on creation
- Domain aliases automatically update both DNS and K8s ingress
- CI deploy steps work without manual RBAC setup
- Template seeding triggers only one CI pipeline (not per-file)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 15:18:31 -07:00
jordan
c86516c53a feat: Add multi-domain support with auto-generated slugs for landing page cookbook
Landing page cookbook implementation (Weeks 1-4):

Domain Infrastructure:
- Add project_domains table with migration (013_project_domains.sql)
- Add ProjectDomain model with domain types (primary_auto, primary_custom, alias)
- Add SlugGenerator and ProjectDomainRepository interfaces
- Implement postgres adapters for domain and slug management

Service Layer:
- Add domain CRUD methods to ProjectInfraService
- Generate 8-char random slugs for auto-domains
- Support custom subdomains during project creation
- Add site_live health check to project status
- Trigger CI build after template seeding

Handler Updates:
- Add DomainService interface and adapter pattern
- Rewrite domain handlers to use database-backed service
- Add proper error handling for duplicate/missing domains

CI Integration:
- Add TriggerBuild to CIProvider interface
- Implement TriggerBuild in Woodpecker adapter
- Manually trigger initial build after template seed

Cookbook & Scripts:
- Add landing-test.sh script for E2E testing
- Add release.sh for version releases
- Add logs.sh for quick log access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 12:55:59 -07:00
jordan
bc47e426b0 feat: Add CI pipeline proxy, DNS alias management, and worker executor system
- Add ListPipelines/GetPipeline to CIProvider port with Woodpecker adapter
- Add DNS alias endpoints: GET/POST/DELETE /projects/{id}/domains
- Implement worker executor daemon, build executor, and git operations
- Add build service, worker service, and build audit tracking
- Add worker registry with PostgreSQL adapter and migration
- Add multi-provider code agent interface (Claude Code + OpenCode)
- Add create-and-build combo endpoint
- Update landing-page cookbook to reflect all gaps closed
- Fix tech debt: unified validation, auth scopes, error wrapping, slog patterns

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 21:05:28 -07:00
jordan
39df51defd feat: Add multi-provider code agent interface with Claude Code and OpenCode adapters
Implements weeks 1-4 of the multi-provider architecture:

Week 1 - Foundation:
- Add domain models (AgentProvider, AgentRequest, AgentEvent, AgentResult)
- Define CodeAgent port interface with Execute, Cancel, Capabilities
- Create thread-safe provider registry with first-registered default

Week 2 - Claude Code Adapter:
- Extract kubectl exec logic into CodeAgent implementation
- Parse stream-json output format (init, message, tool_use, result)
- Support session continuation via --resume flag

Week 3 - OpenCode Adapter:
- HTTP/SSE client for opencode serve API
- Session management (create, send message, abort)
- Event streaming with documented buffer rationale

Week 4 - Quality & Polish:
- Fix race condition in OpenCode Cancel method
- Add AgentRequest.Validate() with ErrPromptRequired, ErrInvalidTimeout
- Document DefaultAvailabilityTimeout constants
- Add HTTP error context for debugging

Also includes:
- Work queue system with PostgreSQL adapter
- Credential store for infrastructure secrets
- Project templates with Woodpecker CI integration
- Comprehensive test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 09:25:51 -07:00
jordan
812b8341be refactor: Split large files to comply with 500-line limit
- cmd/rdev-api/main.go: Extract OpenAPI spec to openapi.go (1073→386 lines)
- internal/adapter/deployer/deployer.go: Extract K8s resources to resources.go (502→264 lines)
- internal/handlers/infrastructure.go: Extract deploy handlers to infrastructure_deploy.go (592→342 lines)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 23:02:31 -07:00
jordan
0fd4e32073 feat: Add infrastructure adapters for threesix.ai
Add Gitea, Cloudflare DNS, and Kubernetes deployer adapters following
hexagonal architecture. These enable automated project provisioning:
- Git repository creation/management via Gitea
- DNS record management via Cloudflare
- Container deployment to Kubernetes

Includes domain models, ports, handlers, and Woodpecker CI webhook
integration for automated deployments on push.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 22:49:58 -07:00
jordan
72d16929ca feat: Implement hexagonal architecture with services, webhooks, queue, and telemetry
Major refactoring to hexagonal (ports & adapters) architecture:

- Add service layer (apikey_service, project_service) for business logic
- Add webhook system with dispatcher and delivery tracking
- Add command queue with priority-based processing
- Add rate limiting with sliding window algorithm
- Add audit logging for command execution
- Add OpenTelemetry integration (traces, metrics, spans)
- Add circuit breaker for fault tolerance
- Add cached repository wrapper for performance
- Add comprehensive validation package
- Add Kubernetes client integration for pod management
- Add database migrations (allowed_ips, audit_log, rate_limiting, queue, webhooks)
- Add network policy and PodDisruptionBudget for k8s
- Remove legacy executor and projects/registry packages
- Untrack secrets.yaml (now managed via envault)
- Add coverage.out to .gitignore
- Add e2e test infrastructure with docker-compose
- Add comprehensive documentation (API, architecture, operations, plans)
- Add golangci-lint config and pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 19:57:46 -07:00
jordan
538ea57ed4 feat: Add claude-config API, security hardening, and testing infrastructure
Claude Config API (v0.6):
- Add CRUD endpoints for commands, skills, and agents
- Commands/skills/agents stored in /workspace/.claude/ (per-project, in git)
- Credentials shared via PVC at /root/.claude/ (shared across pods)
- Use base64 encoding for file writes (prevents shell injection)
- Add content size limits (1MB max)

Security Hardening:
- Add sanitize package for command/prompt validation
- Add rate limiting middleware (token bucket algorithm)
- Add concurrent command limiting
- Add input sanitization to all command handlers
- Gitignore secrets.yaml and credentials.yaml
- Add *.example templates for secrets

Testing Infrastructure:
- Add testutil package with mocks and fixtures
- Add unit tests for auth package (63% coverage)
- Add unit tests for executor (47% coverage)
- Add handler integration tests (40% coverage)
- Add 100% coverage for sanitize, cmdlimit packages
- Add 96% coverage for ratelimit package

Infrastructure:
- Shared Claude credentials PVC (ReadWriteMany)
- Reduced workspace PVC size from 20Gi to 5Gi
- Add init container cleanup before git clone
- Document Longhorn RWX requirements

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 01:29:13 -07:00
jordan
d2de49a591 feat: Add API key authentication with auto-migrations
Implements API key authentication for all rdev endpoints:

## Database (internal/db)
- Auto-migrating postgres connection
- Embedded SQL migrations via go:embed
- api_keys table with scopes, expiration, project restrictions

## Auth Package (internal/auth)
- Key generation: rdev_sk_<prefix>_<random> format
- Scopes: projects:read, projects:execute, keys:read, keys:write, admin
- SHA-256 key hashing (secrets never stored)
- Expiration options: 30d, 60d, 90d, 1y, never
- Middleware skips /health, /ready, /docs, /openapi.json

## Key Management API
- GET /keys - List keys (keys:read)
- POST /keys - Create key (keys:write)
- GET /keys/{id} - Get key details (keys:read)
- DELETE /keys/{id} - Revoke key (keys:write)

## Environment Variables
- DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME
- RDEV_ADMIN_KEY - Super admin key for bootstrapping

Version bumped to 0.5.0.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:26:26 -07:00
jordan
4a042a8b71 feat: Add rdev-api Go server with OpenAPI docs
Implements a fully documented API server following the aeries chassis pattern:

- pkg/api: Simplified chassis with App, Response helpers, and OpenAPI builder
- cmd/rdev-api: Entry point with full OpenAPI spec for all v0.4 endpoints
- internal/handlers: Stubbed project handlers (list, get, claude, shell, git, events)

Endpoints:
- GET  /health, /ready     - Health checks
- GET  /docs, /openapi.json - Scalar API docs
- GET  /projects           - List projects
- GET  /projects/{id}      - Get project
- POST /projects/{id}/claude, shell, git - Run commands
- GET  /projects/{id}/events - SSE streaming

Uses Scalar for dark-mode API documentation at /docs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 20:56:27 -07:00