Commit Graph

98 Commits

Author SHA1 Message Date
jordan
05a64c51e7 release: v0.10.27 - fix: woodpecker step YAML multi-line command syntax 2026-02-01 12:42:18 -07:00
jordan
35dc4d26a4 release: v0.10.25 - feat: add pipeline steps API for debugging diagnostics 2026-02-01 12:41:04 -07:00
jordan
ccc3f13ced release: v0.10.26 - fix: sanitize component path for K8s labels 2026-02-01 12:28:08 -07:00
jordan
c9414832d3 release: v0.10.25 - fix: component deployment creation and pnpm workspace Docker builds 2026-02-01 11:12:55 -07:00
jordan
96a81fb395 release: v0.10.24 - fix: woodpecker YAML marker format 2026-02-01 01:24:29 -07:00
jordan
91c87836a7 release: v0.10.23 - feat: composable monorepo component endpoints 2026-02-01 00:26:36 -07:00
jordan
f6ced22e06 fix: Use FQDN for k8s service hostnames and remove broken commonLabels
Short-form DNS names (e.g. postgres.databases.svc) fail to resolve in
new pods due to k8s DNS search domain limitations. Switch all service
hostnames to FQDNs (*.svc.cluster.local).

Remove commonLabels from kustomization.yaml — it injected labels into
all selectors including NetworkPolicy egress rules (blocking DNS to
CoreDNS) and Deployment selectors (causing immutability errors).

Add OTEL_EXPORTER_OTLP_ENDPOINT env var to deployment YAML so the
telemetry collector endpoint uses the FQDN without requiring a binary
rebuild.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 20:46:04 -07:00
jordan
e1b8ccd6a4 release: v0.10.22 - fix: Reduce CI activation retry from 15 to 5 attempts to stay under proxy timeout 2026-01-31 10:53:22 -07:00
jordan
137814ae7e release: v0.10.21 - fix: Sync build audit with work queue when stale tasks are requeued 2026-01-31 02:06:10 -07:00
jordan
8db06a32ec chore: Remove obsolete dedicated claudebox pods
The shared worker pool (claudebox-0) now handles all project builds
with dynamic git cloning. The dedicated per-project pods were stuck
in Init state and are no longer needed.

Removed:
- claudebox-aeries StatefulSet and PVC
- claudebox-pantheon StatefulSet and PVC
- Associated secrets and configmaps (deleted from cluster)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 01:15:17 -07:00
jordan
b9aa64f284 release: v0.10.20 - fix: Verify git remote matches before pulling in shared workspace 2026-01-31 00:48:09 -07:00
jordan
6405acb66a release: v0.10.19 - fix: Clear non-git workspace before cloning repository 2026-01-31 00:34:39 -07:00
jordan
823cae51c0 release: v0.10.18 - fix: Clone git repo before build execution to enable post-build git operations 2026-01-31 00:21:06 -07:00
jordan
072348451c release: v0.10.17 - feat: Programmatic post-build git operations via kubectl exec 2026-01-30 23:52:49 -07:00
jordan
b0fbeb4190 release: v0.10.16 - fix: Handle existing git repos during project creation 2026-01-30 23:28:18 -07:00
jordan
ece73d2b01 release: v0.10.15 - fix: Parse Claude stream-json subtype field instead of status for result messages 2026-01-29 23:46:41 -07:00
jordan
df77ec8c5c release: v0.10.14 - fix: Move prompt before flags in Claude Code CLI invocation 2026-01-29 23:34:00 -07:00
jordan
2d5136224a release: v0.10.13 - fix: Replace --dangerously-skip-permissions with --allowedTools for root compatibility 2026-01-29 23:27:24 -07:00
jordan
e9984ebc07 fix: Include stderr and troubleshooting help in Claude Code errors
When Claude fails to execute, error messages now include:
- Captured stderr output from the failed command
- Troubleshooting commands to exec into pod and run `claude login`

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 23:12:01 -07:00
jordan
4354f96351 release: v0.10.11 - fix: Persist build audit status when worker claims task 2026-01-29 21:25:50 -07:00
jordan
6b666914bc release: v0.10.10 - feat: Bulk file seeding for single-commit template creation 2026-01-29 17:04:08 -07:00
jordan
34e72687e6 feat: Complete automation gaps for repeatable project deployments
- Initial K8s deployment auto-creation during project creation
- DNS record upsert support (create or update existing records)
- Ingress host management for domain aliases (AddIngressHost/RemoveIngressHost)
- Woodpecker deployer RBAC manifest for CI deploy steps
- Single-commit template seeding via Gitea bulk file API

Closes automation gaps exposed during www.threesix.ai launch:
- Projects now auto-create K8s Deployment/Service/Ingress on creation
- Domain aliases automatically update both DNS and K8s ingress
- CI deploy steps work without manual RBAC setup
- Template seeding triggers only one CI pipeline (not per-file)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 15:18:31 -07:00
jordan
79b32ffa6c release: v0.10.9 - Fix TLS: use cluster-issuer for project deploys 2026-01-29 01:29:58 -07:00
jordan
aa6fa4ebdf release: v0.10.8 - Fix Kaniko plugin: use repo/tags format instead of destinations 2026-01-29 01:08:02 -07:00
jordan
ee2c0d6482 fix: Use repo/tags format for Kaniko plugin (not destinations)
The destinations format caused Kaniko to push images with the full
registry URL as part of the repo path (registry.threesix.ai/name
instead of just name). Using registry + repo + tags format pushes
to the correct path.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 01:07:49 -07:00
jordan
e1d84f3398 release: v0.10.7 - Fix registry hostname: use registry.threesix.ai instead of nonexistent zot.orchard9.ai 2026-01-29 00:01:58 -07:00
jordan
5a7b9342c6 fix: Use registry.threesix.ai instead of nonexistent zot.orchard9.ai
The templates referenced zot.orchard9.ai which has no DNS record.
The actual zot registry is at registry.threesix.ai. Also updated
static templates to use Kaniko plugin instead of docker:24-dind.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 00:01:48 -07:00
jordan
173d461027 release: v0.10.6 - Fix ensureNamespace RBAC failure, add namespace/secrets permissions to deployer ClusterRole 2026-01-28 21:34:53 -07:00
jordan
043cc8c63b fix: ensureNamespace uses Get-then-Create to avoid RBAC failures
The deployer was blindly calling Namespaces().Create() which triggered
cluster-scope RBAC checks even when the namespace already existed.
Now checks with Get() first and only creates if NotFound.

Also adds namespace get/create and secrets create/update/patch
permissions to the rdev-api-deployer ClusterRole.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 21:34:32 -07:00
jordan
1adffbd50e release: v0.10.5 - Use Woodpecker Kaniko plugin with destinations format 2026-01-28 21:23:37 -07:00
jordan
fb994269c9 release: v0.10.4 - Simplify Kaniko templates for anonymous zot registry 2026-01-28 18:47:39 -07:00
jordan
a14606e9c9 release: v0.10.3 - Update templates to use Kaniko for rootless builds (no privileged mode) 2026-01-28 18:44:31 -07:00
jordan
9e3c1c3806 release: v0.10.2 - Fix: Expose pipeline errors in API response (privileged mode trust issue) 2026-01-28 18:36:31 -07:00
jordan
823d45f22c release: v0.10.1 - Expose Woodpecker pipeline errors in API response 2026-01-28 16:16:52 -07:00
jordan
d040e7b97f release: v0.10.0 - Add multi-domain support with auto-generated slugs for landing page cookbook 2026-01-28 12:56:36 -07:00
jordan
89b832ce0d release: v0.9.9 - Upgrade to Woodpecker SDK v3 for API compatibility 2026-01-28 09:48:20 -07:00
jordan
f82c5f50a7 release: v0.9.8 - Fix Woodpecker: use RepoListOpts(true) to find inactive repos 2026-01-28 09:27:14 -07:00
jordan
f0f1b03ec0 release: v0.9.7 - Fix Woodpecker SDK bug: nil out targetRepo on RepoLookup error 2026-01-28 00:18:28 -07:00
jordan
b91f6d6921 release: v0.9.6 - Increase Woodpecker sync retry to 45s (15 attempts * 3s) 2026-01-27 23:34:46 -07:00
jordan
8e1d90b9f6 release: v0.9.5 - Fix Woodpecker CI: retry when forge metadata not yet synced 2026-01-27 23:32:45 -07:00
jordan
e81055d27b release: v0.9.4 - Fix project creation: empty repo seeding and Woodpecker sync retry 2026-01-27 23:30:37 -07:00
jordan
39df51defd feat: Add multi-provider code agent interface with Claude Code and OpenCode adapters
Implements weeks 1-4 of the multi-provider architecture:

Week 1 - Foundation:
- Add domain models (AgentProvider, AgentRequest, AgentEvent, AgentResult)
- Define CodeAgent port interface with Execute, Cancel, Capabilities
- Create thread-safe provider registry with first-registered default

Week 2 - Claude Code Adapter:
- Extract kubectl exec logic into CodeAgent implementation
- Parse stream-json output format (init, message, tool_use, result)
- Support session continuation via --resume flag

Week 3 - OpenCode Adapter:
- HTTP/SSE client for opencode serve API
- Session management (create, send message, abort)
- Event streaming with documented buffer rationale

Week 4 - Quality & Polish:
- Fix race condition in OpenCode Cancel method
- Add AgentRequest.Validate() with ErrPromptRequired, ErrInvalidTimeout
- Document DefaultAvailabilityTimeout constants
- Add HTTP error context for debugging

Also includes:
- Work queue system with PostgreSQL adapter
- Credential store for infrastructure secrets
- Project templates with Woodpecker CI integration
- Comprehensive test coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 09:25:51 -07:00
jordan
72d16929ca feat: Implement hexagonal architecture with services, webhooks, queue, and telemetry
Major refactoring to hexagonal (ports & adapters) architecture:

- Add service layer (apikey_service, project_service) for business logic
- Add webhook system with dispatcher and delivery tracking
- Add command queue with priority-based processing
- Add rate limiting with sliding window algorithm
- Add audit logging for command execution
- Add OpenTelemetry integration (traces, metrics, spans)
- Add circuit breaker for fault tolerance
- Add cached repository wrapper for performance
- Add comprehensive validation package
- Add Kubernetes client integration for pod management
- Add database migrations (allowed_ips, audit_log, rate_limiting, queue, webhooks)
- Add network policy and PodDisruptionBudget for k8s
- Remove legacy executor and projects/registry packages
- Untrack secrets.yaml (now managed via envault)
- Add coverage.out to .gitignore
- Add e2e test infrastructure with docker-compose
- Add comprehensive documentation (API, architecture, operations, plans)
- Add golangci-lint config and pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 19:57:46 -07:00
jordan
538ea57ed4 feat: Add claude-config API, security hardening, and testing infrastructure
Claude Config API (v0.6):
- Add CRUD endpoints for commands, skills, and agents
- Commands/skills/agents stored in /workspace/.claude/ (per-project, in git)
- Credentials shared via PVC at /root/.claude/ (shared across pods)
- Use base64 encoding for file writes (prevents shell injection)
- Add content size limits (1MB max)

Security Hardening:
- Add sanitize package for command/prompt validation
- Add rate limiting middleware (token bucket algorithm)
- Add concurrent command limiting
- Add input sanitization to all command handlers
- Gitignore secrets.yaml and credentials.yaml
- Add *.example templates for secrets

Testing Infrastructure:
- Add testutil package with mocks and fixtures
- Add unit tests for auth package (63% coverage)
- Add unit tests for executor (47% coverage)
- Add handler integration tests (40% coverage)
- Add 100% coverage for sanitize, cmdlimit packages
- Add 96% coverage for ratelimit package

Infrastructure:
- Shared Claude credentials PVC (ReadWriteMany)
- Reduced workspace PVC size from 20Gi to 5Gi
- Add init container cleanup before git clone
- Document Longhorn RWX requirements

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 01:29:13 -07:00
jordan
fa66a69120 fix: Defer health endpoints to Run() for proper middleware ordering
Chi requires middleware to be defined before routes. Moved
setupHealthEndpoints() from New() to Run() to allow callers to
add middleware before routes are registered.

Also:
- Updated rdev-api.yaml with DB env vars, RBAC, ServiceAccount
- Added Dockerfile.api.simple for pre-built binary deployment

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 23:28:54 -07:00
jordan
0960b17eb2 feat: Implement v0.2-v0.4 (workspaces, git, API)
v0.2 - Real Workspaces:
- Project-specific claudebox StatefulSets (pantheon, aeries)
- Init containers for git clone via SSH
- Deploy key secrets template
- Project ConfigMaps for CLAUDE.md

v0.3 - Git Integration:
- Dockerfile with rdev-bot git identity
- openssh-client for SSH operations
- Image version bump to v0.3.0

v0.4 - API Server:
- Go REST API with chi router
- Endpoints: /projects, /claude, /shell, /git, /events
- SSE streaming for real-time output
- OpenAPI docs via Scalar at /docs
- Kubernetes RBAC for pod exec
- Executor and project registry packages

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:07:00 -07:00
jordan
d4eb41589f fix: Use ghcr.io and build for amd64
- Switch from GCP Artifact Registry to GitHub Container Registry
- Build images for linux/amd64 (k3s node architecture)
- Use PVC for Claude config instead of secret (auth persists across restarts)
- Remove credential secret dependency

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 20:04:34 -07:00
jordan
17aeb1c25b Initial commit: rdev v0.1 base case
- Dockerfile for claudebox with Claude Code CLI
- Kustomize manifests for k3s deployment
- Scripts for credentials, deploy, and verify
- README with quick start guide

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:24:07 -07:00