jordan
f20fc6c51c
feat(saga): implement enterprise-grade resilience architecture
...
ci/woodpecker/push/woodpecker Pipeline failed
Fixes issues from code review of resilience implementation:
- Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler)
- Fix CompletedSteps() to include skipped steps for dependency resolution
- Fix reverse loop bug in saga compensation (use standard swap pattern)
- Add circuit breaker state change callbacks for Prometheus metrics
Phase 1 (Build Resilience):
- Add failure:retry to all component Kaniko build steps
- Add preflight registry health check before builds
- Add services-deployed sync point to decouple docs from critical path
Phase 2 (API Resilience):
- Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry)
- Wire circuit breakers with metrics callbacks
- Add /health/circuits endpoint for circuit breaker status
Phase 3 (Saga Engine):
- Full domain model (Saga, SagaStep, RetryPolicy, BackoffType)
- PostgreSQL saga repository with CRUD and step management
- Saga executor with retry, compensation, skip step support
- Saga API handlers with CRUD and control operations
Phase 4 (Observability):
- Add saga metrics (total, step_duration, retry, circuit_breaker_state)
- Add logging fields (saga_id, saga_name, step_name)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 01:58:02 -07:00
jordan
210064d490
feat: add diagnostics endpoint and external health monitoring
...
- Add /diagnostics endpoint for system health overview
- Add external health worker for monitoring Gitea, Woodpecker, Registry
- Add health check methods to Gitea and Woodpecker clients
- Remove hardcoded fallback projects (pantheon, aeries)
- Add diagnostics domain types and service layer
- Add comprehensive tests for diagnostics handler and service
- Fix tests to use registered test project instead of hardcoded one
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 19:10:56 -07:00
jordan
9a1309a0c5
feat: fix composable monorepo CI builds + health endpoint improvements
...
Composable monorepo CI fixes:
- Add empty go.sum.tmpl files for pkg, service, worker, and cli components
- Fix Dockerfile.tmpl glob patterns (COPY go.work.sum* is invalid in Kaniko)
- Add deps step to CI that runs go work sync and go mod tidy before builds
- Fix scalar-go dependency version (v0.1.2 doesn't exist, use v0.13.0)
Health endpoint improvements:
- Add registry health check (zot OCI /v2/ endpoint)
- Add health metrics for CI, registry, and Git
- Add /health/ci endpoint for Woodpecker health
Visual verification scaffolding:
- Add Playwright pod and scripts ConfigMap
- Add vision.md and implementation breakdown plan
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 18:46:51 -07:00
jordan
c59d348040
chore: prepare for composable monorepo template implementation
...
This commit captures the current state before implementing the composable
monorepo template system. Key changes included:
Infrastructure:
- Add CockroachDB provisioner adapter for database provisioning
- Add Redis provisioner adapter for cache provisioning
- Add build events system with PostgreSQL storage
- Add WebSocket endpoint for real-time build progress
Code agent improvements:
- Fix Claude Code adapter to use default allowed tools instead of dangerously-skip-permissions
- Add context-aware stream closing for cancellation support
- Improve parser tests for edge cases
Build system:
- Add build event constants and metrics
- Remove deprecated git_operations.go (replaced by pod_git_operations.go)
- Add rollback logic for multi-step provisioning operations
Documentation:
- Add composable-monorepo feature documentation
- Add DNS/Cloudflare service documentation
- Update deployment and troubleshooting guides
Cookbooks:
- Add fullstack-app cookbook
- Refactor landing-test with shared library
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 11:39:28 -07:00
jordan
bc47e426b0
feat: Add CI pipeline proxy, DNS alias management, and worker executor system
...
- Add ListPipelines/GetPipeline to CIProvider port with Woodpecker adapter
- Add DNS alias endpoints: GET/POST/DELETE /projects/{id}/domains
- Implement worker executor daemon, build executor, and git operations
- Add build service, worker service, and build audit tracking
- Add worker registry with PostgreSQL adapter and migration
- Add multi-provider code agent interface (Claude Code + OpenCode)
- Add create-and-build combo endpoint
- Update landing-page cookbook to reflect all gaps closed
- Fix tech debt: unified validation, auth scopes, error wrapping, slog patterns
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 21:05:28 -07:00
jordan
72d16929ca
feat: Implement hexagonal architecture with services, webhooks, queue, and telemetry
...
Major refactoring to hexagonal (ports & adapters) architecture:
- Add service layer (apikey_service, project_service) for business logic
- Add webhook system with dispatcher and delivery tracking
- Add command queue with priority-based processing
- Add rate limiting with sliding window algorithm
- Add audit logging for command execution
- Add OpenTelemetry integration (traces, metrics, spans)
- Add circuit breaker for fault tolerance
- Add cached repository wrapper for performance
- Add comprehensive validation package
- Add Kubernetes client integration for pod management
- Add database migrations (allowed_ips, audit_log, rate_limiting, queue, webhooks)
- Add network policy and PodDisruptionBudget for k8s
- Remove legacy executor and projects/registry packages
- Untrack secrets.yaml (now managed via envault)
- Add coverage.out to .gitignore
- Add e2e test infrastructure with docker-compose
- Add comprehensive documentation (API, architecture, operations, plans)
- Add golangci-lint config and pre-commit hook
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 19:57:46 -07:00