rdev/internal/port/saga.go
jordan f20fc6c51c
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
feat(saga): implement enterprise-grade resilience architecture
Fixes issues from code review of resilience implementation:

- Wire saga system in main.go (SagaRepository, SagaExecutor, SagaHandler)
- Fix CompletedSteps() to include skipped steps for dependency resolution
- Fix reverse loop bug in saga compensation (use standard swap pattern)
- Add circuit breaker state change callbacks for Prometheus metrics

Phase 1 (Build Resilience):
- Add failure:retry to all component Kaniko build steps
- Add preflight registry health check before builds
- Add services-deployed sync point to decouple docs from critical path

Phase 2 (API Resilience):
- Add pipeline retry endpoint (POST /projects/{id}/pipelines/{number}/retry)
- Wire circuit breakers with metrics callbacks
- Add /health/circuits endpoint for circuit breaker status

Phase 3 (Saga Engine):
- Full domain model (Saga, SagaStep, RetryPolicy, BackoffType)
- PostgreSQL saga repository with CRUD and step management
- Saga executor with retry, compensation, skip step support
- Saga API handlers with CRUD and control operations

Phase 4 (Observability):
- Add saga metrics (total, step_duration, retry, circuit_breaker_state)
- Add logging fields (saga_id, saga_name, step_name)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 01:58:02 -07:00

51 lines
1.6 KiB
Go

// Package port defines interfaces (ports) for external dependencies.
package port
import (
"context"
"github.com/orchard9/rdev/internal/domain"
)
// SagaRepository manages saga persistence.
type SagaRepository interface {
// Create creates a new saga with its steps.
Create(ctx context.Context, saga *domain.Saga) error
// Get returns a saga by ID, including all steps.
Get(ctx context.Context, id string) (*domain.Saga, error)
// Update updates a saga's status and metadata (not steps).
Update(ctx context.Context, saga *domain.Saga) error
// UpdateStep updates a single step's status and output.
UpdateStep(ctx context.Context, step *domain.SagaStep) error
// List returns sagas matching the given filters.
List(ctx context.Context, filters domain.SagaFilters) ([]*domain.Saga, error)
// Delete removes a saga and its steps.
Delete(ctx context.Context, id string) error
// GetPendingSteps returns steps ready to execute (no unmet dependencies).
GetPendingSteps(ctx context.Context, sagaID string) ([]domain.SagaStep, error)
}
// SagaExecutor executes saga workflows.
type SagaExecutor interface {
// Execute runs a saga from the beginning.
Execute(ctx context.Context, saga *domain.Saga) error
// Resume continues execution of a paused or failed saga.
Resume(ctx context.Context, sagaID string) error
// Compensate runs compensation steps for a failed saga.
Compensate(ctx context.Context, sagaID string) error
// RetryStep retries a specific failed step.
RetryStep(ctx context.Context, sagaID, stepName string) error
// SkipStep skips a step and continues execution.
SkipStep(ctx context.Context, sagaID, stepName string) error
}