- Add /diagnostics endpoint for system health overview - Add external health worker for monitoring Gitea, Woodpecker, Registry - Add health check methods to Gitea and Woodpecker clients - Remove hardcoded fallback projects (pantheon, aeries) - Add diagnostics domain types and service layer - Add comprehensive tests for diagnostics handler and service - Fix tests to use registered test project instead of hardcoded one Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2.1 KiB
2.1 KiB
External Health Checker
Last Updated: 2026-02-03 Confidence: High
Summary
Background worker that continuously monitors external systems (registry, CI, git) and surfaces issues proactively via metrics, logs, and the /ready endpoint. Runs every 30s, caches results for instant lookups, and logs state transitions.
Key Facts:
- Monitors:
registry(zot),ci(woodpecker),git(gitea) - Check interval: 30 seconds (configurable)
- Caches results for
/readyendpoint (no blocking network calls) - Logs only on state changes (healthy→unhealthy, unhealthy→healthy)
- Preserves
LastHealthytimestamp through unhealthy periods
File Pointers:
- Domain types:
internal/domain/external_health.go - Worker implementation:
internal/worker/external_health.go - Port interface:
internal/port/health.go:ExternalHealthChecker - Handler integration:
internal/handlers/health.go:WithExternalHealthChecker - Wiring:
cmd/rdev-api/main.go:433-455
How It Works
- Background goroutine polls all configured external systems every 30s
- Checks run in parallel with 10s timeout per system
- Results cached in thread-safe map
/readyreads cached statuses (no network calls)- Prometheus metrics updated on each check cycle
Adapter implementations:
- Registry:
internal/adapter/zot/client.go:Check()- calls/v2/endpoint - CI:
internal/adapter/woodpecker/client.go:Check()- callsSelf()API - Git:
internal/adapter/gitea/client.go:Check()- callsListMyOrgs()
Prometheus Metrics
| Metric | Type | Labels | Description |
|---|---|---|---|
rdev_external_system_healthy |
Gauge | system |
1=healthy, 0=unhealthy |
rdev_external_system_latency_seconds |
Gauge | system |
Check latency |
rdev_external_system_last_check_timestamp |
Gauge | system |
Unix timestamp of last check |
Related Topics
- Work Queue - Uses similar background worker pattern
- CI Provider - Woodpecker adapter details
- Worker Pool - Another background worker example