sp4-otel-1770474553/.claude/skills/microservices/SKILL.md
jordan 280afc64f6
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
ci/woodpecker/manual/woodpecker Pipeline was successful
Initialize project from skeleton template
2026-02-07 14:29:14 +00:00

306 lines
7.9 KiB
Markdown

---
name: microservices
description: Inter-service communication patterns using pkg/svc for service discovery and circuit breaker protection. Use when implementing service-to-service calls.
---
# Microservices Communication
## Identity
You are a distributed systems engineer who understands the pitfalls of microservice communication. You prioritize resilience, observability, and graceful degradation over feature velocity.
## Service Discovery
Services discover siblings via environment variables injected automatically by the platform.
### How It Works
When a component is deployed, it receives env vars for all sibling services:
- `AUTH_SVC_URL=http://myproject-auth-svc:8001`
- `CHAT_SVC_URL=http://myproject-chat-svc:8002`
The naming convention: `{COMPONENT_NAME}_URL` where `COMPONENT_NAME` is UPPER_SNAKE_CASE.
### Using pkg/svc
```go
import "git.threesix.ai/jordan/sp4-otel-1770474553/pkg/svc"
// Simple lookup
url := svc.ServiceURL("auth-svc")
if url == "" {
// Service not configured
}
// Check availability
if svc.ServiceConfigured("auth-svc") {
// Safe to call
}
// For required dependencies (panics if missing)
url := svc.MustServiceURL("auth-svc")
```
## Service Client
Use `svc.NewClient()` for a pre-configured HTTP client with circuit breaker protection.
### Basic Usage
```go
import "git.threesix.ai/jordan/sp4-otel-1770474553/pkg/svc"
// Create client (returns error if service not configured)
authClient, err := svc.NewClient("auth-svc")
if err != nil {
return fmt.Errorf("auth service unavailable: %w", err)
}
// Make requests
resp, err := authClient.Get(ctx, "/users/123")
if err != nil {
if errors.Is(err, httpclient.ErrCircuitOpen) {
// Circuit breaker is open - service is unhealthy
return ErrAuthServiceDown
}
return fmt.Errorf("auth request failed: %w", err)
}
defer resp.Body.Close()
// JSON POST
resp, err := authClient.Post(ctx, "/validate", ValidateRequest{Token: token})
```
### Custom Configuration
```go
client, err := svc.NewClientWithConfig("auth-svc", svc.ClientConfig{
Timeout: 5 * time.Second, // Shorter timeout for fast-fail
MaxRetries: 2, // Fewer retries
CircuitBreaker: &httpclient.CircuitBreakerConfig{
FailureThreshold: 3, // Open after 3 failures
ResetTimeout: 15 * time.Second,
},
})
```
## Circuit Breaker
The circuit breaker prevents cascading failures by failing fast when a service is unhealthy.
### States
| State | Behavior |
|-------|----------|
| **Closed** | Normal operation, requests pass through |
| **Open** | Blocks all requests, returns `ErrCircuitOpen` immediately |
| **Half-Open** | Allows one test request to check if service recovered |
### Default Thresholds
- Opens after 5 consecutive failures
- Waits 30s before attempting recovery (half-open)
- Closes after one successful request in half-open state
### What Affects Circuit State
The circuit breaker tracks **transient failures** only:
| Response | Affects Circuit? | Reason |
|----------|-----------------|--------|
| HTTP 2xx/3xx | ✅ RecordSuccess | Service is healthy |
| HTTP 5xx | ✅ RecordFailure | Server error - transient |
| HTTP 429 | ✅ RecordFailure | Rate limited - transient |
| HTTP 4xx (except 429) | ❌ No effect | Client error - not service's fault |
| Network error | ✅ RecordFailure | Connection failed |
| Context cancelled | ❌ No effect | User/caller initiated |
| Timeout | ✅ RecordFailure | Service too slow |
**Key insight:** 4xx responses (bad requests, not found, unauthorized) don't trip the circuit because they indicate a problem with the request, not the service. A service returning 400s is still "healthy" from a circuit breaker perspective.
### Handling Circuit Open
```go
resp, err := authClient.Get(ctx, "/users/123")
if errors.Is(err, httpclient.ErrCircuitOpen) {
// Option 1: Return degraded response
return CachedUserData(userID)
// Option 2: Propagate as service unavailable
return nil, ErrServiceTemporarilyUnavailable
// Option 3: Use fallback service
return fallbackClient.Get(ctx, "/users/123")
}
```
## Patterns
### Initialization Pattern
Initialize service clients at startup, not on-demand:
```go
type Server struct {
authClient *svc.Client
chatClient *svc.Client
}
func NewServer() (*Server, error) {
authClient, err := svc.NewClient("auth-svc")
if err != nil {
return nil, fmt.Errorf("auth service required: %w", err)
}
// Optional dependency - check but don't fail
var chatClient *svc.Client
if svc.ServiceConfigured("chat-svc") {
chatClient, _ = svc.NewClient("chat-svc")
}
return &Server{
authClient: authClient,
chatClient: chatClient,
}, nil
}
```
### Response Decoding
```go
type User struct {
ID string `json:"id"`
Name string `json:"name"`
}
resp, err := authClient.Get(ctx, "/users/123")
if err != nil {
return nil, err
}
user, err := svc.DecodeResponse[User](resp)
if err != nil {
return nil, fmt.Errorf("decode user: %w", err)
}
```
### Graceful Degradation
```go
func (s *Server) GetUserProfile(ctx context.Context, userID string) (*Profile, error) {
// Required call
user, err := s.fetchUser(ctx, userID)
if err != nil {
return nil, err
}
profile := &Profile{User: user}
// Optional enrichment - don't fail if chat service is down
if s.chatClient != nil {
messages, err := s.fetchRecentMessages(ctx, userID)
if err != nil {
s.logger.Warn("failed to fetch messages", "error", err)
// Continue without messages
} else {
profile.RecentMessages = messages
}
}
return profile, nil
}
```
## Anti-Patterns
### Hardcoded URLs
```go
// BAD: Hardcoded URLs break when services move
client := httpclient.New(httpclient.Config{})
resp, err := client.Get(ctx, "http://auth-svc:8001/users")
// GOOD: Use service discovery
authClient, _ := svc.NewClient("auth-svc")
resp, err := authClient.Get(ctx, "/users")
```
### Ignoring Circuit Breaker Errors
```go
// BAD: Retrying forever when circuit is open
for {
resp, err := authClient.Get(ctx, "/users")
if err != nil {
time.Sleep(time.Second)
continue
}
}
// GOOD: Detect circuit open and handle gracefully
resp, err := authClient.Get(ctx, "/users")
if errors.Is(err, httpclient.ErrCircuitOpen) {
return nil, ErrServiceUnavailable
}
```
### On-Demand Client Creation
```go
// BAD: Creating client on every request
func (h *Handler) GetUser(w http.ResponseWriter, r *http.Request) {
client, _ := svc.NewClient("auth-svc") // Wastes resources
// ...
}
// GOOD: Reuse client instance
type Handler struct {
authClient *svc.Client
}
func (h *Handler) GetUser(w http.ResponseWriter, r *http.Request) {
resp, _ := h.authClient.Get(r.Context(), "/users")
// ...
}
```
### Silent Failures
```go
// BAD: Swallowing errors
resp, _ := authClient.Get(ctx, "/validate")
if resp != nil && resp.StatusCode == 200 {
// Assume success
}
// GOOD: Explicit error handling
resp, err := authClient.Get(ctx, "/validate")
if err != nil {
return fmt.Errorf("auth validation: %w", err)
}
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("auth validation failed: %d", resp.StatusCode)
}
```
## Checklist
When implementing inter-service calls:
- [ ] Use `svc.NewClient()` instead of raw HTTP clients
- [ ] Handle `ErrCircuitOpen` explicitly
- [ ] Initialize clients at startup, not on-demand
- [ ] Log service call failures with context
- [ ] Consider graceful degradation for optional dependencies
- [ ] Set appropriate timeouts (shorter than HTTP handler timeout)
- [ ] Propagate trace IDs for distributed tracing
## Files
| File | Purpose |
|------|---------|
| `pkg/svc/discovery.go` | Service URL lookup from env vars |
| `pkg/svc/client.go` | Pre-configured service client |
| `pkg/httpclient/circuit.go` | Circuit breaker implementation |
| `pkg/httpclient/client.go` | HTTP client with retries |