--- name: microservices description: Inter-service communication patterns using pkg/svc for service discovery and circuit breaker protection. Use when implementing service-to-service calls. --- # Microservices Communication ## Identity You are a distributed systems engineer who understands the pitfalls of microservice communication. You prioritize resilience, observability, and graceful degradation over feature velocity. ## Service Discovery Services discover siblings via environment variables injected automatically by the platform. ### How It Works When a component is deployed, it receives env vars for all sibling services: - `AUTH_SVC_URL=http://myproject-auth-svc:8001` - `CHAT_SVC_URL=http://myproject-chat-svc:8002` The naming convention: `{COMPONENT_NAME}_URL` where `COMPONENT_NAME` is UPPER_SNAKE_CASE. ### Using pkg/svc ```go import "git.threesix.ai/jordan/sp1-verify-1770320281/pkg/svc" // Simple lookup url := svc.ServiceURL("auth-svc") if url == "" { // Service not configured } // Check availability if svc.ServiceConfigured("auth-svc") { // Safe to call } // For required dependencies (panics if missing) url := svc.MustServiceURL("auth-svc") ``` ## Service Client Use `svc.NewClient()` for a pre-configured HTTP client with circuit breaker protection. ### Basic Usage ```go import "git.threesix.ai/jordan/sp1-verify-1770320281/pkg/svc" // Create client (returns error if service not configured) authClient, err := svc.NewClient("auth-svc") if err != nil { return fmt.Errorf("auth service unavailable: %w", err) } // Make requests resp, err := authClient.Get(ctx, "/users/123") if err != nil { if errors.Is(err, httpclient.ErrCircuitOpen) { // Circuit breaker is open - service is unhealthy return ErrAuthServiceDown } return fmt.Errorf("auth request failed: %w", err) } defer resp.Body.Close() // JSON POST resp, err := authClient.Post(ctx, "/validate", ValidateRequest{Token: token}) ``` ### Custom Configuration ```go client, err := svc.NewClientWithConfig("auth-svc", svc.ClientConfig{ Timeout: 5 * time.Second, // Shorter timeout for fast-fail MaxRetries: 2, // Fewer retries CircuitBreaker: &httpclient.CircuitBreakerConfig{ FailureThreshold: 3, // Open after 3 failures ResetTimeout: 15 * time.Second, }, }) ``` ## Circuit Breaker The circuit breaker prevents cascading failures by failing fast when a service is unhealthy. ### States | State | Behavior | |-------|----------| | **Closed** | Normal operation, requests pass through | | **Open** | Blocks all requests, returns `ErrCircuitOpen` immediately | | **Half-Open** | Allows one test request to check if service recovered | ### Default Thresholds - Opens after 5 consecutive failures - Waits 30s before attempting recovery (half-open) - Closes after one successful request in half-open state ### What Affects Circuit State The circuit breaker tracks **transient failures** only: | Response | Affects Circuit? | Reason | |----------|-----------------|--------| | HTTP 2xx/3xx | ✅ RecordSuccess | Service is healthy | | HTTP 5xx | ✅ RecordFailure | Server error - transient | | HTTP 429 | ✅ RecordFailure | Rate limited - transient | | HTTP 4xx (except 429) | ❌ No effect | Client error - not service's fault | | Network error | ✅ RecordFailure | Connection failed | | Context cancelled | ❌ No effect | User/caller initiated | | Timeout | ✅ RecordFailure | Service too slow | **Key insight:** 4xx responses (bad requests, not found, unauthorized) don't trip the circuit because they indicate a problem with the request, not the service. A service returning 400s is still "healthy" from a circuit breaker perspective. ### Handling Circuit Open ```go resp, err := authClient.Get(ctx, "/users/123") if errors.Is(err, httpclient.ErrCircuitOpen) { // Option 1: Return degraded response return CachedUserData(userID) // Option 2: Propagate as service unavailable return nil, ErrServiceTemporarilyUnavailable // Option 3: Use fallback service return fallbackClient.Get(ctx, "/users/123") } ``` ## Patterns ### Initialization Pattern Initialize service clients at startup, not on-demand: ```go type Server struct { authClient *svc.Client chatClient *svc.Client } func NewServer() (*Server, error) { authClient, err := svc.NewClient("auth-svc") if err != nil { return nil, fmt.Errorf("auth service required: %w", err) } // Optional dependency - check but don't fail var chatClient *svc.Client if svc.ServiceConfigured("chat-svc") { chatClient, _ = svc.NewClient("chat-svc") } return &Server{ authClient: authClient, chatClient: chatClient, }, nil } ``` ### Response Decoding ```go type User struct { ID string `json:"id"` Name string `json:"name"` } resp, err := authClient.Get(ctx, "/users/123") if err != nil { return nil, err } user, err := svc.DecodeResponse[User](resp) if err != nil { return nil, fmt.Errorf("decode user: %w", err) } ``` ### Graceful Degradation ```go func (s *Server) GetUserProfile(ctx context.Context, userID string) (*Profile, error) { // Required call user, err := s.fetchUser(ctx, userID) if err != nil { return nil, err } profile := &Profile{User: user} // Optional enrichment - don't fail if chat service is down if s.chatClient != nil { messages, err := s.fetchRecentMessages(ctx, userID) if err != nil { s.logger.Warn("failed to fetch messages", "error", err) // Continue without messages } else { profile.RecentMessages = messages } } return profile, nil } ``` ## Anti-Patterns ### Hardcoded URLs ```go // BAD: Hardcoded URLs break when services move client := httpclient.New(httpclient.Config{}) resp, err := client.Get(ctx, "http://auth-svc:8001/users") // GOOD: Use service discovery authClient, _ := svc.NewClient("auth-svc") resp, err := authClient.Get(ctx, "/users") ``` ### Ignoring Circuit Breaker Errors ```go // BAD: Retrying forever when circuit is open for { resp, err := authClient.Get(ctx, "/users") if err != nil { time.Sleep(time.Second) continue } } // GOOD: Detect circuit open and handle gracefully resp, err := authClient.Get(ctx, "/users") if errors.Is(err, httpclient.ErrCircuitOpen) { return nil, ErrServiceUnavailable } ``` ### On-Demand Client Creation ```go // BAD: Creating client on every request func (h *Handler) GetUser(w http.ResponseWriter, r *http.Request) { client, _ := svc.NewClient("auth-svc") // Wastes resources // ... } // GOOD: Reuse client instance type Handler struct { authClient *svc.Client } func (h *Handler) GetUser(w http.ResponseWriter, r *http.Request) { resp, _ := h.authClient.Get(r.Context(), "/users") // ... } ``` ### Silent Failures ```go // BAD: Swallowing errors resp, _ := authClient.Get(ctx, "/validate") if resp != nil && resp.StatusCode == 200 { // Assume success } // GOOD: Explicit error handling resp, err := authClient.Get(ctx, "/validate") if err != nil { return fmt.Errorf("auth validation: %w", err) } if resp.StatusCode != http.StatusOK { return fmt.Errorf("auth validation failed: %d", resp.StatusCode) } ``` ## Checklist When implementing inter-service calls: - [ ] Use `svc.NewClient()` instead of raw HTTP clients - [ ] Handle `ErrCircuitOpen` explicitly - [ ] Initialize clients at startup, not on-demand - [ ] Log service call failures with context - [ ] Consider graceful degradation for optional dependencies - [ ] Set appropriate timeouts (shorter than HTTP handler timeout) - [ ] Propagate trace IDs for distributed tracing ## Files | File | Purpose | |------|---------| | `pkg/svc/discovery.go` | Service URL lookup from env vars | | `pkg/svc/client.go` | Pre-configured service client | | `pkg/httpclient/circuit.go` | Circuit breaker implementation | | `pkg/httpclient/client.go` | HTTP client with retries |