111 lines
4.5 KiB
Go
111 lines
4.5 KiB
Go
// Package routing provides unified provider routing with fallback execution
|
|
// and cooldown management for LLM and media generation requests.
|
|
//
|
|
// ============================================================================
|
|
// MANDATORY USAGE - READ THIS FIRST
|
|
// ============================================================================
|
|
//
|
|
// All provider routing in this codebase MUST go through this package.
|
|
// This is the SINGLE SOURCE OF TRUTH for provider orchestration.
|
|
//
|
|
// DO NOT:
|
|
// - Implement custom fallback loops
|
|
// - Implement custom cooldown tracking
|
|
// - Pre-filter providers based on cooldown state
|
|
// - Skip the terminus provider for any reason
|
|
//
|
|
// The routing.Execute function enforces critical invariants:
|
|
// 1. Terminus provider is ALWAYS attempted (guarantees no "all providers unavailable" failures)
|
|
// 2. Tiered cooldowns are applied correctly (rate limit vs transient)
|
|
// 3. Exempt providers never enter cooldown
|
|
// 4. Thread-safe concurrent access
|
|
//
|
|
// ============================================================================
|
|
// KEY CONCEPTS
|
|
// ============================================================================
|
|
//
|
|
// # Terminus Semantics
|
|
//
|
|
// When using StrategyFallback, the LAST provider in the chain is the "terminus".
|
|
// The terminus is ALWAYS attempted regardless of cooldown state. This ensures
|
|
// there's always a fallback of last resort that will be tried.
|
|
//
|
|
// Example: With providers [Gemini, LaoZhang]:
|
|
// - Gemini: Respects cooldown (may be skipped if rate-limited)
|
|
// - LaoZhang (terminus): ALWAYS tried, even if in cooldown
|
|
//
|
|
// CRITICAL: Provider order matters! Place your pay-per-use fallback LAST.
|
|
//
|
|
// # Tiered Cooldowns
|
|
//
|
|
// Failures trigger different cooldown durations based on error type:
|
|
// - Rate limit (429) / Quota errors: 1 hour cooldown (DefaultCooldownPeriod)
|
|
// - Transient server errors (5xx): 30 second cooldown (TransientCooldownPeriod)
|
|
// - Other errors: No cooldown (normal failures)
|
|
//
|
|
// # Exempt Providers
|
|
//
|
|
// Some providers (like LaoZhang) are pay-per-use and never rate-limited.
|
|
// These are listed in ExemptProviders and never enter cooldown.
|
|
//
|
|
// ============================================================================
|
|
// USAGE PATTERNS
|
|
// ============================================================================
|
|
//
|
|
// For image generation (through mediagen.Manager):
|
|
//
|
|
// manager, _ := mediagen.NewManager(mediagen.ManagerConfig{
|
|
// ImageProviders: []mediagen.ImageGenerator{geminiProvider, laozhangProvider},
|
|
// Strategy: mediagen.StrategyFallback,
|
|
// // LaoZhang is LAST = terminus = always tried
|
|
// })
|
|
// resp, err := manager.GenerateImage(ctx, req)
|
|
//
|
|
// For text generation (through textgen.Manager):
|
|
//
|
|
// manager, _ := textgen.NewManager(textgen.ManagerConfig{
|
|
// Providers: []textgen.TextGenerator{geminiProvider, laozhangProvider},
|
|
// Strategy: textgen.StrategyFallback,
|
|
// // LaoZhang is LAST = terminus = always tried
|
|
// })
|
|
// resp, err := manager.GenerateText(ctx, req)
|
|
//
|
|
// ============================================================================
|
|
// INTEGRATION WITH MEDIAGEN AND TEXTGEN
|
|
// ============================================================================
|
|
//
|
|
// The mediagen.Manager and textgen.Manager packages delegate to this package
|
|
// for the actual fallback execution logic. They provide domain-specific
|
|
// interfaces (ImageGenerator, TextGenerator) while routing handles the
|
|
// cross-cutting concerns of provider selection, cooldowns, and fallback.
|
|
//
|
|
// DO NOT bypass these managers to call routing.Execute directly from
|
|
// application code. Always use mediagen.Manager or textgen.Manager.
|
|
//
|
|
// The package hierarchy is:
|
|
//
|
|
// Application Code
|
|
// |
|
|
// v
|
|
// mediagen.Manager / textgen.Manager (domain interfaces)
|
|
// |
|
|
// v
|
|
// pkg/routing (THIS PACKAGE - execution logic)
|
|
// |
|
|
// v
|
|
// Provider Implementations (gemini, laozhang)
|
|
//
|
|
// ============================================================================
|
|
// CODE REVIEW CHECKLIST
|
|
// ============================================================================
|
|
//
|
|
// When reviewing code that touches provider routing, verify:
|
|
//
|
|
// [ ] Uses mediagen.Manager or textgen.Manager (not custom loops)
|
|
// [ ] Provider order is correct (terminus last)
|
|
// [ ] Does NOT pre-filter providers based on cooldown
|
|
// [ ] Does NOT skip terminus for any reason
|
|
// [ ] Uses StrategyFallback for production workloads
|
|
// [ ] Has CircuitBreaker for services
|
|
package routing
|