--- name: logging-standards description: Logging infrastructure standards for slate-final-1770511493 - structured logging, trace propagation, error handling, frontend/backend consistency. --- # Logging Standards ## Identity You enforce consistent, actionable logging across all services and apps in slate-final-1770511493. Every log entry is structured, traceable, and tells the story of what happened. ## Core Principles 1. **Log once at the boundary** - handlers/workers log the result; internal functions return errors 2. **Every log has context** - trace_id, request_id, service, and component on every line 3. **Errors are actionable** - include what failed, why, and what to do about it 4. **Structured always** - JSON in production, text in development; never fmt.Println 5. **No sensitive data** - never log passwords, tokens, PII, or full request bodies ## Backend (Go + slog) ### Logger Creation ```go // Services get a logger from pkg/app - it's pre-configured app := app.New("auth-api", app.WithDefaultPort(8001)) logger := app.Logger() // Workers create their own context ctx = logging.WorkerContext(ctx, "email-sender") logger := logging.FromContext(ctx) ``` ### Context Propagation The middleware stack automatically sets up context: ``` RequestID() -> Tracing() -> RequestLogger() -> Recoverer() ``` Every request gets: - `request_id` - unique per request (from X-Request-ID header or generated) - `trace_id` - unique per trace (from X-Trace-ID / X-Cloud-Trace-Context or generated) Retrieve in handlers: ```go logger := logging.FromContext(r.Context()) logger.Info("user created", "user_id", user.ID) ``` ### Error Logging Pattern ```go // GOOD - log at handler boundary with context func (h *Handler) CreateUser(w http.ResponseWriter, r *http.Request) { user, err := h.service.Create(r.Context(), req) if err != nil { logging.FromContext(r.Context()).Error("create user failed", "error", err, "email", req.Email, ) httpresponse.InternalError(w, r, "failed to create user") return } httpresponse.Created(w, r, user) } // GOOD - service returns error, does not log func (s *Service) Create(ctx context.Context, input CreateInput) (*User, error) { user, err := s.repo.Insert(ctx, input) if err != nil { return nil, fmt.Errorf("insert user: %w", err) } return user, nil } // BAD - logging inside service AND returning error (double-logged) func (s *Service) Create(ctx context.Context, input CreateInput) (*User, error) { user, err := s.repo.Insert(ctx, input) if err != nil { s.logger.Error("failed to insert", "error", err) // DON'T DO THIS return nil, err } return user, nil } ``` ### Log Levels | Level | When | |-------|------| | **Error** | Something failed and needs attention (5xx, unrecoverable) | | **Warn** | Something unexpected but handled (4xx, retries, fallbacks) | | **Info** | Normal operations (request completed, job processed, startup) | | **Debug** | Diagnostic details (SQL queries, cache hits, retry attempts) | ### Service-to-Service The `httpclient` package automatically propagates both `X-Request-ID` and `X-Trace-ID` headers on outgoing requests: ```go client := httpclient.New(httpclient.Config{Timeout: 5 * time.Second}) resp, err := client.Do(req) // trace_id and request_id propagated automatically ``` ### Response Envelope Every API response includes trace context in the meta field: ```json { "data": {}, "meta": { "request_id": "abc-123", "trace_id": "def-456", "timestamp": "2024-01-01T00:00:00Z" } } ``` ## Frontend (TypeScript + @slate-final-1770511493/logger) ### Setup ```typescript import { createLogger, installGlobalHandlers } from '@slate-final-1770511493/logger'; export const logger = createLogger({ level: import.meta.env.DEV ? 'debug' : 'info', service: 'dashboard', endpoint: '/api/logs', // optional: send logs to backend }); installGlobalHandlers(logger); ``` ### Usage ```typescript // Simple logging logger.info('page loaded', { route: '/dashboard' }); // Error logging with Error objects try { await fetchData(); } catch (err) { logger.error('fetch failed', err, { endpoint: '/api/data' }); } // Child logger with component context const authLogger = logger.withContext({ component: 'auth' }); authLogger.info('login attempt', { method: 'oauth' }); ``` ### Features - **Batching**: Logs are buffered and sent in batches (default: 20 entries or 5s) - **Offline resilience**: Uses `navigator.sendBeacon` for reliable delivery during page unload - **Global handlers**: Captures uncaught exceptions and unhandled promise rejections - **Zero-crash**: Logging failures never break the app ## Workers & Cron Jobs Workers don't have HTTP context. Use `WorkerContext` to generate trace IDs: ```go func (w *OrderProcessor) Handle(ctx context.Context, job queue.Job) error { ctx = logging.WorkerContext(ctx, "order-processor") logger := logging.FromContext(ctx) logger.Info("processing order", "order_id", job.Payload.OrderID) if err := w.process(ctx, job); err != nil { logger.Error("order processing failed", "error", err, "order_id", job.Payload.OrderID, ) return err } logger.Info("order processed", "order_id", job.Payload.OrderID) return nil } ``` ## Error Wrapping Patterns ### Standard wrap (add context) ```go return fmt.Errorf("insert user: %w", err) ``` ### Sentinel + detail wrap (Go 1.20+) When a handler needs to classify errors for HTTP status mapping, wrap both a sentinel and detail: ```go // Service returns a matchable sentinel WITH the detail error return fmt.Errorf("%w: %w", domain.ErrInvalidCommand, err) // Handler matches sentinel → 400, otherwise → 500 if errors.Is(err, domain.ErrInvalidCommand) { httpresponse.BadRequest(w, r, err.Error()) return } ``` Both errors are matchable via `errors.Is()`. This is NOT an anti-pattern. ### When to use which | Pattern | Use when | |---------|----------| | `fmt.Errorf("context: %w", err)` | Adding operation context | | `fmt.Errorf("%w: %w", sentinel, err)` | Handler needs to classify error type | | `fmt.Errorf("%w: detail string", sentinel)` | Sentinel + static detail (no inner error) | ## Anti-Patterns | Don't | Do Instead | |-------|-----------| | `fmt.Println("error:", err)` | `logger.Error("description", "error", err)` | | `log.Fatal(err)` | `logger.Error(...)` + graceful shutdown | | Log in service AND handler | Log once at boundary, return errors | | `logger.Info("password=" + pw)` | Never log credentials or PII | | `logger.Error(err.Error())` | `logger.Error("what failed", "error", err)` | | Ignore returned errors | Wrap and return: `fmt.Errorf("context: %w", err)` | | `&http.Client{}` (no timeout) | `&http.Client{Timeout: 30 * time.Second}` | | `http.Get(url)` (default client) | Use `httpclient.Get(ctx, url)` from pkg/httpclient | ## HTTP Client Rules Every `http.Client` must have an explicit `Timeout`. A bare `&http.Client{}` can hang indefinitely. ```go // GOOD - explicit timeout client := &http.Client{Timeout: 30 * time.Second} // GOOD - use the shared httpclient package (has retries + trace propagation) client := httpclient.New(httpclient.Config{ Timeout: 10 * time.Second, MaxRetries: 3, }) resp, err := client.Do(req) // propagates trace_id and request_id // BAD - no timeout, can hang forever client := &http.Client{} // BAD - uses http.DefaultClient (no timeout) resp, err := http.Get(url) ``` ## Checklist When reviewing logging in a PR: - [ ] Every handler logs errors before returning error responses - [ ] Services return errors, don't log them - [ ] No sensitive data in log output - [ ] trace_id and request_id propagated on service-to-service calls - [ ] Workers use `logging.WorkerContext` for correlation - [ ] Frontend apps initialize logger and global handlers - [ ] Error logs include enough context to debug (IDs, operation name) - [ ] Log levels appropriate (not everything is Error) - [ ] All `http.Client` instances have explicit `Timeout` set - [ ] Service-to-service calls use `pkg/httpclient` (retries + trace propagation)