264 lines
8.0 KiB
Markdown
264 lines
8.0 KiB
Markdown
---
|
|
name: logging-standards
|
|
description: Logging infrastructure standards for feat-dev-e2e - structured logging, trace propagation, error handling, frontend/backend consistency.
|
|
---
|
|
|
|
# Logging Standards
|
|
|
|
## Identity
|
|
|
|
You enforce consistent, actionable logging across all services and apps in feat-dev-e2e. Every log entry is structured, traceable, and tells the story of what happened.
|
|
|
|
## Core Principles
|
|
|
|
1. **Log once at the boundary** - handlers/workers log the result; internal functions return errors
|
|
2. **Every log has context** - trace_id, request_id, service, and component on every line
|
|
3. **Errors are actionable** - include what failed, why, and what to do about it
|
|
4. **Structured always** - JSON in production, text in development; never fmt.Println
|
|
5. **No sensitive data** - never log passwords, tokens, PII, or full request bodies
|
|
|
|
## Backend (Go + slog)
|
|
|
|
### Logger Creation
|
|
|
|
```go
|
|
// Services get a logger from pkg/app - it's pre-configured
|
|
app := app.New("auth-api", app.WithDefaultPort(8001))
|
|
logger := app.Logger()
|
|
|
|
// Workers create their own context
|
|
ctx = logging.WorkerContext(ctx, "email-sender")
|
|
logger := logging.FromContext(ctx)
|
|
```
|
|
|
|
### Context Propagation
|
|
|
|
The middleware stack automatically sets up context:
|
|
|
|
```
|
|
RequestID() -> Tracing() -> RequestLogger() -> Recoverer()
|
|
```
|
|
|
|
Every request gets:
|
|
- `request_id` - unique per request (from X-Request-ID header or generated)
|
|
- `trace_id` - unique per trace (from X-Trace-ID / X-Cloud-Trace-Context or generated)
|
|
|
|
Retrieve in handlers:
|
|
```go
|
|
logger := logging.FromContext(r.Context())
|
|
logger.Info("user created", "user_id", user.ID)
|
|
```
|
|
|
|
### Error Logging Pattern
|
|
|
|
```go
|
|
// GOOD - log at handler boundary with context
|
|
func (h *Handler) CreateUser(w http.ResponseWriter, r *http.Request) {
|
|
user, err := h.service.Create(r.Context(), req)
|
|
if err != nil {
|
|
logging.FromContext(r.Context()).Error("create user failed",
|
|
"error", err,
|
|
"email", req.Email,
|
|
)
|
|
httpresponse.InternalError(w, r, "failed to create user")
|
|
return
|
|
}
|
|
httpresponse.Created(w, r, user)
|
|
}
|
|
|
|
// GOOD - service returns error, does not log
|
|
func (s *Service) Create(ctx context.Context, input CreateInput) (*User, error) {
|
|
user, err := s.repo.Insert(ctx, input)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("insert user: %w", err)
|
|
}
|
|
return user, nil
|
|
}
|
|
|
|
// BAD - logging inside service AND returning error (double-logged)
|
|
func (s *Service) Create(ctx context.Context, input CreateInput) (*User, error) {
|
|
user, err := s.repo.Insert(ctx, input)
|
|
if err != nil {
|
|
s.logger.Error("failed to insert", "error", err) // DON'T DO THIS
|
|
return nil, err
|
|
}
|
|
return user, nil
|
|
}
|
|
```
|
|
|
|
### Log Levels
|
|
|
|
| Level | When |
|
|
|-------|------|
|
|
| **Error** | Something failed and needs attention (5xx, unrecoverable) |
|
|
| **Warn** | Something unexpected but handled (4xx, retries, fallbacks) |
|
|
| **Info** | Normal operations (request completed, job processed, startup) |
|
|
| **Debug** | Diagnostic details (SQL queries, cache hits, retry attempts) |
|
|
|
|
### Service-to-Service
|
|
|
|
The `httpclient` package automatically propagates both `X-Request-ID` and `X-Trace-ID` headers on outgoing requests:
|
|
|
|
```go
|
|
client := httpclient.New(httpclient.Config{Timeout: 5 * time.Second})
|
|
resp, err := client.Do(req) // trace_id and request_id propagated automatically
|
|
```
|
|
|
|
### Response Envelope
|
|
|
|
Every API response includes trace context in the meta field:
|
|
|
|
```json
|
|
{
|
|
"data": {},
|
|
"meta": {
|
|
"request_id": "abc-123",
|
|
"trace_id": "def-456",
|
|
"timestamp": "2024-01-01T00:00:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Frontend (TypeScript + @feat-dev-e2e/logger)
|
|
|
|
### Setup
|
|
|
|
```typescript
|
|
import { createLogger, installGlobalHandlers } from '@feat-dev-e2e/logger';
|
|
|
|
export const logger = createLogger({
|
|
level: import.meta.env.DEV ? 'debug' : 'info',
|
|
service: 'dashboard',
|
|
endpoint: '/api/logs', // optional: send logs to backend
|
|
});
|
|
|
|
installGlobalHandlers(logger);
|
|
```
|
|
|
|
### Usage
|
|
|
|
```typescript
|
|
// Simple logging
|
|
logger.info('page loaded', { route: '/dashboard' });
|
|
|
|
// Error logging with Error objects
|
|
try {
|
|
await fetchData();
|
|
} catch (err) {
|
|
logger.error('fetch failed', err, { endpoint: '/api/data' });
|
|
}
|
|
|
|
// Child logger with component context
|
|
const authLogger = logger.withContext({ component: 'auth' });
|
|
authLogger.info('login attempt', { method: 'oauth' });
|
|
```
|
|
|
|
### Features
|
|
|
|
- **Batching**: Logs are buffered and sent in batches (default: 20 entries or 5s)
|
|
- **Offline resilience**: Uses `navigator.sendBeacon` for reliable delivery during page unload
|
|
- **Global handlers**: Captures uncaught exceptions and unhandled promise rejections
|
|
- **Zero-crash**: Logging failures never break the app
|
|
|
|
## Workers & Cron Jobs
|
|
|
|
Workers don't have HTTP context. Use `WorkerContext` to generate trace IDs:
|
|
|
|
```go
|
|
func (w *OrderProcessor) Handle(ctx context.Context, job queue.Job) error {
|
|
ctx = logging.WorkerContext(ctx, "order-processor")
|
|
logger := logging.FromContext(ctx)
|
|
|
|
logger.Info("processing order", "order_id", job.Payload.OrderID)
|
|
|
|
if err := w.process(ctx, job); err != nil {
|
|
logger.Error("order processing failed",
|
|
"error", err,
|
|
"order_id", job.Payload.OrderID,
|
|
)
|
|
return err
|
|
}
|
|
|
|
logger.Info("order processed", "order_id", job.Payload.OrderID)
|
|
return nil
|
|
}
|
|
```
|
|
|
|
## Error Wrapping Patterns
|
|
|
|
### Standard wrap (add context)
|
|
```go
|
|
return fmt.Errorf("insert user: %w", err)
|
|
```
|
|
|
|
### Sentinel + detail wrap (Go 1.20+)
|
|
When a handler needs to classify errors for HTTP status mapping, wrap both a sentinel and detail:
|
|
```go
|
|
// Service returns a matchable sentinel WITH the detail error
|
|
return fmt.Errorf("%w: %w", domain.ErrInvalidCommand, err)
|
|
|
|
// Handler matches sentinel → 400, otherwise → 500
|
|
if errors.Is(err, domain.ErrInvalidCommand) {
|
|
httpresponse.BadRequest(w, r, err.Error())
|
|
return
|
|
}
|
|
```
|
|
Both errors are matchable via `errors.Is()`. This is NOT an anti-pattern.
|
|
|
|
### When to use which
|
|
| Pattern | Use when |
|
|
|---------|----------|
|
|
| `fmt.Errorf("context: %w", err)` | Adding operation context |
|
|
| `fmt.Errorf("%w: %w", sentinel, err)` | Handler needs to classify error type |
|
|
| `fmt.Errorf("%w: detail string", sentinel)` | Sentinel + static detail (no inner error) |
|
|
|
|
## Anti-Patterns
|
|
|
|
| Don't | Do Instead |
|
|
|-------|-----------|
|
|
| `fmt.Println("error:", err)` | `logger.Error("description", "error", err)` |
|
|
| `log.Fatal(err)` | `logger.Error(...)` + graceful shutdown |
|
|
| Log in service AND handler | Log once at boundary, return errors |
|
|
| `logger.Info("password=" + pw)` | Never log credentials or PII |
|
|
| `logger.Error(err.Error())` | `logger.Error("what failed", "error", err)` |
|
|
| Ignore returned errors | Wrap and return: `fmt.Errorf("context: %w", err)` |
|
|
| `&http.Client{}` (no timeout) | `&http.Client{Timeout: 30 * time.Second}` |
|
|
| `http.Get(url)` (default client) | Use `httpclient.Get(ctx, url)` from pkg/httpclient |
|
|
|
|
## HTTP Client Rules
|
|
|
|
Every `http.Client` must have an explicit `Timeout`. A bare `&http.Client{}` can hang indefinitely.
|
|
|
|
```go
|
|
// GOOD - explicit timeout
|
|
client := &http.Client{Timeout: 30 * time.Second}
|
|
|
|
// GOOD - use the shared httpclient package (has retries + trace propagation)
|
|
client := httpclient.New(httpclient.Config{
|
|
Timeout: 10 * time.Second,
|
|
MaxRetries: 3,
|
|
})
|
|
resp, err := client.Do(req) // propagates trace_id and request_id
|
|
|
|
// BAD - no timeout, can hang forever
|
|
client := &http.Client{}
|
|
|
|
// BAD - uses http.DefaultClient (no timeout)
|
|
resp, err := http.Get(url)
|
|
```
|
|
|
|
## Checklist
|
|
|
|
When reviewing logging in a PR:
|
|
|
|
- [ ] Every handler logs errors before returning error responses
|
|
- [ ] Services return errors, don't log them
|
|
- [ ] No sensitive data in log output
|
|
- [ ] trace_id and request_id propagated on service-to-service calls
|
|
- [ ] Workers use `logging.WorkerContext` for correlation
|
|
- [ ] Frontend apps initialize logger and global handlers
|
|
- [ ] Error logs include enough context to debug (IDs, operation name)
|
|
- [ ] Log levels appropriate (not everything is Error)
|
|
- [ ] All `http.Client` instances have explicit `Timeout` set
|
|
- [ ] Service-to-service calls use `pkg/httpclient` (retries + trace propagation)
|