This commit implements comprehensive production hardening across multiple layers to prepare StemeDB for enterprise pilot deployments: ## API Layer - Add rate limiting middleware with configurable limits per endpoint - Enhance error handling with detailed context and proper HTTP status codes - Add security hardening tests for input validation and boundary conditions - Create store_helpers module for defensive storage access patterns ## Storage & WAL - Optimize group commit batching for higher throughput - Add defensive error handling in hybrid backend with proper fallbacks - Enhance WAL journal durability guarantees with fsync validation - Improve index store query performance with better caching ## Operations & Deployment - Add comprehensive operations documentation (deployment, monitoring, DR) - Create systemd units for backup, WAL archival, and verification - Add monitoring configs (Prometheus alerts, metrics exporters) - Implement backup/restore scripts with verification and S3 archival - Add DR drill automation and runbook procedures - Create load balancer configs (nginx, envoy) with health checks ## Documentation - Update CLAUDE.md with operations and troubleshooting guides - Expand roadmap with production readiness milestones - Add pilot success criteria and deployment reference architecture - Document TLS setup, monitoring integration, and incident response ## Configuration - Add .env.example with all required environment variables - Document resource sizing for different deployment scales - Add configuration examples for various deployment topologies This positions StemeDB for successful enterprise pilots with proper operational discipline, monitoring, backup/DR, and security hardening. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
107 lines
3.9 KiB
Plaintext
107 lines
3.9 KiB
Plaintext
# StemeDB API Server Configuration
|
|
#
|
|
# Copy this file to `.env` and customize for your environment.
|
|
|
|
# =============================================================================
|
|
# Core Configuration
|
|
# =============================================================================
|
|
|
|
# Directory for Write-Ahead Log (WAL) files
|
|
STEMEDB_WAL_DIR=data/wal
|
|
|
|
# Directory for key-value storage
|
|
STEMEDB_DB_DIR=data/db
|
|
|
|
# HTTP server bind address
|
|
STEMEDB_BIND_ADDR=127.0.0.1:18180
|
|
|
|
# Enable economic throttling (The Meter)
|
|
# When enabled, enforces per-agent per-hour quotas
|
|
STEMEDB_METER_ENABLED=true
|
|
|
|
# Optional: Separate database for Aphoria corpus
|
|
# If not set, corpus queries use the main store
|
|
# STEMEDB_CORPUS_DB_DIR=data/corpus
|
|
|
|
# =============================================================================
|
|
# P5.1 Security Hardening (TLS/HTTPS)
|
|
# =============================================================================
|
|
|
|
# TLS certificate path (optional - enables HTTPS)
|
|
# When set, server runs in HTTPS mode with TLS 1.3
|
|
# Example with Let's Encrypt:
|
|
# STEMEDB_TLS_CERT_PATH=/etc/letsencrypt/live/stemedb.example.com/fullchain.pem
|
|
|
|
# TLS private key path (optional - enables HTTPS)
|
|
# Required if STEMEDB_TLS_CERT_PATH is set
|
|
# Example with Let's Encrypt:
|
|
# STEMEDB_TLS_KEY_PATH=/etc/letsencrypt/live/stemedb.example.com/privkey.pem
|
|
|
|
# =============================================================================
|
|
# P5.1 Security Hardening (Request Limits & Timeouts)
|
|
# =============================================================================
|
|
|
|
# Request body size limits (bytes)
|
|
# Write endpoints (POST /v1/assert, /v1/vote, etc.): Default 1MB
|
|
STEMEDB_WRITE_BODY_LIMIT=1048576
|
|
|
|
# Read endpoints (GET /v1/query, etc.): Default 64KB
|
|
STEMEDB_READ_BODY_LIMIT=65536
|
|
|
|
# HTTP request timeout (seconds)
|
|
# Entire request/response cycle must complete within this time
|
|
# Default: 30 seconds
|
|
STEMEDB_HTTP_TIMEOUT_SECS=30
|
|
|
|
# Store operation timeout (seconds)
|
|
# Individual get()/put() operations must complete within this time
|
|
# Default: 5 seconds (hardcoded in store_helpers.rs)
|
|
# Note: Store timeout is currently hardcoded at 5s and cannot be configured via env var
|
|
# STEMEDB_STORE_TIMEOUT_SECS=5
|
|
|
|
# Health endpoint rate limit (requests per second per IP)
|
|
# Prevents metrics flooding attacks via /v1/health endpoint
|
|
# Default: 1 request per second
|
|
STEMEDB_HEALTH_RATE_LIMIT=1
|
|
|
|
# =============================================================================
|
|
# P4.2 Authentication
|
|
# =============================================================================
|
|
|
|
# Root API key (for bootstrapping admin access on first start)
|
|
# Generate a secure key:
|
|
# export STEMEDB_ROOT_API_KEY=steme_live_$(openssl rand -hex 24)
|
|
#
|
|
# This key will be hashed and stored on first start.
|
|
# Use it to authenticate to POST /v1/admin/api-keys to create additional keys.
|
|
# STEMEDB_ROOT_API_KEY=steme_live_your_secure_key_here
|
|
|
|
# Enable API key authentication globally
|
|
STEMEDB_AUTH_ENABLED=false
|
|
|
|
# Require authentication for all endpoints (not just /v1/admin/*)
|
|
STEMEDB_AUTH_REQUIRE_ALL=false
|
|
|
|
# =============================================================================
|
|
# Logging & Observability
|
|
# =============================================================================
|
|
|
|
# Logging level (via RUST_LOG)
|
|
# Examples:
|
|
# RUST_LOG=debug # All debug logs
|
|
# RUST_LOG=stemedb_api=debug # Only stemedb-api debug logs
|
|
# RUST_LOG=stemedb_api=debug,tower_http=debug # Multiple modules
|
|
#
|
|
# Default (if not set): stemedb_api=debug,tower_http=debug
|
|
|
|
# =============================================================================
|
|
# Prometheus Metrics
|
|
# =============================================================================
|
|
|
|
# Metrics are exposed at /metrics endpoint
|
|
# Default port: 18180 (same as HTTP API)
|
|
# Scrape config for Prometheus:
|
|
# - job_name: 'stemedb'
|
|
# static_configs:
|
|
# - targets: ['localhost:18180']
|