This commit implements comprehensive production hardening across multiple layers to prepare StemeDB for enterprise pilot deployments: ## API Layer - Add rate limiting middleware with configurable limits per endpoint - Enhance error handling with detailed context and proper HTTP status codes - Add security hardening tests for input validation and boundary conditions - Create store_helpers module for defensive storage access patterns ## Storage & WAL - Optimize group commit batching for higher throughput - Add defensive error handling in hybrid backend with proper fallbacks - Enhance WAL journal durability guarantees with fsync validation - Improve index store query performance with better caching ## Operations & Deployment - Add comprehensive operations documentation (deployment, monitoring, DR) - Create systemd units for backup, WAL archival, and verification - Add monitoring configs (Prometheus alerts, metrics exporters) - Implement backup/restore scripts with verification and S3 archival - Add DR drill automation and runbook procedures - Create load balancer configs (nginx, envoy) with health checks ## Documentation - Update CLAUDE.md with operations and troubleshooting guides - Expand roadmap with production readiness milestones - Add pilot success criteria and deployment reference architecture - Document TLS setup, monitoring integration, and incident response ## Configuration - Add .env.example with all required environment variables - Document resource sizing for different deployment scales - Add configuration examples for various deployment topologies This positions StemeDB for successful enterprise pilots with proper operational discipline, monitoring, backup/DR, and security hardening. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
47 lines
1.1 KiB
Desktop File
47 lines
1.1 KiB
Desktop File
[Unit]
|
|
Description=StemeDB WAL Archival Service
|
|
Documentation=https://github.com/yourusername/stemedb
|
|
After=network.target
|
|
Wants=network-online.target
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
User=stemedb
|
|
Group=stemedb
|
|
|
|
# Environment file for S3 credentials
|
|
EnvironmentFile=-/etc/default/stemedb-backup
|
|
|
|
# Default environment variables
|
|
Environment="STEMEDB_WAL_DIR=/var/lib/stemedb/wal"
|
|
Environment="STATE_FILE=/var/lib/stemedb/wal-archival-state.json"
|
|
Environment="METRICS_DIR=/var/lib/node_exporter/textfile_collector"
|
|
|
|
# Execute WAL archival
|
|
ExecStart=/usr/local/bin/archive-wal-to-s3.sh
|
|
|
|
# Timeout after 10 minutes
|
|
TimeoutStartSec=600
|
|
|
|
# Restart on failure (network issues, transient errors)
|
|
Restart=on-failure
|
|
RestartSec=2min
|
|
StartLimitBurst=3
|
|
StartLimitIntervalSec=15min
|
|
|
|
# Hardening
|
|
NoNewPrivileges=true
|
|
PrivateTmp=true
|
|
ProtectSystem=strict
|
|
ProtectHome=true
|
|
ReadOnlyPaths=/var/lib/stemedb/wal
|
|
ReadWritePaths=/var/lib/stemedb /var/lib/node_exporter/textfile_collector
|
|
|
|
# Logging
|
|
StandardOutput=journal
|
|
StandardError=journal
|
|
SyslogIdentifier=stemedb-archive-wal
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|