stemedb/crates/stemedb-wal/src/lib.rs
jordan 3320c24afa feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation
Add CRC32C checksums to WAL record format (v2), implement crash recovery
with automatic truncation of corrupt records, add feature-gated group commit
buffer for batched fsync under concurrent load, and implement log rotation
via segment files with global offset addressing.

Key changes:
- Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N]
- recover_file() scans and truncates corrupt tail records
- GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate)
- SegmentManager with binary search resolution and cursor-based cleanup
- Journal::read() auto-refreshes segments on miss for writer/reader split
- Split recovery.rs and key_codec.rs into directory modules for 500-line max

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:36:35 -07:00

50 lines
1.8 KiB
Rust

//! Write-Ahead Log (WAL) and durability primitives for Episteme.
//!
//! This crate provides the foundational durability layer, ensuring that
//! assertions are safely persisted to disk before being acknowledged.
//!
//! # Record Format (v2)
//!
//! Each record is stored as: `[payload_len:u32_LE][crc32c:u32][blake3:32][payload:N]`
//!
//! - CRC32C provides fast integrity checking to detect torn writes
//! - BLAKE3 provides content-addressed verification
//!
//! # Crash Recovery
//!
//! The WAL provides crash recovery guarantees via immediate fsync. When a
//! record is appended with `DurabilityLevel::Immediate` (the default), it
//! is guaranteed to survive process crashes or power failures.
//!
//! On open, the journal scans all records across all segments, verifying
//! CRC32C and BLAKE3. Any corrupt or partial records at the tail are truncated.
//!
//! # Log Rotation
//!
//! Segment files are named `{base_offset:016x}.wal`. When the current segment
//! exceeds the configured max size, a new segment is created. Old segments
//! can be cleaned up once all consumers have advanced past them.
pub mod durability;
/// Error types and Result wrapper for WAL operations.
pub mod error;
/// Binary format for WAL records and headers.
pub mod format;
/// The main Journal API.
pub mod journal;
/// Crash recovery: file scanning, validation, and truncation.
pub mod recovery;
/// Log rotation via segment files.
pub mod segment;
/// Group commit buffer for batching fsync operations.
#[cfg(feature = "group-commit")]
pub mod group_commit;
pub use durability::{DurabilityLevel, FsyncGuard};
pub use error::{QuarantineError, Result};
pub use format::{FileHeader, Record, HEADER_SIZE, RECORD_OVERHEAD};
pub use journal::Journal;
pub use recovery::RecoveryReport;
pub use segment::{Segment, SegmentManager};