stemedb/ai-lookup/patterns/error-handling.md
jordan 3320c24afa feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation
Add CRC32C checksums to WAL record format (v2), implement crash recovery
with automatic truncation of corrupt records, add feature-gated group commit
buffer for batched fsync under concurrent load, and implement log rotation
via segment files with global offset addressing.

Key changes:
- Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N]
- recover_file() scans and truncates corrupt tail records
- GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate)
- SegmentManager with binary search resolution and cursor-based cleanup
- Journal::read() auto-refreshes segments on miss for writer/reader split
- Split recovery.rs and key_codec.rs into directory modules for 500-line max

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:36:35 -07:00

2.1 KiB

Error Handling Pattern

Last Updated: 2026-01-31 Confidence: High

Summary

StemeDB uses thiserror for library errors with context chains. No panics in production code. All fallible operations return Result<T, E>.

Key Facts:

  • Library code: thiserror for custom error types — ALL error enums MUST use #[derive(thiserror::Error)]
  • Binary code: anyhow for error chaining
  • Never use unwrap(), expect(), panic!() in production
  • Add context with .context("what we were doing")?
  • NEVER use manual impl Display + impl Error for error types — use thiserror derives instead

Error types in workspace (all use thiserror):

  • stemedb-core/src/serde.rsSerdeError
  • stemedb-wal/src/error.rsQuarantineError
  • stemedb-storage/src/error.rsStorageError
  • stemedb-ingest/src/error.rsIngestError
  • stemedb-query/src/error.rsQueryError
  • stemedb-api/src/error.rsApiError

The Pattern

use thiserror::Error;

#[derive(Debug, Error)]
pub enum StemeError {
    #[error("assertion not found: {0:?}")]
    NotFound(Hash),

    #[error("invalid signature for agent {agent:?}")]
    InvalidSignature { agent: AgentId },

    #[error("storage error: {0}")]
    Storage(String),

    #[error("serialization error: {0}")]
    Serialization(String),
}

// Usage with context
fn load_assertion(&self, hash: &Hash) -> Result<Assertion, StemeError> {
    let bytes = self.store
        .get(hash.as_bytes())
        .context("failed to read assertion from store")?
        .ok_or(StemeError::NotFound(*hash))?;

    rkyv::from_bytes(&bytes)
        .map_err(|e| StemeError::Serialization(e.to_string()))
}

Error Categories

Type Description Example
NotFound Data doesn't exist Missing assertion
InvalidSignature Crypto verification failed Tampered assertion
Storage Underlying KV error Disk full
Serialization Encode/decode failed Corrupt data