stemedb/ai-lookup/patterns/error-handling.md
jordan 3320c24afa feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation
Add CRC32C checksums to WAL record format (v2), implement crash recovery
with automatic truncation of corrupt records, add feature-gated group commit
buffer for batched fsync under concurrent load, and implement log rotation
via segment files with global offset addressing.

Key changes:
- Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N]
- recover_file() scans and truncates corrupt tail records
- GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate)
- SegmentManager with binary search resolution and cursor-based cleanup
- Journal::read() auto-refreshes segments on miss for writer/reader split
- Split recovery.rs and key_codec.rs into directory modules for 500-line max

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:36:35 -07:00

70 lines
2.1 KiB
Markdown

# Error Handling Pattern
**Last Updated:** 2026-01-31
**Confidence:** High
## Summary
StemeDB uses `thiserror` for library errors with context chains. No panics in production code. All fallible operations return `Result<T, E>`.
**Key Facts:**
- Library code: `thiserror` for custom error types — ALL error enums MUST use `#[derive(thiserror::Error)]`
- Binary code: `anyhow` for error chaining
- Never use `unwrap()`, `expect()`, `panic!()` in production
- Add context with `.context("what we were doing")?`
- NEVER use manual `impl Display` + `impl Error` for error types — use thiserror derives instead
**Error types in workspace (all use thiserror):**
- `stemedb-core/src/serde.rs``SerdeError`
- `stemedb-wal/src/error.rs``QuarantineError`
- `stemedb-storage/src/error.rs``StorageError`
- `stemedb-ingest/src/error.rs``IngestError`
- `stemedb-query/src/error.rs``QueryError`
- `stemedb-api/src/error.rs``ApiError`
## The Pattern
```rust
use thiserror::Error;
#[derive(Debug, Error)]
pub enum StemeError {
#[error("assertion not found: {0:?}")]
NotFound(Hash),
#[error("invalid signature for agent {agent:?}")]
InvalidSignature { agent: AgentId },
#[error("storage error: {0}")]
Storage(String),
#[error("serialization error: {0}")]
Serialization(String),
}
// Usage with context
fn load_assertion(&self, hash: &Hash) -> Result<Assertion, StemeError> {
let bytes = self.store
.get(hash.as_bytes())
.context("failed to read assertion from store")?
.ok_or(StemeError::NotFound(*hash))?;
rkyv::from_bytes(&bytes)
.map_err(|e| StemeError::Serialization(e.to_string()))
}
```
## Error Categories
| Type | Description | Example |
|------|-------------|---------|
| `NotFound` | Data doesn't exist | Missing assertion |
| `InvalidSignature` | Crypto verification failed | Tampered assertion |
| `Storage` | Underlying KV error | Disk full |
| `Serialization` | Encode/decode failed | Corrupt data |
## Related Topics
- [Rust Guidelines](../../.claude/guides/backend/rust-guidelines.md)
- [CODING_GUIDELINES.md](../../../CODING_GUIDELINES.md)