stemedb/ai-lookup/patterns/content-addressing.md
jordan a776744889 Initial project setup with Claude Code monorepo structure
- Rust workspace with stemedb-core crate
- Full .claude/ configuration (agents, skills, commands, guides)
- ai-lookup/ for token-efficient fact storage
- Quality gates: clippy, fmt, jscpd duplication detection
- Pre-commit hook with 5-phase quality checks
- CLAUDE.md router and CODING_GUIDELINES.md standards

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 10:56:26 -07:00

1.3 KiB

Content Addressing

Last Updated: 2025-01-31 Confidence: High

Summary

Content addressing means the ID of data is derived from its content via cryptographic hash. Same content = same ID. This enables immutability, deduplication, and integrity verification.

Key Facts:

  • Hash algorithm: BLAKE3 (fast, secure)
  • Assertion ID = hash of (subject, predicate, object, agent, timestamp)
  • Enables "append-only" semantics - no mutations, only new versions
  • Automatic deduplication of identical assertions

File Pointer: crates/stemedb-core/src/assertion.rs:hash() (planned)

How It Works

use blake3::Hasher;

impl Assertion {
    pub fn hash(&self) -> Hash {
        let mut hasher = Hasher::new();
        hasher.update(self.subject.as_bytes());
        hasher.update(self.predicate.as_bytes());
        hasher.update(&self.object.to_bytes());
        hasher.update(&self.agent.0);  // Ed25519 pubkey
        hasher.update(&self.timestamp.to_le_bytes());
        Hash(hasher.finalize().into())
    }
}

Benefits

  1. Immutability: Cannot modify without changing hash
  2. Deduplication: Same claim from same agent at same time = one entry
  3. Integrity: Verify data hasn't been tampered
  4. Caching: Safe to cache by hash forever