--- name: stemedb-core description: Core guidelines for the Episteme database engine. Use when working on storage, DAG, or assertions. --- # StemeDB Core Guidelines ## Identity You are building the **Spine** of Episteme. This is the storage engine that persists the Merkle DAG. ## Principles * **Append-Only**: We never mutate an existing Assertion. We only append new ones. * **Content-Addressed**: The ID of an assertion is its Hash (BLAKE3). * **Defensive**: Use `quarantine-journal` patterns (WAL, Fsync). * **Typed**: Use Strong types (`EntityId`, `RelationId`, `Hash`) not Strings. ## Data Structures ### Assertion (sync with `crates/stemedb-core/src/types.rs`) ```rust pub struct Assertion { // The Fact pub subject: EntityId, // "Tesla_Inc" pub predicate: RelationId, // "has_revenue" pub object: ObjectValue, // Text/Number/Boolean/Reference // The Lineage pub parent_hash: Option, // Link to previous version pub source_hash: Hash, // Evidence pointer pub visual_hash: Option, // pHash for image provenance // Meta-Cognition pub signatures: Vec, // Multi-sig support pub confidence: f32, // 0.0 to 1.0 pub timestamp: u64, // Unix epoch pub vector: Option>, // Semantic embedding } pub struct SignatureEntry { pub agent_id: [u8; 32], // Ed25519 Public Key pub signature: [u8; 64], // Ed25519 Signature pub timestamp: u64, // When signed } pub enum ObjectValue { Text(String), Number(f64), Boolean(bool), Reference(EntityId), } ``` ## Storage Layout (KV) * `H:{Hash} -> Assertion` (Main Store) * `S:{Subject} -> Vec` (Index) * `SP:{Subject}:{Predicate} -> Vec` (Index) ## Do * Use `rkyv` for zero-copy deserialization. * Use `thiserror` for library errors. * Validate signatures on Ingest. * **Instrument public methods** with `#[instrument]` for observability. * **In stemedb-storage**: Use `crate::serde_helpers::{serialize, deserialize}` for all serialization. This provides unified error mapping to `StorageError::Serialization`. ## Tracing Pattern All public methods in WAL, storage, and ingestion MUST have tracing spans: ```rust use tracing::{debug, info, instrument}; #[instrument(skip(self, payload), fields(payload_len = payload.len()))] pub fn append(&mut self, payload: Vec) -> Result { // ... implementation ... debug!(offset, "Record appended"); Ok(offset) } ``` Guidelines: - Use `skip(self)` to avoid noisy output - Use `skip(payload)` or `skip(value)` for large data - Add `fields(key_len = ..., value_len = ...)` for size visibility - Use `debug!` for routine operations, `info!` for lifecycle events, `warn!` for recoverable issues ## Do Not * Use `unwrap()` in core logic. * Store large blobs in the Assertions (store pointers/hashes instead). * Add new types without updating `ai-lookup/services/` documentation. * Add public methods without `#[instrument]` in WAL/storage/ingest crates. * **In stemedb-storage**: Call `stemedb_core::serde::serialize` or `deserialize` directly. Always use `crate::serde_helpers` instead. ## Documentation Sync When modifying core types: 1. Update this skill's Data Structures section to match actual code 2. Add/update entry in `ai-lookup/services/assertion.md` or `ai-lookup/services/storage.md` 3. Update `ai-lookup/index.md` if adding new concepts