stemedb/GEMINI.md

# StemeDB (Episteme) Project Context

## Project Overview
**StemeDB (Episteme)** is a probabilistic, log-structured, content-addressed knowledge graph database designed as the "Cortex" for autonomous AI research agents. Unlike traditional databases that enforce a single mutable state, StemeDB preserves immutable history and resolves conflicting assertions at read-time using "Lenses."

It serves as the "Git for Truth," allowing agents to:
*   **Assert** facts with cryptographic signatures and confidence scores.
*   **Vote** on assertions to build consensus without lock contention.
*   **Fork** reality to simulate "what-if" scenarios (Overlay Graphs).
*   **Resolve** truth dynamically via lenses like Consensus, Authority, or Recency.

## Tech Stack
*   **Language:** Rust (2024 edition)
*   **Durability:** `stemedb-wal` (Quarantine Pattern with `fs2`, `blake3` checksums)
*   **Storage:** `stemedb-storage` (`sled` embedded KV, abstracted via `KVStore` trait)
*   **Serialization:** `rkyv` (Zero-copy deserialization for high performance)
*   **Ingestion:** `stemedb-ingest` (Async background worker bridging WAL and Store)
*   **Simulation:** `stemedb-sim` (Agent-based modeling to verify system behavior)

## Architecture
The system follows a "Spine -> Lattice -> Cortex" architecture:

1.  **The Spine (Durability):**
    *   **Write-Ahead Log (WAL):** Append-only log with strict `fsync` guarantees.
    *   **Ingestor:** Background task that tails the WAL and indexes data.
    *   **KV Store:** Persistent storage for assertions and indexes.

2.  **The Lattice (Connectivity) - *In Progress*:**
    *   **Ballot Box:** High-velocity vote stream.
    *   **Materialized Views:** Pre-computed truth states.

3.  **The Cortex (Reasoning) - *Planned*:**
    *   **Lenses:** WASM-based filters for truth resolution.
    *   **SMT:** Sparse Merkle Trees for efficient branching.

## Key Files & Directories
*   `stemedb/`
    *   `crates/`
        *   `stemedb-core/`: Core data structures (`Assertion`, `Vote`, `Epoch`) and types.
        *   `stemedb-wal/`: Durability primitives (`Journal`, `FsyncGuard`, `Record`).
        *   `stemedb-storage/`: Storage engine abstraction and `sled` implementation.
        *   `stemedb-ingest/`: Async ingestion pipeline logic.
        *   `stemedb-sim/`: "The Arena" simulation for end-to-end verification.
    *   `architecture.md`: Detailed system design and data flow.
    *   `roadmap.md`: Phased implementation plan and status.
    *   `docs/sdk/go-usage-guide.md`: Go SDK usage guide and patterns.
    *   `Makefile`: Build and quality automation.

## Building and Running

The project uses a `Makefile` for common tasks:

*   **Build:** `make build` (Compiles the workspace)
*   **Test:** `make test` (Runs unit tests across all crates)
*   **Quality Check:** `make quality` (Runs fmt, strict clippy linting, duplication checks, and tests)
*   **Run Simulation:** `cargo run -p stemedb-sim` (Executes the spine verification simulation)
*   **Format:** `make fmt` (Auto-formats code)

## Development Conventions
*   **Strict Quality:** `make quality` must pass before committing.
    *   No `unwrap()` or `expect()` in production code (enforced by clippy).
    *   Zero warnings allowed.
    *   Missing documentation is a hard error.
*   **Testing:** Every crate must have unit tests. The `stemedb-sim` crate serves as the integration test suite.
*   **Architecture:** Follow the "Defensive by Default" philosophy. Durability > Speed > Features.