Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint, security hardening, signed assertions, source registry, dashboard enhancements) and fixed all test failures across the full workspace (2656/2656 passing). Key fixes: - fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges() - 6 previously-hanging SWIM tests now pass in <2s - fix(sim): replace background-task+polling ingestion with synchronous process_pending() - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests - Removed ingestor.start() + wait_until_ingested() pattern throughout sim - All arena functions now call ingestor.process_pending() directly (deterministic) - fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2) - fix(test): quota test signed "test" but v1 requires "subject:predicate" format - fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex - fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change) - config: add nextest.toml with slow-timeout for background-task-tests group Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
66 lines
3.6 KiB
Markdown
66 lines
3.6 KiB
Markdown
# StemeDB (Episteme) Project Context
|
|
|
|
## Project Overview
|
|
**StemeDB (Episteme)** is a probabilistic, log-structured, content-addressed knowledge graph database designed as the "Cortex" for autonomous AI research agents. Unlike traditional databases that enforce a single mutable state, StemeDB preserves immutable history and resolves conflicting assertions at read-time using "Lenses."
|
|
|
|
It serves as the "Git for Truth," allowing agents to:
|
|
* **Assert** facts with cryptographic signatures and confidence scores.
|
|
* **Vote** on assertions to build consensus without lock contention.
|
|
* **Fork** reality to simulate "what-if" scenarios (Overlay Graphs).
|
|
* **Resolve** truth dynamically via lenses like Consensus, Authority, or Recency.
|
|
|
|
## Tech Stack
|
|
* **Language:** Rust (2024 edition)
|
|
* **Durability:** `stemedb-wal` (Quarantine Pattern with `fs2`, `blake3` checksums)
|
|
* **Storage:** `stemedb-storage` (Hybrid Store: `fjall` LSM-tree for writes, `redb` B-tree for reads)
|
|
* **Serialization:** `rkyv` (Zero-copy deserialization for high performance)
|
|
* **Ingestion:** `stemedb-ingest` (Async background worker bridging WAL and Store)
|
|
* **Simulation:** `stemedb-sim` (Agent-based modeling to verify system behavior)
|
|
|
|
## Architecture
|
|
The system follows a "Spine -> Lattice -> Cortex" architecture:
|
|
|
|
1. **The Spine (Durability):**
|
|
* **Write-Ahead Log (WAL):** Append-only log with strict `fsync` guarantees.
|
|
* **Ingestor:** Background task that tails the WAL and indexes data.
|
|
* **KV Store:** Persistent storage for assertions and indexes.
|
|
|
|
2. **The Lattice (Connectivity) - *Implemented*:**
|
|
* **Ballot Box:** High-velocity vote stream.
|
|
* **Materialized Views:** Pre-computed truth states.
|
|
|
|
3. **The Cortex (Reasoning) - *Implemented*:**
|
|
* **Lenses:** WASM-based filters for truth resolution (Consensus, Authority, Recency, etc.).
|
|
* **SMT:** Sparse Merkle Trees for efficient branching.
|
|
|
|
## Key Files & Directories
|
|
* `stemedb/`
|
|
* `crates/`
|
|
* `stemedb-core/`: Core data structures (`Assertion`, `Vote`, `Epoch`) and types.
|
|
* `stemedb-wal/`: Durability primitives (`Journal`, `FsyncGuard`, `Record`).
|
|
* `stemedb-storage/`: Storage engine abstraction and Hybrid Store implementation.
|
|
* `stemedb-ingest/`: Async ingestion pipeline logic.
|
|
* `stemedb-lens/`: Truth Lenses (`Recency`, `Consensus`, `Authority`, `Skeptic`).
|
|
* `stemedb-sim/`: "The Arena" simulation for end-to-end verification.
|
|
* `architecture.md`: Detailed system design and data flow.
|
|
* `roadmap.md`: Phased implementation plan and status.
|
|
* `docs/sdk/go-usage-guide.md`: Go SDK usage guide and patterns.
|
|
* `Makefile`: Build and quality automation.
|
|
|
|
## Building and Running
|
|
|
|
The project uses a `Makefile` for common tasks:
|
|
|
|
* **Build:** `make build` (Compiles the workspace)
|
|
* **Test:** `make test` (Runs unit tests across all crates)
|
|
* **Quality Check:** `make quality` (Runs fmt, strict clippy linting, duplication checks, and tests)
|
|
* **Run Simulation:** `cargo run -p stemedb-sim` (Executes the spine verification simulation)
|
|
* **Format:** `make fmt` (Auto-formats code)
|
|
|
|
## Development Conventions
|
|
* **Strict Quality:** `make quality` must pass before committing.
|
|
* No `unwrap()` or `expect()` in production code (enforced by clippy).
|
|
* Zero warnings allowed.
|
|
* Missing documentation is a hard error.
|
|
* **Testing:** Every crate must have unit tests. The `stemedb-sim` crate serves as the integration test suite.
|
|
* **Architecture:** Follow the "Defensive by Default" philosophy. Durability > Speed > Features. |