# Episteme (StemeDB) A probabilistic knowledge graph database that stores Claims, not Facts. Append-only Merkle DAG with read-time resolution via Lenses. **Core Concept:** "Git for Truth" - conflicting assertions coexist, resolved at query time through Consensus, Recency, Authority, or custom Lenses. ## Find Your Guide | If you need to... | Read this | |-------------------|-----------| | **Get started fast** | [quickstart.md](./quickstart.md) | | **Understand what Episteme is** | [what-is-episteme.md](./what-is-episteme.md) | | **Understand the technical vision** | [vision.md](./vision.md) | | **See use cases** | [use-cases/README.md](./use-cases/README.md) | | **Understand architecture** | [architecture.md](./architecture.md) | | **Learn data structures** | [docs/data-structures.md](./docs/data-structures.md) | | **See the roadmap** | [roadmap.md](./roadmap.md) | | **Build apps on Episteme** | [docs/app-concepts/index.md](./docs/app-concepts/index.md) | | **Consumer Health vertical** | [docs/app-concepts/consumer-health.md](./docs/app-concepts/consumer-health.md) | | **Use Go SDK** | [ai-lookup/services/sdk.md](ai-lookup/services/sdk.md) | | **Write Rust code** | [.claude/guides/backend/rust-guidelines.md](.claude/guides/backend/rust-guidelines.md) | | **Set up local dev** | [.claude/guides/local/setup.md](.claude/guides/local/setup.md) | | **Run tests** | [.claude/guides/local/testing.md](.claude/guides/local/testing.md) | | **Understand quality checks** | [.claude/guides/local/quality-checks.md](.claude/guides/local/quality-checks.md) | | **Learn about simulation** | [ai-lookup/features/simulation.md](ai-lookup/features/simulation.md) | | **Advance the simulator** | [arena-roadmap.md](./arena-roadmap.md) | | **Work on storage/DAG** | Load skill: `stemedb-core` | | **Implement a Lens** | Load skill: `stemedb-lens` | | **Work on domain ontology** | `crates/stemedb-ontology/` | | **Consumer Health UAT** | [uat/consumer-health/README.md](./uat/consumer-health/README.md) | | **Plan a milestone** | `/plan-milestone` command | | **Analyze use case gaps** | `/analyze-gaps` command | | **Add an API endpoint** | [.claude/guides/backend/api-endpoints.md](.claude/guides/backend/api-endpoints.md) | | **Integrate with AI tools** | [.claude/guides/integrations/ai-coding-assistant-integration.md](.claude/guides/integrations/ai-coding-assistant-integration.md) | | **ADK-Go + Episteme** | [.claude/guides/integrations/adk-go-episteme.md](.claude/guides/integrations/adk-go-episteme.md) | | **Distributed architecture** | [docs/research/distributed-write-path.md](docs/research/distributed-write-path.md) | | **Write UAT reports** | [.claude/guides/local/uat-reports.md](.claude/guides/local/uat-reports.md) | | **Phase 6 UAT results** | [ai-lookup/features/phase6-uat.md](ai-lookup/features/phase6-uat.md) | ## Critical Rules - **Append-Only:** NEVER mutate existing Assertions. Create new ones. - **Content-Addressed:** Assertion ID = BLAKE3 hash of content. - **No Unwrap:** NEVER use `unwrap()` or `expect()` in production code. CI enforces via `clippy::unwrap_used` and `clippy::expect_used` at deny level. - **Defensive Writes:** All writes go through WAL with fsync. - **Zero-Copy:** Use `rkyv` for serialization. ALWAYS use `stemedb_core::serde::{serialize, deserialize}` — NEVER use raw `AllocSerializer` in production code. - **Instrument Critical Paths:** Use `#[instrument]` on public methods in WAL, storage, ingestion, and lens code. Include meaningful fields (key_len, payload_len, offset, candidates_count, lens). - **Structured Logging:** Use `tracing` (info!, warn!, error!) instead of `println!`/`eprintln!`. Clippy enforces via `print_stdout`/`print_stderr` at warn level. CLI binaries (e.g., `stemedb-sim`) may use `#![allow()]` for user-facing output. - **Document Changes:** Update `ai-lookup/` when adding new types/concepts. Keep skills in sync with code. - **No Git Operations:** NEVER use git stash, git branch, git checkout, or any git operations unless the user explicitly tells you to. ## Quick Reference ```bash # Build cargo build --workspace # Test cargo test --workspace # Lint (must pass before commit) cargo clippy --workspace -- -D warnings cargo fmt --check ``` ## Port Scheme (181XX) | Offset | Service | Default | Env Var | |--------|---------|---------|---------| | +0 | HTTP API | 18180 | `STEMEDB_BIND_ADDR` | | +1 | Cluster Gateway | 18181 | `STEMEDB_NODE_API_ADDR` | | +2 | Cluster RPC | 18182 | `STEMEDB_NODE_RPC_ADDR` | | +3 | SWIM Gossip | 18183 | via `SwimConfig` | | +4 | Metrics | 18184 | (reserved) | | +5 | Admin | 18185 | (reserved) | | +6 | Latent Signal | 18186 | — | | +7 | Community App | 18187 | — | ## Specialized Agents | Domain | Agent | When to use | |--------|-------|-------------| | **Product Vision** | `episteme-product-visionary` | Use cases, "why not Postgres?", product-market fit | | General Rust | `primary-developer` | Feature implementation, refactoring | | Code Quality | `rust-quality-engineer` | Reviews, test coverage, clippy | | Storage | `storage-engine-architect` | WAL, LSM, crash recovery | | Graph Engine | `rust-graph-engine-architect` | Lock-free structures, cache optimization | | Defensive | `defensive-systems-architect` | Rate limiting, circuit breakers, hostile input | | Distributed | `distributed-systems-engineer` | CRDT replication, Raft coordination, Merkle sync, clustering | | Lenses | `stemedb-lens-architect` | Query resolution, ranking algorithms | | Planning | `stemedb-planner` | Milestone planning, roadmap | ## Architecture Overview ``` Write Path (Spine): Read Path (Cortex): [Agent] -> [Ingestion] [Agent] <- [Lens Engine] | | v | [WAL/Fsync] [Index Lookup] | | v | [KV Store] <--------------------+ ``` ## Crates | Crate | Purpose | Status | |-------|---------|--------| | `stemedb-core` | Assertion, LifecycleStage, MaterializedView, types | ✅ Implemented | | `stemedb-wal` | Write-ahead log with crash recovery | ✅ Implemented | | `stemedb-storage` | KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex | ✅ Implemented | | `stemedb-ingest` | Ingestion pipeline, signature verification, ContentDefenseLayer | ✅ Implemented | | `stemedb-query` | Query engine, Materializer for O(1) MV: reads | ✅ Implemented | | `stemedb-lens` | Lenses (Recency, Consensus, Authority, Vote/Trust-aware) | ✅ Implemented | | `stemedb-api` | HTTP API with axum + utoipa OpenAPI docs | ✅ Implemented | | `stemedb-sim` | Simulation for testing the pipeline | ✅ Implemented | | `stemedb-merkle` | BLAKE3 Merkle tree for diff detection | ✅ Implemented | | `stemedb-rpc` | gRPC services for node-to-node communication | ✅ Implemented | | `stemedb-sync` | Merkle sync, gossip broadcast, anti-entropy | ✅ Implemented | | `stemedb-cluster` | Cluster membership (SWIM), sharding, gateway | ✅ Implemented | | `stemedb-ontology` | Domain definitions (Pharma), subject builders, medical extractors | ✅ Implemented | ## SDKs | SDK | Purpose | Status | |-----|---------|--------| | `sdk/go/steme` | Go HTTP client with Ed25519 signing and fluent builders | ✅ Implemented | | `sdk/go/adk` | ADK-Go tools and callbacks for AI agents | ✅ Implemented | ## Latent Signal (latent/) Python CLI tools for adverse event signal detection. Different rules from Rust crates: **Allowed:** - `print()` for user-facing CLI output (these are scripts, not libraries) - `except Exception as e:` for CLI error handling (log and continue) **Required:** - **Environment Variables for URLs:** NEVER hardcode `localhost` URLs without env fallback - Use `os.getenv("VAR", "http://localhost:...")` in Python - Use `process.env.VAR || 'http://localhost:...'` in TypeScript - **StemeDB Integration:** New ingestors should use `StemeDBClient` pattern from `adk-agent/`, not write to JSONL files