jordan 3cfaa1e1d3 feat: Complete Phase 1 (The Spine) - storage foundation

Phase 1 delivers the complete durability and storage layer:

- WAL with crash recovery: Append-only journal with BLAKE3 checksums,
  fsync guarantees, and proper seek-to-EOF on reopen
- Storage engine: sled-backed KVStore with scan_prefix for range queries
- Content-addressed storage: H:{hash}, V:{hash}, E:{hash} key patterns
- Ingestor: Background worker tailing WAL, writing to KV with 8-byte
  aligned record headers for rkyv zero-copy deserialization
- Comprehensive tests: 31 tests covering crash recovery, round-trips,
  and multi-cycle durability

New crates: stemedb-wal, stemedb-storage, stemedb-ingest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-31 14:15:34 -07:00

16 KiB

Raw Blame History

AI Coding Assistant Integration Guide

Research report on integrating Episteme (StemeDB) with Claude Code, Gemini CLI, OpenAI Codex, and other AI coding assistants.

Executive Summary

There are three main integration approaches for AI coding assistants, each with different trade-offs:

Approach	Reliability	Complexity	Cross-Platform	Best For
Skills/Commands + CLI	High	Low	Good	Direct, reliable integration
Context Files (CLAUDE.md, AGENTS.md)	High	Very Low	Excellent	Static knowledge, guidelines
A2A Protocol	Medium	Medium	Emerging	Agent-to-agent collaboration
MCP Servers	Variable	High	Good	Dynamic tools (when working)

Recommendation: Start with Skills + CLI tools for reliability, use context files for static knowledge, and consider A2A for agent collaboration. MCP is powerful but has reliability concerns.

Part 1: Integration Approaches Comparison

Why MCP Can Be Problematic

MCP servers can be unreliable for several reasons:

Connection management complexity (STDIO process lifecycle, HTTP session state)
Protocol version mismatches between clients
Authentication failures with OAuth 2.1 flows
Tool search latency when many tools are registered
Context window consumption from tool descriptions

Recommended: Skills + CLI Integration

Agent Skills (agentskills.io) provide a simpler, more reliable approach:

┌─────────────────────────────────────────────────────┐
│                 AI Coding Assistant                  │
│  (Claude Code, Gemini CLI, Codex, Cursor)           │
└────────────────────────┬────────────────────────────┘
                         │
         ┌───────────────┼───────────────┐
         │               │               │
         ▼               ▼               ▼
   ┌──────────┐   ┌──────────┐   ┌──────────┐
   │  SKILL.md │   │AGENTS.md │   │   CLI    │
   │  /command │   │ Context  │   │   Tool   │
   └──────────┘   └──────────┘   └──────────┘
         │               │               │
         └───────────────┼───────────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │  episteme-cli       │
              │  (Rust binary)      │
              └─────────────────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │  StemeDB            │
              └─────────────────────┘

Advantages:

Skills are just markdown files - no running processes
CLI tools are standalone binaries - always available
Context files are version-controlled and deterministic
No connection management, no protocol negotiation
Works offline, no authentication complexity

Part 2: Agent Skills (SKILL.md)

What Are Agent Skills?

Agent Skills are organized folders of instructions, scripts, and resources that AI assistants discover and load dynamically. They follow an open standard adopted by Claude Code, Codex, and others.

SKILL.md Format

---
name: episteme-query
description: Query the Episteme knowledge graph for assertions about a subject
disable-model-invocation: false
allowed-tools: Bash(episteme *)
---

# Episteme Query

Query the knowledge graph for information about a subject.

## Usage

```bash
episteme query --subject "$ARGUMENTS" --lens recency

What to Look For

Check for conflicting assertions
Note the confidence levels
Trace provenance if the source matters

Output Format

Returns JSON with assertions matching the query.


### Skill Locations

| Location | Path | Scope |
|----------|------|-------|
| Personal | `~/.claude/skills/<name>/SKILL.md` | All your projects |
| Project | `.claude/skills/<name>/SKILL.md` | This project only |
| Codex | `~/.codex/skills/<name>/SKILL.md` | Codex sessions |

### Cross-Platform Compatibility

The Agent Skills specification ([agentskills.io](https://agentskills.io)) works across:
- Claude Code
- OpenAI Codex CLI
- Cursor
- Other compatible tools

**Key insight:** Write skills once, they work everywhere.

---

## Part 3: Context Files (AGENTS.md / CLAUDE.md / GEMINI.md)

### The Open Standard: AGENTS.md

[AGENTS.md](https://agents.md/) is an open format for guiding coding agents, now stewarded by the Linux Foundation's Agentic AI Foundation. It's adopted by 40,000+ open-source projects.

### File Discovery (Codex)

Discovery order (first match wins):

./AGENTS.md (current directory)
Parent directories up to repo root
Sub-folders the agent is working in
~/.factory/AGENTS.md (personal override)

Merge: Files concatenate root → leaf, closer files override.


### Recommended Sections

```markdown
# AGENTS.md

## Build & Test
Exact commands for compiling and testing.

## Architecture Overview
Short description of major modules.

## Knowledge System
This project uses Episteme for persistent knowledge:
- Query: `episteme query --subject <topic>`
- Store: `episteme assert <subject> <predicate> <object>`
- Lenses: consensus, recency, authority

When you learn something important, store it in Episteme.

## Conventions
Naming, folder layout, code style.

Platform-Specific Files

Platform	Primary File	Alternative
Claude Code	`CLAUDE.md`	Also reads `AGENTS.md`
Gemini CLI	`GEMINI.md`	Also reads `AGENT.md`
OpenAI Codex	`AGENTS.md`	-
Cursor	`.cursor/rules/`	-

Best Practice: Use AGENTS.md as the canonical file, reference it from platform-specific files:

# CLAUDE.md

See [AGENTS.md](./AGENTS.md) for project conventions.

## Claude-Specific
Additional Claude Code settings here.

Part 4: CLI Tool Integration

The episteme-cli Approach

Instead of MCP, build a standalone CLI that skills can invoke:

// crates/episteme-cli/src/main.rs
use clap::{Parser, Subcommand};

#[derive(Parser)]
#[command(name = "episteme")]
#[command(about = "Episteme knowledge graph CLI")]
struct Cli {
    #[command(subcommand)]
    command: Commands,
}

#[derive(Subcommand)]
enum Commands {
    /// Query assertions about a subject
    Query {
        #[arg(short, long)]
        subject: String,
        #[arg(short, long, default_value = "recency")]
        lens: String,
        #[arg(short, long)]
        predicate: Option<String>,
    },
    /// Create a new assertion
    Assert {
        subject: String,
        predicate: String,
        object: String,
        #[arg(short, long)]
        source: Option<String>,
        #[arg(short, long, default_value = "1.0")]
        confidence: f64,
    },
    /// List conflicts for a subject
    Conflicts {
        #[arg(short, long)]
        subject: String,
    },
    /// Trace provenance chain
    Trace {
        #[arg(long)]
        assertion_id: String,
    },
}

CLI Usage in Skills

---
name: remember
description: Store a learning in the knowledge graph
allowed-tools: Bash(episteme *)
---

Store important learnings about this codebase.

## Usage

```bash
episteme assert "$0" "$1" "$2" --source "claude-session"

Where:

$0 = subject (e.g., "AuthSystem")
$1 = predicate (e.g., "uses")
$2 = object (e.g., "JWT with 24h expiration")


### Output Format

Design CLI output for AI consumption:

```bash
$ episteme query --subject AuthSystem --lens recency

{
  "assertions": [
    {
      "id": "blake3:abc123...",
      "subject": "AuthSystem",
      "predicate": "uses",
      "object": "JWT",
      "confidence": 0.95,
      "source": "code-review-2024-01",
      "timestamp": "2024-01-15T10:30:00Z"
    }
  ],
  "lens": "recency",
  "conflicts": []
}

Part 5: A2A Protocol (Agent-to-Agent)

What is A2A?

A2A is Google's open protocol for agent-to-agent communication, now under Linux Foundation governance. It's complementary to MCP:

MCP: Agent → Tool communication
A2A: Agent → Agent communication

When to Use A2A

Use A2A when you want:

Multiple AI agents collaborating on a task
Episteme acting as a "memory agent" that other agents consult
Cross-vendor agent ecosystems (Claude ↔ Gemini ↔ GPT agents)

A2A Architecture

┌──────────────────┐     A2A      ┌──────────────────┐
│  Claude Agent    │◄────────────►│  Episteme Agent  │
│  (Coding tasks)  │              │  (Knowledge)     │
└──────────────────┘              └──────────────────┘
         │                                 │
         │ A2A                             │
         ▼                                 ▼
┌──────────────────┐              ┌──────────────────┐
│  Gemini Agent    │              │  StemeDB         │
│  (Review tasks)  │              │                  │
└──────────────────┘              └──────────────────┘

Agent Card (Discovery)

Agents advertise capabilities via JSON "Agent Cards":

{
  "name": "episteme-memory",
  "description": "Knowledge graph memory for AI agents",
  "version": "0.1.0",
  "capabilities": [
    "store_assertion",
    "query_knowledge",
    "resolve_conflicts"
  ],
  "endpoint": "https://episteme.local/a2a",
  "auth": {
    "type": "bearer"
  }
}

Task Lifecycle

A2A uses task-oriented communication:

Client agent discovers Episteme via Agent Card
Sends task: "Remember that AuthSystem uses JWT"
Episteme agent processes, returns task status
Long-running tasks use SSE streaming for updates

A2A vs MCP

Aspect	MCP	A2A
Communication	Tool invocation	Task delegation
Statefulness	Stateful sessions	Task-based state
Discovery	Client config	Agent Cards
Use case	AI → External tools	AI → AI collaboration
Opacity	Tools exposed	Internal state hidden

Part 6: Recommended Implementation Strategy

Phase 1: CLI + Skills (Start Here)

Build episteme-cli as standalone Rust binary
Create skills that wrap CLI commands
Add to AGENTS.md with usage instructions

# Install
cargo install --path crates/episteme-cli

# Skills location
mkdir -p ~/.claude/skills/episteme-query
mkdir -p ~/.claude/skills/episteme-remember

Phase 2: Context Integration

Create AGENTS.md with Episteme documentation
Symlink or reference from CLAUDE.md, GEMINI.md
Version control the context files

Phase 3: A2A Agent (Optional)

Implement Agent Card endpoint
Add A2A task handlers for knowledge operations
Deploy as service for multi-agent scenarios

Phase 4: MCP Server (If Needed)

Only if you need:

Dynamic tool discovery (tools that change at runtime)
Resource subscriptions (real-time updates)
Deep IDE integration beyond skills

Part 7: Episteme Skills Library

Core Skills

`/episteme-query` - Query Knowledge

---
name: episteme-query
description: Query the knowledge graph. Use before making changes to understand existing knowledge.
allowed-tools: Bash(episteme *)
---

Query Episteme for existing knowledge about a subject.

## Usage

```bash
episteme query --subject "$ARGUMENTS" --lens recency

Lenses

recency - Most recent assertions win
consensus - Community agreement
authority - Trusted sources weighted higher

Example

episteme query --subject "PaymentService" --lens authority


#### `/episteme-remember` - Store Knowledge

```yaml
---
name: episteme-remember
description: Store a learning in the knowledge graph. Use after discovering something important.
disable-model-invocation: false
allowed-tools: Bash(episteme *)
---

Store important learnings in Episteme.

## Usage

```bash
episteme assert "$0" "$1" "$2" --source "claude-session" --confidence 0.9

Arguments

$0: Subject (what the assertion is about)
$1: Predicate (the relationship)
$2: Object (the value or target)

Examples

# Architecture decision
episteme assert "UserService" "database" "PostgreSQL" --source "arch-review"

# Pattern discovery
episteme assert "ErrorHandling" "pattern" "Result<T, AppError>" --source "code-analysis"


#### `/episteme-conflicts` - Find Conflicts

```yaml
---
name: episteme-conflicts
description: Find conflicting assertions about a subject. Use when information seems contradictory.
allowed-tools: Bash(episteme *)
---

Find conflicting knowledge that needs resolution.

## Usage

```bash
episteme conflicts --subject "$ARGUMENTS"

Output

Returns pairs of conflicting assertions with:

Both assertion details
Confidence levels
Sources
Suggested resolution strategy


---

## Part 8: Cross-Platform Configuration

### Universal Setup

project/ ├── AGENTS.md # Open standard, works everywhere ├── CLAUDE.md # References AGENTS.md + Claude extras ├── .gemini/ │ └── GEMINI.md # References AGENTS.md + Gemini extras ├── .claude/ │ └── skills/ │ ├── episteme-query/ │ │ └── SKILL.md │ └── episteme-remember/ │ └── SKILL.md └── .codex/ └── skills/ # Symlink to .claude/skills/


### AGENTS.md Template

```markdown
# AGENTS.md

## Overview
[Project description]

## Knowledge System

This project uses **Episteme** for persistent AI memory.

### Querying Knowledge
Before making significant changes, query existing knowledge:
```bash
episteme query --subject <topic> --lens recency

Storing Learnings

After discovering important patterns or decisions:

episteme assert <subject> <predicate> <object>

Resolving Conflicts

When encountering contradictory information:

episteme conflicts --subject <topic>

Build & Test

[Your build commands]

Architecture

[Architecture overview]


---

## References

### Agent Skills
- [Agent Skills Specification](https://agentskills.io)
- [Claude Code Skills](https://code.claude.com/docs/en/skills)
- [Codex Skills](https://developers.openai.com/codex/skills/)

### Context Files
- [AGENTS.md Specification](https://agents.md/)
- [AGENTS.md on GitHub](https://github.com/openai/codex/blob/main/docs/agents_md.md)
- [Claude Code Memory](https://code.claude.com/docs/en/memory)

### A2A Protocol
- [A2A Protocol Specification](https://a2a-protocol.org/latest/)
- [A2A GitHub](https://github.com/a2aproject/A2A)
- [Linux Foundation Announcement](https://www.linuxfoundation.org/press/linux-foundation-launches-the-agent2agent-protocol-project)

### CLI Integration
- [Using Gemini CLI as Claude Subagent](https://aicodingtools.blog/en/claude-code/gemini-cli-as-subagent-of-claude-code)
- [Claude Code + Gemini CLI Integration](https://gist.github.com/AndrewAltimit/fc5ba068b73e7002cbe4e9721cebb0f5)

### MCP (Reference)
- [MCP Specification](https://modelcontextprotocol.io/specification/2025-11-25)
- [Rust MCP SDK](https://github.com/modelcontextprotocol/rust-sdk)

16 KiB Raw Blame History