jordan 3cfaa1e1d3 feat: Complete Phase 1 (The Spine) - storage foundation

Phase 1 delivers the complete durability and storage layer:

- WAL with crash recovery: Append-only journal with BLAKE3 checksums,
  fsync guarantees, and proper seek-to-EOF on reopen
- Storage engine: sled-backed KVStore with scan_prefix for range queries
- Content-addressed storage: H:{hash}, V:{hash}, E:{hash} key patterns
- Ingestor: Background worker tailing WAL, writing to KV with 8-byte
  aligned record headers for rkyv zero-copy deserialization
- Comprehensive tests: 31 tests covering crash recovery, round-trips,
  and multi-cycle durability

New crates: stemedb-wal, stemedb-storage, stemedb-ingest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-31 14:15:34 -07:00

47 KiB

Raw Blame History

Agile AI Agent Team: Knowledge Coordination

Tier: Production-Ready Pillars Used: First-Class Contradiction, Invalidation Cascades, Multi-Signature Consensus, Semantic Decay Postgres Test: FAILED - Lifecycle stages require application-level state machines; time-travel needs temporal tables with complex joins; query audit trails don't exist natively; epoch supersession requires recursive invalidation logic

The Catastrophe (Without Episteme)

I watched a production outage take down auth for 47 minutes because an AI agent deployed the wrong JWT configuration.

Here's what happened: Our team uses AI agents for development—a Lead Orchestrator coordinates specialists for research, implementation, and deployment. The deployment agent queried our knowledge base for "current JWT signing algorithm" and got "ES256."

It deployed with confidence. Tests passed. CI went green.

The auth service expected RS256. Every token validation failed. At 3am, the pager fired.

During the post-mortem, someone asked: "Why did the agent think ES256 was correct?"

Silence.

We dug through the knowledge base. Found an RFC from the security team proposing ES256 migration. Found Slack messages discussing it. Found a doc that said "we should use ES256" in future tense. The knowledge base had no distinction between "proposed" and "approved." The most recent entry was the RFC—a proposal, not a decision.

The agent queried, got the proposal, treated it as truth, deployed.

The failure mode: Traditional databases store information without lifecycle state. Proposals look like decisions. Discussions look like conclusions. When an AI agent queries "what is X?", it gets whatever is most recent—whether that's a decision, a debate, or a rejected idea.

The Scenario

An agile development team uses AI agents to coordinate work across auth migrations, feature flag rollouts, deployment configurations, and research. The agents need a shared knowledge base that can:

Distinguish proposals from approved decisions
Track what agents believed at any point in time (for incident investigation)
Log every query for audit trails
Handle paradigm shifts when standards change (v1 → v2 migrations)
Weight expert review above raw discovery

The team consists of:

Lead Orchestrator: Routes work, needs definitive current-state answers
Implementation Agent: Writes code, needs approved patterns only
Research Agent: Ingests docs, papers, discussions—often conflicting
Human Supervisor: Reviews agent decisions, needs to trace reasoning
On-Call SRE: Investigates incidents, needs time-travel debugging

Feature 1: Lifecycle Stage (Proposed vs. Approved)

The Failure Mode

The Research Agent ingests an RFC proposing ES256 for JWT signing. The Implementation Agent queries "JWT signing algorithm" and gets ES256—even though it was never approved, just proposed.

In Postgres, both "proposed" and "approved" are just text values in a status column. No enforcement. No guarantees. An agent can easily query without filtering by status.

The Postgres Attempt

-- Track lifecycle with status column
CREATE TABLE assertions (
    id SERIAL PRIMARY KEY,
    subject VARCHAR(100),
    predicate VARCHAR(100),
    value TEXT,
    source_url TEXT,
    status VARCHAR(20),  -- 'proposed', 'under_review', 'approved', 'deprecated'
    created_by VARCHAR(100),
    created_at TIMESTAMPTZ,
    confidence DECIMAL
);

-- Query: "What's the approved JWT algorithm?"
SELECT value, confidence
FROM assertions
WHERE subject = 'auth/jwt'
  AND predicate = 'signing_algorithm'
  AND status = 'approved'
ORDER BY created_at DESC
LIMIT 1;

-- Problem: What if agent forgets AND status = 'approved'?
SELECT value FROM assertions
WHERE subject = 'auth/jwt' AND predicate = 'signing_algorithm'
ORDER BY created_at DESC LIMIT 1;
-- Returns ES256 (the proposal), not RS256 (the approved decision)

Where it breaks:

Status filtering is optional—agents can (and will) forget it
No enforcement that "proposed" and "approved" are mutually exclusive for same subject
Transition from "proposed" → "approved" requires UPDATE, breaking append-only
No native lens that says "only return approved items"

The Episteme Solution

Lifecycle stage is a first-class field with lens enforcement:

enum LifecycleStage {
    Proposed,
    UnderReview,
    Approved,
    Deprecated,
    Rejected,
}

struct Assertion {
    // ... existing fields
    pub lifecycle: LifecycleStage,
}

# Research Agent ingests RFC as PROPOSED
POST /assert
{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": { "Text": "ES256" },
  "source_hash": "rfc_2024_001...",
  "lifecycle": "Proposed",
  "confidence": 0.75,
  "signatures": [{ "agent_id": "research_agent", ... }]
}

# Later, security lead approves it
POST /assert
{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": { "Text": "ES256" },
  "parent_hash": "rfc_assertion_hash...",  # Links to proposal
  "lifecycle": "Approved",
  "confidence": 0.95,
  "signatures": [{ "agent_id": "security_lead", ... }]
}

Query with lens enforces lifecycle:

# Implementation Agent queries for APPROVED patterns only
GET /query?subject=auth/jwt&predicate=signing_algorithm
    &lens=authority
    &lifecycle=approved

-> Returns RS256 (the old approved decision)
-> Proposal for ES256 is excluded by lifecycle filter

# After approval is recorded:
GET /query?subject=auth/jwt&predicate=signing_algorithm
    &lens=authority
    &lifecycle=approved

-> Returns ES256 (now approved)
-> With provenance showing: proposal → approval chain

Pillar: First-Class Contradiction extended with Lifecycle. Proposals and approvals coexist in the DAG but are distinguished structurally, not by convention.

Feature 2: Query Audit Trail

The Failure Mode

At 3am, auth is broken. The SRE knows the deployment agent queried for JWT config. But what exactly did it query? What result did it get? What assertions contributed to that result?

In Postgres, queries aren't logged with semantic meaning. You might have access logs, but they show "SELECT * FROM assertions WHERE..." not "deployment-agent asked about auth/jwt at 21:03:47 and got ES256 with 0.87 confidence."

The Postgres Attempt

-- Create query log table
CREATE TABLE query_log (
    id SERIAL PRIMARY KEY,
    agent_id VARCHAR(100),
    query_text TEXT,  -- Raw SQL
    executed_at TIMESTAMPTZ,
    rows_returned INTEGER
);

-- Log every query (trigger? application code?)
-- Problem: What was returned? Need to capture result set

CREATE TABLE query_results (
    query_log_id INTEGER REFERENCES query_log(id),
    result_row JSONB
);

-- Now find what deployment agent queried at 9pm
SELECT ql.*, qr.result_row
FROM query_log ql
JOIN query_results qr ON ql.id = qr.query_log_id
WHERE ql.agent_id = 'deployment-agent'
  AND ql.executed_at BETWEEN '2024-01-15 20:00:00' AND '2024-01-15 22:00:00'
  AND ql.query_text LIKE '%signing_algorithm%';

Where it breaks:

Query logging must be implemented in application code (every client, every language)
Capturing result sets is expensive and complex
No linkage between result rows and contributing source assertions
No confidence score or lens information in the log
"What SQL did they run" is not the same as "what question did they ask"

The Episteme Solution

Query audit is built into the core engine:

struct QueryAudit {
    pub query_id: Hash,
    pub agent_id: AgentId,
    pub timestamp: u64,
    pub subject: EntityId,
    pub predicate: RelationId,
    pub lens: LensType,
    pub lifecycle_filter: Option<LifecycleStage>,
    pub result_hash: Hash,
    pub result_confidence: f32,
    pub contributing_assertions: Vec<ContributingAssertion>,
}

struct ContributingAssertion {
    pub assertion_hash: Hash,
    pub weight: f32,
    pub source_hash: Hash,
}

Every query is automatically logged:

# SRE investigates: what did deployment agent query?
GET /audit/queries?agent=deployment-agent
    &from=2024-01-15T20:00:00Z
    &to=2024-01-15T22:00:00Z

-> Returns:
[
  {
    "query_id": "q_7f3a2b...",
    "timestamp": "2024-01-15T21:03:47Z",
    "subject": "auth/jwt",
    "predicate": "signing_algorithm",
    "lens": "authority",
    "lifecycle_filter": null,  # PROBLEM: agent didn't filter!
    "result": {
      "value": "ES256",
      "confidence": 0.87
    },
    "contributing_assertions": [
      {
        "hash": "rfc_2024_001...",
        "lifecycle": "Proposed",  # Here's the bug
        "weight": 0.9,
        "source": "security-rfc-2024.md"
      },
      {
        "hash": "prod_config_v2...",
        "lifecycle": "Approved",
        "weight": 0.6,
        "source": "production-config.yaml"
      }
    ]
  }
]

The SRE immediately sees:

The agent didn't filter by lifecycle
A proposal outweighed an approved config due to recency
Root cause: missing lifecycle filter + authority lens favoring recent security-team docs

# Trace command for incident investigation
episteme trace --agent deployment-agent \
    --time "6 hours ago" \
    --subject "auth/*"

-> Shows all queries, results, and contributing assertions
-> Sub-500ms response time

Pillar: This extends Multi-Signature Consensus to queries themselves. Every query is a signed, timestamped event with full provenance.

Feature 3: Time-Travel Queries

The Failure Mode

The SRE has rolled back the bad config. Production is stable. Now they need to understand: what was the state of knowledge at 9pm when the agent made its decision?

The current state is useless—it reflects the correction. They need historical state.

The Postgres Attempt

-- Temporal tables (Postgres 9.2+ / SQL:2011)
CREATE TABLE assertions (
    -- ... columns
    valid_from TIMESTAMPTZ DEFAULT NOW(),
    valid_to TIMESTAMPTZ DEFAULT 'infinity'
);

-- Query state at specific time
SELECT * FROM assertions
WHERE subject = 'auth/jwt'
  AND predicate = 'signing_algorithm'
  AND valid_from <= '2024-01-15 21:00:00'
  AND valid_to > '2024-01-15 21:00:00';

Where it breaks:

Temporal tables require schema changes and careful valid_from/valid_to management
Queries become complex (every WHERE needs temporal bounds)
No built-in lens application at historical point—just raw rows
Confidence decay at historical point requires application logic
No native "what-if" branching (what if we had known X?)

The Episteme Solution

The Merkle DAG is inherently temporal. Every assertion has a timestamp. Queries accept as_of parameter:

# What did we believe at 9pm?
GET /query?subject=auth/jwt&predicate=signing_algorithm
    &lens=authority
    &as_of=2024-01-15T21:00:00Z

-> Returns ES256 (the state at that moment)
-> Shows which assertions existed then
-> Applies lens as it would have been applied then

# Compare to current state
GET /query?subject=auth/jwt&predicate=signing_algorithm
    &lens=authority

-> Returns RS256 (post-correction)

Time-travel is first-class:

# What changed in the last 24 hours?
GET /diff?subject=auth/jwt
    &from=2024-01-14T21:00:00Z
    &to=2024-01-15T21:00:00Z

-> Returns:
{
  "added": [
    { "hash": "rfc_2024_001...", "lifecycle": "Proposed", "value": "ES256" }
  ],
  "superseded": [],
  "confidence_changed": []
}

Pillar: Semantic Decay extended to full temporal queries. The DAG preserves all history; time-travel is O(log n) via hash lookups.

Feature 4: Paradigm Shifts (Epochs)

The Failure Mode

The security team decides to migrate from RS256 to ES256. This isn't a simple "update one value"—it affects:

JWT signing configuration
Key management procedures
Token validation logic
Session handling
47 related assertions about auth

In Postgres, you'd need to update 47 rows. Or create 47 new rows and mark the old ones deprecated. Either way, it's O(n) writes and error-prone.

The Postgres Attempt

-- Create epoch tracking
CREATE TABLE epochs (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    description TEXT,
    supersedes_epoch_id INTEGER REFERENCES epochs(id),
    supersession_type VARCHAR(20),  -- 'invalidate', 'temporal', 'requires_review'
    created_at TIMESTAMPTZ
);

ALTER TABLE assertions ADD COLUMN epoch_id INTEGER REFERENCES epochs(id);

-- Migrate to new epoch: mark 47 assertions as superseded
UPDATE assertions
SET epoch_id = (SELECT id FROM epochs WHERE name = 'pre-es256-migration')
WHERE subject LIKE 'auth/%';

-- Insert new assertions under new epoch
INSERT INTO assertions (subject, predicate, value, epoch_id, ...)
SELECT subject, predicate, new_value, new_epoch_id, ...
FROM migration_mapping;

Where it breaks:

UPDATE breaks append-only semantics
No atomic epoch transition (partial failures leave inconsistent state)
Queries must manually filter by epoch (WHERE epoch_id = current_epoch)
"RequiresReview" supersession has no enforcement—it's just a label
Historical queries need complex epoch chain traversal

The Episteme Solution

Epochs are first-class with atomic supersession:

struct Epoch {
    pub id: EpochId,
    pub name: String,
    pub superseded_by: Option<EpochId>,
    pub supersession_type: SupersessionType,
    pub effective_date: u64,
}

enum SupersessionType {
    /// Old assertions are considered incorrect
    Invalidate,
    /// Old assertions were true for their time, new are true now
    Temporal,
    /// Old assertions require human review before use
    RequiresReview,
    /// Old and new coexist (additive extension)
    Additive,
}

# Create new epoch for ES256 migration
POST /epoch
{
  "name": "auth-es256-migration",
  "supersedes": "auth-rs256-era",
  "supersession_type": "Temporal",
  "effective_date": "2024-02-01T00:00:00Z"
}

# New assertions automatically belong to new epoch
POST /assert
{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": { "Text": "ES256" },
  "epoch": "auth-es256-migration",
  "lifecycle": "Approved",
  "confidence": 0.95
}

Queries automatically respect epoch boundaries:

# Default: returns current epoch data
GET /query?subject=auth/jwt&predicate=signing_algorithm&lens=authority
-> Returns ES256 (from auth-es256-migration epoch)

# Historical: what was true in old epoch?
GET /query?subject=auth/jwt&predicate=signing_algorithm
    &lens=authority
    &epoch=auth-rs256-era
-> Returns RS256

# See what would be affected by epoch change
GET /epoch/impact?old=auth-rs256-era&new=auth-es256-migration
-> Returns:
{
  "affected_assertions": 47,
  "by_subject": {
    "auth/jwt": 12,
    "auth/keys": 8,
    "auth/validation": 15,
    "auth/session": 12
  },
  "supersession_type": "Temporal"
}

Pillar: Invalidation Cascades at epoch granularity. O(1) supersession of entire paradigms without O(n) writes.

Feature 5: Expert Weighting (Authority Lens)

The Failure Mode

A junior developer discovers a Stack Overflow answer suggesting JWT rotation every 15 minutes. A senior security engineer reviews and adds context: "That's for high-security contexts; our standard is daily rotation."

In Postgres, both are just rows. The engineer's expertise isn't structurally encoded.

The Postgres Attempt

CREATE TABLE agent_reputation (
    agent_id VARCHAR(100) PRIMARY KEY,
    reputation_score DECIMAL,
    domain VARCHAR(50)  -- 'security', 'infrastructure', 'frontend'
);

-- Query with reputation weighting
SELECT a.*,
    a.confidence * ar.reputation_score AS weighted_confidence
FROM assertions a
JOIN agent_reputation ar ON a.created_by = ar.agent_id
WHERE a.subject = 'auth/jwt/rotation'
ORDER BY weighted_confidence DESC;

Where it breaks:

Reputation weighting logic duplicated across every query pattern
No cryptographic proof of who asserted what
Reputation scores are mutable—historical queries return different results
Domain-specific authority (security engineer vs. frontend dev) requires application logic

The Episteme Solution

Multi-signature with reputation is built into the resolution engine:

# Junior dev ingests Stack Overflow finding
POST /assert
{
  "subject": "auth/jwt",
  "predicate": "rotation_interval",
  "object": { "Duration": "15m" },
  "source_hash": "stackoverflow_12345...",
  "confidence": 0.6,
  "signatures": [{
    "agent_id": "junior_dev_pubkey",
    "signature": "ed25519...",
    "timestamp": 1706745600
  }]
}

# Senior security engineer co-signs with context
POST /cosign
{
  "assertion_hash": "junior_dev_assertion...",
  "context": "Valid for high-security; our standard is daily",
  "signatures": [{
    "agent_id": "security_lead_pubkey",
    "signature": "ed25519...",
    "timestamp": 1706832000
  }]
}

# And asserts the standard
POST /assert
{
  "subject": "auth/jwt",
  "predicate": "rotation_interval",
  "object": { "Duration": "24h" },
  "source_hash": "internal_security_policy...",
  "confidence": 0.95,
  "signatures": [{
    "agent_id": "security_lead_pubkey",
    "signature": "ed25519...",
    "timestamp": 1706832100
  }]
}

Authority lens applies domain-weighted reputation:

GET /query?subject=auth/jwt&predicate=rotation_interval
    &lens=authority
    &domain=security

-> Returns: 24h
-> security_lead has reputation 0.95 in security domain
-> junior_dev has reputation 0.4 in security domain
-> Authority lens weights accordingly

GET /query?subject=auth/jwt&predicate=rotation_interval
    &lens=consensus

-> Returns: 15m (more documents mention this)
-> But with context: "Higher-weighted dissent from security_lead"

Pillar: Multi-Signature Consensus with domain-specific authority. Signatures are cryptographic, immutable, and automatically weighted.

Feature 6: Persistent Learning (Negative Constraints + The Gardener)

The Failure Mode: The Optimization Conflict

You correct an agent: "No, don't use the requests library, use axios."

The agent says "Got it!" and uses axios... this time. Next week, new session, new context window—the agent uses requests again. You correct it again. Repeat forever.

This is The Optimization Conflict: agents rely on System Prompts (temporary instructions in the context window) that suffer from "context drift" and "catastrophic amnesia." Your correction lives in the sliding context window; once it slides past, the agent reverts to its base weights.

Current agents are also trained on Golden Trajectories—perfect, error-free examples. When they make mistakes during training, that data is discarded. They never learn "don't do X because it leads to failure."

The Postgres Attempt

-- Store corrections as "rules"
CREATE TABLE agent_rules (
    id SERIAL PRIMARY KEY,
    subject VARCHAR(100),
    predicate VARCHAR(100),  -- 'must_use', 'forbidden'
    value TEXT,
    reason TEXT,
    created_by VARCHAR(100),
    created_at TIMESTAMPTZ
);

-- Query rules before acting
SELECT * FROM agent_rules
WHERE subject = 'Project_X_Http_Client'
  AND predicate IN ('must_use', 'forbidden');

Where it breaks:

Agents can (and will) forget to query rules before acting
No enforcement that rules are checked pre-flight
No automatic reputation adjustment when agents violate rules
Rules decay equally whether used or not—stale rules clutter the system
No "forbidden alternative" with reason for contrastive learning

The Episteme Solution

Step 1: The Write Path—Negative Constraint Assertion

When you correct an agent, the system writes a Signed Assertion with the correction AND the forbidden alternative:

POST /assert
{
  "subject": "Project_X_Http_Client",
  "predicate": "must_use_library",
  "object": { "Text": "axios" },
  "meta": {
    "forbidden_alternative": "requests",
    "reason": "User correction: requests library is deprecated for this project"
  },
  "epoch": "Project_X_Standards_2025",
  "confidence": 1.0,  // Maximum - came from Authority (you)
  "lifecycle": "Approved",
  "signatures": [{ "agent_id": "user_supervisor", ... }]
}

Key difference from Feature 1: This isn't just "what is true"—it's "what is true AND what was wrong." The forbidden_alternative gives the agent a contrastive signal: "axios is right BECAUSE requests is wrong."

Step 2: The Learning Path—The Gardener & TrustRank

Simply storing the correction isn't enough. The agent that made the mistake needs to "feel" the error so it hesitates next time.

The Gardener (a background worker) runs Back-Propagation:

struct GardenerJob {
    pub agent_id: AgentId,
    pub topic: String,
    pub prediction: Value,        // What agent said ("requests")
    pub ground_truth: Value,      // What was correct ("axios")
    pub delta: f32,               // How wrong (-0.3)
}

// The Gardener sees:
// - Agent asserted "requests" (Confidence 0.8)
// - User asserted "axios" (Confidence 1.0)
// Result: Agent's TrustRank for topic "http_libraries" drops by 0.15

Next time this agent tries to pick an HTTP library:

Its confidence is mathematically penalized
It's forced to look for external verification (your correction) rather than guessing
Or the system routes to a different agent with higher TrustRank on this topic

Step 3: The Read Path—Lens::Constraints (Pre-Flight Check)

One month later. New session. Empty context window. Agent is about to write code.

The problem: How does the agent know to look up "http library constraints"? It hasn't decided to use an HTTP library yet.

The solution: Lens::Constraints—a pre-flight check that runs BEFORE the agent generates output:

# Agent prepares to write code for "data fetching"
# BEFORE generating, it queries constraints on the domain:

GET /query?context=python_http&lens=constraints

-> Returns:
{
  "constraints": [
    {
      "subject": "Project_X_Http_Client",
      "predicate": "must_use_library",
      "value": "axios",
      "forbidden": "requests",
      "reason": "User correction: requests deprecated",
      "confidence": 1.0,
      "last_verified": "2024-01-15T10:30:00Z"
    }
  ]
}

The Lens::Constraints query is cheap and token-efficient:

It doesn't load the whole project history
It returns only must_use / forbidden predicates for the domain
It acts as a "compiler error" for the agent's intent

Step 4: Resurrection (Decay Reversal)

Facts that aren't used decay over time (Semantic Decay). But what happens when a constraint IS used?

Resurrection: When the agent queries the "axios" constraint and uses it successfully, the system updates last_verified to today. The constraint is "resurrected"—it stays at high confidence.

struct ResurrectionEvent {
    pub assertion_hash: Hash,
    pub queried_at: u64,
    pub used_successfully: bool,
}

// If used_successfully:
// - last_verified = NOW()
// - confidence decay resets
// - Assertion stays in "hot path"

// If NOT used in 6 months:
// - confidence decays toward 0
// - Moves to "cold store"
// - Still queryable for audit, but not returned by default lenses

Result: Your correction from 1 month ago is still at 1.0 confidence because it was used last week. Useless facts ("I'm tired today") have decayed to 0.0 and are ignored.

The Complete Flow: 1 Month Later

Day 1: You correct the agent
  -> System writes Negative Constraint Assertion (confidence 1.0)
  -> Gardener penalizes agent's TrustRank on http_libraries

Day 30: New session, agent is about to write code

  Agent: "I need to write a data fetch script"

  # PRE-FLIGHT CHECK (automatic, before generating)
  GET /query?context=python_http&lens=constraints

  -> Returns: { must_use: "axios", forbidden: "requests", reason: "..." }

  Agent: "Got it. Using axios."
  -> Generates code with `import axios`

  # RESURRECTION
  -> System records: constraint was queried and used successfully
  -> last_verified = NOW()
  -> Constraint stays at confidence 1.0 indefinitely

The Postgres Gap

This pattern is impossible in Postgres because:

No Lens::Constraints: Postgres has no native "pre-flight check" concept
No TrustRank back-propagation: Reputation adjustment requires application logic
No forbidden_alternative: Storing "what not to do" is awkward (separate table? JSON blob?)
No Resurrection: Decay and resurrection require application-level timestamp management
No automatic query interception: You can't force agents to check constraints before acting

Summary: Fixing the Optimization Conflict

Problem	System Prompt Approach	Episteme Approach
Correction persists	Only until context window slides	Permanent in DAG
Forbidden alternative stored	No	Yes, with reason
Agent learns from mistake	No (stateless)	Yes (TrustRank penalty)
Pre-flight constraint check	Manual, forgettable	Lens::Constraints (enforced)
Old corrections decay	Equally, clogging context	Used facts stay fresh (Resurrection)
Token efficiency	Dump whole history	Query only relevant constraints

Pillar: This combines Semantic Decay (Resurrection), Multi-Signature Consensus (TrustRank back-propagation), and a new lens (Constraints) to move the locus of control from "inference-time instruction" to "training-time intuition."

SDK Integration: ADK-Go

This section shows how AI agents actually integrate with Episteme using Google's Agent Development Kit for Go (ADK-Go). Each perspective agent has specific patterns for tool usage and callback integration.

Tool Definitions

Every Episteme operation is exposed as an ADK-Go tool:

package episteme

import (
    "google.golang.org/adk/tool"
    "google.golang.org/adk/tool/functiontool"
)

// === QUERY TOOL ===
// Used by: All agents

type QueryInput struct {
    Subject       string  `json:"subject" jsonschema:"Entity to query (e.g., auth/jwt)"`
    Predicate     string  `json:"predicate" jsonschema:"Relation to query (e.g., signing_algorithm)"`
    Lens          string  `json:"lens,omitempty" jsonschema:"Resolution: consensus, authority, recency, constraints"`
    Lifecycle     string  `json:"lifecycle,omitempty" jsonschema:"Filter: proposed, approved, deprecated"`
    MinConfidence float32 `json:"min_confidence,omitempty" jsonschema:"Minimum confidence threshold (0.0-1.0)"`
    AsOf          string  `json:"as_of,omitempty" jsonschema:"Time-travel: ISO8601 timestamp"`
}

type QueryOutput struct {
    Value      interface{} `json:"value"`
    Confidence float32     `json:"confidence"`
    Lifecycle  string      `json:"lifecycle"`
    Sources    []Source    `json:"sources"`
    QueryID    string      `json:"query_id"`  // CRITICAL: for audit trail
}

type Source struct {
    Hash       string  `json:"hash"`
    SourceHash string  `json:"source_hash"`
    Weight     float32 `json:"weight"`
}

func queryEpisteme(ctx tool.Context, input QueryInput) QueryOutput {
    // Call Episteme API
    result, err := epistemeClient.Query(ctx, input)
    if err != nil {
        return QueryOutput{Error: err.Error()}
    }
    return result
}

// === ASSERT TOOL ===
// Used by: Research Agent, Human Supervisor

type AssertInput struct {
    Subject    string      `json:"subject" jsonschema:"Entity being described"`
    Predicate  string      `json:"predicate" jsonschema:"Relation being asserted"`
    Object     interface{} `json:"object" jsonschema:"Value being claimed"`
    SourceHash string      `json:"source_hash" jsonschema:"BLAKE3 hash of evidence"`
    Confidence float32     `json:"confidence" jsonschema:"Certainty level (0.0-1.0)"`
    Lifecycle  string      `json:"lifecycle,omitempty" jsonschema:"proposed, under_review, approved"`
    ParentHash string      `json:"parent_hash,omitempty" jsonschema:"Hash of assertion being updated"`
    Meta       *AssertMeta `json:"meta,omitempty" jsonschema:"Negative constraints and metadata"`
}

type AssertMeta struct {
    ForbiddenAlternative string `json:"forbidden_alternative,omitempty"`
    Reason               string `json:"reason,omitempty"`
}

type AssertOutput struct {
    Hash    string `json:"hash"`
    Success bool   `json:"success"`
    Error   string `json:"error,omitempty"`
}

// === CONSTRAINT CHECK TOOL ===
// Used by: Implementation Agent (pre-flight)

type ConstraintCheckInput struct {
    Context string `json:"context" jsonschema:"Domain context (e.g., python_http, auth_jwt)"`
}

type ConstraintCheckOutput struct {
    Constraints []Constraint `json:"constraints"`
}

type Constraint struct {
    Subject   string `json:"subject"`
    MustUse   string `json:"must_use,omitempty"`
    Forbidden string `json:"forbidden,omitempty"`
    Reason    string `json:"reason"`
}

// === TRACE TOOL ===
// Used by: On-Call SRE, Human Supervisor

type TraceInput struct {
    AgentID string `json:"agent_id" jsonschema:"Agent to trace"`
    From    string `json:"from" jsonschema:"Start time (ISO8601 or relative like -6h)"`
    To      string `json:"to,omitempty" jsonschema:"End time (default: now)"`
    Subject string `json:"subject,omitempty" jsonschema:"Filter by subject pattern (e.g., auth/*)"`
}

type TraceOutput struct {
    Queries []QueryTrace `json:"queries"`
}

type QueryTrace struct {
    QueryID      string   `json:"query_id"`
    Timestamp    string   `json:"timestamp"`
    Subject      string   `json:"subject"`
    Predicate    string   `json:"predicate"`
    Lens         string   `json:"lens"`
    Result       string   `json:"result"`
    Confidence   float32  `json:"confidence"`
    Contributing []string `json:"contributing_assertions"`
}

// === SUPERSEDE TOOL ===
// Used by: Human Supervisor

type SupersedeInput struct {
    Hash   string `json:"hash" jsonschema:"Assertion hash to supersede"`
    Reason string `json:"reason" jsonschema:"Why this is being superseded"`
    Type   string `json:"type" jsonschema:"Invalidate, Temporal, RequiresReview, Additive"`
}

type SupersedeOutput struct {
    NewHash            string   `json:"new_hash"`
    AffectedAssertions []string `json:"affected_assertions"`
}

Callback Integration

Callbacks are critical for enforcing constraints and maintaining audit trails:

import (
    "google.golang.org/adk/agent"
    "google.golang.org/adk/agent/llmagent"
    "google.golang.org/adk/model"
    "google.golang.org/adk/tool"
)

// Implementation Agent with BeforeToolCallback for constraint checking
implementationAgent, err := llmagent.New(llmagent.Config{
    Name:        "implementation_agent",
    Model:       model,
    Description: "Writes code against current approved patterns",
    Instruction: "You write code. Always use approved patterns only.",
    Tools:       []tool.Tool{queryTool, constraintTool},

    // CRITICAL: Check constraints BEFORE any tool that generates code
    BeforeToolCallback: func(ctx agent.CallbackContext, call *tool.Call) (*tool.Call, error) {
        // If agent is about to write code, check constraints first
        if needsConstraintCheck(call) {
            context := extractDomainContext(call)  // e.g., "python_http"
            constraints, err := checkConstraints(ctx, context)
            if err != nil {
                return nil, fmt.Errorf("constraint check failed: %w", err)
            }

            for _, c := range constraints {
                if violatesConstraint(call, c) {
                    return nil, fmt.Errorf("blocked: %s is forbidden - %s",
                        c.Forbidden, c.Reason)
                }
            }
        }
        return call, nil
    },

    // Log decisions for audit trail
    AfterModelCallback: func(ctx agent.CallbackContext, resp *model.LLMResponse) (*model.LLMResponse, error) {
        decision := extractDecision(resp)
        logToEpisteme(ctx, AuditEntry{
            AgentID:   ctx.Agent().Name(),
            Timestamp: time.Now(),
            Decision:  decision,
            SessionID: ctx.Session().ID(),
        })
        return resp, nil
    },
})

// Lead Orchestrator with confidence threshold escalation
leadOrchestrator, err := llmagent.New(llmagent.Config{
    Name:        "lead_orchestrator",
    Model:       model,
    Description: "Coordinates agent team, routes work based on knowledge",
    Instruction: `You coordinate the agent team. Query Episteme for current state.
                  If confidence < 0.8, escalate to human supervisor.`,
    Tools:       []tool.Tool{queryTool, delegateTool},
    OutputKey:   "orchestrator_decision",  // Pass to downstream agents

    // Check confidence scores and escalate if too low
    AfterToolCallback: func(ctx agent.CallbackContext, call *tool.Call, result *tool.Result) (*tool.Result, error) {
        if call.Name == "episteme_query" {
            var queryResult QueryOutput
            json.Unmarshal(result.Output, &queryResult)

            if queryResult.Confidence < 0.8 {
                // Mark for human review
                ctx.Session().State().Set("needs_human_review", true)
                ctx.Session().State().Set("low_confidence_query", queryResult.QueryID)
            }
        }
        return result, nil
    },
})

Agent-Specific Patterns

Lead Orchestrator

Fast queries with confidence thresholds for routing decisions:

// Query current auth config with confidence threshold
result := queryEpisteme(ctx, QueryInput{
    Subject:       "auth/jwt",
    Predicate:     "signing_algorithm",
    Lens:          "authority",
    MinConfidence: 0.8,
})

if result.Confidence < 0.8 {
    // Escalate to human - can't route confidently
    ctx.Session().State().Set("escalation_reason",
        fmt.Sprintf("Confidence %.2f below threshold for %s/%s",
            result.Confidence, result.Subject, result.Predicate))
    return escalateToHuman(ctx)
}

// Route to implementation agent with high confidence
ctx.Session().State().Set("auth_config", result.Value)
return delegateTo(ctx, "implementation_agent")

Implementation Agent

Queries approved patterns only, with pre-flight constraint checks:

// CRITICAL: Filter to approved lifecycle ONLY
result := queryEpisteme(ctx, QueryInput{
    Subject:   "auth/jwt",
    Predicate: "signing_algorithm",
    Lens:      "authority",
    Lifecycle: "approved",  // Never use proposed patterns!
})

if result.Lifecycle != "approved" {
    return fmt.Errorf("no approved pattern found for %s/%s",
        input.Subject, input.Predicate)
}

// Pre-flight: check for forbidden alternatives
constraints := checkConstraints(ctx, "auth_jwt")
for _, c := range constraints.Constraints {
    // Agent now knows: "use X, don't use Y, because Z"
    // This enables contrastive learning
}

Research Agent

Stores conflicting information with uncertainty:

// Store first source
assertKnowledge(ctx, AssertInput{
    Subject:    "jwt_rotation",
    Predicate:  "best_practice",
    Object:     "rotate_daily",
    SourceHash: hashURL("https://security.io/jwt-best-practices"),
    Confidence: 0.7,  // Express uncertainty
    Lifecycle:  "proposed",  // Not approved yet
})

// Store contradicting source - don't flatten!
assertKnowledge(ctx, AssertInput{
    Subject:    "jwt_rotation",
    Predicate:  "best_practice",
    Object:     "rotate_hourly",
    SourceHash: hashURL("https://owasp.org/jwt-rotation"),
    Confidence: 0.8,
    Lifecycle:  "proposed",
})

// Let Lead Orchestrator resolve via Lens::Consensus or Lens::Authority

Human Supervisor

Time-travel queries and corrections with impact analysis:

// What was believed during the incident?
result := queryEpisteme(ctx, QueryInput{
    Subject:   "auth/jwt",
    Predicate: "signing_algorithm",
    AsOf:      "2024-01-15T21:00:00Z",  // Time of incident
})

// Trace agent queries during the incident window
traces := traceQueries(ctx, TraceInput{
    AgentID: "deployment-agent",
    From:    "2024-01-15T20:00:00Z",
    To:      "2024-01-15T22:00:00Z",
    Subject: "auth/*",
})

// Found the bug: agent queried without lifecycle filter
// Correct the record
impact := supersede(ctx, SupersedeInput{
    Hash:   badAssertionHash,
    Reason: "Proposal treated as approved - agent didn't filter by lifecycle",
    Type:   "Invalidate",
})

// impact.AffectedAssertions shows downstream effects
fmt.Printf("Corrected. %d downstream assertions affected.\n",
    len(impact.AffectedAssertions))

On-Call SRE

Sub-second trace commands for incident investigation:

// It's 3am. Auth is broken. What happened?

// Step 1: What did deployment agent query? (<500ms required)
traces := traceQueries(ctx, TraceInput{
    AgentID: "deployment-agent",
    From:    "-6h",
    Subject: "auth/*",
})

for _, t := range traces.Queries {
    fmt.Printf("[%s] %s/%s via %s -> %s (conf: %.2f)\n",
        t.Timestamp, t.Subject, t.Predicate,
        t.Lens, t.Result, t.Confidence)

    if len(t.Contributing) > 0 {
        fmt.Printf("  Contributing: %v\n", t.Contributing)
    }
}

// Step 2: What changed in the last 24 hours?
diff := queryDiff(ctx, DiffInput{
    Subject: "auth/jwt",
    From:    "-24h",
})

// Step 3: Mark bad assertion (found via trace)
supersede(ctx, SupersedeInput{
    Hash:   badAssertionHash,
    Reason: "RFC proposal incorrectly treated as approved config",
    Type:   "Invalidate",
})

// Step 4: Verify fix
result := queryEpisteme(ctx, QueryInput{
    Subject:   "auth/jwt",
    Predicate: "signing_algorithm",
    Lifecycle: "approved",
})
fmt.Printf("Current approved value: %s\n", result.Value)

Multi-Agent Pipeline Example

Complete example showing agents coordinating through Episteme:

package main

import (
    "context"
    "google.golang.org/adk/agent"
    "google.golang.org/adk/agent/llmagent"
    "google.golang.org/adk/agent/sequentialagent"
    "google.golang.org/adk/model/gemini"
)

func main() {
    ctx := context.Background()
    model, _ := gemini.NewModel(ctx, "gemini-3-flash-preview", nil)

    // Research Agent: Discovers and stores knowledge
    researchAgent, _ := llmagent.New(llmagent.Config{
        Name:        "research_agent",
        Model:       model,
        Description: "Researches and stores knowledge with source attribution",
        Instruction: `Research the topic. Store findings with confidence scores.
                      Mark conflicting sources - don't resolve them.`,
        Tools:       []tool.Tool{assertTool, searchTool},
        OutputKey:   "research_findings",
    })

    // Lead Orchestrator: Queries and routes
    leadAgent, _ := llmagent.New(llmagent.Config{
        Name:        "lead_orchestrator",
        Model:       model,
        Description: "Coordinates team based on current knowledge state",
        Instruction: `Query Episteme for current approved patterns.
                      Research findings available in {research_findings}.
                      If confidence >= 0.8, route to implementation.
                      If confidence < 0.8, flag for human review.`,
        Tools:       []tool.Tool{queryTool},
        OutputKey:   "routing_decision",

        AfterToolCallback: confidenceEscalationCallback,
    })

    // Implementation Agent: Writes code
    implAgent, _ := llmagent.New(llmagent.Config{
        Name:        "implementation_agent",
        Model:       model,
        Description: "Writes code against approved patterns only",
        Instruction: `Write code for the task.
                      Routing decision in {routing_decision}.
                      ONLY use approved patterns. Check constraints first.`,
        Tools:       []tool.Tool{queryTool, constraintTool, writeCodeTool},
        OutputKey:   "implementation",

        BeforeToolCallback: constraintEnforcementCallback,
    })

    // Sequential pipeline
    pipeline, _ := sequentialagent.New(sequentialagent.Config{
        AgentConfig: agent.Config{
            Name:        "AgileDevPipeline",
            Description: "Research → Route → Implement with Episteme coordination",
            SubAgents:   []agent.Agent{researchAgent, leadAgent, implAgent},
        },
    })

    // Run the pipeline
    runner := agent.NewRunner(pipeline, sessionService)
    for event, err := range runner.Run(ctx, userID, sessionID, userMessage, nil) {
        if err != nil {
            log.Printf("Pipeline error: %v", err)
        }
        handleEvent(event)
    }
}

Session State vs. Episteme

Use Case	Mechanism	Why
Agent-to-agent handoff in pipeline	Session State + `OutputKey`	Ephemeral, same conversation
Permanent organizational knowledge	Episteme `Assert`	Persists across sessions/agents
"What was decided in this conversation"	Session State	Temporary, conversation-scoped
"What should all agents know forever"	Episteme `Assert` + `lifecycle: approved`	Permanent, query-audited
Temporary working data	`temp:` prefixed state keys	Auto-cleared after session
Agent decision audit trail	Episteme `QueryAudit`	Immutable, time-travel enabled

The 5-Minute Demo

# Clone and start
git clone https://github.com/orchard9/stemedb
cd stemedb
cargo run --bin stemedb-server

# In another terminal:

# 1. Insert a PROPOSED pattern (like an RFC)
curl -X POST http://localhost:8080/assert -d '{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": {"Text": "ES256"},
  "lifecycle": "Proposed",
  "source_hash": "rfc_2024",
  "confidence": 0.75
}'
# Save the returned hash as $PROPOSAL_HASH

# 2. Insert an APPROVED pattern (current production)
curl -X POST http://localhost:8080/assert -d '{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": {"Text": "RS256"},
  "lifecycle": "Approved",
  "source_hash": "prod_config",
  "confidence": 0.9
}'

# 3. Query WITHOUT lifecycle filter (the bug!)
curl "http://localhost:8080/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency"
# Returns ES256 (the proposal, because it's newer)
# THIS IS THE BUG - agent would deploy wrong config

# 4. Query WITH lifecycle filter (the fix!)
curl "http://localhost:8080/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency&lifecycle=approved"
# Returns RS256 (correct - only approved items)

# 5. Later: Approve the proposal
curl -X POST http://localhost:8080/assert -d '{
  "subject": "auth/jwt",
  "predicate": "signing_algorithm",
  "object": {"Text": "ES256"},
  "lifecycle": "Approved",
  "parent_hash": "'$PROPOSAL_HASH'",
  "epoch": "es256-migration",
  "confidence": 0.95
}'

# 6. Now the same query returns ES256
curl "http://localhost:8080/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency&lifecycle=approved"
# Returns ES256 (now approved)

# 7. Check audit trail
curl "http://localhost:8080/audit/queries?subject=auth/jwt&limit=5"
# Shows all queries, results, and contributing assertions

# 8. Time travel: what did we believe yesterday?
curl "http://localhost:8080/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency&lifecycle=approved&as_of=2024-01-14T00:00:00Z"
# Returns RS256 (pre-migration state)

# 9. Store a NEGATIVE CONSTRAINT (correction with forbidden alternative)
curl -X POST http://localhost:8080/assert -d '{
  "subject": "Project_X_Http_Client",
  "predicate": "must_use_library",
  "object": {"Text": "axios"},
  "meta": {"forbidden_alternative": "requests", "reason": "User correction"},
  "lifecycle": "Approved",
  "confidence": 1.0
}'

# 10. Pre-flight constraint check (what Lens::Constraints returns)
curl "http://localhost:8080/query?context=python_http&lens=constraints"
# Returns: { must_use: "axios", forbidden: "requests", reason: "..." }
# Agent sees this BEFORE generating code

# 11. Query resurrects the constraint (updates last_verified)
curl "http://localhost:8080/query?subject=Project_X_Http_Client&predicate=must_use_library&lens=authority"
# The constraint is now "resurrected" - stays at high confidence

Time to value: Under 5 minutes to see lifecycle filtering, query audit, time-travel, and negative constraints working.

What Postgres CAN Do

Postgres is sufficient for:

Storing assertions with status columns
Basic recency queries
Simple audit logging (with application code)
Temporal tables for history (with schema complexity)

Postgres requires significant application code for:

Enforcing lifecycle filters on every query
Query audit with result provenance
Domain-weighted authority calculations
Epoch chain traversal for supersession

Postgres cannot cleanly handle:

O(1) epoch supersession (bulk paradigm shifts)
Lens-based resolution with lifecycle enforcement
Cryptographic multi-signature with immutable provenance
Time-travel with lens application at historical point
Query audit with contributing assertion weights
Sub-100ms trace command for incident investigation

The Incident Investigation Pattern

When something breaks:

# 1. What did the agent query?
episteme trace --agent deployment-agent --time "6 hours ago" --subject "auth/*"

# 2. What assertions contributed?
episteme show --hash $ASSERTION_HASH --provenance

# 3. Mark bad assertion as superseded
episteme supersede --hash $BAD_HASH --reason "Proposal treated as approved" \
    --type "Invalidate"

# 4. See impact
episteme cascade --root $BAD_HASH

# 5. Verify correction propagated
episteme query --subject auth/jwt --predicate signing_algorithm --lifecycle approved

SRE's 3am workflow: Query → Trace → Identify → Correct → Verify. Under 10 minutes.

Summary: Why Episteme for Agent Teams?

Problem	Postgres Approach	Episteme Approach	Pillar
Proposal vs. Approved	Status column (unenforced)	Lifecycle enum with lens enforcement	First-Class Contradiction
Query audit trail	Application-level logging	Built-in with provenance	Multi-Signature Consensus
Time-travel debugging	Temporal tables + complex joins	Native `as_of` parameter	Semantic Decay
Paradigm shift (RS256→ES256)	O(n) updates or migrations	O(1) epoch supersession	Invalidation Cascades
Expert vs. junior weighting	Join tables with reputation	Cryptographic signatures + Authority lens	Multi-Signature Consensus
Corrections forgotten (Optimization Conflict)	System prompt drift	Negative Constraints + Resurrection	Semantic Decay
Agents repeat mistakes	No learning (stateless)	TrustRank back-propagation (The Gardener)	Multi-Signature Consensus
Pre-flight safety checks	Manual, forgettable	Lens::Constraints (enforced)	First-Class Contradiction

The 47-minute outage I witnessed happened because an AI agent couldn't distinguish a proposal from an approved decision. Episteme ensures that distinction is structural—not a convention that agents might forget.

The deeper problem: Agents are stateless. They rely on fragile system prompts that suffer from context drift. When you correct an agent, that correction lives in the sliding context window—once it slides past, the agent reverts to its base weights. Episteme solves this by treating corrections as database writes that persist permanently, are retrieved token-efficiently via Lens::Constraints, and stay fresh through Resurrection.

Regulatory / Compliance Considerations

For production AI agent deployments:

Model Governance: Query audit trails demonstrate what context influenced agent decisions
Incident Response: Time-travel enables post-mortem without reconstructing state
Change Control: Lifecycle stages enforce approval workflows for configuration changes
Audit Requirements: Immutable Merkle DAG provides tamper-evident decision provenance

Episteme doesn't replace change management—it provides the data substrate that makes AI agent decisions auditable and correctable.

47 KiB Raw Blame History

Agile AI Agent Team: Knowledge Coordination

The Catastrophe (Without Episteme)

The Scenario

Feature 1: Lifecycle Stage (Proposed vs. Approved)

The Failure Mode

The Postgres Attempt

The Episteme Solution

Feature 2: Query Audit Trail

The Failure Mode

The Postgres Attempt

The Episteme Solution

Feature 3: Time-Travel Queries

The Failure Mode

The Postgres Attempt

The Episteme Solution

Feature 4: Paradigm Shifts (Epochs)

The Failure Mode

The Postgres Attempt

The Episteme Solution

Feature 5: Expert Weighting (Authority Lens)

The Failure Mode

The Postgres Attempt

The Episteme Solution

Feature 6: Persistent Learning (Negative Constraints + The Gardener)

The Failure Mode: The Optimization Conflict

The Postgres Attempt

The Episteme Solution

Step 1: The Write Path—Negative Constraint Assertion

Step 2: The Learning Path—The Gardener & TrustRank

Step 3: The Read Path—Lens::Constraints (Pre-Flight Check)

Step 4: Resurrection (Decay Reversal)

The Complete Flow: 1 Month Later

The Postgres Gap

Summary: Fixing the Optimization Conflict

SDK Integration: ADK-Go

Tool Definitions

Callback Integration

Agent-Specific Patterns

Lead Orchestrator

Implementation Agent

Research Agent

Human Supervisor

On-Call SRE

Multi-Agent Pipeline Example

Session State vs. Episteme

The 5-Minute Demo

What Postgres CAN Do

The Incident Investigation Pattern

Summary: Why Episteme for Agent Teams?

Regulatory / Compliance Considerations

47 KiB

Raw Blame History