Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
8.8 KiB
Agile AI Agent Team: Knowledge Coordination
Tier: Production-Ready Pillars Used: First-Class Contradiction, Invalidation Cascades, Multi-Signature Consensus, Semantic Decay Postgres Test: FAILED - Lifecycle stages require application-level state machines; time-travel needs temporal tables with complex joins; query audit trails don't exist natively; epoch supersession requires recursive invalidation logic
The Catastrophe
I watched a production outage take down auth for 47 minutes because an AI agent deployed the wrong JWT configuration.
Here's what happened: Our team uses AI agents for development—a Lead Orchestrator coordinates specialists for research, implementation, and deployment. The deployment agent queried our knowledge base for "current JWT signing algorithm" and got "ES256."
It deployed with confidence. Tests passed. CI went green.
The auth service expected RS256. Every token validation failed. At 3am, the pager fired.
During the post-mortem, someone asked: "Why did the agent think ES256 was correct?"
Silence.
We dug through the knowledge base. Found an RFC from the security team proposing ES256 migration. Found Slack messages discussing it. Found a doc that said "we should use ES256" in future tense. The knowledge base had no distinction between "proposed" and "approved." The most recent entry was the RFC—a proposal, not a decision.
The agent queried, got the proposal, treated it as truth, deployed.
The failure mode: Traditional databases store information without lifecycle state. Proposals look like decisions. Discussions look like conclusions. When an AI agent queries "what is X?", it gets whatever is most recent—whether that's a decision, a debate, or a rejected idea.
The Team
An agile development team uses AI agents to coordinate work across auth migrations, feature flag rollouts, deployment configurations, and research:
| Role | Need |
|---|---|
| Lead Orchestrator | Routes work, needs definitive current-state answers |
| Implementation Agent | Writes code, needs approved patterns only |
| Research Agent | Ingests docs, papers, discussions—often conflicting |
| Human Supervisor | Reviews agent decisions, needs to trace reasoning |
| On-Call SRE | Investigates incidents, needs time-travel debugging |
What They Need from Episteme
1. Lifecycle Stage (Proposed vs. Approved)
The Problem: Research Agent ingests an RFC proposing ES256. Implementation Agent queries "JWT signing algorithm" and gets ES256—even though it was never approved.
The Solution: Lifecycle is a first-class field with lens enforcement:
# Query with lifecycle filter
GET /query?subject=auth/jwt&predicate=signing_algorithm
&lens=authority
&lifecycle=approved
-> Returns RS256 (approved decision)
-> Proposal for ES256 is excluded by lifecycle filter
Proposals and approvals coexist in the DAG but are distinguished structurally—not by convention that agents might forget.
2. Query Audit Trail
The Problem: At 3am, auth is broken. What did the deployment agent query? What result did it get? What assertions contributed?
The Solution: Every query is automatically logged with full provenance:
GET /audit/queries?agent=deployment-agent&from=-6h
-> Returns:
{
"query_id": "q_7f3a2b...",
"timestamp": "2024-01-15T21:03:47Z",
"subject": "auth/jwt",
"predicate": "signing_algorithm",
"lifecycle_filter": null, // PROBLEM: agent didn't filter!
"result": { "value": "ES256", "confidence": 0.87 },
"contributing_assertions": [
{ "hash": "rfc_2024_001...", "lifecycle": "Proposed", "weight": 0.9 }
]
}
The SRE immediately sees: agent didn't filter by lifecycle, proposal outweighed approved config.
3. Time-Travel Queries
The Problem: Production is stable after rollback. Now SRE needs to understand: what was the state of knowledge at 9pm when the agent made its decision?
The Solution: The Merkle DAG is inherently temporal:
# What did we believe at 9pm?
GET /query?subject=auth/jwt&predicate=signing_algorithm
&as_of=2024-01-15T21:00:00Z
-> Returns ES256 (the state at that moment)
-> Shows which assertions existed then
Time-travel is O(log n) via hash lookups, not complex temporal table joins.
4. Paradigm Shifts (Epochs)
The Problem: Security team migrates from RS256 to ES256. This affects 47 related assertions. In Postgres, you'd need 47 UPDATEs or INSERT/deprecate pairs.
The Solution: Epochs enable O(1) supersession:
# Create new epoch
POST /epoch
{
"name": "auth-es256-migration",
"supersedes": "auth-rs256-era",
"supersession_type": "Temporal",
"effective_date": "2024-02-01T00:00:00Z"
}
# Queries automatically respect epoch boundaries
GET /query?subject=auth/jwt&predicate=signing_algorithm
-> Returns ES256 (from new epoch)
GET /query?subject=auth/jwt&predicate=signing_algorithm&epoch=auth-rs256-era
-> Returns RS256 (historical)
5. Expert Weighting (Authority Lens)
The Problem: Junior dev discovers Stack Overflow answer suggesting JWT rotation every 15 minutes. Senior security engineer says "That's for high-security contexts; our standard is daily."
The Solution: Multi-signature with domain-weighted reputation:
GET /query?subject=auth/jwt&predicate=rotation_interval
&lens=authority
&domain=security
-> Returns: 24h
-> security_lead has reputation 0.95 in security domain
-> junior_dev has reputation 0.4 in security domain
Signatures are cryptographic, immutable, and automatically weighted.
6. Persistent Learning (The Optimization Conflict)
The Problem: You correct an agent: "Don't use requests, use axios." Agent says "Got it!" Next week, new session—agent uses requests again. Repeat forever.
This is The Optimization Conflict: agents rely on context windows that drift. Your correction slides past; the agent reverts to base weights.
The Solution: Corrections become database writes that persist permanently:
# Day 1: Store correction with forbidden alternative
POST /assert
{
"subject": "Project_X_Http_Client",
"predicate": "must_use_library",
"object": "axios",
"meta": { "forbidden_alternative": "requests", "reason": "deprecated" },
"confidence": 1.0
}
# Day 30: New session, agent checks constraints before coding
GET /query?context=python_http&lens=constraints
-> Returns: { must_use: "axios", forbidden: "requests" }
# Agent uses axios. Constraint honored across sessions.
The Gardener (background worker) also adjusts TrustRank—agents that make mistakes have reduced confidence on that topic.
The 5-Minute Demo
# Start server
cargo run --bin stemedb-server
# Insert PROPOSED pattern (RFC)
curl -X POST http://localhost:18180/assert -d '{
"subject": "auth/jwt", "predicate": "signing_algorithm",
"object": {"Text": "ES256"}, "lifecycle": "Proposed", "confidence": 0.75
}'
# Insert APPROVED pattern (production)
curl -X POST http://localhost:18180/assert -d '{
"subject": "auth/jwt", "predicate": "signing_algorithm",
"object": {"Text": "RS256"}, "lifecycle": "Approved", "confidence": 0.9
}'
# Query WITHOUT lifecycle filter (the bug!)
curl "http://localhost:18180/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency"
# Returns ES256 (proposal, most recent)
# Query WITH lifecycle filter (the fix!)
curl "http://localhost:18180/query?subject=auth/jwt&predicate=signing_algorithm&lens=recency&lifecycle=approved"
# Returns RS256 (correct)
Summary: Why Episteme for Agent Teams?
| Problem | Traditional Approach | Episteme Approach |
|---|---|---|
| Proposal vs. Approved | Status column (unenforced) | Lifecycle enum with lens enforcement |
| Query audit trail | Application-level logging | Built-in with provenance |
| Time-travel debugging | Temporal tables + complex joins | Native as_of parameter |
| Paradigm shift (RS256→ES256) | O(n) updates | O(1) epoch supersession |
| Expert vs. junior weighting | Join tables with reputation | Cryptographic signatures + Authority lens |
| Corrections forgotten | System prompt drift | Negative Constraints + Resurrection |
| Agents repeat mistakes | No learning (stateless) | TrustRank back-propagation |
The 47-minute outage happened because an AI agent couldn't distinguish a proposal from an approved decision. Episteme ensures that distinction is structural—not a convention that agents might forget.
Further Reading
- SDK Integration: See .claude/guides/integrations/adk-go-episteme.md for ADK-Go tool definitions, callback patterns, and multi-agent pipeline examples.
- Presentation: See docs/presentations/ for the visual walkthrough of this use case.
- Architecture: See architecture.md for Episteme internals.