jordan b3e8a9a058 feat: Multi-application expansion with chaos testing and community UI

Major additions:
- Community Next.js app (port 18187) for browsing claims with API docs
- stemedb-chaos crate: Fault injection, chaos testing, CRDT properties
- Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents
- Disputed claims handling: Manual review workflows and validation
- Aphoria security scanner: New extractors (SQL injection, command
  injection, weak crypto, TLS version), policy-based ignores, UAT reports
- Docker infrastructure: Dockerfile, docker-compose.yml for full stack
- VulnBank demo: Intentionally vulnerable multi-language test corpus

SDK & API enhancements:
- Source registry handlers for tracking data provenance
- Metrics endpoint
- Skeptic filtering improvements

Code quality:
- Split 14 large files (>500 lines) into focused modules
- All files now under 500-line limit per project guidelines

Documentation:
- Chaos testing guide, circuit breakers, observability docs
- Phase 7 UAT documentation updates
- Martin Kleppmann technical writer agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-04 01:24:14 -07:00

3.6 KiB

Raw Blame History

Query Audit Trail

Quick Ref: Every query is logged with provenance for incident investigation Status: ✅ Implemented (Phase 2)

The Problem

At 3am, production is broken. An agent deployed wrong config. The SRE needs to know: What did the agent query? What result did it get? What assertions contributed?

Postgres query logs show SQL, not semantic meaning.

The Solution

Every query to the Episteme API is automatically logged with full provenance:

/// Stored at `AUD:{query_id}` in the KV store.
pub struct QueryAudit {
    pub query_id: QueryId,           // Content-addressed hash
    pub agent_id: Option<[u8; 32]>,  // From X-Agent-Id header
    pub timestamp: u64,
    pub params: QueryParams,         // Subject, predicate, lifecycle, epoch, lens
    pub result_hash: Option<Hash>,   // Hash of winning assertion
    pub result_confidence: f32,
    pub contributing_assertions: Vec<ContributingAssertion>,
}

pub struct ContributingAssertion {
    pub assertion_hash: Hash,
    pub weight: f32,        // How much it influenced result (1.0 for winner)
    pub source_hash: Hash,  // Original evidence
    pub lifecycle: LifecycleStage,
}

Storage Layout

Key Pattern	Value	Purpose
`AUD:{query_id}`	Serialized QueryAudit	Individual audit records
`AUDA:{agent_id}:{timestamp}:{query_id}`	Empty	Agent index for temporal queries

API

List Query Audits

# List recent audits
GET /v1/audit/queries?limit=100

# Filter by agent
GET /v1/audit/queries?agent_id=<hex-encoded-pubkey>&from=1704067200&to=1704153600

Get Specific Audit

# Full reasoning trace for a single query
GET /v1/audit/query/{query_id}

Including Agent ID in Queries

To associate queries with an agent, include the X-Agent-Id header:

curl -H "X-Agent-Id: <hex-encoded-32-byte-pubkey>" \
     "http://localhost:18180/v1/query?subject=Tesla&predicate=revenue"

Response Format

{
  "query_id": "a7f3a2b9c1d4e5f6...",
  "agent_id": "01020304...",
  "timestamp": 1704153600,
  "params": {
    "subject": "auth/jwt",
    "predicate": "signing_algorithm",
    "lifecycle": "Approved",
    "lens": "Authority"
  },
  "result_hash": "b8c9d0e1f2a3...",
  "result_confidence": 0.87,
  "contributing_assertions": [
    {
      "assertion_hash": "c1d2e3f4a5b6...",
      "weight": 1.0,
      "source_hash": "d2e3f4a5b6c7...",
      "lifecycle": "Approved"
    },
    {
      "assertion_hash": "e3f4a5b6c7d8...",
      "weight": 0.0,
      "source_hash": "f4a5b6c7d8e9...",
      "lifecycle": "Proposed"
    }
  ]
}

Implementation Details

Query ID Generation: Content-addressed hash of params + timestamp for deterministic IDs
Fire-and-Forget: Audit logging doesn't block the query response; failures are logged but don't fail queries
Agent Index: Enables O(1) lookups by agent + time range via prefix scan

Latency Requirements (from user research)

Query Type	Target Latency
Point query (current)	< 100ms
Time-travel query	< 500ms
Audit trace	< 2s
Full provenance chain	< 5s

Crates

Types: stemedb_core::types::{QueryAudit, QueryParams, ContributingAssertion}
Storage: stemedb_storage::{AuditStore, GenericAuditStore}
API Handlers: stemedb_api::handlers::audit::{list_audits, get_audit}

Origin

This feature emerged from SRE perspective interviews (see .claude/agents/perspective-oncall-sre.md). Core need: "I need to trace from agent decision → query → assertions in under 10 minutes."

3.6 KiB Raw Blame History