stemedb/ai-lookup/features/query-audit.md
jordan 1ce4004807 feat: Complete Phase 2 (The Cortex) - query, lens, and API layers
This commit adds the read path (Cortex) to complement the write path (Spine):

## Crates
- stemedb-api: HTTP API with axum + utoipa OpenAPI
  - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit
  - Metered endpoints with quota enforcement
  - Ed25519 signature verification
- stemedb-lens: Truth resolution lenses
  - RecencyLens, ConsensusLens, ConfidenceLens
  - VoteAwareConsensusLens (Ballot Box pattern)
  - TrustAwareAuthorityLens (The Hive pattern)
  - SkepticLens (conflict analysis)
  - EpochAwareLens (paradigm-safe queries)
- stemedb-query: Query engine with materialized views

## Storage Extensions
- VoteStore: Vote aggregation with cached counts
- TrustRankStore: Agent reputation with decay
- AuditStore: Query audit trail
- IndexStore: SP/P/S index structures
- SupersessionStore: Epoch supersession chains

## SDKs
- sdk/go/steme: Go HTTP client with Ed25519 signing
- sdk/go/adk: ADK-Go tools for AI agents

## Documentation
- Updated CLAUDE.md, architecture.md, roadmap.md
- New ai-lookup entries for all services
- Use case docs for consumer health intelligence
- Arena roadmap for simulation advancement

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 13:22:44 -07:00

3.6 KiB

Query Audit Trail

Quick Ref: Every query is logged with provenance for incident investigation Status: Implemented (Phase 2)

The Problem

At 3am, production is broken. An agent deployed wrong config. The SRE needs to know: What did the agent query? What result did it get? What assertions contributed?

Postgres query logs show SQL, not semantic meaning.

The Solution

Every query to the Episteme API is automatically logged with full provenance:

/// Stored at `AUD:{query_id}` in the KV store.
pub struct QueryAudit {
    pub query_id: QueryId,           // Content-addressed hash
    pub agent_id: Option<[u8; 32]>,  // From X-Agent-Id header
    pub timestamp: u64,
    pub params: QueryParams,         // Subject, predicate, lifecycle, epoch, lens
    pub result_hash: Option<Hash>,   // Hash of winning assertion
    pub result_confidence: f32,
    pub contributing_assertions: Vec<ContributingAssertion>,
}

pub struct ContributingAssertion {
    pub assertion_hash: Hash,
    pub weight: f32,        // How much it influenced result (1.0 for winner)
    pub source_hash: Hash,  // Original evidence
    pub lifecycle: LifecycleStage,
}

Storage Layout

Key Pattern Value Purpose
AUD:{query_id} Serialized QueryAudit Individual audit records
AUDA:{agent_id}:{timestamp}:{query_id} Empty Agent index for temporal queries

API

List Query Audits

# List recent audits
GET /v1/audit/queries?limit=100

# Filter by agent
GET /v1/audit/queries?agent_id=<hex-encoded-pubkey>&from=1704067200&to=1704153600

Get Specific Audit

# Full reasoning trace for a single query
GET /v1/audit/query/{query_id}

Including Agent ID in Queries

To associate queries with an agent, include the X-Agent-Id header:

curl -H "X-Agent-Id: <hex-encoded-32-byte-pubkey>" \
     "http://localhost:3000/v1/query?subject=Tesla&predicate=revenue"

Response Format

{
  "query_id": "a7f3a2b9c1d4e5f6...",
  "agent_id": "01020304...",
  "timestamp": 1704153600,
  "params": {
    "subject": "auth/jwt",
    "predicate": "signing_algorithm",
    "lifecycle": "Approved",
    "lens": "Authority"
  },
  "result_hash": "b8c9d0e1f2a3...",
  "result_confidence": 0.87,
  "contributing_assertions": [
    {
      "assertion_hash": "c1d2e3f4a5b6...",
      "weight": 1.0,
      "source_hash": "d2e3f4a5b6c7...",
      "lifecycle": "Approved"
    },
    {
      "assertion_hash": "e3f4a5b6c7d8...",
      "weight": 0.0,
      "source_hash": "f4a5b6c7d8e9...",
      "lifecycle": "Proposed"
    }
  ]
}

Implementation Details

  • Query ID Generation: Content-addressed hash of params + timestamp for deterministic IDs
  • Fire-and-Forget: Audit logging doesn't block the query response; failures are logged but don't fail queries
  • Agent Index: Enables O(1) lookups by agent + time range via prefix scan

Latency Requirements (from user research)

Query Type Target Latency
Point query (current) < 100ms
Time-travel query < 500ms
Audit trace < 2s
Full provenance chain < 5s

Crates

  • Types: stemedb_core::types::{QueryAudit, QueryParams, ContributingAssertion}
  • Storage: stemedb_storage::{AuditStore, GenericAuditStore}
  • API Handlers: stemedb_api::handlers::audit::{list_audits, get_audit}

Origin

This feature emerged from SRE perspective interviews (see .claude/agents/perspective-oncall-sre.md). Core need: "I need to trace from agent decision → query → assertions in under 10 minutes."