# Query Audit Trail > **Quick Ref:** Every query is logged with provenance for incident investigation > **Status:** ✅ Implemented (Phase 2) ## The Problem At 3am, production is broken. An agent deployed wrong config. The SRE needs to know: What did the agent query? What result did it get? What assertions contributed? Postgres query logs show SQL, not semantic meaning. ## The Solution Every query to the Episteme API is automatically logged with full provenance: ```rust /// Stored at `AUD:{query_id}` in the KV store. pub struct QueryAudit { pub query_id: QueryId, // Content-addressed hash pub agent_id: Option<[u8; 32]>, // From X-Agent-Id header pub timestamp: u64, pub params: QueryParams, // Subject, predicate, lifecycle, epoch, lens pub result_hash: Option, // Hash of winning assertion pub result_confidence: f32, pub contributing_assertions: Vec, } pub struct ContributingAssertion { pub assertion_hash: Hash, pub weight: f32, // How much it influenced result (1.0 for winner) pub source_hash: Hash, // Original evidence pub lifecycle: LifecycleStage, } ``` ## Storage Layout | Key Pattern | Value | Purpose | |-------------|-------|---------| | `AUD:{query_id}` | Serialized QueryAudit | Individual audit records | | `AUDA:{agent_id}:{timestamp}:{query_id}` | Empty | Agent index for temporal queries | ## API ### List Query Audits ```bash # List recent audits GET /v1/audit/queries?limit=100 # Filter by agent GET /v1/audit/queries?agent_id=&from=1704067200&to=1704153600 ``` ### Get Specific Audit ```bash # Full reasoning trace for a single query GET /v1/audit/query/{query_id} ``` ### Including Agent ID in Queries To associate queries with an agent, include the `X-Agent-Id` header: ```bash curl -H "X-Agent-Id: " \ "http://localhost:18180/v1/query?subject=Tesla&predicate=revenue" ``` ## Response Format ```json { "query_id": "a7f3a2b9c1d4e5f6...", "agent_id": "01020304...", "timestamp": 1704153600, "params": { "subject": "auth/jwt", "predicate": "signing_algorithm", "lifecycle": "Approved", "lens": "Authority" }, "result_hash": "b8c9d0e1f2a3...", "result_confidence": 0.87, "contributing_assertions": [ { "assertion_hash": "c1d2e3f4a5b6...", "weight": 1.0, "source_hash": "d2e3f4a5b6c7...", "lifecycle": "Approved" }, { "assertion_hash": "e3f4a5b6c7d8...", "weight": 0.0, "source_hash": "f4a5b6c7d8e9...", "lifecycle": "Proposed" } ] } ``` ## Implementation Details - **Query ID Generation:** Content-addressed hash of params + timestamp for deterministic IDs - **Fire-and-Forget:** Audit logging doesn't block the query response; failures are logged but don't fail queries - **Agent Index:** Enables O(1) lookups by agent + time range via prefix scan ## Latency Requirements (from user research) | Query Type | Target Latency | |------------|---------------| | Point query (current) | < 100ms | | Time-travel query | < 500ms | | Audit trace | < 2s | | Full provenance chain | < 5s | ## Crates - **Types:** `stemedb_core::types::{QueryAudit, QueryParams, ContributingAssertion}` - **Storage:** `stemedb_storage::{AuditStore, GenericAuditStore}` - **API Handlers:** `stemedb_api::handlers::audit::{list_audits, get_audit}` ## Origin This feature emerged from SRE perspective interviews (see `.claude/agents/perspective-oncall-sre.md`). Core need: "I need to trace from agent decision → query → assertions in under 10 minutes."