stemedb/ai-lookup/features/query-audit.md
jordan 3cfaa1e1d3 feat: Complete Phase 1 (The Spine) - storage foundation
Phase 1 delivers the complete durability and storage layer:

- WAL with crash recovery: Append-only journal with BLAKE3 checksums,
  fsync guarantees, and proper seek-to-EOF on reopen
- Storage engine: sled-backed KVStore with scan_prefix for range queries
- Content-addressed storage: H:{hash}, V:{hash}, E:{hash} key patterns
- Ingestor: Background worker tailing WAL, writing to KV with 8-byte
  aligned record headers for rkyv zero-copy deserialization
- Comprehensive tests: 31 tests covering crash recovery, round-trips,
  and multi-cycle durability

New crates: stemedb-wal, stemedb-storage, stemedb-ingest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 14:15:34 -07:00

83 lines
2.0 KiB
Markdown

# Query Audit Trail
> **Quick Ref:** Every query is logged with provenance for incident investigation
## The Problem
At 3am, production is broken. An agent deployed wrong config. The SRE needs to know: What did the agent query? What result did it get? What assertions contributed?
Postgres query logs show SQL, not semantic meaning.
## The Solution
```rust
struct QueryAudit {
pub query_id: Hash,
pub agent_id: AgentId,
pub timestamp: u64,
pub subject: EntityId,
pub predicate: RelationId,
pub lens: LensType,
pub lifecycle_filter: Option<LifecycleStage>,
pub result_hash: Hash,
pub result_confidence: f32,
pub contributing_assertions: Vec<ContributingAssertion>,
}
struct ContributingAssertion {
pub assertion_hash: Hash,
pub weight: f32, // How much it influenced result
pub source_hash: Hash, // Original evidence
}
```
## API
```bash
# What queries did this agent run?
GET /audit/queries?agent=deployment-agent&from=2024-01-15T20:00:00Z
# Trace command for incident investigation
episteme trace --agent deployment-agent \
--time "6 hours ago" \
--subject "auth/*"
```
## Response Format
```json
{
"query_id": "q_7f3a2b...",
"timestamp": "2024-01-15T21:03:47Z",
"subject": "auth/jwt",
"predicate": "signing_algorithm",
"lens": "authority",
"lifecycle_filter": null,
"result": {
"value": "ES256",
"confidence": 0.87
},
"contributing_assertions": [
{
"hash": "rfc_2024_001...",
"lifecycle": "Proposed",
"weight": 0.9,
"source": "security-rfc-2024.md"
}
]
}
```
## Latency Requirements (from user research)
| Query Type | Target Latency |
|------------|---------------|
| Point query (current) | < 100ms |
| Time-travel query | < 500ms |
| Audit trace | < 2s |
| Full provenance chain | < 5s |
## Origin
This feature emerged from SRE perspective interviews (see `.claude/agents/perspective-oncall-sre.md`). Core need: "I need to trace from agent decision query assertions in under 10 minutes."