Implementation Audit Checklist
Systematic verification of every completed roadmap feature (Phases 1-5B).
Organized by functional theme with actionable /investigate and /trace-feature entries.
Scope: 8 Rust crates + Go SDK | Phases: 1, 2, 2.5, 3, 4, 5A, 5B
How to use: Work through each section. Run the suggested command for each entry.
Mark items as they are verified. Priority guides triage order.
Note: Phase 5B roadmap checkboxes are not yet updated, but the code was
delivered in commit 3320c24 ("feat: WAL hardening (Phase 5B)"). All 5B
features (CRC32C, crash recovery, group commit, log rotation) exist in the
codebase and are included in this checklist.
1. Write Path (WAL, Ingestion, Durability)
1.1 WAL Record Format & CRC32C Integrity
| Field |
Value |
| Priority |
Critical |
| Why it matters |
Torn writes corrupt the append-only log. CRC32C is the first line of defense. |
| Key files |
crates/stemedb-wal/src/format.rs, crates/stemedb-wal/src/journal.rs |
| What to verify |
Record format is [len:u32][crc32c:u32][data][blake3:32]. CRC32C validated on every read before deserialization. Torn writes detected and rejected. |
| Command |
/investigate WAL record format — confirm dual checksum (CRC32C + BLAKE3), torn write rejection, and that no record can be deserialized without passing CRC32C |
1.2 Crash Recovery (Full Scan & Truncate)
| Field |
Value |
| Priority |
Critical |
| Why it matters |
After an unclean shutdown, the WAL must recover to a consistent state. Partial writes must be truncated, not served. |
| Key files |
crates/stemedb-wal/src/journal.rs, crates/stemedb-wal/src/segment.rs |
| What to verify |
recover() performs sequential record scan with CRC32C validation. Truncates at first corrupted/incomplete record. Recovery metrics tracked (valid/invalid records, bytes truncated). |
| Command |
/trace-feature WAL crash recovery — trace from journal open through record scan, truncation, and metric reporting |
1.3 Group Commit
| Field |
Value |
| Priority |
High |
| Why it matters |
Without group commit, fsync-per-write caps throughput at ~1K writes/sec. Group commit batches fsyncs for dramatically higher throughput. |
| Key files |
crates/stemedb-wal/src/group_commit.rs, crates/stemedb-wal/src/durability.rs |
| What to verify |
GroupCommitBuffer buffers N writes or T milliseconds before single fsync. Writers wait on Notify. Background flusher calls fsync and notifies all waiters. Configurable max_writes and max_duration. |
| Command |
/investigate group commit — verify buffer/flush cycle, writer notification, and that no write is acknowledged before its fsync completes |
1.4 Log Rotation
| Field |
Value |
| Priority |
High |
| Why it matters |
Unbounded WAL growth will eventually exhaust disk. Rotation keeps disk usage bounded. |
| Key files |
crates/stemedb-wal/src/segment.rs, crates/stemedb-wal/src/journal.rs |
| What to verify |
Segment naming follows {seq:08x}.wal. New segment created when current exceeds threshold. Old segments deleted after cursor passes them. Recovery works across multiple segments. |
| Command |
/investigate log rotation — verify segment creation, naming, cleanup after cursor advancement, and multi-segment recovery |
1.5 Ed25519 Signature Verification
| Field |
Value |
| Priority |
Critical |
| Why it matters |
Unsigned or mis-signed assertions bypass the entire trust model. Every assertion must be cryptographically verified before storage. |
| Key files |
crates/stemedb-ingest/src/worker/processing.rs, crates/stemedb-ingest/src/worker/tests/signatures.rs |
| What to verify |
Signature verified during ingestion before KV write. Invalid signatures rejected with error. Multi-sig (SignatureEntry vec) verified. Unsigned assertions rejected. |
| Command |
/investigate Ed25519 verification in ingestion — confirm every assertion is verified, invalid sigs rejected, and multi-sig entries all checked |
1.6 Cursor Persistence
| Field |
Value |
| Priority |
Critical |
| Why it matters |
If the ingest cursor is lost, the entire WAL is re-processed on restart. If it advances too early, assertions are silently dropped. |
| Key files |
crates/stemedb-ingest/src/worker/run.rs, crates/stemedb-ingest/src/worker/storage.rs, crates/stemedb-ingest/src/worker/tests/cursor.rs |
| What to verify |
Cursor stored at __CURSOR__:ingest key in KV. Updated after successful processing (not before). Survives restart and resumes from correct offset. |
| Command |
/trace-feature cursor persistence — trace from WAL tail through processing, cursor update, restart, and resume |
1.7 Epoch Cascade & Supersession Markers
| Field |
Value |
| Priority |
High |
| Why it matters |
Without transitive cascade, epoch chains require O(chain_length) walks. Markers enable O(1) supersession checks. |
| Key files |
crates/stemedb-ingest/src/worker/processing.rs, crates/stemedb-ingest/src/worker/tests/epoch_cascade.rs, crates/stemedb-storage/src/supersession_store.rs |
| What to verify |
write_supersession_cascade() writes SUPERSEDED:{old_epoch_id} for full transitive closure. All markers point to LATEST superseding epoch. Max depth guard (100) and cycle detection. |
| Command |
/investigate epoch cascade — verify transitive closure, marker correctness for A->B->C chains, cycle detection, and depth guard |
1.8 Ingestion Record Type Routing
| Field |
Value |
| Priority |
Medium |
| Why it matters |
The ingest worker handles assertions, votes, and epochs. Mis-routing corrupts data. |
| Key files |
crates/stemedb-ingest/src/worker/record_types.rs, crates/stemedb-ingest/src/worker/processing.rs |
| What to verify |
Each record type deserialized and routed correctly. Unknown record types handled gracefully (logged, not panicked). Indexes (S:, SP:) updated on assertion ingest. |
| Command |
/investigate ingest record routing — verify assertion/vote/epoch dispatch, index updates, and unknown-type handling |
2. Read Path (Query Engine, Materialized Views)
2.1 MV Fast Path
| Field |
Value |
| Priority |
Critical |
| Why it matters |
The MV fast path is the primary read performance mechanism. If it silently serves stale data or misses updates, query results are wrong. |
| Key files |
crates/stemedb-query/src/engine.rs, crates/stemedb-query/src/query.rs |
| What to verify |
try_fast_path() looks up MV:{subject}:{predicate}. Returns immediately if MV exists and no features bypass it. Falls through to slow path when MV missing. |
| Command |
/trace-feature MV fast path — trace from QueryEngine::execute through try_fast_path, MV lookup, and fallback to slow path |
2.2 MV Staleness Detection
| Field |
Value |
| Priority |
High |
| Why it matters |
Without staleness detection, the fast path serves arbitrarily old cached results. |
| Key files |
crates/stemedb-query/src/engine.rs, crates/stemedb-query/src/query.rs |
| What to verify |
max_stale parameter on Query. If set and MV age exceeds threshold, falls through to slow path. No max_stale = any MV age accepted (backward compatible). max_stale=0 rejects all but brand-new MVs. |
| Command |
/investigate MV staleness — verify threshold comparison logic, edge cases (zero, None), and that stale MVs trigger slow path with debug log |
2.3 Fast Path Bypass Conditions
| Field |
Value |
| Priority |
High |
| Why it matters |
Several features must bypass the fast path because MVs don't capture their state. Missing a bypass means wrong results. |
| Key files |
crates/stemedb-query/src/engine.rs |
| What to verify |
Fast path bypassed when: as_of is set (time-travel), since is set (changelog), decay_halflife is set (confidence decay), source_class_decay is enabled, vector_near is set. All other queries use fast path. |
| Command |
/investigate fast path bypass — enumerate all conditions that force slow path and verify each is correctly checked |
2.4 Slow Path Filtering & Index Routing
| Field |
Value |
| Priority |
High |
| Why it matters |
The slow path is the correctness fallback. If index routing picks the wrong index or filtering misses conditions, results are incorrect. |
| Key files |
crates/stemedb-query/src/engine.rs, crates/stemedb-storage/src/index_store.rs |
| What to verify |
QueryEngine routes: SP index (subject+predicate) -> S index (subject only) -> full scan. Query::matches() checks all filter fields (subject, predicate, lifecycle, epoch, as_of, visual_near, metadata filters). |
| Command |
/trace-feature query index routing — trace from execute() through index selection, candidate fetch, matches() filtering, and result assembly |
2.5 Materializer Worker
| Field |
Value |
| Priority |
High |
| Why it matters |
The materializer pre-computes winning assertions for O(1) reads. If it fails to run or computes wrong winners, the fast path serves incorrect data. |
| Key files |
crates/stemedb-query/src/materializer.rs |
| What to verify |
step() processes pending subject/predicate pairs. run() loops continuously. run_notified() wakes on Notify events. materialize_pair() applies lens, writes MV, writes changelog on winner change, fires escalation checks. |
| Command |
/trace-feature materializer — trace step/run/run_notified cycle, MV write, changelog write, and escalation trigger |
3. Lens Resolution
3.1 RecencyLens
| Field |
Value |
| Priority |
High |
| Why it matters |
Default lens. Picks the newest assertion. If timestamp comparison is wrong, newest-wins semantics break. |
| Key files |
crates/stemedb-lens/src/recency.rs |
| What to verify |
Selects assertion with highest timestamp. Computes conflict_score. Empty candidates return empty resolution. |
| Command |
/investigate RecencyLens — verify timestamp comparison, conflict score computation, empty/single candidate edge cases |
3.2 ConsensusLens
| Field |
Value |
| Priority |
High |
| Why it matters |
Groups assertions by object value, picks the group with most support. Incorrect grouping means wrong consensus. |
| Key files |
crates/stemedb-lens/src/consensus.rs |
| What to verify |
Groups by object value. Picks group with highest total confidence. Conflict score reflects inter-group disagreement. |
| Command |
/investigate ConsensusLens — verify grouping logic, confidence aggregation, and conflict score for multi-group scenarios |
3.3 ConfidenceLens (formerly AuthorityLens)
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Selects by raw confidence field. The rename from AuthorityLens must not have broken routing. |
| Key files |
crates/stemedb-lens/src/confidence.rs |
| What to verify |
Picks assertion with highest confidence field. LensDto::Confidence routes here. Old Authority name no longer routes here. |
| Command |
/investigate ConfidenceLens — verify confidence-field selection, DTO routing, and that Authority name redirects to TrustAwareAuthorityLens |
3.4 VoteAwareConsensusLens
| Field |
Value |
| Priority |
High |
| Why it matters |
Real vote-weighted resolution. If vote counts are miscounted or weights wrong, democratic consensus breaks. |
| Key files |
crates/stemedb-lens/src/vote_aware_consensus.rs, crates/stemedb-storage/src/vote_store/ |
| What to verify |
Fetches vote counts/weights from VoteStore. Groups by object value weighted by votes. Falls back gracefully when no votes exist. |
| Command |
/trace-feature VoteAwareConsensusLens — trace from lens resolve through VoteStore lookup, weight calculation, and winner selection |
3.5 TrustAwareAuthorityLens
| Field |
Value |
| Priority |
High |
| Why it matters |
Weights assertions by agent TrustRank. If trust scores not fetched or weighted wrong, reputation system is decorative. |
| Key files |
crates/stemedb-lens/src/trust_aware_authority.rs, crates/stemedb-storage/src/trust_rank_store/ |
| What to verify |
Fetches per-agent TrustRank. Weights assertion confidence by trust score. LensDto::Authority routes here (not to ConfidenceLens). Falls back when TrustRankStore unavailable. |
| Command |
/trace-feature TrustAwareAuthorityLens — trace from resolve through TrustRankStore lookup, weight multiplication, and winner selection |
3.6 EpochAwareLens
| Field |
Value |
| Priority |
High |
| Why it matters |
Filters out assertions from superseded epochs. Without this, old paradigm data contaminates current results. |
| Key files |
crates/stemedb-lens/src/lib.rs (epoch_aware logic may be inlined or split) |
| What to verify |
Uses O(1) SUPERSEDED:{epoch_id} marker lookup. Fail-open on missing markers. Cycle detection + max depth 100. Decorates any inner lens. |
| Command |
/investigate EpochAwareLens — verify marker-based filtering, fail-open semantics, cycle detection, and decorator pattern with inner lens |
3.7 SkepticLens
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Surfaces disagreement instead of resolving it. Critical for the browser extension's contradiction overlay. |
| Key files |
crates/stemedb-lens/src/skeptic.rs, crates/stemedb-query/src/skeptic.rs, crates/stemedb-api/src/handlers/skeptic.rs |
| What to verify |
Returns ConflictAnalysis with Shannon entropy-based conflict score. ResolutionStatus (Unanimous/Agreed/Contested) thresholds correct. All claims ranked by weight. API endpoint /v1/skeptic returns full analysis. |
| Command |
/trace-feature SkepticLens — trace from API endpoint through SkepticResolver, entropy computation, status classification, and response assembly |
3.8 LayeredConsensusLens
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Per-source-class consensus for the pharma vertical. Ensures Regulatory tier outweighs Anecdotal even with fewer assertions. |
| Key files |
crates/stemedb-lens/src/layered_consensus.rs, crates/stemedb-api/src/handlers/layered.rs |
| What to verify |
Groups by SourceClass tier. Per-tier resolution with individual conflict scores. Cross-tier conflict via Shannon entropy. Overall winner from highest-authority tier. API endpoint /v1/layered returns LayeredQueryResponse. |
| Command |
/trace-feature LayeredConsensusLens — trace from API endpoint through tier grouping, per-tier resolution, cross-tier conflict, and overall winner |
3.9 ConstraintsLens
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Pre-flight safety checks for must_use/forbidden/prefer predicates. Incorrect categorization could allow forbidden items or miss required ones. |
| Key files |
crates/stemedb-lens/src/constraints.rs, crates/stemedb-api/src/handlers/constraints.rs |
| What to verify |
Categorizes by predicate pattern: must_use:*, forbidden:*, prefer:*. Priority: must_use > forbidden > prefer. Sorted by confidence within category. Non-constraint predicates ignored. API endpoint /v1/constraints returns ConstraintsResponse. |
| Command |
/investigate ConstraintsLens — verify predicate pattern matching, priority ordering, confidence sort, and that non-constraint predicates are fully excluded |
4. Trust & Safety
4.1 TrustRank Engine
| Field |
Value |
| Priority |
High |
| Why it matters |
Foundation of the reputation system. If trust scores drift, decay wrong, or clamp incorrectly, the entire trust model is unreliable. |
| Key files |
crates/stemedb-storage/src/trust_rank_store/store_impl.rs, crates/stemedb-storage/src/trust_rank_store/model.rs |
| What to verify |
Per-agent trust score storage and retrieval. record_outcome() for accuracy tracking. Trust score clamping to valid range. Decay calculation with configurable half-life. |
| Command |
/investigate TrustRank engine — verify score storage, outcome recording, clamping, and decay calculation |
4.2 Gold Standard Verification
| Field |
Value |
| Priority |
High |
| Why it matters |
Sybil defense. If agents can game gold standards (verify repeatedly, get wrong rewards), the trust bootstrapping mechanism is broken. |
| Key files |
crates/stemedb-storage/src/gold_standard_store.rs, crates/stemedb-storage/src/trust_rank_store/store_impl.rs, crates/stemedb-storage/src/trust_rank_store/gold_standard_tests.rs |
| What to verify |
GoldStandard CRUD operations. verify_agent_against_gold_standard() with correct/incorrect matching. Deduplication markers at GS_VERIFIED:{agent_id}:{subject}:{predicate}. Trust adjustments: +0.05 reward, -0.1 penalty. Clamping after adjustment. |
| Command |
/trace-feature gold standard verification — trace from admin API through gold standard creation, agent verification, trust adjustment, and dedup marker |
4.3 Escalation Triggers
| Field |
Value |
| Priority |
High |
| Why it matters |
Active safety system. If high-conflict assertions don't fire escalations, dangerous disagreements go unnoticed. |
| Key files |
crates/stemedb-storage/src/escalation_store.rs, crates/stemedb-core/src/types/escalation.rs, crates/stemedb-query/src/materializer.rs |
| What to verify |
EscalationPolicy with configurable threshold + level. Materializer fires events when conflict_score exceeds policy threshold. Events stored at ESC:{timestamp_nanos}:{id_hex}. Predicate pattern matching on policies. API endpoints for query and resolution. |
| Command |
/trace-feature escalation triggers — trace from materializer conflict_score computation through policy check, event creation, storage, and API retrieval |
4.4 Conflict Score Computation
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Conflict score drives escalations, filtering, and the Skeptic lens. If the formula is wrong, downstream features are unreliable. |
| Key files |
crates/stemedb-lens/src/traits.rs (compute_conflict_score function) |
| What to verify |
Normalized variance of confidence values. 0 or 1 candidates = 0.0. All same confidence = 0.0. Max variance (0.0 vs 1.0) = 1.0. NaN handling returns 0.0. Score added to all lens resolutions and MaterializedViews. |
| Command |
/investigate conflict score — verify formula correctness, edge cases (empty, single, NaN), and propagation to Resolution and MaterializedView |
4.5 Conflict Score Filtering
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Browser extension needs "only high-conflict claims." If filtering is wrong, the UI shows wrong results or nothing. |
| Key files |
crates/stemedb-query/src/engine.rs, crates/stemedb-api/src/dto/query_params.rs |
| What to verify |
min_conflict_score and max_conflict_score on Query. Fast-path filtering checks MV conflict_score. API validation: scores 0.0-1.0, finite (rejects NaN/Inf). |
| Command |
/investigate conflict score filtering — verify min/max thresholds on fast path, boundary behavior, NaN/Inf rejection, and combination with other filters |
4.6 Batch TrustRank Decay API
| Field |
Value |
| Priority |
Medium |
| Why it matters |
External orchestrators (Gardener) need to trigger scheduled trust decay. If the endpoint doesn't work, trust scores never decay. |
| Key files |
crates/stemedb-api/src/handlers/admin.rs, crates/stemedb-storage/src/trust_rank_store/store_impl.rs |
| What to verify |
POST /v1/admin/decay-trust-ranks accepts now and half_life_seconds. Delegates to TrustRankStore::decay_trust_ranks(). Response includes decayed_count, timestamp_used, half_life_used, status. |
| Command |
/investigate trust decay API — verify endpoint accepts params, delegates correctly, and response has all required fields |
5. Search & Similarity
5.1 HNSW Vector Search
| Field |
Value |
| Priority |
High |
| Why it matters |
Semantic k-NN search for embeddings. If the index returns wrong neighbors or crashes on edge cases, the semantic search feature is broken. |
| Key files |
crates/stemedb-storage/src/vector_index.rs, crates/stemedb-query/src/engine.rs |
| What to verify |
HnswVectorIndex with RwLock protection. Input validation: dimension mismatch, NaN, Infinity rejected. Idempotent insert. QueryEngine uses O(log N) HNSW when vector_near set + index configured. Falls back to standard path without index. |
| Command |
/trace-feature vector search — trace from API query with vector_near through QueryEngine, HNSW lookup, candidate fetch, and result assembly |
5.2 BK-Tree Visual Search
| Field |
Value |
| Priority |
High |
| Why it matters |
Image provenance via perceptual hashes. If hamming distance or BK-tree traversal is wrong, similar images aren't found. |
| Key files |
crates/stemedb-storage/src/visual_index.rs, crates/stemedb-query/src/engine.rs |
| What to verify |
BkTreeVisualIndex with hamming distance metric. Threshold clamped to 0-64. Results sorted by distance ascending. Idempotent insert. QueryEngine uses O(log N) BK-tree when visual_near set + index configured. Falls back to brute-force scan without index. |
| Command |
/trace-feature visual search — trace from API query with visual_near through QueryEngine, BK-tree lookup, threshold filtering, and result assembly |
5.3 Visual Hash Brute-Force Fallback
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Without a BK-tree index, visual search falls back to O(N) scan in Query::matches(). If the fallback is broken, visual search fails silently when no index is configured. |
| Key files |
crates/stemedb-query/src/query.rs |
| What to verify |
visual_near and visual_threshold on Query. matches() computes hamming distance when set. Invalid hex rejected. Wrong-length hash rejected. Default threshold behavior. |
| Command |
/investigate visual hash brute-force — verify hamming distance in matches(), invalid input handling, and default threshold |
6. Time & Decay
6.1 Time-Travel (as_of)
| Field |
Value |
| Priority |
High |
| Why it matters |
Historical queries are fundamental to "Git for Truth." If as_of doesn't exclude future assertions or incorrectly uses MVs, historical queries are wrong. |
| Key files |
crates/stemedb-query/src/query.rs, crates/stemedb-query/src/engine.rs |
| What to verify |
as_of parameter filters assertions by timestamp <= as_of. Fast path bypassed (MVs reflect current state). Edge case: assertion.timestamp == as_of included. Works with lens resolution (only historical candidates). |
| Command |
/investigate time-travel — verify as_of filtering in matches(), fast path bypass, exact-timestamp edge case, and lens interaction |
6.2 Change Tracking (since)
| Field |
Value |
| Priority |
High |
| Why it matters |
"What changed since I last looked?" is the returning consumer story. If changelog entries are missed or timestamps wrong, consumers miss updates. |
| Key files |
crates/stemedb-query/src/materializer.rs, crates/stemedb-query/src/engine.rs, crates/stemedb-core/src/types/materialized.rs |
| What to verify |
MVC:{subject}:{predicate}:{timestamp_nanos} changelog keys. Written when MV winner changes (not on same-winner re-materialization). since parameter triggers fetch_changes_since(). Entries sorted ascending. Fast path bypassed when since is set. |
| Command |
/trace-feature change tracking — trace from materializer winner-change detection through changelog write, since-based fetch, and API response |
6.3 Semantic Decay (Confidence Half-Life)
| Field |
Value |
| Priority |
High |
| Why it matters |
Old assertions should lose influence. If decay formula is wrong or not applied before lens resolution, stale high-confidence assertions dominate forever. |
| Key files |
crates/stemedb-query/src/decay.rs, crates/stemedb-query/src/engine.rs |
| What to verify |
apply_decay(): confidence * 2^(-(age / halflife)). Applied after filtering, before lens. Zero halflife = no decay (avoids div-by-zero). Future assertions = no decay. Confidence clamped to [0.0, 1.0]. Only confidence changes; other fields preserved. Fast path bypassed. |
| Command |
/investigate semantic decay — verify formula, application order (after filter, before lens), zero-halflife safety, and field preservation |
6.4 Source-Class-Aware Decay
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Regulatory data should never decay. Anecdotal data should decay in 30 days. If tier-specific half-lives are wrong, the evidence hierarchy is undermined. |
| Key files |
crates/stemedb-query/src/decay.rs, crates/stemedb-core/src/types/source.rs |
| What to verify |
SourceClass::default_decay_days() returns tier-specific half-lives. Tier 0 (Regulatory) = no decay. Tier 5 (Anecdotal) = 30 days. apply_source_class_decay() uses per-assertion source_class. Time-travel compatible (uses as_of if set). |
| Command |
/investigate source-class decay — verify per-tier half-life values, Regulatory no-decay, Anecdotal rapid decay, and as_of interaction |
6.5 Epoch Supersession at Query Time
| Field |
Value |
| Priority |
High |
| Why it matters |
Superseded epoch assertions should be excluded from results. If markers aren't checked or fail-open is wrong, old paradigm data leaks into current queries. |
| Key files |
crates/stemedb-lens/src/lib.rs, crates/stemedb-storage/src/supersession_store.rs |
| What to verify |
is_epoch_superseded() uses O(1) marker lookup. Assertions from superseded epochs filtered before lens resolution. Fail-open: missing marker = not superseded. Works with all inner lenses. |
| Command |
/investigate epoch supersession at query time — verify marker lookup, filtering position in pipeline, fail-open behavior, and inner lens compatibility |
7. Source Provenance
7.1 Source Document Storage
| Field |
Value |
| Priority |
Medium |
| Why it matters |
100% citation recall requires every source document to be retrievable by its hash. If storage or retrieval is broken, provenance claims are unverifiable. |
| Key files |
crates/stemedb-api/src/handlers/source.rs |
| What to verify |
POST /v1/source stores document at SRC:{hash}. BLAKE3 content hash. Base64 encoding for binary-safe transport. 10MB size limit. Content-addressed (idempotent). GET /v1/provenance/{hash} retrieves by hash. Format: [content_type_len:4][content_type][content]. |
| Command |
/trace-feature source document storage — trace from POST /v1/source through hashing, storage format, and GET /v1/provenance retrieval |
7.2 Source Metadata Indexing
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Queryable metadata fields (journal, doi, platform, study_design) enable filtered searches. If indexing is broken, metadata queries return empty or wrong results. |
| Key files |
crates/stemedb-storage/src/lib.rs (SourceMetadataIndexStore), crates/stemedb-ingest/src/worker/processing.rs |
| What to verify |
SMV:{field}:{value} key pattern. Case-insensitive normalization. IngestWorker extracts indexed fields from source_metadata JSON. Query supports source_journal, source_doi, source_platform, source_study_design with AND semantics. Malformed JSON gracefully skipped. |
| Command |
/investigate source metadata indexing — verify index key pattern, case normalization, ingest-time extraction, query-time filtering, and malformed JSON handling |
7.3 Rich Source Metadata (Opaque Blob)
| Field |
Value |
| Priority |
Low |
| Why it matters |
source_metadata: Option<Vec<u8>> stores arbitrary provenance. If serialization roundtrip is broken, metadata is silently lost. |
| Key files |
crates/stemedb-core/src/types/assertion.rs, crates/stemedb-api/src/dto/create.rs, crates/stemedb-api/src/dto/responses.rs |
| What to verify |
Vec<u8> field for rkyv zero-copy. API accepts JSON string, converts to bytes. Response converts bytes back with defensive UTF-8 handling. Builder supports .source_metadata_json() and .source_metadata(). |
| Command |
/investigate source metadata blob — verify serialization roundtrip, API JSON<->bytes conversion, and defensive UTF-8 handling |
8. API & Integration
8.1 Core CRUD Endpoints
| Field |
Value |
| Priority |
Critical |
| Why it matters |
These are the primary write and read endpoints. If any is broken, the database is unusable. |
| Key files |
crates/stemedb-api/src/handlers/assert.rs, crates/stemedb-api/src/handlers/vote.rs, crates/stemedb-api/src/handlers/epoch.rs, crates/stemedb-api/src/handlers/query.rs, crates/stemedb-api/src/handlers/health.rs |
| What to verify |
|
| Endpoint |
Method |
Handler |
Verify |
/v1/assert |
POST |
assert.rs |
Accepts JSON, writes to WAL, returns assertion hash |
/v1/vote |
POST |
vote.rs |
High-throughput vote ingestion with provenance fields |
/v1/epoch |
POST |
epoch.rs |
Creates epoch with optional supersedes field |
/v1/query |
GET |
query.rs |
Subject/Predicate/Lens/Lifecycle/Epoch/as_of/since/decay/vector/visual filters |
/v1/health |
GET |
health.rs |
Returns assertion count, uptime |
| Command | /trace-feature core API endpoints — trace each CRUD endpoint from HTTP handler through DTO validation, WAL write (or query), and response assembly |
8.2 Advanced Query Endpoints
| Field |
Value |
| Priority |
High |
| Why it matters |
Specialized query endpoints serve distinct use cases. If routing is wrong, queries silently fall through to the wrong handler. |
| Key files |
crates/stemedb-api/src/handlers/skeptic.rs, crates/stemedb-api/src/handlers/layered.rs, crates/stemedb-api/src/handlers/constraints.rs |
| What to verify |
|
| Endpoint |
Method |
Handler |
Verify |
/v1/skeptic |
GET |
skeptic.rs |
Returns ConflictAnalysis with entropy-based scoring |
/v1/layered |
GET |
layered.rs |
Returns LayeredQueryResponse with per-tier resolution |
/v1/constraints |
GET |
constraints.rs |
Returns ConstraintsResponse with must_use/forbidden/prefer |
| Command | /investigate advanced query endpoints — verify each endpoint returns correct response type and that LensDto redirects work |
8.3 Admin Endpoints
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Admin endpoints control trust decay, gold standards, and escalations. If access control is missing, any agent can manipulate trust. |
| Key files |
crates/stemedb-api/src/handlers/admin.rs, crates/stemedb-api/src/handlers/gold_standard.rs, crates/stemedb-api/src/handlers/escalation.rs |
| What to verify |
|
| Endpoint |
Method |
Handler |
Verify |
/v1/admin/decay-trust-ranks |
POST |
admin.rs |
Batch trust decay with configurable params |
/v1/admin/gold-standards |
POST/GET/DELETE |
gold_standard.rs |
Gold standard CRUD |
/v1/admin/verify-agent |
POST |
gold_standard.rs |
Agent verification against gold standard |
/v1/admin/escalations |
GET |
escalation.rs |
Query escalation events |
/v1/admin/escalations/{id}/resolve |
POST |
escalation.rs |
Resolve escalation |
| Command | /investigate admin endpoints — verify each endpoint works, and note whether any access control exists (or is missing) |
8.4 Provenance & Audit Endpoints
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Audit trail and source provenance are compliance requirements. If broken, query decisions are untraceable. |
| Key files |
crates/stemedb-api/src/handlers/source.rs, crates/stemedb-api/src/handlers/audit.rs |
| What to verify |
|
| Endpoint |
Method |
Handler |
Verify |
/v1/source |
POST |
source.rs |
Store source document by BLAKE3 hash |
/v1/provenance/{hash} |
GET |
source.rs |
Retrieve source document |
/v1/audit/queries |
GET |
audit.rs |
Query audit history by agent |
/v1/audit/query/{id} |
GET |
audit.rs |
Full reasoning trace for single query |
| Command | /trace-feature audit trail — trace from query execution through QueryAudit creation, storage at AUD: key, and retrieval via audit endpoints |
8.5 Quota Meter (TAN)
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Economic throttling prevents abuse. If the meter doesn't deduct correctly or the middleware doesn't enforce, agents can write unlimited data. |
| Key files |
crates/stemedb-api/src/middleware/meter.rs, crates/stemedb-api/src/handlers/meter.rs, crates/stemedb-storage/src/quota_store/ |
| What to verify |
Token Bucket algorithm with per-agent per-hour quotas. Cost model: Assert=10, Vote=1, Query=5+lens, +1/KB payload. MeterLayer tower middleware deducts on every request. GET /v1/meter/quota returns remaining. POST /v1/meter/quota/limit sets custom limits. |
| Command |
/trace-feature quota meter — trace from HTTP request through MeterLayer middleware, cost calculation, QuotaStore deduction, and quota check endpoint |
8.6 OpenAPI / Swagger
| Field |
Value |
| Priority |
Low |
| Why it matters |
Developer experience. If OpenAPI spec doesn't match actual endpoints, SDK generation produces wrong clients. |
| Key files |
crates/stemedb-api/src/lib.rs (utoipa annotations) |
| What to verify |
GET /swagger-ui serves interactive docs. All endpoints annotated with utoipa. DTOs have proper schema annotations. Endpoint list in OpenAPI matches actual routes. |
| Command |
/investigate OpenAPI spec — verify all endpoints are annotated, DTO schemas are correct, and swagger-ui is served |
8.7 Vote Provenance Witness
| Field |
Value |
| Priority |
Medium |
| Why it matters |
Votes with provenance (source_url, observed_context) are cryptographic witnesses. Without validation, votes carry no evidentiary weight. |
| Key files |
crates/stemedb-core/src/types/voting.rs, crates/stemedb-api/src/handlers/vote.rs, crates/stemedb-api/src/dto/create.rs |
| What to verify |
source_url max 2048 chars (non-empty if provided). observed_context max 64KB. Backward compatible (existing votes without provenance remain valid). API DTOs serialize/deserialize both fields. |
| Command |
/investigate vote provenance — verify input validation limits, backward compatibility, and API roundtrip for source_url and observed_context |
9. Storage Engine
9.1 HybridStore Routing
| Field |
Value |
| Priority |
Critical |
| Why it matters |
Every KV operation flows through HybridStore. Wrong routing sends write-heavy data to the read-optimized backend (or vice versa), causing performance degradation or correctness issues. |
| Key files |
crates/stemedb-storage/src/hybrid_backend.rs, crates/stemedb-storage/src/fjall_backend.rs, crates/stemedb-storage/src/redb_backend.rs |
| What to verify |
Prefix-based dispatch: fjall (LSM) for write-heavy (H:, V:, VC:, VW:, E:, SUPERSEDED:, __CURSOR__:), redb (B-tree) for read-heavy (S:, SP:, MV:, TR:, QA:, QT:, TP:, GS:, ESC:). All KVStore trait methods dispatched correctly. No key falls through unmatched. |
| Command |
/trace-feature HybridStore routing — trace prefix dispatch logic, verify every key prefix is routed, and confirm no unmatched-prefix fallthrough |
9.2 FjallStore Backend
| Field |
Value |
| Priority |
High |
| Why it matters |
Write-heavy paths depend on fjall. If atomic operations (increment, CAS) don't work correctly under concurrency, vote counts and cursors corrupt. |
| Key files |
crates/stemedb-storage/src/fjall_backend.rs |
| What to verify |
All KVStore trait methods implemented. DashMap per-key locks for atomics. ACID transactions. Error mapping to StorageError::Backend. |
| Command |
/investigate FjallStore — verify atomic operations under concurrent access, DashMap locking, and error mapping |
9.3 RedbStore Backend
| Field |
Value |
| Priority |
High |
| Why it matters |
Read-heavy paths depend on redb. If ACID transactions don't commit correctly, materialized views and indexes corrupt. |
| Key files |
crates/stemedb-storage/src/redb_backend.rs |
| What to verify |
All KVStore trait methods implemented. ACID transactions for writes. Prefix scan via range queries. Error mapping to StorageError::Backend. |
| Command |
/investigate RedbStore — verify ACID transactions, prefix scan correctness, and error mapping |
9.4 Key Codec & Subject Co-location
| Field |
Value |
| Priority |
High |
| Why it matters |
Key codec is the foundation for distributed sharding. If keys aren't co-located by subject, range sharding (Phase 6) will split related data across nodes. |
| Key files |
crates/stemedb-storage/src/key_codec/mod.rs, crates/stemedb-storage/src/key_codec/tests.rs |
| What to verify |
40+ key builder functions. Subject-prefixed keys use {subject}\x00 separator. Global keys use \x00 prefix (sort first). Subject validation. Zero hardcoded key patterns in store files (all use key_codec). test_subject_colocation and test_global_keys_sort_first pass. |
| Command |
/investigate key codec — verify subject co-location layout, \x00 separator/prefix usage, and that all 91+ call sites use key_codec (no hardcoded patterns) |
9.5 StorageError Generalization
| Field |
Value |
| Priority |
Low |
| Why it matters |
Error type was generalized from Sled to Backend(String). If any code still references the old variant, it won't compile (but worth confirming). |
| Key files |
crates/stemedb-storage/src/error.rs |
| What to verify |
StorageError::Backend(String) exists. No references to StorageError::Sled. Both fjall and redb map their errors correctly. |
| Command |
/investigate StorageError — verify Backend variant exists, no Sled references remain, and both backends map errors correctly |
10. Cross-Cutting Concerns
10.1 No-Unwrap Enforcement
| Field |
Value |
| Priority |
Critical |
| Why it matters |
unwrap() and expect() in production code cause panics. CI enforces at deny level. A single slip crashes the server. |
| Key files |
All crates/stemedb-*/src/**/*.rs |
| What to verify |
clippy::unwrap_used and clippy::expect_used at deny level in workspace Cargo.toml or clippy.toml. No unwrap() or expect() in production code (test code is allowed). CI runs cargo clippy -- -D warnings. |
| Command |
/investigate no-unwrap enforcement — verify clippy config, scan for any unwrap/expect in production code, and confirm CI enforcement |
10.2 Structured Logging
| Field |
Value |
| Priority |
Medium |
| Why it matters |
println!/eprintln! bypass structured logging. Without tracing, production debugging is impossible. |
| Key files |
All crates/stemedb-*/src/**/*.rs |
| What to verify |
tracing used everywhere (info!, warn!, error!, debug!). clippy::print_stdout/print_stderr at warn level. #[instrument] on public methods in WAL, storage, ingestion, and lens code. stemedb-sim may use #![allow()] for CLI output. |
| Command |
/investigate structured logging — verify tracing usage, clippy print enforcement, and #[instrument] on critical public methods |
10.3 rkyv Zero-Copy Serialization
| Field |
Value |
| Priority |
High |
| Why it matters |
All data goes through rkyv. If raw AllocSerializer is used instead of the wrapper, serialization may miss fields or produce incompatible formats. |
| Key files |
crates/stemedb-core/src/serde.rs |
| What to verify |
stemedb_core::serde::{serialize, deserialize} wrapper functions exist. No raw AllocSerializer in production code. Roundtrip tests for all core types (Assertion, Vote, Epoch, MaterializedView, ChangeEntry, etc.). |
| Command |
/investigate rkyv serialization — verify wrapper usage, scan for raw AllocSerializer, and confirm roundtrip tests for all core types |
10.4 Go SDK (steme)
| Field |
Value |
| Priority |
Medium |
| Why it matters |
The Go SDK is the primary client integration. If it's out of sync with the API, external consumers get errors. |
| Key files |
sdk/go/steme/client.go, sdk/go/steme/assertion.go, sdk/go/steme/query.go, sdk/go/steme/signer.go, sdk/go/steme/types.go, sdk/go/steme/errors.go |
| What to verify |
HTTP client covers all endpoints. Ed25519 signing matches server verification. Fluent builder pattern for assertions and queries. Error types match API error responses. Integration test exists. Types match latest API DTOs. |
| Command |
/trace-feature Go SDK — trace from client.Assert() through HTTP request construction, Ed25519 signing, and response parsing. Verify all API endpoints have SDK methods. |
10.5 Go ADK Integration
| Field |
Value |
| Priority |
Medium |
| Why it matters |
ADK-Go tools let AI agents interact with Episteme. If tool definitions are wrong, agents can't use the database. |
| Key files |
sdk/go/adk/tools.go, sdk/go/adk/callbacks.go, sdk/go/adk/config.go, sdk/go/adk/types.go |
| What to verify |
Tool definitions match Episteme API capabilities. Callbacks wire correctly. Config supports endpoint and auth. Types match API DTOs. |
| Command |
/investigate ADK-Go integration — verify tool definitions, callback wiring, and type alignment with latest API |
10.6 SourceClass Enum & Evidence Hierarchy
| Field |
Value |
| Priority |
Medium |
| Why it matters |
The 6-tier evidence hierarchy (Regulatory through Anecdotal) drives decay rates, authority weights, and layered consensus. If tiers are mis-numbered, the entire evidence model is inverted. |
| Key files |
crates/stemedb-core/src/types/source.rs |
| What to verify |
6 tiers: Regulatory(0), Clinical(1), Observational(2), Expert(3), Community(4), Anecdotal(5). tier() returns correct ordinal. default_decay_days() returns tier-specific values. authority_weight() returns tier-specific weights. Serialization preserves tier identity. |
| Command |
/investigate SourceClass — verify tier numbering, decay days, authority weights, and serialization roundtrip |
10.7 Simulation Pipeline
| Field |
Value |
| Priority |
Low |
| Why it matters |
The simulator tests the full pipeline under synthetic load. If it doesn't exercise all features, it provides false confidence. |
| Key files |
crates/stemedb-sim/src/runner.rs, crates/stemedb-sim/src/agent.rs, crates/stemedb-sim/src/strategy.rs |
| What to verify |
Runner exercises write path (assertions, votes, epochs). Agent strategies produce realistic data patterns. Results can be queried through standard query path. |
| Command |
/investigate simulation pipeline — verify runner exercises assertions/votes/epochs, agent strategies are diverse, and output is queryable |
Summary
| Section |
Entries |
Critical |
High |
Medium |
Low |
| 1. Write Path |
8 |
3 |
3 |
1 |
0 |
| 2. Read Path |
5 |
1 |
4 |
0 |
0 |
| 3. Lens Resolution |
9 |
0 |
6 |
3 |
0 |
| 4. Trust & Safety |
6 |
0 |
3 |
3 |
0 |
| 5. Search & Similarity |
3 |
0 |
2 |
1 |
0 |
| 6. Time & Decay |
5 |
0 |
3 |
1 |
0 |
| 7. Source Provenance |
3 |
0 |
0 |
2 |
1 |
| 8. API & Integration |
7 |
1 |
1 |
4 |
1 |
| 9. Storage Engine |
5 |
1 |
3 |
0 |
1 |
| 10. Cross-Cutting |
7 |
1 |
1 |
4 |
1 |
| Total |
58 |
7 |
26 |
19 |
4 |
*Section 6 includes 1 entry at High that spans as_of+epoch interactions
Crate Coverage
| Crate |
Entries |
stemedb-wal |
1.1, 1.2, 1.3, 1.4 |
stemedb-ingest |
1.5, 1.6, 1.7, 1.8 |
stemedb-core |
7.3, 10.3, 10.6 |
stemedb-storage |
5.1, 5.2, 9.1, 9.2, 9.3, 9.4, 9.5, 4.1, 4.2, 4.3, 7.2 |
stemedb-query |
2.1-2.5, 5.3, 6.1-6.4 |
stemedb-lens |
3.1-3.9, 4.4, 6.5 |
stemedb-api |
8.1-8.7 |
stemedb-sim |
10.7 |
Go SDK (sdk/go/steme) |
10.4 |
Go ADK (sdk/go/adk) |
10.5 |
Command Index
| Type |
Count |
/investigate |
38 |
/trace-feature |
18 |
| Total |
56 |