Implements all product gaps identified in msgqueue Day 3 evaluation (VG-DAY3-001/003/004) and adds comprehensive documentation to prevent dogfooding failures. ## Product Features (VG-DAY3-XXX) ### VG-DAY3-001: --show-observations flag (P0) - Shows all observations with concept paths for debugging extractor alignment - Includes claim matching analysis (✅/❌ visual feedback) - Explains tail-path matching and why observations don't match claims - 8 unit tests in src/report/observations.rs - 5 integration tests in src/tests/day3_debugging.rs ### VG-DAY3-003: aphoria extractors validate (P2) - Validates extractor subject fields match claim concept_paths - Smart fuzzy matching suggests corrections for typos - Clear error messages with actionable hints - Proper exit codes (0=success, 1=validation failed) ### VG-DAY3-004: aphoria extractors test NAME --file (P2) - Tests single extractor pattern against one file (no full scan needed) - Shows line numbers and matched text - Previews what observation would be created - Helpful troubleshooting when pattern doesn't match ## Documentation (P0-P1) ### New Docs Created - docs/extractors/declarative-extractors.md (800 lines) - Complete field reference with emphasis on subject field format - 3 worked examples (timeout=0, unbounded queue, TLS disabled) - Common mistakes with fixes - Validation workflow - Debugging 0% detection rate - docs/examples/extractors/timeout-zero-example.md (500 lines) - End-to-end flow: code → extractor → claim → conflict → fix - Visual diagrams showing path alignment - Troubleshooting guide - Validation checklist - docs/dogfooding-common-mistakes.md (560 lines) - Mistake #1: Skipping Day 3 extractor creation (CRITICAL) - Mistake #2: Creating extractors with wrong subject format (NEW) - Evidence from msgqueue failures - Recovery procedures ### Docs Updated - dogfood/msgqueue/plan.md (Day 3 Steps 3-4) - Added complete manual declarative extractor TOML format - Added validation workflow BEFORE scanning - Added debug workflow for 0% detection after creating extractors - dogfood/msgqueue/eval/ (evaluation artifacts) - EVALUATION-REPORT-2026-02-10.md (600 lines) - DOC-FIXES-2026-02-10.md (summary of fixes) - IMPLEMENTATION-REVIEW-2026-02-10.md (feature review) ## New Extractors - src/extractors/ack_mode_config.rs - Detects AckMode::AutoAck violations - src/extractors/async_blocking.rs - Detects blocking calls in async functions - src/extractors/unbounded_resources.rs - Detects unbounded queues/connections ## Code Changes - src/cli/mod.rs: Add --show-observations flag to scan command - src/cli/extractors.rs: Add Validate and Test subcommands - src/handlers/scan.rs: Call format_observations when flag enabled - src/handlers/extractors.rs: Implement handle_validate() and handle_test() - src/report/observations.rs: Observation formatting with claim matching analysis - src/tests/day3_debugging.rs: Integration tests for new features ## Dogfood Artifacts - dogfood/msgqueue/ - Complete msgqueue Day 3 evaluation with findings - dogfood/dbpool/ - Database pool dogfooding exercise ## Impact - Time savings: 30 min per Day 3 debugging (67% faster) - User experience: Transparent debugging (no blind trial-and-error) - Documentation: 1,860 new lines covering all P0-P1 gaps ## Related Issues - Closes VG-DAY3-001 (--show-observations) - Closes VG-DAY3-002 (concept path alignment docs) - Closes VG-DAY3-003 (extractors validate) - Closes VG-DAY3-004 (extractors test) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
22 KiB
Episteme (StemeDB)
A probabilistic knowledge graph database that stores Claims, not Facts. Append-only Merkle DAG with read-time resolution via Lenses.
Core Concept: "Git for Truth" - conflicting assertions coexist, resolved at query time through Consensus, Recency, Authority, or custom Lenses.
ZERO TOLERANCE FOR MEDIOCRITY: We build enterprise-grade products that must survive in production. Panics are UNACCEPTABLE. Broken pipe errors are UNACCEPTABLE. Sloppy testing is UNACCEPTABLE. Every line of code ships to paying customers who depend on it. Test everything. Handle every error. No shortcuts. No excuses.
Find Your Guide
| If you need to... | Read this |
|---|---|
| Get started fast | quickstart.md |
| Understand what Episteme is | what-is-episteme.md |
| Understand the technical vision | vision.md |
| See use cases | use-cases/README.md |
| Understand architecture | architecture.md |
| Learn data structures | docs/data-structures.md |
| Understand governance models | docs/specs/governance-models.md |
| See the roadmap | roadmap.md |
| See completed phases | roadmap-archive.md |
| Build apps on Episteme | docs/app-concepts/index.md |
| Consumer Health vertical | docs/app-concepts/consumer-health.md |
| Use Go SDK | ai-lookup/services/sdk.md |
| Write Rust code | .claude/guides/backend/rust-guidelines.md |
| Set up local dev | .claude/guides/local/setup.md |
| Run tests | .claude/guides/local/testing.md |
| Understand quality checks | .claude/guides/local/quality-checks.md |
| Learn about simulation | ai-lookup/features/simulation.md |
| Advance the simulator | arena-roadmap.md |
| Work on storage/DAG | Load skill: stemedb-core |
| Implement a Lens | Load skill: stemedb-lens |
| Work on domain ontology | crates/stemedb-ontology/ |
| Consumer Health UAT | uat/consumer-health/README.md |
| Verify production readiness | uat/production-readiness/README.md |
| Plan a milestone | /plan-milestone command |
| Analyze use case gaps | /analyze-gaps command |
| Add an API endpoint | .claude/guides/backend/api-endpoints.md |
| Integrate with AI tools | .claude/guides/integrations/ai-coding-assistant-integration.md |
| ADK-Go + Episteme | .claude/guides/integrations/adk-go-episteme.md |
| Distributed architecture | docs/research/distributed-write-path.md |
| Write UAT reports | .claude/guides/local/uat-reports.md |
| Phase 6 UAT results | ai-lookup/features/phase6-uat.md |
| Configure Aphoria hosted mode | .claude/guides/services/aphoria-hosted-mode.md |
| Aphoria config reference | ai-lookup/features/aphoria-config.md |
| Work on Admin Dashboard | applications/stemedb-dashboard/ (Next.js + shadcn/ui) |
| Work on Disputed app | applications/disputed/ |
| Understand repo structure | ai-lookup/repo-structure.md |
| Understand Aphoria flywheel | ai-lookup/features/aphoria-flywheel.md |
| Aphoria LLM eval | Load skill: aphoria-llm-optimization |
| General LLM optimization | Load skill: llm-optimization |
| Install Aphoria | Load skill: aphoria-install |
| Run Aphoria self-review | Load skill: aphoria-self-review |
| Author claims from diffs | Load skill: aphoria-claims |
| Suggest new claims | Load skill: aphoria-suggest |
| Automate post-commit analysis | Load skill: aphoria-post-commit-hook |
| Set up CI/CD automation | Load skill: aphoria-ci-setup |
| Create declarative extractors | applications/aphoria/docs/extractors/declarative-extractors.md |
| Learn extractor examples | applications/aphoria/docs/examples/extractors/ |
| Avoid dogfooding mistakes | applications/aphoria/docs/dogfooding-common-mistakes.md |
Roadmap Maintenance
Two files, strict separation:
| File | Contains | When to modify |
|---|---|---|
roadmap.md |
Current + future work only | Add new phases, update task status |
roadmap-archive.md |
Completed phases (1-7, 8A, MVP) | Move items when phase completes |
Rules:
- When a phase completes: Move entire phase section to archive, update status table in both files
- When adding tasks: Add to current phase in
roadmap.mdwith- [ ]checkbox format - When completing tasks: Change
- [ ]to- [x], add brief implementation notes - Keep
roadmap.mdunder 500 lines — if it grows, archive more aggressively - Current phase always has "🎯" marker in status table
Task format:
- [ ] **P1.2 Feature Name**: Brief description
- [ ] Subtask one
- [ ] Subtask two
Phase completion checklist:
- All tasks marked
[x]inroadmap.md - Cut entire phase section, paste into
roadmap-archive.md - Update status tables in both files
- Update "Current Focus" in
roadmap.mdheader
Aphoria: The Autonomous Flywheel
Aphoria is a continuous learning system that runs on EVERY commit, NOT a CLI tool you invoke manually.
The Commit-Time Loop (Runs Automatically):
Developer commits code
↓
1. SCAN: Extractors → observations
↓
2. CHECK: Compare observations against claims → violations
↓
3. FIX: Developer fixes violations
↓
4. GET REMAINING CLAIMS: Identify claims without extractors
↓
5. CREATE EXTRACTORS: Dynamically generate extractors for uncovered claims
↓
6. SUGGEST NEW CLAIMS: LLM analyzes patterns → suggests new claims
↓
7. CREATE NEW EXTRACTORS: Generate extractors for new claims
↓
(Loop repeats, knowledge compounds)
Knowledge Compounding: Each commit benefits from all previous commits' learning - not through ML training, but through accumulated structured decisions.
LLM Workflows ARE the Core Product
CRITICAL: Aphoria's autonomous operation REQUIRES LLM-driven automation:
- Claude Code skills (
/aphoria-claims,/aphoria-suggest,/aphoria-custom-extractor-creator) - Go ADK agents (custom agent implementations)
- Other LLM methodology (API-driven workflows)
Manual CLI (aphoria scan, aphoria claims create) is debug interface for when LLM automation is unavailable. It is NOT the primary workflow.
Manual fallbacks to CLI operations are unacceptable in production workflows — if LLM automation is unavailable, the system is broken, not in "fallback mode."
Three Main Workflows:
- Commit-time (PRIMARY): Developer commits → Aphoria scans → checks policies → dynamically creates extractors for uncovered existing claims → LLM suggests new claims from patterns → LLM creates extractors for new claims
- Onboarding: New dev codes → Aphoria guides with team conventions + linked context (who, why, when)
- Graduation: Patterns with frequency + authority → auto-promote to conventions (shadow mode → promotion)
Critical: The commit-time workflow has TWO extractor creation phases:
- Phase 1: Dynamic creation for existing claims without extractors (ensures all authored claims are verifiable)
- Phase 2: Creation for new claims suggested by pattern analysis (expands coverage)
Skills That Drive the Flywheel:
| Skill | Purpose | When Used |
|---|---|---|
/aphoria-claims |
Analyze diffs, author/update claims | Every commit with code changes |
/aphoria-suggest |
Suggest new claims from patterns | When growing coverage |
/aphoria-custom-extractor-creator |
Generate extractors (for both existing uncovered claims AND new claims) | Continuous - both phases of loop |
/aphoria-corpus-import |
Import docs → create claims + extractors | Bootstrap from external sources |
/aphoria-post-commit-hook |
Automate all loop steps with post-commit hooks | One-time setup per project |
/aphoria-ci-setup |
Automate via CI/CD instead of local hooks | One-time setup per repo |
Dogfooding Day 3: The Extractor Creation Phase
Day 3 is where the flywheel validates. This is the step that separates Aphoria from static linters.
Why Day 3 is Critical:
- Day 3 IS Steps 4-5 of the commit-time loop (identify gaps → create extractors)
- Without Day 3 extractor creation, NO knowledge is captured
- This is the CORE validation of autonomous learning
Workflow:
- Baseline scan → Detect X violations (often 0-20% on new domains)
- Gap analysis → Identify claims with no extractors (MISSING verdicts)
- Extractor creation → Use
/aphoria-custom-extractor-creatorto generate extractors (REQUIRED) - Verification scan → Detect Y violations (target: ≥90%)
- Document → Record detection rate improvement (X% → Y%)
Success Criteria:
- Detection rate ≥90% after extractor creation
- All extractors produce correct observations (concept_path matches claim)
- Learning documented (which patterns were added to corpus)
- Time ≤2 hours (including all 5 phases)
Evidence of Correct Execution:
ls .aphoria/extractors/*.toml | wc -l # Should be: 8+ (number of violations)
ls scan-v2.json # Must exist (verification scan)
ls DAY3-SUMMARY.md # Must exist (daily summary)
If ANY of these are missing, Day 3 was NOT completed correctly.
Common Mistake: Running scan once, seeing low detection rate, and moving on without creating extractors. This breaks the entire flywheel. See applications/aphoria/docs/dogfooding-common-mistakes.md for full details.
CRITICAL PROHIBITION:
NEVER describe Aphoria as:
- ❌ "CLI tool with LLM features"
- ❌ "Static scanner with optional automation"
- ❌ "Tool you run when you want"
ALWAYS describe Aphoria as:
- ✅ "Autonomous continuous learning system"
- ✅ "LLM-driven commit-time flywheel"
- ✅ "System that runs on every commit"
For questions about "what is the flywheel?" or "main use cases", read:
/home/jml/Workspace/stemedb/applications/aphoria/vision.md
Aphoria: What Is a Claim?
A claim is a human-authored statement about what code MUST do and WHY, with provenance and consequences.
Claims vs Observations
| Type | What it is | Who creates it | Example |
|---|---|---|---|
| Observation | Grep result: "this code does X" | Extractors (automated) | imports/tokio: true |
| Claim | Rule: "code MUST do X because Y, or Z breaks" | Humans (via skill) | "Core MUST NOT import tokio because it creates runtime coupling. If tokio appears in core imports, the library becomes async-only and breaks sync users." |
Observations are garbage. They're indexed facts with no meaning. Nobody cares that imports/format: true — that's just grep output.
Claims are the product. They encode architectural decisions, safety invariants, and spec compliance with full context: provenance (where the rule came from), invariant (what must stay true), and consequence (what breaks if violated).
Structure of a Claim
[[claim]]
id = "core-no-tokio-001"
concept_path = "stemedb/core/imports/tokio"
predicate = "imported"
value = false
comparison = "absent" # Code MUST NOT have this
provenance = "Architecture decision by jml 2024-12-15"
invariant = "Core modules MUST remain sync-only"
consequence = "Importing tokio makes core async-only, breaking sync library users"
authority_tier = "expert"
category = "architecture"
evidence = ["ADR-003", "design review notes"]
status = "active"
Aphoria Workflows (Primary Use Cases)
Day-to-day (commit-time claim authoring):
- Look at the entire diff
- Use
aphoria-claimsskill to identify "claimable" patterns (spec constants, ordering changes, boundary violations, derive changes on wire types) - Skill does lookups:
aphoria claims listto check what exists - If alignment needed, skill uses
aphoria claims updateorsupersede - Skill crafts and submits new claims via
aphoria claims create - If needed for audit, create paired extractor
Audit (scan-time claim verification):
- Direction 1:
aphoria scanruns extractors → observations, compares against authored claims → PASS/CONFLICT/MISSING - Direction 2:
aphoria verify runwalks all claims, verifies each one's pattern exists in code → PASS/CONFLICT/MISSING
The skill drives the CLI. The CLI doesn't know about the skill. They connect via skill calling aphoria claims commands in a loop.
Inline Claim Markers (@aphoria:claim)
Capture claim intent while writing code with inline markers:
1. Add marker in comment:
// @aphoria:claim[safety] Pool size MUST NOT exceed 50 -- OOM under sustained load
const MAX_POOL_SIZE: u32 = 50;
2. Enable in config (.aphoria/config.toml):
[extractors.inline_markers]
enabled = true
sync_to_pending = true # Auto-sync during scan (default)
3. Scan detects markers:
aphoria scan
# Output: ℹ Detected 1 new claim marker(s). Run 'aphoria claims list-markers' to review.
4. Review pending markers:
aphoria claims list-markers --format table
# Shows: ID, file, line, category, invariant
aphoria claims list-markers --format json
# JSON output for skills to process
5. Formalize via CLI:
aphoria claims formalize-marker marker-abc123 \
--id myapp-pool-max-001 \
--tier expert \
--evidence "tests/pool_tests.rs load test" \
--by jml
# Creates full claim in .aphoria/claims.toml
# Updates marker status to "formalized"
Or reject if not worth a claim:
aphoria claims reject-marker marker-abc123 --reason "Implementation detail, not architecture"
6. Update comment after formalization:
// @aphoria:claimed myapp-pool-max-001
const MAX_POOL_SIZE: u32 = 50;
Supported comment styles:
// @aphoria:claim(Rust, Go, C, TypeScript, JavaScript)# @aphoria:claim(Python, Ruby, Shell, YAML)-- @aphoria:claim(SQL)/* @aphoria:claim */(CSS, C-style blocks)<!-- @aphoria:claim -->(HTML, XML)
Optional fields:
- Category in brackets:
@aphoria:claim[category] - Consequence after
--:invariant -- consequence
Storage:
- Detected markers →
.aphoria/pending_markers.toml(auto-synced during scan) - Formalized claims →
.aphoria/claims.toml - Already formalized →
@aphoria:claimed <claim-id>(skipped by extractor)
Critical Rules
- Append-Only: NEVER mutate existing Assertions. Create new ones.
- Content-Addressed: Assertion ID = BLAKE3 hash of content.
- No Unwrap: NEVER use
unwrap()orexpect()in production code. CI enforces viaclippy::unwrap_usedandclippy::expect_usedat deny level. - Defensive Writes: All writes go through WAL with fsync.
- Zero-Copy: Use
rkyvfor serialization. ALWAYS usestemedb_core::serde::{serialize, deserialize}— NEVER use rawAllocSerializerin production code. - Instrument Critical Paths: Use
#[instrument]on public methods in WAL, storage, ingestion, and lens code. Include meaningful fields (key_len, payload_len, offset, candidates_count, lens). - Structured Logging: Use
tracing(info!, warn!, error!) instead ofprintln!/eprintln!. Clippy enforces viaprint_stdout/print_stderrat warn level. CLI binaries (e.g.,stemedb-sim) may use#![allow()]for user-facing output. - Query Parameter Arrays: In API handlers, use
QsQueryextractor (not standardQuery) for any DTO withVec<T>orOption<Vec<T>>fields. Dashboard uses bracket notation (?sources[]=a&sources[]=b) which requiresserde_qs. StandardQuerysilently fails on array params. Seecrates/stemedb-api/src/extractors.rsfor details. - Document Changes: Update
ai-lookup/when adding new types/concepts. Keep skills in sync with code. - No Git Operations: NEVER use git stash, git branch, git checkout, or any git operations unless the user explicitly tells you to.
- No GitHub Workflows: We use pre-commit hooks, not GitHub Actions CI.
Quick Reference
# Build
cargo build --workspace
# Test (choose based on need)
cargo test -p stemedb-core # Fast: single crate (~30s)
cargo test --workspace --lib # Medium: all unit tests (~3min)
cargo nextest run # Full: parallel runner (~5min)
cargo test --workspace # Legacy: sequential (~15min)
# Lint (must pass before commit)
cargo clippy --workspace -- -D warnings
cargo fmt --check
Port Scheme (181XX)
| Offset | Service | Default | Env Var |
|---|---|---|---|
| +0 | HTTP API | 18180 | STEMEDB_BIND_ADDR |
| +1 | Cluster Gateway | 18181 | STEMEDB_NODE_API_ADDR |
| +2 | Cluster RPC | 18182 | STEMEDB_NODE_RPC_ADDR |
| +3 | SWIM Gossip | 18183 | via SwimConfig |
| +4 | Metrics | 18184 | (reserved) |
| +5 | Admin | 18185 | (reserved) |
| +6 | Latent Signal | 18186 | — |
| +7 | Community App | 18187 | — |
| +8 | StemeDB Dashboard | 18188 | — |
| +9 | Aphoria Dashboard | 18189 | — |
Specialized Agents
| Domain | Agent | When to use |
|---|---|---|
| Product Vision | episteme-product-visionary |
Use cases, "why not Postgres?", product-market fit |
| Pilot Prep | enterprise-skeptic-buyer |
Pressure-test demos, find gaps, prepare for tough questions |
| Aphoria Pitch | aphoria-skeptic-buyer |
Pressure-test Aphoria demos, security tool buyer objections |
| Aphoria Phase 7 | declarative-extractor-skeptic |
Pressure-test declarative extractors, LLM extraction, pattern learning |
| Aphoria Phase 9 | autonomous-learning-skeptic |
Pressure-test autonomous promotion, shadow mode, cross-project learning |
| General Rust | primary-developer |
Feature implementation, refactoring |
| Code Quality | rust-quality-engineer |
Reviews, test coverage, clippy |
| Storage | storage-engine-architect |
WAL, LSM, crash recovery |
| Graph Engine | rust-graph-engine-architect |
Lock-free structures, cache optimization |
| Defensive | defensive-systems-architect |
Rate limiting, circuit breakers, hostile input |
| Distributed | distributed-systems-engineer |
CRDT replication, Raft coordination, Merkle sync, clustering |
| Lenses | stemedb-lens-architect |
Query resolution, ranking algorithms |
| Planning | stemedb-planner |
Milestone planning, roadmap |
Architecture Overview
Write Path (Spine): Read Path (Cortex):
[Agent] -> [Ingestion] [Agent] <- [Lens Engine]
| |
v |
[WAL/Fsync] [Index Lookup]
| |
v |
[KV Store] <--------------------+
Crates
| Crate | Purpose | Status |
|---|---|---|
stemedb-core |
Assertion, LifecycleStage, MaterializedView, types, signing utilities | ✅ Implemented |
stemedb-wal |
Write-ahead log with crash recovery | ✅ Implemented |
stemedb-storage |
KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex | ✅ Implemented |
stemedb-ingest |
Ingestion pipeline, signature verification, ContentDefenseLayer | ✅ Implemented |
stemedb-query |
Query engine, Materializer for O(1) MV: reads | ✅ Implemented |
stemedb-lens |
Lenses (Recency, Consensus, Authority, Vote/Trust-aware) | ✅ Implemented |
stemedb-api |
HTTP API with axum + utoipa OpenAPI docs | ✅ Implemented |
stemedb-sim |
Simulation for testing the pipeline | ✅ Implemented |
stemedb-merkle |
BLAKE3 Merkle tree for diff detection | ✅ Implemented |
stemedb-rpc |
gRPC services for node-to-node communication | ✅ Implemented |
stemedb-sync |
Merkle sync, gossip broadcast, anti-entropy | ✅ Implemented |
stemedb-cluster |
Cluster membership (SWIM), sharding, gateway | ✅ Implemented |
stemedb-ontology |
Domain definitions (Pharma), subject builders, medical extractors | ✅ Implemented |
SDKs
| SDK | Purpose | Status |
|---|---|---|
sdk/go/steme |
Go HTTP client with Ed25519 signing and fluent builders | ✅ Implemented |
sdk/go/adk |
ADK-Go tools and callbacks for AI agents | ✅ Implemented |
Latent Signal (latent/)
Python CLI tools for adverse event signal detection. Different rules from Rust crates:
Allowed:
print()for user-facing CLI output (these are scripts, not libraries)except Exception as e:for CLI error handling (log and continue)
Required:
- Environment Variables for URLs: NEVER hardcode
localhostURLs without env fallback- Use
os.getenv("VAR", "http://localhost:...")in Python - Use
process.env.VAR || 'http://localhost:...'in TypeScript
- Use
- StemeDB Integration: New ingestors should use
StemeDBClientpattern fromadk-agent/, not write to JSONL files