jml 6430ff0fd6 fix(aphoria): move claims.toml to project root and fix verify integration

## Root Cause
Claims file was in applications/aphoria/.aphoria/ but all commands looked
for .aphoria/claims.toml relative to project root. Additionally, .aphoria/
was fully gitignored, preventing version control of claims.

## Changes

### Path Fixes
- Move claims.toml from applications/aphoria/.aphoria/ to .aphoria/ at project root
- Update .gitignore: .aphoria/ → .aphoria/* with !.aphoria/claims.toml exception
- Now claims can be version controlled while keys remain secret

### Verify Integration (Scanner)
- scanner.rs: Load claims from ClaimsFile and call verify_claims()
- ScanResult: Add verify field with VerifyReport
- Report formatters: Add claim verification sections showing PASS/CONFLICT/MISSING

### Clippy Fix
- report/json.rs: Replace filter().map().expect() with filter_map()

## Verification
- aphoria scan . → Shows claim verification with verdicts
- aphoria verify run → Per-claim verification results
- aphoria verify map → Extractor coverage mapping (7/10 claims = 70%)
- aphoria claims list → Reads from project root
- aphoria claims create → Writes to project root
- All tests pass (1120+ aphoria tests)
- clippy --workspace passes

## Impact
Both primary use cases now work:
1. Day-to-day (commit-time): Skills can read/create claims via CLI
2. Audit (scan-time): Scanner verifies code against authored claims

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-08 11:09:57 +00:00

13 KiB

Raw Blame History

Episteme (StemeDB)

A probabilistic knowledge graph database that stores Claims, not Facts. Append-only Merkle DAG with read-time resolution via Lenses.

Core Concept: "Git for Truth" - conflicting assertions coexist, resolved at query time through Consensus, Recency, Authority, or custom Lenses.

Find Your Guide

If you need to...	Read this
Get started fast	quickstart.md
Understand what Episteme is	what-is-episteme.md
Understand the technical vision	vision.md
See use cases	use-cases/README.md
Understand architecture	architecture.md
Learn data structures	docs/data-structures.md
Understand governance models	docs/specs/governance-models.md
See the roadmap	roadmap.md
See completed phases	roadmap-archive.md
Build apps on Episteme	docs/app-concepts/index.md
Consumer Health vertical	docs/app-concepts/consumer-health.md
Use Go SDK	ai-lookup/services/sdk.md
Write Rust code	.claude/guides/backend/rust-guidelines.md
Set up local dev	.claude/guides/local/setup.md
Run tests	.claude/guides/local/testing.md
Understand quality checks	.claude/guides/local/quality-checks.md
Learn about simulation	ai-lookup/features/simulation.md
Advance the simulator	arena-roadmap.md
Work on storage/DAG	Load skill: `stemedb-core`
Implement a Lens	Load skill: `stemedb-lens`
Work on domain ontology	`crates/stemedb-ontology/`
Consumer Health UAT	uat/consumer-health/README.md
Verify production readiness	uat/production-readiness/README.md
Plan a milestone	`/plan-milestone` command
Analyze use case gaps	`/analyze-gaps` command
Add an API endpoint	.claude/guides/backend/api-endpoints.md
Integrate with AI tools	.claude/guides/integrations/ai-coding-assistant-integration.md
ADK-Go + Episteme	.claude/guides/integrations/adk-go-episteme.md
Distributed architecture	docs/research/distributed-write-path.md
Write UAT reports	.claude/guides/local/uat-reports.md
Phase 6 UAT results	ai-lookup/features/phase6-uat.md
Configure Aphoria hosted mode	.claude/guides/services/aphoria-hosted-mode.md
Aphoria config reference	ai-lookup/features/aphoria-config.md
Work on Admin Dashboard	`applications/stemedb-dashboard/` (Next.js + shadcn/ui)
Work on Disputed app	`applications/disputed/`
Understand repo structure	ai-lookup/repo-structure.md
Aphoria LLM eval	Load skill: `aphoria-llm-optimization`
General LLM optimization	Load skill: `llm-optimization`
Install Aphoria	Load skill: `aphoria-install`
Run Aphoria self-review	Load skill: `aphoria-self-review`
Author claims from diffs	Load skill: `aphoria-claims`
Suggest new claims	Load skill: `aphoria-suggest`

Roadmap Maintenance

Two files, strict separation:

File	Contains	When to modify
`roadmap.md`	Current + future work only	Add new phases, update task status
`roadmap-archive.md`	Completed phases (1-7, 8A, MVP)	Move items when phase completes

Rules:

When a phase completes: Move entire phase section to archive, update status table in both files
When adding tasks: Add to current phase in roadmap.md with - [ ] checkbox format
When completing tasks: Change - [ ] to - [x], add brief implementation notes
Keep roadmap.md under 500 lines — if it grows, archive more aggressively
Current phase always has "🎯" marker in status table

Task format:

- [ ] **P1.2 Feature Name**: Brief description
    - [ ] Subtask one
    - [ ] Subtask two

Phase completion checklist:

All tasks marked [x] in roadmap.md
Cut entire phase section, paste into roadmap-archive.md
Update status tables in both files
Update "Current Focus" in roadmap.md header

Aphoria: What Is a Claim?

A claim is a human-authored statement about what code MUST do and WHY, with provenance and consequences.

Claims vs Observations

Type	What it is	Who creates it	Example
Observation	Grep result: "this code does X"	Extractors (automated)	`imports/tokio: true`
Claim	Rule: "code MUST do X because Y, or Z breaks"	Humans (via skill)	"Core MUST NOT import tokio because it creates runtime coupling. If tokio appears in core imports, the library becomes async-only and breaks sync users."

Observations are garbage. They're indexed facts with no meaning. Nobody cares that imports/format: true — that's just grep output.

Claims are the product. They encode architectural decisions, safety invariants, and spec compliance with full context: provenance (where the rule came from), invariant (what must stay true), and consequence (what breaks if violated).

Structure of a Claim

[[claim]]
id = "core-no-tokio-001"
concept_path = "stemedb/core/imports/tokio"
predicate = "imported"
value = false
comparison = "absent"  # Code MUST NOT have this
provenance = "Architecture decision by jml 2024-12-15"
invariant = "Core modules MUST remain sync-only"
consequence = "Importing tokio makes core async-only, breaking sync library users"
authority_tier = "expert"
category = "architecture"
evidence = ["ADR-003", "design review notes"]
status = "active"

Aphoria Workflows (Primary Use Cases)

Day-to-day (commit-time claim authoring):

Look at the entire diff
Use aphoria-claims skill to identify "claimable" patterns (spec constants, ordering changes, boundary violations, derive changes on wire types)
Skill does lookups: aphoria claims list to check what exists
If alignment needed, skill uses aphoria claims update or supersede
Skill crafts and submits new claims via aphoria claims create
If needed for audit, create paired extractor

Audit (scan-time claim verification):

Direction 1: aphoria scan runs extractors → observations, compares against authored claims → PASS/CONFLICT/MISSING
Direction 2: aphoria verify run walks all claims, verifies each one's pattern exists in code → PASS/CONFLICT/MISSING

The skill drives the CLI. The CLI doesn't know about the skill. They connect via skill calling aphoria claims commands in a loop.

Critical Rules

Append-Only: NEVER mutate existing Assertions. Create new ones.
Content-Addressed: Assertion ID = BLAKE3 hash of content.
No Unwrap: NEVER use unwrap() or expect() in production code. CI enforces via clippy::unwrap_used and clippy::expect_used at deny level.
Defensive Writes: All writes go through WAL with fsync.
Zero-Copy: Use rkyv for serialization. ALWAYS use stemedb_core::serde::{serialize, deserialize} — NEVER use raw AllocSerializer in production code.
Instrument Critical Paths: Use #[instrument] on public methods in WAL, storage, ingestion, and lens code. Include meaningful fields (key_len, payload_len, offset, candidates_count, lens).
Structured Logging: Use tracing (info!, warn!, error!) instead of println!/eprintln!. Clippy enforces via print_stdout/print_stderr at warn level. CLI binaries (e.g., stemedb-sim) may use #![allow()] for user-facing output.
Document Changes: Update ai-lookup/ when adding new types/concepts. Keep skills in sync with code.
No Git Operations: NEVER use git stash, git branch, git checkout, or any git operations unless the user explicitly tells you to.
No GitHub Workflows: We use pre-commit hooks, not GitHub Actions CI.

Quick Reference

# Build
cargo build --workspace

# Test (choose based on need)
cargo test -p stemedb-core        # Fast: single crate (~30s)
cargo test --workspace --lib      # Medium: all unit tests (~3min)
cargo nextest run                 # Full: parallel runner (~5min)
cargo test --workspace            # Legacy: sequential (~15min)

# Lint (must pass before commit)
cargo clippy --workspace -- -D warnings
cargo fmt --check

Port Scheme (181XX)

Offset	Service	Default	Env Var
+0	HTTP API	18180	`STEMEDB_BIND_ADDR`
+1	Cluster Gateway	18181	`STEMEDB_NODE_API_ADDR`
+2	Cluster RPC	18182	`STEMEDB_NODE_RPC_ADDR`
+3	SWIM Gossip	18183	via `SwimConfig`
+4	Metrics	18184	(reserved)
+5	Admin	18185	(reserved)
+6	Latent Signal	18186	—
+7	Community App	18187	—

Specialized Agents

Domain	Agent	When to use
Product Vision	`episteme-product-visionary`	Use cases, "why not Postgres?", product-market fit
Pilot Prep	`enterprise-skeptic-buyer`	Pressure-test demos, find gaps, prepare for tough questions
Aphoria Pitch	`aphoria-skeptic-buyer`	Pressure-test Aphoria demos, security tool buyer objections
Aphoria Phase 7	`declarative-extractor-skeptic`	Pressure-test declarative extractors, LLM extraction, pattern learning
Aphoria Phase 9	`autonomous-learning-skeptic`	Pressure-test autonomous promotion, shadow mode, cross-project learning
General Rust	`primary-developer`	Feature implementation, refactoring
Code Quality	`rust-quality-engineer`	Reviews, test coverage, clippy
Storage	`storage-engine-architect`	WAL, LSM, crash recovery
Graph Engine	`rust-graph-engine-architect`	Lock-free structures, cache optimization
Defensive	`defensive-systems-architect`	Rate limiting, circuit breakers, hostile input
Distributed	`distributed-systems-engineer`	CRDT replication, Raft coordination, Merkle sync, clustering
Lenses	`stemedb-lens-architect`	Query resolution, ranking algorithms
Planning	`stemedb-planner`	Milestone planning, roadmap

Architecture Overview

Write Path (Spine):           Read Path (Cortex):
[Agent] -> [Ingestion]        [Agent] <- [Lens Engine]
              |                              |
              v                              |
         [WAL/Fsync]                  [Index Lookup]
              |                              |
              v                              |
         [KV Store] <--------------------+

Crates

Crate	Purpose	Status
`stemedb-core`	Assertion, LifecycleStage, MaterializedView, types, signing utilities	✅ Implemented
`stemedb-wal`	Write-ahead log with crash recovery	✅ Implemented
`stemedb-storage`	KVStore, VoteStore, IndexStore, TrustRankStore, QuarantineStore, SimilarityIndex	✅ Implemented
`stemedb-ingest`	Ingestion pipeline, signature verification, ContentDefenseLayer	✅ Implemented
`stemedb-query`	Query engine, Materializer for O(1) MV: reads	✅ Implemented
`stemedb-lens`	Lenses (Recency, Consensus, Authority, Vote/Trust-aware)	✅ Implemented
`stemedb-api`	HTTP API with axum + utoipa OpenAPI docs	✅ Implemented
`stemedb-sim`	Simulation for testing the pipeline	✅ Implemented
`stemedb-merkle`	BLAKE3 Merkle tree for diff detection	✅ Implemented
`stemedb-rpc`	gRPC services for node-to-node communication	✅ Implemented
`stemedb-sync`	Merkle sync, gossip broadcast, anti-entropy	✅ Implemented
`stemedb-cluster`	Cluster membership (SWIM), sharding, gateway	✅ Implemented
`stemedb-ontology`	Domain definitions (Pharma), subject builders, medical extractors	✅ Implemented

SDKs

SDK	Purpose	Status
`sdk/go/steme`	Go HTTP client with Ed25519 signing and fluent builders	✅ Implemented
`sdk/go/adk`	ADK-Go tools and callbacks for AI agents	✅ Implemented

Latent Signal (latent/)

Python CLI tools for adverse event signal detection. Different rules from Rust crates:

Allowed:

print() for user-facing CLI output (these are scripts, not libraries)
except Exception as e: for CLI error handling (log and continue)

Required:

Environment Variables for URLs: NEVER hardcode localhost URLs without env fallback
- Use os.getenv("VAR", "http://localhost:...") in Python
- Use process.env.VAR || 'http://localhost:...' in TypeScript
StemeDB Integration: New ingestors should use StemeDBClient pattern from adk-agent/, not write to JSONL files

13 KiB Raw Blame History