stemedb

Author	SHA1	Message	Date
jordan	02ecac9a07	fix: merge upstream 10 commits, fix DashMap deadlock, deterministic sim ingestion Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint, security hardening, signed assertions, source registry, dashboard enhancements) and fixed all test failures across the full workspace (2656/2656 passing). Key fixes: - fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges() - 6 previously-hanging SWIM tests now pass in <2s - fix(sim): replace background-task+polling ingestion with synchronous process_pending() - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests - Removed ingestor.start() + wait_until_ingested() pattern throughout sim - All arena functions now call ingestor.process_pending() directly (deterministic) - fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2) - fix(test): quota test signed "test" but v1 requires "subject:predicate" format - fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex - fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change) - config: add nextest.toml with slow-timeout for background-task-tests group Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 20:27:32 -07:00
jordan	ad07a75d0a	feat: add source content to source registry, signed assertions, feed endpoint, dashboard enhancements - Add `content: Option<String>` to SourceRecord with rkyv schema evolution (LegacySourceRecord compat deserializer for backward compatibility) - Add MAX_SOURCE_CONTENT_LEN (1MB) limit with API validation - Strip content from list responses, include in single-source GET - Update Go SDK RegisterSourceRequest with Content field - FCM pipeline extracts PDF text via pdftotext and passes to registration - Dashboard impact panel fetches and displays source content with expand/collapse - Add feed endpoint, dashboard feed panel, and signed assertion support - Update data-structures.md, API docs, and storage docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:54:27 -07:00
jordan	58594bc7b9	feat: add feed endpoint, dashboard feed panel, and FindMyHealth app - Add /v1/feed API endpoint with handler and tests - Remove health endpoint rate limiting (behind firewall, caused spurious 429s) - Add dashboard feed panel with list, row, empty state, and loading skeleton - Update home page to show feed instead of redirecting to skeptic - Improve API key auth middleware and DTO create/query params - Add OpenAPI conceptual guide (api-intro.md) with semaglutide examples - Add FindMyHealth application scaffolding (vision, architecture, prototypes) - Add FindMyHealth designer/writer and Aphoria founder-CEO agents - Update roadmap with current progress Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-16 17:16:17 -07:00
jml	3df4aa7167	Signed assertions	2026-02-16 04:46:53 +00:00
jml	fae9b47fae	feat(aphoria): implement hosted mode with remote StemeDB integration Add remote mode infrastructure for querying claims from StemeDB API: - Remote client with caching layer for claim queries - Authority resolution logic with tier-based verdict system - StemeDB API handlers for claims CRUD operations - Enhanced conflict detection with remote claim support - Validation reports documenting A5.3 phase completion Changes: - applications/aphoria/src/remote/: New client + cache modules - applications/aphoria/src/resolution/: Authority tier resolution - crates/stemedb-api/src/handlers/stemedb_claims.rs: API handlers - applications/aphoria/validation/a5.3/: Phase validation reports - Updated roadmap with hosted mode milestones Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-14 09:29:56 +00:00
jordan	422e2d4416	feat(aphoria): wire claims through StemeDB — Gap Closure Phase 1 Claims now flow through StemeDB's append-only knowledge graph instead of mutable TOML files. This resolves all 6 critical claim-bypass code paths: - Bridge: lossless AuthoredClaim ↔ Assertion round-trip (comparison, status, lifecycle mapping) - LocalEpisteme: ingest_authored_claim() and fetch_authored_claims() with AUTHORED_CLAIM predicate index - EpistemeClaimStore: ClaimStore trait backed by StemeDB (append-only delete via deprecation) - CLI handlers: all claim commands read/write through StemeDB - Scanner: loads claims from StemeDB with auto-migration fallback to TOML - Export: new `aphoria claims export` serializes StemeDB claims to TOML/JSON Also cleans up dead code (EpistemeConfig.url), renames ingest_claims→ingest_observations, fixes ClaimFilter.authority_tier type, adds Draft variant to ClaimStatus, and fixes pre-existing clippy warnings (too_many_arguments, filter_next→rfind). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-12 02:02:51 -07:00
jml	3e7eddc074	feat: add enterprise production readiness infrastructure This commit implements comprehensive production hardening across multiple layers to prepare StemeDB for enterprise pilot deployments: ## API Layer - Add rate limiting middleware with configurable limits per endpoint - Enhance error handling with detailed context and proper HTTP status codes - Add security hardening tests for input validation and boundary conditions - Create store_helpers module for defensive storage access patterns ## Storage & WAL - Optimize group commit batching for higher throughput - Add defensive error handling in hybrid backend with proper fallbacks - Enhance WAL journal durability guarantees with fsync validation - Improve index store query performance with better caching ## Operations & Deployment - Add comprehensive operations documentation (deployment, monitoring, DR) - Create systemd units for backup, WAL archival, and verification - Add monitoring configs (Prometheus alerts, metrics exporters) - Implement backup/restore scripts with verification and S3 archival - Add DR drill automation and runbook procedures - Create load balancer configs (nginx, envoy) with health checks ## Documentation - Update CLAUDE.md with operations and troubleshooting guides - Expand roadmap with production readiness milestones - Add pilot success criteria and deployment reference architecture - Document TLS setup, monitoring integration, and incident response ## Configuration - Add .env.example with all required environment variables - Document resource sizing for different deployment scales - Add configuration examples for various deployment topologies This positions StemeDB for successful enterprise pilots with proper operational discipline, monitoring, backup/DR, and security hardening. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-12 06:08:15 +00:00
jml	bb0c33f8d3	fix(api): enable querying of CLI-created community corpus items ## Problem CLI-created community corpus items (tier 3) were stored correctly but invisible via API queries. Two issues blocked discoverability: 1. Prefix mismatch: API hardcoded 'community://pattern/' for aggregated patterns, but CLI creates 'community://rust/http/...' URIs 2. Query parameter parsing: Axum's default parser doesn't support bracket notation (?sources[]=value) used by the dashboard Result: 0/22 CLI-created items were queryable. ## Solution ### Fix 1: Broaden Community Prefix - Changed: 'community://pattern/' → 'community://' in corpus handler - Impact: Now matches both aggregated patterns AND CLI-created items - Backward compatible: Broader prefix includes narrower results ### Fix 2: Add QsQuery Extractor - Added: serde_qs dependency + custom QsQuery extractor - Supports: Bracket notation for array parameters (?sources[]=a&sources[]=b) - Compatible: Works with JavaScript URLSearchParams standard - Tested: 3 new unit tests for extractor behavior ## Verification - ✅ All 22 CLI-created community items now queryable (was 0) - ✅ Source filtering works: community (22), RFC (2), vendor (5) - ✅ Multi-source queries work: ?sources[]=community&sources[]=rfc → 24 - ✅ All 89 API tests pass + 3 new extractor tests - ✅ Clippy clean (0 warnings) - ✅ No regressions in existing functionality ## Files Changed - crates/stemedb-api/Cargo.toml: Add serde_qs dependency - crates/stemedb-api/src/extractors.rs: New QsQuery extractor (117 lines) - crates/stemedb-api/src/handlers/aphoria/corpus.rs: Use QsQuery, broaden prefix - crates/stemedb-api/src/lib.rs: Export extractors module Also includes: Scale-adaptive thresholds, wiki corpus extraction, documentation updates, and dashboard UI improvements from prior work. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-09 15:54:35 +00:00
jml	65065f3d8f	feat(aphoria): implement community corpus with wiki import and pattern aggregation Implements Phase 4 (A4) - Community corpus as first-class citizens: - Community Corpus Builder - Queries StemeDB pattern aggregates - Wiki Import - Bootstrap corpus from markdown docs (aphoria corpus import wiki) - Pattern Aggregation - Automatic learning from local scans (--sync flag) - Storage Layer - StemeDBPatternStore with content-addressed deduplication - Promotion Logic - Multi-tier thresholds (95%/80%/50% adoption rates) - Corpus Build - Unified registry for RFC/OWASP/Vendor/Community sources - Trust Packs - Export corpus as signed, distributable artifacts - Documentation - bootstrap-corpus.md guide + CLI reference updates Technical details: - Pattern aggregates stored as assertions with predicate "pattern_aggregate" - Content-addressed subjects via BLAKE3(subject:predicate:value) - PatternAggregator handles write path (observations → patterns) - StemeDBPatternStore handles read path (pattern queries) - Integration tests + fixtures in tests/wiki_import_test.rs Deleted hardcoded.rs (368 lines) - corpus now fully emergent from StemeDB. Deleted enriched-corpus-patterns.md (677 lines) - feature shipped. Closes VG-026 (community corpus), part of A4 milestone. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-09 00:12:31 +00:00
jml	e95c978481	feat(aphoria): add inline claim markers and claim enrichment infrastructure This commit implements Phase 17 of the Aphoria roadmap, adding: Inline Claim Markers (@aphoria:claim): - New extractor for detecting inline markers in comments - Pending markers tracked in .aphoria/pending_markers.toml - CLI commands: list-markers, formalize-marker, reject-marker - Support for all major comment styles (Rust, Python, SQL, etc.) - Auto-sync during scan (configurable) Claim Enrichment: - ClaimEnrichment type with source attribution (inline, extractor, manual) - EnrichedClaimInfo with full enrichment metadata - Extended AuthoredClaim with optional enrichment field - API endpoints for enriched claim queries - Dashboard UI components (enrichment badge, verdict badge) Enhanced Extractor Trait: - verifiable_predicates() method for declaring (tail_path, predicate) pairs - 10 security extractors now implement verifiable_predicates - Enables claim suggester skill to find unclaimed patterns Documentation: - Phase 17 summary with complete implementation details - Gap fixes summary documenting 8 closed vision gaps - Updated CLI reference with new commands - New aphoria-docs skill for documentation maintenance - Updated roadmap with Phase 17 completion Integration: - ClaimsFile support for claim enrichment persistence - Pattern aggregate store support for enrichment queries - Dashboard filters and display for enrichment metadata - API handlers for list-markers and enrichment queries Tests: - New gap_fixes_integration test suite - Corpus enricher module with best practices ingestion Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 20:18:20 +00:00
jml	ef2c8c5940	fix(aphoria): fix 3 critical verification engine bugs Fixed 3 bugs in Aphoria's claim verification engine that were causing false positives in Maxwell validation testing: Bug 1: Path matching + predicate filtering - Added predicate filtering to prevent cross-predicate matches - Added path prefix matching to respect crate boundaries - Prevents core/imports/serde from matching hypervisor/vsock/imports/serde Bug 2: Value-specific absent checks - Absent mode now checks for specific forbidden value, not any observation - Example: "Clone absent" + "Debug present" = PASS (not CONFLICT) - Only conflicts when the exact forbidden value is found Bug 3: Wildcard pattern support - Wildcard patterns like message//derives now match multiple paths - Enhanced wildcard_matches() to support prefix//suffix patterns - Correctly strips full scheme+language from observation paths Test coverage: - All 39 existing tests passing - 3 new tests added for bug fixes - 2 tests updated to use correct predicates - Zero clippy warnings Maxwell validation: - maxwell-core-no-serde-001: CONFLICT → PASS (respects path boundaries) - maxwell-singleton-no-clone-001: CONFLICT → PASS (value-specific absent) - 5 claims now correctly show as MISSING (expose predicate mismatches) The fixes successfully eliminate false positives while exposing pre-existing issues where claims used incorrect predicates. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 15:13:10 +00:00
jml	6430ff0fd6	fix(aphoria): move claims.toml to project root and fix verify integration ## Root Cause Claims file was in applications/aphoria/.aphoria/ but all commands looked for .aphoria/claims.toml relative to project root. Additionally, .aphoria/ was fully gitignored, preventing version control of claims. ## Changes ### Path Fixes - Move claims.toml from applications/aphoria/.aphoria/ to .aphoria/ at project root - Update .gitignore: .aphoria/ → .aphoria/* with !.aphoria/claims.toml exception - Now claims can be version controlled while keys remain secret ### Verify Integration (Scanner) - scanner.rs: Load claims from ClaimsFile and call verify_claims() - ScanResult: Add verify field with VerifyReport - Report formatters: Add claim verification sections showing PASS/CONFLICT/MISSING ### Clippy Fix - report/json.rs: Replace filter().map().expect() with filter_map() ## Verification - aphoria scan . → Shows claim verification with verdicts - aphoria verify run → Per-claim verification results - aphoria verify map → Extractor coverage mapping (7/10 claims = 70%) - aphoria claims list → Reads from project root - aphoria claims create → Writes to project root - All tests pass (1120+ aphoria tests) - clippy --workspace passes ## Impact Both primary use cases now work: 1. Day-to-day (commit-time): Skills can read/create claims via CLI 2. Audit (scan-time): Scanner verifies code against authored claims Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 11:09:57 +00:00
jml	3b5f88b4f0	feat(aphoria): implement claims architecture (A1-A5) with verify engine, corpus, coverage, and explain Complete Aphoria claims system overhaul: - A1: Rename ExtractedClaim to Observation (extractors produce observations, not claims) - A2: Add AuthoredClaim with full provenance, invariants, and authority tiers - A3: Verify engine comparing observations against authored claims, CLI + formatters - A4: Corpus as first-class assertions with predicate indexing, authority lens, trust packs - A5: Coverage analysis, explain/docs generation, self-audit extractor, claim suggester skill Also includes: 42 extractors updated for Observation type, verifiable_predicates trait, conflict detection with comparison modes, claims TOML persistence, Grafana dashboard, backup/restore scripts, and comprehensive test coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 09:11:47 +00:00
jordan	e0d2940b82	Skill	2026-02-07 19:51:05 -07:00
jordan	c849627620	feat: add Aphoria dashboard scans and corpus UI - Add scans panel with finding details, verdict badges, and filters - Add corpus panel for managing knowledge sources - Add scan cache for API state management - Update sidebar navigation with new routes - Extend API types for scans and corpus endpoints - Add .aphoria/ to gitignore (contains project keys) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 15:56:49 -07:00
jordan	f2ffb63f79	fix: Add missing benchmark field and fix approx_constant warning - Add benchmark: false to ScanArgs in stemedb-api handler - Change test float from 3.14 to 7.25 to avoid clippy approx_constant Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-07 05:17:53 -07:00
jordan	157dbbb9eb	feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing) ## Phase 8: Enterprise Extractor Improvements ✅ - 14 security extractors (TLS, JWT, SQL injection, XSS, etc.) - 10 framework-specific extractors (Spring, Django, Rails, etc.) - Config file security detection (YAML, TOML) ## Phase 9: Autonomous Extractor Generation ✅ - Shadow mode executor with TP/FP tracking - Graduation pipeline with confidence thresholds - Auto-rollback on regression detection - Cross-project pattern syncing ## UAT Suite Complete (14 scripts, 90 tests) - test-core-detection.sh (6 tests) - test-declarative-extractors.sh (5 tests) - test-domain-frameworks.sh (5 tests) - test-domain-unreal.sh (3 tests) - test-llm-extraction.sh (6 tests) - test-eval-harness.sh (5 tests) - test-cross-language.sh (3 tests) - test-precommit-performance.sh (4 tests) - test-output-formats.sh (8 tests) - test-drift-detection.sh (6 tests) - test-exit-codes.sh (12 tests) + 3 more scripts ## Other Changes - Updated roadmap to mark Phase 8-9 complete - Added .gitignore entries for build artifacts - Updated pre-commit: 800 line limit, exclude tests/data/cmd Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-06 22:50:55 -07:00
jordan	41c676a78e	feat: Aphoria enterprise features + ontology SDK + file length compliance Enterprise Features: - Hosted mode with remote sync for team pattern aggregation - Community sharing with privacy-preserving anonymization - LLM-based semantic claim extraction with Gemini integration - Pattern learning with promotion to declarative extractors - High-entropy secrets extractor with configurable thresholds - Auth bypass and insecure cookies extractors Module Refactoring: - Split oversized files to comply with 500-line limit - Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc. - Handlers split: scan.rs, policy.rs, report.rs modules - Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/ - Learning split: store modules with metrics and persistence SDK & Ontology: - stemedb-ontology SDK with fluent builders and StemeDB client - Pharma domain extractors for FDA Orange Book data - Consumer health UAT test infrastructure Code Quality: - Fixed clippy warnings (needless_borrows_for_generic_args) - Added KVStore trait imports where needed - Fixed utoipa path re-exports for OpenAPI docs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 12:55:29 -07:00
jordan	8f6506b70a	feat: Aphoria scan modes + stemedb-ontology crate + consumer health UAT Major additions: - Staged scanning modes (working tree, staged, committed) with git integration - Drift detection for baseline vs current state comparisons - Hosted API handlers for policy CRUD operations via StemeDB API - stemedb-ontology crate with domain definitions and medical extractors - Consumer health vertical UAT scenarios (GLP-1, gastroparesis, etc.) - Aphoria development skill documentation Code organization: - Split large files into focused modules to stay under 500-line limit - Extracted config tests, episteme helpers/drift/aliases, API helpers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 21:57:33 -07:00
jordan	1cc453c97b	feat: Aphoria policy source tracking + claim extraction pipeline - Add PolicySourceStore for tracking where policies come from - Implement claim extraction skill and API endpoints - Add community UI text selection extractor component - Create Go SDK aphoria client for policy operations - Document patent specifications and legal disclosures - Add guides: golden path loop, policy audit trails, pre-flight checks - Expand Unreal Engine config extractor with source tracking - Add UAT reports for policy source tracking validation - Refactor tests.rs into modular test files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 02:35:02 -07:00
jordan	b3e8a9a058	feat: Multi-application expansion with chaos testing and community UI Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 01:24:14 -07:00
jordan	a734be3a0d	feat: Phase 7 Content Defense + code structure refactoring Content Defense (Phase 7): - Add SimilarityIndex with MinHash/LSH for near-duplicate detection - Add QuarantineStore for flagged assertions awaiting admin review - Add CircuitBreakerStore for per-agent circuit breaker state - Add ContentDefenseLayer for ingestion pipeline integration - Add API endpoints for quarantine and circuit breaker management - Add research module with gap detection and documentation fetching Code Structure Improvements: - Extract research CLI commands to research_commands.rs - Extract API routers to routers.rs module - Extract key_codec extraction functions to separate module - Extract test modules to separate files across multiple crates - All files now under 500 line limit per pre-commit hook Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 12:44:05 -07:00
jordan	d3a88585fe	feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination This commit includes comprehensive work on Phase 6 features: ## Admission Control (Phase 6 admission middleware) - AdmissionStore implementation backed by TrustRankStore - PoW verification with tier-based difficulty computation - Trust tier progression (Newcomer → Established → Trusted → Authority) - API integration with admission status endpoints ## HLC Recency Lens (Phase 6C) - HlcRecencyLens for distributed system ordering - Hybrid logical clock integration with causality preservation ## Cluster Coordination (Phase 6C) - Multi-node cluster tests (availability, partition tolerance) - CRDT convergence tests for anti-entropy sync - Gateway handler improvements ## Aphoria Code Linter (Phase 2A) - RFC/OWASP corpus builders with network fetching and caching - Concept hierarchy with auto-alias creation on conflict detection - Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting) ## Code Organization - Split large files into modules to comply with 500-line limit - Improved test organization with separate test modules - Fixed rkyv serialization for EigenTrustState (AgentScore struct) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 00:43:37 -07:00
jordan	2b0923f20e	feat: Distributed replication foundation (Phase 6A) - HLC, Merkle trees, CRDT stores, sync protocol - Add Hybrid Logical Clock (HLC) for causality tracking across nodes - Implement Merkle tree for efficient diff/sync with BLAKE3 hashing - Add CRDT-aware stores for assertions and votes with vector clocks - Create stemedb-sync crate with anti-entropy and gossip protocols - Add stemedb-rpc crate with gRPC sync service (proto definitions) - Implement SupersessionChain for tracking assertion lifecycles - Add Aphoria application for code analysis/reporting - Add battery11 replication test scaffolding - Fix .gitignore to exclude nested target directories Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 19:31:54 -07:00
jordan	137a588ed0	feat: Concept hierarchy (Phase 5D) - ConceptPath, source schemes, AliasStore Implements hierarchical subject identifiers with scheme-based source tier inference: - ConceptPath type with parse/wire_format, leaf/parent, prefix matching - SourceScheme registry mapping schemes to default SourceClass tiers: - rfc://, fda://, ietf:// → Regulatory (Tier 0) - peer://, pubmed:// → PeerReviewed (Tier 1) - code://, wiki:// → Expert (Tier 3) - blog://, anon:// → Anecdotal (Tier 5) - AliasStore for cross-scheme entity resolution (bidirectional indexing) - API endpoints for concept operations - Battery tests 8, 9 & 10 for concepts, aliases, and advanced signatures - Go SDK updates for concept types and signing Completes Phase 5, advancing to Phase 6 (Distributed Writes). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 17:44:54 -07:00
jordan	3320c24afa	feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation Add CRC32C checksums to WAL record format (v2), implement crash recovery with automatic truncation of corrupt records, add feature-gated group commit buffer for batched fsync under concurrent load, and implement log rotation via segment files with global offset addressing. Key changes: - Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N] - recover_file() scans and truncates corrupt tail records - GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate) - SegmentManager with binary search resolution and cursor-based cleanup - Journal::read() auto-refreshes segments on miss for writer/reader split - Split recovery.rs and key_codec.rs into directory modules for 500-line max Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 12:36:35 -07:00
jordan	55349845d0	refactor: Split all files to enforce 500-line max Break monolith source files into focused modules: - stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.) - stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories - stemedb-ingest/worker.rs → worker/ with separate test modules - stemedb-query: engine, materializer, query → module directories - stemedb-lens: epoch_aware, skeptic → module directories - stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types - stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline - stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution - stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay Also adds new features: gold standard verification, escalation handlers, admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 01:13:45 -07:00
jordan	c59066949a	feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints - Add Layered() method to Go SDK for per-source-class consensus queries - Add LayeredQueryParams, LayeredResult, TierResolution types to Go SDK - Create conflict example demonstrating Skeptic and Layered endpoints - Update quickstart.md with sections 6 (conflict detection) and 7 (authority tiers) - Remove tracked Go binary and add data/ to .gitignore The new quickstart sections demonstrate Episteme's differentiating features: - Skeptic endpoint shows "Trust but Verify" conflict analysis - Layered endpoint shows per-tier resolution (Clinical vs Anecdotal) Note: Pre-existing large files flagged by pre-commit hook (technical debt from prior sessions) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 21:00:59 -07:00
jordan	1ce4004807	feat: Complete Phase 2 (The Cortex) - query, lens, and API layers This commit adds the read path (Cortex) to complement the write path (Spine): ## Crates - stemedb-api: HTTP API with axum + utoipa OpenAPI - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit - Metered endpoints with quota enforcement - Ed25519 signature verification - stemedb-lens: Truth resolution lenses - RecencyLens, ConsensusLens, ConfidenceLens - VoteAwareConsensusLens (Ballot Box pattern) - TrustAwareAuthorityLens (The Hive pattern) - SkepticLens (conflict analysis) - EpochAwareLens (paradigm-safe queries) - stemedb-query: Query engine with materialized views ## Storage Extensions - VoteStore: Vote aggregation with cached counts - TrustRankStore: Agent reputation with decay - AuditStore: Query audit trail - IndexStore: SP/P/S index structures - SupersessionStore: Epoch supersession chains ## SDKs - sdk/go/steme: Go HTTP client with Ed25519 signing - sdk/go/adk: ADK-Go tools for AI agents ## Documentation - Updated CLAUDE.md, architecture.md, roadmap.md - New ai-lookup entries for all services - Use case docs for consumer health intelligence - Arena roadmap for simulation advancement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 13:22:44 -07:00

29 Commits