stemedb

Author	SHA1	Message	Date
jordan	cde30b9213	chore: apply rustfmt formatting across API handlers and core types Reformats import blocks, function signatures, and expression line wrapping in stemedb-api handlers, stemedb-core serde/source_record, and serde_helpers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-21 16:43:45 -07:00
jordan	ad07a75d0a	feat: add source content to source registry, signed assertions, feed endpoint, dashboard enhancements - Add `content: Option<String>` to SourceRecord with rkyv schema evolution (LegacySourceRecord compat deserializer for backward compatibility) - Add MAX_SOURCE_CONTENT_LEN (1MB) limit with API validation - Strip content from list responses, include in single-source GET - Update Go SDK RegisterSourceRequest with Content field - FCM pipeline extracts PDF text via pdftotext and passes to registration - Dashboard impact panel fetches and displays source content with expand/collapse - Add feed endpoint, dashboard feed panel, and signed assertion support - Update data-structures.md, API docs, and storage docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:54:27 -07:00
jml	e95c978481	feat(aphoria): add inline claim markers and claim enrichment infrastructure This commit implements Phase 17 of the Aphoria roadmap, adding: Inline Claim Markers (@aphoria:claim): - New extractor for detecting inline markers in comments - Pending markers tracked in .aphoria/pending_markers.toml - CLI commands: list-markers, formalize-marker, reject-marker - Support for all major comment styles (Rust, Python, SQL, etc.) - Auto-sync during scan (configurable) Claim Enrichment: - ClaimEnrichment type with source attribution (inline, extractor, manual) - EnrichedClaimInfo with full enrichment metadata - Extended AuthoredClaim with optional enrichment field - API endpoints for enriched claim queries - Dashboard UI components (enrichment badge, verdict badge) Enhanced Extractor Trait: - verifiable_predicates() method for declaring (tail_path, predicate) pairs - 10 security extractors now implement verifiable_predicates - Enables claim suggester skill to find unclaimed patterns Documentation: - Phase 17 summary with complete implementation details - Gap fixes summary documenting 8 closed vision gaps - Updated CLI reference with new commands - New aphoria-docs skill for documentation maintenance - Updated roadmap with Phase 17 completion Integration: - ClaimsFile support for claim enrichment persistence - Pattern aggregate store support for enrichment queries - Dashboard filters and display for enrichment metadata - API handlers for list-markers and enrichment queries Tests: - New gap_fixes_integration test suite - Corpus enricher module with best practices ingestion Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-08 20:18:20 +00:00
jordan	bbe6aedc40	feat: Aphoria security extractors + LLM evaluation architecture + ontology docs New security extractors: - insecure_deserialization, orm_injection, path_traversal, security_headers - ssrf, unvalidated_redirects, weak_password, xxe - Enhanced tls_version extractor with comprehensive cipher/protocol checks Architecture docs: - Scout-judge extraction pattern for LLM-based code analysis - LLM prompt evaluation framework - LLM eval implementation guide Core improvements: - stemedb-ontology README and client enhancements - WAL journal/segment instrumentation - Signing and ingestion refinements - Consumer health demo script Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 15:22:55 -07:00
jordan	41c676a78e	feat: Aphoria enterprise features + ontology SDK + file length compliance Enterprise Features: - Hosted mode with remote sync for team pattern aggregation - Community sharing with privacy-preserving anonymization - LLM-based semantic claim extraction with Gemini integration - Pattern learning with promotion to declarative extractors - High-entropy secrets extractor with configurable thresholds - Auth bypass and insecure cookies extractors Module Refactoring: - Split oversized files to comply with 500-line limit - Config split: types/core.rs, types/extractors.rs, types/hosted.rs, etc. - Handlers split: scan.rs, policy.rs, report.rs modules - Extractors split: declarative/, high_entropy_secrets/, insecure_cookies/ - Learning split: store modules with metrics and persistence SDK & Ontology: - stemedb-ontology SDK with fluent builders and StemeDB client - Pharma domain extractors for FDA Orange Book data - Consumer health UAT test infrastructure Code Quality: - Fixed clippy warnings (needless_borrows_for_generic_args) - Added KVStore trait imports where needed - Fixed utoipa path re-exports for OpenAPI docs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 12:55:29 -07:00
jordan	b3e8a9a058	feat: Multi-application expansion with chaos testing and community UI Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 01:24:14 -07:00
jordan	a734be3a0d	feat: Phase 7 Content Defense + code structure refactoring Content Defense (Phase 7): - Add SimilarityIndex with MinHash/LSH for near-duplicate detection - Add QuarantineStore for flagged assertions awaiting admin review - Add CircuitBreakerStore for per-agent circuit breaker state - Add ContentDefenseLayer for ingestion pipeline integration - Add API endpoints for quarantine and circuit breaker management - Add research module with gap detection and documentation fetching Code Structure Improvements: - Extract research CLI commands to research_commands.rs - Extract API routers to routers.rs module - Extract key_codec extraction functions to separate module - Extract test modules to separate files across multiple crates - All files now under 500 line limit per pre-commit hook Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 12:44:05 -07:00
jordan	d3a88585fe	feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination This commit includes comprehensive work on Phase 6 features: ## Admission Control (Phase 6 admission middleware) - AdmissionStore implementation backed by TrustRankStore - PoW verification with tier-based difficulty computation - Trust tier progression (Newcomer → Established → Trusted → Authority) - API integration with admission status endpoints ## HLC Recency Lens (Phase 6C) - HlcRecencyLens for distributed system ordering - Hybrid logical clock integration with causality preservation ## Cluster Coordination (Phase 6C) - Multi-node cluster tests (availability, partition tolerance) - CRDT convergence tests for anti-entropy sync - Gateway handler improvements ## Aphoria Code Linter (Phase 2A) - RFC/OWASP corpus builders with network fetching and caching - Concept hierarchy with auto-alias creation on conflict detection - Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting) ## Code Organization - Split large files into modules to comply with 500-line limit - Improved test organization with separate test modules - Fixed rkyv serialization for EigenTrustState (AgentScore struct) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 00:43:37 -07:00
jordan	2b0923f20e	feat: Distributed replication foundation (Phase 6A) - HLC, Merkle trees, CRDT stores, sync protocol - Add Hybrid Logical Clock (HLC) for causality tracking across nodes - Implement Merkle tree for efficient diff/sync with BLAKE3 hashing - Add CRDT-aware stores for assertions and votes with vector clocks - Create stemedb-sync crate with anti-entropy and gossip protocols - Add stemedb-rpc crate with gRPC sync service (proto definitions) - Implement SupersessionChain for tracking assertion lifecycles - Add Aphoria application for code analysis/reporting - Add battery11 replication test scaffolding - Fix .gitignore to exclude nested target directories Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 19:31:54 -07:00
jordan	137a588ed0	feat: Concept hierarchy (Phase 5D) - ConceptPath, source schemes, AliasStore Implements hierarchical subject identifiers with scheme-based source tier inference: - ConceptPath type with parse/wire_format, leaf/parent, prefix matching - SourceScheme registry mapping schemes to default SourceClass tiers: - rfc://, fda://, ietf:// → Regulatory (Tier 0) - peer://, pubmed:// → PeerReviewed (Tier 1) - code://, wiki:// → Expert (Tier 3) - blog://, anon:// → Anecdotal (Tier 5) - AliasStore for cross-scheme entity resolution (bidirectional indexing) - API endpoints for concept operations - Battery tests 8, 9 & 10 for concepts, aliases, and advanced signatures - Go SDK updates for concept types and signing Completes Phase 5, advancing to Phase 6 (Distributed Writes). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 17:44:54 -07:00
jordan	55349845d0	refactor: Split all files to enforce 500-line max Break monolith source files into focused modules: - stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.) - stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories - stemedb-ingest/worker.rs → worker/ with separate test modules - stemedb-query: engine, materializer, query → module directories - stemedb-lens: epoch_aware, skeptic → module directories - stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types - stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline - stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution - stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay Also adds new features: gold standard verification, escalation handlers, admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 01:13:45 -07:00
jordan	c59066949a	feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints - Add Layered() method to Go SDK for per-source-class consensus queries - Add LayeredQueryParams, LayeredResult, TierResolution types to Go SDK - Create conflict example demonstrating Skeptic and Layered endpoints - Update quickstart.md with sections 6 (conflict detection) and 7 (authority tiers) - Remove tracked Go binary and add data/ to .gitignore The new quickstart sections demonstrate Episteme's differentiating features: - Skeptic endpoint shows "Trust but Verify" conflict analysis - Layered endpoint shows per-tier resolution (Clinical vs Anecdotal) Note: Pre-existing large files flagged by pre-commit hook (technical debt from prior sessions) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 21:00:59 -07:00
jordan	152df4b0b4	docs: Mark Phase 2.4, 2.5, 2.6 as complete in roadmap - 2.4 Visual Hash Query: hamming_distance, visual_near/threshold implemented - 2.5 Vector Field: N/A (Phase 3 work, scaffolding correct) - 2.6 E2E Integration Test: e2e_pipeline.rs with 5 comprehensive tests Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 13:33:03 -07:00
jordan	1ce4004807	feat: Complete Phase 2 (The Cortex) - query, lens, and API layers This commit adds the read path (Cortex) to complement the write path (Spine): ## Crates - stemedb-api: HTTP API with axum + utoipa OpenAPI - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit - Metered endpoints with quota enforcement - Ed25519 signature verification - stemedb-lens: Truth resolution lenses - RecencyLens, ConsensusLens, ConfidenceLens - VoteAwareConsensusLens (Ballot Box pattern) - TrustAwareAuthorityLens (The Hive pattern) - SkepticLens (conflict analysis) - EpochAwareLens (paradigm-safe queries) - stemedb-query: Query engine with materialized views ## Storage Extensions - VoteStore: Vote aggregation with cached counts - TrustRankStore: Agent reputation with decay - AuditStore: Query audit trail - IndexStore: SP/P/S index structures - SupersessionStore: Epoch supersession chains ## SDKs - sdk/go/steme: Go HTTP client with Ed25519 signing - sdk/go/adk: ADK-Go tools for AI agents ## Documentation - Updated CLAUDE.md, architecture.md, roadmap.md - New ai-lookup entries for all services - Use case docs for consumer health intelligence - Arena roadmap for simulation advancement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 13:22:44 -07:00
jordan	3cfaa1e1d3	feat: Complete Phase 1 (The Spine) - storage foundation Phase 1 delivers the complete durability and storage layer: - WAL with crash recovery: Append-only journal with BLAKE3 checksums, fsync guarantees, and proper seek-to-EOF on reopen - Storage engine: sled-backed KVStore with scan_prefix for range queries - Content-addressed storage: H:{hash}, V:{hash}, E:{hash} key patterns - Ingestor: Background worker tailing WAL, writing to KV with 8-byte aligned record headers for rkyv zero-copy deserialization - Comprehensive tests: 31 tests covering crash recovery, round-trips, and multi-cycle durability New crates: stemedb-wal, stemedb-storage, stemedb-ingest Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-31 14:15:34 -07:00
jordan	a776744889	Initial project setup with Claude Code monorepo structure - Rust workspace with stemedb-core crate - Full .claude/ configuration (agents, skills, commands, guides) - ai-lookup/ for token-efficient fact storage - Quality gates: clippy, fmt, jscpd duplication detection - Pre-commit hook with 5-phase quality checks - CLAUDE.md router and CODING_GUIDELINES.md standards Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-31 10:56:26 -07:00

16 Commits