Commit Graph

11 Commits

Author SHA1 Message Date
jordan
02ecac9a07 fix: merge upstream 10 commits, fix DashMap deadlock, deterministic sim ingestion
Merged 10 upstream commits (MemTable, read-your-writes tests, feed endpoint,
security hardening, signed assertions, source registry, dashboard enhancements)
and fixed all test failures across the full workspace (2656/2656 passing).

Key fixes:
- fix(cluster): DashMap deadlock in swim.rs suspect_node/fail_node/alive_node
  - DashMap::get_mut RefMut + iter() on same map = non-reentrant write lock deadlock
  - Fix: extract clone in scoped block to drop RefMut before calling update_node_gauges()
  - 6 previously-hanging SWIM tests now pass in <2s
- fix(sim): replace background-task+polling ingestion with synchronous process_pending()
  - smoke_high_volume_simulation was CPU-starved under 2656 parallel tests
  - Removed ingestor.start() + wait_until_ingested() pattern throughout sim
  - All arena functions now call ingestor.process_pending() directly (deterministic)
- fix(test): v2 signature helper used wrong hash (rkyv vs canonical compute_content_hash_v2)
- fix(test): quota test signed "test" but v1 requires "subject:predicate" format
- fix(test): http_validation now accepts 400 for valid-format-but-invalid-crypto hex
- fix(test): scale_adaptive micro tier assertions updated (auto_promote upstream change)
- config: add nextest.toml with slow-timeout for background-task-tests group

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 20:27:32 -07:00
jordan
ad07a75d0a feat: add source content to source registry, signed assertions, feed endpoint, dashboard enhancements
- Add `content: Option<String>` to SourceRecord with rkyv schema evolution
  (LegacySourceRecord compat deserializer for backward compatibility)
- Add MAX_SOURCE_CONTENT_LEN (1MB) limit with API validation
- Strip content from list responses, include in single-source GET
- Update Go SDK RegisterSourceRequest with Content field
- FCM pipeline extracts PDF text via pdftotext and passes to registration
- Dashboard impact panel fetches and displays source content with expand/collapse
- Add feed endpoint, dashboard feed panel, and signed assertion support
- Update data-structures.md, API docs, and storage docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:54:27 -07:00
jordan
58594bc7b9 feat: add feed endpoint, dashboard feed panel, and FindMyHealth app
- Add /v1/feed API endpoint with handler and tests
- Remove health endpoint rate limiting (behind firewall, caused spurious 429s)
- Add dashboard feed panel with list, row, empty state, and loading skeleton
- Update home page to show feed instead of redirecting to skeptic
- Improve API key auth middleware and DTO create/query params
- Add OpenAPI conceptual guide (api-intro.md) with semaglutide examples
- Add FindMyHealth application scaffolding (vision, architecture, prototypes)
- Add FindMyHealth designer/writer and Aphoria founder-CEO agents
- Update roadmap with current progress

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 17:16:17 -07:00
jordan
422e2d4416 feat(aphoria): wire claims through StemeDB — Gap Closure Phase 1
Claims now flow through StemeDB's append-only knowledge graph instead of
mutable TOML files. This resolves all 6 critical claim-bypass code paths:

- Bridge: lossless AuthoredClaim ↔ Assertion round-trip (comparison, status, lifecycle mapping)
- LocalEpisteme: ingest_authored_claim() and fetch_authored_claims() with AUTHORED_CLAIM predicate index
- EpistemeClaimStore: ClaimStore trait backed by StemeDB (append-only delete via deprecation)
- CLI handlers: all claim commands read/write through StemeDB
- Scanner: loads claims from StemeDB with auto-migration fallback to TOML
- Export: new `aphoria claims export` serializes StemeDB claims to TOML/JSON

Also cleans up dead code (EpistemeConfig.url), renames ingest_claims→ingest_observations,
fixes ClaimFilter.authority_tier type, adds Draft variant to ClaimStatus, and fixes
pre-existing clippy warnings (too_many_arguments, filter_next→rfind).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 02:02:51 -07:00
jml
3e7eddc074 feat: add enterprise production readiness infrastructure
This commit implements comprehensive production hardening across multiple
layers to prepare StemeDB for enterprise pilot deployments:

## API Layer
- Add rate limiting middleware with configurable limits per endpoint
- Enhance error handling with detailed context and proper HTTP status codes
- Add security hardening tests for input validation and boundary conditions
- Create store_helpers module for defensive storage access patterns

## Storage & WAL
- Optimize group commit batching for higher throughput
- Add defensive error handling in hybrid backend with proper fallbacks
- Enhance WAL journal durability guarantees with fsync validation
- Improve index store query performance with better caching

## Operations & Deployment
- Add comprehensive operations documentation (deployment, monitoring, DR)
- Create systemd units for backup, WAL archival, and verification
- Add monitoring configs (Prometheus alerts, metrics exporters)
- Implement backup/restore scripts with verification and S3 archival
- Add DR drill automation and runbook procedures
- Create load balancer configs (nginx, envoy) with health checks

## Documentation
- Update CLAUDE.md with operations and troubleshooting guides
- Expand roadmap with production readiness milestones
- Add pilot success criteria and deployment reference architecture
- Document TLS setup, monitoring integration, and incident response

## Configuration
- Add .env.example with all required environment variables
- Document resource sizing for different deployment scales
- Add configuration examples for various deployment topologies

This positions StemeDB for successful enterprise pilots with proper
operational discipline, monitoring, backup/DR, and security hardening.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-12 06:08:15 +00:00
jml
bb0c33f8d3 fix(api): enable querying of CLI-created community corpus items
## Problem
CLI-created community corpus items (tier 3) were stored correctly but
invisible via API queries. Two issues blocked discoverability:

1. **Prefix mismatch**: API hardcoded 'community://pattern/' for
   aggregated patterns, but CLI creates 'community://rust/http/...' URIs
2. **Query parameter parsing**: Axum's default parser doesn't support
   bracket notation (?sources[]=value) used by the dashboard

Result: 0/22 CLI-created items were queryable.

## Solution

### Fix 1: Broaden Community Prefix
- Changed: 'community://pattern/' → 'community://' in corpus handler
- Impact: Now matches both aggregated patterns AND CLI-created items
- Backward compatible: Broader prefix includes narrower results

### Fix 2: Add QsQuery Extractor
- Added: serde_qs dependency + custom QsQuery extractor
- Supports: Bracket notation for array parameters (?sources[]=a&sources[]=b)
- Compatible: Works with JavaScript URLSearchParams standard
- Tested: 3 new unit tests for extractor behavior

## Verification
-  All 22 CLI-created community items now queryable (was 0)
-  Source filtering works: community (22), RFC (2), vendor (5)
-  Multi-source queries work: ?sources[]=community&sources[]=rfc → 24
-  All 89 API tests pass + 3 new extractor tests
-  Clippy clean (0 warnings)
-  No regressions in existing functionality

## Files Changed
- crates/stemedb-api/Cargo.toml: Add serde_qs dependency
- crates/stemedb-api/src/extractors.rs: New QsQuery extractor (117 lines)
- crates/stemedb-api/src/handlers/aphoria/corpus.rs: Use QsQuery, broaden prefix
- crates/stemedb-api/src/lib.rs: Export extractors module

Also includes: Scale-adaptive thresholds, wiki corpus extraction,
documentation updates, and dashboard UI improvements from prior work.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 15:54:35 +00:00
jordan
d3a88585fe feat: Phase 6 UAT - Admission control, HLC recency, cluster coordination
This commit includes comprehensive work on Phase 6 features:

## Admission Control (Phase 6 admission middleware)
- AdmissionStore implementation backed by TrustRankStore
- PoW verification with tier-based difficulty computation
- Trust tier progression (Newcomer → Established → Trusted → Authority)
- API integration with admission status endpoints

## HLC Recency Lens (Phase 6C)
- HlcRecencyLens for distributed system ordering
- Hybrid logical clock integration with causality preservation

## Cluster Coordination (Phase 6C)
- Multi-node cluster tests (availability, partition tolerance)
- CRDT convergence tests for anti-entropy sync
- Gateway handler improvements

## Aphoria Code Linter (Phase 2A)
- RFC/OWASP corpus builders with network fetching and caching
- Concept hierarchy with auto-alias creation on conflict detection
- Multiple security extractors (TLS, JWT, CORS, secrets, rate limiting)

## Code Organization
- Split large files into modules to comply with 500-line limit
- Improved test organization with separate test modules
- Fixed rkyv serialization for EigenTrustState (AgentScore struct)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:43:37 -07:00
jordan
3320c24afa feat: WAL hardening (Phase 5B) - CRC32C, crash recovery, group commit, log rotation
Add CRC32C checksums to WAL record format (v2), implement crash recovery
with automatic truncation of corrupt records, add feature-gated group commit
buffer for batched fsync under concurrent load, and implement log rotation
via segment files with global offset addressing.

Key changes:
- Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N]
- recover_file() scans and truncates corrupt tail records
- GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate)
- SegmentManager with binary search resolution and cursor-based cleanup
- Journal::read() auto-refreshes segments on miss for writer/reader split
- Split recovery.rs and key_codec.rs into directory modules for 500-line max

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:36:35 -07:00
jordan
55349845d0 refactor: Split all files to enforce 500-line max
Break monolith source files into focused modules:
- stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.)
- stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories
- stemedb-ingest/worker.rs → worker/ with separate test modules
- stemedb-query: engine, materializer, query → module directories
- stemedb-lens: epoch_aware, skeptic → module directories
- stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types
- stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline
- stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution
- stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay

Also adds new features: gold standard verification, escalation handlers,
admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 01:13:45 -07:00
jordan
c59066949a feat: Add quickstart "Beyond Hello World" sections with Skeptic and Layered endpoints
- Add Layered() method to Go SDK for per-source-class consensus queries
- Add LayeredQueryParams, LayeredResult, TierResolution types to Go SDK
- Create conflict example demonstrating Skeptic and Layered endpoints
- Update quickstart.md with sections 6 (conflict detection) and 7 (authority tiers)
- Remove tracked Go binary and add data/ to .gitignore

The new quickstart sections demonstrate Episteme's differentiating features:
- Skeptic endpoint shows "Trust but Verify" conflict analysis
- Layered endpoint shows per-tier resolution (Clinical vs Anecdotal)

Note: Pre-existing large files flagged by pre-commit hook (technical debt from prior sessions)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:00:59 -07:00
jordan
1ce4004807 feat: Complete Phase 2 (The Cortex) - query, lens, and API layers
This commit adds the read path (Cortex) to complement the write path (Spine):

## Crates
- stemedb-api: HTTP API with axum + utoipa OpenAPI
  - /v1/assert, /v1/query, /v1/epoch, /v1/skeptic, /v1/trace, /v1/audit
  - Metered endpoints with quota enforcement
  - Ed25519 signature verification
- stemedb-lens: Truth resolution lenses
  - RecencyLens, ConsensusLens, ConfidenceLens
  - VoteAwareConsensusLens (Ballot Box pattern)
  - TrustAwareAuthorityLens (The Hive pattern)
  - SkepticLens (conflict analysis)
  - EpochAwareLens (paradigm-safe queries)
- stemedb-query: Query engine with materialized views

## Storage Extensions
- VoteStore: Vote aggregation with cached counts
- TrustRankStore: Agent reputation with decay
- AuditStore: Query audit trail
- IndexStore: SP/P/S index structures
- SupersessionStore: Epoch supersession chains

## SDKs
- sdk/go/steme: Go HTTP client with Ed25519 signing
- sdk/go/adk: ADK-Go tools for AI agents

## Documentation
- Updated CLAUDE.md, architecture.md, roadmap.md
- New ai-lookup entries for all services
- Use case docs for consumer health intelligence
- Arena roadmap for simulation advancement

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 13:22:44 -07:00