Add CRC32C checksums to WAL record format (v2), implement crash recovery with automatic truncation of corrupt records, add feature-gated group commit buffer for batched fsync under concurrent load, and implement log rotation via segment files with global offset addressing. Key changes: - Record format v2: [len:u32][crc32c:u32][blake3:32][payload:N] - recover_file() scans and truncates corrupt tail records - GroupCommitBuffer batches fsync via MPSC channel (tokio feature gate) - SegmentManager with binary search resolution and cursor-based cleanup - Journal::read() auto-refreshes segments on miss for writer/reader split - Split recovery.rs and key_codec.rs into directory modules for 500-line max Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
298 lines
10 KiB
Markdown
298 lines
10 KiB
Markdown
# Lens
|
|
|
|
**Last Updated:** 2026-02-01
|
|
**Confidence:** High
|
|
**Status:** Implemented in `stemedb-lens` v0.1.0
|
|
|
|
## Summary
|
|
|
|
A Lens resolves conflicting assertions into a deterministic answer at read time. Multiple truths coexist; the Lens chooses which to return.
|
|
|
|
**Key Facts:**
|
|
- Stateless compute (no side effects)
|
|
- Deterministic (same input = same output)
|
|
- Fast (runs on every read, avoid allocations)
|
|
- Pluggable (implement `Lens` trait)
|
|
|
|
**File Pointer:** `crates/stemedb-lens/src/lib.rs`
|
|
|
|
## The Traits
|
|
|
|
### Synchronous Lens
|
|
|
|
```rust
|
|
pub trait Lens: Send + Sync {
|
|
fn resolve(&self, candidates: &[Assertion]) -> Resolution;
|
|
fn name(&self) -> &'static str;
|
|
}
|
|
```
|
|
|
|
### Async Lens
|
|
|
|
For lenses requiring I/O (e.g., VoteStore lookups):
|
|
|
|
```rust
|
|
#[async_trait]
|
|
pub trait AsyncLens: Send + Sync {
|
|
async fn resolve_async(&self, candidates: &[Assertion]) -> Resolution;
|
|
fn name(&self) -> &'static str;
|
|
}
|
|
```
|
|
|
|
### Analysis Lens (Trust but Verify)
|
|
|
|
For lenses that surface conflict instead of resolving it:
|
|
|
|
```rust
|
|
#[async_trait]
|
|
pub trait AnalysisLens: Send + Sync {
|
|
async fn analyze(&self, candidates: &[Assertion]) -> ConflictAnalysis;
|
|
fn name(&self) -> &'static str;
|
|
}
|
|
```
|
|
|
|
Returns `ConflictAnalysis` with:
|
|
- `status`: Unanimous, Agreed, or Contested
|
|
- `conflict_score`: 0.0 (unanimous) to 1.0 (chaos) using normalized Shannon entropy
|
|
- `claims`: All distinct claims ranked by weight share
|
|
|
|
## VoteAwareConsensus Implementation
|
|
|
|
The VoteAwareConsensusLens integrates with the Ballot Box pattern (VoteStore) to resolve based on actual vote counts.
|
|
|
|
**Resolution Strategy:**
|
|
1. For each candidate assertion, lookup vote count and aggregate weight (O(1) cached)
|
|
2. Rank by aggregate weight (sum of all vote weights)
|
|
3. Return assertion with highest aggregate weight
|
|
4. Tiebreaker: If weights equal, prefer most recent timestamp
|
|
|
|
**Confidence Calculation:**
|
|
```
|
|
confidence = winner_weight / total_weight_across_all_candidates
|
|
```
|
|
|
|
**Example:**
|
|
```rust
|
|
use stemedb_lens::VoteAwareConsensusLens;
|
|
use stemedb_storage::{HybridStore, GenericVoteStore};
|
|
use std::sync::Arc;
|
|
|
|
let store = HybridStore::open("./data").await?;
|
|
let vote_store = Arc::new(GenericVoteStore::new(store));
|
|
let lens = VoteAwareConsensusLens::new(vote_store);
|
|
|
|
let resolution = lens.resolve_async(&candidates).await;
|
|
```
|
|
|
|
## TrustAwareAuthority Implementation
|
|
|
|
The TrustAwareAuthorityLens integrates with TrustRank to weight assertions by agent reputation. This is the foundation of "The Hive" learning loop.
|
|
|
|
**Resolution Strategy:**
|
|
1. For each candidate assertion, lookup the primary signer's TrustRank (O(1) lookup)
|
|
2. Calculate weighted score: `assertion.confidence * agent.trust_rank`
|
|
3. Return assertion with highest weighted score
|
|
4. Tiebreaker: If scores equal, prefer most recent timestamp
|
|
5. New agents default to 0.5 trust score
|
|
6. Unsigned assertions treated as 0.0 trust
|
|
|
|
**Confidence Calculation:**
|
|
```
|
|
weighted_score = assertion.confidence * agent.trust_rank
|
|
confidence = weighted_score // Direct weighted score
|
|
```
|
|
|
|
**TrustRank Learning Loop:**
|
|
- Agents start at 0.5 (neutral)
|
|
- Accurate predictions: +0.05 per correct assertion
|
|
- Inaccurate predictions: -0.1 per incorrect assertion (higher penalty discourages spam)
|
|
- Confidence half-life: Scores decay over 30 days by default
|
|
- Scores bounded to [0.0, 1.0]
|
|
|
|
**Example:**
|
|
```rust
|
|
use stemedb_lens::TrustAwareAuthorityLens;
|
|
use stemedb_storage::{HybridStore, GenericTrustRankStore};
|
|
use std::sync::Arc;
|
|
|
|
let store = HybridStore::open("./data").await?;
|
|
let trust_store = Arc::new(GenericTrustRankStore::new(store));
|
|
let lens = TrustAwareAuthorityLens::new(trust_store);
|
|
|
|
let resolution = lens.resolve_async(&candidates).await;
|
|
|
|
// Record outcome for learning
|
|
trust_store.record_outcome(&agent_id, was_accurate, timestamp).await?;
|
|
|
|
// Apply decay periodically
|
|
trust_store.decay_trust_ranks(current_timestamp, None).await?;
|
|
```
|
|
|
|
## Standard Lenses
|
|
|
|
| Lens | Strategy | Use Case | Status |
|
|
|------|----------|----------|--------|
|
|
| Recency | Latest timestamp wins | News, real-time | ✅ Implemented |
|
|
| Consensus | Most common object value | Democratic truth (basic) | ✅ Implemented |
|
|
| VoteAwareConsensus | Highest vote weight from VoteStore | Democratic truth (advanced) | ✅ Implemented |
|
|
| Confidence | Highest assertion `confidence` field | Source-declared certainty | ✅ Implemented |
|
|
| Authority | Alias for TrustAwareAuthority | Reputation-weighted (user-friendly name) | ✅ Implemented |
|
|
| TrustAwareAuthority | Weighted by TrustRank reputation | Expert truth (The Hive) | ✅ Implemented |
|
|
| **Skeptic** | Returns all claims with conflict score | "Trust but Verify" dashboards | ✅ Implemented |
|
|
| EpochAware | Filters superseded epochs first | Paradigm-safe queries | ✅ Implemented |
|
|
| Constraints | Returns `must_use`/`forbidden` predicates | Pre-flight checks | 🔜 Planned |
|
|
|
|
**Note:** The `Authority` lens is now an alias for `TrustAwareAuthority` (both use agent reputation via TrustRank). Use `Confidence` if you want to select by the assertion's self-declared confidence field without considering agent reputation.
|
|
|
|
## EpochAwareLens Implementation
|
|
|
|
The EpochAwareLens filters assertions from superseded epochs before delegating to an inner lens. This enables "paradigm-safe" queries where obsolete worldviews are automatically excluded.
|
|
|
|
**Resolution Strategy:**
|
|
1. Collect all unique epoch IDs from candidate assertions
|
|
2. For each epoch, read `E:{epoch_id}` from store
|
|
3. Walk the `supersedes` chain to build a set of superseded epoch IDs
|
|
4. Filter candidates: exclude any assertion whose epoch is in the superseded set
|
|
5. Delegate filtered candidates to inner lens (default: RecencyLens)
|
|
|
|
**Key Design Decisions:**
|
|
|
|
| Behavior | Choice | Rationale |
|
|
|----------|--------|-----------|
|
|
| Missing epoch record | Include assertion (fail-open) | Data availability > metadata consistency |
|
|
| Cycle in supersession chain | Stop walking, include assertions | Pathological data shouldn't hide valid assertions |
|
|
| Max depth exceeded (100) | Stop walking, log warning | Prevent infinite loops |
|
|
| No epochs in candidates | Delegate directly to inner lens | Optimization for common case |
|
|
|
|
**Use Case: Accounting Standard Migration (GAAP → IFRS)**
|
|
|
|
```bash
|
|
# Create epochs representing paradigm shift
|
|
POST /v1/epoch {"name": "GAAP-Era", "start_timestamp": 0}
|
|
# Returns epoch_id: "abc123..."
|
|
|
|
POST /v1/epoch {
|
|
"name": "IFRS-Transition",
|
|
"supersedes": "abc123...",
|
|
"supersession_type": "Temporal",
|
|
"start_timestamp": 1704067200
|
|
}
|
|
# Returns epoch_id: "def456..."
|
|
|
|
# Query with epoch awareness
|
|
GET /v1/query?subject=Acme&predicate=lease_liability&lens=EpochAware
|
|
|
|
# Returns IFRS treatment (new epoch)
|
|
# GAAP treatment (old epoch) automatically excluded
|
|
```
|
|
|
|
**Example:**
|
|
```rust
|
|
use stemedb_lens::EpochAwareLens;
|
|
use stemedb_storage::HybridStore;
|
|
use std::sync::Arc;
|
|
|
|
let store = Arc::new(HybridStore::open("./data").expect("store"));
|
|
|
|
// Default: filter superseded epochs, then pick most recent
|
|
let lens = EpochAwareLens::with_recency(store.clone());
|
|
|
|
// Custom: filter superseded epochs, then use consensus
|
|
use stemedb_lens::ConsensusLens;
|
|
let lens = EpochAwareLens::with_sync_lens(store, ConsensusLens);
|
|
|
|
let resolution = lens.resolve_async(&candidates).await;
|
|
```
|
|
|
|
**Limitation:** The lens only filters assertions when assertions from the superseding epoch are present in the candidates. If you only have old-epoch assertions (no new-epoch assertions exist for the query), they will pass through. This is intentional fail-open behavior.
|
|
|
|
## SkepticLens Implementation
|
|
|
|
The SkepticLens surfaces conflict instead of hiding it. It implements `AnalysisLens` rather than `Lens`, returning a `ConflictAnalysis` with all competing claims.
|
|
|
|
**Resolution Strategy:**
|
|
1. Group assertions by object value
|
|
2. For each group, calculate aggregate vote weight (or fallback to confidence)
|
|
3. Calculate normalized Shannon entropy as conflict score
|
|
4. Determine status: Unanimous (<0.1), Agreed (<0.4), or Contested (>=0.4)
|
|
5. Build ClaimSummary for each group with supporting agents and source provenance
|
|
|
|
**Conflict Score Formula:**
|
|
```
|
|
entropy = -sum(p * log2(p)) for each claim weight proportion
|
|
conflict_score = entropy / log2(num_claims) // Normalized to 0.0-1.0
|
|
```
|
|
|
|
**API Endpoint:**
|
|
```
|
|
GET /v1/skeptic?subject=Semaglutide&predicate=muscle_effect
|
|
|
|
{
|
|
"status": "Contested",
|
|
"conflict_score": 0.72,
|
|
"claims": [
|
|
{
|
|
"value": {"type": "Text", "value": "Significant loss"},
|
|
"weight_share": 0.45,
|
|
"assertion_count": 12,
|
|
"supporting_agents": [...]
|
|
},
|
|
{
|
|
"value": {"type": "Text", "value": "Minimal loss"},
|
|
"weight_share": 0.35,
|
|
"assertion_count": 3
|
|
}
|
|
],
|
|
"candidates_count": 17
|
|
}
|
|
```
|
|
|
|
**Example:**
|
|
```rust
|
|
use stemedb_lens::SkepticLens;
|
|
use stemedb_storage::{HybridStore, GenericVoteStore, GenericTrustRankStore};
|
|
use std::sync::Arc;
|
|
|
|
let store = HybridStore::open("./data").await?;
|
|
let vote_store = Arc::new(GenericVoteStore::new(store.clone()));
|
|
let trust_store = Arc::new(GenericTrustRankStore::new(store));
|
|
let lens = SkepticLens::new(vote_store, trust_store);
|
|
|
|
let analysis = lens.analyze(&candidates).await;
|
|
if analysis.status == ResolutionStatus::Contested {
|
|
println!("⚠️ This fact is disputed! Conflict score: {:.2}", analysis.conflict_score);
|
|
}
|
|
```
|
|
|
|
## Lens::Constraints (Pre-Flight Check)
|
|
|
|
Special lens for agent safety. Returns rules, not facts.
|
|
|
|
```
|
|
GET /query?context=python_http&lens=constraints
|
|
|
|
-> Returns:
|
|
{
|
|
"constraints": [
|
|
{ "must_use": "axios", "forbidden": "requests", "reason": "User correction" }
|
|
]
|
|
}
|
|
```
|
|
|
|
**Origin:** Solves the "Optimization Conflict" where agents forget corrections. Acts as a compiler error for agent intent.
|
|
|
|
See [agile-agent-team.md](../../use-cases/agile-agent-team.md#feature-6-persistent-learning-negative-constraints--the-gardener) for full explanation.
|
|
|
|
## Query Flow
|
|
|
|
1. Client: `GET(Subject="Tesla", Predicate="Revenue", Lens="Consensus")`
|
|
2. Index lookup: `SP:Tesla:Revenue` -> `[Hash1, Hash2, Hash3]`
|
|
3. Hydrate: Load assertions from hashes
|
|
4. Resolve: `ConsensusLens.resolve(assertions, context)`
|
|
5. Return: Single deterministic answer with confidence
|
|
|
|
## Related Topics
|
|
|
|
- [Assertion](./assertion.md)
|
|
- [stemedb-lens skill](../../.claude/skills/stemedb-lens/SKILL.md)
|