Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
220 lines
6.5 KiB
Markdown
220 lines
6.5 KiB
Markdown
# Chaos Testing (Phase 8A)
|
|
|
|
The `stemedb-chaos` crate provides infrastructure for testing Episteme distributed clusters under failure conditions.
|
|
|
|
## Overview
|
|
|
|
Chaos testing verifies that Episteme clusters:
|
|
- Continue accepting writes during network partitions
|
|
- Converge correctly after partition heals
|
|
- Handle node failures and recovery
|
|
- Maintain CRDT invariants under all conditions
|
|
- Handle clock skew correctly with HLC timestamps
|
|
|
|
## Components
|
|
|
|
### Test Harness
|
|
|
|
| Component | Purpose |
|
|
|-----------|---------|
|
|
| `ChaosNode` | Simulated cluster node with fault injection support |
|
|
| `TestCluster` | Manages N ChaosNodes with shared fault controllers |
|
|
|
|
### Fault Injection
|
|
|
|
| Controller | Capabilities |
|
|
|------------|--------------|
|
|
| `NetworkController` | Partitions, latency, message drops |
|
|
| `ClockController` | Clock skew injection for HLC testing |
|
|
|
|
### CRDT Property Verification
|
|
|
|
| Function | Verifies |
|
|
|----------|----------|
|
|
| `verify_commutativity()` | `merge(A, B) = merge(B, A)` |
|
|
| `verify_associativity()` | `(A merge B) merge C = A merge (B merge C)` |
|
|
| `verify_idempotence()` | `merge(A, A) = A` |
|
|
|
|
## Running Chaos Tests
|
|
|
|
```bash
|
|
# All chaos tests
|
|
cargo test -p stemedb-chaos
|
|
|
|
# Partition tests only
|
|
cargo test -p stemedb-chaos --test partition_tests
|
|
|
|
# Consistency tests only
|
|
cargo test -p stemedb-chaos --test consistency_tests
|
|
|
|
# Unit tests only
|
|
cargo test -p stemedb-chaos --lib
|
|
```
|
|
|
|
## Test Categories
|
|
|
|
### Partition Tests (8 tests)
|
|
|
|
| Test | Scenario |
|
|
|------|----------|
|
|
| `test_5_node_kill_2_convergence` | 5-node cluster survives 2 node failures |
|
|
| `test_partition_between_groups_convergence` | [0,1,2] vs [3,4] partition and heal |
|
|
| `test_message_reordering_convergence` | 100 writes in random order converge |
|
|
| `test_message_duplication_idempotent` | Repeated syncs don't create duplicates |
|
|
| `test_cascading_failure_recovery` | Sequential node failures and recovery |
|
|
| `test_swim_suspicion_not_false_positive` | Slow node marked Suspect, then Alive |
|
|
| `test_asymmetric_partition` | One-way partition (0→1 works, 1→0 blocked) |
|
|
| `test_write_availability_during_partition` | All nodes can write when fully partitioned |
|
|
|
|
### Consistency Tests (11 tests)
|
|
|
|
| Test | Scenario |
|
|
|------|----------|
|
|
| `test_crdt_eventual_consistency` | 1000 concurrent writes across 5 nodes |
|
|
| `test_crdt_commutativity` | Different merge orders produce same result |
|
|
| `test_crdt_associativity` | Merge grouping doesn't affect result |
|
|
| `test_crdt_idempotence` | Syncing same data repeatedly is stable |
|
|
| `test_hlc_handles_clock_skew` | ±5 second skew still converges |
|
|
| `test_hlc_monotonic_under_partition` | HLC remains monotonic during partition |
|
|
| `test_supersession_ordering_with_clock_skew` | HLC ordering with 2s skew |
|
|
| `test_concurrent_writes_same_subject_under_partition` | Both writes survive (append-only) |
|
|
| `test_large_merkle_diff_eventual_convergence` | 1500 vs 500 assertions converge |
|
|
| `test_all_crdt_properties` | Property-based verification |
|
|
| `test_eventual_consistency_property` | Eventual consistency verification |
|
|
|
|
## Example Usage
|
|
|
|
### Basic Cluster Test
|
|
|
|
```rust
|
|
use stemedb_chaos::TestCluster;
|
|
|
|
#[tokio::test]
|
|
async fn test_basic_convergence() {
|
|
let mut cluster = TestCluster::spawn(3).await.expect("spawn");
|
|
|
|
// Write to node 0
|
|
cluster.get_node_mut(0)
|
|
.write_assertion("subject", "pred", 1000)
|
|
.await.expect("write");
|
|
|
|
// Sync all nodes
|
|
cluster.sync_all().await.expect("sync");
|
|
|
|
// Verify convergence
|
|
cluster.assert_converged();
|
|
}
|
|
```
|
|
|
|
### Partition Testing
|
|
|
|
```rust
|
|
use stemedb_chaos::TestCluster;
|
|
|
|
#[tokio::test]
|
|
async fn test_partition() {
|
|
let mut cluster = TestCluster::spawn(4).await.expect("spawn");
|
|
|
|
// Create partition: [0,1] vs [2,3]
|
|
cluster.network().partition(&[0, 1], &[2, 3]);
|
|
|
|
// Write to both sides
|
|
cluster.get_node_mut(0).write_assertion("a", "pred", 1000).await.expect("write");
|
|
cluster.get_node_mut(2).write_assertion("b", "pred", 2000).await.expect("write");
|
|
|
|
// Heal and sync
|
|
cluster.network().heal();
|
|
cluster.sync_all().await.expect("sync");
|
|
|
|
// Both writes survive
|
|
cluster.assert_converged();
|
|
assert_eq!(cluster.get_node(0).assertion_count(), 2);
|
|
}
|
|
```
|
|
|
|
### Clock Skew Testing
|
|
|
|
```rust
|
|
use stemedb_chaos::TestCluster;
|
|
|
|
#[tokio::test]
|
|
async fn test_clock_skew() {
|
|
let mut cluster = TestCluster::spawn(2).await.expect("spawn");
|
|
|
|
// Inject +5 second skew on node 0
|
|
cluster.clock().inject_skew(0, 5000);
|
|
|
|
// Verify skew is detected
|
|
assert!(cluster.clock().has_significant_skew(0, 1));
|
|
|
|
// Write with skewed timestamps
|
|
cluster.get_node_mut(0).write_assertion("skewed", "pred", 1000).await.expect("write");
|
|
|
|
// Cluster still converges
|
|
cluster.sync_all().await.expect("sync");
|
|
cluster.assert_converged();
|
|
}
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
TestCluster
|
|
├── nodes: Vec<ChaosNode>
|
|
├── network: Arc<NetworkController>
|
|
└── clock: Arc<ClockController>
|
|
|
|
ChaosNode
|
|
├── crdt_store: CrdtAssertionStore
|
|
├── merkle_tree: MerkleTree
|
|
├── hash_to_data: HashMap<Hash, (Subject, Data)>
|
|
├── hlc: SkewedHlc (respects ClockController)
|
|
└── alive: bool (kill/revive simulation)
|
|
|
|
NetworkController
|
|
├── partitions: DashMap<(from, to), bool>
|
|
├── latencies: DashMap<(from, to), Duration>
|
|
└── drop_rates: DashMap<(from, to), f64>
|
|
|
|
ClockController
|
|
├── node_offsets: DashMap<node, offset_ms>
|
|
└── global_offset_ms: AtomicI64
|
|
```
|
|
|
|
## Design Decisions
|
|
|
|
### Channel-Based vs iptables/tc
|
|
|
|
**Chosen: Channel-based interception**
|
|
|
|
- Aligns with existing `SimNode` pattern in `partition_tolerance.rs`
|
|
- Deterministic and CI-friendly (no elevated privileges)
|
|
- Production code stays unchanged
|
|
- Real network tests can be added later as optional e2e suite
|
|
|
|
### Sync Semantics
|
|
|
|
- `sync_from()` on ChaosNode checks partition state before syncing
|
|
- `sync_all()` on TestCluster does full mesh sync respecting partitions
|
|
- Content-addressed storage ensures idempotent merges
|
|
|
|
## Metrics
|
|
|
|
The controllers track:
|
|
- `messages_dropped`: Total messages dropped (partition + drop rate)
|
|
- `messages_delayed`: Total messages delayed (latency)
|
|
- `partition_events`: Number of partition operations
|
|
|
|
```rust
|
|
let summary = cluster.summary();
|
|
println!("Dropped: {}", summary.messages_dropped);
|
|
println!("Delayed: {}", summary.messages_delayed);
|
|
println!("Max skew: {}ms", summary.max_clock_skew_ms);
|
|
```
|
|
|
|
## Related Documentation
|
|
|
|
- [Architecture](../../architecture.md) - Overall system design
|
|
- [Distributed Write Path](../../docs/research/distributed-write-path.md) - CRDT replication
|
|
- [Phase 6 UAT](./phase6-uat.md) - Cluster coordination tests
|