docs: Mark Phase 2.4, 2.5, 2.6 as complete in roadmap

- 2.4 Visual Hash Query: hamming_distance, visual_near/threshold implemented
- 2.5 Vector Field: N/A (Phase 3 work, scaffolding correct)
- 2.6 E2E Integration Test: e2e_pipeline.rs with 5 comprehensive tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
jordan 2026-02-01 13:33:03 -07:00
parent 1ce4004807
commit 152df4b0b4
3 changed files with 28 additions and 31 deletions

View File

@ -253,3 +253,4 @@ mod tests {
assert_eq!(deserialized.weight, 0.8); assert_eq!(deserialized.weight, 0.8);
} }
} }
// test hook

View File

@ -137,41 +137,35 @@
- [x] Documentation updated in `ai-lookup/services/lens.md` - [x] Documentation updated in `ai-lookup/services/lens.md`
- **Known Limitation:** Filtering only occurs when assertions from the superseding epoch are present in candidates. If all candidates are from old epoch (no new epoch assertions), they pass through (fail-open behavior). - **Known Limitation:** Filtering only occurs when assertions from the superseding epoch are present in candidates. If all candidates are from old epoch (no new epoch assertions), they pass through (fail-open behavior).
- [ ] **2.4 Visual Hash Query Support**: Make the stored `visual_hash` queryable. - [x] **2.4 Visual Hash Query Support**: Make the stored `visual_hash` queryable.
- **Problem:** `visual_hash: Option<PHash>` exists on `Assertion` and is stored/returned by the API, but there is no way to query by visual similarity. The field is write-only from a query perspective. - **Status:** ✅ COMPLETE
- **Current state:** `PHash` is `[u8; 8]` (perceptual hash). Stored on assertions. API accepts/returns it. No query parameter. No similarity computation. - **Implementation:**
- [ ] Add `visual_near: Option<String>` (hex-encoded pHash) and `visual_threshold: Option<u32>` (max hamming distance, default: 8) to `Query` struct. - [x] `hamming_distance(a: &PHash, b: &PHash) -> u32` in `crates/stemedb-query/src/query.rs` (lines 26-28)
- [ ] Add `.visual_near(hash, threshold)` to `QueryBuilder`. - [x] `visual_near: Option<String>` and `visual_threshold: Option<u32>` in `Query` struct (lines 84-90)
- [ ] In `Query::matches()`: if `visual_near` is set and assertion has `visual_hash`, compute hamming distance (XOR + popcount on the 8 bytes). If distance <= threshold, match. If assertion has no `visual_hash`, don't match. - [x] `.visual_near(hash, threshold)` builder method
- [ ] Add `visual_near` and `visual_threshold` to API `QueryParams` DTO. - [x] `Query::matches()` computes hamming distance when `visual_near` is set
- [ ] Implement `hamming_distance(a: &[u8; 8], b: &[u8; 8]) -> u32` utility function. - [x] API `QueryParams` DTO has `visual_near` and `visual_threshold`
- [ ] Tests: - [x] 10+ tests: exact_match, within_threshold, exceeds_threshold, skips_without_hash, invalid_hex, wrong_length, combines_with_subject, default_threshold, max_threshold, threshold_63_rejects
- [ ] `test_visual_near_exact_match`: Same hash, threshold 0. Matches. - **Note:** Brute-force O(N) scan. VP-tree/BK-tree index is Phase 3+.
- [ ] `test_visual_near_within_threshold`: Hashes differ by 3 bits, threshold 5. Matches.
- [ ] `test_visual_near_exceeds_threshold`: Hashes differ by 10 bits, threshold 5. No match.
- [ ] `test_visual_near_skips_assertions_without_hash`: Assertion has no visual_hash. Not matched.
- [ ] **Note:** This is a brute-force scan approach (O(N) with hamming distance check). A proper VP-tree or BK-tree index is Phase 3. This gives immediate queryability.
- [ ] **2.5 Vector Field**: No changes needed. Already roadmapped for Phase 3. - [x] **2.5 Vector Field**: No changes needed. Already roadmapped for Phase 3.
- **Status:** ✅ N/A (No Phase 2 work required)
- **Current state:** `vector: Option<Vec<f32>>` on `Assertion`. Stored and returned by API. No index, no search. - **Current state:** `vector: Option<Vec<f32>>` on `Assertion`. Stored and returned by API. No index, no search.
- **Phase 3 plan:** Integrate `hnsw-rs` or `lance` for k-NN search. - **Phase 3 plan:** Integrate `hnsw-rs` or `lance` for k-NN search.
- **No Phase 2.5 work required.** The field scaffolding is correct.
- [ ] **2.6 E2E Integration Test (Write -> Materialize -> Read)**: Prove the full pipeline works end-to-end. - [x] **2.6 E2E Integration Test (Write -> Materialize -> Read)**: Prove the full pipeline works end-to-end.
- **Problem:** The IngestWorker, Materializer, and QueryEngine have been tested in isolation. The Notify integration between IngestWorker and Materializer is tested with a single notification. No test wires all three components together to verify the full write-materialize-read loop. - **Status:** ✅ COMPLETE
- **Current state:** IngestWorker has `with_notify(Arc<Notify>)` (`worker.rs`). Materializer has `run_notified(Arc<Notify>, Duration)` (`materializer.rs`). QueryEngine has `try_fast_path()` (`engine.rs`). Never tested together. - **Implementation:**
- [ ] Create integration test in `crates/stemedb-query/tests/e2e_pipeline.rs`: - [x] `crates/stemedb-query/tests/e2e_pipeline.rs` with 5 comprehensive tests:
- [ ] Setup: Create temp WAL + SledStore + VoteStore + TrustRankStore. - `test_e2e_write_materialize_read` - Basic happy path
- [ ] Wire: IngestWorker with `with_notify(notify)`, Materializer with `run_notified(notify)`, QueryEngine with same store. - `test_e2e_vote_consensus` - Vote-weighted resolution
- [ ] Test steps: - `test_e2e_update_winner` - Winner changes on re-materialize
1. Write a signed assertion to WAL. - `test_e2e_cursor_persistence` - Cursor survives worker restart
2. Run IngestWorker.step() -> verifies assertion ingested to KV. - `test_e2e_notify_integration` - Event-driven notification channel
3. Verify IngestWorker triggered Notify. - [x] `stemedb-wal` and `stemedb-ingest` added as dev-dependencies
4. Run Materializer.step() -> verifies MV written. - [x] Helper functions: `create_signed_assertion()`, `compute_assertion_hash()`, `create_vote()`
5. Execute QueryEngine.execute() with subject+predicate -> verifies fast-path returns MV winner. - [x] Uses Ed25519 signing for authentic signature verification
- [ ] Variant: Write 2 competing assertions, add votes favoring one, run pipeline, verify correct winner in MV. - [x] Also: `crates/stemedb-api/tests/e2e_flow_test.rs` tests the HTTP API layer end-to-end.
- [ ] Variant: Write assertion, materialize, write new assertion with higher timestamp, re-materialize, verify MV updated, verify query returns new winner.
- [ ] Add `stemedb-wal` and `stemedb-ingest` as dev-dependencies in `stemedb-query/Cargo.toml`.
### Phase 3: The Pilot (BioTech/Pharma) ### Phase 3: The Pilot (BioTech/Pharma)
*Goal: Prove value in the "High-Liability" beachhead. Close every Camp 4 gap that blocks a credible demo.* *Goal: Prove value in the "High-Liability" beachhead. Close every Camp 4 gap that blocks a credible demo.*

View File

@ -394,3 +394,5 @@ func (c *Client) doJSON(ctx context.Context, method, path string, body any, resu
return nil return nil
} }
// test hook