stemedb/applications/sentinel/roadmap.md
jordan 55349845d0 refactor: Split all files to enforce 500-line max
Break monolith source files into focused modules:
- stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.)
- stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories
- stemedb-ingest/worker.rs → worker/ with separate test modules
- stemedb-query: engine, materializer, query → module directories
- stemedb-lens: epoch_aware, skeptic → module directories
- stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types
- stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline
- stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution
- stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay

Also adds new features: gold standard verification, escalation handlers,
admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 01:13:45 -07:00

390 lines
14 KiB
Markdown

# Sentinel Roadmap
---
## Phase 0: StemeDB Foundation
Changes to the core database that Sentinel depends on. These ship before the CLI.
### 0.1 ConceptPath Type
Add the `ConceptPath` struct to `stemedb-core`. Parsing, validation, wire format (`scheme://segments/leaf`), prefix matching, parent traversal. Backward-compatible: bare strings parse as `custom://{string}`.
**Depends on:** [concept-hierarchy spec](../../docs/specs/concept-hierarchy.md)
**Crate:** `stemedb-core`
### 0.2 ConceptPath in Assertion
Replace `Assertion.subject: EntityId` with `Assertion.subject: ConceptPath`. Update rkyv serialization. Update all downstream consumers (ingestion, query, lenses, API, tests).
**Crate:** `stemedb-core`, `stemedb-ingest`, `stemedb-query`, `stemedb-lens`, `stemedb-api`
### 0.3 Hierarchical Index
Update `IndexStore` key construction to use ConceptPath wire format. Verify that `scan_prefix` on `S:{concept_path}/` returns all descendants. No new index structure needed — the `/` in the path maps to byte-level prefix scanning.
**Crate:** `stemedb-storage`
### 0.4 Alias Store
Add `CA:` (alias → canonical) and `CAR:` (canonical → all aliases) key prefixes. Implement alias resolution in the query path: lookup aliases before index scan, merge results, deduplicate. Transitive alias resolution.
**Crate:** `stemedb-storage`, `stemedb-query`
### 0.5 Source Class Inference
Wire scheme-based tier inference into ingestion. If no explicit `source_class` is set, infer from ConceptPath scheme. `rfc://` → Tier 0, `code://` → Tier 3, etc.
**Crate:** `stemedb-ingest`
### 0.6 Concept API Endpoints
```
POST /v1/concepts/alias Create alias
GET /v1/concepts/aliases/{path} List aliases for a path
DELETE /v1/concepts/alias Remove alias
GET /v1/concepts/tree/{prefix} Browse hierarchy under prefix
GET /v1/concepts/suggest Suggested aliases (shared leaf detection)
```
**Crate:** `stemedb-api`
---
## Phase 1: Authoritative Corpus
Before Sentinel can find conflicts, Episteme needs the authoritative sources to conflict against.
### 1.1 RFC Ingester
A CLI tool (or ingestion module) that:
- Fetches RFC text from `rfc-editor.org` (text format, no PDF parsing needed)
- Extracts normative statements (MUST, MUST NOT, SHOULD, SHALL per RFC 2119)
- Maps each statement to a ConceptPath: `rfc://{number}/{topic}/{claim}`
- Ingests as Tier 0 assertions
Start with a curated list of security-relevant RFCs:
| RFC | Topic |
|-----|-------|
| 7519 | JWT |
| 6749 | OAuth 2.0 |
| 6750 | Bearer tokens |
| 8446 | TLS 1.3 |
| 7525 | TLS best practices |
| 6238 | TOTP |
| 7617 | HTTP Basic Auth |
| 9110 | HTTP Semantics |
### 1.2 OWASP Ingester
Parse OWASP Cheat Sheets (markdown source on GitHub):
- Extract each recommendation as a claim
- Map to `owasp://cheatsheet/{topic}/{claim}`
- Ingest as Tier 1 assertions
Priority cheat sheets: Authentication, JWT, TLS, Secrets Management, Input Validation, Session Management.
### 1.3 Vendor Docs (Manual Bootstrap)
For v1, manually curate a small set of vendor doc claims:
- Postgres connection pool recommendations
- Redis timeout defaults
- Common HTTP client library defaults (reqwest, hyper, net/http)
These are `vendor://{product}/{topic}/{claim}` at Tier 2.
This doesn't need to be exhaustive. It needs to cover the claims that Sentinel's extractors will actually find in code.
---
## Phase 2: CLI Core
The Sentinel binary itself.
### 2.1 Project Walker
Input: a project root path.
Output: a list of files to scan, each tagged with:
- Language (rust, go, python, typescript, yaml, toml, json)
- ConceptPath prefix derived from directory structure
```
crates/citadeldb/src/auth/jwt.rs
→ language: rust
→ prefix: code://rust/citadeldb/auth/jwt
```
Normalization rules:
- Strip `src/`, `lib/`, `pkg/`, `internal/` (language boilerplate)
- Strip `crates/`, `packages/`, `apps/` (monorepo wrappers)
- Map `config/`, `deploy/`, `infra/` to `code://config/{project}/...`
- File extension determines language, not directory
### 2.2 Extractors
Each extractor is a module that:
- Takes a file path + content + language
- Returns a `Vec<ExtractedClaim>`
Ship these extractors in v1:
| Extractor | What it finds | Languages |
|-----------|--------------|-----------|
| `tls_verify` | TLS certificate verification disabled | rust, go, python, js/ts |
| `jwt_config` | JWT validation settings (aud, exp, alg) | rust, go, python, js/ts |
| `hardcoded_secrets` | Credentials in source (not .env) | all |
| `timeout_config` | HTTP/DB/Redis timeout values | all (config files) |
| `dep_versions` | Known-vulnerable dependency versions | Cargo.toml, go.mod, package.json, requirements.txt |
| `cors_config` | CORS allow-origin settings | rust, go, js/ts |
| `rate_limit` | Rate limiting disabled or unreasonable | rust, go, js/ts |
Extractors use regex + AST patterns, not LLMs. Each extractor declares:
- The patterns it searches for
- The ConceptPath leaf it maps to
- The predicate (e.g., `config_value`, `enabled`, `version`)
- How to extract the ObjectValue from the match
### 2.3 Ingestion Bridge
Connect extractor output to the Episteme ingestion pipeline:
```
ExtractedClaim {
path: code://rust/citadeldb/auth/jwt/audience_validation
predicate: "enabled"
value: Boolean(false)
source_location: "src/auth/jwt.rs:47"
confidence: 1.0 // regex match, not heuristic
}
Assertion {
subject: ConceptPath::parse("code://rust/citadeldb/auth/jwt/audience_validation")
predicate: "enabled"
object: ObjectValue::Boolean(false)
source_class: SourceClass::Expert // inferred from code:// scheme
source_hash: blake3(file_content)
source_metadata: { "file": "src/auth/jwt.rs", "line": 47 }
confidence: 1.0
lifecycle: LifecycleStage::Approved // code is deployed, it's a fact about the code
}
```
The bridge handles:
- ConceptPath construction from extractor output
- Source hash computation (BLAKE3 of the file at scan time)
- Source metadata encoding (file path, line number, extraction method)
- Signing with the Sentinel agent's keypair
### 2.4 Conflict Query
After ingestion, query Episteme for each extracted concept:
```rust
for claim in extracted_claims {
let results = query_engine.query(Query {
subject: Some(claim.path.to_string()),
resolve_aliases: true,
hierarchical: false,
lens: Some("skeptic"),
..Default::default()
});
if results.conflict_score > threshold {
report.add_conflict(claim, results);
}
}
```
The Skeptic lens returns all claims for the concept across all aliased paths, with a conflict score. If the code claim (Tier 3) contradicts an RFC claim (Tier 0), the conflict score will be high because of the tier spread.
### 2.5 Report Output
```
$ sentinel scan ./citadeldb --format table
┌──────────────────────────────────────────────────────────────────────┐
│ Sentinel Report: citadeldb │
│ Scanned: 142 files │ Claims: 23 │ Conflicts: 3 │
├──────────┬───────────────────────────────────────┬──────────┬───────┤
│ Verdict │ Concept │ Score │ Tier │
├──────────┼───────────────────────────────────────┼──────────┼───────┤
│ BLOCK │ auth/jwt/audience_validation │ 0.92 │ 0↔3 │
│ BLOCK │ net/tls/cert_verification │ 0.87 │ 1↔3 │
│ FLAG │ http/timeout │ 0.54 │ 2↔3 │
└──────────┴───────────────────────────────────────┴──────────┴───────┘
Details:
BLOCK code://rust/citadeldb/auth/jwt/audience_validation
Your code: aud validation disabled (src/auth/jwt.rs:47)
RFC 7519: aud validation MUST be enabled (Tier 0)
Action: Fix or acknowledge with: sentinel ack <path> --reason "..."
BLOCK code://rust/citadeldb/net/tls/cert_verification
Your code: verify = false (src/net/client.rs:23)
OWASP: verification required (Tier 1)
Action: Fix or acknowledge with: sentinel ack <path> --reason "..."
FLAG code://rust/citadeldb/http/timeout
Your code: timeout = 0 (infinite) (config/production.yaml:8)
reqwest: default timeout 30s (Tier 2)
Action: Review recommended
```
Output formats: `table` (default), `json`, `sarif` (for CI integration), `markdown`.
### 2.6 Acknowledge Command
```
$ sentinel ack code://rust/citadeldb/auth/jwt/audience_validation \
--reason "Internal service, no external JWT consumers. Accepted risk per SEC-2024-003."
```
This creates a new Assertion:
- Subject: `internal://decision/citadeldb/auth/jwt/audience_validation`
- Predicate: `deviation_accepted`
- Object: Text with the reason
- SourceClass: Expert (Tier 3)
- Aliased to: `code://rust/citadeldb/auth/jwt/audience_validation`
The conflict still exists in Episteme, but the acknowledgment is recorded. Next scan, the conflict still shows but with context: "Acknowledged by [agent] on [date]: [reason]." The Skeptic lens sees the acknowledgment as an additional claim in the space.
---
## Phase 3: Skill Integration
### 3.1 Claude Code Skill
A `/sentinel` skill that wraps the CLI:
```
/sentinel scan Scan current project, report conflicts
/sentinel scan --fix Scan and offer to fix each conflict
/sentinel ack <path> Acknowledge a conflict with a reason
/sentinel status Show current conflict summary
/sentinel diff Show new conflicts since last scan
```
The skill runs the CLI binary, parses the JSON output, and presents results inline in the Claude Code session.
### 3.2 Agent Pre-Flight Hook
A Claude Code hook that runs Sentinel before certain operations:
```json
{
"hooks": {
"pre-commit": "sentinel scan --format sarif --exit-code",
"pre-deploy": "sentinel scan --strict --exit-code"
}
}
```
`--exit-code` returns non-zero if any BLOCK verdicts exist, preventing the commit or deploy.
### 3.3 Alias Suggestion Workflow
When Sentinel scans a new project and finds concepts that share leaf names with existing authoritative paths, it prompts:
```
New concept detected: code://rust/newproject/auth/jwt/audience_validation
Suggested alias:
→ rfc://7519/jwt/audience_validation (Tier 0, RFC 7519 Section 4.1.3)
Accept? [y/n/defer]
```
Accepting creates the alias. Deferring flags it for later review. Rejecting records that these are intentionally different concepts.
---
## Phase 4: CI Integration
### 4.1 GitHub Action
```yaml
- name: Sentinel Scan
uses: orchard9/sentinel-action@v1
with:
episteme-url: ${{ secrets.EPISTEME_URL }}
fail-on: block
format: sarif
```
Publishes SARIF results to GitHub Security tab. BLOCK verdicts fail the check. FLAG verdicts appear as warnings.
### 4.2 PR Comment Bot
On pull request, Sentinel scans the diff (not the whole project) and comments:
```
## Sentinel Report
This PR introduces 1 new conflict:
| File | Conflict | Score |
|------|----------|-------|
| src/auth/jwt.rs:47 | Disables aud validation (RFC 7519 requires it) | 0.92 |
Run `sentinel ack` to acknowledge, or fix before merge.
```
### 4.3 Baseline Mode
For existing projects with many conflicts, `sentinel baseline` records the current state. Subsequent scans only report *new* conflicts. This prevents the "500 warnings so we ignore all of them" problem.
```
$ sentinel baseline
Baseline recorded: 12 existing conflicts frozen.
Future scans will only report new conflicts.
```
---
## Phase 5: Research Agent Loop
### 5.1 Gap Detection
When Sentinel extracts a claim and no authoritative source exists for that concept, log it as a gap:
```
GAP: code://rust/citadeldb/cache/redis/max_memory_policy
No authoritative source found for redis/max_memory_policy
Seen in 3 projects
```
### 5.2 Research Agent Trigger
When a gap is seen across N projects (configurable, default 3), dispatch a research agent:
1. Agent searches for authoritative documentation on `redis max_memory_policy`
2. Finds Redis official docs
3. Extracts normative claims: "default is `noeviction`, recommended `allkeys-lru` for cache use cases"
4. Ingests as `vendor://redis/cache/max_memory_policy` at Tier 2
5. Future Sentinel scans now have something to conflict against
### 5.3 Community Corpus Contributions
Users who run Sentinel can opt in to contribute their alias mappings and acknowledgment patterns (anonymized) to a shared corpus. Common patterns propagate:
- "Every Rust project has this JWT pattern" → pre-built alias set for Rust JWT libraries
- "This Redis config is always flagged and always acknowledged" → lower the default threshold for that concept
- "This TLS pattern is always a real bug" → elevate the default threshold
---
## Milestone Summary
| Phase | Deliverable | Depends On |
|-------|-------------|------------|
| 0 | ConceptPath in StemeDB | concept-hierarchy spec |
| 1 | Authoritative corpus (RFCs, OWASP) | Phase 0 |
| 2 | Sentinel CLI (scan, report, ack) | Phase 0, Phase 1 |
| 3 | Claude Code skill + hooks | Phase 2 |
| 4 | CI integration (GitHub Action, PR bot) | Phase 2 |
| 5 | Research agent loop | Phase 2, Phase 4 (gap data) |
Phase 0 and Phase 1 can run in parallel — the corpus ingestion uses the ConceptPath types as they're built. Phase 2 is the critical path. Everything after Phase 2 is distribution and flywheel.