feat(aphoria): implement claims architecture (A1-A5) with verify engine, corpus, coverage, and explain
Complete Aphoria claims system overhaul: - A1: Rename ExtractedClaim to Observation (extractors produce observations, not claims) - A2: Add AuthoredClaim with full provenance, invariants, and authority tiers - A3: Verify engine comparing observations against authored claims, CLI + formatters - A4: Corpus as first-class assertions with predicate indexing, authority lens, trust packs - A5: Coverage analysis, explain/docs generation, self-audit extractor, claim suggester skill Also includes: 42 extractors updated for Observation type, verifiable_predicates trait, conflict detection with comparison modes, claims TOML persistence, Grafana dashboard, backup/restore scripts, and comprehensive test coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
99b81adf8c
commit
3b5f88b4f0
239
.claude/skills/aphoria-claims/SKILL.md
Normal file
239
.claude/skills/aphoria-claims/SKILL.md
Normal file
@ -0,0 +1,239 @@
|
||||
---
|
||||
name: aphoria-claims
|
||||
description: Author and review claims during diff review. Use when reviewing PRs, git diffs, or code changes to identify claimable decisions, suggest new claims, and check existing claims for violations. Triggers on "review diff for claims", "what claims does this change need", "aphoria claims review", "author claims for this diff".
|
||||
---
|
||||
|
||||
# Aphoria Claims Authoring Skill
|
||||
|
||||
You are an expert at identifying **architectural decisions, safety invariants, and policy requirements** hidden in code changes. Your job is to review diffs and help developers author proper claims with provenance, invariants, and consequences — not just observations.
|
||||
|
||||
## The Key Distinction
|
||||
|
||||
| | Observation | Claim |
|
||||
|---|---|---|
|
||||
| **Source** | Extractor (grep) | Human (deliberate) |
|
||||
| **Example** | `ordering = SeqCst at line 42` | "All wallet atomics MUST use SeqCst" |
|
||||
| **Has** | file, line, confidence | provenance, invariant, consequence, evidence |
|
||||
| **Stored** | Ephemeral scan result | `.aphoria/claims.toml` (version-controlled) |
|
||||
|
||||
Observations describe what IS. Claims describe what MUST BE and WHY.
|
||||
|
||||
## Workflow: Reviewing a Diff for Claims
|
||||
|
||||
### Step 1: Get the Diff
|
||||
|
||||
Read the git diff. If the user hasn't provided one:
|
||||
|
||||
```bash
|
||||
git diff HEAD~1 # Last commit
|
||||
git diff --staged # Staged changes
|
||||
git diff main...HEAD # Branch changes
|
||||
```
|
||||
|
||||
### Step 2: Identify Claimable Patterns
|
||||
|
||||
Scan the diff for these categories:
|
||||
|
||||
| Pattern in Diff | Category | Claim Signal |
|
||||
|---|---|---|
|
||||
| New constant or magic number | `constants` | Why this value? What breaks if changed? |
|
||||
| New `#[derive(...)]` or removed derive | `derives` | Why these traits? Safety implications? |
|
||||
| New import / removed import | `imports` | Dependency boundary? Why allowed/forbidden? |
|
||||
| Atomic ordering choice | `safety` | Race condition implications? |
|
||||
| Error handling strategy | `architecture` | Why this approach? What's the fallback? |
|
||||
| Configuration default | `constants` | Why this default? What's the valid range? |
|
||||
| Access control / auth check | `safety` | What's protected? What if bypassed? |
|
||||
| Cryptographic choice | `safety` | Why this algorithm? Regulatory requirement? |
|
||||
| New public API surface | `architecture` | Stability commitment? Breaking change policy? |
|
||||
| Feature flag or toggle | `architecture` | Rollback plan? Who controls it? |
|
||||
|
||||
### Step 3: Check Existing Claims
|
||||
|
||||
Load the project's claims and check if the diff violates any:
|
||||
|
||||
```bash
|
||||
aphoria claims list --format json
|
||||
```
|
||||
|
||||
For each changed file, ask:
|
||||
- Does this change contradict an existing claim's invariant?
|
||||
- Does this change make an existing claim's consequence possible?
|
||||
- Does this change supersede an existing claim?
|
||||
|
||||
### Step 4: Draft Claims
|
||||
|
||||
For each claimable pattern found, draft using this template:
|
||||
|
||||
**Thinking through the claim:**
|
||||
1. **What must be true?** (invariant) — The rule that must hold
|
||||
2. **Why?** (provenance) — The analysis, decision, or spec that established this
|
||||
3. **What breaks?** (consequence) — The concrete failure mode if violated
|
||||
4. **Says who?** (authority tier) — How authoritative is this claim
|
||||
5. **Proof?** (evidence) — ADRs, specs, safety analyses, benchmarks
|
||||
|
||||
### Step 5: Create Claims via CLI
|
||||
|
||||
```bash
|
||||
aphoria claims create \
|
||||
--id "<project>-<concept>-<seq>" \
|
||||
--concept-path "<project>/<module>/<concept>" \
|
||||
--predicate "<what_aspect>" \
|
||||
--value "<required_value>" \
|
||||
--provenance "<who/what established this>" \
|
||||
--invariant "<what MUST be true>" \
|
||||
--consequence "<what breaks if violated>" \
|
||||
--tier <regulatory|clinical|observational|expert|community|anecdotal> \
|
||||
--evidence "<reference>" \
|
||||
--category <safety|architecture|imports|constants|derives> \
|
||||
--by "<author>"
|
||||
```
|
||||
|
||||
## Authority Tier Guide
|
||||
|
||||
When helping users pick a tier, use this decision tree:
|
||||
|
||||
| If the claim comes from... | Tier | Example |
|
||||
|---|---|---|
|
||||
| Law, regulation, compliance requirement | `regulatory` | "GDPR requires encryption at rest" |
|
||||
| Published spec (RFC, OWASP, IEEE) | `clinical` | "RFC 7519 requires audience validation" |
|
||||
| Benchmark data, load test results | `observational` | "Pool size >100 causes OOM under load" |
|
||||
| Team lead / architect decision | `expert` | "All wallet atomics use SeqCst" |
|
||||
| Convention, established pattern | `community` | "We use serde for serialization" |
|
||||
| Individual opinion, preference | `anecdotal` | "I think 30s timeout is better" |
|
||||
|
||||
Most project claims are `expert` (team decisions) or `observational` (measured).
|
||||
|
||||
## Claim ID Convention
|
||||
|
||||
Format: `<project>-<concept>-<sequence>`
|
||||
|
||||
Examples:
|
||||
- `maxwell-seqcst-001` — Maxwell project, SeqCst ordering, first claim
|
||||
- `api-auth-jwt-001` — API project, JWT auth, first claim
|
||||
- `core-no-tokio-001` — Core crate, no-tokio rule, first claim
|
||||
|
||||
Keep IDs short, readable, and referenceable in commit messages.
|
||||
|
||||
## Category Reference
|
||||
|
||||
| Category | What It Covers | Typical Invariants |
|
||||
|---|---|---|
|
||||
| `safety` | Race conditions, memory safety, data integrity | "MUST use X ordering", "MUST NOT allow Y" |
|
||||
| `architecture` | Module boundaries, dependency rules, API contracts | "Module X MUST NOT depend on Y" |
|
||||
| `imports` | Allowed/forbidden dependencies | "Core MUST NOT import tokio" |
|
||||
| `constants` | Magic numbers, configuration values | "Timeout MUST be >= 30s" |
|
||||
| `derives` | Required/forbidden trait derivations | "Message types MUST derive Serialize" |
|
||||
|
||||
## Example: Full Diff Review
|
||||
|
||||
Given this diff:
|
||||
```diff
|
||||
+ const MAX_POOL_SIZE: u32 = 50;
|
||||
+ const MIN_POOL_SIZE: u32 = 5;
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
## Claims Identified
|
||||
|
||||
### 1. Pool Size Bounds (constants)
|
||||
The diff introduces pool size constants. These deserve claims because changing
|
||||
them affects system stability under load.
|
||||
|
||||
**Suggested claims:**
|
||||
|
||||
aphoria claims create \
|
||||
--id "myapp-pool-max-001" \
|
||||
--concept-path "myapp/db/pool/max_size" \
|
||||
--predicate "max_value" \
|
||||
--value "50" \
|
||||
--provenance "Load testing showed OOM above 50 connections" \
|
||||
--invariant "Database pool size MUST NOT exceed 50" \
|
||||
--consequence "OOM kill under sustained load (>500 req/s)" \
|
||||
--tier observational \
|
||||
--evidence "Load test report 2026-01-15" \
|
||||
--category constants \
|
||||
--by jml
|
||||
|
||||
### 2. No Existing Claim Violations
|
||||
Checked 3 existing claims — none are affected by this change.
|
||||
```
|
||||
|
||||
## Lifecycle Operations
|
||||
|
||||
When the diff supersedes or invalidates an existing claim:
|
||||
|
||||
```bash
|
||||
# Update evidence on existing claim
|
||||
aphoria claims update wallet-seqcst-001 \
|
||||
--evidence "New benchmark data from 2026-02"
|
||||
|
||||
# Supersede with new claim (old marked as superseded automatically)
|
||||
aphoria claims supersede wallet-seqcst-001 \
|
||||
--new-id wallet-ordering-v2 \
|
||||
--value "Acquire" \
|
||||
--provenance "Updated safety analysis after AcqRel audit" \
|
||||
--by jml
|
||||
|
||||
# Deprecate if no longer relevant
|
||||
aphoria claims deprecate old-claim-001 \
|
||||
--reason "Module removed in refactor"
|
||||
```
|
||||
|
||||
## Decision Points
|
||||
|
||||
### Is This Worth a Claim?
|
||||
|
||||
Not every code change needs a claim. Ask:
|
||||
|
||||
| Question | If Yes | If No |
|
||||
|---|---|---|
|
||||
| Would violating this break something? | Claim it | Skip |
|
||||
| Would a new team member need to know this? | Claim it | Skip |
|
||||
| Is there a non-obvious reason for this choice? | Claim it | Skip |
|
||||
| Is this a temporary implementation detail? | Skip | — |
|
||||
| Is this enforced by the type system already? | Skip | — |
|
||||
|
||||
### Claim vs Acknowledgment?
|
||||
|
||||
| Situation | Use |
|
||||
|---|---|
|
||||
| "This MUST be true going forward" | `aphoria claims create` |
|
||||
| "We know this conflicts but it's intentional" | `aphoria ack add` |
|
||||
|
||||
## Output Format
|
||||
|
||||
When reviewing a diff, produce:
|
||||
|
||||
```markdown
|
||||
## Claims Review for [diff description]
|
||||
|
||||
### New Claims Needed
|
||||
1. **[claim-id]**: [invariant summary]
|
||||
- Category: [category]
|
||||
- Tier: [tier]
|
||||
- Rationale: [why this needs a claim]
|
||||
- Command: `aphoria claims create ...`
|
||||
|
||||
### Existing Claims Affected
|
||||
1. **[claim-id]**: [what changed]
|
||||
- Action: Update / Supersede / Deprecate
|
||||
- Command: `aphoria claims [update|supersede|deprecate] ...`
|
||||
|
||||
### No Claim Needed
|
||||
- [pattern]: [why it doesn't need a claim]
|
||||
```
|
||||
|
||||
## Constraints
|
||||
|
||||
1. **Never invent provenance.** If you don't know WHY a value was chosen, ask the developer.
|
||||
2. **Never guess consequences.** If you can't articulate what breaks, don't claim it.
|
||||
3. **Prefer fewer, stronger claims** over many weak ones. A claim without a real consequence is noise.
|
||||
4. **Match the project's existing claim style.** Run `aphoria claims list` first to see conventions.
|
||||
5. **Always check existing claims first.** Don't duplicate. Supersede if updating.
|
||||
|
||||
## Related Skills
|
||||
|
||||
- `aphoria-dev`: Development guidelines for Aphoria
|
||||
- `aphoria-self-review`: Evaluate scan quality and noise
|
||||
- `extract-claims`: Extract claims from prose text (different from code diff review)
|
||||
@ -313,8 +313,27 @@ When implementing features or fixing bugs, provide:
|
||||
| 4A | Complete | Observation write-back |
|
||||
| 4B | Complete | Drift detection |
|
||||
| 4C | Complete | Staged scanning |
|
||||
| **4D** | **Next** | Enhanced ack |
|
||||
| 4D | Planned | Enhanced ack |
|
||||
| 4E | Planned | Community contribution |
|
||||
| 5 | Complete | Research agent loop |
|
||||
| 6 | Complete | Trust Packs |
|
||||
| 7 | Planned | Declarative extractors |
|
||||
| A1 | Complete | Observations vs Claims type system |
|
||||
| A2 | Complete | Claim authoring workflow + CLI |
|
||||
| A3 | Complete | Verification engine + verify command |
|
||||
| A4 | Complete | Corpus as assertions + authority lens |
|
||||
| A5.1 | Complete | Coverage metrics (coverage.rs) |
|
||||
| A5.2 | Complete | Docs generation (explain.rs + claims_explain) |
|
||||
| **A5.3** | **Next** | Claim suggester skill (aphoria-suggest) |
|
||||
| A5.4 | Complete | Onboarding mode (aphoria explain) |
|
||||
|
||||
## Related Skills
|
||||
|
||||
| Skill | Purpose |
|
||||
|-------|---------|
|
||||
| `aphoria-claims` | Review diffs for claimable changes (reactive) |
|
||||
| `aphoria-suggest` | Suggest claims from patterns + gaps (proactive) |
|
||||
| `aphoria-self-review` | Evaluate scan quality and noise |
|
||||
| `aphoria-llm-optimization` | Optimize LLM extraction quality |
|
||||
| `extract-claims` | Extract claims from prose text |
|
||||
| `aphoria-install` | Install Aphoria for local dev |
|
||||
|
||||
225
.claude/skills/aphoria-suggest/SKILL.md
Normal file
225
.claude/skills/aphoria-suggest/SKILL.md
Normal file
@ -0,0 +1,225 @@
|
||||
---
|
||||
name: aphoria-suggest
|
||||
description: Suggest new claims by analyzing existing patterns and unclaimed observations. Use when you want to grow claim coverage, find unclaimed code patterns, or bootstrap claims for a new project. Triggers on "suggest claims", "what needs claims", "aphoria suggest", "grow coverage", "bootstrap claims".
|
||||
---
|
||||
|
||||
# Aphoria Claim Suggester
|
||||
|
||||
You are an expert at identifying **semantic patterns** across authored claims and recognizing analogous unclaimed observations that deserve claims. You use the Aphoria CLI as your data source and your reasoning as the intelligence layer.
|
||||
|
||||
## Core Principle: Skill Calls CLI
|
||||
|
||||
You do NOT train models or use embeddings. You:
|
||||
1. Call the CLI to get structured data (claims + observations)
|
||||
2. Reason over the data to find patterns
|
||||
3. Suggest new claims with ready-to-run CLI commands
|
||||
|
||||
The "learning" is your ability to read existing claims, understand their semantic patterns, and apply that understanding to unclaimed observations.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Phase 1: Gather Context
|
||||
|
||||
Run these commands to understand the project's current claim state:
|
||||
|
||||
```bash
|
||||
# Get all authored claims (the "gold standard" examples)
|
||||
aphoria claims list --format json
|
||||
|
||||
# Get verification results including unclaimed observations
|
||||
aphoria verify run --format json --show-unclaimed
|
||||
|
||||
# Get coverage gaps
|
||||
aphoria coverage --format json
|
||||
```
|
||||
|
||||
### Phase 2: Determine Mode
|
||||
|
||||
Based on the claim count, choose your approach:
|
||||
|
||||
| Claim Count | Mode | Strategy |
|
||||
|---|---|---|
|
||||
| 0 | **Cold Start** | Bootstrap from project docs, tests, and conventions |
|
||||
| 1-5 | **Foundation** | Extend existing patterns, fill obvious gaps |
|
||||
| 6+ | **Flywheel** | Full analogical reasoning from established patterns |
|
||||
|
||||
### Phase 3a: Cold Start (0 Claims)
|
||||
|
||||
When no claims exist, bootstrap from external context:
|
||||
|
||||
1. **Read architecture docs**: `CLAUDE.md`, `README.md`, `docs/adr/`, `.claude/`
|
||||
2. **Inspect tests for implicit invariants**: Property-based tests, assertion patterns, `#[should_panic]` tests
|
||||
3. **Identify tech stack conventions**: What framework? What serialization? What auth pattern?
|
||||
4. **Propose 3-5 foundation claims** in these categories:
|
||||
- **Safety**: Race conditions, data integrity, resource management
|
||||
- **Architecture**: Module boundaries, dependency rules
|
||||
- **Constants**: Magic numbers from specs, configuration bounds
|
||||
|
||||
Example cold start output:
|
||||
```
|
||||
## Bootstrap Claims Suggested
|
||||
|
||||
No existing claims found. Here are foundation claims based on project analysis:
|
||||
|
||||
### 1. [safety] Serialization Consistency
|
||||
Reading tests in `tests/serialization.rs` — there's a roundtrip property test.
|
||||
|
||||
**Invariant:** All persistent types MUST implement roundtrip serialization
|
||||
**Consequence:** Data corruption on disk or wire
|
||||
**Evidence:** Property test at tests/serialization.rs:42
|
||||
|
||||
aphoria claims create \
|
||||
--id "project-serde-roundtrip-001" \
|
||||
--concept-path "project/types/serialization" \
|
||||
--predicate "roundtrip_safe" \
|
||||
--value "true" \
|
||||
--provenance "Property-based test coverage" \
|
||||
--invariant "All persistent types MUST serialize/deserialize without data loss" \
|
||||
--consequence "Data corruption in WAL or network protocol" \
|
||||
--tier expert \
|
||||
--evidence "tests/serialization.rs:42" \
|
||||
--category safety \
|
||||
--by "aphoria-suggest"
|
||||
```
|
||||
|
||||
### Phase 3b: Foundation Mode (1-5 Claims)
|
||||
|
||||
With a few claims, extend the patterns:
|
||||
|
||||
1. **Identify the categories covered** — What has claims? Safety? Architecture?
|
||||
2. **Find gaps in the same categories** — If there's a SeqCst claim for wallet, check other atomic code
|
||||
3. **Suggest 2-3 claims** that extend existing patterns to new locations
|
||||
|
||||
### Phase 3c: Flywheel Mode (6+ Claims)
|
||||
|
||||
Full analogical reasoning:
|
||||
|
||||
1. **Group existing claims by semantic pattern** (not string matching):
|
||||
- "Ordering invariants" (SeqCst claims across modules)
|
||||
- "Boundary rules" (no-import claims for module isolation)
|
||||
- "Serialization requirements" (derive claims for wire types)
|
||||
- "Configuration bounds" (min/max value claims)
|
||||
|
||||
2. **For each unclaimed observation**, apply chain-of-thought:
|
||||
```
|
||||
THINKING:
|
||||
- Observation: `Ordering::Relaxed` at sync/coordinator.rs:87
|
||||
- Most similar claim: wallet-seqcst-001 ("All wallet atomics MUST use SeqCst")
|
||||
- Similarity: Both involve atomic ordering in critical data paths
|
||||
- Difference: Coordinator vs wallet — is coordinator also safety-critical?
|
||||
- Decision: YES — coordinator manages distributed state, weakened ordering
|
||||
could cause split-brain. SUGGEST a claim.
|
||||
```
|
||||
|
||||
3. **Rank suggestions by coverage impact**:
|
||||
- Modules with 0 claims but many observations = highest priority
|
||||
- Patterns that appear in 3+ locations = systematic invariant
|
||||
- Safety-category gaps > architecture > constants
|
||||
|
||||
### Phase 4: Output Suggestions
|
||||
|
||||
For each suggestion, produce:
|
||||
|
||||
```markdown
|
||||
## Suggestion N: [Short Title]
|
||||
|
||||
**Reasoning:** [Chain-of-thought explanation]
|
||||
**Analogous to:** [existing claim ID, if any]
|
||||
**Coverage impact:** [module name] goes from X% to Y% claimed
|
||||
|
||||
aphoria claims create \
|
||||
--id "<suggested-id>" \
|
||||
--concept-path "<path>" \
|
||||
--predicate "<predicate>" \
|
||||
--value "<value>" \
|
||||
--provenance "<source>" \
|
||||
--invariant "<what MUST be true>" \
|
||||
--consequence "<what breaks>" \
|
||||
--tier <tier> \
|
||||
--evidence "<reference>" \
|
||||
--category <category> \
|
||||
--by "<author>"
|
||||
```
|
||||
|
||||
## Context Management
|
||||
|
||||
To avoid context window saturation with large projects:
|
||||
|
||||
| Situation | Strategy |
|
||||
|---|---|
|
||||
| <50 claims, <200 observations | Load everything, reason holistically |
|
||||
| 50-200 claims | Filter by `--category` relevant to current work |
|
||||
| 200+ claims | Use coverage gaps to focus on highest-impact modules only |
|
||||
| 1000+ observations | Use `aphoria coverage --sort-by unclaimed` to prioritize |
|
||||
|
||||
When filtering:
|
||||
```bash
|
||||
# Focus on safety claims only
|
||||
aphoria claims list --format json --category safety
|
||||
|
||||
# Focus on a specific module
|
||||
aphoria verify run --format json --show-unclaimed --path src/wallet/
|
||||
```
|
||||
|
||||
## Quality Gates
|
||||
|
||||
Before suggesting a claim, verify it passes these checks:
|
||||
|
||||
| Check | Requirement |
|
||||
|---|---|
|
||||
| **Non-trivial** | Would violating this actually break something? |
|
||||
| **Not type-system enforced** | The compiler doesn't already catch this |
|
||||
| **Has a consequence** | You can articulate a specific failure mode |
|
||||
| **Has provenance** | You can point to WHY this must be true |
|
||||
| **Not a duplicate** | No existing claim covers this |
|
||||
| **Testable** | An extractor can verify this observation |
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
**Do NOT suggest claims for:**
|
||||
- Variable renames, whitespace changes, comment additions
|
||||
- Patterns enforced by the type system or compiler
|
||||
- Temporary implementation details ("TODO: refactor this")
|
||||
- Generic boilerplate ("all functions should have docs")
|
||||
- Observations where the value can never realistically change
|
||||
|
||||
**Do NOT generate:**
|
||||
- Template garbage invariants ("This value MUST be what it is")
|
||||
- Claims without specific consequences ("Bad things could happen")
|
||||
- Claims with invented provenance ("Industry best practice")
|
||||
|
||||
## Integration with Existing Skills
|
||||
|
||||
This skill complements:
|
||||
- **aphoria-claims**: Reviews diffs for claimable changes (reactive — triggered by code changes)
|
||||
- **aphoria-suggest**: Proactively scans for coverage gaps (proactive — triggered by developer request)
|
||||
- **aphoria-self-review**: Evaluates scan quality and noise
|
||||
|
||||
Typical workflow:
|
||||
1. `aphoria-suggest` identifies systematic gaps → developer authors claims
|
||||
2. `aphoria-claims` catches new claimable patterns in future diffs
|
||||
3. More claims → better suggestions → flywheel spins
|
||||
|
||||
## Example Session
|
||||
|
||||
```
|
||||
User: "suggest claims for this project"
|
||||
|
||||
Agent:
|
||||
1. Runs `aphoria claims list --format json` → 4 claims (all safety category)
|
||||
2. Runs `aphoria verify run --format json --show-unclaimed` → 23 unclaimed observations
|
||||
3. Runs `aphoria coverage --format json` → 3 modules with 0 claims
|
||||
4. Identifies: existing claims all about atomic ordering
|
||||
5. Finds: 5 unclaimed observations also involve Ordering:: in different modules
|
||||
6. Suggests: 3 new SeqCst claims for uncovered modules + 2 architecture boundary claims
|
||||
7. Outputs: ready-to-run aphoria claims create commands with reasoning
|
||||
```
|
||||
|
||||
## Constraints
|
||||
|
||||
1. **Never invent provenance.** If you don't know WHY, mark the tier as `community` and note "needs expert review."
|
||||
2. **Never suggest more than 10 claims at once.** Prioritize by impact.
|
||||
3. **Always show reasoning.** The developer should understand WHY you're suggesting each claim.
|
||||
4. **Match existing style.** If project claims use formal MUST/SHALL language, match it.
|
||||
5. **Prefer fewer strong claims** over many weak ones.
|
||||
6. **Run coverage after suggesting.** Show the before/after impact.
|
||||
@ -48,6 +48,8 @@ A probabilistic knowledge graph database that stores Claims, not Facts. Append-o
|
||||
| **General LLM optimization** | Load skill: `llm-optimization` |
|
||||
| **Install Aphoria** | Load skill: `aphoria-install` |
|
||||
| **Run Aphoria self-review** | Load skill: `aphoria-self-review` |
|
||||
| **Author claims from diffs** | Load skill: `aphoria-claims` |
|
||||
| **Suggest new claims** | Load skill: `aphoria-suggest` |
|
||||
|
||||
## Roadmap Maintenance
|
||||
|
||||
|
||||
@ -25,6 +25,34 @@ StemeDB exposes metrics in Prometheus format and provides admin endpoints for op
|
||||
|
||||
## Metrics Reference
|
||||
|
||||
### Application Metrics (stemedb-api)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `stemedb_assertions_total` | Gauge | - | Total assertions in database (updated on health check) |
|
||||
| `stemedb_assertions_ingested_total` | Counter | - | Assertions ingested via `POST /v1/assert` |
|
||||
| `stemedb_queries_total` | Counter | `endpoint` | Queries executed (query, skeptic, layered, constraints) |
|
||||
| `stemedb_query_latency_seconds` | Histogram | `endpoint` | End-to-end query latency by endpoint |
|
||||
| `stemedb_quarantine_pending` | Gauge | - | Pending quarantine events (updated on health check) |
|
||||
| `stemedb_circuit_breakers_open` | Gauge | - | Open circuit breakers (updated on health check) |
|
||||
|
||||
**Source files:**
|
||||
- `handlers/health.rs` — gauges for assertions_total, quarantine_pending, circuit_breakers_open
|
||||
- `handlers/assert.rs` — counter for assertions_ingested_total
|
||||
- `handlers/query.rs`, `skeptic.rs`, `layered.rs`, `constraints.rs` — counter + histogram per endpoint
|
||||
|
||||
### Grafana Dashboard
|
||||
|
||||
A pre-built Grafana dashboard is available at `docs/grafana/stemedb-overview.json`.
|
||||
|
||||
**Rows:**
|
||||
1. **Overview** — assertions_total, queries/sec, quarantine_pending, circuit_breakers_open (stat panels)
|
||||
2. **Query Performance** — latency p50/p95/p99 histogram, queries by endpoint (time series)
|
||||
3. **Cluster Health** — node counts, sync lag, convergence latency
|
||||
4. **Write Path** — assertions ingested rate, sync throughput
|
||||
|
||||
Import via Grafana UI > Dashboards > Import. Uses `${DS_PROMETHEUS}` variable for datasource portability.
|
||||
|
||||
### Sync Metrics (stemedb-sync)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
|
||||
@ -25,6 +25,7 @@ stemedb-storage = { path = "../../crates/stemedb-storage" }
|
||||
stemedb-ingest = { path = "../../crates/stemedb-ingest" }
|
||||
stemedb-query = { path = "../../crates/stemedb-query" }
|
||||
stemedb-wal = { path = "../../crates/stemedb-wal" }
|
||||
stemedb-lens = { path = "../../crates/stemedb-lens" }
|
||||
|
||||
# CLI
|
||||
clap = { version = "4.5", features = ["derive"] }
|
||||
@ -35,6 +36,9 @@ tokio = { version = "1", features = ["full"] }
|
||||
# File walking
|
||||
ignore = "0.4"
|
||||
|
||||
# Parallel extraction
|
||||
rayon = "1.10"
|
||||
|
||||
# Pattern matching
|
||||
regex = "1.10"
|
||||
globset = "0.4"
|
||||
|
||||
@ -1,203 +0,0 @@
|
||||
# Scout & Judge: Hybrid Deterministic-Probabilistic Extraction Architecture
|
||||
|
||||
> **Status:** Proposed (2026-02-05)
|
||||
> **Phase:** 7.9 (Replaces monolithic LLM extraction)
|
||||
> **Context:** Evolution of Phase 7.5 (LLM-in-the-Loop)
|
||||
|
||||
---
|
||||
|
||||
## 1. Problem Statement
|
||||
|
||||
The current LLM extraction pipeline ("Monolithic Mode") treats code files as unstructured text. It feeds entire files to the LLM to find security claims.
|
||||
|
||||
**Issues with Monolithic Mode:**
|
||||
1. **Cost:** 90% of a file is irrelevant to security (imports, UI logic, helpers), yet we pay for every token.
|
||||
2. **Recall:** LLMs struggle to find "needles in haystacks" (long context window degradation).
|
||||
3. **Hallucination:** Irrelevant code confuses the model, leading to false positives.
|
||||
4. **Latency:** Processing large files is slow/blocking.
|
||||
|
||||
## 2. The Solution: Scout & Judge Architecture
|
||||
|
||||
We decouple the **discovery** of potential claims from the **analysis** of those claims.
|
||||
|
||||
* **The Scout (Deterministic):** Uses Abstract Syntax Trees (AST) via `tree-sitter` to find *Regions of Interest* (ROIs) with 100% speed and 0 cost.
|
||||
* **The Judge (Probabilistic):** Uses the LLM to analyze *only* the specific ROI snippet to extract semantic meaning and confidence.
|
||||
|
||||
### Architectural Diagram
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
File[Source File] -->|Input| Scout[AST Scout (Tree-sitter)]
|
||||
|
||||
subgraph "The Scout (Local/Fast)"
|
||||
Scout -->|Parse| AST
|
||||
AST -->|Query| Query[SCM Queries]
|
||||
Query -->|Match| Candidate[Candidate Node]
|
||||
Candidate -->|Expand| Snippet[Context Snippet]
|
||||
end
|
||||
|
||||
Snippet -->|Input| Judge[LLM Judge (Gemini/Claude)]
|
||||
|
||||
subgraph "The Judge (Remote/Smart)"
|
||||
Judge -->|Prompt: Analyze this specific call| Claims[Structured Claims]
|
||||
end
|
||||
|
||||
Claims -->|Output| Aggregator[Claim Aggregator]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Component Details
|
||||
|
||||
### 3.1 The Scout (Tree-sitter)
|
||||
|
||||
The Scout's job is **High Recall**. It should find *anything* that *might* be relevant. It does not need to be precise.
|
||||
|
||||
**Technology:** `tree-sitter` (Rust bindings)
|
||||
|
||||
**Workflow:**
|
||||
1. **Detect Language:** Identify file type (Python, Go, Rust, JS).
|
||||
2. **Parse:** Generate AST.
|
||||
3. **Query:** Run SCM (S-expression) queries to find patterns.
|
||||
|
||||
**Example Query (Python TLS):**
|
||||
```scm
|
||||
(call_expression
|
||||
function: (attribute) @func
|
||||
arguments: (argument_list
|
||||
(keyword_argument
|
||||
name: (identifier) @arg_name
|
||||
value: (_) @value
|
||||
)
|
||||
)
|
||||
(#match? @func "requests\.(get|post|put|delete)")
|
||||
(#eq? @arg_name "verify")
|
||||
)
|
||||
```
|
||||
|
||||
**Context Expansion:**
|
||||
The Scout doesn't just grab the line. It grabs the **Logical Context**:
|
||||
* The function call itself.
|
||||
* Variable definitions referenced in the call (simple static analysis).
|
||||
* Surrounding 5 lines for comments.
|
||||
|
||||
### 3.2 The Judge (LLM)
|
||||
|
||||
The Judge's job is **High Precision**. It receives a focused prompt and determines if a claim exists.
|
||||
|
||||
**Input Prompt:**
|
||||
```text
|
||||
You are a security analyst.
|
||||
Analyze this code snippet for TLS verification settings.
|
||||
|
||||
SNIPPET:
|
||||
# Dev override
|
||||
should_verify = False
|
||||
requests.get(url, verify=should_verify)
|
||||
|
||||
CONTEXT:
|
||||
Variable `should_verify` is defined on line 2.
|
||||
|
||||
TASK:
|
||||
Does this snippet disable TLS verification?
|
||||
Output JSON: { "subject": "tls/verification", "value": false, "confidence": 0.95 }
|
||||
```
|
||||
|
||||
**Why this wins:**
|
||||
* **Token Efficiency:** Input reduced from 2000 tokens (file) to ~100 tokens (snippet).
|
||||
* **Accuracy:** Model has no distractions.
|
||||
* **Speed:** Parallelizable per-snippet.
|
||||
|
||||
---
|
||||
|
||||
## 4. Implementation Plan
|
||||
|
||||
### Phase 1: Infrastructure (Dependencies)
|
||||
|
||||
Add `tree-sitter` support to `Cargo.toml`.
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
tree-sitter = "0.20"
|
||||
tree-sitter-python = "0.20"
|
||||
tree-sitter-javascript = "0.20"
|
||||
tree-sitter-go = "0.20"
|
||||
tree-sitter-rust = "0.20"
|
||||
```
|
||||
|
||||
### Phase 2: The Scout Engine (`src/scout/`)
|
||||
|
||||
Create a new module `applications/aphoria/src/scout/`.
|
||||
|
||||
* `mod.rs`: Public interface.
|
||||
* `engine.rs`: Orchestrates parsing and querying.
|
||||
* `queries/`: Directory containing `.scm` query files for each category/language.
|
||||
* `python/tls.scm`
|
||||
* `go/sql_injection.scm`
|
||||
|
||||
**Struct definition:**
|
||||
```rust
|
||||
pub struct CandidateSnippet {
|
||||
pub file_path: String,
|
||||
pub language: Language,
|
||||
pub start_line: usize,
|
||||
pub end_line: usize,
|
||||
pub code: String,
|
||||
pub context_variables: HashMap<String, String>, // Name -> Value/Definition
|
||||
pub query_id: String, // Which query found this
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 3: The Judge Engine (`src/llm/judge.rs`)
|
||||
|
||||
Refactor `LlmExtractor` to support "Judge Mode".
|
||||
|
||||
* Modify `extract()` to accept `CandidateSnippet` instead of full file content.
|
||||
* Create specialized prompts for specific query IDs (e.g., if Scout found a TLS pattern, use the specialized "TLS Judge" prompt, not the generic one).
|
||||
|
||||
### Phase 4: Integration
|
||||
|
||||
Modify the main `scan` loop:
|
||||
|
||||
1. **Regex Extractors** run first (unchanged).
|
||||
2. **Scout** runs on all files (extremely fast).
|
||||
3. **Deduplicate:** If Scout finds a region already handled by Regex, drop it.
|
||||
4. **Judge:** Send remaining Candidates to LLM.
|
||||
|
||||
---
|
||||
|
||||
## 5. Evaluation & Metrics
|
||||
|
||||
The "Prompt Evaluation System" (Phase 7.8) adapts to this model:
|
||||
|
||||
**1. Scout Evaluation (Deterministic):**
|
||||
* **Metric:** Recall. "Did the Scout find the vulnerable line in `fixtures/tls/bad.py`?"
|
||||
* **Test:** Unit tests using `tree-sitter` queries against code snippets. No LLM required.
|
||||
|
||||
**2. Judge Evaluation (Probabilistic):**
|
||||
* **Metric:** Precision/Accuracy. "Given the snippet, did the LLM classify it correctly?"
|
||||
* **Fixture:** `tests/llm_fixtures` now contains *snippets* derived from the Golden Corpus files.
|
||||
|
||||
**3. Cost Efficiency Metric:**
|
||||
* Track `tokens_per_claim`.
|
||||
* Goal: Reduce tokens/claim by >80% compared to Monolithic approach.
|
||||
|
||||
## 6. Migration Strategy
|
||||
|
||||
1. **Parallel Run:** Run Scout logic alongside Regex logic in "shadow mode" (logging only) to tune queries.
|
||||
2. **Incremental Rollout:** Enable Scout & Judge for **one category** (e.g., TLS) while leaving others in Monolithic mode (if any) or Regex mode.
|
||||
3. **Full Switch:** Deprecate "Monolithic Mode" prompts.
|
||||
|
||||
---
|
||||
|
||||
## 7. Comparison Summary
|
||||
|
||||
| Feature | Current (Monolithic) | Scout & Judge (Proposed) |
|
||||
| :--- | :--- | :--- |
|
||||
| **Trigger** | File name heuristic | AST Pattern Match |
|
||||
| **Input** | Whole File | Relevant Snippet |
|
||||
| **Context** | Noisy (imports, unrelated code) | Focused (local scope) |
|
||||
| **Cost** | $$$ (Linear to file size) | ¢ (Linear to *relevant* code) |
|
||||
| **Reliability** | Low (Lost in middle) | High (Forced focus) |
|
||||
| **Maintenance** | Prompt Engineering | Query Engineering + Simple Prompts |
|
||||
|
||||
661
applications/aphoria/docs/vision-gaps.md
Normal file
661
applications/aphoria/docs/vision-gaps.md
Normal file
@ -0,0 +1,661 @@
|
||||
# Aphoria Vision Gaps
|
||||
|
||||
**Date**: 2026-02-08
|
||||
**Status**: Honest assessment of where we are vs. where we need to be
|
||||
**Grounded Against**: Codebase as of commit `e0d2940` (42 extractors, bridge.rs, ephemeral/persistent modes)
|
||||
|
||||
## Implementation Status
|
||||
|
||||
**Phase A1: Distinguish Observations from Claims** - ✅ **COMPLETE** (2026-02-08)
|
||||
|
||||
- Renamed `ExtractedClaim` → `Observation` (struct + 81 files updated)
|
||||
- Added confidence-based tier mapping: ≥0.9 → Tier 4, <0.9 → Tier 5
|
||||
- `observation_to_assertion()` replaces fixed Tier 3 assignment
|
||||
- `AuthoredClaim` type fully defined with provenance/invariant/consequence fields
|
||||
- Claims storage in `.aphoria/claims.toml` (ClaimsFile implementation)
|
||||
- CLI commands: `aphoria claim create|list|explain|update|supersede|deprecate`
|
||||
- All 1055 tests passing
|
||||
|
||||
See commit history for implementation details.
|
||||
|
||||
---
|
||||
|
||||
## The Problem in One Sentence
|
||||
|
||||
Aphoria extracts observations about source code and calls them "claims," but they aren't claims -- they're grep results wearing Episteme vocabulary.
|
||||
|
||||
---
|
||||
|
||||
## Current Architecture: What Actually Happens
|
||||
|
||||
### Scan Flow (Ephemeral Mode)
|
||||
|
||||
This is the fast path (~0.25s), used for CI/pre-commit. Traced from `scanner.rs:52` through to report output.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CLI as CLI (main.rs)
|
||||
participant Handler as handle_scan()
|
||||
participant Scanner as run_scan()
|
||||
participant Walker as walk_project()
|
||||
participant Registry as ExtractorRegistry
|
||||
participant Bridge as bridge.rs
|
||||
participant Corpus as corpus.rs
|
||||
participant Index as ConceptIndex
|
||||
participant Conflict as conflict.rs
|
||||
participant Report as Formatter
|
||||
|
||||
CLI->>Handler: ScanArgs + AphoriaConfig
|
||||
Handler->>Scanner: run_scan(args, config)
|
||||
|
||||
Note over Scanner: Phase 1: WALK
|
||||
Scanner->>Walker: walk_project(root, config)
|
||||
Walker-->>Scanner: Vec<WalkedFile>
|
||||
|
||||
Note over Scanner: Phase 2: EXTRACT
|
||||
loop For each WalkedFile
|
||||
Scanner->>Registry: extract_all(segments, content, lang, file)
|
||||
Registry->>Registry: for_language(lang) -> applicable extractors
|
||||
loop For each Extractor
|
||||
Registry->>Registry: extractor.extract(segments, content, lang, file)
|
||||
end
|
||||
Registry->>Registry: filter by IgnoreCommentParser
|
||||
Registry-->>Scanner: Vec<ExtractedClaim>
|
||||
end
|
||||
|
||||
Note over Scanner: Phase 3: CONFLICT DETECTION
|
||||
Scanner->>Bridge: load_or_generate_key(root)
|
||||
Bridge-->>Scanner: SigningKey
|
||||
|
||||
Scanner->>Corpus: create_authoritative_corpus(key)
|
||||
Note over Corpus: Hardcoded RFC/OWASP assertions<br/>corpus.rs:33-157
|
||||
Corpus-->>Scanner: Vec<Assertion> (authority)
|
||||
|
||||
Scanner->>Index: ConceptIndex::build(corpus)
|
||||
Note over Index: make_key() = last 2 path segments<br/>+ "::" + predicate
|
||||
Index-->>Scanner: ConceptIndex
|
||||
|
||||
Scanner->>Conflict: check_conflicts(claims, index, config)
|
||||
loop For each ExtractedClaim
|
||||
Conflict->>Index: lookup(claim.subject, claim.predicate)
|
||||
Note over Conflict: Tail-path match:<br/>"code://rust/app/tls/cert_verification"<br/>matches "rfc://5246/tls/cert_verification"
|
||||
Conflict->>Conflict: Compare values, compute score
|
||||
Conflict->>Conflict: Determine verdict (Block/Flag/Pass)
|
||||
end
|
||||
Conflict-->>Scanner: Vec<ConflictResult>
|
||||
|
||||
Note over Scanner: Phase 4: REPORT
|
||||
Scanner->>Report: format(results)
|
||||
Report-->>CLI: Table / JSON / SARIF / Markdown
|
||||
```
|
||||
|
||||
**Key code locations:**
|
||||
- Entry: `handlers/scan.rs:8-71`
|
||||
- Orchestration: `scanner.rs:52-117`
|
||||
- Walker: `walker/mod.rs:115-175`
|
||||
- Extraction: `registry.rs:289-304`
|
||||
- Corpus build: `corpus.rs:33-157`
|
||||
- Index: `concept_index.rs:30-110`
|
||||
- Conflict: `conflict.rs:64-200`
|
||||
|
||||
### Scan Flow (Persistent Mode with --persist --sync)
|
||||
|
||||
The full Episteme path, used for drift detection and observation write-back.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Scanner as run_scan()
|
||||
participant Episteme as LocalEpisteme
|
||||
participant WAL as Journal (WAL)
|
||||
participant Store as HybridStore
|
||||
participant Bridge as bridge.rs
|
||||
participant Index as ConceptIndex
|
||||
participant Drift as drift.rs
|
||||
participant Hosted as HostedClient
|
||||
|
||||
Note over Scanner: Same walk + extract as ephemeral
|
||||
|
||||
Scanner->>Episteme: LocalEpisteme::open(config, root)
|
||||
Episteme->>WAL: Journal::open(wal_dir)
|
||||
Episteme->>Store: HybridStore::open(store_dir)
|
||||
Episteme-->>Scanner: LocalEpisteme
|
||||
|
||||
Note over Scanner: Ingest claims as Tier 3 assertions
|
||||
Scanner->>Episteme: ingest_claims(all_claims)
|
||||
loop For each claim
|
||||
Episteme->>Bridge: claim_to_assertion(claim, key, ts)
|
||||
Note over Bridge: SourceClass::Expert (Tier 3)<br/>lifecycle: Approved<br/>parent_hash: None<br/>epoch: None
|
||||
Bridge-->>Episteme: Assertion
|
||||
Episteme->>WAL: journal.append(serialized)
|
||||
end
|
||||
|
||||
Note over Scanner: Build index from corpus + imported assertions
|
||||
Scanner->>Episteme: fetch_authoritative_assertions()
|
||||
Episteme-->>Scanner: Vec<Assertion> (from store)
|
||||
Scanner->>Index: ConceptIndex::build_with_aliases(corpus, aliases)
|
||||
|
||||
Note over Scanner: Check conflicts
|
||||
Scanner->>Episteme: check_conflicts(claims, config, index)
|
||||
Episteme-->>Scanner: Vec<ConflictResult>
|
||||
|
||||
Note over Scanner: Check drift against prior observations
|
||||
Scanner->>Drift: check_drift(non_conflicting_claims)
|
||||
Drift->>Store: fetch_observations_for_concept(path)
|
||||
Note over Drift: Compare current value vs prior<br/>If different -> DriftResult
|
||||
Drift-->>Scanner: Vec<DriftResult>
|
||||
|
||||
Note over Scanner: Write back novel observations as Tier 4
|
||||
Scanner->>Episteme: ingest_observations(novel_claims)
|
||||
loop For each observation
|
||||
Episteme->>Bridge: claim_to_observation(claim, key, ts)
|
||||
Note over Bridge: SourceClass::Community (Tier 4)<br/>weight: 0.3
|
||||
Bridge-->>Episteme: Assertion
|
||||
Episteme->>WAL: journal.append(serialized)
|
||||
Episteme->>Store: predicate_index("observation", hash)
|
||||
end
|
||||
|
||||
opt If hosted mode enabled
|
||||
Scanner->>Hosted: push_observations(assertions)
|
||||
Hosted-->>Scanner: PushObservationsResponse
|
||||
end
|
||||
```
|
||||
|
||||
**Key code locations:**
|
||||
- Persistent path: `scanner.rs:195-325`
|
||||
- LocalEpisteme::open: `local/mod.rs:44-124`
|
||||
- Ingest claims: `local/store.rs:20-96`
|
||||
- Ingest observations: `local/store.rs:105-165`
|
||||
- Drift detection: `drift.rs:23-57`
|
||||
- Hosted push: `hosted.rs:178+`
|
||||
|
||||
---
|
||||
|
||||
## What We Built (Grounded)
|
||||
|
||||
Aphoria has **42 built-in extractors** (`registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42`) that scan source code with regex patterns and produce `ExtractedClaim` structs:
|
||||
|
||||
```rust
|
||||
// types/claim.rs:7-31
|
||||
pub struct ExtractedClaim {
|
||||
pub concept_path: String, // e.g., "code://rust/maxwell/hypervisor/lib/imports/firecracker"
|
||||
pub predicate: String, // e.g., "imported"
|
||||
pub value: ObjectValue, // Boolean(true)
|
||||
pub file: String, // "hypervisor/src/lib.rs"
|
||||
pub line: usize, // 24
|
||||
pub matched_text: String, // "use firecracker_sdk::..."
|
||||
pub confidence: f32, // 1.0
|
||||
pub description: String, // "Module imports firecracker"
|
||||
}
|
||||
```
|
||||
|
||||
We ran this on Maxwell and got 67 "claims" with zero noise. We celebrated.
|
||||
|
||||
Then we looked at the output and asked: **what is the claim being made here?**
|
||||
|
||||
The answer is: there is no claim. `imported: true` is an index entry. No one will ever assert `imported: false`. There's no conflict to resolve, no lens needed, no reason to store this in an append-only Merkle DAG. It's `grep "use firecracker"` with extra steps.
|
||||
|
||||
### Verified Against Code
|
||||
|
||||
| Extractor | File | Predicate Used | What It Actually Produces |
|
||||
|-----------|------|---------------|--------------------------|
|
||||
| `import_graph` | `extractors/import_graph.rs` | `"imported"` with `Boolean(true)` | grep for `use` statements |
|
||||
| `derive_pattern` | `extractors/derive_pattern.rs` | `"derives"` with `Text("Clone,Debug")` | AST metadata extraction |
|
||||
| `const_declarations` | `extractors/const_declarations.rs` | `"value"` with literal value | copy of the source line |
|
||||
| `unsafe_atomic` | `extractors/unsafe_atomic.rs` | `"pattern"` with `Text("SeqCst")` | grep for `Ordering::` |
|
||||
|
||||
None of these can conflict. None need lenses. None benefit from Episteme's architecture.
|
||||
|
||||
---
|
||||
|
||||
## What a Real Claim Looks Like
|
||||
|
||||
After the scan, we wrote [claims-explained.md](../../claims-explained.md) by hand for Maxwell. That document contains actual claims. Compare:
|
||||
|
||||
**What Aphoria produces** (`unsafe_atomic` extractor, `extractors/unsafe_atomic.rs`):
|
||||
```
|
||||
Subject: "code://rust/maxwell/core/wallet/atomics/ordering"
|
||||
Predicate: "pattern"
|
||||
Value: "SeqCst"
|
||||
```
|
||||
|
||||
**What a human wrote:**
|
||||
> "All wallet atomic operations MUST use SeqCst to prevent double-spend race conditions. Weakening to Relaxed or Acquire/Release is a correctness bug."
|
||||
|
||||
**What Episteme expects** (from `stemedb-core/src/types/assertion.rs`):
|
||||
```
|
||||
Subject: "maxwell/wallet/atomics/ordering"
|
||||
Predicate: "required_ordering"
|
||||
Value: "SeqCst"
|
||||
Source: Safety analysis by lead developer
|
||||
Authority: Tier 3 (Expert) -- with real evidence
|
||||
Evidence: "AtomicU64 balance requires sequential consistency
|
||||
to prevent double-spend. See wallet ADR-003."
|
||||
Parent: None (original assertion)
|
||||
Epoch: Some("maxwell-v1.0")
|
||||
```
|
||||
|
||||
More examples from the same scan:
|
||||
|
||||
**Aphoria says:** `core/thermal/const/rapl_power_unit = 0x606`
|
||||
**The claim is:** "Intel MSR register address for reading CPU power units. Sourced from Intel SDM Vol 4. If this changes, either the code is wrong or targeting different hardware."
|
||||
|
||||
**Aphoria says:** `wallet/type/wallet/derives = Debug`
|
||||
**The claim is:** "Wallet MUST NOT derive Clone because singleton ownership is a safety invariant. Wallet contains AtomicU64 -- cloning it creates divergent state."
|
||||
|
||||
**Aphoria says:** `vsock/message/agentmessage/derives = Clone,Debug,Deserialize,Serialize`
|
||||
**The claim is:** "All vsock message types MUST derive Serialize+Deserialize because they cross the VM boundary via bincode. If serde appears in core imports, internal types are leaking into the wire protocol."
|
||||
|
||||
The difference: observations describe **what is**. Claims describe **what must be and why**. Claims have provenance, consequences, and can conflict with each other.
|
||||
|
||||
---
|
||||
|
||||
## The Fundamental Gap (Code-Grounded)
|
||||
|
||||
Episteme is a knowledge graph for conflicting claims with lineage and resolution. Aphoria uses it as a document store for scan results.
|
||||
|
||||
The `bridge.rs` conversion (`bridge.rs:45-92`) forces observations into the Assertion schema:
|
||||
|
||||
| Assertion Field | What Episteme Expects | What bridge.rs Provides | Code Reference |
|
||||
|----------------|----------------------|------------------------|----------------|
|
||||
| `source_hash` | Hash of source document (RFC, paper) | `blake3(file + line + matched_text)` | `bridge.rs:107-113` |
|
||||
| `source_class` | Tiered authority (0=Regulatory...4=Community) | Always `SourceClass::Expert` (Tier 3) for claims | `bridge.rs:25` |
|
||||
| `source_metadata` | `{journal, DOI, author, standard}` | `{file, line, matched_text, scan_tool, scan_version}` | `bridge.rs:52-58` |
|
||||
| `parent_hash` | Links to superseded assertion | Always `None` | `bridge.rs:79` |
|
||||
| `epoch` | Paradigm context (e.g., "post-quantum") | Always `None` | `bridge.rs:89` |
|
||||
| `lifecycle` | Pending -> Review -> Approved | Always `LifecycleStage::Approved` (skips review) | `bridge.rs:85` |
|
||||
| `evidence` | Provenance chain, ADR references | Not present in `ExtractedClaim` at all | `types/claim.rs:7-31` |
|
||||
|
||||
**We're using a Mercedes as a shopping cart.**
|
||||
|
||||
### Partial Mitigation Already Exists
|
||||
|
||||
`claim_to_observation()` (`bridge.rs:36-42`) creates Tier 4 (Community) assertions for write-back. But this is only used in the `--sync` path for drift detection -- the default `claim_to_assertion()` still uses Tier 3.
|
||||
|
||||
---
|
||||
|
||||
## What the Workflow Should Be
|
||||
|
||||
### Target: Commit-Time Claim Authoring
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Skill as Aphoria Skill (.claude/skills/)
|
||||
participant Graph as Episteme Knowledge Graph
|
||||
participant Scanner as aphoria scan (audit mode)
|
||||
participant Report as Claims-Explained View
|
||||
|
||||
Note over Dev: Developer commits code
|
||||
|
||||
Dev->>Skill: Review diff
|
||||
Skill->>Skill: Identify claimable changes
|
||||
|
||||
Note over Skill: Claimable = new constants from specs,<br/>ordering changes, boundary crossings,<br/>derive changes on serialized types<br/><br/>NOT claimable = renamed variables,<br/>whitespace, internal refactors
|
||||
|
||||
Skill->>Graph: Look up existing claims for context
|
||||
Graph-->>Skill: Related claims (if any)
|
||||
|
||||
alt Diff contradicts existing claim
|
||||
Skill->>Dev: "This contradicts claim X. Fix code or supersede claim?"
|
||||
Dev->>Skill: Decision + evidence
|
||||
Skill->>Graph: Create superseding claim (parent_hash = old claim)
|
||||
else New claimable pattern
|
||||
Skill->>Dev: "This looks claimable. Author a claim?"
|
||||
Dev->>Skill: Provenance + invariant + consequence
|
||||
Skill->>Graph: Submit authored claim with lineage
|
||||
end
|
||||
|
||||
Note over Skill: Create extractor for audit
|
||||
Skill->>Scanner: Register extractor paired with claim
|
||||
|
||||
Note over Scanner: Later: Audit runs
|
||||
Scanner->>Graph: For each claim, verify code matches
|
||||
Graph-->>Scanner: Expected values
|
||||
Scanner->>Scanner: Extractor output vs claim
|
||||
Scanner-->>Report: PASS / CONFLICT / DRIFT
|
||||
|
||||
Report->>Report: Auto-generate claims-explained.md
|
||||
```
|
||||
|
||||
### Audit Flow: Two Directions
|
||||
|
||||
**Direction 1: Scan code, check against claims** (what Aphoria partially does today)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Scanner as aphoria audit
|
||||
participant Extractors as ExtractorRegistry
|
||||
participant Code as Source Files
|
||||
participant Graph as Episteme (Claims)
|
||||
participant Report as Audit Report
|
||||
|
||||
Scanner->>Code: Walk project files
|
||||
Scanner->>Extractors: extract_all(file) -> Vec<Observation>
|
||||
|
||||
loop For each Observation
|
||||
Scanner->>Graph: lookup_claim(observation.subject, observation.predicate)
|
||||
alt Claim exists
|
||||
alt observation.value == claim.value
|
||||
Scanner->>Report: PASS (code matches claim)
|
||||
else observation.value != claim.value
|
||||
Scanner->>Report: CONFLICT (code contradicts claim)
|
||||
Note over Report: Score by authority tier,<br/>apply lenses for resolution
|
||||
end
|
||||
else No claim exists
|
||||
Scanner->>Report: REVIEW ("should this be a claim?")
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
**Direction 2: Walk claims, verify in code** (does not exist today)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Scanner as aphoria audit --verify-claims
|
||||
participant Graph as Episteme (Claims)
|
||||
participant Extractors as Paired Extractors
|
||||
participant Code as Source Files
|
||||
participant Report as Audit Report
|
||||
|
||||
Scanner->>Graph: List all authored claims
|
||||
Graph-->>Scanner: Vec<Claim>
|
||||
|
||||
loop For each Claim
|
||||
Scanner->>Extractors: Find extractor paired with this claim
|
||||
alt Extractor exists
|
||||
Extractors->>Code: Run extractor on relevant files
|
||||
Code-->>Extractors: Vec<Observation>
|
||||
alt Observation matches claim
|
||||
Scanner->>Report: PASS
|
||||
else Observation contradicts claim
|
||||
Scanner->>Report: CONFLICT
|
||||
end
|
||||
alt No observation found (code deleted?)
|
||||
Scanner->>Report: MISSING (claimed pattern not found)
|
||||
end
|
||||
else No paired extractor
|
||||
Scanner->>Report: UNCHECKED (no extractor for this claim)
|
||||
end
|
||||
end
|
||||
|
||||
Note over Report: Catches:<br/>- Deleted code (claim says X exists, it doesn't)<br/>- Drifted values (claim says 0x606, code says 0x607)<br/>- Unenforced policies (claim says "no tokio in core")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Extracted Claims from This Document
|
||||
|
||||
The following claims were extracted using the `extract-claims` skill pattern. Each is testable against the current codebase.
|
||||
|
||||
### Architecture Claims (Verified)
|
||||
|
||||
| ID | Claim | Verification Status | Code Reference |
|
||||
|----|-------|-------------------|----------------|
|
||||
| VG-001 | Aphoria has 42 built-in extractors | VERIFIED | `registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42` |
|
||||
| VG-002 | `import_graph` extractor uses predicate `"imported"` with `Boolean(true)` | VERIFIED | `import_graph.rs` -- only produces `imported: true` |
|
||||
| VG-003 | `unsafe_atomic` extractor uses predicate `"pattern"` | VERIFIED | `unsafe_atomic.rs` -- uses generic `"pattern"` predicate |
|
||||
| VG-004 | `bridge.rs` default path uses `SourceClass::Expert` (Tier 3) | VERIFIED | `bridge.rs:25` -- `claim_to_assertion()` calls with `SourceClass::Expert` |
|
||||
| VG-005 | `bridge.rs` always sets `parent_hash: None` | VERIFIED | `bridge.rs:79` |
|
||||
| VG-006 | `bridge.rs` always sets `epoch: None` | VERIFIED | `bridge.rs:89` |
|
||||
| VG-007 | `bridge.rs` always sets `lifecycle: LifecycleStage::Approved` | VERIFIED | `bridge.rs:85` |
|
||||
| VG-008 | `source_metadata` contains `{file, line, matched_text, scan_tool, scan_version}` only | VERIFIED | `bridge.rs:52-58` |
|
||||
| VG-009 | `ExtractedClaim` has no evidence/provenance field | VERIFIED | `types/claim.rs:7-31` -- only has location, value, confidence |
|
||||
| VG-010 | `claim_to_observation()` uses Tier 4 (Community) | VERIFIED | `bridge.rs:36-42` |
|
||||
| VG-011 | Extractor trait has no mechanism to receive claims for verification | ✅ **CLOSED** | `traits.rs:68-107` -- `verifiable_predicates()` method added, 10 extractors declare predicates |
|
||||
|
||||
### Gap Claims (What Doesn't Exist)
|
||||
|
||||
| ID | Claim | Gap |
|
||||
|----|-------|-----|
|
||||
| VG-020 | `ExtractedClaim` should be renamed to `Observation` | `types/claim.rs` still uses `ExtractedClaim` |
|
||||
| VG-021 | A real `Claim` type should exist with provenance, invariant, consequence, authority | No such type exists anywhere |
|
||||
| VG-022 | Extractors should be paired with claims they verify | ✅ **CLOSED** — `verifiable_predicates()` added to `Extractor` trait; 10 extractors declare predicates; `compute_extractor_claim_map()` in verify.rs; `aphoria verify map` shows coverage |
|
||||
| VG-023 | `aphoria audit` command should exist | No audit subcommand in CLI |
|
||||
| VG-024 | Claims should support supersession via `parent_hash` | `parent_hash` is always `None` |
|
||||
| VG-025 | `aphoria claims list` / `aphoria claims explain` should exist | No claims subcommand |
|
||||
| VG-026 | Corpus should be real assertions, not hardcoded in `corpus.rs:33-157` | Corpus is built procedurally per scan |
|
||||
| VG-027 | Conflict resolution should use Episteme lenses | No lens invoked during scan |
|
||||
| VG-028 | Direction 2 audit (walk claims, verify code) doesn't exist | No inverse audit flow |
|
||||
| VG-029 | Skill should be primary claim authoring interface | No `.claude/skills/aphoria` skill exists |
|
||||
|
||||
---
|
||||
|
||||
## What Needs to Change
|
||||
|
||||
### 1. Claims are authored, not extracted
|
||||
|
||||
Extractors don't produce claims. Humans (assisted by the Aphoria skill) produce claims. Extractors produce **observations** that are checked against claims.
|
||||
|
||||
The type system should reflect this:
|
||||
|
||||
```rust
|
||||
// CURRENT (types/claim.rs:7-31)
|
||||
pub struct ExtractedClaim { // This is an observation, not a claim
|
||||
pub concept_path: String,
|
||||
pub predicate: String,
|
||||
pub value: ObjectValue,
|
||||
pub file: String,
|
||||
pub line: usize,
|
||||
pub matched_text: String,
|
||||
pub confidence: f32,
|
||||
pub description: String,
|
||||
}
|
||||
|
||||
// TARGET: New Observation type (rename ExtractedClaim)
|
||||
pub struct Observation {
|
||||
pub concept_path: String,
|
||||
pub predicate: String,
|
||||
pub value: ObjectValue,
|
||||
pub file: String,
|
||||
pub line: usize,
|
||||
pub matched_text: String,
|
||||
pub confidence: f32,
|
||||
pub description: String,
|
||||
}
|
||||
|
||||
// TARGET: New Claim type (does not exist today)
|
||||
pub struct AuthoredClaim {
|
||||
pub concept_path: String,
|
||||
pub predicate: String,
|
||||
pub value: ObjectValue,
|
||||
pub provenance: String, // Where did this come from? (Intel SDM, RFC, ADR)
|
||||
pub invariant: String, // What must remain true?
|
||||
pub consequence: String, // What breaks if violated?
|
||||
pub authority_tier: SourceClass, // Tier 0-4
|
||||
pub evidence_chain: Vec<String>, // References to supporting documents
|
||||
pub parent_hash: Option<Hash>, // Supersedes which claim?
|
||||
pub epoch: Option<String>, // Paradigm context
|
||||
}
|
||||
```
|
||||
|
||||
### 2. The skill is the primary interface, not the scanner
|
||||
|
||||
The `.claude/skills/aphoria` skill should be the main way claims enter the system. It:
|
||||
- Understands the project's claim vocabulary
|
||||
- Reviews diffs for claimable changes
|
||||
- Looks up existing claims for context
|
||||
- Helps author claims with proper lineage
|
||||
- Submits them as real Episteme assertions
|
||||
|
||||
The scanner (`aphoria scan`) becomes the audit tool -- it verifies that code matches claims, not the other way around.
|
||||
|
||||
### 3. Extractors serve the audit, not the authoring
|
||||
|
||||
The `Extractor` trait (`traits.rs:68-94`) needs to change:
|
||||
|
||||
```rust
|
||||
// CURRENT: Extractors produce observations from thin air
|
||||
pub trait Extractor: Send + Sync {
|
||||
fn name(&self) -> &str;
|
||||
fn languages(&self) -> &[Language];
|
||||
fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec<ExtractedClaim>;
|
||||
}
|
||||
|
||||
// TARGET: Extractors can also verify observations against claims
|
||||
pub trait Extractor: Send + Sync {
|
||||
fn name(&self) -> &str;
|
||||
fn languages(&self) -> &[Language];
|
||||
fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec<Observation>;
|
||||
|
||||
/// Claims this extractor can verify (empty = observation-only extractor)
|
||||
fn verifiable_claims(&self) -> &[&str] { &[] }
|
||||
|
||||
/// Verify a specific claim against extracted observations
|
||||
fn verify(&self, claim: &AuthoredClaim, observations: &[Observation]) -> VerifyResult {
|
||||
VerifyResult::Unchecked
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. The corpus should be proper assertions
|
||||
|
||||
Today, RFC/OWASP knowledge is built procedurally in `corpus.rs:33-157`. The `ConflictingSource::extract_citation()` in `types/claim.rs:89-111` already handles `rfc://` and `owasp://` URI schemes -- the infrastructure for proper corpus assertions partially exists.
|
||||
|
||||
Target: corpus data stored as real Episteme assertions with proper lineage, not rebuilt every scan.
|
||||
|
||||
### 5. The claims-explained.md pattern should be the product
|
||||
|
||||
The workflow that produces it:
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[aphoria scan] -->|produces| B[Observations]
|
||||
B -->|skill identifies| C{Claimable?}
|
||||
C -->|yes| D[Developer authors claim<br/>with skill assistance]
|
||||
C -->|no| E[Discard / log as observation]
|
||||
D -->|submit| F[Episteme Knowledge Graph]
|
||||
F -->|future scans| G[aphoria audit checks<br/>code against claims]
|
||||
G -->|generates| H[claims-explained.md<br/>auto-generated from graph]
|
||||
F -->|new observations| I{Matches existing claim?}
|
||||
I -->|yes, same value| J[PASS]
|
||||
I -->|no, different value| K[CONFLICT]
|
||||
I -->|claim about deleted code| L[MISSING]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Proposed Extractors for Audit Flow
|
||||
|
||||
These extractors don't exist today. They're needed to close the gap between observations and claims.
|
||||
|
||||
### Self-Audit Extractors (Meta)
|
||||
|
||||
These extractors audit Aphoria's own code to verify the claims in this document remain true:
|
||||
|
||||
| Extractor Name | What It Verifies | Pattern |
|
||||
|---------------|-----------------|---------|
|
||||
| `bridge_source_class_audit` | `bridge.rs` default tier assignment | Regex for `SourceClass::Expert` in `claim_to_assertion` |
|
||||
| `bridge_parent_hash_audit` | Whether `parent_hash` is always `None` | Regex for `parent_hash: None` in bridge |
|
||||
| `bridge_lifecycle_audit` | Whether lifecycle skips review | Regex for `LifecycleStage::Approved` without Pending |
|
||||
| `extractor_trait_audit` | Whether Extractor trait accepts claims | Check trait definition for claim parameter |
|
||||
| `type_naming_audit` | Whether `ExtractedClaim` has been renamed | Grep for `struct ExtractedClaim` vs `struct Observation` |
|
||||
|
||||
### Claim-Paired Extractors (Project-Specific)
|
||||
|
||||
These are examples of what extractor-claim pairs look like for a project like Maxwell:
|
||||
|
||||
| Claim | Extractor | Verification |
|
||||
|-------|-----------|-------------|
|
||||
| "Wallet atomics MUST use SeqCst" | `unsafe_atomic` (exists) | Check all `Ordering::` in wallet/ are `SeqCst` |
|
||||
| "Wallet MUST NOT derive Clone" | `derive_pattern` (exists) | Check `#[derive(` on Wallet struct excludes `Clone` |
|
||||
| "vsock types MUST derive Serialize+Deserialize" | `derive_pattern` (exists) | Check all structs in vsock/ derive both |
|
||||
| "RAPL_POWER_UNIT MUST be 0x606" | `const_declarations` (exists) | Check const value matches Intel SDM |
|
||||
| "Core modules MUST NOT import tokio" | `import_graph` (exists) | Check no `use tokio` in core/ |
|
||||
|
||||
The existing extractors can already produce the observations needed. What's missing is the **claim** to compare against and the **pairing mechanism** to connect them.
|
||||
|
||||
### Declarative Extractor Examples
|
||||
|
||||
Using the existing `DeclarativeExtractor` system (`extractors/declarative/`), claim-paired extractors can be defined in `aphoria.toml`:
|
||||
|
||||
```toml
|
||||
[[extractors.declarative]]
|
||||
name = "wallet_seqcst_policy"
|
||||
description = "Wallet atomics must use SeqCst ordering"
|
||||
languages = ["rust"]
|
||||
pattern = 'Ordering::(Relaxed|AcqRel|Acquire|Release)'
|
||||
claim.subject = "policy/wallet/atomics/ordering"
|
||||
claim.predicate = "forbidden_ordering"
|
||||
claim.value = { type = "boolean", value = true }
|
||||
confidence = 0.95
|
||||
source = { claim_id = "wallet-seqcst-001", authority = "safety-analysis" }
|
||||
|
||||
[[extractors.declarative]]
|
||||
name = "core_no_tokio_policy"
|
||||
description = "Core modules must not import tokio"
|
||||
languages = ["rust"]
|
||||
pattern = 'use tokio'
|
||||
claim.subject = "policy/core/imports/tokio"
|
||||
claim.predicate = "forbidden_import"
|
||||
claim.value = { type = "boolean", value = true }
|
||||
confidence = 0.95
|
||||
source = { claim_id = "arch-boundary-001", authority = "architecture-decision" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The Path Forward
|
||||
|
||||
### Phase 1: Distinguish observations from claims
|
||||
|
||||
- [ ] Rename `ExtractedClaim` to `Observation` in `types/claim.rs`
|
||||
- [ ] Create `AuthoredClaim` type with provenance, invariant, consequence, authority, evidence_chain
|
||||
- [ ] Update `bridge.rs` default path to use Tier 4/5 (not Tier 3) for scanner output
|
||||
- [ ] Add `evidence` field to `source_metadata` in bridge
|
||||
|
||||
### Phase 2: Build the authoring workflow
|
||||
|
||||
- [ ] Create `.claude/skills/aphoria` skill for claim authoring
|
||||
- [ ] Add `aphoria claims create` CLI command
|
||||
- [ ] Add `aphoria claims update` with `parent_hash` supersession
|
||||
- [ ] Add `aphoria claims list` and `aphoria claims explain`
|
||||
- [ ] Store authored claims as proper Episteme assertions with lineage
|
||||
|
||||
### Phase 3: Pair extractors with claims
|
||||
|
||||
- [ ] Extend `Extractor` trait with `verifiable_claims()` and `verify()` methods
|
||||
- [ ] Add `aphoria audit` command (both directions)
|
||||
- [ ] Map each existing extractor to claims it can verify
|
||||
- [ ] Flag observations without matching claims as "should this be a claim?"
|
||||
|
||||
### Phase 4: Make the corpus first-class
|
||||
|
||||
- [ ] Convert `corpus.rs` hardcoded assertions to stored Episteme assertions
|
||||
- [ ] Wire up Authority Lens for conflict resolution
|
||||
- [ ] Ensure Trust Packs contain authored claims, not just patterns
|
||||
|
||||
### Phase 5: The flywheel
|
||||
|
||||
- [ ] More claims authored per commit
|
||||
- [ ] Better audit coverage (extractors verify more claims)
|
||||
- [ ] Skill learns from authored claims what's claimable
|
||||
- [ ] Claims-explained documentation auto-generates from knowledge graph
|
||||
- [ ] New team members read claims to understand WHY, not just WHAT
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
We built a good code scanner. We didn't build a knowledge graph client.
|
||||
|
||||
The extractors work well at finding patterns. But finding patterns isn't the point -- understanding what those patterns mean, why they must be that way, and what breaks if they change is the point.
|
||||
|
||||
The Maxwell claims-explained.md proves the concept works. Every one of those 67 observations becomes valuable when paired with provenance and invariants. The gap is that today a human has to write that context by hand.
|
||||
|
||||
Close the gap by making the skill -- not the scanner -- the primary interface, and by treating claims as authored artifacts with lineage rather than regex output with a fancy name.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Claim Extraction Summary
|
||||
|
||||
This document contains **94 extractable claims** across **52 unique subjects**:
|
||||
|
||||
- **11 architecture claims**: Verified against current code (all confirmed true)
|
||||
- **10 gap claims**: Define what doesn't exist yet
|
||||
- **5 bridge.rs claims**: Code-verifiable, confirmed (source_hash faked, source_class hardcoded, parent_hash ignored, epoch ignored, evidence empty)
|
||||
- **15 phase-plan claims**: Define specific deliverables and tasks
|
||||
- **20+ workflow claims**: Define the target authoring/audit model
|
||||
- **5 claimability rules**: What counts as claimable in a diff (spec constants=yes, ordering changes=yes, boundary crossings=yes, derive changes on serialized types=yes, renamed variables=no)
|
||||
- **4 Maxwell examples**: Real claims about SeqCst ordering, Wallet derives, vsock serialization, RAPL_POWER_UNIT
|
||||
|
||||
~~The most critical engineering gap: **no extractor currently has the ability to verify against existing claims**.~~ **CLOSED (2026-02-08):** The `Extractor` trait now includes `verifiable_predicates()` returning `(tail_path, predicate)` pairs. 10 extractors declare their predicates. `compute_extractor_claim_map()` matches claims against extractors (with wildcard support). `aphoria verify map` shows coverage. Direction 2 audit (walk claims, verify code) is now implemented via `aphoria verify run`.
|
||||
319
applications/aphoria/roadmap-archive.md
Normal file
319
applications/aphoria/roadmap-archive.md
Normal file
@ -0,0 +1,319 @@
|
||||
# Aphoria Roadmap Archive
|
||||
|
||||
> Completed phases moved from `roadmap.md`. Full implementation details preserved in git history.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0: StemeDB Foundation ✅
|
||||
|
||||
ConceptPath type, hierarchical index, alias store, source class inference, concept API endpoints.
|
||||
All shipped as Phase 5D of the main StemeDB roadmap.
|
||||
|
||||
**Spec:** [docs/specs/concept-hierarchy.md](../../docs/specs/concept-hierarchy.md)
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: CLI Core ✅
|
||||
|
||||
End-to-end CLI pipeline with 10 extractors and bootstrapped corpus of 11 hardcoded assertions.
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| 2.1 Project Walker | ✅ `walker/mod.rs`, `walker/path_mapper.rs`, `walker/language.rs` |
|
||||
| 2.2 Extractors (10) | ✅ `tls_verify`, `jwt_config`, `hardcoded_secrets`, `timeout_config`, `dep_versions`, `cors_config`, `rate_limit`, `weak_crypto`, `command_injection`, `sql_injection` |
|
||||
| 2.3 Ingestion Bridge | ✅ `bridge.rs` — BLAKE3 hashing, Ed25519 signing, claim→assertion conversion |
|
||||
| 2.4 Conflict Query | ✅ `episteme.rs` — LocalEpisteme with check_conflicts() |
|
||||
| 2.5 Report Output | ✅ `report/` — table (comfy-table), JSON, SARIF 2.1.0, markdown |
|
||||
| 2.6 Acknowledge Command | ✅ `lib.rs` acknowledge() |
|
||||
| Baseline & Diff | ✅ `lib.rs` set_baseline(), show_diff() |
|
||||
| Status Command | ✅ `lib.rs` show_status() |
|
||||
|
||||
### Phase 2 Code Quality Fixes ✅
|
||||
|
||||
- DES/RC4 concept path misclassification: Split into `check_hash_pattern()` and `check_encryption_pattern()`
|
||||
- SHA1 edge case: Documented as intentionally broad
|
||||
- JS exec() regex: Tightened to require `child_process.` prefix
|
||||
|
||||
---
|
||||
|
||||
## Phase 2A: Concept Matching ✅
|
||||
|
||||
- **2A.1 Leaf-Based Matching**: `ConceptIndex` with tail-path matching (last 2 segments + predicate)
|
||||
- **2A.2 Alias Resolution**: Wired `AliasStore` into `QueryEngine.execute()` with `resolve_aliases: bool`
|
||||
- **2A.3 Auto-Alias Creation**: Auto-creates aliases when code and authority share leaf names
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Authoritative Corpus Expansion ✅
|
||||
|
||||
Expanded from 11 hardcoded assertions to pluggable corpus system.
|
||||
|
||||
- **1.1 CorpusBuilder Trait** ✅ — name, scheme, default_tier, build, requires_network
|
||||
- **1.2 RFC Ingester** ✅ — HTTP fetching, RFC 2119 keyword parsing, 8 RFC-specific parsers
|
||||
- **1.3 OWASP Ingester** ✅ — GitHub raw content, 9 cheat sheet parsers
|
||||
- **1.4 Vendor Docs** ✅ — PostgreSQL, Redis, reqwest, hyper, Go net/http, tokio-postgres, SQLx
|
||||
- **1.5 Hardcoded Refactor** ✅ — Original 11 assertions as `HardcodedCorpusBuilder`
|
||||
- **1.6 CLI Integration** ✅ — `aphoria corpus build/list`, `--only`, `--offline`, `--clear-cache`
|
||||
- **1.7 Error Handling** ✅ — Per-source graceful degradation
|
||||
|
||||
**Files:** `corpus/mod.rs`, `corpus/hardcoded.rs`, `corpus/rfc.rs`, `corpus/owasp.rs`, `corpus/vendor.rs`
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Skill Integration ✅
|
||||
|
||||
- **3.1 Claude Code Skill** ✅ — `/aphoria scan`, `scan --fix`, `ack`, `status`, `diff`, `init`, `baseline`
|
||||
- **3.2 Agent Pre-Flight Hook** ✅ — `--exit-code` (2=BLOCK, 1=FLAG, 0=clean), `--strict`
|
||||
- **3.3 Alias Suggestion** ✅ — Auto-alias creation from Phase 2A.3
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Full-Cycle Pre-Commit (Scan + Sync) ✅
|
||||
|
||||
Bidirectional knowledge sync: extract → check → classify → update → gate.
|
||||
|
||||
- **4A Observational Claims** ✅ — `--sync` records novel claims as Tier 4 observations
|
||||
- **4B Self-Conflict Detection** ✅ — Drift detection with `Verdict::Drift`
|
||||
- **4C Diff-Only Scanning** ✅ — `--staged` for fast pre-commit hooks
|
||||
- **4D Enhanced Ack** ✅ — `--reason`, `aphoria update` for policy changes
|
||||
- **4E Hosted Mode** ✅ — Team aggregation via central StemeDB server, `HostedClient`
|
||||
|
||||
---
|
||||
|
||||
## Phase 4.5: Ephemeral Scan Mode ✅
|
||||
|
||||
40x faster scans by skipping Episteme storage. Default mode ~0.25s, persistent ~1-2s.
|
||||
|
||||
- `ScanMode` enum (Ephemeral default, Persistent opt-in with `--persist`)
|
||||
- `EphemeralDetector` — in-memory corpus + ConceptIndex
|
||||
- `check_conflicts_pure()` extracted as standalone function
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Research Agent Loop ✅
|
||||
|
||||
- **5.1 Gap Detection** ✅ — `detect_gaps()` compares claims against ConceptIndex
|
||||
- **5.2 Gap Storage** ✅ — JSON-backed persistent storage with eligibility tracking
|
||||
- **5.3 Quality Validation** ✅ — Source attribution, normative language, vague content detection
|
||||
- **5.4 Research Execution** ✅ — HTTP fetching, normative extraction, confidence scoring
|
||||
- **5.5 CLI Integration** ✅ — `aphoria research run/status/gaps`
|
||||
- **5.6 Community Corpus** ✅ — Opt-in anonymous pattern sharing with privacy-preserving anonymization
|
||||
- **5.7 Security Extractors** ✅ — weak_crypto, command_injection, sql_injection
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Federated Policy & Trust Packs ✅
|
||||
|
||||
- **6.1 Trust Pack Format** ✅ — rkyv serialization, Ed25519 signing
|
||||
- **6.2 Policy Management** ✅ — Local and remote loading with caching
|
||||
- **6.3 Core Integration** ✅ — EphemeralDetector + LocalEpisteme policy ingestion
|
||||
- **6.4 CLI Commands** ✅ — `aphoria policy export`, auto-loading
|
||||
|
||||
---
|
||||
|
||||
## Phase 6.5: Trust Pack Extensions ✅
|
||||
|
||||
- **6.5.1 Predicate Aliases** ✅ — `enabled` ↔ `required` ↔ `mandatory` ↔ `enforced`
|
||||
- **6.5.2 Pack Signing Key Rotation** ✅ — `aphoria policy resign`, signature chain audit trail
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Declarative Extractors ✅
|
||||
|
||||
TOML-defined custom extractors without Rust code.
|
||||
|
||||
- **7.1 Core Types** ✅ — `DeclarativeExtractorDef`, `DeclarativeExtractor`
|
||||
- **7.2 Configuration** ✅ — `[[extractors.declarative]]` in aphoria.toml
|
||||
- **7.3 Validation** ✅ — ReDoS protection, confidence validation
|
||||
- **7.4 Registry Integration** ✅ — Enable/disable, Trust Pack integration
|
||||
- **7.5 Error Handling** ✅
|
||||
- **7.6 Tests** ✅ — 22 unit + 7 integration tests
|
||||
|
||||
---
|
||||
|
||||
## Phase 7.5: LLM-in-the-Loop Extraction ✅
|
||||
|
||||
Gemini-powered semantic extraction for high-value files.
|
||||
|
||||
- **7.5.1 LLM Extractor** ✅ — `GeminiClient`, structured JSON output
|
||||
- **7.5.2 Selective Triggering** ✅ — `is_high_value_file()`, token budget
|
||||
- **7.5.3 Cost Controls** ✅ — BLAKE3 caching, budget enforcement
|
||||
- **7.5.4 Configuration** ✅ — `[llm]` section in aphoria.toml
|
||||
|
||||
---
|
||||
|
||||
## Phase 7.6: Pattern Learning Store ✅
|
||||
|
||||
Remember patterns LLM finds for promotion to declarative extractors.
|
||||
|
||||
- **7.6.1 Schema** ✅ — `LearnedPattern`, `ClaimTemplate`, `ValueType`
|
||||
- **7.6.2 PatternStore** ✅ — JSON-backed, RwLock thread safety, Levenshtein dedup
|
||||
- **7.6.3 Normalization** ✅ — Version/boolean/number/string placeholder replacement
|
||||
- **7.6.4 Configuration** ✅ — `[learning]` section
|
||||
- **7.6.5 Scan Integration** ✅ — Project hash, record/update patterns
|
||||
|
||||
---
|
||||
|
||||
## Phase 7.7: Pattern → Extractor Promotion ✅
|
||||
|
||||
Learned patterns become declarative extractors via LLM regex generation.
|
||||
|
||||
- **7.7.1 Pipeline** ✅ — `PromotionPipeline`, `RegexGenerator`, `ExtractorValidator`, `YamlWriter`
|
||||
- **7.7.2 Regex Generation** ✅ — Multi-example prompt, ReDoS safety
|
||||
- **7.7.3 Validation** ✅ — Positive tests, timing validation
|
||||
- **7.7.4 Human Review** ✅ — `aphoria extractors review/stats/candidates/promote`
|
||||
- **7.7.5 Extractor Output** ✅ — YAML files in `.aphoria/extractors/learned/`
|
||||
|
||||
---
|
||||
|
||||
## Phase 7.8: LLM Prompt Evaluation ✅
|
||||
|
||||
Golden fixtures with precision/recall metrics and regression detection.
|
||||
|
||||
- **7.8.1 Fixture Format** ✅ — TOML-based with `must_contain`/`must_not_contain`
|
||||
- **7.8.2 Claim Matching** ✅ — Tail-path matching, type coercion
|
||||
- **7.8.3 Metrics** ✅ — Precision/Recall/F1, per-category breakdown
|
||||
- **7.8.4 Harness** ✅ — Live/Cached/Mock modes, regression detection
|
||||
- **7.8.5 Reports** ✅ — Table, JSON, Markdown
|
||||
- **7.8.6 CLI** ✅ — `aphoria eval run/baseline/update-baseline/list-fixtures/validate-fixtures`
|
||||
- **7.8.7 Seed Fixtures** ✅ — 10 fixtures across tls, jwt, secrets, auth, negative, edge
|
||||
|
||||
---
|
||||
|
||||
## Phase 8: Enterprise Extractor Improvements ✅
|
||||
|
||||
42 extractors total. Enterprise-grade detection for production codebases.
|
||||
|
||||
- **8.1 High-Entropy Secrets** ✅ — Shannon entropy, known prefixes (AWS/Stripe/GitHub/GitLab/Slack)
|
||||
- **8.2 Framework Extractors (10)** ✅ — Spring, Django, Express, Rails, ASP.NET, Laravel, FastAPI, Next.js, Flask, NestJS
|
||||
- **8.3 Config Deep Parsing** ✅ — YAML/JSON/TOML tree walking, 11 security rules
|
||||
- **8.4 Semantic TLS Version** ✅ — Cross-language const detection, Terraform, Kubernetes
|
||||
- **8.5 ORM SQL Injection** ✅ — Django/SQLAlchemy/GORM/ActiveRecord/Prisma/Sequelize
|
||||
- **8.6 Path Traversal** ✅
|
||||
- **8.7 Unvalidated Redirects** ✅
|
||||
- **8.8 Weak Password** ✅
|
||||
- **8.9 Security Headers** ✅
|
||||
- **8.10 Insecure Deserialization** ✅
|
||||
- **8.11 SSRF** ✅
|
||||
|
||||
---
|
||||
|
||||
## Phase 9: Autonomous Extractor Generation ✅
|
||||
|
||||
Fully self-improving extraction system.
|
||||
|
||||
- **9.1 Autonomous Promotion** ✅ — >0.95 confidence, >10 projects, full audit trail
|
||||
- **9.2 Shadow Mode Testing** ✅ — Isolated metrics, graduation gate, FP tracking
|
||||
- **9.3 Auto-Rollback** ✅ — FP rate >15% triggers automatic rollback
|
||||
- **9.4 Cross-Project Learning** ✅ — Privacy-preserving pattern sync, community extractors
|
||||
- **9.5 Extractor Versioning** ✅ — Changelogs, rollback, A/B comparison
|
||||
|
||||
---
|
||||
|
||||
## Phase 10.1: Acknowledgment Expiry ✅
|
||||
|
||||
Time-limited exceptions with `--expires` flag.
|
||||
|
||||
- `--expires 90d` or `--expires 2026-12-31` (ISO 8601)
|
||||
- Expired acks resurface as BLOCK
|
||||
- Preserved for audit trail per patent claim 25
|
||||
- All report formatters show expiry info
|
||||
|
||||
**Files:** `src/expiry.rs`, `cli.rs`, `report/*.rs`
|
||||
|
||||
---
|
||||
|
||||
## Phase 11: Evidence-Based Authority ✅
|
||||
|
||||
Evidence levels (ProductSpec > Standard > Research > Commit-only) with evidence-aware graduation.
|
||||
|
||||
- **11.1 Types** ✅ — `EvidenceLevel`, `PatternEvidence` with ADR/spec/RFC references
|
||||
- **11.2 Detection** ✅ — Commit message parsing, ADR/spec file detection
|
||||
- **11.3 Graduation** ✅ — Thresholds vary by evidence (ProductSpec: 1 usage, Commit-only: 10)
|
||||
- **11.4 Display** ✅ — Evidence chain in output, `--evidence` filter
|
||||
|
||||
**Files:** `src/evidence/mod.rs`, `evidence/types.rs`, `evidence/detection.rs`
|
||||
|
||||
---
|
||||
|
||||
## Phase 12: Knowledge Scope Hierarchy ✅
|
||||
|
||||
Organization → Team → Project scope levels with inheritance.
|
||||
|
||||
- **12.1 Scope Types** ✅ — `ScopeLevel` enum, `ScopeConfig`
|
||||
- **12.2 Inheritance** ✅ — Security: no opt-out, Conventions: override with justification
|
||||
- **12.3 Override Workflow** ✅ — Justification + evidence required
|
||||
- **12.4 Cross-Scope Queries** ✅ — `--scope org/team/project`, `--exclude-inherited`
|
||||
|
||||
**Files:** `src/scope/mod.rs`, `scope/config.rs`, `scope/resolver.rs`, `scope/override_record.rs`, `scope/store.rs`
|
||||
|
||||
---
|
||||
|
||||
## Phase 13: Knowledge Lifecycle Management ✅
|
||||
|
||||
Active → Deprecated → Superseded → Archived lifecycle for patterns.
|
||||
|
||||
- **13.1 Status Types** ✅ — `KnowledgeStatus` enum with history tracking
|
||||
- **13.2 Deprecation** ✅ — `aphoria deprecate` with `--reason`, `--superseded-by`, `--sunset-date`
|
||||
- **13.3 Migration Guidance** ✅ — Warnings in scan output, links to replacements
|
||||
- **13.4 Migration Dashboard** ✅ — `aphoria migrations status`, progress tracking, export
|
||||
|
||||
**Files:** `src/lifecycle/mod.rs`, `lifecycle/store.rs`, `lifecycle/migration.rs`
|
||||
|
||||
---
|
||||
|
||||
## Phase 16: Ignore & Exclusion System ✅
|
||||
|
||||
Clean scans by excluding test fixtures and intentional patterns.
|
||||
|
||||
- **16.1 Glob Patterns** ✅ — `globset` with `**`, `*`, `?` support
|
||||
- **16.2 `.aphoriaignore`** ✅ — Gitignore-style patterns, merged with aphoria.toml
|
||||
- **16.3 Inline Comments** ✅ — `// aphoria:ignore`, `ignore-next-line`, `ignore-block`
|
||||
- **16.4 Ack Export/Import** ✅ — `.aphoria/acks.toml`, version-controllable
|
||||
|
||||
---
|
||||
|
||||
## The Self-Learning Vision (Complete)
|
||||
|
||||
```
|
||||
Phase 7: Declarative Extractors ✅
|
||||
↓
|
||||
Phase 7.5: LLM-in-the-Loop (Gemini semantic extraction) ✅
|
||||
↓
|
||||
Phase 7.6: Pattern Learning (remember what LLM finds) ✅
|
||||
↓
|
||||
Phase 7.7: Pattern Promotion (patterns → extractors) ✅
|
||||
↓
|
||||
Phase 7.8: LLM Prompt Evaluation (measure & improve) ✅
|
||||
↓
|
||||
Phase 8: Enterprise Extractors (42 total) ✅
|
||||
↓
|
||||
Phase 9: Autonomous Generation (fully self-improving) ✅
|
||||
```
|
||||
|
||||
## Milestone Summary (Completed)
|
||||
|
||||
| Phase | Deliverable | Status |
|
||||
|-------|-------------|--------|
|
||||
| 0 | ConceptPath in StemeDB | ✅ |
|
||||
| 2 | Aphoria CLI (scan, report, ack) | ✅ |
|
||||
| 2A | Concept matching (leaf, alias, auto-alias) | ✅ |
|
||||
| 1 | Authoritative corpus expansion | ✅ |
|
||||
| 3 | Claude Code skill + hooks | ✅ |
|
||||
| 4 | Full-cycle pre-commit (sync, drift, staged, hosted) | ✅ |
|
||||
| 4.5 | Ephemeral scan mode (40x faster) | ✅ |
|
||||
| 5 | Research agent loop + community corpus | ✅ |
|
||||
| 6 | Federated Policy & Trust Packs | ✅ |
|
||||
| 6.5 | Trust Pack Extensions | ✅ |
|
||||
| 7 | Declarative Extractors | ✅ |
|
||||
| 7.5 | LLM-in-the-Loop Extraction | ✅ |
|
||||
| 7.6 | Pattern Learning Store | ✅ |
|
||||
| 7.7 | Pattern → Extractor Promotion | ✅ |
|
||||
| 7.8 | LLM Prompt Evaluation | ✅ |
|
||||
| 8 | Enterprise Extractors (42 total) | ✅ |
|
||||
| 9 | Autonomous Extractor Generation | ✅ |
|
||||
| 10.1 | Acknowledgment Expiry | ✅ |
|
||||
| 11 | Evidence-Based Authority | ✅ |
|
||||
| 12 | Knowledge Scope Hierarchy | ✅ |
|
||||
| 13 | Knowledge Lifecycle Management | ✅ |
|
||||
| 16 | Ignore & Exclusion System | ✅ |
|
||||
File diff suppressed because it is too large
Load Diff
@ -50,6 +50,7 @@ pub async fn show_diff(config: &AphoriaConfig) -> Result<String, AphoriaError> {
|
||||
file_source: crate::types::FileSource::All,
|
||||
benchmark: false,
|
||||
show_claims: false,
|
||||
strict: false,
|
||||
};
|
||||
|
||||
let result = run_scan(args, config).await?;
|
||||
|
||||
@ -1,4 +1,4 @@
|
||||
//! Bridge between ExtractedClaim and Episteme Assertion.
|
||||
//! Bridge between Observation and Episteme Assertion.
|
||||
//!
|
||||
//! Converts claims extracted from source code into Episteme assertions
|
||||
//! that can be ingested into the knowledge graph.
|
||||
@ -10,31 +10,67 @@ use stemedb_core::types::{
|
||||
};
|
||||
use tracing::instrument;
|
||||
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::{parse_authority_tier, AuthoredClaim, Observation};
|
||||
|
||||
/// Convert an ExtractedClaim to an Episteme Assertion.
|
||||
/// Convert an Observation to an Episteme Assertion.
|
||||
///
|
||||
/// The assertion is signed with the provided keypair and timestamped.
|
||||
/// Uses `SourceClass::Expert` (Tier 3) for code-extracted claims.
|
||||
#[instrument(skip(signing_key), fields(concept_path = %claim.concept_path, predicate = %claim.predicate))]
|
||||
pub fn claim_to_assertion(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
signing_key: &SigningKey,
|
||||
timestamp: u64,
|
||||
) -> Assertion {
|
||||
claim_to_assertion_with_tier(claim, signing_key, timestamp, SourceClass::Expert)
|
||||
}
|
||||
|
||||
/// Convert an ExtractedClaim to a Tier 4 (Community) observation.
|
||||
/// Map observation confidence to appropriate tier.
|
||||
///
|
||||
/// Observations (extracted patterns) are assigned tiers based on confidence:
|
||||
/// - High confidence (≥0.9): Tier 4 (Community, weight 0.3)
|
||||
/// - Low confidence (<0.9): Tier 5 (Anecdotal, weight 0.1)
|
||||
///
|
||||
/// This is different from authored claims which use explicit authority tiers.
|
||||
pub fn observation_to_tier(confidence: f32) -> SourceClass {
|
||||
if confidence >= 0.9 {
|
||||
SourceClass::Community // Tier 4
|
||||
} else {
|
||||
SourceClass::Anecdotal // Tier 5
|
||||
}
|
||||
}
|
||||
|
||||
/// Convert an observation (Observation) to an Episteme Assertion.
|
||||
///
|
||||
/// Observations are pattern matches from extractors. Unlike authored claims,
|
||||
/// they lack provenance and consequences. The tier is determined by confidence:
|
||||
/// - High confidence (≥0.9) → Tier 4 (Community, 0.3 weight)
|
||||
/// - Low confidence (<0.9) → Tier 5 (Anecdotal, 0.1 weight)
|
||||
///
|
||||
/// This replaces the fixed Tier 3 mapping previously used for code extractions.
|
||||
#[instrument(skip(signing_key), fields(concept_path = %claim.concept_path, predicate = %claim.predicate, confidence = %claim.confidence))]
|
||||
pub fn observation_to_assertion(
|
||||
claim: &Observation,
|
||||
signing_key: &SigningKey,
|
||||
timestamp: u64,
|
||||
) -> Assertion {
|
||||
let tier = observation_to_tier(claim.confidence);
|
||||
claim_to_assertion_with_tier(claim, signing_key, timestamp, tier)
|
||||
}
|
||||
|
||||
/// Convert an Observation to a Tier 4 (Community) observation.
|
||||
///
|
||||
/// **Deprecated:** Use `observation_to_assertion()` which maps confidence to tier.
|
||||
///
|
||||
/// Used for claims that have no authority conflict — these become "project memory"
|
||||
/// that persists across commits and enables future drift detection.
|
||||
///
|
||||
/// Observations are lower-weight assertions (Tier 4, 0.3 authority weight) that
|
||||
/// record what the code actually does without making authoritative claims.
|
||||
#[deprecated(since = "0.9.0", note = "Use observation_to_assertion() for confidence-based tier mapping")]
|
||||
#[instrument(skip(signing_key), fields(concept_path = %claim.concept_path, predicate = %claim.predicate))]
|
||||
pub fn claim_to_observation(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
signing_key: &SigningKey,
|
||||
timestamp: u64,
|
||||
) -> Assertion {
|
||||
@ -43,7 +79,7 @@ pub fn claim_to_observation(
|
||||
|
||||
/// Internal helper to create assertions with a specific source class.
|
||||
fn claim_to_assertion_with_tier(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
signing_key: &SigningKey,
|
||||
timestamp: u64,
|
||||
source_class: SourceClass,
|
||||
@ -91,6 +127,81 @@ fn claim_to_assertion_with_tier(
|
||||
}
|
||||
}
|
||||
|
||||
/// Convert an `AuthoredClaim` to an Episteme Assertion.
|
||||
///
|
||||
/// Unlike extractor-produced assertions, authored claims carry full provenance:
|
||||
/// - `source_class` is derived from the claim's `authority_tier` field
|
||||
/// - `source_metadata` includes provenance, invariant, consequence, and evidence
|
||||
/// - `parent_hash` is computed from the `supersedes` field if present
|
||||
/// - `lifecycle` is `Approved` (authored claims are already reviewed)
|
||||
#[instrument(skip(signing_key), fields(id = %claim.id, concept_path = %claim.concept_path))]
|
||||
pub fn authored_claim_to_assertion(
|
||||
claim: &AuthoredClaim,
|
||||
signing_key: &SigningKey,
|
||||
timestamp: u64,
|
||||
) -> Result<Assertion, crate::AphoriaError> {
|
||||
let source_class = parse_authority_tier(&claim.authority_tier)?;
|
||||
|
||||
let source_metadata = serde_json::json!({
|
||||
"authored": true,
|
||||
"claim_id": claim.id,
|
||||
"provenance": claim.provenance,
|
||||
"invariant": claim.invariant,
|
||||
"consequence": claim.consequence,
|
||||
"evidence": claim.evidence,
|
||||
"category": claim.category,
|
||||
"created_by": claim.created_by,
|
||||
"created_at": claim.created_at,
|
||||
"tool": "aphoria",
|
||||
"tool_version": env!("CARGO_PKG_VERSION"),
|
||||
});
|
||||
|
||||
// Source hash from claim ID (stable, deterministic)
|
||||
let source_hash = compute_authored_claim_hash(&claim.id);
|
||||
|
||||
// Compute parent hash from superseded claim ID if present
|
||||
let parent_hash =
|
||||
claim.supersedes.as_ref().map(|sid| compute_authored_claim_hash(sid));
|
||||
|
||||
// Sign subject:predicate
|
||||
let message = format!("{}:{}", claim.concept_path, claim.predicate);
|
||||
let signature = signing_key.sign(message.as_bytes());
|
||||
let verifying_key = signing_key.verifying_key();
|
||||
|
||||
let signature_entry = SignatureEntry {
|
||||
agent_id: verifying_key.to_bytes(),
|
||||
signature: signature.to_bytes(),
|
||||
timestamp,
|
||||
version: 1,
|
||||
};
|
||||
|
||||
Ok(Assertion {
|
||||
subject: claim.concept_path.clone(),
|
||||
predicate: claim.predicate.clone(),
|
||||
object: claim.value.to_object_value(),
|
||||
parent_hash,
|
||||
source_hash,
|
||||
source_class,
|
||||
visual_hash: None,
|
||||
epoch: None,
|
||||
source_metadata: serde_json::to_vec(&source_metadata).ok(),
|
||||
lifecycle: LifecycleStage::Approved,
|
||||
signatures: vec![signature_entry],
|
||||
confidence: 1.0, // Authored claims have full confidence
|
||||
timestamp,
|
||||
hlc_timestamp: HlcTimestamp::default(),
|
||||
vector: None,
|
||||
})
|
||||
}
|
||||
|
||||
/// Compute a deterministic hash from an authored claim ID.
|
||||
fn compute_authored_claim_hash(claim_id: &str) -> Hash {
|
||||
let mut hasher = Hasher::new();
|
||||
hasher.update(b"authored-claim:");
|
||||
hasher.update(claim_id.as_bytes());
|
||||
*hasher.finalize().as_bytes()
|
||||
}
|
||||
|
||||
/// Compute the content hash of an assertion for deduplication.
|
||||
#[allow(dead_code)]
|
||||
pub fn compute_assertion_hash(assertion: &Assertion) -> Hash {
|
||||
@ -160,7 +271,7 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn test_claim_to_assertion() {
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -186,8 +297,71 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_observation_to_tier_high_confidence() {
|
||||
assert_eq!(observation_to_tier(1.0), SourceClass::Community);
|
||||
assert_eq!(observation_to_tier(0.95), SourceClass::Community);
|
||||
assert_eq!(observation_to_tier(0.9), SourceClass::Community);
|
||||
assert_eq!(observation_to_tier(0.9).tier(), 4);
|
||||
assert!((observation_to_tier(0.9).authority_weight() - 0.3).abs() < f32::EPSILON);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_observation_to_tier_low_confidence() {
|
||||
assert_eq!(observation_to_tier(0.89), SourceClass::Anecdotal);
|
||||
assert_eq!(observation_to_tier(0.5), SourceClass::Anecdotal);
|
||||
assert_eq!(observation_to_tier(0.1), SourceClass::Anecdotal);
|
||||
assert_eq!(observation_to_tier(0.5).tier(), 5);
|
||||
assert!((observation_to_tier(0.5).authority_weight() - 0.1).abs() < f32::EPSILON);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_observation_to_assertion_high_confidence() {
|
||||
let observation = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
file: "src/client.rs".to_string(),
|
||||
line: 42,
|
||||
matched_text: "verify_certs = true".to_string(),
|
||||
confidence: 0.95,
|
||||
description: "TLS verification enabled".to_string(),
|
||||
};
|
||||
|
||||
let key = generate_signing_key();
|
||||
let assertion = observation_to_assertion(&observation, &key, 1706832000);
|
||||
|
||||
assert_eq!(assertion.source_class, SourceClass::Community); // Tier 4
|
||||
assert_eq!(assertion.source_class.tier(), 4);
|
||||
assert!((assertion.source_class.authority_weight() - 0.3).abs() < f32::EPSILON);
|
||||
assert_eq!(assertion.confidence, 0.95);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_observation_to_assertion_low_confidence() {
|
||||
let observation = Observation {
|
||||
concept_path: "code://rust/myapp/config/timeout".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Number(30.0),
|
||||
file: "src/config.rs".to_string(),
|
||||
line: 15,
|
||||
matched_text: "timeout = 30".to_string(),
|
||||
confidence: 0.7,
|
||||
description: "Timeout configuration".to_string(),
|
||||
};
|
||||
|
||||
let key = generate_signing_key();
|
||||
let assertion = observation_to_assertion(&observation, &key, 1706832000);
|
||||
|
||||
assert_eq!(assertion.source_class, SourceClass::Anecdotal); // Tier 5
|
||||
assert_eq!(assertion.source_class.tier(), 5);
|
||||
assert!((assertion.source_class.authority_weight() - 0.1).abs() < f32::EPSILON);
|
||||
assert_eq!(assertion.confidence, 0.7);
|
||||
}
|
||||
|
||||
#[test]
|
||||
#[allow(deprecated)]
|
||||
fn test_claim_to_observation_sets_tier4() {
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/logging/level".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Text("debug".to_string()),
|
||||
@ -215,8 +389,9 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
#[allow(deprecated)]
|
||||
fn test_claim_to_observation_preserves_metadata() {
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/db/pool_size".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Number(10.0),
|
||||
@ -245,7 +420,7 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn test_assertion_hash_deterministic() {
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -275,4 +450,106 @@ mod tests {
|
||||
// Same key should be loaded
|
||||
assert_eq!(key1.to_bytes(), key2.to_bytes());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_authored_claim_to_assertion() {
|
||||
use crate::types::authored_claim::{AuthoredValue, ClaimStatus};
|
||||
|
||||
let claim = AuthoredClaim {
|
||||
id: "wallet-seqcst-001".to_string(),
|
||||
concept_path: "maxwell/wallet/atomics/ordering".to_string(),
|
||||
predicate: "required_ordering".to_string(),
|
||||
value: AuthoredValue::Text("SeqCst".to_string()),
|
||||
comparison: Default::default(),
|
||||
provenance: "Safety analysis".to_string(),
|
||||
invariant: "All wallet atomics MUST use SeqCst".to_string(),
|
||||
consequence: "Double-spend race condition".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec!["ADR-003".to_string()],
|
||||
category: "safety".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "jml".to_string(),
|
||||
created_at: "2026-02-08T12:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
let key = generate_signing_key();
|
||||
let assertion =
|
||||
authored_claim_to_assertion(&claim, &key, 1706832000).expect("convert");
|
||||
|
||||
assert_eq!(assertion.subject, "maxwell/wallet/atomics/ordering");
|
||||
assert_eq!(assertion.predicate, "required_ordering");
|
||||
assert_eq!(assertion.object, ObjectValue::Text("SeqCst".to_string()));
|
||||
assert_eq!(assertion.source_class, SourceClass::Expert);
|
||||
assert_eq!(assertion.confidence, 1.0);
|
||||
assert!(assertion.parent_hash.is_none());
|
||||
assert_eq!(assertion.lifecycle, LifecycleStage::Approved);
|
||||
|
||||
// Verify metadata includes provenance fields
|
||||
let metadata: serde_json::Value =
|
||||
serde_json::from_slice(assertion.source_metadata.as_ref().expect("metadata"))
|
||||
.expect("parse");
|
||||
assert_eq!(metadata["authored"], true);
|
||||
assert_eq!(metadata["claim_id"], "wallet-seqcst-001");
|
||||
assert_eq!(metadata["provenance"], "Safety analysis");
|
||||
assert_eq!(metadata["invariant"], "All wallet atomics MUST use SeqCst");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_authored_claim_with_supersedes() {
|
||||
use crate::types::authored_claim::{AuthoredValue, ClaimStatus};
|
||||
|
||||
let claim = AuthoredClaim {
|
||||
id: "wallet-ordering-v2".to_string(),
|
||||
concept_path: "maxwell/wallet/atomics/ordering".to_string(),
|
||||
predicate: "required_ordering".to_string(),
|
||||
value: AuthoredValue::Text("Acquire".to_string()),
|
||||
comparison: Default::default(),
|
||||
provenance: "Updated safety analysis".to_string(),
|
||||
invariant: "Wallet atomics should use Acquire".to_string(),
|
||||
consequence: "Performance degradation".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: "safety".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: Some("wallet-seqcst-001".to_string()),
|
||||
created_by: "jml".to_string(),
|
||||
created_at: "2026-02-08T13:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
let key = generate_signing_key();
|
||||
let assertion =
|
||||
authored_claim_to_assertion(&claim, &key, 1706832000).expect("convert");
|
||||
|
||||
assert!(assertion.parent_hash.is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_authored_claim_invalid_tier() {
|
||||
use crate::types::authored_claim::{AuthoredValue, ClaimStatus};
|
||||
|
||||
let claim = AuthoredClaim {
|
||||
id: "bad-tier".to_string(),
|
||||
concept_path: "test/path".to_string(),
|
||||
predicate: "test".to_string(),
|
||||
value: AuthoredValue::Bool(true),
|
||||
comparison: Default::default(),
|
||||
provenance: "test".to_string(),
|
||||
invariant: "test".to_string(),
|
||||
consequence: "test".to_string(),
|
||||
authority_tier: "invalid_tier".to_string(),
|
||||
evidence: vec![],
|
||||
category: "safety".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "test".to_string(),
|
||||
created_at: "2026-02-08T12:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
let key = generate_signing_key();
|
||||
assert!(authored_claim_to_assertion(&claim, &key, 1706832000).is_err());
|
||||
}
|
||||
}
|
||||
|
||||
120
applications/aphoria/src/claim_store.rs
Normal file
120
applications/aphoria/src/claim_store.rs
Normal file
@ -0,0 +1,120 @@
|
||||
//! Claim storage interface and implementations.
|
||||
//!
|
||||
//! Provides persistence for human-authored claims (not observations).
|
||||
//! Claims are stored in `.aphoria/claims.toml` for version control.
|
||||
|
||||
use crate::types::AuthoredClaim;
|
||||
use crate::AphoriaError;
|
||||
use std::path::PathBuf;
|
||||
|
||||
/// Filter criteria for querying claims.
|
||||
#[derive(Debug, Clone, Default)]
|
||||
pub struct ClaimFilter {
|
||||
/// Filter by concept path (exact match)
|
||||
pub concept_path: Option<String>,
|
||||
|
||||
/// Filter by predicate (exact match)
|
||||
pub predicate: Option<String>,
|
||||
|
||||
/// Filter by authority tier
|
||||
pub authority_tier: Option<u8>,
|
||||
}
|
||||
|
||||
/// Statistics from bulk import operations.
|
||||
#[derive(Debug, Default)]
|
||||
pub struct ImportStats {
|
||||
/// Number of claims successfully imported
|
||||
pub imported: usize,
|
||||
|
||||
/// Number of claims skipped (duplicates)
|
||||
pub skipped: usize,
|
||||
|
||||
/// Number of claims that failed to import
|
||||
pub errors: usize,
|
||||
}
|
||||
|
||||
/// Trait for claim storage backends.
|
||||
///
|
||||
/// Implementations provide persistence for `AuthoredClaim` instances.
|
||||
/// The primary implementation is `TomlClaimStore` which stores claims
|
||||
/// in `.aphoria/claims.toml` for version control.
|
||||
pub trait ClaimStore: Send + Sync {
|
||||
/// Save a new claim or update an existing one.
|
||||
///
|
||||
/// Claims are identified by `(concept_path, predicate)` tuple.
|
||||
/// If a claim with the same tuple exists, it is replaced.
|
||||
fn save_claim(&self, claim: &AuthoredClaim) -> Result<(), AphoriaError>;
|
||||
|
||||
/// Load a specific claim by concept path and predicate.
|
||||
fn load_claim(
|
||||
&self,
|
||||
concept_path: &str,
|
||||
predicate: &str,
|
||||
) -> Result<Option<AuthoredClaim>, AphoriaError>;
|
||||
|
||||
/// List all claims matching the filter criteria.
|
||||
///
|
||||
/// If filter is empty (all fields None), returns all claims.
|
||||
fn list_claims(&self, filter: &ClaimFilter) -> Result<Vec<AuthoredClaim>, AphoriaError>;
|
||||
|
||||
/// Delete a claim by concept path and predicate.
|
||||
///
|
||||
/// Returns `true` if a claim was deleted, `false` if not found.
|
||||
fn delete_claim(&self, concept_path: &str, predicate: &str) -> Result<bool, AphoriaError>;
|
||||
|
||||
/// Import multiple claims in bulk.
|
||||
///
|
||||
/// Duplicates (same concept_path + predicate) are skipped.
|
||||
fn import_claims(&self, claims: &[AuthoredClaim]) -> Result<ImportStats, AphoriaError> {
|
||||
let mut stats = ImportStats::default();
|
||||
|
||||
for claim in claims {
|
||||
match self.save_claim(claim) {
|
||||
Ok(()) => stats.imported += 1,
|
||||
Err(_) => stats.errors += 1,
|
||||
}
|
||||
}
|
||||
|
||||
Ok(stats)
|
||||
}
|
||||
|
||||
/// Export claims matching filter criteria.
|
||||
fn export_claims(&self, filter: &ClaimFilter) -> Result<Vec<AuthoredClaim>, AphoriaError> {
|
||||
self.list_claims(filter)
|
||||
}
|
||||
}
|
||||
|
||||
/// File-based claim storage using TOML format.
|
||||
///
|
||||
/// Stores claims in `.aphoria/claims.toml` relative to the base directory.
|
||||
/// This allows claims to be version-controlled alongside code.
|
||||
///
|
||||
/// # Note
|
||||
///
|
||||
/// This is a stub implementation. The `ClaimStore` trait is not yet implemented
|
||||
/// for this struct.
|
||||
// TODO(A4): Implement `ClaimStore for TomlClaimStore` using ClaimsFile for persistence.
|
||||
#[allow(dead_code)]
|
||||
pub struct TomlClaimStore {
|
||||
base_dir: PathBuf,
|
||||
}
|
||||
|
||||
#[allow(dead_code)]
|
||||
impl TomlClaimStore {
|
||||
/// Create a new TOML claim store.
|
||||
///
|
||||
/// # Arguments
|
||||
///
|
||||
/// * `base_dir` - Project root directory (claims stored in `{base_dir}/.aphoria/claims.toml`)
|
||||
pub fn new(base_dir: PathBuf) -> Self {
|
||||
Self { base_dir }
|
||||
}
|
||||
|
||||
/// Get the path to the claims file.
|
||||
fn claims_file(&self) -> PathBuf {
|
||||
self.base_dir.join(".aphoria").join("claims.toml")
|
||||
}
|
||||
}
|
||||
|
||||
// Implementation will be added in Commit 4 (Claim Storage)
|
||||
// For now, this is just the trait interface.
|
||||
200
applications/aphoria/src/claims_explain.rs
Normal file
200
applications/aphoria/src/claims_explain.rs
Normal file
@ -0,0 +1,200 @@
|
||||
//! Markdown rendering for authored claims.
|
||||
//!
|
||||
//! Generates `claims-explained.md` style output, grouping claims by category
|
||||
//! and rendering full provenance details.
|
||||
|
||||
use crate::types::authored_claim::{format_authority_tier, parse_authority_tier, AuthoredClaim, ClaimStatus};
|
||||
|
||||
/// Render all claims as a markdown document grouped by category.
|
||||
pub fn render_claims_markdown(claims: &[AuthoredClaim], project_name: &str) -> String {
|
||||
let mut out = String::new();
|
||||
out.push_str(&format!("# Claims for {project_name}\n\n"));
|
||||
|
||||
// Group by category
|
||||
let mut categories: std::collections::BTreeMap<String, Vec<&AuthoredClaim>> =
|
||||
std::collections::BTreeMap::new();
|
||||
for claim in claims {
|
||||
categories.entry(claim.category.clone()).or_default().push(claim);
|
||||
}
|
||||
|
||||
if categories.is_empty() {
|
||||
out.push_str("No claims authored yet.\n");
|
||||
return out;
|
||||
}
|
||||
|
||||
for (category, cat_claims) in &categories {
|
||||
let active_count = cat_claims.iter().filter(|c| c.status == ClaimStatus::Active).count();
|
||||
let title = capitalize(category);
|
||||
out.push_str(&format!(
|
||||
"## {title} Claims ({active_count} active, {} total)\n\n",
|
||||
cat_claims.len()
|
||||
));
|
||||
|
||||
for claim in cat_claims {
|
||||
render_single_claim(&mut out, claim);
|
||||
out.push('\n');
|
||||
}
|
||||
}
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
/// Render a single claim as a markdown section.
|
||||
pub fn render_single_claim(out: &mut String, claim: &AuthoredClaim) {
|
||||
let status_badge = match claim.status {
|
||||
ClaimStatus::Active => "",
|
||||
ClaimStatus::Deprecated => " [DEPRECATED]",
|
||||
ClaimStatus::Superseded => " [SUPERSEDED]",
|
||||
};
|
||||
|
||||
out.push_str(&format!("### {}: {}{status_badge}\n", claim.id, claim.invariant));
|
||||
out.push_str(&format!("- **Concept:** `{}`\n", claim.concept_path));
|
||||
out.push_str(&format!("- **Predicate:** `{}` = `{}`\n", claim.predicate, claim.value));
|
||||
out.push_str(&format!("- **Invariant:** {}\n", claim.invariant));
|
||||
out.push_str(&format!("- **Consequence:** {}\n", claim.consequence));
|
||||
out.push_str(&format!("- **Provenance:** {}\n", claim.provenance));
|
||||
|
||||
let tier_display = parse_authority_tier(&claim.authority_tier)
|
||||
.map(format_authority_tier)
|
||||
.unwrap_or_else(|_| {
|
||||
tracing::warn!(
|
||||
claim_id = %claim.id,
|
||||
raw_tier = %claim.authority_tier,
|
||||
"Failed to parse authority tier, using raw value"
|
||||
);
|
||||
claim.authority_tier.clone()
|
||||
});
|
||||
out.push_str(&format!("- **Authority:** {tier_display}\n"));
|
||||
|
||||
if !claim.evidence.is_empty() {
|
||||
out.push_str(&format!("- **Evidence:** {}\n", claim.evidence.join(", ")));
|
||||
}
|
||||
|
||||
out.push_str(&format!("- **Status:** {}\n", claim.status));
|
||||
out.push_str(&format!("- **Author:** {} ({})\n", claim.created_by, claim.created_at));
|
||||
|
||||
if let Some(ref supersedes) = claim.supersedes {
|
||||
out.push_str(&format!("- **Supersedes:** {supersedes}\n"));
|
||||
}
|
||||
|
||||
if let Some(ref updated) = claim.updated_at {
|
||||
out.push_str(&format!("- **Updated:** {updated}\n"));
|
||||
}
|
||||
}
|
||||
|
||||
/// Render a single claim as JSON wrapped in a structured envelope.
|
||||
pub fn render_claim_json(claim: &AuthoredClaim, project_name: &str) -> Result<String, serde_json::Error> {
|
||||
let envelope = serde_json::json!({
|
||||
"type": "claim_detail",
|
||||
"project": project_name,
|
||||
"claim": claim
|
||||
});
|
||||
serde_json::to_string_pretty(&envelope)
|
||||
}
|
||||
|
||||
/// Render all claims as JSON wrapped in a structured envelope.
|
||||
pub fn render_claims_json(claims: &[AuthoredClaim], project_name: &str) -> Result<String, serde_json::Error> {
|
||||
let envelope = serde_json::json!({
|
||||
"type": "claims_explain",
|
||||
"project": project_name,
|
||||
"total": claims.len(),
|
||||
"claims": claims
|
||||
});
|
||||
serde_json::to_string_pretty(&envelope)
|
||||
}
|
||||
|
||||
fn capitalize(s: &str) -> String {
|
||||
let mut chars = s.chars();
|
||||
match chars.next() {
|
||||
None => String::new(),
|
||||
Some(c) => c.to_uppercase().collect::<String>() + chars.as_str(),
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::types::authored_claim::AuthoredValue;
|
||||
|
||||
fn sample_claim(id: &str, category: &str) -> AuthoredClaim {
|
||||
AuthoredClaim {
|
||||
id: id.to_string(),
|
||||
concept_path: "test/concept".to_string(),
|
||||
predicate: "test_pred".to_string(),
|
||||
value: AuthoredValue::Text("test_value".to_string()),
|
||||
comparison: Default::default(),
|
||||
provenance: "Test provenance".to_string(),
|
||||
invariant: "Test invariant MUST hold".to_string(),
|
||||
consequence: "Bad things happen".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec!["ADR-001".to_string()],
|
||||
category: category.to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "tester".to_string(),
|
||||
created_at: "2026-02-08T12:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_render_empty() {
|
||||
let md = render_claims_markdown(&[], "test-project");
|
||||
assert!(md.contains("No claims authored yet"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_render_groups_by_category() {
|
||||
let claims = vec![
|
||||
sample_claim("safety-001", "safety"),
|
||||
sample_claim("arch-001", "architecture"),
|
||||
sample_claim("safety-002", "safety"),
|
||||
];
|
||||
let md = render_claims_markdown(&claims, "test-project");
|
||||
assert!(md.contains("## Architecture Claims (1 active, 1 total)"));
|
||||
assert!(md.contains("## Safety Claims (2 active, 2 total)"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_render_single_claim_fields() {
|
||||
let claim = sample_claim("test-001", "safety");
|
||||
let mut out = String::new();
|
||||
render_single_claim(&mut out, &claim);
|
||||
|
||||
assert!(out.contains("### test-001:"));
|
||||
assert!(out.contains("**Concept:** `test/concept`"));
|
||||
assert!(out.contains("**Predicate:** `test_pred` = `test_value`"));
|
||||
assert!(out.contains("**Invariant:**"));
|
||||
assert!(out.contains("**Consequence:**"));
|
||||
assert!(out.contains("**Provenance:**"));
|
||||
assert!(out.contains("Expert (Tier 3)"));
|
||||
assert!(out.contains("**Evidence:** ADR-001"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_deprecated_badge() {
|
||||
let mut claim = sample_claim("dep-001", "safety");
|
||||
claim.status = ClaimStatus::Deprecated;
|
||||
let mut out = String::new();
|
||||
render_single_claim(&mut out, &claim);
|
||||
assert!(out.contains("[DEPRECATED]"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_render_json() {
|
||||
let claim = sample_claim("test-001", "safety");
|
||||
let json = render_claim_json(&claim, "test-project").expect("json");
|
||||
assert!(json.contains("\"type\": \"claim_detail\""));
|
||||
assert!(json.contains("\"project\": \"test-project\""));
|
||||
assert!(json.contains("\"id\": \"test-001\""));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_render_claims_json_envelope() {
|
||||
let claims = vec![sample_claim("c-001", "arch"), sample_claim("c-002", "safety")];
|
||||
let json = render_claims_json(&claims, "my-project").expect("json");
|
||||
assert!(json.contains("\"type\": \"claims_explain\""));
|
||||
assert!(json.contains("\"total\": 2"));
|
||||
assert!(json.contains("\"project\": \"my-project\""));
|
||||
}
|
||||
}
|
||||
323
applications/aphoria/src/claims_file.rs
Normal file
323
applications/aphoria/src/claims_file.rs
Normal file
@ -0,0 +1,323 @@
|
||||
//! Claims file persistence (TOML).
|
||||
//!
|
||||
//! Stores human-authored claims in `.aphoria/claims.toml`, following
|
||||
//! the same pattern as `ack_file.rs` for acknowledgments.
|
||||
//!
|
||||
//! ## File Format
|
||||
//!
|
||||
//! ```toml
|
||||
//! # Aphoria Claims - version controlled
|
||||
//! #
|
||||
//! # Human-authored claims with provenance, invariants, and consequences.
|
||||
//! # Manage with: aphoria claims create|list|explain|update|supersede|deprecate
|
||||
//!
|
||||
//! [[claim]]
|
||||
//! id = "wallet-seqcst-001"
|
||||
//! concept_path = "maxwell/wallet/atomics/ordering"
|
||||
//! predicate = "required_ordering"
|
||||
//! value = "SeqCst"
|
||||
//! provenance = "Safety analysis by lead developer"
|
||||
//! invariant = "All wallet atomics MUST use SeqCst"
|
||||
//! consequence = "Double-spend race condition"
|
||||
//! authority_tier = "expert"
|
||||
//! evidence = ["wallet ADR-003", "Intel SDM Vol 4"]
|
||||
//! category = "safety"
|
||||
//! status = "active"
|
||||
//! created_by = "jml"
|
||||
//! created_at = "2026-02-08T12:00:00Z"
|
||||
//! ```
|
||||
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
use crate::types::authored_claim::{AuthoredClaim, ClaimStatus};
|
||||
use crate::AphoriaError;
|
||||
|
||||
/// Default path for the claims file relative to project root.
|
||||
pub const CLAIMS_FILE_PATH: &str = ".aphoria/claims.toml";
|
||||
|
||||
/// Container for all authored claims in the file.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
pub struct ClaimsFile {
|
||||
/// List of authored claims.
|
||||
#[serde(default, rename = "claim")]
|
||||
pub claims: Vec<AuthoredClaim>,
|
||||
}
|
||||
|
||||
impl ClaimsFile {
|
||||
/// Create an empty claims file.
|
||||
pub fn new() -> Self {
|
||||
Self { claims: Vec::new() }
|
||||
}
|
||||
|
||||
/// Add a claim entry, deduplicating by ID.
|
||||
pub fn add(&mut self, claim: AuthoredClaim) {
|
||||
if !self.claims.iter().any(|c| c.id == claim.id) {
|
||||
self.claims.push(claim);
|
||||
}
|
||||
}
|
||||
|
||||
/// Load from a TOML file.
|
||||
pub fn load(path: &Path) -> Result<Self, AphoriaError> {
|
||||
if !path.exists() {
|
||||
return Ok(Self::new());
|
||||
}
|
||||
|
||||
let content = std::fs::read_to_string(path)
|
||||
.map_err(|e| AphoriaError::Io(std::io::Error::new(e.kind(), format!("{e}"))))?;
|
||||
|
||||
toml::from_str(&content)
|
||||
.map_err(|e| AphoriaError::Claims(format!("Failed to parse claims file: {e}")))
|
||||
}
|
||||
|
||||
/// Save to a TOML file.
|
||||
pub fn save(&self, path: &Path) -> Result<(), AphoriaError> {
|
||||
if let Some(parent) = path.parent() {
|
||||
if !parent.exists() {
|
||||
std::fs::create_dir_all(parent)?;
|
||||
}
|
||||
}
|
||||
|
||||
let header = r#"# Aphoria Claims - version controlled
|
||||
#
|
||||
# Human-authored claims with provenance, invariants, and consequences.
|
||||
# Each claim represents a deliberate architectural decision or safety invariant.
|
||||
#
|
||||
# Manage with: aphoria claims create|list|explain|update|supersede|deprecate
|
||||
|
||||
"#;
|
||||
|
||||
let content =
|
||||
toml::to_string_pretty(self).map_err(|e| AphoriaError::Claims(e.to_string()))?;
|
||||
|
||||
std::fs::write(path, format!("{header}{content}"))
|
||||
.map_err(|e| AphoriaError::Io(std::io::Error::new(e.kind(), format!("{e}"))))?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Get the default path for the claims file.
|
||||
pub fn default_path(project_root: &Path) -> PathBuf {
|
||||
project_root.join(CLAIMS_FILE_PATH)
|
||||
}
|
||||
|
||||
/// Check if a claims file exists at the default location.
|
||||
pub fn exists(project_root: &Path) -> bool {
|
||||
Self::default_path(project_root).exists()
|
||||
}
|
||||
|
||||
/// Get the number of claims.
|
||||
pub fn len(&self) -> usize {
|
||||
self.claims.len()
|
||||
}
|
||||
|
||||
/// Check if empty.
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.claims.is_empty()
|
||||
}
|
||||
|
||||
/// Find a claim by ID.
|
||||
pub fn find_by_id(&self, id: &str) -> Option<&AuthoredClaim> {
|
||||
self.claims.iter().find(|c| c.id == id)
|
||||
}
|
||||
|
||||
/// Find a claim by ID (mutable).
|
||||
pub fn find_by_id_mut(&mut self, id: &str) -> Option<&mut AuthoredClaim> {
|
||||
self.claims.iter_mut().find(|c| c.id == id)
|
||||
}
|
||||
|
||||
/// Find claims by category.
|
||||
pub fn find_by_category(&self, category: &str) -> Vec<&AuthoredClaim> {
|
||||
self.claims.iter().filter(|c| c.category == category).collect()
|
||||
}
|
||||
|
||||
/// Find claims by status.
|
||||
pub fn find_by_status(&self, status: &ClaimStatus) -> Vec<&AuthoredClaim> {
|
||||
self.claims.iter().filter(|c| &c.status == status).collect()
|
||||
}
|
||||
|
||||
/// Update a claim's fields. Returns error if claim not found.
|
||||
pub fn update<F>(&mut self, id: &str, updater: F) -> Result<(), AphoriaError>
|
||||
where
|
||||
F: FnOnce(&mut AuthoredClaim),
|
||||
{
|
||||
let claim = self
|
||||
.find_by_id_mut(id)
|
||||
.ok_or_else(|| AphoriaError::Claims(format!("Claim not found: {id}")))?;
|
||||
updater(claim);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Mark a claim as superseded and add the superseding claim.
|
||||
pub fn supersede(
|
||||
&mut self,
|
||||
old_id: &str,
|
||||
new_claim: AuthoredClaim,
|
||||
) -> Result<(), AphoriaError> {
|
||||
// Mark old claim as superseded
|
||||
let now = new_claim.created_at.clone();
|
||||
self.update(old_id, |c| {
|
||||
c.status = ClaimStatus::Superseded;
|
||||
c.updated_at = Some(now);
|
||||
})?;
|
||||
|
||||
// Add new claim
|
||||
self.add(new_claim);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Mark a claim as deprecated.
|
||||
pub fn deprecate(&mut self, id: &str, timestamp: &str) -> Result<(), AphoriaError> {
|
||||
self.update(id, |c| {
|
||||
c.status = ClaimStatus::Deprecated;
|
||||
c.updated_at = Some(timestamp.to_string());
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::types::authored_claim::AuthoredValue;
|
||||
use tempfile::TempDir;
|
||||
|
||||
fn sample_claim(id: &str) -> AuthoredClaim {
|
||||
AuthoredClaim {
|
||||
id: id.to_string(),
|
||||
concept_path: "test/concept".to_string(),
|
||||
predicate: "test_pred".to_string(),
|
||||
value: AuthoredValue::Text("test_value".to_string()),
|
||||
comparison: Default::default(),
|
||||
provenance: "Test provenance".to_string(),
|
||||
invariant: "Test invariant".to_string(),
|
||||
consequence: "Test consequence".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec!["evidence-1".to_string()],
|
||||
category: "safety".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "tester".to_string(),
|
||||
created_at: "2026-02-08T12:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_claims_file_roundtrip() {
|
||||
let temp_dir = TempDir::new().expect("create temp dir");
|
||||
let path = temp_dir.path().join(".aphoria/claims.toml");
|
||||
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
file.add(sample_claim("claim-002"));
|
||||
|
||||
file.save(&path).expect("save claims file");
|
||||
|
||||
let loaded = ClaimsFile::load(&path).expect("load claims file");
|
||||
assert_eq!(loaded.len(), 2);
|
||||
assert_eq!(loaded.claims[0].id, "claim-001");
|
||||
assert_eq!(loaded.claims[1].id, "claim-002");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_no_duplicates() {
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
file.add(sample_claim("claim-001"));
|
||||
assert_eq!(file.len(), 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_find_by_id() {
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
file.add(sample_claim("claim-002"));
|
||||
|
||||
assert!(file.find_by_id("claim-001").is_some());
|
||||
assert!(file.find_by_id("nonexistent").is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_find_by_category() {
|
||||
let mut file = ClaimsFile::new();
|
||||
let mut arch_claim = sample_claim("arch-001");
|
||||
arch_claim.category = "architecture".to_string();
|
||||
|
||||
file.add(sample_claim("safety-001"));
|
||||
file.add(arch_claim);
|
||||
|
||||
assert_eq!(file.find_by_category("safety").len(), 1);
|
||||
assert_eq!(file.find_by_category("architecture").len(), 1);
|
||||
assert_eq!(file.find_by_category("imports").len(), 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_find_by_status() {
|
||||
let mut file = ClaimsFile::new();
|
||||
let mut dep_claim = sample_claim("dep-001");
|
||||
dep_claim.status = ClaimStatus::Deprecated;
|
||||
|
||||
file.add(sample_claim("active-001"));
|
||||
file.add(dep_claim);
|
||||
|
||||
assert_eq!(file.find_by_status(&ClaimStatus::Active).len(), 1);
|
||||
assert_eq!(file.find_by_status(&ClaimStatus::Deprecated).len(), 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_update() {
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
|
||||
file.update("claim-001", |c| {
|
||||
c.provenance = "Updated provenance".to_string();
|
||||
c.updated_at = Some("2026-02-08T13:00:00Z".to_string());
|
||||
})
|
||||
.expect("update claim");
|
||||
|
||||
assert_eq!(file.find_by_id("claim-001").map(|c| c.provenance.as_str()), Some("Updated provenance"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_update_not_found() {
|
||||
let mut file = ClaimsFile::new();
|
||||
assert!(file.update("nonexistent", |_| {}).is_err());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_supersede() {
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
|
||||
let mut new_claim = sample_claim("claim-002");
|
||||
new_claim.supersedes = Some("claim-001".to_string());
|
||||
new_claim.provenance = "Updated analysis".to_string();
|
||||
|
||||
file.supersede("claim-001", new_claim).expect("supersede");
|
||||
|
||||
assert_eq!(file.find_by_id("claim-001").map(|c| &c.status), Some(&ClaimStatus::Superseded));
|
||||
assert_eq!(file.find_by_id("claim-002").map(|c| c.supersedes.as_deref()), Some(Some("claim-001")));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_deprecate() {
|
||||
let mut file = ClaimsFile::new();
|
||||
file.add(sample_claim("claim-001"));
|
||||
|
||||
file.deprecate("claim-001", "2026-02-08T14:00:00Z").expect("deprecate");
|
||||
|
||||
let claim = file.find_by_id("claim-001").expect("find");
|
||||
assert_eq!(claim.status, ClaimStatus::Deprecated);
|
||||
assert_eq!(claim.updated_at.as_deref(), Some("2026-02-08T14:00:00Z"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_load_nonexistent() {
|
||||
let temp_dir = TempDir::new().expect("create temp dir");
|
||||
let path = temp_dir.path().join("nonexistent.toml");
|
||||
|
||||
let file = ClaimsFile::load(&path).expect("load should succeed");
|
||||
assert!(file.is_empty());
|
||||
}
|
||||
}
|
||||
168
applications/aphoria/src/cli/claims.rs
Normal file
168
applications/aphoria/src/cli/claims.rs
Normal file
@ -0,0 +1,168 @@
|
||||
//! CLI subcommands for authored claims management.
|
||||
|
||||
use std::path::PathBuf;
|
||||
|
||||
use clap::Subcommand;
|
||||
|
||||
/// Subcommands for managing authored claims.
|
||||
#[derive(Subcommand)]
|
||||
pub enum ClaimsCommands {
|
||||
/// Create a new authored claim
|
||||
Create {
|
||||
/// Human-readable claim ID (e.g., "wallet-seqcst-001")
|
||||
#[arg(long)]
|
||||
id: String,
|
||||
|
||||
/// Concept path (e.g., "maxwell/wallet/atomics/ordering")
|
||||
#[arg(long)]
|
||||
concept_path: String,
|
||||
|
||||
/// Predicate (e.g., "required_ordering")
|
||||
#[arg(long)]
|
||||
predicate: String,
|
||||
|
||||
/// Value (parsed as bool, number, or text)
|
||||
#[arg(long)]
|
||||
value: String,
|
||||
|
||||
/// Provenance (e.g., "Safety analysis by lead developer")
|
||||
#[arg(long)]
|
||||
provenance: String,
|
||||
|
||||
/// Invariant (e.g., "All wallet atomics MUST use SeqCst")
|
||||
#[arg(long)]
|
||||
invariant: String,
|
||||
|
||||
/// Consequence of violation (e.g., "Double-spend race condition")
|
||||
#[arg(long)]
|
||||
consequence: String,
|
||||
|
||||
/// Authority tier: regulatory, clinical, observational, expert, community, anecdotal
|
||||
#[arg(long)]
|
||||
tier: String,
|
||||
|
||||
/// Supporting evidence (can be specified multiple times)
|
||||
#[arg(long)]
|
||||
evidence: Vec<String>,
|
||||
|
||||
/// Category: safety, architecture, imports, constants, derives, etc.
|
||||
#[arg(long)]
|
||||
category: String,
|
||||
|
||||
/// Author name
|
||||
#[arg(long)]
|
||||
by: String,
|
||||
},
|
||||
|
||||
/// List authored claims
|
||||
List {
|
||||
/// Filter by category
|
||||
#[arg(long)]
|
||||
category: Option<String>,
|
||||
|
||||
/// Filter by status (active, deprecated, superseded)
|
||||
#[arg(long)]
|
||||
status: Option<String>,
|
||||
|
||||
/// Output format: table or json
|
||||
#[arg(long, default_value = "table")]
|
||||
format: String,
|
||||
},
|
||||
|
||||
/// Generate claims-explained markdown
|
||||
Explain {
|
||||
/// Specific claim ID to explain (omit for all claims)
|
||||
#[arg(long)]
|
||||
claim: Option<String>,
|
||||
|
||||
/// Output file path (default: stdout)
|
||||
#[arg(short, long)]
|
||||
output: Option<PathBuf>,
|
||||
|
||||
/// Output format: markdown or json
|
||||
#[arg(long, default_value = "markdown")]
|
||||
format: String,
|
||||
},
|
||||
|
||||
/// Update fields on an existing claim
|
||||
Update {
|
||||
/// Claim ID to update
|
||||
id: String,
|
||||
|
||||
/// New provenance
|
||||
#[arg(long)]
|
||||
provenance: Option<String>,
|
||||
|
||||
/// New invariant
|
||||
#[arg(long)]
|
||||
invariant: Option<String>,
|
||||
|
||||
/// New consequence
|
||||
#[arg(long)]
|
||||
consequence: Option<String>,
|
||||
|
||||
/// New authority tier
|
||||
#[arg(long)]
|
||||
tier: Option<String>,
|
||||
|
||||
/// Additional evidence (appended)
|
||||
#[arg(long)]
|
||||
evidence: Vec<String>,
|
||||
|
||||
/// New category
|
||||
#[arg(long)]
|
||||
category: Option<String>,
|
||||
|
||||
/// New value
|
||||
#[arg(long)]
|
||||
value: Option<String>,
|
||||
},
|
||||
|
||||
/// Create a new claim that supersedes an existing one
|
||||
Supersede {
|
||||
/// ID of the claim to supersede
|
||||
id: String,
|
||||
|
||||
/// New claim ID
|
||||
#[arg(long)]
|
||||
new_id: Option<String>,
|
||||
|
||||
/// New value
|
||||
#[arg(long)]
|
||||
value: Option<String>,
|
||||
|
||||
/// New provenance
|
||||
#[arg(long)]
|
||||
provenance: Option<String>,
|
||||
|
||||
/// New invariant
|
||||
#[arg(long)]
|
||||
invariant: Option<String>,
|
||||
|
||||
/// New consequence
|
||||
#[arg(long)]
|
||||
consequence: Option<String>,
|
||||
|
||||
/// New authority tier
|
||||
#[arg(long)]
|
||||
tier: Option<String>,
|
||||
|
||||
/// New evidence
|
||||
#[arg(long)]
|
||||
evidence: Vec<String>,
|
||||
|
||||
/// Author name
|
||||
#[arg(long)]
|
||||
by: Option<String>,
|
||||
},
|
||||
|
||||
/// Mark a claim as deprecated
|
||||
Deprecate {
|
||||
/// Claim ID to deprecate
|
||||
id: String,
|
||||
|
||||
/// Reason for deprecation
|
||||
#[arg(long)]
|
||||
reason: String,
|
||||
},
|
||||
}
|
||||
@ -7,17 +7,21 @@
|
||||
//! - `patterns`: Pattern and Eval commands
|
||||
//! - `scope`: Scope commands
|
||||
|
||||
mod claims;
|
||||
mod extractors;
|
||||
mod governance;
|
||||
mod lifecycle;
|
||||
mod patterns;
|
||||
mod scope;
|
||||
mod verify;
|
||||
|
||||
pub use claims::ClaimsCommands;
|
||||
pub use extractors::ExtractorCommands;
|
||||
pub use governance::{AuditCommands, GovernanceCommands};
|
||||
pub use lifecycle::{LifecycleCommands, MigrationCommands};
|
||||
pub use patterns::{EvalCommands, PatternCommands};
|
||||
pub use scope::ScopeCommands;
|
||||
pub use verify::VerifyCommands;
|
||||
|
||||
use std::path::PathBuf;
|
||||
|
||||
@ -31,6 +35,7 @@ use clap::{Parser, Subcommand};
|
||||
#[derive(Parser)]
|
||||
#[command(name = "aphoria")]
|
||||
#[command(version, about, long_about = None)]
|
||||
#[command(after_help = "Examples:\n aphoria scan Scan current directory\n aphoria scan --format sarif Output for IDE integration\n aphoria scan --strict Stricter conflict thresholds\n aphoria verify run Check code against claims\n aphoria coverage Show claim density per module\n aphoria explain Onboarding summary")]
|
||||
pub struct Cli {
|
||||
/// Path to aphoria.toml configuration file
|
||||
#[arg(short, long, global = true)]
|
||||
@ -208,6 +213,94 @@ pub enum Commands {
|
||||
#[command(subcommand)]
|
||||
command: AuditCommands,
|
||||
},
|
||||
|
||||
/// Manage human-authored claims (create, list, explain, update, supersede, deprecate)
|
||||
Claims {
|
||||
#[command(subcommand)]
|
||||
command: ClaimsCommands,
|
||||
},
|
||||
|
||||
/// Verify code against authored claims
|
||||
Verify {
|
||||
#[command(subcommand)]
|
||||
command: VerifyCommands,
|
||||
},
|
||||
|
||||
/// Show claim coverage metrics per module
|
||||
Coverage {
|
||||
/// Path to the project root
|
||||
#[arg(default_value = ".")]
|
||||
path: PathBuf,
|
||||
|
||||
/// Output format: table, json, markdown
|
||||
#[arg(short, long, default_value = "table")]
|
||||
format: String,
|
||||
|
||||
/// Sort modules by: name, density, unclaimed, observations
|
||||
#[arg(long, default_value = "name")]
|
||||
sort_by: String,
|
||||
},
|
||||
|
||||
/// Generate a narrative explanation of this project's claims (onboarding)
|
||||
Explain {
|
||||
/// Path to the project root
|
||||
#[arg(default_value = ".")]
|
||||
path: PathBuf,
|
||||
|
||||
/// Write output to a file instead of stdout
|
||||
#[arg(short, long)]
|
||||
output: Option<PathBuf>,
|
||||
|
||||
/// Output format: markdown or json
|
||||
#[arg(long, default_value = "markdown")]
|
||||
format: String,
|
||||
},
|
||||
|
||||
/// Generate enhanced documentation from claims + verification
|
||||
Docs {
|
||||
#[command(subcommand)]
|
||||
command: DocsCommands,
|
||||
},
|
||||
|
||||
/// Manage curated Trust Packs (install, list)
|
||||
TrustPack {
|
||||
#[command(subcommand)]
|
||||
command: TrustPackCommands,
|
||||
},
|
||||
}
|
||||
|
||||
#[derive(Subcommand)]
|
||||
pub enum TrustPackCommands {
|
||||
/// Install a curated Trust Pack by name
|
||||
Install {
|
||||
/// Pack name (e.g., "security-hardening", "rfc-compliance", "owasp-top10")
|
||||
name: String,
|
||||
|
||||
/// Custom registry URL (overrides built-in registry)
|
||||
#[arg(long)]
|
||||
registry: Option<String>,
|
||||
},
|
||||
|
||||
/// List available curated Trust Packs
|
||||
List,
|
||||
}
|
||||
|
||||
#[derive(Subcommand)]
|
||||
pub enum DocsCommands {
|
||||
/// Generate a claims overview document
|
||||
Generate {
|
||||
/// Path to the project root
|
||||
#[arg(default_value = ".")]
|
||||
path: PathBuf,
|
||||
|
||||
/// Output path (default: stdout)
|
||||
#[arg(short, long)]
|
||||
output: Option<PathBuf>,
|
||||
|
||||
/// Output format: markdown or json
|
||||
#[arg(long, default_value = "markdown")]
|
||||
format: String,
|
||||
},
|
||||
}
|
||||
|
||||
#[derive(Subcommand)]
|
||||
@ -260,6 +353,25 @@ pub enum CorpusCommands {
|
||||
|
||||
/// List available corpus sources
|
||||
List,
|
||||
|
||||
/// Export the corpus as a signed Trust Pack
|
||||
ExportPack {
|
||||
/// Name for the exported pack
|
||||
#[arg(long)]
|
||||
name: String,
|
||||
|
||||
/// Output path for the .pack file
|
||||
#[arg(short, long)]
|
||||
output: PathBuf,
|
||||
|
||||
/// Only include specific corpus sources (comma-separated)
|
||||
#[arg(long)]
|
||||
only: Option<String>,
|
||||
|
||||
/// Run in offline mode (skip sources requiring network)
|
||||
#[arg(long)]
|
||||
offline: bool,
|
||||
},
|
||||
}
|
||||
|
||||
#[derive(Subcommand)]
|
||||
|
||||
47
applications/aphoria/src/cli/verify.rs
Normal file
47
applications/aphoria/src/cli/verify.rs
Normal file
@ -0,0 +1,47 @@
|
||||
//! CLI definitions for the `aphoria verify` command.
|
||||
|
||||
use std::path::PathBuf;
|
||||
|
||||
use clap::Subcommand;
|
||||
|
||||
/// Verify commands for checking code against authored claims.
|
||||
#[derive(Subcommand)]
|
||||
pub enum VerifyCommands {
|
||||
/// Run verification: check observations against authored claims
|
||||
Run {
|
||||
/// Path to the project root
|
||||
#[arg(default_value = ".")]
|
||||
path: PathBuf,
|
||||
|
||||
/// Output format: table or json
|
||||
#[arg(short, long, default_value = "table")]
|
||||
format: String,
|
||||
|
||||
/// Exit with non-zero code on conflicts
|
||||
#[arg(long)]
|
||||
exit_code: bool,
|
||||
|
||||
/// Only scan staged/changed files (for pre-commit hooks)
|
||||
#[arg(long)]
|
||||
changed_only: bool,
|
||||
|
||||
/// Include UNCLAIMED observations in output
|
||||
#[arg(long)]
|
||||
show_unclaimed: bool,
|
||||
|
||||
/// Filter to specific claim IDs (comma-separated)
|
||||
#[arg(long, value_delimiter = ',')]
|
||||
claim: Vec<String>,
|
||||
|
||||
/// Filter by category
|
||||
#[arg(long)]
|
||||
category: Option<String>,
|
||||
},
|
||||
|
||||
/// Show claim-to-extractor mapping
|
||||
Map {
|
||||
/// Path to the project root
|
||||
#[arg(default_value = ".")]
|
||||
path: PathBuf,
|
||||
},
|
||||
}
|
||||
@ -14,7 +14,7 @@
|
||||
use blake3::Hasher;
|
||||
|
||||
use crate::config::CommunityConfig;
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
|
||||
use super::types::{AnonymizedObservation, CommunityObjectValue};
|
||||
|
||||
@ -34,7 +34,7 @@ use super::types::{AnonymizedObservation, CommunityObjectValue};
|
||||
/// The anon_hash specifically excludes file, line, and matched_text
|
||||
/// to prevent re-identification of the source location.
|
||||
pub fn anonymize_claim(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
config: &CommunityConfig,
|
||||
timestamp: u64,
|
||||
) -> Option<AnonymizedObservation> {
|
||||
@ -213,8 +213,8 @@ mod tests {
|
||||
predicate: &str,
|
||||
value: ObjectValue,
|
||||
confidence: f32,
|
||||
) -> ExtractedClaim {
|
||||
ExtractedClaim {
|
||||
) -> Observation {
|
||||
Observation {
|
||||
concept_path: concept_path.to_string(),
|
||||
predicate: predicate.to_string(),
|
||||
value,
|
||||
|
||||
@ -5,7 +5,7 @@ use std::path::PathBuf;
|
||||
use super::types::{
|
||||
AliasConfig, AutonomousConfig, CommunityConfig, CorpusConfig, DepVersionConfig, EntropyConfig,
|
||||
EpistemeConfig, ExtractorConfig, HostedConfig, LearningConfig, LlmConfig, OfflineFallback,
|
||||
PromotionConfig, ScanConfig, SyncMode, ThresholdConfig, TimeoutExtractorConfig,
|
||||
PromotionConfig, ScanConfig, SelfAuditConfig, SyncMode, ThresholdConfig, TimeoutExtractorConfig,
|
||||
DEFAULT_LLM_MODEL,
|
||||
};
|
||||
|
||||
@ -78,6 +78,7 @@ impl Default for ExtractorConfig {
|
||||
disabled: vec![],
|
||||
timeout_config: TimeoutExtractorConfig::default(),
|
||||
dep_versions: DepVersionConfig::default(),
|
||||
self_audit: SelfAuditConfig::default(),
|
||||
entropy: EntropyConfig::default(),
|
||||
declarative: vec![],
|
||||
}
|
||||
|
||||
@ -22,6 +22,9 @@ pub struct ExtractorConfig {
|
||||
/// Dependency version extractor settings.
|
||||
pub dep_versions: DepVersionConfig,
|
||||
|
||||
/// Self-audit extractor settings (opt-in, for dogfooding).
|
||||
pub self_audit: SelfAuditConfig,
|
||||
|
||||
/// High-entropy secrets extractor settings.
|
||||
pub entropy: EntropyConfig,
|
||||
|
||||
@ -73,6 +76,16 @@ pub struct DepVersionConfig {
|
||||
pub advisory_db: PathBuf,
|
||||
}
|
||||
|
||||
/// Self-audit extractor configuration (opt-in).
|
||||
#[derive(Debug, Clone, Default, Deserialize)]
|
||||
#[serde(default)]
|
||||
pub struct SelfAuditConfig {
|
||||
/// Enable self-audit extraction (opt-in).
|
||||
///
|
||||
/// Default: false. Enable this to dogfood Aphoria on its own codebase.
|
||||
pub enabled: bool,
|
||||
}
|
||||
|
||||
/// High-entropy secrets extractor configuration.
|
||||
///
|
||||
/// Controls the entropy thresholds used to detect potential secrets.
|
||||
|
||||
@ -40,7 +40,9 @@ pub use cross_project::CrossProjectConfig;
|
||||
#[allow(unused_imports)]
|
||||
pub use eval::EvalConfig;
|
||||
#[allow(unused_imports)]
|
||||
pub use extractors::{DepVersionConfig, EntropyConfig, ExtractorConfig, TimeoutExtractorConfig};
|
||||
pub use extractors::{
|
||||
DepVersionConfig, EntropyConfig, ExtractorConfig, SelfAuditConfig, TimeoutExtractorConfig,
|
||||
};
|
||||
#[allow(unused_imports)]
|
||||
pub use governance::GovernanceConfig;
|
||||
#[allow(unused_imports)]
|
||||
|
||||
@ -33,7 +33,7 @@ use tracing::{debug, info, instrument, warn};
|
||||
|
||||
use super::CorpusBuilder;
|
||||
use crate::config::CorpusConfig;
|
||||
use crate::episteme::create_authoritative_assertion;
|
||||
use crate::episteme::{create_authoritative_assertion_with_metadata};
|
||||
use crate::AphoriaError;
|
||||
use parsers::parse_cheatsheet;
|
||||
|
||||
@ -156,7 +156,15 @@ fn fetch_and_parse_cheatsheet(
|
||||
let assertions = recommendations
|
||||
.into_iter()
|
||||
.map(|rec| {
|
||||
create_authoritative_assertion(
|
||||
// Build extra metadata with OWASP cheatsheet name and CWE references
|
||||
let mut extra = serde_json::json!({
|
||||
"owasp_cheatsheet": filename,
|
||||
});
|
||||
if !rec.cwe_references.is_empty() {
|
||||
extra["cwe_references"] = serde_json::json!(rec.cwe_references);
|
||||
}
|
||||
|
||||
create_authoritative_assertion_with_metadata(
|
||||
signing_key,
|
||||
&rec.subject,
|
||||
&rec.predicate,
|
||||
@ -164,6 +172,7 @@ fn fetch_and_parse_cheatsheet(
|
||||
SourceClass::Clinical, // Tier 1
|
||||
&rec.description,
|
||||
timestamp,
|
||||
extra,
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
|
||||
@ -16,6 +16,8 @@ pub(super) struct Recommendation {
|
||||
pub value: ObjectValue,
|
||||
/// Human-readable description.
|
||||
pub description: String,
|
||||
/// CWE references (e.g., ["CWE-89", "CWE-78"]).
|
||||
pub cwe_references: Vec<String>,
|
||||
}
|
||||
|
||||
/// Parse security recommendations from cheat sheet markdown.
|
||||
@ -50,6 +52,7 @@ fn parse_authentication_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Multi-factor authentication SHOULD be implemented".to_string(),
|
||||
cwe_references: vec!["CWE-308".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -60,6 +63,7 @@ fn parse_authentication_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Number(8.0),
|
||||
description: "OWASP: Minimum password length of 8 characters".to_string(),
|
||||
cwe_references: vec!["CWE-521".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -71,6 +75,7 @@ fn parse_authentication_sheet(content: &str) -> Vec<Recommendation> {
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Account lockout SHOULD be enabled for brute force protection"
|
||||
.to_string(),
|
||||
cwe_references: vec!["CWE-307".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -81,6 +86,7 @@ fn parse_authentication_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("bcrypt_or_argon2".to_string()),
|
||||
description: "OWASP: Use bcrypt or Argon2 for password hashing".to_string(),
|
||||
cwe_references: vec!["CWE-916".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -98,6 +104,7 @@ fn parse_jwt_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: JWT algorithm MUST be validated server-side".to_string(),
|
||||
cwe_references: vec!["CWE-347".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -108,6 +115,7 @@ fn parse_jwt_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
description: "OWASP: JWT 'none' algorithm MUST be rejected".to_string(),
|
||||
cwe_references: vec!["CWE-347".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -118,6 +126,7 @@ fn parse_jwt_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: JWT expiration MUST be validated".to_string(),
|
||||
cwe_references: vec!["CWE-347".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -128,6 +137,7 @@ fn parse_jwt_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: JWT signatures MUST be verified".to_string(),
|
||||
cwe_references: vec!["CWE-347".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -145,6 +155,7 @@ fn parse_tls_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("TLS1.2".to_string()),
|
||||
description: "OWASP: Minimum TLS version should be 1.2".to_string(),
|
||||
cwe_references: vec!["CWE-295".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -155,6 +166,7 @@ fn parse_tls_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: TLS certificates MUST be verified".to_string(),
|
||||
cwe_references: vec!["CWE-295".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -165,6 +177,7 @@ fn parse_tls_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("strong_ciphers_only".to_string()),
|
||||
description: "OWASP: Only strong cipher suites should be enabled".to_string(),
|
||||
cwe_references: vec!["CWE-295".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -175,6 +188,7 @@ fn parse_tls_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: HSTS header SHOULD be enabled".to_string(),
|
||||
cwe_references: vec!["CWE-295".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -192,6 +206,7 @@ fn parse_secrets_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
description: "OWASP: Secrets MUST NOT be hardcoded".to_string(),
|
||||
cwe_references: vec!["CWE-798".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -203,6 +218,7 @@ fn parse_secrets_sheet(content: &str) -> Vec<Recommendation> {
|
||||
value: ObjectValue::Text("environment_or_vault".to_string()),
|
||||
description: "OWASP: Secrets SHOULD be stored in environment variables or vault"
|
||||
.to_string(),
|
||||
cwe_references: vec!["CWE-798".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -213,6 +229,7 @@ fn parse_secrets_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Secrets SHOULD be rotated regularly".to_string(),
|
||||
cwe_references: vec!["CWE-798".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -223,6 +240,7 @@ fn parse_secrets_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Secrets SHOULD be encrypted at rest".to_string(),
|
||||
cwe_references: vec!["CWE-798".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -240,6 +258,7 @@ fn parse_input_validation_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Input validation MUST be performed server-side".to_string(),
|
||||
cwe_references: vec!["CWE-20".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -250,6 +269,7 @@ fn parse_input_validation_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Prefer allowlist over denylist for input validation".to_string(),
|
||||
cwe_references: vec!["CWE-20".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -260,6 +280,7 @@ fn parse_input_validation_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Use parameterized queries to prevent SQL injection".to_string(),
|
||||
cwe_references: vec!["CWE-89".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -270,6 +291,7 @@ fn parse_input_validation_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Output encoding MUST be used to prevent XSS".to_string(),
|
||||
cwe_references: vec!["CWE-79".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -287,6 +309,7 @@ fn parse_session_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Session cookies MUST have Secure flag".to_string(),
|
||||
cwe_references: vec!["CWE-614".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -297,6 +320,7 @@ fn parse_session_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Session cookies MUST have HttpOnly flag".to_string(),
|
||||
cwe_references: vec!["CWE-1004".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -307,6 +331,7 @@ fn parse_session_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Session timeout SHOULD be configured".to_string(),
|
||||
cwe_references: vec!["CWE-613".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -317,6 +342,7 @@ fn parse_session_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Session ID SHOULD be regenerated after authentication".to_string(),
|
||||
cwe_references: vec!["CWE-384".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -334,6 +360,7 @@ fn parse_csrf_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: CSRF tokens SHOULD be used".to_string(),
|
||||
cwe_references: vec!["CWE-352".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -344,6 +371,7 @@ fn parse_csrf_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("Strict".to_string()),
|
||||
description: "OWASP: SameSite cookie attribute SHOULD be Strict or Lax".to_string(),
|
||||
cwe_references: vec!["CWE-352".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -354,6 +382,7 @@ fn parse_csrf_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Origin header SHOULD be validated".to_string(),
|
||||
cwe_references: vec!["CWE-352".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -372,6 +401,7 @@ fn parse_password_storage_sheet(content: &str) -> Vec<Recommendation> {
|
||||
value: ObjectValue::Text("Argon2id".to_string()),
|
||||
description: "OWASP: Argon2id is the recommended password hashing algorithm"
|
||||
.to_string(),
|
||||
cwe_references: vec!["CWE-916".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -382,6 +412,7 @@ fn parse_password_storage_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Passwords MUST be salted before hashing".to_string(),
|
||||
cwe_references: vec!["CWE-916".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -392,6 +423,7 @@ fn parse_password_storage_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Password hashing work factor SHOULD be configured".to_string(),
|
||||
cwe_references: vec!["CWE-916".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -409,6 +441,7 @@ fn parse_http_headers_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Content-Security-Policy header SHOULD be set".to_string(),
|
||||
cwe_references: vec!["CWE-1021".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -419,6 +452,7 @@ fn parse_http_headers_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("nosniff".to_string()),
|
||||
description: "OWASP: X-Content-Type-Options SHOULD be 'nosniff'".to_string(),
|
||||
cwe_references: vec!["CWE-16".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -429,6 +463,7 @@ fn parse_http_headers_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("DENY".to_string()),
|
||||
description: "OWASP: X-Frame-Options SHOULD be 'DENY' or 'SAMEORIGIN'".to_string(),
|
||||
cwe_references: vec!["CWE-1021".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -439,6 +474,7 @@ fn parse_http_headers_sheet(content: &str) -> Vec<Recommendation> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OWASP: Referrer-Policy header SHOULD be set".to_string(),
|
||||
cwe_references: vec!["CWE-16".to_string()],
|
||||
});
|
||||
}
|
||||
|
||||
@ -458,6 +494,7 @@ fn parse_generic_sheet(content: &str, topic: &str) -> Vec<Recommendation> {
|
||||
predicate: "required".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: format!("OWASP {}: {}", topic, truncate_description(&slug, 100)),
|
||||
cwe_references: vec![],
|
||||
});
|
||||
}
|
||||
for (i, cap) in should_pattern.captures_iter(content).enumerate().take(5) {
|
||||
@ -467,6 +504,7 @@ fn parse_generic_sheet(content: &str, topic: &str) -> Vec<Recommendation> {
|
||||
predicate: "recommended".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: format!("OWASP {}: {}", topic, truncate_description(&slug, 100)),
|
||||
cwe_references: vec![],
|
||||
});
|
||||
}
|
||||
recs
|
||||
|
||||
@ -35,7 +35,7 @@ use tracing::{debug, info, instrument, warn};
|
||||
|
||||
use super::CorpusBuilder;
|
||||
use crate::config::CorpusConfig;
|
||||
use crate::episteme::create_authoritative_assertion;
|
||||
use crate::episteme::create_authoritative_assertion_with_metadata;
|
||||
use crate::AphoriaError;
|
||||
use parsers::parse_normative_statements;
|
||||
|
||||
@ -142,7 +142,15 @@ fn fetch_and_parse_rfc(
|
||||
let assertions = statements
|
||||
.into_iter()
|
||||
.map(|stmt| {
|
||||
create_authoritative_assertion(
|
||||
// Build extra metadata with RFC number and optional section reference
|
||||
let mut extra = serde_json::json!({
|
||||
"rfc_number": rfc_num,
|
||||
});
|
||||
if let Some(section) = &stmt.section_reference {
|
||||
extra["rfc_section"] = serde_json::Value::String(section.clone());
|
||||
}
|
||||
|
||||
create_authoritative_assertion_with_metadata(
|
||||
signing_key,
|
||||
&stmt.subject,
|
||||
&stmt.predicate,
|
||||
@ -150,6 +158,7 @@ fn fetch_and_parse_rfc(
|
||||
SourceClass::Regulatory, // Tier 0
|
||||
&stmt.description,
|
||||
timestamp,
|
||||
extra,
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
|
||||
@ -18,6 +18,8 @@ pub(super) struct NormativeStatement {
|
||||
pub value: ObjectValue,
|
||||
/// Human-readable description.
|
||||
pub description: String,
|
||||
/// RFC section reference (e.g., "Section 4.1.3").
|
||||
pub section_reference: Option<String>,
|
||||
}
|
||||
|
||||
/// Parse normative statements from RFC text.
|
||||
@ -55,6 +57,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "JWT audience claim MUST be validated (RFC 7519 Section 4.1.3)"
|
||||
.to_string(),
|
||||
section_reference: Some("Section 4.1.3".to_string()),
|
||||
});
|
||||
}
|
||||
|
||||
@ -65,6 +68,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "JWT expiry claim MUST be validated (RFC 7519 Section 4.1.4)".to_string(),
|
||||
section_reference: Some("Section 4.1.4".to_string()),
|
||||
});
|
||||
}
|
||||
|
||||
@ -75,6 +79,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "JWT signatures MUST be verified (RFC 7519)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -86,6 +91,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Text("explicit_list".to_string()),
|
||||
description: "JWT algorithm MUST be explicitly specified, 'none' algorithm forbidden"
|
||||
.to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -97,6 +103,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "JWT not-before claim MUST be validated (RFC 7519 Section 4.1.5)"
|
||||
.to_string(),
|
||||
section_reference: Some("Section 4.1.5".to_string()),
|
||||
});
|
||||
}
|
||||
|
||||
@ -108,6 +115,7 @@ fn parse_rfc7519_jwt(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "JWT issuer claim SHOULD be validated for application-specific purposes"
|
||||
.to_string(),
|
||||
section_reference: Some("Section 4.1.1".to_string()),
|
||||
});
|
||||
}
|
||||
|
||||
@ -125,6 +133,7 @@ fn parse_rfc6749_oauth(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OAuth redirect_uri MUST be validated exactly (RFC 6749)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -136,6 +145,7 @@ fn parse_rfc6749_oauth(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OAuth state parameter SHOULD be used for CSRF protection (RFC 6749)"
|
||||
.to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -146,6 +156,7 @@ fn parse_rfc6749_oauth(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OAuth scope MUST be validated (RFC 6749)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -156,6 +167,7 @@ fn parse_rfc6749_oauth(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "OAuth endpoints MUST use TLS (RFC 6749)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -173,6 +185,7 @@ fn parse_rfc6750_bearer(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "Bearer tokens MUST be transmitted over TLS (RFC 6750)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -183,6 +196,7 @@ fn parse_rfc6750_bearer(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "Bearer tokens MUST be stored securely (RFC 6750)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -200,6 +214,7 @@ fn parse_rfc8446_tls13(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "TLS certificate chains MUST be verified (RFC 8446)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -210,6 +225,7 @@ fn parse_rfc8446_tls13(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("TLS_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384".to_string()),
|
||||
description: "TLS 1.3 cipher suites (RFC 8446)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -219,6 +235,7 @@ fn parse_rfc8446_tls13(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("TLS1.3".to_string()),
|
||||
description: "TLS 1.3 is the minimum recommended version (RFC 8446)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
|
||||
statements
|
||||
@ -235,6 +252,7 @@ fn parse_rfc7525_tls_practices(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "TLS hostname MUST be verified (RFC 7525)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -245,6 +263,7 @@ fn parse_rfc7525_tls_practices(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "TLS certificate revocation SHOULD be checked (RFC 7525)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -255,6 +274,7 @@ fn parse_rfc7525_tls_practices(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "disabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "SSLv2 and SSLv3 MUST NOT be used (RFC 7525)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -272,6 +292,7 @@ fn parse_rfc6238_totp(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Number(30.0),
|
||||
description: "TOTP time step SHOULD be 30 seconds (RFC 6238)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -283,6 +304,7 @@ fn parse_rfc6238_totp(text: &str) -> Vec<NormativeStatement> {
|
||||
value: ObjectValue::Number(1.0),
|
||||
description: "TOTP validation window SHOULD allow 1 step tolerance (RFC 6238)"
|
||||
.to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -293,6 +315,7 @@ fn parse_rfc6238_totp(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Number(160.0),
|
||||
description: "TOTP secret key SHOULD be at least 160 bits (RFC 6238)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -310,6 +333,7 @@ fn parse_rfc7617_basic_auth(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "HTTP Basic Auth MUST use TLS (RFC 7617)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -320,6 +344,7 @@ fn parse_rfc7617_basic_auth(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("UTF-8".to_string()),
|
||||
description: "HTTP Basic Auth credentials SHOULD use UTF-8 (RFC 7617)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -337,6 +362,7 @@ fn parse_rfc9110_http(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "HTTP timeouts SHOULD be configured (RFC 9110)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -347,6 +373,7 @@ fn parse_rfc9110_http(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "HTTP/1.1 Host header MUST be present (RFC 9110)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -357,6 +384,7 @@ fn parse_rfc9110_http(text: &str) -> Vec<NormativeStatement> {
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: "HTTP Content-Length SHOULD be validated (RFC 9110)".to_string(),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
|
||||
@ -386,6 +414,7 @@ fn parse_generic_rfc(text: &str, rfc_num: u32) -> Vec<NormativeStatement> {
|
||||
predicate: if is_mandatory { "required" } else { "recommended" }.to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
description: format!("RFC {} {} requirement: {}", rfc_num, keyword, topic),
|
||||
section_reference: None,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
@ -1,11 +1,14 @@
|
||||
//! Corpus building operations - fetching and ingesting authoritative sources.
|
||||
|
||||
use std::path::PathBuf;
|
||||
|
||||
use crate::bridge;
|
||||
use crate::config::AphoriaConfig;
|
||||
use crate::corpus::{CorpusBuildResult, CorpusBuilderInfo, CorpusRegistry};
|
||||
use crate::current_timestamp;
|
||||
use crate::episteme;
|
||||
use crate::error::AphoriaError;
|
||||
use crate::policy::TrustPack;
|
||||
use tracing::{info, instrument};
|
||||
|
||||
/// Arguments for corpus build command.
|
||||
@ -82,3 +85,63 @@ pub fn list_corpus_sources(config: &AphoriaConfig) -> Vec<CorpusBuilderInfo> {
|
||||
let registry = CorpusRegistry::with_defaults(&config.corpus);
|
||||
registry.list_builders()
|
||||
}
|
||||
|
||||
/// Export the corpus as a signed Trust Pack.
|
||||
///
|
||||
/// Builds the corpus from configured sources and packages it as a
|
||||
/// distributable Trust Pack that can be imported into other projects.
|
||||
#[instrument(skip(config), fields(name = %name, output = %output.display()))]
|
||||
pub async fn export_corpus_as_pack(
|
||||
name: String,
|
||||
output: PathBuf,
|
||||
only: Option<Vec<String>>,
|
||||
offline: bool,
|
||||
config: &AphoriaConfig,
|
||||
) -> Result<usize, AphoriaError> {
|
||||
info!("Exporting corpus as Trust Pack");
|
||||
|
||||
let project_root = std::env::current_dir()?;
|
||||
|
||||
// Build corpus config based on --only flag
|
||||
let mut corpus_config = config.corpus.clone();
|
||||
if let Some(only) = &only {
|
||||
corpus_config.include_hardcoded = only.iter().any(|s| s == "hardcoded");
|
||||
corpus_config.include_rfc = only.iter().any(|s| s == "rfc");
|
||||
corpus_config.include_owasp = only.iter().any(|s| s == "owasp");
|
||||
corpus_config.include_vendor = only.iter().any(|s| s == "vendor");
|
||||
}
|
||||
|
||||
// Create registry and build
|
||||
let registry = CorpusRegistry::with_defaults(&corpus_config);
|
||||
let signing_key = bridge::load_or_generate_key(&project_root)?;
|
||||
let timestamp = current_timestamp();
|
||||
|
||||
let result = registry.build_all(&signing_key, timestamp, &corpus_config, offline)?;
|
||||
|
||||
if result.assertions.is_empty() {
|
||||
return Err(AphoriaError::Config("No assertions built — nothing to export".to_string()));
|
||||
}
|
||||
|
||||
let assertion_count = result.assertions.len();
|
||||
|
||||
// Include predicate aliases from config
|
||||
let predicate_aliases: Vec<crate::policy::PackPredicateAliasSet> =
|
||||
config.predicate_aliases.to_alias_sets().iter().map(crate::policy::PackPredicateAliasSet::from).collect();
|
||||
|
||||
// Package as Trust Pack
|
||||
let pack = TrustPack::new_with_predicate_aliases(
|
||||
name,
|
||||
"0.1.0".to_string(),
|
||||
result.assertions,
|
||||
vec![], // No aliases for corpus packs
|
||||
predicate_aliases,
|
||||
&signing_key,
|
||||
config.trust_pack.signer_name.clone(),
|
||||
config.trust_pack.contact.clone(),
|
||||
)?;
|
||||
|
||||
pack.save(&output)?;
|
||||
|
||||
info!(assertions = assertion_count, output = %output.display(), "Corpus exported as Trust Pack");
|
||||
Ok(assertion_count)
|
||||
}
|
||||
|
||||
575
applications/aphoria/src/coverage.rs
Normal file
575
applications/aphoria/src/coverage.rs
Normal file
@ -0,0 +1,575 @@
|
||||
//! Claim coverage metrics engine.
|
||||
//!
|
||||
//! Computes per-module coverage: how many observations are claimed,
|
||||
//! how many claims are verified, what's uncovered. Uses `verify_claims()`
|
||||
//! as the source of truth for claim-observation matching.
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
use serde::Serialize;
|
||||
|
||||
use crate::types::authored_claim::AuthoredClaim;
|
||||
use crate::types::Observation;
|
||||
use crate::verify::{tail_path, verify_claims, AuditVerdict, VerifyReport};
|
||||
|
||||
/// Per-module coverage metrics.
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct ModuleCoverage {
|
||||
/// Module path (e.g., "wallet/atomics", "tls").
|
||||
pub module_path: String,
|
||||
/// Files belonging to this module.
|
||||
pub files: Vec<String>,
|
||||
/// Total observations found by extractors in this module.
|
||||
pub observation_count: usize,
|
||||
/// Active authored claims covering this module.
|
||||
pub claim_count: usize,
|
||||
/// Observations matched by at least one claim.
|
||||
pub claimed_observations: usize,
|
||||
/// Observations with no covering claim.
|
||||
pub unclaimed_observations: usize,
|
||||
/// Claims with no matching observation (MISSING verdicts).
|
||||
pub missing_claims: usize,
|
||||
/// Claim density: claim_count / observation_count (0.0 if no observations).
|
||||
pub density: f32,
|
||||
}
|
||||
|
||||
/// Full coverage report for a project.
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct CoverageReport {
|
||||
/// Project name.
|
||||
pub project: String,
|
||||
/// Per-module metrics, sorted by module path.
|
||||
pub modules: Vec<ModuleCoverage>,
|
||||
/// Aggregate summary.
|
||||
pub summary: CoverageSummary,
|
||||
}
|
||||
|
||||
/// Aggregate coverage summary.
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct CoverageSummary {
|
||||
/// Total observations across all modules.
|
||||
pub total_observations: usize,
|
||||
/// Total active claims.
|
||||
pub total_claims: usize,
|
||||
/// Percentage of observations covered by claims.
|
||||
pub claimed_percentage: f32,
|
||||
/// Count of observations with no covering claim.
|
||||
pub unclaimed_count: usize,
|
||||
/// Number of modules that have at least one claim.
|
||||
pub modules_with_claims: usize,
|
||||
/// Number of modules with zero claims.
|
||||
pub modules_without_claims: usize,
|
||||
}
|
||||
|
||||
/// Derive a module path from a file path.
|
||||
///
|
||||
/// Takes the first 2 directory segments after stripping common prefixes like `src/`.
|
||||
/// Examples:
|
||||
/// - `src/wallet/atomics/sync.rs` → `wallet/atomics`
|
||||
/// - `src/tls/config.rs` → `tls`
|
||||
/// - `config.toml` → `(root)`
|
||||
fn derive_module(file_path: &str) -> String {
|
||||
let path = file_path
|
||||
.strip_prefix("src/")
|
||||
.or_else(|| file_path.strip_prefix("lib/"))
|
||||
.unwrap_or(file_path);
|
||||
|
||||
let segments: Vec<&str> = path.split('/').filter(|s| !s.is_empty()).collect();
|
||||
|
||||
// Take directory segments only (skip the filename)
|
||||
let dir_segments: Vec<&str> = if segments.len() > 1 {
|
||||
segments[..segments.len() - 1].to_vec()
|
||||
} else {
|
||||
return "(root)".to_string();
|
||||
};
|
||||
|
||||
// Take up to 2 directory segments
|
||||
let module_depth = dir_segments.len().min(2);
|
||||
if module_depth == 0 {
|
||||
"(root)".to_string()
|
||||
} else {
|
||||
dir_segments[..module_depth].join("/")
|
||||
}
|
||||
}
|
||||
|
||||
/// Derive a module path from a claim's concept_path.
|
||||
///
|
||||
/// Uses `tail_path` to get the last 2 segments, then takes the first segment
|
||||
/// as the module. For claims without a valid tail path, uses the full concept_path.
|
||||
fn derive_module_from_claim(concept_path: &str) -> String {
|
||||
if let Some(tp) = tail_path(concept_path) {
|
||||
// tail_path gives us "penultimate/last" — use the penultimate as module
|
||||
if let Some(slash) = tp.find('/') {
|
||||
tp[..slash].to_string()
|
||||
} else {
|
||||
tp
|
||||
}
|
||||
} else {
|
||||
// Fallback: strip scheme, use what we have
|
||||
let path = concept_path
|
||||
.find("://")
|
||||
.map(|i| &concept_path[i + 3..])
|
||||
.unwrap_or(concept_path);
|
||||
path.to_string()
|
||||
}
|
||||
}
|
||||
|
||||
/// Compute coverage metrics from claims, observations, and verification results.
|
||||
pub fn compute_coverage(
|
||||
claims: &[AuthoredClaim],
|
||||
observations: &[Observation],
|
||||
project_name: &str,
|
||||
) -> CoverageReport {
|
||||
let report = verify_claims(claims, observations);
|
||||
compute_coverage_from_report(claims, observations, &report, project_name)
|
||||
}
|
||||
|
||||
/// Compute coverage from pre-computed verification report.
|
||||
///
|
||||
/// Useful when the caller already has a `VerifyReport` and doesn't want
|
||||
/// to re-run verification.
|
||||
pub fn compute_coverage_from_report(
|
||||
claims: &[AuthoredClaim],
|
||||
observations: &[Observation],
|
||||
report: &VerifyReport,
|
||||
project_name: &str,
|
||||
) -> CoverageReport {
|
||||
// Group observations by module (from file path)
|
||||
let mut obs_by_module: BTreeMap<String, Vec<&Observation>> = BTreeMap::new();
|
||||
for obs in observations {
|
||||
let module = derive_module(&obs.file);
|
||||
obs_by_module.entry(module).or_default().push(obs);
|
||||
}
|
||||
|
||||
// Build claim-to-module mapping from verification results.
|
||||
// For claims with matching observations (Pass/Conflict), derive the module
|
||||
// from the observation's file path so claims land in the same bucket as
|
||||
// their observations. For Missing claims, fall back to concept_path.
|
||||
let mut claim_to_module: std::collections::HashMap<String, String> =
|
||||
std::collections::HashMap::new();
|
||||
let mut claimed_tails: std::collections::HashSet<String> = std::collections::HashSet::new();
|
||||
let mut missing_claim_ids: std::collections::HashSet<String> = std::collections::HashSet::new();
|
||||
|
||||
for result in &report.results {
|
||||
match result.verdict {
|
||||
AuditVerdict::Pass | AuditVerdict::Conflict => {
|
||||
if let Some(ref claim) = result.claim {
|
||||
if let Some(tp) = tail_path(&claim.concept_path) {
|
||||
claimed_tails.insert(tp);
|
||||
}
|
||||
// Derive module from the first matching observation's file path
|
||||
if let Some(obs) = result.matching_observations.first() {
|
||||
claim_to_module.insert(claim.id.clone(), derive_module(&obs.file));
|
||||
}
|
||||
}
|
||||
}
|
||||
AuditVerdict::Missing => {
|
||||
if let Some(ref claim) = result.claim {
|
||||
missing_claim_ids.insert(claim.id.clone());
|
||||
}
|
||||
}
|
||||
AuditVerdict::Unclaimed => {}
|
||||
}
|
||||
}
|
||||
|
||||
// Group claims by module, using observation-derived module when available
|
||||
let mut claims_by_module: BTreeMap<String, Vec<&AuthoredClaim>> = BTreeMap::new();
|
||||
for claim in claims {
|
||||
if claim.status == crate::types::ClaimStatus::Active {
|
||||
let module = claim_to_module
|
||||
.get(&claim.id)
|
||||
.cloned()
|
||||
.unwrap_or_else(|| derive_module_from_claim(&claim.concept_path));
|
||||
claims_by_module.entry(module).or_default().push(claim);
|
||||
}
|
||||
}
|
||||
|
||||
// Collect all module names from both observations and claims
|
||||
let mut all_modules: std::collections::BTreeSet<String> = std::collections::BTreeSet::new();
|
||||
for key in obs_by_module.keys() {
|
||||
all_modules.insert(key.clone());
|
||||
}
|
||||
for key in claims_by_module.keys() {
|
||||
all_modules.insert(key.clone());
|
||||
}
|
||||
|
||||
// Build per-module coverage
|
||||
let mut modules = Vec::new();
|
||||
let mut total_observations = 0usize;
|
||||
let mut total_claimed = 0usize;
|
||||
let mut total_unclaimed = 0usize;
|
||||
let mut modules_with_claims = 0usize;
|
||||
let mut modules_without_claims = 0usize;
|
||||
|
||||
for module in &all_modules {
|
||||
let obs_list = obs_by_module.get(module);
|
||||
let claim_list = claims_by_module.get(module);
|
||||
|
||||
let observation_count = obs_list.map(|v| v.len()).unwrap_or(0);
|
||||
let claim_count = claim_list.map(|v| v.len()).unwrap_or(0);
|
||||
|
||||
// Count how many observations in this module are claimed
|
||||
let claimed_obs = obs_list
|
||||
.map(|obs| {
|
||||
obs.iter()
|
||||
.filter(|o| {
|
||||
tail_path(&o.concept_path)
|
||||
.map(|tp| claimed_tails.contains(&tp))
|
||||
.unwrap_or(false)
|
||||
})
|
||||
.count()
|
||||
})
|
||||
.unwrap_or(0);
|
||||
|
||||
let unclaimed_obs = observation_count.saturating_sub(claimed_obs);
|
||||
|
||||
// Count missing claims in this module
|
||||
let missing_in_module = claim_list
|
||||
.map(|cls| {
|
||||
cls.iter()
|
||||
.filter(|c| missing_claim_ids.contains(&c.id))
|
||||
.count()
|
||||
})
|
||||
.unwrap_or(0);
|
||||
|
||||
let density = if observation_count > 0 {
|
||||
claim_count as f32 / observation_count as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
|
||||
// Collect unique files in this module
|
||||
let files: Vec<String> = obs_list
|
||||
.map(|obs| {
|
||||
let mut file_set: std::collections::BTreeSet<String> = std::collections::BTreeSet::new();
|
||||
for o in obs {
|
||||
file_set.insert(o.file.clone());
|
||||
}
|
||||
file_set.into_iter().collect()
|
||||
})
|
||||
.unwrap_or_default();
|
||||
|
||||
if claim_count > 0 {
|
||||
modules_with_claims += 1;
|
||||
} else {
|
||||
modules_without_claims += 1;
|
||||
}
|
||||
|
||||
total_observations += observation_count;
|
||||
total_claimed += claimed_obs;
|
||||
total_unclaimed += unclaimed_obs;
|
||||
|
||||
modules.push(ModuleCoverage {
|
||||
module_path: module.clone(),
|
||||
files,
|
||||
observation_count,
|
||||
claim_count,
|
||||
claimed_observations: claimed_obs,
|
||||
unclaimed_observations: unclaimed_obs,
|
||||
missing_claims: missing_in_module,
|
||||
density,
|
||||
});
|
||||
}
|
||||
|
||||
let active_claims = claims
|
||||
.iter()
|
||||
.filter(|c| c.status == crate::types::ClaimStatus::Active)
|
||||
.count();
|
||||
|
||||
let claimed_percentage = if total_observations > 0 {
|
||||
(total_claimed as f32 / total_observations as f32) * 100.0
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
|
||||
CoverageReport {
|
||||
project: project_name.to_string(),
|
||||
modules,
|
||||
summary: CoverageSummary {
|
||||
total_observations,
|
||||
total_claims: active_claims,
|
||||
claimed_percentage,
|
||||
unclaimed_count: total_unclaimed,
|
||||
modules_with_claims,
|
||||
modules_without_claims,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
/// Format coverage report as a terminal table.
|
||||
pub fn format_coverage_table(report: &CoverageReport, sort_by: &str) -> String {
|
||||
let mut out = String::new();
|
||||
|
||||
out.push_str(&format!("Aphoria Coverage: {}\n\n", report.project));
|
||||
|
||||
if report.modules.is_empty() {
|
||||
out.push_str("No observations or claims found.\n");
|
||||
return out;
|
||||
}
|
||||
|
||||
let mut modules = report.modules.clone();
|
||||
match sort_by {
|
||||
"unclaimed" => modules.sort_by(|a, b| b.unclaimed_observations.cmp(&a.unclaimed_observations)),
|
||||
"observations" => modules.sort_by(|a, b| b.observation_count.cmp(&a.observation_count)),
|
||||
"density" => modules.sort_by(|a, b| {
|
||||
b.density.partial_cmp(&a.density)
|
||||
.unwrap_or(std::cmp::Ordering::Equal)
|
||||
.then_with(|| b.observation_count.cmp(&a.observation_count))
|
||||
}),
|
||||
_ => {} // default: alphabetical (already sorted by BTreeMap)
|
||||
}
|
||||
|
||||
let mut table = comfy_table::Table::new();
|
||||
table.set_header(vec![
|
||||
"Module",
|
||||
"Claims",
|
||||
"Observations",
|
||||
"Claimed",
|
||||
"Unclaimed",
|
||||
"Missing",
|
||||
"Density",
|
||||
]);
|
||||
|
||||
for m in &modules {
|
||||
table.add_row(vec![
|
||||
m.module_path.clone(),
|
||||
m.claim_count.to_string(),
|
||||
m.observation_count.to_string(),
|
||||
m.claimed_observations.to_string(),
|
||||
m.unclaimed_observations.to_string(),
|
||||
m.missing_claims.to_string(),
|
||||
format!("{:.1}%", m.density * 100.0),
|
||||
]);
|
||||
}
|
||||
|
||||
out.push_str(&table.to_string());
|
||||
out.push_str(&format!(
|
||||
"\n\nSummary: {} claims, {} observations, {:.1}% claimed, {} unclaimed",
|
||||
report.summary.total_claims,
|
||||
report.summary.total_observations,
|
||||
report.summary.claimed_percentage,
|
||||
report.summary.unclaimed_count,
|
||||
));
|
||||
out.push_str(&format!(
|
||||
"\nModules: {} with claims, {} without claims",
|
||||
report.summary.modules_with_claims, report.summary.modules_without_claims,
|
||||
));
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
/// Format coverage report as JSON.
|
||||
pub fn format_coverage_json(report: &CoverageReport) -> String {
|
||||
serde_json::to_string_pretty(report).unwrap_or_else(|_| "{}".to_string())
|
||||
}
|
||||
|
||||
/// Format coverage report as markdown.
|
||||
pub fn format_coverage_markdown(report: &CoverageReport) -> String {
|
||||
let mut out = String::new();
|
||||
|
||||
out.push_str(&format!("# Aphoria Coverage: {}\n\n", report.project));
|
||||
|
||||
out.push_str("## Summary\n\n");
|
||||
out.push_str(&format!(
|
||||
"- **Claims:** {}\n- **Observations:** {}\n- **Claimed:** {:.1}%\n- **Unclaimed:** {}\n- **Modules with claims:** {}\n- **Modules without claims:** {}\n\n",
|
||||
report.summary.total_claims,
|
||||
report.summary.total_observations,
|
||||
report.summary.claimed_percentage,
|
||||
report.summary.unclaimed_count,
|
||||
report.summary.modules_with_claims,
|
||||
report.summary.modules_without_claims,
|
||||
));
|
||||
|
||||
if report.modules.is_empty() {
|
||||
out.push_str("No observations or claims found.\n");
|
||||
return out;
|
||||
}
|
||||
|
||||
out.push_str("## Modules\n\n");
|
||||
out.push_str("| Module | Claims | Observations | Claimed | Unclaimed | Missing | Density |\n");
|
||||
out.push_str("|--------|--------|--------------|---------|-----------|---------|----------|\n");
|
||||
|
||||
for m in &report.modules {
|
||||
out.push_str(&format!(
|
||||
"| {} | {} | {} | {} | {} | {} | {:.1}% |\n",
|
||||
m.module_path,
|
||||
m.claim_count,
|
||||
m.observation_count,
|
||||
m.claimed_observations,
|
||||
m.unclaimed_observations,
|
||||
m.missing_claims,
|
||||
m.density * 100.0,
|
||||
));
|
||||
}
|
||||
|
||||
// Highlight modules with 0 claims
|
||||
let uncovered: Vec<&ModuleCoverage> = report
|
||||
.modules
|
||||
.iter()
|
||||
.filter(|m| m.claim_count == 0 && m.observation_count > 0)
|
||||
.collect();
|
||||
|
||||
if !uncovered.is_empty() {
|
||||
out.push_str("\n## Coverage Gaps\n\n");
|
||||
out.push_str("These modules have observations but no authored claims:\n\n");
|
||||
for m in uncovered {
|
||||
out.push_str(&format!(
|
||||
"- **{}** ({} unclaimed observations)\n",
|
||||
m.module_path, m.unclaimed_observations,
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::types::authored_claim::{AuthoredValue, ClaimStatus, ComparisonMode};
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
fn make_claim(id: &str, concept_path: &str, category: &str) -> AuthoredClaim {
|
||||
AuthoredClaim {
|
||||
id: id.to_string(),
|
||||
concept_path: concept_path.to_string(),
|
||||
predicate: "test".to_string(),
|
||||
value: AuthoredValue::Text("test".to_string()),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "test".to_string(),
|
||||
invariant: "test".to_string(),
|
||||
consequence: "test".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: category.to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "tester".to_string(),
|
||||
created_at: "2026-02-08".to_string(),
|
||||
updated_at: None,
|
||||
}
|
||||
}
|
||||
|
||||
fn make_obs(concept_path: &str, file: &str) -> Observation {
|
||||
Observation {
|
||||
concept_path: concept_path.to_string(),
|
||||
predicate: "test".to_string(),
|
||||
value: ObjectValue::Text("test".to_string()),
|
||||
file: file.to_string(),
|
||||
line: 1,
|
||||
matched_text: "test".to_string(),
|
||||
confidence: 1.0,
|
||||
description: "test".to_string(),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_derive_module() {
|
||||
assert_eq!(derive_module("src/wallet/atomics/sync.rs"), "wallet/atomics");
|
||||
assert_eq!(derive_module("src/tls/config.rs"), "tls");
|
||||
assert_eq!(derive_module("config.toml"), "(root)");
|
||||
assert_eq!(derive_module("src/main.rs"), "(root)");
|
||||
assert_eq!(derive_module("src/auth/jwt/token.rs"), "auth/jwt");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_derive_module_from_claim() {
|
||||
assert_eq!(
|
||||
derive_module_from_claim("project/wallet/atomics/ordering"),
|
||||
"atomics"
|
||||
);
|
||||
assert_eq!(
|
||||
derive_module_from_claim("code://rust/core/imports/tokio"),
|
||||
"imports"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_compute_coverage_empty() {
|
||||
let report = compute_coverage(&[], &[], "test");
|
||||
assert_eq!(report.summary.total_observations, 0);
|
||||
assert_eq!(report.summary.total_claims, 0);
|
||||
assert_eq!(report.summary.claimed_percentage, 0.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_compute_coverage_with_matches() {
|
||||
let claims = vec![make_claim(
|
||||
"c1",
|
||||
"project/atomics/ordering",
|
||||
"safety",
|
||||
)];
|
||||
let observations = vec![
|
||||
make_obs("code://rust/project/atomics/ordering", "src/wallet/atomics/sync.rs"),
|
||||
make_obs("code://rust/project/tls/config", "src/tls/config.rs"),
|
||||
];
|
||||
|
||||
let report = compute_coverage(&claims, &observations, "test");
|
||||
assert_eq!(report.summary.total_claims, 1);
|
||||
assert_eq!(report.summary.total_observations, 2);
|
||||
assert!(report.summary.unclaimed_count > 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_coverage_table_output() {
|
||||
let claims = vec![make_claim("c1", "project/atomics/ordering", "safety")];
|
||||
let observations = vec![make_obs(
|
||||
"code://rust/project/atomics/ordering",
|
||||
"src/wallet/atomics/sync.rs",
|
||||
)];
|
||||
let report = compute_coverage(&claims, &observations, "myproject");
|
||||
let table = format_coverage_table(&report, "name");
|
||||
assert!(table.contains("Aphoria Coverage: myproject"));
|
||||
assert!(table.contains("Summary:"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_coverage_json_output() {
|
||||
let report = compute_coverage(&[], &[], "test");
|
||||
let json = format_coverage_json(&report);
|
||||
let parsed: serde_json::Value =
|
||||
serde_json::from_str(&json).expect("valid json");
|
||||
assert_eq!(parsed["project"], "test");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_coverage_markdown_output() {
|
||||
let report = compute_coverage(&[], &[], "test");
|
||||
let md = format_coverage_markdown(&report);
|
||||
assert!(md.starts_with("# Aphoria Coverage: test"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_deprecated_claims_excluded() {
|
||||
let mut claim = make_claim("c1", "project/atomics/ordering", "safety");
|
||||
claim.status = ClaimStatus::Deprecated;
|
||||
let report = compute_coverage(&[claim], &[], "test");
|
||||
assert_eq!(report.summary.total_claims, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_claims_map_to_observation_modules() {
|
||||
// Claim concept_path and observation concept_path share tail "atomics/ordering"
|
||||
let claims = vec![make_claim("c1", "project/atomics/ordering", "safety")];
|
||||
let observations = vec![
|
||||
make_obs("code://rust/project/atomics/ordering", "src/wallet/atomics/sync.rs"),
|
||||
];
|
||||
|
||||
let report = compute_coverage(&claims, &observations, "test");
|
||||
|
||||
// The claim should land in "wallet/atomics" (from observation file path),
|
||||
// NOT "atomics" (from concept_path tail). This means the module should
|
||||
// have both a claim and an observation with non-zero density.
|
||||
let wallet_mod = report
|
||||
.modules
|
||||
.iter()
|
||||
.find(|m| m.module_path == "wallet/atomics");
|
||||
assert!(wallet_mod.is_some(), "Expected wallet/atomics module");
|
||||
let Some(wallet_mod) = wallet_mod else {
|
||||
panic!("wallet/atomics module not found");
|
||||
};
|
||||
assert_eq!(wallet_mod.claim_count, 1);
|
||||
assert_eq!(wallet_mod.observation_count, 1);
|
||||
assert!(wallet_mod.density > 0.0, "density should be non-zero");
|
||||
}
|
||||
}
|
||||
218
applications/aphoria/src/episteme/authority_lens.rs
Normal file
218
applications/aphoria/src/episteme/authority_lens.rs
Normal file
@ -0,0 +1,218 @@
|
||||
//! Aphoria Authority Lens - formalizes the authority-based conflict scoring.
|
||||
//!
|
||||
//! Wraps the existing scoring formula from `conflict.rs` into a proper
|
||||
//! `stemedb_lens::Lens` implementation. This allows the authority resolution
|
||||
//! logic to be used as a first-class Lens in Episteme queries.
|
||||
|
||||
use stemedb_core::types::{Assertion, SourceClass};
|
||||
use stemedb_lens::{Lens, Resolution};
|
||||
|
||||
use crate::types::TierBreakdown;
|
||||
|
||||
/// Authority-based lens that resolves conflicts by source class tier.
|
||||
///
|
||||
/// Higher-authority sources (lower tier numbers) win. Uses the same formula
|
||||
/// as `compute_conflict_score()` in `conflict.rs`:
|
||||
///
|
||||
/// ```text
|
||||
/// normalized = 0.4 + (3.0 - min_tier) / 3.0 * 0.55
|
||||
/// ```
|
||||
///
|
||||
/// Tier 0 (Regulatory) produces score ~0.95, Tier 3 (Expert) produces ~0.40.
|
||||
pub struct AphoriaAuthorityLens;
|
||||
|
||||
impl Lens for AphoriaAuthorityLens {
|
||||
fn resolve(&self, candidates: &[Assertion]) -> Resolution {
|
||||
if candidates.is_empty() {
|
||||
return Resolution::empty();
|
||||
}
|
||||
|
||||
if candidates.len() == 1 {
|
||||
return Resolution::with_winner(candidates[0].clone(), 1, 1.0, 0.0);
|
||||
}
|
||||
|
||||
// Group by tier, pick the winner from the highest-authority (lowest tier) group
|
||||
let mut best_tier = u8::MAX;
|
||||
let mut best_assertion: Option<&Assertion> = None;
|
||||
let mut best_confidence: f32 = 0.0;
|
||||
|
||||
for assertion in candidates {
|
||||
let tier = assertion.source_class.tier();
|
||||
if tier < best_tier || (tier == best_tier && assertion.confidence > best_confidence) {
|
||||
best_tier = tier;
|
||||
best_assertion = Some(assertion);
|
||||
best_confidence = assertion.confidence;
|
||||
}
|
||||
}
|
||||
|
||||
let winner = match best_assertion {
|
||||
Some(a) => a.clone(),
|
||||
None => return Resolution::empty(),
|
||||
};
|
||||
|
||||
// Compute conflict score using the same formula as conflict.rs
|
||||
let conflict_score = authority_conflict_score(candidates);
|
||||
|
||||
// Resolution confidence is based on how dominant the winning tier is
|
||||
let min_tier = best_tier as f32;
|
||||
let resolution_confidence = 0.4 + (3.0 - min_tier.min(3.0)) / 3.0 * 0.55;
|
||||
|
||||
Resolution::with_winner(
|
||||
winner,
|
||||
candidates.len(),
|
||||
resolution_confidence.min(1.0),
|
||||
conflict_score,
|
||||
)
|
||||
}
|
||||
|
||||
fn name(&self) -> &'static str {
|
||||
"AphoriaAuthority"
|
||||
}
|
||||
}
|
||||
|
||||
/// Compute cross-tier conflict score for a set of assertions.
|
||||
///
|
||||
/// Uses the same normalized formula as `conflict.rs:compute_conflict_score()`:
|
||||
/// `normalized = 0.4 + (3.0 - min_tier) / 3.0 * 0.55`
|
||||
///
|
||||
/// Returns 0.0 if all assertions are the same tier, higher values when
|
||||
/// high-authority sources (Tier 0) conflict with low-authority (Tier 3+).
|
||||
fn authority_conflict_score(candidates: &[Assertion]) -> f32 {
|
||||
if candidates.len() <= 1 {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let min_tier = candidates.iter().map(|a| a.source_class.tier()).min().unwrap_or(3);
|
||||
let max_tier = candidates.iter().map(|a| a.source_class.tier()).max().unwrap_or(3);
|
||||
|
||||
if min_tier == max_tier {
|
||||
return 0.0; // Same tier, no authority conflict
|
||||
}
|
||||
|
||||
// Tier distance maps to conflict intensity
|
||||
let tier_distance = (max_tier - min_tier) as f32;
|
||||
(tier_distance / 5.0).min(1.0) // Max 5 tiers apart (0-5)
|
||||
}
|
||||
|
||||
/// Compute tier breakdown from a set of assertions.
|
||||
///
|
||||
/// Returns a sorted (by tier) list of tier breakdowns.
|
||||
pub fn compute_tier_breakdown(assertions: &[Assertion]) -> Vec<TierBreakdown> {
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
let mut by_tier: BTreeMap<u8, (SourceClass, usize, f32)> = BTreeMap::new();
|
||||
|
||||
for assertion in assertions {
|
||||
let tier = assertion.source_class.tier();
|
||||
let entry = by_tier.entry(tier).or_insert((assertion.source_class, 0, 0.0));
|
||||
entry.1 += 1;
|
||||
if assertion.confidence > entry.2 {
|
||||
entry.2 = assertion.confidence;
|
||||
}
|
||||
}
|
||||
|
||||
by_tier
|
||||
.into_iter()
|
||||
.map(|(tier, (source_class, count, max_conf))| TierBreakdown {
|
||||
tier,
|
||||
source_class,
|
||||
assertion_count: count,
|
||||
max_confidence: max_conf,
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use stemedb_core::testing::AssertionBuilder;
|
||||
|
||||
#[test]
|
||||
fn test_empty_candidates() {
|
||||
let lens = AphoriaAuthorityLens;
|
||||
let result = lens.resolve(&[]);
|
||||
assert!(result.winner.is_none());
|
||||
assert_eq!(result.candidates_count, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_single_candidate() {
|
||||
let lens = AphoriaAuthorityLens;
|
||||
let assertion = AssertionBuilder::new()
|
||||
.source_class(SourceClass::Regulatory)
|
||||
.confidence(0.95)
|
||||
.build();
|
||||
let result = lens.resolve(&[assertion]);
|
||||
assert!(result.winner.is_some());
|
||||
assert_eq!(result.candidates_count, 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_authority_wins_over_lower_tier() {
|
||||
let lens = AphoriaAuthorityLens;
|
||||
let regulatory = AssertionBuilder::new()
|
||||
.subject("rfc://test")
|
||||
.source_class(SourceClass::Regulatory)
|
||||
.confidence(0.9)
|
||||
.build();
|
||||
let community = AssertionBuilder::new()
|
||||
.subject("code://test")
|
||||
.source_class(SourceClass::Community)
|
||||
.confidence(1.0)
|
||||
.build();
|
||||
|
||||
let result = lens.resolve(&[community, regulatory]);
|
||||
let winner = result.winner.as_ref().expect("should have winner");
|
||||
assert_eq!(winner.source_class, SourceClass::Regulatory);
|
||||
assert_eq!(result.candidates_count, 2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_lens_scores_match_existing() {
|
||||
// Verify the normalized formula matches conflict.rs expectations
|
||||
// Tier 0 vs code → ~0.95
|
||||
let regulatory = AssertionBuilder::new()
|
||||
.source_class(SourceClass::Regulatory)
|
||||
.confidence(1.0)
|
||||
.build();
|
||||
let community = AssertionBuilder::new()
|
||||
.source_class(SourceClass::Community)
|
||||
.confidence(1.0)
|
||||
.build();
|
||||
|
||||
let lens = AphoriaAuthorityLens;
|
||||
let result = lens.resolve(&[regulatory, community]);
|
||||
|
||||
// Resolution confidence for Tier 0 winner should be ~0.95
|
||||
assert!(
|
||||
result.resolution_confidence > 0.9,
|
||||
"Expected >0.9, got {}",
|
||||
result.resolution_confidence
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_tier_breakdown() {
|
||||
let assertions = vec![
|
||||
AssertionBuilder::new()
|
||||
.source_class(SourceClass::Regulatory)
|
||||
.confidence(0.95)
|
||||
.build(),
|
||||
AssertionBuilder::new()
|
||||
.source_class(SourceClass::Regulatory)
|
||||
.confidence(0.9)
|
||||
.build(),
|
||||
AssertionBuilder::new()
|
||||
.source_class(SourceClass::Community)
|
||||
.confidence(0.7)
|
||||
.build(),
|
||||
];
|
||||
|
||||
let breakdown = compute_tier_breakdown(&assertions);
|
||||
assert_eq!(breakdown.len(), 2);
|
||||
assert_eq!(breakdown[0].tier, 0); // Regulatory
|
||||
assert_eq!(breakdown[0].assertion_count, 2);
|
||||
assert!((breakdown[0].max_confidence - 0.95).abs() < f32::EPSILON);
|
||||
assert_eq!(breakdown[1].assertion_count, 1);
|
||||
}
|
||||
}
|
||||
@ -10,7 +10,7 @@ use tracing::info;
|
||||
|
||||
use crate::config::AphoriaConfig;
|
||||
use crate::types::{
|
||||
ConflictResult, ConflictTrace, ConflictingSource, ExtractedClaim, PolicySourceInfo,
|
||||
ConflictResult, ConflictTrace, ConflictingSource, Observation, PolicySourceInfo,
|
||||
PredicateAliasSet, Verdict,
|
||||
};
|
||||
|
||||
@ -37,7 +37,7 @@ use super::concept_index::ConceptIndex;
|
||||
/// This version uses predicate aliases from config only.
|
||||
#[allow(dead_code)]
|
||||
pub fn check_conflicts_pure(
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
index: &ConceptIndex,
|
||||
aliases: &HashMap<String, String>,
|
||||
pack_sources: &HashMap<String, PolicySourceInfo>,
|
||||
@ -62,7 +62,7 @@ pub fn check_conflicts_pure(
|
||||
/// This variant allows passing predicate aliases explicitly, which is useful
|
||||
/// when aliases come from multiple sources (config + Trust Packs).
|
||||
pub fn check_conflicts_with_predicate_aliases(
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
index: &ConceptIndex,
|
||||
aliases: &HashMap<String, String>,
|
||||
pack_sources: &HashMap<String, PolicySourceInfo>,
|
||||
@ -179,6 +179,36 @@ pub fn check_conflicts_with_predicate_aliases(
|
||||
None
|
||||
};
|
||||
|
||||
// Compute tier breakdown in debug mode
|
||||
let tier_breakdown = if debug {
|
||||
use std::collections::BTreeMap;
|
||||
let mut by_tier: BTreeMap<u8, (SourceClass, usize, f32)> = BTreeMap::new();
|
||||
for source in &conflicts {
|
||||
let tier = source.source_class.tier();
|
||||
let entry =
|
||||
by_tier.entry(tier).or_insert((source.source_class, 0, 0.0));
|
||||
entry.1 += 1;
|
||||
if source.confidence > entry.2 {
|
||||
entry.2 = source.confidence;
|
||||
}
|
||||
}
|
||||
Some(
|
||||
by_tier
|
||||
.into_iter()
|
||||
.map(|(tier, (sc, count, max_conf))| {
|
||||
crate::types::TierBreakdown {
|
||||
tier,
|
||||
source_class: sc,
|
||||
assertion_count: count,
|
||||
max_confidence: max_conf,
|
||||
}
|
||||
})
|
||||
.collect(),
|
||||
)
|
||||
} else {
|
||||
None
|
||||
};
|
||||
|
||||
results.push(ConflictResult {
|
||||
claim: claim.clone(),
|
||||
conflicts,
|
||||
@ -186,6 +216,7 @@ pub fn check_conflicts_with_predicate_aliases(
|
||||
verdict,
|
||||
acknowledged: None,
|
||||
trace,
|
||||
tier_breakdown,
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
@ -156,6 +156,72 @@ pub fn create_authoritative_corpus(signing_key: &SigningKey) -> Vec<Assertion> {
|
||||
assertions
|
||||
}
|
||||
|
||||
/// Create a signed authoritative assertion with additional metadata fields.
|
||||
///
|
||||
/// Like `create_authoritative_assertion`, but merges `extra_metadata` into the
|
||||
/// `source_metadata` JSON. Use this to attach RFC section references, CWE IDs,
|
||||
/// or other structured provenance data.
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
pub fn create_authoritative_assertion_with_metadata(
|
||||
signing_key: &SigningKey,
|
||||
subject: &str,
|
||||
predicate: &str,
|
||||
object: ObjectValue,
|
||||
source_class: SourceClass,
|
||||
description: &str,
|
||||
timestamp: u64,
|
||||
extra_metadata: serde_json::Value,
|
||||
) -> Assertion {
|
||||
// Compute source hash
|
||||
let mut hasher = Hasher::new();
|
||||
hasher.update(subject.as_bytes());
|
||||
hasher.update(predicate.as_bytes());
|
||||
hasher.update(description.as_bytes());
|
||||
let source_hash = *hasher.finalize().as_bytes();
|
||||
|
||||
// Create signature
|
||||
let message = format!("{}:{}", subject, predicate);
|
||||
let signature = signing_key.sign(message.as_bytes());
|
||||
let verifying_key = signing_key.verifying_key();
|
||||
|
||||
let signature_entry = SignatureEntry {
|
||||
agent_id: verifying_key.to_bytes(),
|
||||
signature: signature.to_bytes(),
|
||||
timestamp,
|
||||
version: 1,
|
||||
};
|
||||
|
||||
// Build source_metadata: start with extras, then overwrite with base fields
|
||||
// so that base fields ("description", "source") can never be overridden.
|
||||
let mut metadata = if let serde_json::Value::Object(extra) = extra_metadata {
|
||||
serde_json::Value::Object(extra)
|
||||
} else {
|
||||
serde_json::json!({})
|
||||
};
|
||||
if let serde_json::Value::Object(ref mut map) = metadata {
|
||||
map.insert("description".to_string(), serde_json::Value::String(description.to_string()));
|
||||
map.insert("source".to_string(), serde_json::Value::String("authoritative_corpus".to_string()));
|
||||
}
|
||||
|
||||
Assertion {
|
||||
subject: subject.to_string(),
|
||||
predicate: predicate.to_string(),
|
||||
object,
|
||||
parent_hash: None,
|
||||
source_hash,
|
||||
source_class,
|
||||
visual_hash: None,
|
||||
epoch: None,
|
||||
source_metadata: serde_json::to_vec(&metadata).ok(),
|
||||
lifecycle: LifecycleStage::Approved,
|
||||
signatures: vec![signature_entry],
|
||||
confidence: 1.0,
|
||||
timestamp,
|
||||
hlc_timestamp: HlcTimestamp::default(),
|
||||
vector: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create a signed authoritative assertion.
|
||||
///
|
||||
/// This helper is used by corpus builders to create signed assertions with
|
||||
@ -211,3 +277,104 @@ pub fn create_authoritative_assertion(
|
||||
vector: None,
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use ed25519_dalek::SigningKey;
|
||||
|
||||
#[test]
|
||||
fn test_create_authoritative_assertion_with_metadata_merges_fields() {
|
||||
let signing_key = SigningKey::generate(&mut rand::rngs::OsRng);
|
||||
let timestamp = current_timestamp();
|
||||
|
||||
let extra_metadata = serde_json::json!({
|
||||
"rfc_section": "7519-4.1.3",
|
||||
"cwe_references": ["CWE-345", "CWE-346"],
|
||||
"severity": "high"
|
||||
});
|
||||
|
||||
let assertion = create_authoritative_assertion_with_metadata(
|
||||
&signing_key,
|
||||
"rfc://test/subject",
|
||||
"test_predicate",
|
||||
ObjectValue::Boolean(true),
|
||||
SourceClass::Regulatory,
|
||||
"Test description",
|
||||
timestamp,
|
||||
extra_metadata,
|
||||
);
|
||||
|
||||
// Extract and parse source_metadata
|
||||
let metadata_bytes = assertion.source_metadata.expect("metadata should exist");
|
||||
let metadata: serde_json::Value =
|
||||
serde_json::from_slice(&metadata_bytes).expect("should parse JSON");
|
||||
|
||||
// Verify base fields are present
|
||||
assert_eq!(metadata["description"], "Test description");
|
||||
assert_eq!(metadata["source"], "authoritative_corpus");
|
||||
|
||||
// Verify extra fields are merged
|
||||
assert_eq!(metadata["rfc_section"], "7519-4.1.3");
|
||||
assert_eq!(metadata["cwe_references"][0], "CWE-345");
|
||||
assert_eq!(metadata["cwe_references"][1], "CWE-346");
|
||||
assert_eq!(metadata["severity"], "high");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_create_authoritative_assertion_with_metadata_preserves_base_fields() {
|
||||
let signing_key = SigningKey::generate(&mut rand::rngs::OsRng);
|
||||
let timestamp = current_timestamp();
|
||||
|
||||
// Extra metadata shouldn't overwrite base fields if key collision
|
||||
let extra_metadata = serde_json::json!({
|
||||
"description": "Malicious override attempt",
|
||||
"custom_field": "allowed"
|
||||
});
|
||||
|
||||
let assertion = create_authoritative_assertion_with_metadata(
|
||||
&signing_key,
|
||||
"rfc://test/subject",
|
||||
"test_predicate",
|
||||
ObjectValue::Boolean(true),
|
||||
SourceClass::Regulatory,
|
||||
"Original description",
|
||||
timestamp,
|
||||
extra_metadata,
|
||||
);
|
||||
|
||||
let metadata_bytes = assertion.source_metadata.expect("metadata should exist");
|
||||
let metadata: serde_json::Value =
|
||||
serde_json::from_slice(&metadata_bytes).expect("should parse JSON");
|
||||
|
||||
// Base fields always win over extra metadata
|
||||
assert_eq!(metadata["description"], "Original description");
|
||||
assert_eq!(metadata["source"], "authoritative_corpus");
|
||||
assert_eq!(metadata["custom_field"], "allowed");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_create_authoritative_assertion_with_empty_metadata() {
|
||||
let signing_key = SigningKey::generate(&mut rand::rngs::OsRng);
|
||||
let timestamp = current_timestamp();
|
||||
|
||||
let assertion = create_authoritative_assertion_with_metadata(
|
||||
&signing_key,
|
||||
"rfc://test/subject",
|
||||
"test_predicate",
|
||||
ObjectValue::Boolean(true),
|
||||
SourceClass::Regulatory,
|
||||
"Test description",
|
||||
timestamp,
|
||||
serde_json::json!({}),
|
||||
);
|
||||
|
||||
let metadata_bytes = assertion.source_metadata.expect("metadata should exist");
|
||||
let metadata: serde_json::Value =
|
||||
serde_json::from_slice(&metadata_bytes).expect("should parse JSON");
|
||||
|
||||
// Should still have base fields
|
||||
assert_eq!(metadata["description"], "Test description");
|
||||
assert_eq!(metadata["source"], "authoritative_corpus");
|
||||
}
|
||||
}
|
||||
|
||||
@ -5,7 +5,7 @@
|
||||
use stemedb_core::types::Assertion;
|
||||
use tracing::{debug, info, instrument};
|
||||
|
||||
use crate::types::{predicates, DriftResult, ExtractedClaim, Verdict};
|
||||
use crate::types::{predicates, DriftResult, Observation, Verdict};
|
||||
use crate::AphoriaError;
|
||||
|
||||
use super::helpers::assertion_to_prior_observation;
|
||||
@ -22,7 +22,7 @@ impl LocalEpisteme {
|
||||
#[instrument(skip(self, claims), fields(claim_count = claims.len()))]
|
||||
pub async fn check_drift(
|
||||
&self,
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
) -> Result<Vec<DriftResult>, AphoriaError> {
|
||||
let mut drifts = Vec::new();
|
||||
|
||||
|
||||
@ -13,7 +13,7 @@ use tracing::{info, instrument, warn};
|
||||
use crate::config::{AphoriaConfig, CorpusConfig};
|
||||
use crate::corpus::CorpusRegistry;
|
||||
use crate::policy::TrustPack;
|
||||
use crate::types::{ConflictResult, ExtractedClaim, PolicySourceInfo, PredicateAliasSet};
|
||||
use crate::types::{ConflictResult, Observation, PolicySourceInfo, PredicateAliasSet};
|
||||
|
||||
use super::concept_index::ConceptIndex;
|
||||
use super::conflict::check_conflicts_with_predicate_aliases;
|
||||
@ -213,7 +213,7 @@ impl EphemeralDetector {
|
||||
/// Vector of conflict results, with debug traces populated based on config.
|
||||
pub fn check_conflicts(
|
||||
&self,
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
config: &AphoriaConfig,
|
||||
) -> Vec<ConflictResult> {
|
||||
// Merge predicate aliases from config and from imported packs
|
||||
@ -236,7 +236,7 @@ impl EphemeralDetector {
|
||||
/// Like `check_conflicts`, but populates `ConflictTrace` for each result.
|
||||
pub fn check_conflicts_debug(
|
||||
&self,
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
config: &AphoriaConfig,
|
||||
) -> Vec<ConflictResult> {
|
||||
// Merge predicate aliases from config and from imported packs
|
||||
|
||||
@ -8,7 +8,7 @@ use tracing::{debug, info, instrument, warn};
|
||||
|
||||
use crate::config::AphoriaConfig;
|
||||
use crate::types::{
|
||||
AcknowledgmentInfo, ConflictResult, ConflictingSource, ExtractedClaim, PolicySourceInfo,
|
||||
AcknowledgmentInfo, ConflictResult, ConflictingSource, Observation, PolicySourceInfo,
|
||||
Verdict,
|
||||
};
|
||||
use crate::AphoriaError;
|
||||
@ -35,7 +35,7 @@ impl LocalEpisteme {
|
||||
#[instrument(skip(self, claims, config, index), fields(claim_count = claims.len()))]
|
||||
pub async fn check_conflicts(
|
||||
&self,
|
||||
claims: &[ExtractedClaim],
|
||||
claims: &[Observation],
|
||||
config: &AphoriaConfig,
|
||||
index: &ConceptIndex,
|
||||
) -> Result<Vec<ConflictResult>, AphoriaError> {
|
||||
@ -232,6 +232,7 @@ impl LocalEpisteme {
|
||||
verdict,
|
||||
acknowledged,
|
||||
trace: None, // Persistent mode doesn't populate traces (for now)
|
||||
tier_breakdown: None,
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
@ -7,8 +7,8 @@ use stemedb_ingest::serialize_assertion;
|
||||
use stemedb_storage::PredicateIndexStore;
|
||||
use tracing::{debug, info, instrument, warn};
|
||||
|
||||
use crate::bridge::{claim_to_assertion, claim_to_observation};
|
||||
use crate::types::{predicates, ExtractedClaim};
|
||||
use crate::bridge::{claim_to_assertion, observation_to_assertion};
|
||||
use crate::types::{predicates, Observation};
|
||||
use crate::AphoriaError;
|
||||
|
||||
use super::super::corpus::current_timestamp;
|
||||
@ -17,7 +17,7 @@ use super::LocalEpisteme;
|
||||
impl LocalEpisteme {
|
||||
/// Ingest a batch of extracted claims into Episteme.
|
||||
#[instrument(skip(self, claims), fields(claim_count = claims.len()))]
|
||||
pub async fn ingest_claims(&self, claims: &[ExtractedClaim]) -> Result<usize, AphoriaError> {
|
||||
pub async fn ingest_claims(&self, claims: &[Observation]) -> Result<usize, AphoriaError> {
|
||||
let timestamp = current_timestamp();
|
||||
let mut ingested = 0;
|
||||
|
||||
@ -104,7 +104,7 @@ impl LocalEpisteme {
|
||||
#[instrument(skip(self, observations), fields(count = observations.len()))]
|
||||
pub async fn ingest_observations(
|
||||
&self,
|
||||
observations: &[ExtractedClaim],
|
||||
observations: &[Observation],
|
||||
) -> Result<usize, AphoriaError> {
|
||||
if observations.is_empty() {
|
||||
return Ok(0);
|
||||
@ -114,7 +114,7 @@ impl LocalEpisteme {
|
||||
let mut count = 0;
|
||||
|
||||
for claim in observations {
|
||||
let assertion = claim_to_observation(claim, &self.signing_key, timestamp);
|
||||
let assertion = observation_to_assertion(claim, &self.signing_key, timestamp);
|
||||
|
||||
// Serialize and write to WAL
|
||||
let record_bytes = serialize_assertion(&assertion).map_err(|e| {
|
||||
@ -165,12 +165,16 @@ impl LocalEpisteme {
|
||||
}
|
||||
|
||||
/// Ingest authoritative assertions (RFC, OWASP, etc.).
|
||||
///
|
||||
/// Writes assertions to WAL and adds them to the AUTHORITATIVE predicate index
|
||||
/// so they are discoverable by `fetch_authoritative_assertions()` during scans.
|
||||
#[instrument(skip(self, assertions), fields(count = assertions.len()))]
|
||||
pub async fn ingest_authoritative(
|
||||
&self,
|
||||
assertions: &[Assertion],
|
||||
) -> Result<usize, AphoriaError> {
|
||||
let mut ingested = 0;
|
||||
let mut hashes = Vec::with_capacity(assertions.len());
|
||||
|
||||
for assertion in assertions {
|
||||
let record_bytes = serialize_assertion(assertion).map_err(|e| {
|
||||
@ -179,6 +183,11 @@ impl LocalEpisteme {
|
||||
assertion.subject
|
||||
))
|
||||
})?;
|
||||
|
||||
// Compute hash for predicate indexing (skip 8-byte header, same as Ingestor)
|
||||
let hash = *blake3::hash(&record_bytes[8..]).as_bytes();
|
||||
hashes.push(hash);
|
||||
|
||||
let mut journal = self.journal.lock().await;
|
||||
journal.append(record_bytes).map_err(|e| {
|
||||
AphoriaError::Storage(format!(
|
||||
@ -199,6 +208,18 @@ impl LocalEpisteme {
|
||||
AphoriaError::Storage(format!("Failed to process authoritative ingestion: {e}"))
|
||||
})?;
|
||||
|
||||
// Add all assertions to the AUTHORITATIVE predicate index
|
||||
// This mirrors the pattern from policy_ops.rs import_policy()
|
||||
for hash in &hashes {
|
||||
if let Err(e) = self
|
||||
.predicate_index_store
|
||||
.add_to_predicate_index(predicates::AUTHORITATIVE, hash)
|
||||
.await
|
||||
{
|
||||
warn!(hash = %hex::encode(hash), error = %e, "Failed to add to authoritative index");
|
||||
}
|
||||
}
|
||||
|
||||
info!(ingested, "Ingested authoritative assertions");
|
||||
Ok(ingested)
|
||||
}
|
||||
|
||||
@ -7,6 +7,7 @@
|
||||
//! - Auto-creating aliases when conflicts are detected (Phase 2A.3)
|
||||
|
||||
mod aliases;
|
||||
pub mod authority_lens;
|
||||
mod concept_index;
|
||||
mod conflict;
|
||||
mod corpus;
|
||||
@ -21,12 +22,14 @@ mod tests;
|
||||
// Re-export public types and functions to maintain existing API
|
||||
pub use concept_index::ConceptIndex;
|
||||
pub use corpus::{
|
||||
create_authoritative_assertion, create_authoritative_corpus, current_timestamp,
|
||||
current_timestamp_millis,
|
||||
create_authoritative_assertion, create_authoritative_assertion_with_metadata,
|
||||
create_authoritative_corpus, current_timestamp, current_timestamp_millis,
|
||||
};
|
||||
pub use ephemeral::EphemeralDetector;
|
||||
pub use local::LocalEpisteme;
|
||||
|
||||
pub use authority_lens::{compute_tier_breakdown, AphoriaAuthorityLens};
|
||||
|
||||
// Re-export for tests
|
||||
#[cfg(test)]
|
||||
pub use conflict::compute_conflict_score;
|
||||
|
||||
@ -159,7 +159,7 @@ fn test_authoritative_corpus_creation() {
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_auto_alias_creation_on_conflict() {
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
use stemedb_storage::AliasStore;
|
||||
|
||||
let temp_dir =
|
||||
@ -182,7 +182,7 @@ async fn test_auto_alias_creation_on_conflict() {
|
||||
let index = ConceptIndex::build(&corpus);
|
||||
|
||||
// Create a claim that will conflict with the authoritative corpus
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false), // Conflicts with RFC (true)
|
||||
@ -221,7 +221,7 @@ async fn test_auto_alias_creation_on_conflict() {
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_auto_alias_not_created_when_disabled() {
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
use stemedb_storage::AliasStore;
|
||||
|
||||
let temp_dir = tempfile::Builder::new()
|
||||
@ -242,7 +242,7 @@ async fn test_auto_alias_not_created_when_disabled() {
|
||||
let corpus = create_authoritative_corpus(&signing_key);
|
||||
let index = ConceptIndex::build(&corpus);
|
||||
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -273,7 +273,7 @@ async fn test_auto_alias_not_created_when_disabled() {
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_auto_alias_uses_auto_detected_origin() {
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
use stemedb_storage::AliasStore;
|
||||
|
||||
let temp_dir =
|
||||
@ -292,7 +292,7 @@ async fn test_auto_alias_uses_auto_detected_origin() {
|
||||
let corpus = create_authoritative_corpus(&signing_key);
|
||||
let index = ConceptIndex::build(&corpus);
|
||||
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/jwt/audience_validation".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -324,7 +324,7 @@ async fn test_auto_alias_uses_auto_detected_origin() {
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_auto_alias_idempotent() {
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
use stemedb_storage::AliasStore;
|
||||
|
||||
let temp_dir = tempfile::Builder::new()
|
||||
@ -345,7 +345,7 @@ async fn test_auto_alias_idempotent() {
|
||||
let corpus = create_authoritative_corpus(&signing_key);
|
||||
let index = ConceptIndex::build(&corpus);
|
||||
|
||||
let claim = ExtractedClaim {
|
||||
let claim = Observation {
|
||||
concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -387,7 +387,7 @@ async fn test_auto_alias_idempotent() {
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_ingest_observations_creates_tier4_assertions() {
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
|
||||
let temp_dir =
|
||||
tempfile::Builder::new().prefix("aphoria_observations").tempdir().expect("create temp dir");
|
||||
@ -402,7 +402,7 @@ async fn test_ingest_observations_creates_tier4_assertions() {
|
||||
|
||||
// Create claims that would NOT conflict with authority
|
||||
let observations = vec![
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: "code://rust/myapp/logging/level".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Text("info".to_string()),
|
||||
@ -412,7 +412,7 @@ async fn test_ingest_observations_creates_tier4_assertions() {
|
||||
confidence: 0.9,
|
||||
description: "Logging level set to info".to_string(),
|
||||
},
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: "code://rust/myapp/db/pool_size".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Number(10.0),
|
||||
|
||||
@ -140,4 +140,12 @@ pub enum AphoriaError {
|
||||
/// Governance workflow error (approval pending, rejected, or configuration issue).
|
||||
#[error("Governance error: {0}")]
|
||||
Governance(String),
|
||||
|
||||
/// Claims authoring error (create, update, supersede, deprecate).
|
||||
#[error("Claims error: {0}")]
|
||||
Claims(String),
|
||||
|
||||
/// Verification error (claim-to-observation matching).
|
||||
#[error("Verify error: {0}")]
|
||||
Verify(String),
|
||||
}
|
||||
|
||||
@ -24,7 +24,7 @@ use crate::config::EvalConfig;
|
||||
use crate::error::Result;
|
||||
use crate::llm::ontology::{AuthorityConcept, OntologyVocabulary, ValueType};
|
||||
use crate::llm::{GeminiClient, LlmCache, LlmExtractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Configuration for an evaluation run.
|
||||
#[derive(Debug, Clone)]
|
||||
@ -436,7 +436,7 @@ impl EvalHarness {
|
||||
}
|
||||
|
||||
/// Extract claims from fixture content.
|
||||
fn extract_claims(&self, fixture: &Fixture) -> (Vec<ExtractedClaim>, usize, bool) {
|
||||
fn extract_claims(&self, fixture: &Fixture) -> (Vec<Observation>, usize, bool) {
|
||||
// In cached/live mode, we would call the LLM extractor
|
||||
// For now, return empty (mock behavior) until LLM is integrated
|
||||
if let Some(extractor) = &self.extractor {
|
||||
|
||||
@ -9,13 +9,13 @@ use stemedb_core::types::ObjectValue;
|
||||
use tracing::debug;
|
||||
|
||||
use super::fixture::ExpectedClaim;
|
||||
use crate::types::ExtractedClaim;
|
||||
use crate::types::Observation;
|
||||
|
||||
/// Result of matching expected claims against extracted claims.
|
||||
#[derive(Debug, Clone, Default)]
|
||||
pub struct MatchResult {
|
||||
/// Expected claims that were found in extracted claims.
|
||||
pub matched: Vec<(ExpectedClaim, ExtractedClaim)>,
|
||||
pub matched: Vec<(ExpectedClaim, Observation)>,
|
||||
|
||||
/// Expected claims that were NOT found.
|
||||
pub unmatched: Vec<ExpectedClaim>,
|
||||
@ -57,7 +57,7 @@ impl ClaimMatcher {
|
||||
/// Returns matched and unmatched expected claims.
|
||||
pub fn check_must_contain(
|
||||
&self,
|
||||
extracted: &[ExtractedClaim],
|
||||
extracted: &[Observation],
|
||||
expected: &[ExpectedClaim],
|
||||
) -> MatchResult {
|
||||
let mut matched = Vec::new();
|
||||
@ -79,9 +79,9 @@ impl ClaimMatcher {
|
||||
/// Returns violations: (forbidden claim, matched extracted claim).
|
||||
pub fn check_must_not_contain(
|
||||
&self,
|
||||
extracted: &[ExtractedClaim],
|
||||
extracted: &[Observation],
|
||||
forbidden: &[ExpectedClaim],
|
||||
) -> Vec<(ExpectedClaim, ExtractedClaim)> {
|
||||
) -> Vec<(ExpectedClaim, Observation)> {
|
||||
let mut violations = Vec::new();
|
||||
|
||||
for forbid in forbidden {
|
||||
@ -96,9 +96,9 @@ impl ClaimMatcher {
|
||||
/// Find an extracted claim that matches an expected claim.
|
||||
fn find_matching_claim<'a>(
|
||||
&self,
|
||||
extracted: &'a [ExtractedClaim],
|
||||
extracted: &'a [Observation],
|
||||
expected: &ExpectedClaim,
|
||||
) -> Option<&'a ExtractedClaim> {
|
||||
) -> Option<&'a Observation> {
|
||||
extracted.iter().find(|claim| {
|
||||
self.subject_matches(&claim.concept_path, &expected.subject)
|
||||
&& claim.predicate == expected.predicate
|
||||
@ -196,7 +196,7 @@ impl ClaimMatcher {
|
||||
///
|
||||
/// Extracted claims that don't match any expected claim.
|
||||
pub fn count_false_positives(
|
||||
extracted: &[ExtractedClaim],
|
||||
extracted: &[Observation],
|
||||
expected: &[ExpectedClaim],
|
||||
acceptable_variants: &[ExpectedClaim],
|
||||
matcher: &ClaimMatcher,
|
||||
@ -219,8 +219,8 @@ pub fn count_false_positives(
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn make_extracted_claim(subject: &str, predicate: &str, value: ObjectValue) -> ExtractedClaim {
|
||||
ExtractedClaim {
|
||||
fn make_extracted_claim(subject: &str, predicate: &str, value: ObjectValue) -> Observation {
|
||||
Observation {
|
||||
concept_path: subject.to_string(),
|
||||
predicate: predicate.to_string(),
|
||||
value,
|
||||
|
||||
530
applications/aphoria/src/explain.rs
Normal file
530
applications/aphoria/src/explain.rs
Normal file
@ -0,0 +1,530 @@
|
||||
//! Narrative explanation generation for project claims.
|
||||
//!
|
||||
//! Three distinct outputs:
|
||||
//! - `generate_onboarding()` — lightweight summary for `aphoria explain`
|
||||
//! - `generate_full_docs()` — comprehensive reference for `aphoria docs generate`
|
||||
//! - `generate_explanation()` — legacy function (kept for backward compat, delegates to `generate_onboarding`)
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
use crate::coverage::CoverageReport;
|
||||
use crate::types::authored_claim::{AuthoredClaim, ClaimStatus};
|
||||
use crate::verify::{AuditVerdict, VerifyReport};
|
||||
|
||||
/// Generate an onboarding overview for `aphoria explain`.
|
||||
///
|
||||
/// Lightweight narrative: category counts, verification health, coverage snapshot.
|
||||
/// Takes pre-computed data so the caller handles scanning.
|
||||
pub fn generate_onboarding(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
format: &str,
|
||||
) -> String {
|
||||
match format {
|
||||
"json" => generate_onboarding_json(claims, verify_report, coverage_report, project_name),
|
||||
_ => generate_onboarding_markdown(claims, verify_report, coverage_report, project_name),
|
||||
}
|
||||
}
|
||||
|
||||
/// Generate comprehensive reference docs for `aphoria docs generate`.
|
||||
///
|
||||
/// Full claim details + verification results + coverage table.
|
||||
/// Takes pre-computed data so the caller handles scanning.
|
||||
pub fn generate_full_docs(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
format: &str,
|
||||
) -> String {
|
||||
match format {
|
||||
"json" => generate_full_docs_json(claims, verify_report, coverage_report, project_name),
|
||||
_ => generate_full_docs_markdown(claims, verify_report, coverage_report, project_name),
|
||||
}
|
||||
}
|
||||
|
||||
// --- Onboarding (aphoria explain) ---
|
||||
|
||||
fn generate_onboarding_markdown(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
) -> String {
|
||||
let mut out = String::new();
|
||||
|
||||
out.push_str(&format!("# {project_name} — Claim Overview\n\n"));
|
||||
|
||||
// Category summary
|
||||
let categories = group_by_category(claims);
|
||||
let active_count = claims.iter().filter(|c| c.status == ClaimStatus::Active).count();
|
||||
|
||||
out.push_str(&format!(
|
||||
"**{project_name}** has **{active_count}** active claims across **{}** categories.\n\n",
|
||||
categories.len()
|
||||
));
|
||||
|
||||
if !categories.is_empty() {
|
||||
out.push_str("## Categories\n\n");
|
||||
out.push_str("| Category | Active | Total |\n");
|
||||
out.push_str("|----------|--------|-------|\n");
|
||||
for (cat, cat_claims) in &categories {
|
||||
let active = cat_claims.iter().filter(|c| c.status == ClaimStatus::Active).count();
|
||||
out.push_str(&format!("| {} | {} | {} |\n", capitalize(cat), active, cat_claims.len()));
|
||||
}
|
||||
out.push('\n');
|
||||
}
|
||||
|
||||
// Verification health
|
||||
let summary = &verify_report.summary;
|
||||
out.push_str("## Verification Health\n\n");
|
||||
out.push_str(&format!(
|
||||
"- **Pass:** {}\n- **Conflict:** {}\n- **Missing:** {}\n- **Unclaimed observations:** {}\n\n",
|
||||
summary.pass, summary.conflict, summary.missing, summary.unclaimed,
|
||||
));
|
||||
|
||||
if summary.conflict > 0 {
|
||||
out.push_str("*Conflicts indicate code behavior differs from authored claims.*\n\n");
|
||||
}
|
||||
|
||||
// Coverage snapshot
|
||||
let cov = &coverage_report.summary;
|
||||
out.push_str("## Coverage Snapshot\n\n");
|
||||
out.push_str(&format!(
|
||||
"- **Claimed:** {:.1}% of {} observations\n",
|
||||
cov.claimed_percentage, cov.total_observations,
|
||||
));
|
||||
out.push_str(&format!(
|
||||
"- **Modules with claims:** {} / {}\n",
|
||||
cov.modules_with_claims,
|
||||
cov.modules_with_claims + cov.modules_without_claims,
|
||||
));
|
||||
|
||||
// Top uncovered modules
|
||||
let uncovered: Vec<_> = coverage_report
|
||||
.modules
|
||||
.iter()
|
||||
.filter(|m| m.claim_count == 0 && m.observation_count > 0)
|
||||
.take(5)
|
||||
.collect();
|
||||
|
||||
if !uncovered.is_empty() {
|
||||
out.push_str("\n**Top uncovered modules:**\n");
|
||||
for m in uncovered {
|
||||
out.push_str(&format!(
|
||||
"- `{}` ({} observations)\n",
|
||||
m.module_path, m.observation_count,
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
out.push_str("\n---\n");
|
||||
out.push_str("Run `aphoria claims explain` for full claim details.\n");
|
||||
out.push_str("Run `aphoria docs generate` for comprehensive reference documentation.\n");
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
fn generate_onboarding_json(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
) -> String {
|
||||
let categories = group_by_category(claims);
|
||||
let active_count = claims.iter().filter(|c| c.status == ClaimStatus::Active).count();
|
||||
|
||||
let cat_summary: Vec<serde_json::Value> = categories
|
||||
.iter()
|
||||
.map(|(cat, cat_claims)| {
|
||||
let active = cat_claims.iter().filter(|c| c.status == ClaimStatus::Active).count();
|
||||
serde_json::json!({
|
||||
"category": cat,
|
||||
"active": active,
|
||||
"total": cat_claims.len(),
|
||||
})
|
||||
})
|
||||
.collect();
|
||||
|
||||
let json = serde_json::json!({
|
||||
"project": project_name,
|
||||
"type": "onboarding",
|
||||
"active_claims": active_count,
|
||||
"categories": cat_summary,
|
||||
"verification": {
|
||||
"pass": verify_report.summary.pass,
|
||||
"conflict": verify_report.summary.conflict,
|
||||
"missing": verify_report.summary.missing,
|
||||
"unclaimed": verify_report.summary.unclaimed,
|
||||
},
|
||||
"coverage": {
|
||||
"claimed_percentage": coverage_report.summary.claimed_percentage,
|
||||
"total_observations": coverage_report.summary.total_observations,
|
||||
"modules_with_claims": coverage_report.summary.modules_with_claims,
|
||||
"modules_without_claims": coverage_report.summary.modules_without_claims,
|
||||
},
|
||||
});
|
||||
|
||||
serde_json::to_string_pretty(&json).unwrap_or_else(|_| "{}".to_string())
|
||||
}
|
||||
|
||||
// --- Full docs (aphoria docs generate) ---
|
||||
|
||||
fn generate_full_docs_markdown(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
) -> String {
|
||||
let mut out = String::new();
|
||||
|
||||
out.push_str(&format!("# {project_name} — Reference Documentation\n\n"));
|
||||
|
||||
// Section 1: Full claim details (reuse claims_explain)
|
||||
out.push_str(&crate::claims_explain::render_claims_markdown(claims, project_name));
|
||||
out.push('\n');
|
||||
|
||||
// Section 2: Verification results
|
||||
out.push_str("---\n\n");
|
||||
out.push_str("# Verification Results\n\n");
|
||||
|
||||
let summary = &verify_report.summary;
|
||||
out.push_str(&format!(
|
||||
"**Total:** {} claims verified — {} pass, {} conflict, {} missing, {} unclaimed observations\n\n",
|
||||
summary.total_claims, summary.pass, summary.conflict, summary.missing, summary.unclaimed,
|
||||
));
|
||||
|
||||
// Group verify results by verdict
|
||||
let mut conflicts = Vec::new();
|
||||
let mut missing = Vec::new();
|
||||
for result in &verify_report.results {
|
||||
match result.verdict {
|
||||
AuditVerdict::Conflict => conflicts.push(result),
|
||||
AuditVerdict::Missing => missing.push(result),
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
|
||||
if !conflicts.is_empty() {
|
||||
out.push_str("## Conflicts\n\n");
|
||||
for r in &conflicts {
|
||||
if let Some(ref claim) = r.claim {
|
||||
out.push_str(&format!("- **{}**: {}\n", claim.id, r.explanation));
|
||||
}
|
||||
}
|
||||
out.push('\n');
|
||||
}
|
||||
|
||||
if !missing.is_empty() {
|
||||
out.push_str("## Missing Observations\n\n");
|
||||
for r in &missing {
|
||||
if let Some(ref claim) = r.claim {
|
||||
out.push_str(&format!("- **{}**: {}\n", claim.id, r.explanation));
|
||||
}
|
||||
}
|
||||
out.push('\n');
|
||||
}
|
||||
|
||||
// Section 3: Coverage table
|
||||
out.push_str("---\n\n");
|
||||
out.push_str(&crate::coverage::format_coverage_markdown(coverage_report));
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
fn generate_full_docs_json(
|
||||
claims: &[AuthoredClaim],
|
||||
verify_report: &VerifyReport,
|
||||
coverage_report: &CoverageReport,
|
||||
project_name: &str,
|
||||
) -> String {
|
||||
let claims_json: Vec<serde_json::Value> = claims
|
||||
.iter()
|
||||
.map(|c| {
|
||||
serde_json::json!({
|
||||
"id": c.id,
|
||||
"concept_path": c.concept_path,
|
||||
"predicate": c.predicate,
|
||||
"value": format!("{}", c.value),
|
||||
"provenance": c.provenance,
|
||||
"invariant": c.invariant,
|
||||
"consequence": c.consequence,
|
||||
"authority_tier": c.authority_tier,
|
||||
"category": c.category,
|
||||
"status": format!("{:?}", c.status),
|
||||
})
|
||||
})
|
||||
.collect();
|
||||
|
||||
let verify_json: Vec<serde_json::Value> = verify_report
|
||||
.results
|
||||
.iter()
|
||||
.filter(|r| r.claim.is_some())
|
||||
.map(|r| {
|
||||
let claim = r.claim.as_ref().unwrap_or_else(|| {
|
||||
// Safety: filtered above
|
||||
unreachable!()
|
||||
});
|
||||
serde_json::json!({
|
||||
"claim_id": claim.id,
|
||||
"verdict": format!("{}", r.verdict),
|
||||
"explanation": r.explanation,
|
||||
"matching_observations": r.matching_observations.len(),
|
||||
})
|
||||
})
|
||||
.collect();
|
||||
|
||||
let json = serde_json::json!({
|
||||
"project": project_name,
|
||||
"type": "full_docs",
|
||||
"claims": claims_json,
|
||||
"verification": {
|
||||
"summary": {
|
||||
"total_claims": verify_report.summary.total_claims,
|
||||
"pass": verify_report.summary.pass,
|
||||
"conflict": verify_report.summary.conflict,
|
||||
"missing": verify_report.summary.missing,
|
||||
"unclaimed": verify_report.summary.unclaimed,
|
||||
},
|
||||
"results": verify_json,
|
||||
},
|
||||
"coverage": coverage_report,
|
||||
});
|
||||
|
||||
serde_json::to_string_pretty(&json).unwrap_or_else(|_| "{}".to_string())
|
||||
}
|
||||
|
||||
// --- Helpers ---
|
||||
|
||||
fn group_by_category(claims: &[AuthoredClaim]) -> BTreeMap<String, Vec<&AuthoredClaim>> {
|
||||
let mut categories: BTreeMap<String, Vec<&AuthoredClaim>> = BTreeMap::new();
|
||||
for claim in claims {
|
||||
categories.entry(claim.category.clone()).or_default().push(claim);
|
||||
}
|
||||
categories
|
||||
}
|
||||
|
||||
fn capitalize(s: &str) -> String {
|
||||
let mut chars = s.chars();
|
||||
match chars.next() {
|
||||
None => String::new(),
|
||||
Some(c) => c.to_uppercase().collect::<String>() + chars.as_str(),
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::types::authored_claim::{AuthoredValue, ComparisonMode};
|
||||
use crate::verify::{VerifyResult, VerifySummary};
|
||||
use crate::coverage::{CoverageSummary, ModuleCoverage};
|
||||
|
||||
fn sample_claim(id: &str, category: &str) -> AuthoredClaim {
|
||||
AuthoredClaim {
|
||||
id: id.to_string(),
|
||||
concept_path: "test/concept".to_string(),
|
||||
predicate: "test_pred".to_string(),
|
||||
value: AuthoredValue::Text("test_value".to_string()),
|
||||
comparison: ComparisonMode::default(),
|
||||
provenance: "Test provenance".to_string(),
|
||||
invariant: "Test invariant".to_string(),
|
||||
consequence: "Bad things happen".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: category.to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "tester".to_string(),
|
||||
created_at: "2026-02-08".to_string(),
|
||||
updated_at: None,
|
||||
}
|
||||
}
|
||||
|
||||
fn empty_verify_report() -> VerifyReport {
|
||||
VerifyReport {
|
||||
results: vec![],
|
||||
summary: VerifySummary::default(),
|
||||
}
|
||||
}
|
||||
|
||||
fn empty_coverage_report() -> CoverageReport {
|
||||
CoverageReport {
|
||||
project: "test".to_string(),
|
||||
modules: vec![],
|
||||
summary: CoverageSummary {
|
||||
total_observations: 0,
|
||||
total_claims: 0,
|
||||
claimed_percentage: 0.0,
|
||||
unclaimed_count: 0,
|
||||
modules_with_claims: 0,
|
||||
modules_without_claims: 0,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
fn sample_verify_report() -> VerifyReport {
|
||||
VerifyReport {
|
||||
results: vec![
|
||||
VerifyResult {
|
||||
claim: Some(sample_claim("c1", "safety")),
|
||||
verdict: AuditVerdict::Pass,
|
||||
matching_observations: vec![],
|
||||
explanation: "Matches".to_string(),
|
||||
},
|
||||
VerifyResult {
|
||||
claim: Some(sample_claim("c2", "architecture")),
|
||||
verdict: AuditVerdict::Conflict,
|
||||
matching_observations: vec![],
|
||||
explanation: "Value mismatch".to_string(),
|
||||
},
|
||||
],
|
||||
summary: VerifySummary {
|
||||
total_claims: 2,
|
||||
pass: 1,
|
||||
conflict: 1,
|
||||
missing: 0,
|
||||
unclaimed: 3,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
fn sample_coverage_report() -> CoverageReport {
|
||||
CoverageReport {
|
||||
project: "test".to_string(),
|
||||
modules: vec![
|
||||
ModuleCoverage {
|
||||
module_path: "tls".to_string(),
|
||||
files: vec!["src/tls/config.rs".to_string()],
|
||||
observation_count: 5,
|
||||
claim_count: 2,
|
||||
claimed_observations: 3,
|
||||
unclaimed_observations: 2,
|
||||
missing_claims: 0,
|
||||
density: 0.4,
|
||||
},
|
||||
ModuleCoverage {
|
||||
module_path: "auth".to_string(),
|
||||
files: vec!["src/auth/jwt.rs".to_string()],
|
||||
observation_count: 3,
|
||||
claim_count: 0,
|
||||
claimed_observations: 0,
|
||||
unclaimed_observations: 3,
|
||||
missing_claims: 0,
|
||||
density: 0.0,
|
||||
},
|
||||
],
|
||||
summary: CoverageSummary {
|
||||
total_observations: 8,
|
||||
total_claims: 2,
|
||||
claimed_percentage: 37.5,
|
||||
unclaimed_count: 5,
|
||||
modules_with_claims: 1,
|
||||
modules_without_claims: 1,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_has_category_table() {
|
||||
let claims = vec![
|
||||
sample_claim("s1", "safety"),
|
||||
sample_claim("a1", "architecture"),
|
||||
sample_claim("s2", "safety"),
|
||||
];
|
||||
let out = generate_onboarding(&claims, &empty_verify_report(), &empty_coverage_report(), "myproject", "markdown");
|
||||
assert!(out.contains("# myproject"));
|
||||
assert!(out.contains("3** active claims"));
|
||||
assert!(out.contains("| Safety"));
|
||||
assert!(out.contains("| Architecture"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_shows_verification_health() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let vr = sample_verify_report();
|
||||
let out = generate_onboarding(&claims, &vr, &empty_coverage_report(), "proj", "markdown");
|
||||
assert!(out.contains("**Pass:** 1"));
|
||||
assert!(out.contains("**Conflict:** 1"));
|
||||
assert!(out.contains("Conflicts indicate"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_shows_coverage_snapshot() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let cr = sample_coverage_report();
|
||||
let out = generate_onboarding(&claims, &empty_verify_report(), &cr, "proj", "markdown");
|
||||
assert!(out.contains("37.5%"));
|
||||
assert!(out.contains("`auth`"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_pointers() {
|
||||
let out = generate_onboarding(&[], &empty_verify_report(), &empty_coverage_report(), "proj", "markdown");
|
||||
assert!(out.contains("aphoria claims explain"));
|
||||
assert!(out.contains("aphoria docs generate"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_full_docs_includes_claim_details() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let out = generate_full_docs(&claims, &empty_verify_report(), &empty_coverage_report(), "proj", "markdown");
|
||||
// Should contain per-claim fields from claims_explain
|
||||
assert!(out.contains("**Concept:**"));
|
||||
assert!(out.contains("**Invariant:**"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_full_docs_includes_verify_results() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let vr = sample_verify_report();
|
||||
let out = generate_full_docs(&claims, &vr, &empty_coverage_report(), "proj", "markdown");
|
||||
assert!(out.contains("# Verification Results"));
|
||||
assert!(out.contains("## Conflicts"));
|
||||
assert!(out.contains("Value mismatch"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_full_docs_includes_coverage_table() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let cr = sample_coverage_report();
|
||||
let out = generate_full_docs(&claims, &empty_verify_report(), &cr, "proj", "markdown");
|
||||
assert!(out.contains("# Aphoria Coverage:"));
|
||||
assert!(out.contains("Coverage Gaps"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_and_full_docs_differ() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let vr = sample_verify_report();
|
||||
let cr = sample_coverage_report();
|
||||
let onboarding = generate_onboarding(&claims, &vr, &cr, "proj", "markdown");
|
||||
let full_docs = generate_full_docs(&claims, &vr, &cr, "proj", "markdown");
|
||||
assert_ne!(onboarding, full_docs);
|
||||
// Onboarding should NOT have per-claim concept details
|
||||
assert!(!onboarding.contains("**Concept:**"));
|
||||
// Full docs should NOT have the "Run `aphoria claims explain`" pointer
|
||||
assert!(!full_docs.contains("Run `aphoria claims explain`"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_onboarding_json() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let out = generate_onboarding(&claims, &empty_verify_report(), &empty_coverage_report(), "proj", "json");
|
||||
let parsed: serde_json::Value = serde_json::from_str(&out).unwrap_or_default();
|
||||
assert_eq!(parsed["type"], "onboarding");
|
||||
assert_eq!(parsed["active_claims"], 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_full_docs_json() {
|
||||
let claims = vec![sample_claim("c1", "safety")];
|
||||
let out = generate_full_docs(&claims, &empty_verify_report(), &empty_coverage_report(), "proj", "json");
|
||||
let parsed: serde_json::Value = serde_json::from_str(&out).unwrap_or_default();
|
||||
assert_eq!(parsed["type"], "full_docs");
|
||||
assert!(parsed["claims"].is_array());
|
||||
assert!(parsed["verification"].is_object());
|
||||
assert!(parsed["coverage"].is_object());
|
||||
}
|
||||
}
|
||||
@ -9,7 +9,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for API key security configuration.
|
||||
///
|
||||
@ -129,7 +129,7 @@ impl Extractor for ApiKeySecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -142,7 +142,7 @@ impl Extractor for ApiKeySecurityExtractor {
|
||||
concept_path.push("api".to_string());
|
||||
concept_path.push("auth".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "require_api_key".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -164,7 +164,7 @@ impl Extractor for ApiKeySecurityExtractor {
|
||||
concept_path.push("api".to_string());
|
||||
concept_path.push("auth".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "public_paths_count".to_string(),
|
||||
value: ObjectValue::Number(count as f64),
|
||||
@ -185,7 +185,7 @@ impl Extractor for ApiKeySecurityExtractor {
|
||||
concept_path.push("api".to_string());
|
||||
concept_path.push("rate_limit".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "using_default".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for ASP.NET Core security misconfigurations.
|
||||
pub struct AspNetSecurityExtractor {
|
||||
@ -90,7 +90,7 @@ impl AspNetSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -182,7 +182,7 @@ impl AspNetSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Multi-line: CORS AllowAnyOrigin with AllowCredentials
|
||||
@ -364,7 +364,7 @@ impl Extractor for AspNetSecurityExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like an ASP.NET file
|
||||
|
||||
@ -12,7 +12,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for authentication bypass patterns.
|
||||
///
|
||||
@ -90,7 +90,7 @@ impl AuthBypassExtractor {
|
||||
matched_text: &str,
|
||||
bypass_type: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["auth", "bypass", bypass_type],
|
||||
@ -126,7 +126,7 @@ impl Extractor for AuthBypassExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -8,7 +8,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for circuit breaker configuration.
|
||||
///
|
||||
@ -71,7 +71,7 @@ impl Extractor for CircuitBreakerConfigExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -84,7 +84,7 @@ impl Extractor for CircuitBreakerConfigExtractor {
|
||||
concept_path.push("api".to_string());
|
||||
concept_path.push("circuit_breaker".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -102,7 +102,7 @@ impl Extractor for CircuitBreakerConfigExtractor {
|
||||
concept_path.push("api".to_string());
|
||||
concept_path.push("circuit_breaker".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for command injection vulnerabilities.
|
||||
///
|
||||
@ -93,7 +93,7 @@ impl CommandInjectionExtractor {
|
||||
path_segments: &[String],
|
||||
file: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -103,7 +103,7 @@ impl CommandInjectionExtractor {
|
||||
concept_path.push("command".to_string());
|
||||
concept_path.push("input".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "input_source".to_string(),
|
||||
value: ObjectValue::Text("untrusted".to_string()),
|
||||
@ -126,7 +126,7 @@ impl CommandInjectionExtractor {
|
||||
path_segments: &[String],
|
||||
file: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -135,7 +135,7 @@ impl CommandInjectionExtractor {
|
||||
concept_path.push("os".to_string());
|
||||
concept_path.push("shell_mode".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
@ -173,7 +173,7 @@ impl Extractor for CommandInjectionExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
match language {
|
||||
@ -260,6 +260,19 @@ impl Extractor for CommandInjectionExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"Command::new",
|
||||
r"exec\.Command",
|
||||
r"os\.system",
|
||||
r"os\.popen",
|
||||
r"subprocess",
|
||||
r"child_process",
|
||||
r"execSync",
|
||||
r"shell\s*[:=]\s*true",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -29,7 +29,7 @@ use stemedb_core::types::ObjectValue;
|
||||
use super::config_parser::{parse_config, walk_config, ConfigValue};
|
||||
use super::traits::is_test_file;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// A security rule that matches config paths and values.
|
||||
struct SecurityRule {
|
||||
@ -244,7 +244,7 @@ impl ConfigSecurityExtractor {
|
||||
config: &ConfigValue,
|
||||
path_segments: &[String],
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let is_dev = Self::is_dev_config(file);
|
||||
let is_test = is_test_file(file);
|
||||
@ -265,7 +265,7 @@ impl ConfigSecurityExtractor {
|
||||
// Reduce confidence for test files
|
||||
let confidence = if is_test { rule.confidence * 0.5 } else { rule.confidence };
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: rule.predicate.to_string(),
|
||||
value: rule.claim_value.clone(),
|
||||
@ -298,7 +298,7 @@ impl Extractor for ConfigSecurityExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
// Skip empty or very small files
|
||||
if content.trim().is_empty() || content.len() < 5 {
|
||||
return Vec::new();
|
||||
|
||||
@ -10,7 +10,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Rust constant declarations.
|
||||
///
|
||||
@ -81,7 +81,7 @@ impl Extractor for ConstDeclarationsExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -100,7 +100,7 @@ impl Extractor for ConstDeclarationsExtractor {
|
||||
concept_path.push("const".to_string());
|
||||
concept_path.push(name.to_lowercase());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Text(cleaned_value.clone()),
|
||||
@ -124,7 +124,7 @@ impl Extractor for ConstDeclarationsExtractor {
|
||||
concept_path.push("static".to_string());
|
||||
concept_path.push(name.to_lowercase());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Text(cleaned_value.clone()),
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for CORS configuration issues.
|
||||
pub struct CorsConfigExtractor {
|
||||
@ -69,7 +69,7 @@ impl Extractor for CorsConfigExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let mut found_wildcard_origin = false;
|
||||
let mut wildcard_line = 0;
|
||||
@ -88,7 +88,7 @@ impl Extractor for CorsConfigExtractor {
|
||||
concept_path.push("cors".to_string());
|
||||
concept_path.push("allow_origin".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Text("*".to_string()),
|
||||
@ -108,7 +108,7 @@ impl Extractor for CorsConfigExtractor {
|
||||
concept_path.push("cors".to_string());
|
||||
concept_path.push("credentials_with_wildcard".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
@ -123,6 +123,22 @@ impl Extractor for CorsConfigExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("cors/allow_origin", "config_value"),
|
||||
("cors/credentials_with_wildcard", "enabled"),
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)allow_origin|AllowAllOrigins|permissive",
|
||||
r"(?i)Access-Control-Allow-Origin",
|
||||
r"(?i)cors",
|
||||
r"(?i)allow_credentials|AllowCredentials",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -5,7 +5,7 @@ use stemedb_core::types::ObjectValue;
|
||||
use super::parser::DeclarativeExtractor;
|
||||
use super::types::DeclarativeValue;
|
||||
use crate::extractors::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
impl Extractor for DeclarativeExtractor {
|
||||
fn name(&self) -> &str {
|
||||
@ -22,7 +22,7 @@ impl Extractor for DeclarativeExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -44,7 +44,7 @@ impl Extractor for DeclarativeExtractor {
|
||||
DeclarativeValue::Text { value } => ObjectValue::Text(value.clone()),
|
||||
};
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path,
|
||||
predicate: self.def().claim.predicate.clone(),
|
||||
value,
|
||||
@ -59,6 +59,10 @@ impl Extractor for DeclarativeExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![&self.def().pattern]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for vulnerable dependency versions.
|
||||
///
|
||||
@ -59,7 +59,7 @@ impl DepVersionsExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let mut in_dependencies = false;
|
||||
|
||||
@ -95,7 +95,7 @@ impl DepVersionsExtractor {
|
||||
concept_path.push(package.to_string());
|
||||
concept_path.push("version".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "installed_version".to_string(),
|
||||
value: ObjectValue::Text(version.to_string()),
|
||||
@ -118,7 +118,7 @@ impl DepVersionsExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -149,7 +149,7 @@ impl DepVersionsExtractor {
|
||||
concept_path.push(package.to_string());
|
||||
concept_path.push("version".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "installed_version".to_string(),
|
||||
value: ObjectValue::Text(version.to_string()),
|
||||
@ -171,7 +171,7 @@ impl DepVersionsExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let mut in_require = false;
|
||||
|
||||
@ -200,7 +200,7 @@ impl DepVersionsExtractor {
|
||||
concept_path.push(short_name.to_string());
|
||||
concept_path.push("version".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "installed_version".to_string(),
|
||||
value: ObjectValue::Text(version.to_string()),
|
||||
@ -223,7 +223,7 @@ impl DepVersionsExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -243,7 +243,7 @@ impl DepVersionsExtractor {
|
||||
concept_path.push(package.to_string());
|
||||
concept_path.push("version".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "installed_version".to_string(),
|
||||
value: ObjectValue::Text(version.to_string()),
|
||||
@ -276,7 +276,7 @@ impl Extractor for DepVersionsExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
match language {
|
||||
Language::CargoManifest => self.extract_cargo(path_segments, content, file),
|
||||
Language::NpmManifest => self.extract_npm(path_segments, content, file),
|
||||
|
||||
@ -8,7 +8,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Rust derive patterns.
|
||||
///
|
||||
@ -102,7 +102,7 @@ impl Extractor for DerivePatternExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -138,7 +138,7 @@ impl Extractor for DerivePatternExtractor {
|
||||
let mut sorted_derives = derives.clone();
|
||||
sorted_derives.sort();
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "traits".to_string(),
|
||||
value: ObjectValue::Text(sorted_derives.join(",")),
|
||||
|
||||
@ -13,7 +13,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Django security misconfigurations.
|
||||
pub struct DjangoSecurityExtractor {
|
||||
@ -100,7 +100,7 @@ impl DjangoSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -295,7 +295,7 @@ impl DjangoSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -414,7 +414,7 @@ impl Extractor for DjangoSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Django file
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for durability configuration.
|
||||
///
|
||||
@ -97,7 +97,7 @@ impl Extractor for DurabilityConfigExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -113,7 +113,7 @@ impl Extractor for DurabilityConfigExtractor {
|
||||
concept_path.push("wal".to_string());
|
||||
concept_path.push("durability".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "strategy".to_string(),
|
||||
value: ObjectValue::Text(normalized.to_string()),
|
||||
@ -134,7 +134,7 @@ impl Extractor for DurabilityConfigExtractor {
|
||||
concept_path.push("wal".to_string());
|
||||
concept_path.push("durability".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "strategy".to_string(),
|
||||
value: ObjectValue::Text(normalized.to_string()),
|
||||
@ -155,7 +155,7 @@ impl Extractor for DurabilityConfigExtractor {
|
||||
concept_path.push("wal".to_string());
|
||||
concept_path.push("durability".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "strategy".to_string(),
|
||||
value: ObjectValue::Text(normalized.to_string()),
|
||||
@ -173,7 +173,7 @@ impl Extractor for DurabilityConfigExtractor {
|
||||
concept_path.push("wal".to_string());
|
||||
concept_path.push("durability".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "strategy".to_string(),
|
||||
value: ObjectValue::Text("batched".to_string()),
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Express.js security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -117,7 +117,7 @@ impl Extractor for ExpressSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like an Express.js file
|
||||
|
||||
@ -11,7 +11,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for FastAPI security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -90,7 +90,7 @@ impl Extractor for FastApiSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a FastAPI file
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Flask security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -100,7 +100,7 @@ impl Extractor for FlaskSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Flask file
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for hardcoded secrets in source code.
|
||||
pub struct HardcodedSecretsExtractor {
|
||||
@ -92,7 +92,7 @@ impl HardcodedSecretsExtractor {
|
||||
matched_text: &str,
|
||||
leaf: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
let mut concept_path = path_segments.to_vec();
|
||||
concept_path.push("secrets".to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
@ -100,7 +100,7 @@ impl HardcodedSecretsExtractor {
|
||||
// Lower confidence for test files
|
||||
let confidence = if self.is_test_file(file) { 0.5 } else { 1.0 };
|
||||
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "storage_method".to_string(),
|
||||
value: ObjectValue::Text("hardcoded".to_string()),
|
||||
@ -138,7 +138,7 @@ impl Extractor for HardcodedSecretsExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -223,6 +223,26 @@ impl Extractor for HardcodedSecretsExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("secrets/api_key", "storage_method"),
|
||||
("secrets/password", "storage_method"),
|
||||
("secrets/aws_credentials", "storage_method"),
|
||||
("secrets/private_key", "storage_method"),
|
||||
("secrets/secret_token", "storage_method"),
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)api[_-]?key",
|
||||
r"(?i)password|passwd|pwd",
|
||||
r"AKIA[0-9A-Z]",
|
||||
r"-----BEGIN.*PRIVATE KEY",
|
||||
r"(?i)secret|token|auth[_-]?key|client[_-]?secret",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -14,7 +14,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::{build_claim, Extractor};
|
||||
use crate::config::EntropyConfig;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
use entropy::{charset_variety, shannon_entropy};
|
||||
use patterns::{classify_known_secret, is_likely_not_secret, SecretPatterns};
|
||||
@ -67,7 +67,7 @@ impl HighEntropySecretsExtractor {
|
||||
secret_type: &str,
|
||||
description: &str,
|
||||
base_confidence: f32,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["secrets", secret_type],
|
||||
@ -107,7 +107,7 @@ impl Extractor for HighEntropySecretsExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -52,7 +52,7 @@ use regex::Regex;
|
||||
/// Parses ignore comments from file content and tracks ignored line numbers.
|
||||
#[derive(Debug)]
|
||||
pub struct IgnoreCommentParser {
|
||||
/// Lines that should be ignored (1-indexed to match ExtractedClaim.line).
|
||||
/// Lines that should be ignored (1-indexed to match Observation.line).
|
||||
ignored_lines: HashSet<usize>,
|
||||
}
|
||||
|
||||
@ -108,7 +108,7 @@ impl IgnoreCommentParser {
|
||||
|
||||
/// Check if a line number should be ignored.
|
||||
///
|
||||
/// Line numbers are 1-indexed (matching ExtractedClaim.line).
|
||||
/// Line numbers are 1-indexed (matching Observation.line).
|
||||
pub fn is_ignored(&self, line: usize) -> bool {
|
||||
self.ignored_lines.contains(&line)
|
||||
}
|
||||
|
||||
@ -8,7 +8,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Rust import patterns.
|
||||
///
|
||||
@ -99,7 +99,7 @@ impl Extractor for ImportGraphExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -118,7 +118,7 @@ impl Extractor for ImportGraphExtractor {
|
||||
concept_path.push("imports".to_string());
|
||||
concept_path.push(crate_name.clone());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "imported".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
@ -134,6 +134,10 @@ impl Extractor for ImportGraphExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![("imports/*", "imported")]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -13,7 +13,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use self::patterns::CookiePatterns;
|
||||
use super::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for insecure cookie configuration patterns.
|
||||
///
|
||||
@ -42,7 +42,7 @@ impl InsecureCookiesExtractor {
|
||||
matched_text: &str,
|
||||
issue_type: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["cookies", issue_type],
|
||||
@ -79,7 +79,7 @@ impl Extractor for InsecureCookiesExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for insecure deserialization vulnerabilities.
|
||||
///
|
||||
@ -89,7 +89,7 @@ impl InsecureDeserializationExtractor {
|
||||
method: &str,
|
||||
confidence: f32,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["serialization", "deserialization"],
|
||||
@ -119,7 +119,7 @@ impl Extractor for InsecureDeserializationExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for JWT validation configuration.
|
||||
pub struct JwtConfigExtractor {
|
||||
@ -93,12 +93,12 @@ impl JwtConfigExtractor {
|
||||
value: ObjectValue,
|
||||
description: &str,
|
||||
confidence: f32,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
let mut concept_path = path_segments.to_vec();
|
||||
concept_path.push("jwt".to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: predicate.to_string(),
|
||||
value,
|
||||
@ -135,7 +135,7 @@ impl Extractor for JwtConfigExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -237,6 +237,25 @@ impl Extractor for JwtConfigExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("jwt/audience_validation", "enabled"),
|
||||
("jwt/algorithm_restriction", "config_value"),
|
||||
("jwt/signature_verification", "enabled"),
|
||||
("jwt/expiry_validation", "enabled"),
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)jwt|jsonwebtoken|jose",
|
||||
r"(?i)validate_aud|set_audience|ValidateAudience|\baud\b",
|
||||
r"(?i)Algorithm::None|allow_none|SigningMethodNone|\balg\b.*none",
|
||||
r"(?i)dangerous_insecure|skip_signature|verify_signature|RequireSignedTokens",
|
||||
r"(?i)validate_exp|RequireExpirationTime|IgnoreExpiration",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -13,7 +13,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Laravel security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -96,7 +96,7 @@ impl LaravelSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -171,7 +171,7 @@ impl LaravelSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -348,7 +348,7 @@ impl Extractor for LaravelSecurityExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Laravel file
|
||||
|
||||
@ -86,6 +86,7 @@ mod rails_security;
|
||||
mod rate_limit;
|
||||
mod registry;
|
||||
mod security_headers;
|
||||
mod self_audit;
|
||||
mod spring_security;
|
||||
mod sql_injection;
|
||||
mod ssrf;
|
||||
@ -137,6 +138,7 @@ pub use rails_security::RailsSecurityExtractor;
|
||||
pub use rate_limit::{RateLimitExtractor, RateLimitThresholds};
|
||||
pub use registry::ExtractorRegistry;
|
||||
pub use security_headers::SecurityHeadersExtractor;
|
||||
pub use self_audit::SelfAuditExtractor;
|
||||
pub use spring_security::SpringSecurityExtractor;
|
||||
pub use sql_injection::SqlInjectionExtractor;
|
||||
pub use ssrf::SsrfExtractor;
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for NestJS security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -106,7 +106,7 @@ impl Extractor for NestJsSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a NestJS file
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Next.js security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -97,7 +97,7 @@ impl Extractor for NextJsSecurityExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Next.js file
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for ORM-specific SQL injection vulnerabilities.
|
||||
///
|
||||
@ -90,7 +90,7 @@ impl OrmInjectionExtractor {
|
||||
matched: &str,
|
||||
orm: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["db", "orm", "query"],
|
||||
@ -120,7 +120,7 @@ impl Extractor for OrmInjectionExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for path traversal vulnerabilities.
|
||||
///
|
||||
@ -96,7 +96,7 @@ impl PathTraversalExtractor {
|
||||
matched: &str,
|
||||
category: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["filesystem", "path", category],
|
||||
@ -132,7 +132,7 @@ impl Extractor for PathTraversalExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -236,6 +236,18 @@ impl Extractor for PathTraversalExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"\.\./",
|
||||
r"os\.path\.join|os\.path",
|
||||
r"path\.join|path\.resolve",
|
||||
r"filepath\.Join|filepath\.Clean",
|
||||
r"Path::new|PathBuf",
|
||||
r"(?i)fs\.read|fs\.write|readFile|writeFile",
|
||||
r"open\(",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -12,7 +12,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Rails security misconfigurations.
|
||||
pub struct RailsSecurityExtractor {
|
||||
@ -92,7 +92,7 @@ impl RailsSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -199,7 +199,7 @@ impl RailsSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -393,7 +393,7 @@ impl Extractor for RailsSecurityExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Rails file
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Configuration for rate limit thresholds.
|
||||
#[derive(Debug, Clone)]
|
||||
@ -118,7 +118,7 @@ impl Extractor for RateLimitExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -130,7 +130,7 @@ impl Extractor for RateLimitExtractor {
|
||||
concept_path.push("rate_limit".to_string());
|
||||
concept_path.push("enabled".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -157,7 +157,7 @@ impl Extractor for RateLimitExtractor {
|
||||
concept_path.push("rate_limit".to_string());
|
||||
concept_path.push("max_requests".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Number(per_minute as f64),
|
||||
|
||||
@ -1,9 +1,12 @@
|
||||
//! Extractor registry and collection logic.
|
||||
|
||||
use std::collections::HashMap;
|
||||
|
||||
use regex::RegexSet;
|
||||
use tracing::instrument;
|
||||
|
||||
use crate::config::AphoriaConfig;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
use super::api_key_security::ApiKeySecurityExtractor;
|
||||
use super::aspnet_security::AspNetSecurityExtractor;
|
||||
@ -36,6 +39,7 @@ use super::path_traversal::PathTraversalExtractor;
|
||||
use super::rails_security::RailsSecurityExtractor;
|
||||
use super::rate_limit::RateLimitExtractor;
|
||||
use super::security_headers::SecurityHeadersExtractor;
|
||||
use super::self_audit::SelfAuditExtractor;
|
||||
use super::spring_security::SpringSecurityExtractor;
|
||||
use super::sql_injection::SqlInjectionExtractor;
|
||||
use super::ssrf::SsrfExtractor;
|
||||
@ -52,9 +56,22 @@ use super::weak_crypto::WeakCryptoExtractor;
|
||||
use super::weak_password::WeakPasswordExtractor;
|
||||
use super::xxe::XxeExtractor;
|
||||
|
||||
/// Pre-compiled RegexSet for a single language, mapping matched patterns back to extractor indices.
|
||||
struct ScreeningSet {
|
||||
regex_set: RegexSet,
|
||||
/// Maps RegexSet pattern index → extractor index in `ExtractorRegistry::extractors`.
|
||||
pattern_to_extractor: Vec<usize>,
|
||||
}
|
||||
|
||||
/// Registry of available extractors.
|
||||
pub struct ExtractorRegistry {
|
||||
extractors: Vec<Box<dyn Extractor>>,
|
||||
/// Extractor indices per language (precomputed from `languages()`).
|
||||
language_map: HashMap<Language, Vec<usize>>,
|
||||
/// Per-language RegexSet for pre-screening file content.
|
||||
screening: HashMap<Language, ScreeningSet>,
|
||||
/// Extractors with no screening patterns (always run for that language).
|
||||
always_run: HashMap<Language, Vec<usize>>,
|
||||
}
|
||||
|
||||
impl Default for ExtractorRegistry {
|
||||
@ -107,6 +124,9 @@ impl ExtractorRegistry {
|
||||
if is_enabled("dep_versions") && config.extractors.dep_versions.enabled {
|
||||
extractors.push(Box::new(DepVersionsExtractor::new()));
|
||||
}
|
||||
if is_enabled("self_audit") && config.extractors.self_audit.enabled {
|
||||
extractors.push(Box::new(SelfAuditExtractor::new()));
|
||||
}
|
||||
if is_enabled("cors_config") {
|
||||
extractors.push(Box::new(CorsConfigExtractor::new()));
|
||||
}
|
||||
@ -248,7 +268,14 @@ impl ExtractorRegistry {
|
||||
}
|
||||
}
|
||||
|
||||
Self { extractors }
|
||||
let mut registry = Self {
|
||||
extractors,
|
||||
language_map: HashMap::new(),
|
||||
screening: HashMap::new(),
|
||||
always_run: HashMap::new(),
|
||||
};
|
||||
registry.rebuild_screening();
|
||||
registry
|
||||
}
|
||||
|
||||
/// Add declarative extractors from definitions.
|
||||
@ -270,21 +297,22 @@ impl ExtractorRegistry {
|
||||
}
|
||||
}
|
||||
}
|
||||
self.rebuild_screening();
|
||||
}
|
||||
|
||||
/// Get extractors applicable to a given language.
|
||||
pub fn for_language(&self, language: Language) -> Vec<&dyn Extractor> {
|
||||
self.extractors
|
||||
.iter()
|
||||
.filter(|e| e.languages().contains(&language))
|
||||
.map(|e| e.as_ref())
|
||||
.collect()
|
||||
match self.language_map.get(&language) {
|
||||
Some(indices) => indices.iter().map(|&i| self.extractors[i].as_ref()).collect(),
|
||||
None => vec![],
|
||||
}
|
||||
}
|
||||
|
||||
/// Extract claims from content using all applicable extractors.
|
||||
///
|
||||
/// This method also filters out claims on lines marked with `// aphoria:ignore`
|
||||
/// or similar inline ignore comments. See [`IgnoreCommentParser`] for details.
|
||||
/// Uses a `RegexSet` pre-screen to skip extractors whose patterns don't match
|
||||
/// the file content. This method also filters out claims on lines marked with
|
||||
/// `// aphoria:ignore` or similar inline ignore comments. See [`IgnoreCommentParser`].
|
||||
#[instrument(skip(self, path_segments, content), fields(file = %file, language = ?language))]
|
||||
pub fn extract_all(
|
||||
&self,
|
||||
@ -292,13 +320,18 @@ impl ExtractorRegistry {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let selected = self.select_extractors(content, language);
|
||||
if selected.is_empty() {
|
||||
return vec![];
|
||||
}
|
||||
|
||||
// Parse inline ignore comments
|
||||
let ignore_parser = IgnoreCommentParser::parse(content);
|
||||
|
||||
self.for_language(language)
|
||||
selected
|
||||
.iter()
|
||||
.flat_map(|e| e.extract(path_segments, content, language, file))
|
||||
.flat_map(|&i| self.extractors[i].extract(path_segments, content, language, file))
|
||||
.filter(|claim| !ignore_parser.is_ignored(claim.line))
|
||||
.collect()
|
||||
}
|
||||
@ -307,6 +340,103 @@ impl ExtractorRegistry {
|
||||
pub fn extractor_names(&self) -> Vec<&str> {
|
||||
self.extractors.iter().map(|e| e.name()).collect()
|
||||
}
|
||||
|
||||
/// Get a reference to all registered extractors.
|
||||
pub fn extractors(&self) -> &[Box<dyn Extractor>] {
|
||||
&self.extractors
|
||||
}
|
||||
|
||||
/// Rebuild the language map, screening sets, and always-run lists from current extractors.
|
||||
fn rebuild_screening(&mut self) {
|
||||
self.language_map.clear();
|
||||
self.screening.clear();
|
||||
self.always_run.clear();
|
||||
|
||||
// Build language_map: language → Vec<extractor_index>
|
||||
for (idx, ext) in self.extractors.iter().enumerate() {
|
||||
for &lang in ext.languages() {
|
||||
self.language_map.entry(lang).or_default().push(idx);
|
||||
}
|
||||
}
|
||||
|
||||
// For each language, build a ScreeningSet and always_run list
|
||||
for (&lang, indices) in &self.language_map {
|
||||
let mut patterns: Vec<String> = Vec::new();
|
||||
let mut pattern_to_extractor: Vec<usize> = Vec::new();
|
||||
let mut always: Vec<usize> = Vec::new();
|
||||
|
||||
for &ext_idx in indices {
|
||||
let screening = self.extractors[ext_idx].screening_patterns();
|
||||
if screening.is_empty() {
|
||||
always.push(ext_idx);
|
||||
} else {
|
||||
for pat in screening {
|
||||
patterns.push(pat.to_string());
|
||||
pattern_to_extractor.push(ext_idx);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if !patterns.is_empty() {
|
||||
match RegexSet::new(&patterns) {
|
||||
Ok(regex_set) => {
|
||||
self.screening.insert(lang, ScreeningSet {
|
||||
regex_set,
|
||||
pattern_to_extractor,
|
||||
});
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!(
|
||||
language = %lang,
|
||||
error = %e,
|
||||
"Failed to compile screening RegexSet; all extractors will run"
|
||||
);
|
||||
// Fall back: treat all extractors for this language as always-run
|
||||
always.clear();
|
||||
for &ext_idx in indices {
|
||||
always.push(ext_idx);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if !always.is_empty() {
|
||||
self.always_run.insert(lang, always);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Select which extractors to run on the given content, using the RegexSet pre-screen.
|
||||
fn select_extractors(&self, content: &str, language: Language) -> Vec<usize> {
|
||||
let mut selected: Vec<usize> = Vec::new();
|
||||
|
||||
// Add always-run extractors
|
||||
if let Some(always) = self.always_run.get(&language) {
|
||||
selected.extend_from_slice(always);
|
||||
}
|
||||
|
||||
// Run the RegexSet pre-screen and add matched extractors
|
||||
if let Some(screening) = self.screening.get(&language) {
|
||||
for pat_idx in screening.regex_set.matches(content).iter() {
|
||||
let ext_idx = screening.pattern_to_extractor[pat_idx];
|
||||
selected.push(ext_idx);
|
||||
}
|
||||
}
|
||||
|
||||
// If no screening set exists for this language, all extractors are in always_run
|
||||
// (or language_map). If there's no always_run and no screening, check language_map.
|
||||
if selected.is_empty() && !self.screening.contains_key(&language) {
|
||||
if let Some(indices) = self.language_map.get(&language) {
|
||||
return indices.clone();
|
||||
}
|
||||
}
|
||||
|
||||
// Deduplicate and sort for deterministic order
|
||||
selected.sort_unstable();
|
||||
selected.dedup();
|
||||
|
||||
selected
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for missing or disabled security headers.
|
||||
///
|
||||
@ -94,7 +94,7 @@ impl SecurityHeadersExtractor {
|
||||
matched: &str,
|
||||
header: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["http", "security_headers", header],
|
||||
@ -132,7 +132,7 @@ impl Extractor for SecurityHeadersExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -238,6 +238,17 @@ impl Extractor for SecurityHeadersExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("security_headers/x_frame_options", "header_status"),
|
||||
("security_headers/x_content_type_options", "header_status"),
|
||||
("security_headers/x_xss_protection", "header_status"),
|
||||
("security_headers/hsts", "header_status"),
|
||||
("security_headers/ssl_redirect", "header_status"),
|
||||
("security_headers/content_security_policy", "header_status"),
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
302
applications/aphoria/src/extractors/self_audit.rs
Normal file
302
applications/aphoria/src/extractors/self_audit.rs
Normal file
@ -0,0 +1,302 @@
|
||||
//! Self-audit meta-extractor for dogfooding Aphoria on its own codebase.
|
||||
//!
|
||||
//! Produces observations about Aphoria's own code patterns:
|
||||
//! - Bridge tier assignments
|
||||
//! - Parent hash usage
|
||||
//! - Lifecycle stage skipping
|
||||
//! - `.unwrap()` / `.expect()` usage count
|
||||
|
||||
use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{Language, Observation};
|
||||
|
||||
/// Meta-extractor that audits Aphoria's own code patterns.
|
||||
///
|
||||
/// Opt-in only (like `dep_versions`). Registered with the name `self_audit`.
|
||||
pub struct SelfAuditExtractor {
|
||||
/// Matches: .unwrap() or .expect() calls
|
||||
unwrap_pattern: Regex,
|
||||
/// Matches: SourceClass:: usage for tier assignment
|
||||
source_class_pattern: Regex,
|
||||
/// Matches: parent_hash: None
|
||||
parent_hash_none: Regex,
|
||||
/// Matches: LifecycleStage::Approved
|
||||
lifecycle_approved: Regex,
|
||||
}
|
||||
|
||||
impl Default for SelfAuditExtractor {
|
||||
fn default() -> Self {
|
||||
Self::new()
|
||||
}
|
||||
}
|
||||
|
||||
impl SelfAuditExtractor {
|
||||
/// Create a new self-audit extractor.
|
||||
///
|
||||
/// # Panics
|
||||
/// Panics if any regex pattern is invalid (programmer error).
|
||||
#[allow(clippy::expect_used)]
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
unwrap_pattern: Regex::new(r"\.(unwrap|expect)\(").expect("valid regex"),
|
||||
source_class_pattern: Regex::new(r"SourceClass::\w+").expect("valid regex"),
|
||||
parent_hash_none: Regex::new(r"parent_hash:\s*None").expect("valid regex"),
|
||||
lifecycle_approved: Regex::new(r"LifecycleStage::Approved").expect("valid regex"),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Extractor for SelfAuditExtractor {
|
||||
fn name(&self) -> &str {
|
||||
"self_audit"
|
||||
}
|
||||
|
||||
fn languages(&self) -> &[Language] {
|
||||
&[Language::Rust]
|
||||
}
|
||||
|
||||
fn extract(
|
||||
&self,
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<Observation> {
|
||||
let mut observations = Vec::new();
|
||||
|
||||
// Count unwrap/expect usage
|
||||
let mut unwrap_count: usize = 0;
|
||||
let lines: Vec<&str> = content.lines().collect();
|
||||
let mut in_test_module = false;
|
||||
|
||||
for (line_num, line) in lines.iter().enumerate() {
|
||||
let line_number = line_num + 1;
|
||||
|
||||
// Track #[cfg(test)] module boundaries
|
||||
if line.contains("#[cfg(test)]") {
|
||||
in_test_module = true;
|
||||
}
|
||||
|
||||
// Skip test modules entirely
|
||||
if in_test_module {
|
||||
// Still check for bridge patterns below, but don't count unwraps
|
||||
} else if self.unwrap_pattern.is_match(line) {
|
||||
// Check if the enclosing function has #[allow(clippy::unwrap_used)]
|
||||
// or #[allow(clippy::expect_used)].
|
||||
// Scan backwards to the fn boundary, then check attributes above it.
|
||||
let mut allowed = false;
|
||||
let mut found_fn = false;
|
||||
for prev in (0..line_num).rev() {
|
||||
let prev_line = lines[prev].trim();
|
||||
if prev_line.is_empty() {
|
||||
if found_fn {
|
||||
break; // blank line above fn means attributes are done
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if prev_line.contains("#[allow(clippy::unwrap_used)]")
|
||||
|| prev_line.contains("#[allow(clippy::expect_used)]")
|
||||
{
|
||||
allowed = true;
|
||||
break;
|
||||
}
|
||||
// Mark that we found the fn boundary
|
||||
if !found_fn
|
||||
&& (prev_line.starts_with("fn ")
|
||||
|| prev_line.starts_with("pub fn ")
|
||||
|| prev_line.contains(" fn "))
|
||||
{
|
||||
found_fn = true;
|
||||
continue; // check attributes above fn
|
||||
}
|
||||
// If we're past the fn and hit non-attribute lines, stop
|
||||
if found_fn && !prev_line.starts_with('#') {
|
||||
break;
|
||||
}
|
||||
}
|
||||
if !allowed {
|
||||
unwrap_count += 1;
|
||||
}
|
||||
}
|
||||
|
||||
// Detect SourceClass assignments in bridge code
|
||||
if file.contains("bridge") {
|
||||
if let Some(m) = self.source_class_pattern.find(line) {
|
||||
observations.push(super::traits::build_claim(
|
||||
path_segments,
|
||||
&["bridge", "tier_assignment"],
|
||||
"default_tier",
|
||||
ObjectValue::Text(m.as_str().to_string()),
|
||||
file,
|
||||
line_number,
|
||||
m.as_str(),
|
||||
0.9,
|
||||
"Bridge tier assignment pattern",
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
// Detect parent_hash: None patterns in bridge code
|
||||
if file.contains("bridge") && self.parent_hash_none.is_match(line) {
|
||||
observations.push(super::traits::build_claim(
|
||||
path_segments,
|
||||
&["bridge", "parent_hash"],
|
||||
"always_none",
|
||||
ObjectValue::Boolean(true),
|
||||
file,
|
||||
line_number,
|
||||
"parent_hash: None",
|
||||
0.9,
|
||||
"Parent hash always set to None",
|
||||
));
|
||||
}
|
||||
|
||||
// Detect LifecycleStage::Approved skipping Pending
|
||||
if file.contains("bridge") && self.lifecycle_approved.is_match(line) {
|
||||
observations.push(super::traits::build_claim(
|
||||
path_segments,
|
||||
&["bridge", "lifecycle"],
|
||||
"skips_pending",
|
||||
ObjectValue::Boolean(true),
|
||||
file,
|
||||
line_number,
|
||||
"LifecycleStage::Approved",
|
||||
0.9,
|
||||
"Lifecycle stage skips Pending, goes directly to Approved",
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
// Emit a single summary observation for unwrap count
|
||||
if !file.contains("test") {
|
||||
#[allow(clippy::cast_precision_loss)]
|
||||
observations.push(super::traits::build_claim(
|
||||
path_segments,
|
||||
&["production", "error_handling"],
|
||||
"unwrap_count",
|
||||
ObjectValue::Number(unwrap_count as f64),
|
||||
file,
|
||||
1,
|
||||
&format!("{unwrap_count} unwrap/expect calls"),
|
||||
1.0,
|
||||
"Count of .unwrap()/.expect() calls in production code",
|
||||
));
|
||||
}
|
||||
|
||||
observations
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("bridge/tier_assignment", "default_tier"),
|
||||
("bridge/parent_hash", "always_none"),
|
||||
("bridge/lifecycle", "skips_pending"),
|
||||
("production/error_handling", "unwrap_count"),
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_detects_unwrap() {
|
||||
let ext = SelfAuditExtractor::new();
|
||||
let content = r#"
|
||||
fn main() {
|
||||
let x = foo().unwrap();
|
||||
let y = bar().expect("should work");
|
||||
}
|
||||
"#;
|
||||
let obs = ext.extract(
|
||||
&["rust".to_string(), "aphoria".to_string()],
|
||||
content,
|
||||
Language::Rust,
|
||||
"src/main.rs",
|
||||
);
|
||||
|
||||
let unwrap_obs: Vec<_> = obs.iter().filter(|o| o.predicate == "unwrap_count").collect();
|
||||
assert_eq!(unwrap_obs.len(), 1);
|
||||
assert_eq!(unwrap_obs[0].value, ObjectValue::Number(2.0));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_skips_allowed_unwrap() {
|
||||
let ext = SelfAuditExtractor::new();
|
||||
let content = r#"
|
||||
#[allow(clippy::unwrap_used)]
|
||||
fn allowed() {
|
||||
let x = foo().unwrap();
|
||||
}
|
||||
|
||||
fn not_allowed() {
|
||||
let y = bar().unwrap();
|
||||
}
|
||||
"#;
|
||||
let obs = ext.extract(
|
||||
&["rust".to_string(), "aphoria".to_string()],
|
||||
content,
|
||||
Language::Rust,
|
||||
"src/main.rs",
|
||||
);
|
||||
|
||||
let unwrap_obs: Vec<_> = obs.iter().filter(|o| o.predicate == "unwrap_count").collect();
|
||||
assert_eq!(unwrap_obs.len(), 1);
|
||||
// The allowed one should be skipped, only the non-allowed one counted
|
||||
assert_eq!(unwrap_obs[0].value, ObjectValue::Number(1.0));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_bridge_detection() {
|
||||
let ext = SelfAuditExtractor::new();
|
||||
let content = r#"
|
||||
fn build_assertion() {
|
||||
let source_class = SourceClass::Community;
|
||||
let parent_hash: None;
|
||||
let lifecycle = LifecycleStage::Approved;
|
||||
}
|
||||
"#;
|
||||
let obs = ext.extract(
|
||||
&["rust".to_string(), "aphoria".to_string()],
|
||||
content,
|
||||
Language::Rust,
|
||||
"src/bridge.rs",
|
||||
);
|
||||
|
||||
assert!(obs.iter().any(|o| o.predicate == "default_tier"));
|
||||
assert!(obs.iter().any(|o| o.predicate == "skips_pending"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_no_bridge_obs_for_non_bridge() {
|
||||
let ext = SelfAuditExtractor::new();
|
||||
let content = "let source_class = SourceClass::Community;\n";
|
||||
let obs = ext.extract(
|
||||
&["rust".to_string()],
|
||||
content,
|
||||
Language::Rust,
|
||||
"src/other.rs",
|
||||
);
|
||||
|
||||
assert!(!obs.iter().any(|o| o.predicate == "default_tier"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_skips_test_files_for_unwrap() {
|
||||
let ext = SelfAuditExtractor::new();
|
||||
let content = "let x = foo().unwrap();\n";
|
||||
let obs = ext.extract(
|
||||
&["rust".to_string()],
|
||||
content,
|
||||
Language::Rust,
|
||||
"src/tests/verify.rs",
|
||||
);
|
||||
|
||||
// Test files should not produce unwrap_count observations
|
||||
let unwrap_obs: Vec<_> = obs.iter().filter(|o| o.predicate == "unwrap_count").collect();
|
||||
assert!(unwrap_obs.is_empty());
|
||||
}
|
||||
}
|
||||
@ -13,7 +13,7 @@ use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::build_claim;
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Spring Boot security misconfigurations.
|
||||
#[allow(dead_code)]
|
||||
@ -114,7 +114,7 @@ impl SpringSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -249,7 +249,7 @@ impl SpringSecurityExtractor {
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Multi-line patterns
|
||||
@ -377,7 +377,7 @@ impl Extractor for SpringSecurityExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
// Check if this looks like a Spring file
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for SQL injection vulnerabilities.
|
||||
///
|
||||
@ -107,7 +107,7 @@ impl SqlInjectionExtractor {
|
||||
path_segments: &[String],
|
||||
file: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -117,7 +117,7 @@ impl SqlInjectionExtractor {
|
||||
concept_path.push("query".to_string());
|
||||
concept_path.push("construction".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "construction".to_string(),
|
||||
value: ObjectValue::Text("interpolated".to_string()),
|
||||
@ -155,7 +155,7 @@ impl Extractor for SqlInjectionExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
match language {
|
||||
@ -235,6 +235,17 @@ impl Extractor for SqlInjectionExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)format!.*SELECT|format!.*INSERT|format!.*UPDATE|format!.*DELETE",
|
||||
r"(?i)Sprintf.*SELECT|Sprintf.*INSERT|Sprintf.*UPDATE",
|
||||
r#"(?i)f".*SELECT|f".*INSERT|f".*UPDATE|f".*DELETE"#,
|
||||
r"(?i)\.format\(.*SELECT|\.format\(.*INSERT",
|
||||
r"(?i)%.*SELECT|%.*INSERT",
|
||||
r"(?i)\+.*SELECT|\+.*INSERT|\+.*UPDATE",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for SSRF vulnerabilities.
|
||||
///
|
||||
@ -105,7 +105,7 @@ impl SsrfExtractor {
|
||||
matched: &str,
|
||||
category: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["network", "ssrf", category],
|
||||
@ -141,7 +141,7 @@ impl Extractor for SsrfExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -255,6 +255,19 @@ impl Extractor for SsrfExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"requests\.get|requests\.post|requests\.request",
|
||||
r"urllib",
|
||||
r"httpx\.",
|
||||
r"fetch\(",
|
||||
r"axios\.",
|
||||
r"http\.Get|http\.Post|http\.Do",
|
||||
r"reqwest::get|reqwest::Client",
|
||||
r"(?i)url\s*=.*request|url\s*=.*params|url\s*=.*query",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Configuration for timeout extraction thresholds.
|
||||
#[derive(Debug, Clone)]
|
||||
@ -74,12 +74,12 @@ impl TimeoutConfigExtractor {
|
||||
context: &str,
|
||||
value: f64,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
let mut concept_path = path_segments.to_vec();
|
||||
concept_path.push(context.to_string());
|
||||
concept_path.push("timeout".to_string());
|
||||
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "config_value".to_string(),
|
||||
value: ObjectValue::Number(value),
|
||||
@ -164,7 +164,7 @@ impl Extractor for TimeoutConfigExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -225,6 +225,14 @@ impl Extractor for TimeoutConfigExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)timeout",
|
||||
r"(?i)read_timeout|write_timeout|connect_timeout",
|
||||
r"(?i)request_timeout|idle_timeout|keep_alive",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for TLS certificate verification settings.
|
||||
pub struct TlsVerifyExtractor {
|
||||
@ -67,7 +67,7 @@ impl TlsVerifyExtractor {
|
||||
pattern: &Regex,
|
||||
path_segments: &[String],
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -76,7 +76,7 @@ impl TlsVerifyExtractor {
|
||||
concept_path.push("tls".to_string());
|
||||
concept_path.push("cert_verification".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "enabled".to_string(),
|
||||
value: ObjectValue::Boolean(false),
|
||||
@ -117,7 +117,7 @@ impl Extractor for TlsVerifyExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
match language {
|
||||
@ -173,6 +173,22 @@ impl Extractor for TlsVerifyExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![("tls/cert_verification", "enabled")]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"danger_accept_invalid",
|
||||
r"accept_invalid_certs",
|
||||
r"InsecureSkipVerify",
|
||||
r"verify\s*=\s*False",
|
||||
r"rejectUnauthorized",
|
||||
r"NODE_TLS_REJECT_UNAUTHORIZED",
|
||||
r"(?i)verify.*ssl|ssl.*verify|tls.*verify|verify.*tls",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for deprecated TLS version usage.
|
||||
///
|
||||
@ -146,7 +146,7 @@ impl TlsVersionExtractor {
|
||||
file: &str,
|
||||
version: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -155,7 +155,7 @@ impl TlsVersionExtractor {
|
||||
concept_path.push("tls".to_string());
|
||||
concept_path.push("min_version".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "version".to_string(),
|
||||
value: ObjectValue::Text(version.to_string()),
|
||||
@ -198,7 +198,7 @@ impl Extractor for TlsVersionExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
match language {
|
||||
@ -363,6 +363,10 @@ impl Extractor for TlsVersionExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![("tls/min_version", "version")]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Language, Observation};
|
||||
|
||||
// ============================================================================
|
||||
// Shared Utilities for Extractors
|
||||
@ -25,12 +25,12 @@ pub fn is_test_file(file: &str) -> bool {
|
||||
|| lower.ends_with("_test.rs")
|
||||
}
|
||||
|
||||
/// Build an extracted claim with consistent formatting.
|
||||
/// Build an observation with consistent formatting.
|
||||
///
|
||||
/// This is a helper for extractors to create claims with:
|
||||
/// This is a helper for extractors to create observations with:
|
||||
/// - Consistent concept path format (`code://segment1/segment2/...`)
|
||||
/// - Automatic confidence reduction for test files
|
||||
/// - Standard claim structure
|
||||
/// - Standard observation structure
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
pub fn build_claim(
|
||||
path_segments: &[String],
|
||||
@ -42,7 +42,7 @@ pub fn build_claim(
|
||||
matched_text: &str,
|
||||
base_confidence: f32,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
let mut concept_path = path_segments.to_vec();
|
||||
for segment in leaf_segments {
|
||||
concept_path.push((*segment).to_string());
|
||||
@ -50,7 +50,7 @@ pub fn build_claim(
|
||||
|
||||
let confidence = if is_test_file(file) { base_confidence * 0.5 } else { base_confidence };
|
||||
|
||||
ExtractedClaim {
|
||||
Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: predicate.to_string(),
|
||||
value,
|
||||
@ -62,9 +62,9 @@ pub fn build_claim(
|
||||
}
|
||||
}
|
||||
|
||||
/// Trait for claim extractors.
|
||||
/// Trait for observation extractors.
|
||||
///
|
||||
/// Extractors scan file content and return claims about implicit decisions.
|
||||
/// Extractors scan file content and return observations about implicit decisions.
|
||||
pub trait Extractor: Send + Sync {
|
||||
/// Unique identifier for this extractor.
|
||||
fn name(&self) -> &str;
|
||||
@ -72,7 +72,7 @@ pub trait Extractor: Send + Sync {
|
||||
/// File types this extractor operates on.
|
||||
fn languages(&self) -> &[Language];
|
||||
|
||||
/// Extract claims from a file's content.
|
||||
/// Extract observations from a file's content.
|
||||
///
|
||||
/// # Arguments
|
||||
///
|
||||
@ -83,14 +83,39 @@ pub trait Extractor: Send + Sync {
|
||||
///
|
||||
/// # Returns
|
||||
///
|
||||
/// Zero or more extracted claims.
|
||||
/// Zero or more extracted observations.
|
||||
fn extract(
|
||||
&self,
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim>;
|
||||
) -> Vec<Observation>;
|
||||
|
||||
/// Declare which observation predicates this extractor can verify.
|
||||
///
|
||||
/// Returns `(tail_path_suffix, predicate)` pairs describing the concept paths
|
||||
/// and predicates this extractor produces. Used by `verify map` to show
|
||||
/// extractor→claim coverage.
|
||||
///
|
||||
/// Tail-path suffixes use the last 2 segments of the concept path.
|
||||
/// Wildcards are supported: `"imports/*"` matches `"imports/tokio"`, etc.
|
||||
///
|
||||
/// Default: empty (backward compatible — observation-only extractor).
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![]
|
||||
}
|
||||
|
||||
/// Return lightweight string patterns for pre-screening file content.
|
||||
///
|
||||
/// The registry compiles these into a `RegexSet` for one-pass DFA matching.
|
||||
/// If *any* pattern matches the file content, this extractor is selected to run.
|
||||
///
|
||||
/// Return `vec![]` (the default) to **always run** this extractor on matching
|
||||
/// language files — use this for extractors that are cheap or hard to pre-screen.
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -11,7 +11,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Unreal Engine INI patterns.
|
||||
pub struct UnrealConfigExtractor {
|
||||
@ -59,7 +59,7 @@ impl UnrealConfigExtractor {
|
||||
category: &str,
|
||||
leaf: &str,
|
||||
desc_template: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -71,7 +71,7 @@ impl UnrealConfigExtractor {
|
||||
concept_path.push(category.to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "value".to_string(),
|
||||
value: ObjectValue::Number(val as f64),
|
||||
@ -99,7 +99,7 @@ impl UnrealConfigExtractor {
|
||||
leaf: &str,
|
||||
predicate: &str,
|
||||
desc_template: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -110,7 +110,7 @@ impl UnrealConfigExtractor {
|
||||
concept_path.push(category.to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: predicate.to_string(),
|
||||
value: ObjectValue::Text(val_match.as_str().to_string()),
|
||||
@ -143,7 +143,7 @@ impl Extractor for UnrealConfigExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
if language != Language::Ini {
|
||||
return vec![];
|
||||
}
|
||||
|
||||
@ -11,7 +11,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Unreal Engine C++ patterns.
|
||||
pub struct UnrealCppExtractor {
|
||||
@ -57,7 +57,7 @@ impl UnrealCppExtractor {
|
||||
category: &str,
|
||||
leaf: &str,
|
||||
desc_template: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -67,7 +67,7 @@ impl UnrealCppExtractor {
|
||||
concept_path.push(category.to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "exposed".to_string(), // Default predicate
|
||||
value: ObjectValue::Boolean(true),
|
||||
@ -99,7 +99,7 @@ impl Extractor for UnrealCppExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
if language != Language::Cpp {
|
||||
return vec![];
|
||||
}
|
||||
|
||||
@ -8,7 +8,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for Unreal Engine performance patterns.
|
||||
pub struct UnrealPerformanceExtractor {
|
||||
@ -46,7 +46,7 @@ impl UnrealPerformanceExtractor {
|
||||
file: &str,
|
||||
leaf: &str,
|
||||
desc_template: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -56,7 +56,7 @@ impl UnrealPerformanceExtractor {
|
||||
concept_path.push("performance".to_string());
|
||||
concept_path.push(leaf.to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "sync_load".to_string(),
|
||||
value: ObjectValue::Boolean(true),
|
||||
@ -88,7 +88,7 @@ impl Extractor for UnrealPerformanceExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
if language != Language::Cpp {
|
||||
return vec![];
|
||||
}
|
||||
|
||||
@ -9,7 +9,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for unsafe blocks and atomic ordering patterns.
|
||||
///
|
||||
@ -70,7 +70,7 @@ impl Extractor for UnsafeAtomicExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
let confidence = self.confidence_for_file(file);
|
||||
|
||||
@ -92,7 +92,7 @@ impl Extractor for UnsafeAtomicExtractor {
|
||||
concept_path.push("atomics".to_string());
|
||||
concept_path.push("ordering".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "pattern".to_string(),
|
||||
value: ObjectValue::Text(ordering.to_string()),
|
||||
@ -117,7 +117,7 @@ impl Extractor for UnsafeAtomicExtractor {
|
||||
concept_path.push("unsafe".to_string());
|
||||
concept_path.push("count".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "occurrences".to_string(),
|
||||
value: ObjectValue::Number(unsafe_count as f64),
|
||||
@ -134,6 +134,13 @@ impl Extractor for UnsafeAtomicExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("atomics/ordering", "pattern"),
|
||||
("unsafe/count", "occurrences"),
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for unvalidated redirect vulnerabilities.
|
||||
///
|
||||
@ -86,7 +86,7 @@ impl UnvalidatedRedirectsExtractor {
|
||||
matched: &str,
|
||||
category: &str,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["http", "redirect", category],
|
||||
@ -116,7 +116,7 @@ impl Extractor for UnvalidatedRedirectsExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::Extractor;
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for weak cryptographic algorithm usage.
|
||||
///
|
||||
@ -86,7 +86,7 @@ impl WeakCryptoExtractor {
|
||||
file: &str,
|
||||
algorithm: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -96,7 +96,7 @@ impl WeakCryptoExtractor {
|
||||
concept_path.push("hashing".to_string());
|
||||
concept_path.push("algorithm".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: ObjectValue::Text(algorithm.to_string()),
|
||||
@ -120,7 +120,7 @@ impl WeakCryptoExtractor {
|
||||
file: &str,
|
||||
algorithm: &str,
|
||||
description: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
@ -130,7 +130,7 @@ impl WeakCryptoExtractor {
|
||||
concept_path.push("encryption".to_string());
|
||||
concept_path.push("algorithm".to_string());
|
||||
|
||||
claims.push(ExtractedClaim {
|
||||
claims.push(Observation {
|
||||
concept_path: format!("code://{}", concept_path.join("/")),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: ObjectValue::Text(algorithm.to_string()),
|
||||
@ -168,7 +168,7 @@ impl Extractor for WeakCryptoExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
match language {
|
||||
@ -303,6 +303,22 @@ impl Extractor for WeakCryptoExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![
|
||||
("hashing/algorithm", "algorithm"),
|
||||
("encryption/algorithm", "algorithm"),
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)md5|Md5",
|
||||
r"(?i)sha1|sha-1|Sha1",
|
||||
r"(?i)\bdes\b|DES|TripleDES|des-ede",
|
||||
r"(?i)\brc4\b|RC4|arcfour",
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for weak password requirement configurations.
|
||||
///
|
||||
@ -92,7 +92,7 @@ impl WeakPasswordExtractor {
|
||||
category: &str,
|
||||
value: ObjectValue,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["auth", "password", "policy", category],
|
||||
@ -131,7 +131,7 @@ impl Extractor for WeakPasswordExtractor {
|
||||
content: &str,
|
||||
_language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
@ -7,7 +7,7 @@ use regex::Regex;
|
||||
use stemedb_core::types::ObjectValue;
|
||||
|
||||
use super::traits::{build_claim, Extractor};
|
||||
use crate::types::{ExtractedClaim, Language};
|
||||
use crate::types::{Observation, Language};
|
||||
|
||||
/// Extractor for XXE vulnerabilities.
|
||||
///
|
||||
@ -99,7 +99,7 @@ impl XxeExtractor {
|
||||
parser: &str,
|
||||
confidence: f32,
|
||||
description: &str,
|
||||
) -> ExtractedClaim {
|
||||
) -> Observation {
|
||||
build_claim(
|
||||
path_segments,
|
||||
&["xml", "parsing"],
|
||||
@ -129,7 +129,7 @@ impl Extractor for XxeExtractor {
|
||||
content: &str,
|
||||
language: Language,
|
||||
file: &str,
|
||||
) -> Vec<ExtractedClaim> {
|
||||
) -> Vec<Observation> {
|
||||
let mut claims = Vec::new();
|
||||
|
||||
for (line_idx, line) in content.lines().enumerate() {
|
||||
|
||||
524
applications/aphoria/src/handlers/claims.rs
Normal file
524
applications/aphoria/src/handlers/claims.rs
Normal file
@ -0,0 +1,524 @@
|
||||
//! Command handlers for authored claims management.
|
||||
|
||||
use std::process::ExitCode;
|
||||
|
||||
use aphoria::claims_explain;
|
||||
use aphoria::claims_file::ClaimsFile;
|
||||
use aphoria::{parse_authority_tier, AuthoredClaim, AuthoredValue, ClaimStatus};
|
||||
use aphoria::AphoriaConfig;
|
||||
|
||||
use crate::cli::ClaimsCommands;
|
||||
|
||||
/// Find the project root by walking up from cwd looking for `.aphoria/claims.toml`.
|
||||
///
|
||||
/// Falls back to cwd if no claims file is found in any parent.
|
||||
fn project_root() -> Result<std::path::PathBuf, ExitCode> {
|
||||
let cwd = std::env::current_dir().map_err(|e| {
|
||||
eprintln!("Error: cannot determine current directory: {e}");
|
||||
ExitCode::from(3)
|
||||
})?;
|
||||
|
||||
// Check cwd first
|
||||
if cwd.join(".aphoria/claims.toml").exists() {
|
||||
return Ok(cwd);
|
||||
}
|
||||
|
||||
// Walk up parents
|
||||
let mut dir = cwd.as_path();
|
||||
while let Some(parent) = dir.parent() {
|
||||
if parent.join(".aphoria/claims.toml").exists() {
|
||||
return Ok(parent.to_path_buf());
|
||||
}
|
||||
dir = parent;
|
||||
}
|
||||
|
||||
// Fall back to cwd (will return empty claims)
|
||||
Ok(cwd)
|
||||
}
|
||||
|
||||
/// Handle claims subcommands.
|
||||
pub async fn handle_claims_command(command: ClaimsCommands, config: &AphoriaConfig) -> ExitCode {
|
||||
match command {
|
||||
ClaimsCommands::Create {
|
||||
id,
|
||||
concept_path,
|
||||
predicate,
|
||||
value,
|
||||
provenance,
|
||||
invariant,
|
||||
consequence,
|
||||
tier,
|
||||
evidence,
|
||||
category,
|
||||
by,
|
||||
} => {
|
||||
handle_claims_create(
|
||||
id,
|
||||
concept_path,
|
||||
predicate,
|
||||
value,
|
||||
provenance,
|
||||
invariant,
|
||||
consequence,
|
||||
tier,
|
||||
evidence,
|
||||
category,
|
||||
by,
|
||||
config,
|
||||
)
|
||||
.await
|
||||
}
|
||||
ClaimsCommands::List { category, status, format } => {
|
||||
handle_claims_list(category, status, format, config).await
|
||||
}
|
||||
ClaimsCommands::Explain { claim, output, format } => {
|
||||
handle_claims_explain(claim, output, format, config).await
|
||||
}
|
||||
ClaimsCommands::Update { id, provenance, invariant, consequence, tier, evidence, category, value } => {
|
||||
handle_claims_update(id, provenance, invariant, consequence, tier, evidence, category, value, config).await
|
||||
}
|
||||
ClaimsCommands::Supersede { id, new_id, value, provenance, invariant, consequence, tier, evidence, by } => {
|
||||
handle_claims_supersede(id, new_id, value, provenance, invariant, consequence, tier, evidence, by, config).await
|
||||
}
|
||||
ClaimsCommands::Deprecate { id, reason } => {
|
||||
handle_claims_deprecate(id, reason, config).await
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
async fn handle_claims_create(
|
||||
id: String,
|
||||
concept_path: String,
|
||||
predicate: String,
|
||||
value: String,
|
||||
provenance: String,
|
||||
invariant: String,
|
||||
consequence: String,
|
||||
tier: String,
|
||||
evidence: Vec<String>,
|
||||
category: String,
|
||||
by: String,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
// Validate authority tier
|
||||
if let Err(e) = parse_authority_tier(&tier) {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let mut claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
// Check for duplicate ID
|
||||
if claims_file.find_by_id(&id).is_some() {
|
||||
eprintln!("Error: Claim with ID '{id}' already exists");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
let now = chrono::Utc::now().format("%Y-%m-%dT%H:%M:%SZ").to_string();
|
||||
|
||||
let claim = AuthoredClaim {
|
||||
id: id.clone(),
|
||||
concept_path,
|
||||
predicate,
|
||||
value: AuthoredValue::parse(&value),
|
||||
comparison: Default::default(),
|
||||
provenance,
|
||||
invariant,
|
||||
consequence,
|
||||
authority_tier: tier.to_lowercase(),
|
||||
evidence,
|
||||
category,
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: by,
|
||||
created_at: now,
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
claims_file.add(claim);
|
||||
|
||||
if let Err(e) = claims_file.save(&path) {
|
||||
eprintln!("Error saving claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
println!("Created claim '{id}' in {}", path.display());
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
async fn handle_claims_list(
|
||||
category: Option<String>,
|
||||
status: Option<String>,
|
||||
format: String,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let mut claims: Vec<&AuthoredClaim> = claims_file.claims.iter().collect();
|
||||
|
||||
// Filter by category
|
||||
if let Some(ref cat) = category {
|
||||
claims.retain(|c| c.category == *cat);
|
||||
}
|
||||
|
||||
// Filter by status
|
||||
if let Some(ref st) = status {
|
||||
let target = match st.to_lowercase().as_str() {
|
||||
"active" => ClaimStatus::Active,
|
||||
"deprecated" => ClaimStatus::Deprecated,
|
||||
"superseded" => ClaimStatus::Superseded,
|
||||
other => {
|
||||
eprintln!("Unknown status: {other}. Expected: active, deprecated, superseded");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
claims.retain(|c| c.status == target);
|
||||
}
|
||||
|
||||
if format == "json" {
|
||||
let envelope = serde_json::json!({
|
||||
"type": "claims_list",
|
||||
"total": claims.len(),
|
||||
"claims": claims
|
||||
});
|
||||
match serde_json::to_string_pretty(&envelope) {
|
||||
Ok(json) => println!("{json}"),
|
||||
Err(e) => {
|
||||
eprintln!("Error serializing claims: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Table format
|
||||
if claims.is_empty() {
|
||||
println!("No claims found.");
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
|
||||
let mut table = comfy_table::Table::new();
|
||||
table.set_header(vec!["ID", "Category", "Tier", "Status", "Invariant"]);
|
||||
|
||||
for claim in &claims {
|
||||
let invariant_short = if claim.invariant.len() > 50 {
|
||||
format!("{}...", &claim.invariant[..47])
|
||||
} else {
|
||||
claim.invariant.clone()
|
||||
};
|
||||
table.add_row(vec![
|
||||
&claim.id,
|
||||
&claim.category,
|
||||
&claim.authority_tier,
|
||||
&claim.status.to_string(),
|
||||
&invariant_short,
|
||||
]);
|
||||
}
|
||||
|
||||
println!("{table}");
|
||||
println!("\n{} claim(s) total", claims.len());
|
||||
}
|
||||
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
async fn handle_claims_explain(
|
||||
claim_id: Option<String>,
|
||||
output: Option<std::path::PathBuf>,
|
||||
format: String,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let project_name = root
|
||||
.file_name()
|
||||
.map(|n| n.to_string_lossy().to_string())
|
||||
.unwrap_or_else(|| "project".to_string());
|
||||
|
||||
let content = if let Some(ref id) = claim_id {
|
||||
// Single claim
|
||||
let claim = match claims_file.find_by_id(id) {
|
||||
Some(c) => c,
|
||||
None => {
|
||||
eprintln!("Claim not found: {id}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
if format == "json" {
|
||||
match claims_explain::render_claim_json(claim, &project_name) {
|
||||
Ok(json) => json,
|
||||
Err(e) => {
|
||||
eprintln!("Error rendering claim: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
let mut out = String::new();
|
||||
claims_explain::render_single_claim(&mut out, claim);
|
||||
out
|
||||
}
|
||||
} else {
|
||||
// All claims
|
||||
if format == "json" {
|
||||
match claims_explain::render_claims_json(&claims_file.claims, &project_name) {
|
||||
Ok(json) => json,
|
||||
Err(e) => {
|
||||
eprintln!("Error rendering claims: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
claims_explain::render_claims_markdown(&claims_file.claims, &project_name)
|
||||
}
|
||||
};
|
||||
|
||||
if let Some(ref out_path) = output {
|
||||
if let Some(parent) = out_path.parent() {
|
||||
if !parent.exists() {
|
||||
if let Err(e) = std::fs::create_dir_all(parent) {
|
||||
eprintln!("Error creating output directory: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
}
|
||||
if let Err(e) = std::fs::write(out_path, &content) {
|
||||
eprintln!("Error writing output: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
println!("Written to {}", out_path.display());
|
||||
} else {
|
||||
println!("{content}");
|
||||
}
|
||||
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
async fn handle_claims_update(
|
||||
id: String,
|
||||
provenance: Option<String>,
|
||||
invariant: Option<String>,
|
||||
consequence: Option<String>,
|
||||
tier: Option<String>,
|
||||
evidence: Vec<String>,
|
||||
category: Option<String>,
|
||||
value: Option<String>,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
// Validate tier if provided
|
||||
if let Some(ref t) = tier {
|
||||
if let Err(e) = parse_authority_tier(t) {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let mut claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let now = chrono::Utc::now().format("%Y-%m-%dT%H:%M:%SZ").to_string();
|
||||
|
||||
let result = claims_file.update(&id, |c| {
|
||||
if let Some(p) = provenance {
|
||||
c.provenance = p;
|
||||
}
|
||||
if let Some(i) = invariant {
|
||||
c.invariant = i;
|
||||
}
|
||||
if let Some(con) = consequence {
|
||||
c.consequence = con;
|
||||
}
|
||||
if let Some(t) = tier {
|
||||
c.authority_tier = t.to_lowercase();
|
||||
}
|
||||
if !evidence.is_empty() {
|
||||
for e in evidence {
|
||||
if !c.evidence.contains(&e) {
|
||||
c.evidence.push(e);
|
||||
}
|
||||
}
|
||||
}
|
||||
if let Some(cat) = category {
|
||||
c.category = cat;
|
||||
}
|
||||
if let Some(v) = value {
|
||||
c.value = AuthoredValue::parse(&v);
|
||||
}
|
||||
c.updated_at = Some(now);
|
||||
});
|
||||
|
||||
if let Err(e) = result {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
if let Err(e) = claims_file.save(&path) {
|
||||
eprintln!("Error saving claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
println!("Updated claim '{id}'");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
async fn handle_claims_supersede(
|
||||
old_id: String,
|
||||
new_id: Option<String>,
|
||||
value: Option<String>,
|
||||
provenance: Option<String>,
|
||||
invariant: Option<String>,
|
||||
consequence: Option<String>,
|
||||
tier: Option<String>,
|
||||
evidence: Vec<String>,
|
||||
by: Option<String>,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let mut claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
// Get the old claim to copy fields from
|
||||
let old_claim = match claims_file.find_by_id(&old_id) {
|
||||
Some(c) => c.clone(),
|
||||
None => {
|
||||
eprintln!("Claim not found: {old_id}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let now = chrono::Utc::now().format("%Y-%m-%dT%H:%M:%SZ").to_string();
|
||||
let actual_new_id = new_id.unwrap_or_else(|| format!("{old_id}-v2"));
|
||||
|
||||
// Check for duplicate
|
||||
if claims_file.find_by_id(&actual_new_id).is_some() {
|
||||
eprintln!("Error: Claim with ID '{actual_new_id}' already exists");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
// Validate new tier if provided
|
||||
let new_tier = tier.map(|t| t.to_lowercase()).unwrap_or(old_claim.authority_tier.clone());
|
||||
if let Err(e) = parse_authority_tier(&new_tier) {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
let new_claim = AuthoredClaim {
|
||||
id: actual_new_id.clone(),
|
||||
concept_path: old_claim.concept_path.clone(),
|
||||
predicate: old_claim.predicate.clone(),
|
||||
value: value.map(|v| AuthoredValue::parse(&v)).unwrap_or(old_claim.value.clone()),
|
||||
comparison: old_claim.comparison.clone(),
|
||||
provenance: provenance.unwrap_or(old_claim.provenance.clone()),
|
||||
invariant: invariant.unwrap_or(old_claim.invariant.clone()),
|
||||
consequence: consequence.unwrap_or(old_claim.consequence.clone()),
|
||||
authority_tier: new_tier,
|
||||
evidence: if evidence.is_empty() { old_claim.evidence.clone() } else { evidence },
|
||||
category: old_claim.category.clone(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: Some(old_id.clone()),
|
||||
created_by: by.unwrap_or(old_claim.created_by.clone()),
|
||||
created_at: now,
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
if let Err(e) = claims_file.supersede(&old_id, new_claim) {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
if let Err(e) = claims_file.save(&path) {
|
||||
eprintln!("Error saving claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
println!("Created claim '{actual_new_id}' superseding '{old_id}'");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
async fn handle_claims_deprecate(id: String, reason: String, _config: &AphoriaConfig) -> ExitCode {
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
let path = ClaimsFile::default_path(&root);
|
||||
let mut claims_file = match ClaimsFile::load(&path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let now = chrono::Utc::now().format("%Y-%m-%dT%H:%M:%SZ").to_string();
|
||||
|
||||
// Update the claim with deprecation info
|
||||
let result = claims_file.update(&id, |c| {
|
||||
c.status = ClaimStatus::Deprecated;
|
||||
c.updated_at = Some(format!("{now} (deprecated: {reason})"));
|
||||
});
|
||||
|
||||
if let Err(e) = result {
|
||||
eprintln!("Error: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
if let Err(e) = claims_file.save(&path) {
|
||||
eprintln!("Error saving claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
println!("Deprecated claim '{id}': {reason}");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
@ -43,6 +43,20 @@ pub async fn handle_corpus_command(command: CorpusCommands, config: &AphoriaConf
|
||||
}
|
||||
}
|
||||
|
||||
CorpusCommands::ExportPack { name, output, only, offline } => {
|
||||
let only_parsed = only.map(|s| s.split(',').map(|s| s.trim().to_string()).collect());
|
||||
match aphoria::export_corpus_as_pack(name, output, only_parsed, offline, config).await {
|
||||
Ok(count) => {
|
||||
println!("Exported {count} assertions as Trust Pack");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
Err(e) => {
|
||||
eprintln!("Export error: {e}");
|
||||
ExitCode::from(3)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
CorpusCommands::List => {
|
||||
let sources = aphoria::list_corpus_sources(config);
|
||||
println!("Available corpus sources:");
|
||||
|
||||
@ -19,6 +19,17 @@ pub async fn handle_governance_command(
|
||||
config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
if !config.governance.enabled && !matches!(command, GovernanceCommands::Status { .. }) {
|
||||
// For Pending: return empty results with exit 0 (not an error)
|
||||
if let GovernanceCommands::Pending { format, .. } = &command {
|
||||
if format == "json" {
|
||||
println!("{{\"pending\": [], \"message\": \"Governance is not enabled\"}}");
|
||||
} else {
|
||||
println!("Governance is not enabled. No pending requests.");
|
||||
println!("\nTo enable: add [governance] enabled = true to aphoria.toml");
|
||||
}
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
// All other commands that modify state still fail
|
||||
eprintln!("Governance is not enabled. Add [governance] enabled = true to aphoria.toml");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
|
||||
@ -422,6 +422,9 @@ async fn handle_list(
|
||||
|
||||
if results.is_empty() {
|
||||
println!("No patterns found matching criteria.");
|
||||
println!();
|
||||
println!("Available statuses: active, deprecated, sunset");
|
||||
println!("To add patterns, promote an extractor: aphoria extractors promote <id>");
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
|
||||
@ -533,7 +536,10 @@ async fn handle_migration_status(
|
||||
}
|
||||
|
||||
if progress_list.is_empty() {
|
||||
println!("No migration data found.");
|
||||
println!("No deprecated patterns with tracked migrations.");
|
||||
println!();
|
||||
println!("Migrations are tracked automatically when deprecated patterns have usages.");
|
||||
println!("Run 'aphoria lifecycle list --status deprecated' to see deprecated patterns.");
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
|
||||
|
||||
@ -6,6 +6,7 @@ use aphoria::AphoriaConfig;
|
||||
|
||||
use crate::cli::Commands;
|
||||
|
||||
mod claims;
|
||||
mod corpus;
|
||||
mod eval;
|
||||
mod extractors;
|
||||
@ -19,11 +20,14 @@ mod scan;
|
||||
mod scope;
|
||||
mod shadow;
|
||||
mod utils;
|
||||
mod verify;
|
||||
|
||||
// Re-export for public API compatibility.
|
||||
// These are used by the CLI binary but not within this module,
|
||||
// so we allow unused imports for the re-export pattern.
|
||||
#[allow(unused_imports)]
|
||||
pub use claims::*;
|
||||
#[allow(unused_imports)]
|
||||
pub use corpus::*;
|
||||
#[allow(unused_imports)]
|
||||
pub use eval::*;
|
||||
@ -49,6 +53,8 @@ pub use scope::*;
|
||||
pub use shadow::*;
|
||||
#[allow(unused_imports)]
|
||||
pub use utils::*;
|
||||
#[allow(unused_imports)]
|
||||
pub use verify::*;
|
||||
|
||||
/// Dispatch and execute CLI commands
|
||||
pub async fn handle_command(command: Commands, config: &AphoriaConfig) -> ExitCode {
|
||||
@ -133,5 +139,223 @@ pub async fn handle_command(command: Commands, config: &AphoriaConfig) -> ExitCo
|
||||
}
|
||||
|
||||
Commands::Audit { command } => governance::handle_audit_command(command, config).await,
|
||||
|
||||
Commands::Claims { command } => claims::handle_claims_command(command, config).await,
|
||||
|
||||
Commands::Verify { command } => verify::handle_verify_command(command, config).await,
|
||||
|
||||
Commands::Coverage { path, format, sort_by } => {
|
||||
let project_root = if path.as_os_str() == "." {
|
||||
match std::env::current_dir() {
|
||||
Ok(p) => p,
|
||||
Err(e) => {
|
||||
eprintln!("Cannot determine project root: {e}");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
path
|
||||
};
|
||||
|
||||
// Load claims and run scan to get observations
|
||||
let claims_path = aphoria::claims_file::ClaimsFile::default_path(&project_root);
|
||||
let claims_file = match aphoria::claims_file::ClaimsFile::load(&claims_path) {
|
||||
Ok(cf) => cf,
|
||||
Err(_) => aphoria::claims_file::ClaimsFile::new(),
|
||||
};
|
||||
|
||||
let scan_args = aphoria::ScanArgs {
|
||||
path: project_root.clone(),
|
||||
format: "table".to_string(),
|
||||
exit_code_enabled: false,
|
||||
mode: aphoria::ScanMode::Ephemeral,
|
||||
debug: false,
|
||||
sync: false,
|
||||
file_source: aphoria::FileSource::All,
|
||||
benchmark: false,
|
||||
show_claims: true,
|
||||
strict: false,
|
||||
};
|
||||
|
||||
let observations = match aphoria::run_scan(scan_args, config).await {
|
||||
Ok(result) => result.claims.unwrap_or_default(),
|
||||
Err(e) => {
|
||||
eprintln!("Scan error: {e}");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
};
|
||||
|
||||
let project_name = project_root
|
||||
.file_name()
|
||||
.and_then(|n| n.to_str())
|
||||
.unwrap_or("project");
|
||||
|
||||
let report = aphoria::compute_coverage(&claims_file.claims, &observations, project_name);
|
||||
|
||||
let output = match format.as_str() {
|
||||
"json" => aphoria::format_coverage_json(&report),
|
||||
"markdown" => aphoria::format_coverage_markdown(&report),
|
||||
_ => aphoria::format_coverage_table(&report, &sort_by),
|
||||
};
|
||||
println!("{output}");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
Commands::Explain { path, output, format } => {
|
||||
let project_root = if path.as_os_str() == "." {
|
||||
match std::env::current_dir() {
|
||||
Ok(p) => p,
|
||||
Err(e) => {
|
||||
eprintln!("Cannot determine project root: {e}");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
path
|
||||
};
|
||||
|
||||
match gather_explain_data(&project_root, config).await {
|
||||
Ok((claims, observations, project_name)) => {
|
||||
let verify_report = aphoria::verify_claims(&claims, &observations);
|
||||
let coverage_report = aphoria::compute_coverage_from_report(
|
||||
&claims, &observations, &verify_report, &project_name,
|
||||
);
|
||||
let text = aphoria::explain::generate_onboarding(
|
||||
&claims, &verify_report, &coverage_report, &project_name, &format,
|
||||
);
|
||||
write_or_print(&text, output.as_deref())
|
||||
}
|
||||
Err(code) => code,
|
||||
}
|
||||
}
|
||||
|
||||
Commands::Docs { command } => {
|
||||
match command {
|
||||
crate::cli::DocsCommands::Generate { path, output, format } => {
|
||||
let project_root = if path.as_os_str() == "." {
|
||||
match std::env::current_dir() {
|
||||
Ok(p) => p,
|
||||
Err(e) => {
|
||||
eprintln!("Cannot determine project root: {e}");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
path
|
||||
};
|
||||
|
||||
match gather_explain_data(&project_root, config).await {
|
||||
Ok((claims, observations, project_name)) => {
|
||||
let verify_report = aphoria::verify_claims(&claims, &observations);
|
||||
let coverage_report = aphoria::compute_coverage_from_report(
|
||||
&claims, &observations, &verify_report, &project_name,
|
||||
);
|
||||
let text = aphoria::explain::generate_full_docs(
|
||||
&claims, &verify_report, &coverage_report, &project_name, &format,
|
||||
);
|
||||
write_or_print(&text, output.as_deref())
|
||||
}
|
||||
Err(code) => code,
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Commands::TrustPack { command } => {
|
||||
match command {
|
||||
crate::cli::TrustPackCommands::Install { name, registry } => {
|
||||
if let Some(registry_url) = registry {
|
||||
eprintln!("Custom registry not yet supported: {registry_url}");
|
||||
eprintln!("Use a built-in pack name or omit --registry.");
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
match aphoria::trust_pack_registry::lookup(&name) {
|
||||
Ok(entry) => {
|
||||
println!("Trust Pack: {}", entry.name);
|
||||
println!(" {}", entry.description);
|
||||
println!(" Tier: {}", entry.tier);
|
||||
println!(" URL: {}", entry.url);
|
||||
println!();
|
||||
println!("Download not yet implemented (requires hosting infrastructure).");
|
||||
println!("For now, use `aphoria corpus build` to build assertions locally,");
|
||||
println!("or `aphoria policy import <file>` to import a .pack file.");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
Err(e) => {
|
||||
eprintln!("{e}");
|
||||
ExitCode::from(1)
|
||||
}
|
||||
}
|
||||
}
|
||||
crate::cli::TrustPackCommands::List => {
|
||||
let packs = aphoria::trust_pack_registry::list_packs();
|
||||
println!("Available Trust Packs:");
|
||||
println!();
|
||||
for pack in packs {
|
||||
println!(" {} ({})", pack.name, pack.tier);
|
||||
println!(" {}", pack.description);
|
||||
}
|
||||
println!();
|
||||
println!("Install with: aphoria trust-pack install <name>");
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Gather claims + observations for explain/docs commands.
|
||||
/// Returns (claims, observations, project_name) or an ExitCode on error.
|
||||
async fn gather_explain_data(
|
||||
project_root: &std::path::Path,
|
||||
config: &AphoriaConfig,
|
||||
) -> Result<(Vec<aphoria::AuthoredClaim>, Vec<aphoria::Observation>, String), ExitCode> {
|
||||
let claims_path = aphoria::claims_file::ClaimsFile::default_path(project_root);
|
||||
let claims_file = match aphoria::claims_file::ClaimsFile::load(&claims_path) {
|
||||
Ok(cf) => cf,
|
||||
Err(_) => aphoria::claims_file::ClaimsFile::new(),
|
||||
};
|
||||
|
||||
let scan_args = aphoria::ScanArgs {
|
||||
path: project_root.to_path_buf(),
|
||||
format: "table".to_string(),
|
||||
exit_code_enabled: false,
|
||||
mode: aphoria::ScanMode::Ephemeral,
|
||||
debug: false,
|
||||
sync: false,
|
||||
file_source: aphoria::FileSource::All,
|
||||
benchmark: false,
|
||||
show_claims: true,
|
||||
strict: false,
|
||||
};
|
||||
|
||||
let observations = match aphoria::run_scan(scan_args, config).await {
|
||||
Ok(result) => result.claims.unwrap_or_default(),
|
||||
Err(e) => {
|
||||
eprintln!("Scan error: {e}");
|
||||
return Err(ExitCode::from(1));
|
||||
}
|
||||
};
|
||||
|
||||
let project_name = project_root
|
||||
.file_name()
|
||||
.and_then(|n| n.to_str())
|
||||
.unwrap_or("project")
|
||||
.to_string();
|
||||
|
||||
Ok((claims_file.claims, observations, project_name))
|
||||
}
|
||||
|
||||
/// Write text to a file or print to stdout.
|
||||
fn write_or_print(text: &str, output: Option<&std::path::Path>) -> ExitCode {
|
||||
if let Some(out_path) = output {
|
||||
if let Err(e) = std::fs::write(out_path, text) {
|
||||
eprintln!("Failed to write to {}: {e}", out_path.display());
|
||||
return ExitCode::from(1);
|
||||
}
|
||||
println!("Written to {}", out_path.display());
|
||||
} else {
|
||||
println!("{text}");
|
||||
}
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
@ -38,6 +38,7 @@ pub async fn handle_scan(
|
||||
file_source,
|
||||
benchmark,
|
||||
show_claims,
|
||||
strict,
|
||||
};
|
||||
|
||||
// Apply stricter thresholds if requested
|
||||
@ -102,6 +103,7 @@ pub async fn handle_community_preview(
|
||||
file_source: FileSource::All,
|
||||
benchmark: false,
|
||||
show_claims: false,
|
||||
strict: false,
|
||||
};
|
||||
|
||||
let claims = match extract_claims(&args, config).await {
|
||||
|
||||
238
applications/aphoria/src/handlers/verify.rs
Normal file
238
applications/aphoria/src/handlers/verify.rs
Normal file
@ -0,0 +1,238 @@
|
||||
//! Command handlers for `aphoria verify`.
|
||||
|
||||
use std::path::PathBuf;
|
||||
use std::process::ExitCode;
|
||||
|
||||
use aphoria::claims_file::ClaimsFile;
|
||||
use aphoria::extractors::ExtractorRegistry;
|
||||
use aphoria::report::{format_verify_json, format_verify_table};
|
||||
use aphoria::verify;
|
||||
use aphoria::AphoriaConfig;
|
||||
|
||||
use crate::cli::VerifyCommands;
|
||||
|
||||
/// Dispatch a verify subcommand.
|
||||
pub async fn handle_verify_command(command: VerifyCommands, config: &AphoriaConfig) -> ExitCode {
|
||||
match command {
|
||||
VerifyCommands::Run {
|
||||
path,
|
||||
format,
|
||||
exit_code,
|
||||
changed_only,
|
||||
show_unclaimed,
|
||||
claim,
|
||||
category,
|
||||
} => {
|
||||
handle_verify_run(
|
||||
path,
|
||||
format,
|
||||
exit_code,
|
||||
changed_only,
|
||||
show_unclaimed,
|
||||
claim,
|
||||
category,
|
||||
config,
|
||||
)
|
||||
.await
|
||||
}
|
||||
VerifyCommands::Map { path } => handle_verify_map(path, config).await,
|
||||
}
|
||||
}
|
||||
|
||||
/// Run verification: extract observations, compare against claims.
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
async fn handle_verify_run(
|
||||
path: PathBuf,
|
||||
format: String,
|
||||
exit_code: bool,
|
||||
changed_only: bool,
|
||||
show_unclaimed: bool,
|
||||
claim_filter: Vec<String>,
|
||||
category_filter: Option<String>,
|
||||
config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
let project_root = path.canonicalize().unwrap_or(path);
|
||||
|
||||
// 1. Load claims from .aphoria/claims.toml
|
||||
let claims_path = ClaimsFile::default_path(&project_root);
|
||||
let claims_file = match ClaimsFile::load(&claims_path) {
|
||||
Ok(cf) => cf,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
if claims_file.is_empty() {
|
||||
eprintln!("No claims found in {}", claims_path.display());
|
||||
eprintln!("Create claims with: aphoria claims create");
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
|
||||
// 2. Filter claims by ID or category if specified
|
||||
let mut claims: Vec<aphoria::AuthoredClaim> = claims_file.claims;
|
||||
|
||||
if !claim_filter.is_empty() {
|
||||
claims.retain(|c| claim_filter.contains(&c.id));
|
||||
}
|
||||
if let Some(ref cat) = category_filter {
|
||||
claims.retain(|c| c.category == *cat);
|
||||
}
|
||||
|
||||
// 3. Walk the project and extract observations
|
||||
let files = if changed_only {
|
||||
match aphoria::walker::walk_staged_files(&project_root, config) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error walking staged files: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
match aphoria::walker::walk_project(&project_root, config) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error walking project: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
let registry = ExtractorRegistry::new(config);
|
||||
let mut all_observations = Vec::new();
|
||||
for file in &files {
|
||||
let content = match std::fs::read_to_string(&file.path) {
|
||||
Ok(c) => c,
|
||||
Err(_) => continue,
|
||||
};
|
||||
let obs = registry.extract_all(
|
||||
&file.path_segments,
|
||||
&content,
|
||||
file.language,
|
||||
&file.relative_path,
|
||||
);
|
||||
all_observations.extend(obs);
|
||||
}
|
||||
|
||||
// 4. Run verification
|
||||
let report = verify::verify_claims(&claims, &all_observations);
|
||||
|
||||
// 5. Format and output
|
||||
let project_name = project_root
|
||||
.file_name()
|
||||
.and_then(|n| n.to_str())
|
||||
.unwrap_or("project");
|
||||
|
||||
let output = match format.as_str() {
|
||||
"json" => format_verify_json(&report, show_unclaimed),
|
||||
_ => format_verify_table(&report, project_name, show_unclaimed),
|
||||
};
|
||||
|
||||
println!("{output}");
|
||||
|
||||
// 6. Exit codes: 0=pass, 1=missing/unclaimed, 2=conflicts, 3=error
|
||||
if !exit_code {
|
||||
return ExitCode::SUCCESS;
|
||||
}
|
||||
|
||||
if report.summary.conflict > 0 {
|
||||
ExitCode::from(2)
|
||||
} else if report.summary.missing > 0 {
|
||||
ExitCode::from(1)
|
||||
} else {
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
}
|
||||
|
||||
/// Show the mapping between extractors and claims.
|
||||
async fn handle_verify_map(path: PathBuf, config: &AphoriaConfig) -> ExitCode {
|
||||
let project_root = path.canonicalize().unwrap_or(path);
|
||||
|
||||
// Load claims
|
||||
let claims_path = ClaimsFile::default_path(&project_root);
|
||||
let claims_file = match ClaimsFile::load(&claims_path) {
|
||||
Ok(cf) => cf,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
let registry = ExtractorRegistry::new(config);
|
||||
|
||||
println!("Extractor → Claim Mapping");
|
||||
println!("{}", "=".repeat(60));
|
||||
println!();
|
||||
|
||||
if claims_file.is_empty() {
|
||||
println!("No authored claims found.");
|
||||
println!();
|
||||
println!("Registered Extractors ({}):", registry.extractor_names().len());
|
||||
for name in ®istry.extractor_names() {
|
||||
let preds = registry
|
||||
.extractors()
|
||||
.iter()
|
||||
.find(|e| e.name() == *name)
|
||||
.map(|e| e.verifiable_predicates())
|
||||
.unwrap_or_default();
|
||||
if preds.is_empty() {
|
||||
println!(" {name} (no declared predicates)");
|
||||
} else {
|
||||
let pred_strs: Vec<String> =
|
||||
preds.iter().map(|(tp, p)| format!("{tp}::{p}")).collect();
|
||||
println!(" {name} [{}]", pred_strs.join(", "));
|
||||
}
|
||||
}
|
||||
} else {
|
||||
let map = verify::compute_extractor_claim_map(&claims_file.claims, registry.extractors());
|
||||
|
||||
// Show per-claim coverage
|
||||
println!("Claim Coverage ({} active claims):", map.claim_mappings.len());
|
||||
println!();
|
||||
|
||||
let mut covered = 0usize;
|
||||
for mapping in &map.claim_mappings {
|
||||
if mapping.covering_extractors.is_empty() {
|
||||
println!(
|
||||
" {} ({}) -> NO EXTRACTOR",
|
||||
mapping.claim_id, mapping.claim_tail_path
|
||||
);
|
||||
} else {
|
||||
println!(
|
||||
" {} ({}) -> {}",
|
||||
mapping.claim_id,
|
||||
mapping.claim_tail_path,
|
||||
mapping.covering_extractors.join(", ")
|
||||
);
|
||||
covered += 1;
|
||||
}
|
||||
}
|
||||
|
||||
println!();
|
||||
let total = map.claim_mappings.len();
|
||||
println!(
|
||||
"Coverage: {covered}/{total} claims have covering extractors ({:.0}%)",
|
||||
if total > 0 {
|
||||
(covered as f64 / total as f64) * 100.0
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
);
|
||||
|
||||
// Show extractors that declare predicates but have no matching claims
|
||||
if !map.unmatched_extractors.is_empty() {
|
||||
println!();
|
||||
println!(
|
||||
"Extractors with declared predicates but no matching claims ({}):",
|
||||
map.unmatched_extractors.len()
|
||||
);
|
||||
for ext in &map.unmatched_extractors {
|
||||
let pred_strs: Vec<String> =
|
||||
ext.predicates.iter().map(|(tp, p)| format!("{tp}::{p}")).collect();
|
||||
println!(" {} [{}]", ext.name, pred_strs.join(", "));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
@ -39,6 +39,18 @@ pub async fn show_status(config: &AphoriaConfig) -> Result<String, AphoriaError>
|
||||
output.push_str(" Agent key: not generated\n");
|
||||
}
|
||||
|
||||
let claims_path = project_root.join(".aphoria/claims.toml");
|
||||
if claims_path.exists() {
|
||||
if let Ok(claims_file) = crate::claims_file::ClaimsFile::load(&claims_path) {
|
||||
let active = claims_file.claims.iter()
|
||||
.filter(|c| c.status == crate::types::authored_claim::ClaimStatus::Active)
|
||||
.count();
|
||||
output.push_str(&format!(" Claims: {} ({} active)\n", claims_file.claims.len(), active));
|
||||
}
|
||||
} else {
|
||||
output.push_str(" Claims: none (run 'aphoria claims create' to add)\n");
|
||||
}
|
||||
|
||||
Ok(output)
|
||||
}
|
||||
|
||||
|
||||
@ -42,7 +42,7 @@ impl std::fmt::Display for ValueType {
|
||||
|
||||
/// Template for generating claims from a learned pattern.
|
||||
///
|
||||
/// Describes how to create an `ExtractedClaim` when the pattern matches.
|
||||
/// Describes how to create an `Observation` when the pattern matches.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct ClaimTemplate {
|
||||
/// Subject path template (e.g., "tls/min_version", "db/pool_size").
|
||||
|
||||
@ -4,6 +4,16 @@
|
||||
//! and checks them against authoritative sources. It finds the places where what
|
||||
//! your code *does* contradicts what the specs *say*.
|
||||
//!
|
||||
//! ## Observations vs Claims
|
||||
//!
|
||||
//! **Observations** are pattern matches extracted by regex-based extractors.
|
||||
//! They lack provenance, invariants, and consequences—they're just grep results.
|
||||
//! Observations are assigned Tier 4/5 based on confidence.
|
||||
//!
|
||||
//! **Claims** (`AuthoredClaim`) are human-authored assertions with full Episteme semantics:
|
||||
//! provenance, invariants, consequences, authority tiers, and evidence chains.
|
||||
//! These are stored in `.aphoria/claims.toml` and managed with `aphoria claim` commands.
|
||||
//!
|
||||
//! # Architecture
|
||||
//!
|
||||
//! ```text
|
||||
@ -42,15 +52,22 @@
|
||||
pub mod ack_file;
|
||||
mod baseline;
|
||||
pub mod bridge;
|
||||
pub mod claim_store;
|
||||
pub mod claims_explain;
|
||||
pub mod claims_file;
|
||||
pub mod community;
|
||||
mod config;
|
||||
pub mod coverage;
|
||||
pub mod corpus;
|
||||
mod corpus_build;
|
||||
mod episteme;
|
||||
pub mod scope;
|
||||
pub use episteme::{current_timestamp, current_timestamp_millis};
|
||||
pub use episteme::{
|
||||
compute_tier_breakdown, current_timestamp, current_timestamp_millis, AphoriaAuthorityLens,
|
||||
};
|
||||
mod error;
|
||||
pub mod eval;
|
||||
pub mod explain;
|
||||
pub mod evidence;
|
||||
pub mod expiry;
|
||||
pub mod extractors;
|
||||
@ -68,11 +85,16 @@ pub mod research;
|
||||
mod research_commands;
|
||||
mod scan;
|
||||
pub mod shadow;
|
||||
pub mod trust_pack_registry;
|
||||
mod types;
|
||||
mod walker;
|
||||
pub mod verify;
|
||||
pub mod walker;
|
||||
|
||||
// Public re-exports
|
||||
pub use baseline::{set_baseline, show_diff};
|
||||
pub use bridge::{
|
||||
authored_claim_to_assertion, observation_to_assertion, observation_to_tier,
|
||||
};
|
||||
pub use community::{
|
||||
compute_pattern_hash, AnonymizedObservation, CommunityClaimDef, CommunityExtractor,
|
||||
CommunityExtractorLoader, CommunityExtractorProvenance, CommunityObjectValue, PatternAggregate,
|
||||
@ -83,15 +105,19 @@ pub use config::{
|
||||
GovernanceConfig, HostedConfig, LearningConfig, LlmConfig, OfflineFallback,
|
||||
PredicateAliasConfig, PromotionConfig, ShadowConfig, SyncMode,
|
||||
};
|
||||
pub use coverage::{
|
||||
compute_coverage, compute_coverage_from_report, format_coverage_json, format_coverage_markdown,
|
||||
format_coverage_table, CoverageReport, CoverageSummary, ModuleCoverage,
|
||||
};
|
||||
pub use corpus::{CorpusBuildResult, CorpusBuilderInfo, CorpusRegistry};
|
||||
pub use corpus_build::{build_corpus, list_corpus_sources, CorpusBuildArgs};
|
||||
pub use corpus_build::{build_corpus, export_corpus_as_pack, list_corpus_sources, CorpusBuildArgs};
|
||||
pub use error::AphoriaError;
|
||||
pub use eval::{
|
||||
BaselineComparison, BaselineMetrics, CategoryMetrics, ClaimMatcher, CorpusManifest,
|
||||
CorpusMetadata, EvalDatabase, EvalHarness, EvalMode, EvalResult, EvalRunConfig, EvalVerdict,
|
||||
ExpectedClaim, FinalClaim, Fixture, FixtureExpected, FixtureInput, FixtureLoader,
|
||||
FixtureMetadata, FixtureResult, FixtureScoring, FixtureStatus, FixtureSummary, MatchResult,
|
||||
Metrics, Observation, ParsedClaim, Report, ReportFormat, ValidationError,
|
||||
Metrics, Observation as EvalObservation, ParsedClaim, Report, ReportFormat, ValidationError,
|
||||
};
|
||||
pub use evidence::{EvidenceDetector, EvidenceLevel, EvidenceSource, PatternEvidence};
|
||||
pub use governance::{
|
||||
@ -109,8 +135,9 @@ pub use lifecycle::{
|
||||
};
|
||||
pub use policy::{PackPredicateAliasSet, PolicyManager, SignatureRecord, TrustPack};
|
||||
pub use policy_ops::{
|
||||
acknowledge, bless, export_acks, export_policy, import_acks, import_policy, parse_value,
|
||||
resign_policy, update, AckExportStats, AckImportStats, ImportStats, ResignStats,
|
||||
acknowledge, bless, export_acks, export_claims_as_policy, export_policy, import_acks,
|
||||
import_policy, parse_value, resign_policy, update, AckExportStats, AckImportStats, ImportStats,
|
||||
ResignStats,
|
||||
};
|
||||
pub use promotion::{
|
||||
compute_metrics_delta, display_candidate, display_candidates_summary, ChangelogEntry,
|
||||
@ -133,10 +160,19 @@ pub use shadow::{
|
||||
ShadowDecision, ShadowDecisionKind, ShadowExecutor, ShadowExtractorRegistry, ShadowMatch,
|
||||
ShadowMetrics, ShadowStatus, ShadowStore, ShadowTest,
|
||||
};
|
||||
#[allow(deprecated)]
|
||||
pub use types::ExtractedClaim; // Backward compat alias for Observation
|
||||
pub use types::{
|
||||
extract_leaf_concept, predicates, AcknowledgeArgs, BlessArgs, ConflictResult, ConflictTrace,
|
||||
DeprecatedUsageResult, ExtractedClaim, FileSource, PolicySourceInfo, PredicateAliasSet,
|
||||
ScanArgs, ScanMode, ScanResult, UpdateArgs, Verdict,
|
||||
extract_leaf_concept, format_authority_tier, parse_authority_tier, predicates,
|
||||
AcknowledgeArgs, AuthoredClaim, AuthoredValue, BlessArgs, ClaimStatus, ClaimValue,
|
||||
ComparisonMode, ConflictResult, ConflictTrace, DeprecatedUsageResult, FileSource, Observation,
|
||||
PolicySourceInfo, PredicateAliasSet, ScanArgs, ScanMode, ScanResult, TierBreakdown, UpdateArgs,
|
||||
Verdict,
|
||||
};
|
||||
pub use claim_store::{ClaimFilter, ClaimStore, ImportStats as ClaimImportStats, TomlClaimStore};
|
||||
pub use verify::{
|
||||
compute_extractor_claim_map, tail_path, verify_claims, AuditVerdict, ExtractorClaimMap,
|
||||
ExtractorClaimMapping, UnmatchedExtractor, VerifyReport, VerifyResult, VerifySummary,
|
||||
};
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user