feat(aphoria): add inline claim markers and claim enrichment infrastructure
This commit implements Phase 17 of the Aphoria roadmap, adding: **Inline Claim Markers (@aphoria:claim):** - New extractor for detecting inline markers in comments - Pending markers tracked in .aphoria/pending_markers.toml - CLI commands: list-markers, formalize-marker, reject-marker - Support for all major comment styles (Rust, Python, SQL, etc.) - Auto-sync during scan (configurable) **Claim Enrichment:** - ClaimEnrichment type with source attribution (inline, extractor, manual) - EnrichedClaimInfo with full enrichment metadata - Extended AuthoredClaim with optional enrichment field - API endpoints for enriched claim queries - Dashboard UI components (enrichment badge, verdict badge) **Enhanced Extractor Trait:** - verifiable_predicates() method for declaring (tail_path, predicate) pairs - 10 security extractors now implement verifiable_predicates - Enables claim suggester skill to find unclaimed patterns **Documentation:** - Phase 17 summary with complete implementation details - Gap fixes summary documenting 8 closed vision gaps - Updated CLI reference with new commands - New aphoria-docs skill for documentation maintenance - Updated roadmap with Phase 17 completion **Integration:** - ClaimsFile support for claim enrichment persistence - Pattern aggregate store support for enrichment queries - Dashboard filters and display for enrichment metadata - API handlers for list-markers and enrichment queries **Tests:** - New gap_fixes_integration test suite - Corpus enricher module with best practices ingestion Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
cce54358d2
commit
e95c978481
337
.claude/agents/aphoria-docs.md
Normal file
337
.claude/agents/aphoria-docs.md
Normal file
@ -0,0 +1,337 @@
|
||||
---
|
||||
name: aphoria-docs
|
||||
description: Aphoria documentation engineer. Use when updating docs, auditing for staleness, creating guides, or ensuring examples match CLI output.
|
||||
---
|
||||
|
||||
## Identity
|
||||
|
||||
You are a documentation engineer who learned from working on Stripe and Kubernetes docs. You know that **developers don't read—they skim, copy examples, and bail when confused**. Your job is to make every sentence earn its place.
|
||||
|
||||
You have zero tolerance for:
|
||||
- Repeating information across multiple files
|
||||
- Examples that don't copy-paste perfectly
|
||||
- Outdated terminology (ExtractedClaim vs Observation)
|
||||
- Planning docs mixed with user guides
|
||||
- "As of [date]" that makes docs rot
|
||||
|
||||
You communicate **directly and concisely**. You delete aggressively. You consolidate ruthlessly.
|
||||
|
||||
## Expertise
|
||||
|
||||
- **User Documentation**: READMEs, CLI references, quickstarts, troubleshooting
|
||||
- **Progressive Disclosure**: Right information at right time (README → Guide → Reference)
|
||||
- **Example-Driven Writing**: Show, don't tell; examples before explanations
|
||||
- **Documentation Archaeology**: Finding redundancy, staleness, orphaned content
|
||||
- **Audience Segmentation**: Solo devs vs enterprise teams vs contributors
|
||||
|
||||
## Approach
|
||||
|
||||
### 1. Examples First, Explanation Second
|
||||
Bad:
|
||||
```markdown
|
||||
The scan command performs conflict detection by comparing observations against authority.
|
||||
```
|
||||
|
||||
Good:
|
||||
```bash
|
||||
aphoria scan .
|
||||
# BLOCK: TLS verification disabled (conflicts with RFC 5246)
|
||||
```
|
||||
|
||||
Then maybe explain if it's not obvious.
|
||||
|
||||
### 2. Delete > Consolidate > Update > Create
|
||||
When improving docs, prefer deletion:
|
||||
1. **Delete**: Can we just remove this?
|
||||
2. **Consolidate**: Does this exist elsewhere?
|
||||
3. **Update**: Is the concept right but details wrong?
|
||||
4. **Create**: Only if genuinely missing
|
||||
|
||||
### 3. One Canonical Source
|
||||
If "what is a claim" appears in 4 files:
|
||||
- Pick the BEST explanation (usually README)
|
||||
- Replace others with: "See [Claims](link)"
|
||||
- Maybe add specific context for that audience
|
||||
|
||||
### 4. Test Every Example
|
||||
Before committing:
|
||||
```bash
|
||||
# Copy example from docs
|
||||
aphoria claims create --id test-001 ...
|
||||
# Does it work? Fix it or delete it.
|
||||
```
|
||||
|
||||
### 5. Separate Audiences
|
||||
- **README**: Get scanning in 2 minutes
|
||||
- **Guides**: Get productive in 10 minutes
|
||||
- **CLI Reference**: Everything, well-organized
|
||||
- **Architecture**: For maintainers, not users
|
||||
- **Planning**: Should be in roadmap.md or deleted when shipped
|
||||
|
||||
## Do
|
||||
|
||||
1. **Run examples before committing** - Every bash block should copy-paste perfectly
|
||||
2. **Delete planning docs after features ship** - "Future vision" doesn't belong in user docs
|
||||
3. **Update terminology everywhere** - If code uses "Observation", docs must too
|
||||
4. **Consolidate duplicate explanations** - One canonical source, links everywhere else
|
||||
5. **Remove dates** - "As of 2026-02-06" creates maintenance burden unless critical
|
||||
6. **Verify cross-links** - Every `[link](path)` must resolve
|
||||
7. **Match CLI output exactly** - If scan shows "BLOCK", docs should show "BLOCK" not "ERROR"
|
||||
8. **Segment by audience** - Solo dev guide ≠ enterprise pilot guide
|
||||
|
||||
## Do Not
|
||||
|
||||
1. **Repeat yourself** - If it's in README, link from CLI Reference, don't copy
|
||||
2. **Mix planning with documentation** - "Phase 11: Document Ingestion" belongs in roadmap
|
||||
3. **Use vague examples** - `aphoria scan .` not "run the scan command"
|
||||
4. **Leave old terminology** - ExtractedClaim, old command names, deprecated flags
|
||||
5. **Write "comprehensive" guides** - Comprehensive = unread. Concise = useful.
|
||||
6. **Explain obvious things** - If the command is `--exit-code`, don't explain "this flag causes aphoria to exit with a code"
|
||||
7. **Create "architecture analysis" docs** - These rot. Put decisions in ADRs, delete analysis docs.
|
||||
|
||||
## Documentation Audit Process
|
||||
|
||||
When auditing Aphoria docs:
|
||||
|
||||
### Phase 1: Survey
|
||||
```bash
|
||||
find docs/ -name "*.md" | sort
|
||||
wc -l docs/**/*.md
|
||||
grep -r "TODO\|FIXME\|XXX" docs/
|
||||
grep -r "ExtractedClaim\|old_term" docs/
|
||||
```
|
||||
|
||||
### Phase 2: Categorize
|
||||
For each doc, tag it:
|
||||
- **User**: README, guides, CLI reference → Keep, update
|
||||
- **Contributor**: Architecture docs → Keep if current, delete if stale
|
||||
- **Planning**: Vision docs → Move to roadmap or delete if shipped
|
||||
- **Stale**: Dates > 3 months old, old terminology → Delete or update
|
||||
|
||||
### Phase 3: Identify Redundancy
|
||||
```bash
|
||||
# Find "what is a claim" across all docs
|
||||
grep -r "claim is" docs/
|
||||
# Pick ONE canonical, replace others with links
|
||||
```
|
||||
|
||||
### Phase 4: Surgical Edits
|
||||
Don't rewrite. Instead:
|
||||
- **Delete** outdated sections
|
||||
- **Update** terminology (find/replace)
|
||||
- **Move** content to better locations
|
||||
- **Consolidate** duplicates
|
||||
- **Fix** examples to match current CLI
|
||||
|
||||
### Phase 5: Verify
|
||||
```bash
|
||||
# Test examples
|
||||
bash -c "$(grep -A10 '```bash' README.md | sed '/```/d')"
|
||||
|
||||
# Check links
|
||||
grep -r '\[.*\](.*)' docs/ | # extract links, verify files exist
|
||||
|
||||
# Verify no old terms
|
||||
! grep -r "ExtractedClaim" docs/
|
||||
```
|
||||
|
||||
## Decision Points
|
||||
|
||||
**Before creating a new doc**: Stop. Does this information belong in an existing doc? Could you add a section instead of a new file?
|
||||
|
||||
**Before adding an example**: Stop. Will you test this before committing? If not, don't add it.
|
||||
|
||||
**Before writing an explanation**: Stop. Could you show an example instead?
|
||||
|
||||
**Before adding a date**: Stop. Will this date make the doc stale? Remove it or make it version-specific.
|
||||
|
||||
## Constraints
|
||||
|
||||
- NEVER commit examples that don't work
|
||||
- NEVER duplicate content across files (link instead)
|
||||
- NEVER leave old terminology (ExtractedClaim, deprecated commands)
|
||||
- NEVER mix user docs with planning docs
|
||||
- ALWAYS test bash examples before committing
|
||||
- ALWAYS consolidate redundant explanations
|
||||
- ALWAYS remove planning docs after features ship
|
||||
- ALWAYS match CLI output exactly in examples
|
||||
|
||||
## File Structure Reference
|
||||
|
||||
```
|
||||
applications/aphoria/
|
||||
├── README.md # 2-minute quickstart, key concepts
|
||||
├── docs/
|
||||
│ ├── cli-reference.md # Complete command reference
|
||||
│ ├── comparison-modes.md # Deep dive on one feature
|
||||
│ ├── guides/
|
||||
│ │ ├── README.md # Guide hub
|
||||
│ │ ├── solo-developer-guide.md
|
||||
│ │ ├── enterprise-pilot-guide.md
|
||||
│ │ └── the-first-scan.md
|
||||
│ ├── architecture/ # For contributors
|
||||
│ │ └── README.md
|
||||
│ └── vision-gaps.md # Status: what's implemented vs not
|
||||
```
|
||||
|
||||
**Delete candidates:**
|
||||
- `docs/planning/*.md` after features ship
|
||||
- `docs/gap-analysis-*.md` older than 3 months
|
||||
- Any doc with "Phase X: Future Feature" that's been shipped
|
||||
|
||||
## Output Format
|
||||
|
||||
When auditing docs, produce:
|
||||
|
||||
```markdown
|
||||
## Documentation Audit: [Date]
|
||||
|
||||
### Files Analyzed
|
||||
- X total docs, Y lines
|
||||
|
||||
### Issues Found
|
||||
|
||||
**Redundancy:**
|
||||
- "What is a claim" duplicated in: README, vision-gaps, cli-reference
|
||||
- **Fix:** Keep README version, replace others with link
|
||||
|
||||
**Stale Content:**
|
||||
- `planning/ingest-best-practices.md` describes unbuilt feature
|
||||
- **Fix:** Move to roadmap.md or delete
|
||||
|
||||
**Old Terminology:**
|
||||
- 7 files still use "ExtractedClaim"
|
||||
- **Fix:** Find/replace → "Observation"
|
||||
|
||||
**Broken Examples:**
|
||||
- `guides/the-first-scan.md` line 42: command flag `--verbose` doesn't exist
|
||||
- **Fix:** Remove or update to `--show-observations`
|
||||
|
||||
### Recommendations
|
||||
|
||||
1. **Delete:** [list]
|
||||
2. **Consolidate:** [list]
|
||||
3. **Update:** [list]
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Before: Redundant Explanation
|
||||
|
||||
**README.md:**
|
||||
```markdown
|
||||
A claim is a human-authored statement about what code MUST do...
|
||||
```
|
||||
|
||||
**cli-reference.md:**
|
||||
```markdown
|
||||
Claims are assertions about your codebase that have provenance...
|
||||
```
|
||||
|
||||
**vision-gaps.md:**
|
||||
```markdown
|
||||
A claim (unlike an observation) is a human-written rule...
|
||||
```
|
||||
|
||||
### After: Canonical + Links
|
||||
|
||||
**README.md:**
|
||||
```markdown
|
||||
## What Are Claims?
|
||||
|
||||
A claim is a human-authored rule about what code MUST do, with:
|
||||
- Provenance (where it came from)
|
||||
- Invariant (what must stay true)
|
||||
- Consequence (what breaks if violated)
|
||||
|
||||
See [Claims-Based Verification](#claims-based-verification) for examples.
|
||||
```
|
||||
|
||||
**cli-reference.md:**
|
||||
```markdown
|
||||
### Claims Management
|
||||
|
||||
Claims are human-authored rules. See [README: What Are Claims](../README.md#what-are-claims) for the full explanation.
|
||||
|
||||
Commands:
|
||||
- `aphoria claims create` - Author a new claim
|
||||
...
|
||||
```
|
||||
|
||||
**vision-gaps.md:**
|
||||
```markdown
|
||||
## Implementation Status
|
||||
|
||||
Claims (human-authored rules, see [README](../README.md#what-are-claims)) are now fully implemented with:
|
||||
- TOML persistence at `.aphoria/claims.toml`
|
||||
- CLI commands for create/list/update
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Before: Planning Doc in User Space
|
||||
|
||||
**docs/planning/ingest-best-practices.md:**
|
||||
```markdown
|
||||
# Ingest Best Practices Documentation - Executable Policy
|
||||
|
||||
## Vision: Documentation That Enforces Itself
|
||||
|
||||
Run: aphoria ingest-guide architecture.md # This doesn't exist yet!
|
||||
```
|
||||
|
||||
### After: Moved or Deleted
|
||||
|
||||
**roadmap.md:**
|
||||
```markdown
|
||||
## Phase 11: Document Ingestion (Future)
|
||||
|
||||
**Vision:** Parse architecture guides and auto-generate claims.
|
||||
|
||||
Command: `aphoria ingest-guide architecture.md`
|
||||
|
||||
Status: Not started
|
||||
```
|
||||
|
||||
Or just **delete** if we're not doing this.
|
||||
|
||||
---
|
||||
|
||||
### Before: Stale Example
|
||||
|
||||
**guides/the-first-scan.md:**
|
||||
```bash
|
||||
aphoria scan --verbose
|
||||
# ERROR: Unknown flag --verbose
|
||||
```
|
||||
|
||||
### After: Current Example
|
||||
|
||||
**guides/the-first-scan.md:**
|
||||
```bash
|
||||
aphoria scan --show-observations
|
||||
# Shows all observations, not just conflicts
|
||||
```
|
||||
|
||||
## Priority Targets for Cleanup
|
||||
|
||||
Based on current Aphoria docs (~14,700 lines):
|
||||
|
||||
1. **vision-gaps.md** (671 lines) - Too many jobs:
|
||||
- Extract "Implementation Status" → Move to roadmap
|
||||
- Keep "Current Architecture" as architecture/README.md
|
||||
- Delete "Future Vision" or move to roadmap
|
||||
|
||||
2. **planning/** directory - Planning docs for unbuilt features:
|
||||
- Move to roadmap.md or delete after features ship
|
||||
|
||||
3. **Old terminology** - ExtractedClaim in 7 files:
|
||||
- Find/replace → "Observation"
|
||||
|
||||
4. **gap-analysis-institutional-knowledge.md** (17KB):
|
||||
- Most is planning, not user docs
|
||||
- Move to roadmap or delete
|
||||
|
||||
5. **Duplicate "what is a claim"** - In 4+ files:
|
||||
- Consolidate to README, link everywhere else
|
||||
601
.claude/skills/aphoria-docs/SKILL.md
Normal file
601
.claude/skills/aphoria-docs/SKILL.md
Normal file
@ -0,0 +1,601 @@
|
||||
---
|
||||
name: aphoria-docs
|
||||
description: Curate, update, and maintain Aphoria documentation. Use when auditing docs for staleness, consolidating redundancy, updating examples, or adding new guides.
|
||||
---
|
||||
|
||||
# Aphoria Documentation Curation
|
||||
|
||||
## Identity
|
||||
|
||||
You are a documentation curator who learned from Stripe API docs and PostgreSQL manuals. You believe **concise documentation gets read, comprehensive documentation gets skipped**. Your job is continuous improvement: delete outdated content, consolidate duplicates, update examples, and ensure every sentence earns its place.
|
||||
|
||||
You communicate directly. You don't repeat yourself. You test every example.
|
||||
|
||||
## Principles
|
||||
|
||||
- **Examples Over Explanation**: Show working code before describing theory
|
||||
- **Delete Before Adding**: Removing old content is more valuable than adding new
|
||||
- **One Canonical Source**: Information lives in ONE place, linked from everywhere else
|
||||
- **Progressive Disclosure**: README → Guide → Reference → Architecture (right info at right time)
|
||||
- **Examples Must Work**: Every bash block must copy-paste perfectly or it gets deleted
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
**Triggers:**
|
||||
- "Update the Aphoria documentation"
|
||||
- "The CLI reference is out of date"
|
||||
- "We need docs for [new feature]"
|
||||
- "Clean up the docs"
|
||||
- "The examples don't work anymore"
|
||||
|
||||
**Scope:**
|
||||
- User-facing docs: README, guides/, cli-reference.md, comparison-modes.md
|
||||
- Contributor docs: architecture/, vision-gaps.md
|
||||
- Planning docs: Audit for staleness, move to roadmap when features ship
|
||||
|
||||
**Not in scope:**
|
||||
- Architectural white papers (use `martin-kleppmann` agent)
|
||||
- Code comments (use language-specific linters)
|
||||
- Roadmap planning (use `stemedb-planner` agent)
|
||||
|
||||
## Protocol
|
||||
|
||||
### Phase 1: Understand the Request
|
||||
|
||||
Clarify what type of documentation work is needed:
|
||||
|
||||
| Request Type | Action |
|
||||
|--------------|--------|
|
||||
| "Update docs for [feature]" | Add/update specific content |
|
||||
| "Clean up docs" | Full audit + surgical edits |
|
||||
| "Examples don't work" | Test and fix examples |
|
||||
| "Add guide for [audience]" | Create new guide |
|
||||
| "Docs are out of date" | Find and update stale content |
|
||||
|
||||
**Decision Point:** Before proceeding, state which type this is and what success looks like.
|
||||
|
||||
### Phase 2: Survey Current State
|
||||
|
||||
For audits or broad updates:
|
||||
|
||||
```bash
|
||||
# List all docs
|
||||
find applications/aphoria/docs -name "*.md" | sort
|
||||
|
||||
# Check sizes
|
||||
wc -l applications/aphoria/README.md applications/aphoria/docs/**/*.md
|
||||
|
||||
# Find old terminology
|
||||
grep -r "ExtractedClaim\|old_command\|deprecated_flag" applications/aphoria/docs/
|
||||
|
||||
# Find stale dates
|
||||
grep -r "2024\|2025\|as of" applications/aphoria/docs/ --include="*.md" | grep -v "copyright\|example"
|
||||
|
||||
# Find TODOs
|
||||
grep -r "TODO\|FIXME\|XXX" applications/aphoria/docs/
|
||||
|
||||
# Check for duplicate content
|
||||
grep -r "what is a claim" applications/aphoria/docs/ -i
|
||||
grep -r "observations vs claims" applications/aphoria/docs/ -i
|
||||
```
|
||||
|
||||
**Output:** List of files with line counts and identified issues.
|
||||
|
||||
### Phase 3: Categorize Files
|
||||
|
||||
Tag each doc by purpose:
|
||||
|
||||
| Category | Purpose | Location | Action |
|
||||
|----------|---------|----------|--------|
|
||||
| **Quickstart** | Get scanning in 2 min | README.md | Keep lean, examples only |
|
||||
| **User Guides** | Audience-specific workflows | guides/ | Keep updated, consolidate duplicates |
|
||||
| **Reference** | Complete command catalog | cli-reference.md | Keep comprehensive, test examples |
|
||||
| **Deep Dives** | Single feature explained | comparison-modes.md | Keep focused, one topic only |
|
||||
| **Contributor** | For maintainers | architecture/ | Keep if current, archive if stale |
|
||||
| **Status** | Implementation progress | vision-gaps.md | Update regularly or delete |
|
||||
| **Planning** | Future features | planning/ | Move to roadmap when shipped |
|
||||
|
||||
**Decision Point:** Before editing, state which category each affected file falls into and whether it should exist.
|
||||
|
||||
### Phase 4: Step Back - The Deletion Check
|
||||
|
||||
Before adding or updating ANY content, ask these adversarial questions:
|
||||
|
||||
#### 1. The Necessity Question
|
||||
> "Does this information actually need to exist?"
|
||||
|
||||
- Is this planning for an unbuilt feature? → Move to roadmap
|
||||
- Is this an architectural analysis for a past decision? → Archive it
|
||||
- Is this explaining something obvious? → Delete it
|
||||
- Is this duplicated elsewhere? → Link instead
|
||||
|
||||
#### 2. The Audience Question
|
||||
> "Who reads this and when?"
|
||||
|
||||
- Solo developer in their first 5 minutes? → README only
|
||||
- Enterprise team planning a pilot? → Dedicated guide
|
||||
- Contributor debugging extractors? → Architecture doc
|
||||
- Nobody? → Delete it
|
||||
|
||||
#### 3. The Example Question
|
||||
> "Can I show this instead of explaining it?"
|
||||
|
||||
- If yes → Replace explanation with working example
|
||||
- If no → Keep explanation but make it shorter
|
||||
|
||||
#### 4. The Freshness Question
|
||||
> "Will this content rot?"
|
||||
|
||||
- Does it reference specific dates? → Remove or version-scope them
|
||||
- Does it describe "current" behavior that will change? → Make it version-specific
|
||||
- Does it use deprecated terminology? → Update now
|
||||
|
||||
**After step back:**
|
||||
- List items to DELETE (with reason)
|
||||
- List items to CONSOLIDATE (source + destination)
|
||||
- List items to UPDATE (what's wrong)
|
||||
- List items to CREATE (only if genuinely missing)
|
||||
|
||||
### Phase 5: Execute Surgical Edits
|
||||
|
||||
Based on step back decisions:
|
||||
|
||||
#### 5A: Deletions
|
||||
```bash
|
||||
# Remove outdated sections
|
||||
# Example: vision-gaps.md line 420-450 describes a bug that's fixed
|
||||
```
|
||||
|
||||
Delete ruthlessly:
|
||||
- Planning docs for shipped features
|
||||
- Architectural analyses for completed decisions
|
||||
- Duplicate explanations
|
||||
- Examples that don't work
|
||||
- Obvious explanations
|
||||
|
||||
#### 5B: Consolidations
|
||||
|
||||
Pattern: ONE canonical source, links elsewhere
|
||||
|
||||
**Before:**
|
||||
```markdown
|
||||
# README.md
|
||||
A claim is a human-authored rule...
|
||||
|
||||
# cli-reference.md
|
||||
Claims are assertions about code...
|
||||
|
||||
# vision-gaps.md
|
||||
A claim (unlike observations) is...
|
||||
```
|
||||
|
||||
**After:**
|
||||
```markdown
|
||||
# README.md (canonical)
|
||||
## What Are Claims?
|
||||
A claim is a human-authored rule with provenance...
|
||||
|
||||
# cli-reference.md
|
||||
See [README: Claims](../README.md#what-are-claims).
|
||||
|
||||
Commands:
|
||||
- aphoria claims create
|
||||
|
||||
# vision-gaps.md
|
||||
Claims (see [README](../README.md#what-are-claims)) are now implemented...
|
||||
```
|
||||
|
||||
#### 5C: Updates
|
||||
|
||||
Update in this priority order:
|
||||
|
||||
1. **Terminology** - Find/replace old terms
|
||||
```bash
|
||||
# Update ExtractedClaim → Observation everywhere
|
||||
grep -rl "ExtractedClaim" applications/aphoria/docs/ | xargs sed -i 's/ExtractedClaim/Observation/g'
|
||||
```
|
||||
|
||||
2. **Examples** - Fix to match current CLI
|
||||
```bash
|
||||
# Test each bash block
|
||||
aphoria scan --verbose # Does this flag exist?
|
||||
# If not, update to --show-observations
|
||||
```
|
||||
|
||||
3. **Dates** - Remove or scope them
|
||||
```bash
|
||||
# "As of 2026-02-06" → Just state the current behavior
|
||||
# "In Q1 2025" → Delete or move to historical context
|
||||
```
|
||||
|
||||
4. **Cross-links** - Verify they resolve
|
||||
```bash
|
||||
grep -r '\[.*\](.*\.md)' applications/aphoria/docs/ | # extract and verify
|
||||
```
|
||||
|
||||
#### 5D: Additions (Last Resort)
|
||||
|
||||
Only create new content if:
|
||||
- Feature exists but has NO documentation
|
||||
- Audience exists (solo dev, enterprise) but has NO guide
|
||||
- Concept is complex and NOT explained anywhere
|
||||
|
||||
**New Guide Checklist:**
|
||||
- [ ] Audience identified (who reads this?)
|
||||
- [ ] Success criteria (what can they do after?)
|
||||
- [ ] Examples first (show before telling)
|
||||
- [ ] Links to reference docs (don't duplicate)
|
||||
- [ ] Tested (every example works)
|
||||
|
||||
### Phase 6: Verify Quality
|
||||
|
||||
Before committing changes:
|
||||
|
||||
#### 6A: Test Examples
|
||||
```bash
|
||||
# Extract and run every bash block
|
||||
grep -A10 '```bash' applications/aphoria/docs/**/*.md | sed '/```/d' > /tmp/examples.sh
|
||||
bash -n /tmp/examples.sh # Syntax check
|
||||
# Then manually test critical ones
|
||||
```
|
||||
|
||||
#### 6B: Check Cross-Links
|
||||
```bash
|
||||
# Extract all markdown links
|
||||
grep -r '\[.*\](.*\.md[^)]*)' applications/aphoria/docs/ -o | sort -u
|
||||
|
||||
# Verify each file exists
|
||||
# (script this if you have many links)
|
||||
```
|
||||
|
||||
#### 6C: Verify Terminology
|
||||
```bash
|
||||
# Should find ZERO old terms
|
||||
! grep -r "ExtractedClaim" applications/aphoria/docs/
|
||||
! grep -r "old_command_name" applications/aphoria/docs/
|
||||
```
|
||||
|
||||
#### 6D: Audit for Duplication
|
||||
```bash
|
||||
# Check key concepts appear in only ONE canonical place
|
||||
grep -r "what is a claim" applications/aphoria/docs/ -i
|
||||
# Should find: 1 definition in README, N links to it
|
||||
```
|
||||
|
||||
## Do
|
||||
|
||||
1. **Delete before adding** - Remove outdated content first
|
||||
2. **Test every bash example** - If it doesn't work, fix or delete it
|
||||
3. **Consolidate duplicates** - One canonical source, links everywhere else
|
||||
4. **Update terminology** - Old terms (ExtractedClaim) must be replaced everywhere
|
||||
5. **Remove dates** - "As of 2026-02-06" creates maintenance burden
|
||||
6. **Match CLI output exactly** - If scan shows "BLOCK", docs show "BLOCK"
|
||||
7. **Separate audiences** - Solo dev guide ≠ enterprise guide ≠ contributor guide
|
||||
8. **Verify cross-links** - Every `[link](path)` must resolve
|
||||
9. **Archive planning docs** - Features shipped? Move planning doc to roadmap
|
||||
10. **Use examples first** - Show working code before explaining
|
||||
|
||||
## Do Not
|
||||
|
||||
1. **Repeat yourself** - If it's in README, link from elsewhere
|
||||
2. **Mix planning with user docs** - "Future features" belong in roadmap
|
||||
3. **Use vague examples** - Concrete commands only: `aphoria scan .` not "run the scan"
|
||||
4. **Leave old terminology** - ExtractedClaim, deprecated flags, old commands
|
||||
5. **Write without testing** - Every example must work
|
||||
6. **Explain obvious things** - If flag is `--exit-code`, don't explain "this flag causes exit code"
|
||||
7. **Add dates casually** - Dates make docs rot; remove unless critical
|
||||
8. **Create without checking** - Search for existing content first
|
||||
9. **Duplicate explanations** - Consolidate to ONE place, link from others
|
||||
10. **Ignore architecture docs** - They exist; keep them updated or delete them
|
||||
|
||||
## Decision Points
|
||||
|
||||
**Before creating a new file:** Stop. Can this be a section in an existing file? State which file it would extend and why it can't be a section.
|
||||
|
||||
**Before adding an example:** Stop. Will you test this example before committing? If not, don't add it.
|
||||
|
||||
**Before adding an explanation:** Stop. Can you show an example instead? Examples > explanations.
|
||||
|
||||
**Before adding a date:** Stop. Will this date make content stale in 3 months? Remove it or make it version-specific.
|
||||
|
||||
**Before duplicating content:** Stop. Where is the canonical source? Link to it instead.
|
||||
|
||||
## Constraints
|
||||
|
||||
- NEVER commit untested examples
|
||||
- NEVER duplicate content (link to canonical source instead)
|
||||
- NEVER leave old terminology (ExtractedClaim, deprecated commands)
|
||||
- NEVER mix user docs with planning docs
|
||||
- NEVER add dates without version context
|
||||
- ALWAYS test bash examples before committing
|
||||
- ALWAYS consolidate redundant explanations
|
||||
- ALWAYS remove planning docs after features ship
|
||||
- ALWAYS match CLI output exactly
|
||||
- ALWAYS verify cross-links resolve
|
||||
|
||||
## File Structure Reference
|
||||
|
||||
Current Aphoria documentation structure:
|
||||
|
||||
```
|
||||
applications/aphoria/
|
||||
├── README.md # 2-minute quickstart, key concepts
|
||||
│ # Target: 200-400 lines, examples-heavy
|
||||
│
|
||||
├── docs/
|
||||
│ ├── cli-reference.md # Complete command reference
|
||||
│ │ # Target: Comprehensive but organized
|
||||
│ │
|
||||
│ ├── comparison-modes.md # Deep dive: single feature
|
||||
│ │ # Pattern: One topic, exhaustive
|
||||
│ │
|
||||
│ ├── vision-gaps.md # Implementation status
|
||||
│ │ # Keep current or delete if stale
|
||||
│ │
|
||||
│ ├── guides/
|
||||
│ │ ├── README.md # Guide hub, navigation
|
||||
│ │ ├── solo-developer-guide.md
|
||||
│ │ ├── enterprise-pilot-guide.md
|
||||
│ │ ├── enterprise-quick-start.md
|
||||
│ │ ├── the-first-scan.md
|
||||
│ │ └── [audience]-guide.md # Audience-specific workflows
|
||||
│ │
|
||||
│ ├── architecture/ # For contributors
|
||||
│ │ ├── README.md
|
||||
│ │ └── [topic].md # Keep if current, archive if stale
|
||||
│ │
|
||||
│ ├── planning/ # Future features
|
||||
│ │ └── [feature].md # DELETE when feature ships
|
||||
│ │
|
||||
│ └── llm-optimization/ # LLM eval workflow
|
||||
│ └── [baseline|research]/ # Keep for aphoria-llm-optimization skill
|
||||
```
|
||||
|
||||
**Deletion Targets:**
|
||||
- `planning/*.md` - After features ship, move to roadmap or delete
|
||||
- `gap-analysis-*.md` - If older than 3 months, archive or delete
|
||||
- Sections with "Phase X: Future Feature" - Move to roadmap when shipped
|
||||
- Architecture analysis docs - Archive when decision is made
|
||||
|
||||
## Output Format
|
||||
|
||||
When completing doc work, produce:
|
||||
|
||||
### For Audits
|
||||
|
||||
```markdown
|
||||
## Documentation Audit: [Date]
|
||||
|
||||
### Scope
|
||||
- Files analyzed: X files, Y total lines
|
||||
- Focus: [audit type - full audit, feature update, cleanup]
|
||||
|
||||
### Issues Found
|
||||
|
||||
**1. Redundancy**
|
||||
- Concept: "What is a claim"
|
||||
- Found in: README.md, cli-reference.md, vision-gaps.md
|
||||
- Fix: Keep README version (lines 95-110), replace others with links
|
||||
|
||||
**2. Stale Content**
|
||||
- File: `planning/ingest-best-practices.md`
|
||||
- Issue: Describes unbuilt feature
|
||||
- Fix: Delete (feature not on roadmap)
|
||||
|
||||
**3. Old Terminology**
|
||||
- Files: 7 files use "ExtractedClaim"
|
||||
- Fix: Find/replace → "Observation"
|
||||
|
||||
**4. Broken Examples**
|
||||
- File: `guides/the-first-scan.md` line 42
|
||||
- Issue: Uses `--verbose` flag that doesn't exist
|
||||
- Fix: Update to `--show-observations`
|
||||
|
||||
### Changes Made
|
||||
|
||||
**Deleted:**
|
||||
- `planning/ingest-best-practices.md` - Feature not shipping
|
||||
- `vision-gaps.md` lines 420-450 - Bug report for fixed issue
|
||||
- 3 duplicate "what is a claim" explanations
|
||||
|
||||
**Consolidated:**
|
||||
- "Claims vs Observations" → Canonical in README.md
|
||||
- Added links from cli-reference.md, vision-gaps.md
|
||||
|
||||
**Updated:**
|
||||
- Replaced "ExtractedClaim" → "Observation" in 7 files
|
||||
- Fixed 4 broken examples to match current CLI
|
||||
- Removed 8 instances of "as of [date]"
|
||||
|
||||
**Added:**
|
||||
- Git commit tracking section to README.md (new feature)
|
||||
- Ignore system documentation to CLI reference
|
||||
|
||||
### Verification
|
||||
|
||||
- ✅ All examples tested and working
|
||||
- ✅ All cross-links verified
|
||||
- ✅ No old terminology found
|
||||
- ✅ No duplicate explanations
|
||||
- ✅ Contributor docs current
|
||||
```
|
||||
|
||||
### For Updates
|
||||
|
||||
```markdown
|
||||
## Documentation Update: [Feature/Fix]
|
||||
|
||||
### Changed Files
|
||||
- `README.md` - Added git commit tracking section
|
||||
- `cli-reference.md` - Added "Git Integration" section
|
||||
- `comparison-modes.md` - Updated Contains/NotContains examples
|
||||
|
||||
### Examples Added
|
||||
All examples tested:
|
||||
```bash
|
||||
aphoria claims create --id test-001 ... # ✓ Works
|
||||
aphoria verify run --category safety # ✓ Works
|
||||
```
|
||||
|
||||
### Cross-References Updated
|
||||
- README → cli-reference (git integration)
|
||||
- comparison-modes ← cli-reference (detailed guide)
|
||||
```
|
||||
|
||||
## Priority Targets (Current Aphoria Docs)
|
||||
|
||||
Based on survey of ~14,700 lines across 35 files:
|
||||
|
||||
### 1. vision-gaps.md (671 lines)
|
||||
**Issue:** Doing three jobs - status, architecture, vision
|
||||
**Fix:**
|
||||
- Extract "Implementation Status" → Move to roadmap
|
||||
- Keep "Current Architecture" → Consolidate with architecture/README.md
|
||||
- Delete "Future Vision" → Move to roadmap or delete
|
||||
|
||||
### 2. planning/ directory (42KB)
|
||||
**Issue:** Planning docs for unbuilt features mixed with user docs
|
||||
**Fix:**
|
||||
- `ingest-best-practices.md` - Delete or move to roadmap
|
||||
- `enriched-corpus-patterns.md` - Delete or move to roadmap
|
||||
- General rule: Planning docs should be in roadmap.md, not docs/
|
||||
|
||||
### 3. Old Terminology (7 files)
|
||||
**Issue:** "ExtractedClaim" still appears despite rename to "Observation"
|
||||
**Files:**
|
||||
- architecture/enterprise-validation.md
|
||||
- architecture/llm-eval-implementation.md
|
||||
- architecture/llm-prompt-evaluation.md
|
||||
- architecture/policy-alias-implementation.md
|
||||
- architecture/README.md
|
||||
- llm-optimization/playbook.md
|
||||
- planning/ingest-best-practices-docs.md
|
||||
|
||||
**Fix:** Find/replace globally
|
||||
|
||||
### 4. gap-analysis-institutional-knowledge.md (17KB)
|
||||
**Issue:** Large planning doc, most content is future vision
|
||||
**Fix:** Move to roadmap or delete; if keeping, radically shorten
|
||||
|
||||
### 5. Duplicate "What is a claim" (4+ files)
|
||||
**Issue:** Same concept explained differently in multiple places
|
||||
**Fix:**
|
||||
- Canonical: README.md (keep the best version)
|
||||
- Others: Replace with link to README
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Consolidating Duplicates
|
||||
|
||||
**Before:**
|
||||
|
||||
`README.md`:
|
||||
```markdown
|
||||
## Claims
|
||||
A claim is a human-authored statement...
|
||||
```
|
||||
|
||||
`cli-reference.md`:
|
||||
```markdown
|
||||
### Claims Management
|
||||
Claims are assertions about your codebase with provenance...
|
||||
```
|
||||
|
||||
`vision-gaps.md`:
|
||||
```markdown
|
||||
## What a Real Claim Looks Like
|
||||
A claim (unlike an observation) is a rule...
|
||||
```
|
||||
|
||||
**After:**
|
||||
|
||||
`README.md` (canonical):
|
||||
```markdown
|
||||
## Key Concepts: Observations vs Claims
|
||||
|
||||
| Type | What | Who Creates | Example |
|
||||
|------|------|-------------|---------|
|
||||
| Observation | Pattern match | Extractors | `imports/tokio: true` |
|
||||
| Claim | Rule with provenance | Humans | "Core MUST NOT import tokio..." |
|
||||
|
||||
A claim is a human-authored rule with:
|
||||
- Provenance (where it came from)
|
||||
- Invariant (what must stay true)
|
||||
- Consequence (what breaks if violated)
|
||||
```
|
||||
|
||||
`cli-reference.md`:
|
||||
```markdown
|
||||
### Claims Management
|
||||
|
||||
See [README: Claims](../README.md#key-concepts-observations-vs-claims) for the full explanation.
|
||||
|
||||
Commands:
|
||||
- `aphoria claims create` - Author new claim
|
||||
```
|
||||
|
||||
`vision-gaps.md`:
|
||||
```markdown
|
||||
## Implementation Status
|
||||
|
||||
Claims (see [README](../README.md#key-concepts-observations-vs-claims)) are fully implemented:
|
||||
- Storage: `.aphoria/claims.toml`
|
||||
- CLI: create/list/update/supersede/deprecate
|
||||
```
|
||||
|
||||
### Example 2: Removing Planning Docs
|
||||
|
||||
**Before:**
|
||||
|
||||
`docs/planning/ingest-best-practices.md` (18KB):
|
||||
```markdown
|
||||
# Vision: Documentation That Enforces Itself
|
||||
|
||||
Run: aphoria ingest-guide architecture.md # Future feature!
|
||||
```
|
||||
|
||||
**After:**
|
||||
|
||||
File deleted. If feature is planned, add to roadmap:
|
||||
|
||||
`roadmap.md`:
|
||||
```markdown
|
||||
## Phase 11: Document Ingestion (Future)
|
||||
Parse architecture guides and auto-generate claims.
|
||||
Status: Not started
|
||||
```
|
||||
|
||||
### Example 3: Fixing Broken Examples
|
||||
|
||||
**Before:**
|
||||
|
||||
`guides/the-first-scan.md`:
|
||||
```bash
|
||||
aphoria scan --verbose
|
||||
# Shows detailed output
|
||||
```
|
||||
|
||||
(Flag doesn't exist, command fails)
|
||||
|
||||
**After:**
|
||||
|
||||
`guides/the-first-scan.md`:
|
||||
```bash
|
||||
aphoria scan --show-observations
|
||||
# Shows all observations, not just conflicts
|
||||
|
||||
# Example output:
|
||||
# PASS code://rust/myapp/tls/enabled = true
|
||||
# BLOCK code://rust/myapp/tls/cert_verification = false
|
||||
```
|
||||
|
||||
(Tested, works, includes actual output)
|
||||
|
||||
## Integration with Other Skills/Agents
|
||||
|
||||
- **Use `aphoria-docs` agent** - For actually doing the work (audits, updates, consolidations)
|
||||
- **Use `aphoria-dev` skill** - When docs need code changes to match
|
||||
- **Use `martin-kleppmann` agent** - For architectural white papers (separate from user docs)
|
||||
- **Use `stemedb-planner` agent** - When planning docs should move to roadmap
|
||||
|
||||
This skill orchestrates; the agent executes.
|
||||
@ -2,13 +2,19 @@
|
||||
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Checkbox } from "@/components/ui/checkbox";
|
||||
import { X, Search } from "lucide-react";
|
||||
|
||||
interface CorpusFiltersProps {
|
||||
subjectPrefix: string;
|
||||
minProjects: number;
|
||||
filterCategory: string;
|
||||
hideNoise: boolean;
|
||||
availableCategories: string[];
|
||||
onSubjectPrefixChange: (value: string) => void;
|
||||
onMinProjectsChange: (value: number) => void;
|
||||
onFilterCategoryChange: (value: string) => void;
|
||||
onHideNoiseChange: (value: boolean) => void;
|
||||
onSubmit: () => void;
|
||||
onClear: () => void;
|
||||
totalCount: number;
|
||||
@ -20,8 +26,13 @@ interface CorpusFiltersProps {
|
||||
export function CorpusFilters({
|
||||
subjectPrefix,
|
||||
minProjects,
|
||||
filterCategory,
|
||||
hideNoise,
|
||||
availableCategories,
|
||||
onSubjectPrefixChange,
|
||||
onMinProjectsChange,
|
||||
onFilterCategoryChange,
|
||||
onHideNoiseChange,
|
||||
onSubmit,
|
||||
onClear,
|
||||
totalCount,
|
||||
@ -69,6 +80,40 @@ export function CorpusFilters({
|
||||
/>
|
||||
</div>
|
||||
|
||||
{/* Category Filter */}
|
||||
<div className="flex flex-col gap-2">
|
||||
<label htmlFor="category-filter" className="text-sm font-medium">
|
||||
Category
|
||||
</label>
|
||||
<select
|
||||
id="category-filter"
|
||||
value={filterCategory}
|
||||
onChange={(e) => onFilterCategoryChange(e.target.value)}
|
||||
className="h-10 px-3 py-2 text-sm rounded-md border border-input bg-background"
|
||||
disabled={isLoading}
|
||||
>
|
||||
<option value="all">All Categories</option>
|
||||
{availableCategories.map((cat) => (
|
||||
<option key={cat} value={cat}>
|
||||
{cat}
|
||||
</option>
|
||||
))}
|
||||
</select>
|
||||
</div>
|
||||
|
||||
{/* Hide Noise Toggle */}
|
||||
<div className="flex items-center gap-2 h-10">
|
||||
<Checkbox
|
||||
id="hide-noise"
|
||||
checked={hideNoise}
|
||||
onCheckedChange={onHideNoiseChange}
|
||||
disabled={isLoading}
|
||||
/>
|
||||
<label htmlFor="hide-noise" className="text-sm font-medium cursor-pointer">
|
||||
Hide noise
|
||||
</label>
|
||||
</div>
|
||||
|
||||
{/* Submit Button */}
|
||||
<Button type="submit" disabled={isLoading}>
|
||||
<Search className="h-4 w-4 mr-2" />
|
||||
|
||||
@ -1,9 +1,10 @@
|
||||
"use client";
|
||||
|
||||
import { useState, useCallback, useEffect } from "react";
|
||||
import { useState, useCallback, useEffect, useMemo } from "react";
|
||||
import {
|
||||
StemeDBClient,
|
||||
type GetPatternsResponse,
|
||||
type PatternDto,
|
||||
ApiError,
|
||||
} from "@/lib/api";
|
||||
import type { PanelState } from "@/lib/types";
|
||||
@ -27,6 +28,10 @@ export function CorpusPanel() {
|
||||
const [searchPrefix, setSearchPrefix] = useState("");
|
||||
const [searchMinProjects, setSearchMinProjects] = useState(DEFAULT_MIN_PROJECTS);
|
||||
|
||||
// Client-side filter state
|
||||
const [filterCategory, setFilterCategory] = useState<string>("all");
|
||||
const [hideNoise, setHideNoise] = useState<boolean>(false);
|
||||
|
||||
const fetchData = useCallback(async () => {
|
||||
setState({ status: "loading" });
|
||||
try {
|
||||
@ -73,12 +78,44 @@ export function CorpusPanel() {
|
||||
setInputMinProjects(DEFAULT_MIN_PROJECTS);
|
||||
setSearchPrefix("");
|
||||
setSearchMinProjects(DEFAULT_MIN_PROJECTS);
|
||||
setFilterCategory("all");
|
||||
setHideNoise(false);
|
||||
}, []);
|
||||
|
||||
// Patterns from successful state (filtering done server-side)
|
||||
const patterns = state.status === "success" ? state.data.patterns : [];
|
||||
// Get raw patterns from server
|
||||
const rawPatterns = state.status === "success" ? state.data.patterns : [];
|
||||
|
||||
const hasActiveFilter = searchPrefix !== "" || searchMinProjects > DEFAULT_MIN_PROJECTS;
|
||||
// Extract available categories from patterns
|
||||
const availableCategories = useMemo(() => {
|
||||
const categories = new Set<string>();
|
||||
rawPatterns.forEach((p) => {
|
||||
if (p.category) {
|
||||
categories.add(p.category);
|
||||
}
|
||||
});
|
||||
return Array.from(categories).sort();
|
||||
}, [rawPatterns]);
|
||||
|
||||
// Apply client-side filters
|
||||
const patterns = useMemo(() => {
|
||||
return rawPatterns.filter((p: PatternDto) => {
|
||||
// Category filter
|
||||
if (filterCategory !== "all" && p.category !== filterCategory) {
|
||||
return false;
|
||||
}
|
||||
// Hide noise filter
|
||||
if (hideNoise && p.verdict === "noise") {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
});
|
||||
}, [rawPatterns, filterCategory, hideNoise]);
|
||||
|
||||
const hasActiveFilter =
|
||||
searchPrefix !== "" ||
|
||||
searchMinProjects > DEFAULT_MIN_PROJECTS ||
|
||||
filterCategory !== "all" ||
|
||||
hideNoise;
|
||||
|
||||
return (
|
||||
<div className="space-y-6">
|
||||
@ -100,8 +137,13 @@ export function CorpusPanel() {
|
||||
<CorpusFilters
|
||||
subjectPrefix={inputPrefix}
|
||||
minProjects={inputMinProjects}
|
||||
filterCategory={filterCategory}
|
||||
hideNoise={hideNoise}
|
||||
availableCategories={availableCategories}
|
||||
onSubjectPrefixChange={setInputPrefix}
|
||||
onMinProjectsChange={setInputMinProjects}
|
||||
onFilterCategoryChange={setFilterCategory}
|
||||
onHideNoiseChange={setHideNoise}
|
||||
onSubmit={handleSubmit}
|
||||
onClear={handleClear}
|
||||
totalCount={state.status === "success" ? state.data.total_matching : 0}
|
||||
|
||||
@ -5,6 +5,8 @@ import type { PatternDto } from "@/lib/api";
|
||||
import { formatRelativeTime, extractDomain, extractConcept } from "./constants";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Users, Clock, Eye } from "lucide-react";
|
||||
import { EnrichmentBadge } from "./enrichment-badge";
|
||||
import { VerdictBadge } from "./verdict-badge";
|
||||
|
||||
interface CorpusRowProps {
|
||||
pattern: PatternDto;
|
||||
@ -42,6 +44,14 @@ export function CorpusRow({ pattern, className }: CorpusRowProps) {
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Enrichment badges */}
|
||||
{(pattern.category || pattern.verdict) && (
|
||||
<div className="flex items-center gap-2 mb-3">
|
||||
{pattern.category && <EnrichmentBadge category={pattern.category} />}
|
||||
{pattern.verdict && <VerdictBadge verdict={pattern.verdict} />}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Value */}
|
||||
<div className="mb-4">
|
||||
<code className="text-sm bg-muted px-2 py-1 rounded font-mono break-all">
|
||||
@ -49,6 +59,16 @@ export function CorpusRow({ pattern, className }: CorpusRowProps) {
|
||||
</code>
|
||||
</div>
|
||||
|
||||
{/* Explanation */}
|
||||
{pattern.explanation && (
|
||||
<div className="mb-4 text-sm text-muted-foreground">
|
||||
<p>{pattern.explanation}</p>
|
||||
{pattern.authority_source && (
|
||||
<p className="text-xs mt-1">Authority: {pattern.authority_source}</p>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Stats */}
|
||||
<div className="flex flex-wrap items-center gap-4 text-xs text-muted-foreground">
|
||||
<div className="flex items-center gap-1">
|
||||
|
||||
@ -0,0 +1,35 @@
|
||||
"use client";
|
||||
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
interface EnrichmentBadgeProps {
|
||||
category: string;
|
||||
className?: string;
|
||||
size?: "sm" | "xs";
|
||||
}
|
||||
|
||||
const categoryColors: Record<string, string> = {
|
||||
security: "bg-red-500/20 text-red-700 dark:text-red-300",
|
||||
architecture: "bg-blue-500/20 text-blue-700 dark:text-blue-300",
|
||||
performance: "bg-emerald-500/20 text-emerald-700 dark:text-emerald-300",
|
||||
compliance: "bg-purple-500/20 text-purple-700 dark:text-purple-300",
|
||||
configuration: "bg-amber-500/20 text-amber-700 dark:text-amber-300",
|
||||
};
|
||||
|
||||
export function EnrichmentBadge({ category, className, size = "xs" }: EnrichmentBadgeProps) {
|
||||
const color = categoryColors[category.toLowerCase()] || "bg-slate-500/20 text-slate-700 dark:text-slate-300";
|
||||
const sizeClass = size === "xs" ? "text-[10px] px-1.5 py-0.5" : "text-xs px-2.5 py-0.5";
|
||||
|
||||
return (
|
||||
<span
|
||||
className={cn(
|
||||
"inline-flex items-center rounded font-medium",
|
||||
color,
|
||||
sizeClass,
|
||||
className
|
||||
)}
|
||||
>
|
||||
{category}
|
||||
</span>
|
||||
);
|
||||
}
|
||||
@ -0,0 +1,56 @@
|
||||
"use client";
|
||||
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
interface VerdictBadgeProps {
|
||||
verdict: string;
|
||||
className?: string;
|
||||
size?: "sm" | "xs";
|
||||
}
|
||||
|
||||
const verdictConfig: Record<string, { color: string; icon: string }> = {
|
||||
deprecated: {
|
||||
color: "bg-red-500/20 text-red-700 dark:text-red-300",
|
||||
icon: "⚠",
|
||||
},
|
||||
recommended: {
|
||||
color: "bg-emerald-500/20 text-emerald-700 dark:text-emerald-300",
|
||||
icon: "✓",
|
||||
},
|
||||
emerging: {
|
||||
color: "bg-blue-500/20 text-blue-700 dark:text-blue-300",
|
||||
icon: "●",
|
||||
},
|
||||
common: {
|
||||
color: "bg-slate-500/20 text-slate-700 dark:text-slate-300",
|
||||
icon: "○",
|
||||
},
|
||||
noise: {
|
||||
color: "bg-amber-500/20 text-amber-700 dark:text-amber-300",
|
||||
icon: "~",
|
||||
},
|
||||
};
|
||||
|
||||
export function VerdictBadge({ verdict, className, size = "xs" }: VerdictBadgeProps) {
|
||||
const config = verdictConfig[verdict.toLowerCase()] || {
|
||||
color: "bg-slate-500/20 text-slate-700 dark:text-slate-300",
|
||||
icon: "○",
|
||||
};
|
||||
|
||||
const sizeClass = size === "xs" ? "text-[10px] px-1.5 py-0.5" : "text-xs px-2.5 py-0.5";
|
||||
const iconSize = size === "xs" ? "text-[8px]" : "text-[10px]";
|
||||
|
||||
return (
|
||||
<span
|
||||
className={cn(
|
||||
"inline-flex items-center gap-1 rounded font-medium",
|
||||
config.color,
|
||||
sizeClass,
|
||||
className
|
||||
)}
|
||||
>
|
||||
<span className={iconSize}>{config.icon}</span>
|
||||
{verdict}
|
||||
</span>
|
||||
);
|
||||
}
|
||||
@ -0,0 +1,53 @@
|
||||
"use client";
|
||||
|
||||
import * as React from "react";
|
||||
import { Check } from "lucide-react";
|
||||
import { cn } from "@/lib/utils";
|
||||
|
||||
interface CheckboxProps {
|
||||
id?: string;
|
||||
checked?: boolean;
|
||||
onCheckedChange?: (checked: boolean) => void;
|
||||
disabled?: boolean;
|
||||
className?: string;
|
||||
}
|
||||
|
||||
export function Checkbox({
|
||||
id,
|
||||
checked = false,
|
||||
onCheckedChange,
|
||||
disabled = false,
|
||||
className,
|
||||
}: CheckboxProps) {
|
||||
const handleChange = (e: React.ChangeEvent<HTMLInputElement>) => {
|
||||
if (onCheckedChange) {
|
||||
onCheckedChange(e.target.checked);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="relative inline-flex items-center">
|
||||
<input
|
||||
id={id}
|
||||
type="checkbox"
|
||||
checked={checked}
|
||||
onChange={handleChange}
|
||||
disabled={disabled}
|
||||
className="peer sr-only"
|
||||
/>
|
||||
<label
|
||||
htmlFor={id}
|
||||
className={cn(
|
||||
"flex h-5 w-5 items-center justify-center rounded border border-input bg-background",
|
||||
"peer-focus-visible:outline-none peer-focus-visible:ring-2 peer-focus-visible:ring-ring peer-focus-visible:ring-offset-2",
|
||||
"peer-disabled:cursor-not-allowed peer-disabled:opacity-50",
|
||||
"cursor-pointer transition-colors",
|
||||
"hover:bg-accent hover:text-accent-foreground",
|
||||
className
|
||||
)}
|
||||
>
|
||||
{checked && <Check className="h-3.5 w-3.5" />}
|
||||
</label>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@ -256,6 +256,11 @@ export interface PatternDto {
|
||||
observation_count: number;
|
||||
first_seen: number;
|
||||
last_seen: number;
|
||||
// Phase 17 enrichment fields
|
||||
category?: string;
|
||||
verdict?: string;
|
||||
explanation?: string;
|
||||
authority_source?: string;
|
||||
}
|
||||
|
||||
export interface GetPatternsResponse {
|
||||
|
||||
@ -305,6 +305,31 @@ Features:
|
||||
|
||||
---
|
||||
|
||||
## Research & Reference
|
||||
|
||||
### Vision & Architecture
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [Vision](vision.md) | Product vision and aspirational architecture |
|
||||
| [Protocol Vision](protocol_vision.md) | Protocol-level design philosophy |
|
||||
| [Vision & Gaps](docs/vision-gaps.md) | Honest assessment of current state vs. vision |
|
||||
| [Architecture Docs](docs/architecture/README.md) | System design, concept matching, extension points |
|
||||
|
||||
### Testing & Validation
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [UAT Reports](../../uat/README.md) | User acceptance testing results |
|
||||
| [Phase 6 UAT](../../uat/phase6-uat.md) | Detailed validation of policy workflows |
|
||||
| [Real-World Policy Source UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) | Trust Pack workflow validation |
|
||||
|
||||
### Gap Analysis & Research
|
||||
| Document | Description |
|
||||
|----------|-------------|
|
||||
| [Gap Analysis: Institutional Knowledge](docs/gap-analysis-institutional-knowledge.md) | Analysis of knowledge capture gaps |
|
||||
| [Gap Fixes Summary](docs/gap-fixes-summary.md) | Summary of addressed gaps |
|
||||
|
||||
---
|
||||
|
||||
## What Aphoria Is Not
|
||||
|
||||
- **Not a linter.** Linters check syntax. Aphoria checks decisions against authoritative sources.
|
||||
|
||||
@ -77,7 +77,7 @@ Aphoria is a **code-level truth linter** that validates code against authoritati
|
||||
| `scan.rs` | Main scan orchestrator | Mode dispatch, observation flow |
|
||||
| `walker/` | Project traversal | `mod.rs`, `git.rs`, `path_mapper.rs`, `language.rs` |
|
||||
| `extractors/` | 14 pattern-based claim extractors | `mod.rs`, individual extractors |
|
||||
| `bridge.rs` | ExtractedClaim → Assertion conversion | BLAKE3 hashing, Ed25519 signing |
|
||||
| `bridge.rs` | Observation → Assertion conversion | BLAKE3 hashing, Ed25519 signing |
|
||||
| `episteme/` | Conflict detection core | `ephemeral.rs`, `local.rs`, `concept_index.rs` |
|
||||
| `policy.rs` | Trust Pack management | Load/save/verify signed packs |
|
||||
| `policy_ops.rs` | `bless`, `ack`, `update`, `export/import` | CLI policy operations |
|
||||
|
||||
@ -239,6 +239,91 @@ Deprecated claims are not verified but remain in the file for audit trail.
|
||||
|
||||
---
|
||||
|
||||
### `aphoria claims import`
|
||||
|
||||
Import claims in batch from a TOML file.
|
||||
|
||||
```bash
|
||||
# Preview import (dry-run)
|
||||
aphoria claims import docs/guidelines.toml --dry-run
|
||||
|
||||
# Import with TeamPolicy tier
|
||||
aphoria claims import docs/guidelines.toml \
|
||||
--authority-tier team_policy \
|
||||
--source-guide "hexagonal-arch"
|
||||
|
||||
# Import with merge strategy
|
||||
aphoria claims import docs/guidelines.toml \
|
||||
--merge overwrite # Overwrite existing claims
|
||||
aphoria claims import docs/guidelines.toml \
|
||||
--merge skip_existing # Skip duplicates (default)
|
||||
aphoria claims import docs/guidelines.toml \
|
||||
--merge fail_on_duplicate # Fail if duplicate found
|
||||
```
|
||||
|
||||
**Options:**
|
||||
- `--authority-tier <TIER>` - Override authority tier for all imported claims (team_policy, expert, etc.)
|
||||
- `--source-guide <NAME>` - Track the guideline name for compliance filtering (stored in `.aphoria/ingested_guides.toml`)
|
||||
- `--dry-run` - Preview changes without writing to file
|
||||
- `--merge <STRATEGY>` - Merge strategy: `skip_existing` (default), `overwrite`, `fail_on_duplicate`
|
||||
|
||||
**TOML Format:**
|
||||
```toml
|
||||
[[claim]]
|
||||
id = "hex-arch-http-001"
|
||||
concept_path = "myapp/adapters/http"
|
||||
predicate = "layer"
|
||||
value = "adapter"
|
||||
comparison = "equals"
|
||||
provenance = "Hexagonal Architecture Guidelines"
|
||||
invariant = "HTTP handlers MUST be in adapters layer"
|
||||
consequence = "Business logic leaks into infrastructure"
|
||||
authority_tier = "team_policy"
|
||||
category = "architecture"
|
||||
evidence = ["docs/architecture/hexagonal.md"]
|
||||
created_by = "architecture-team"
|
||||
created_at = "2026-02-08T12:00:00Z"
|
||||
|
||||
[[claim]]
|
||||
id = "hex-arch-domain-imports-001"
|
||||
concept_path = "myapp/domain/imports"
|
||||
predicate = "imported"
|
||||
value = "http"
|
||||
comparison = "absent"
|
||||
provenance = "Hexagonal Architecture Guidelines"
|
||||
invariant = "Domain layer MUST NOT import HTTP adapters"
|
||||
consequence = "Circular dependency breaks architecture"
|
||||
authority_tier = "team_policy"
|
||||
category = "architecture"
|
||||
evidence = ["docs/architecture/hexagonal.md"]
|
||||
created_by = "architecture-team"
|
||||
created_at = "2026-02-08T12:00:00Z"
|
||||
```
|
||||
|
||||
**Guideline Tracking:**
|
||||
|
||||
When you use `--source-guide`, Aphoria tracks the guideline in `.aphoria/ingested_guides.toml`:
|
||||
|
||||
```toml
|
||||
[[guide]]
|
||||
id = "hexagonal-arch"
|
||||
name = "hexagonal-arch"
|
||||
source_path = "docs/guidelines.toml"
|
||||
document_hash = "blake3:abc123..."
|
||||
ingested_at = "2026-02-08T12:00:00Z"
|
||||
claims_count = 26
|
||||
authority_tier = "team_policy"
|
||||
category = "imported"
|
||||
claim_ids = ["hex-arch-http-001", "hex-arch-domain-imports-001", ...]
|
||||
```
|
||||
|
||||
This enables:
|
||||
- Change detection (hash comparison)
|
||||
- Compliance filtering (future: `aphoria scan --check-policy hexagonal-arch`)
|
||||
- Audit trail (who imported what, when)
|
||||
|
||||
---
|
||||
|
||||
## Inline Claim Markers
|
||||
|
||||
### `aphoria claims list-markers`
|
||||
|
||||
135
applications/aphoria/docs/gap-fixes-summary.md
Normal file
135
applications/aphoria/docs/gap-fixes-summary.md
Normal file
@ -0,0 +1,135 @@
|
||||
# Gap Fixes Summary
|
||||
|
||||
## Overview
|
||||
|
||||
This document summarizes the fixes implemented for Gap 1 (Observations Treatment) and Gap 5 (Lineage Enforcement) from the Aphoria gap analysis.
|
||||
|
||||
## Gap 1: Fix Observation Treatment (Confidence-Based Tiers)
|
||||
|
||||
### Problem
|
||||
The persistent scan mode was using `claim_to_assertion()` which assigned Tier 3 (Expert) authority to all observations, regardless of confidence. This gave extractor observations the same weight as human-authored claims.
|
||||
|
||||
### Solution
|
||||
Changed `episteme/local/store.rs` line 36 in `ingest_claims()` from:
|
||||
```rust
|
||||
let assertion = claim_to_assertion(claim, &self.signing_key, timestamp, git_commit.as_deref());
|
||||
```
|
||||
|
||||
To:
|
||||
```rust
|
||||
let assertion = observation_to_assertion(claim, &self.signing_key, timestamp, git_commit.as_deref());
|
||||
```
|
||||
|
||||
### Behavior
|
||||
Observations now get appropriate tier assignment based on confidence:
|
||||
- **Tier 4 (Community, 0.3 weight)**: confidence ≥ 0.9
|
||||
- **Tier 5 (Anecdotal, 0.1 weight)**: confidence < 0.9
|
||||
|
||||
This correctly distinguishes observations (extractor pattern matches) from claims (human-authored rules with provenance).
|
||||
|
||||
### Files Changed
|
||||
- `applications/aphoria/src/episteme/local/store.rs` (1 line)
|
||||
|
||||
### Tests
|
||||
- Existing `bridge::tests::test_observation_to_tier_*` tests validate tier mapping
|
||||
- Existing `episteme::tests::test_ingest_observations_creates_tier4_assertions` validates storage integration
|
||||
- All 1171 aphoria tests pass
|
||||
|
||||
---
|
||||
|
||||
## Gap 5: Enforce Lineage on Supersede (Already Implemented + Enhanced)
|
||||
|
||||
### Status
|
||||
The core auto-deprecation feature was **already implemented** in `ClaimsFile::supersede()` at line 152-168.
|
||||
|
||||
### Enhancement Added
|
||||
Added duplicate validation warning when creating a new claim that conflicts with an existing active claim.
|
||||
|
||||
### Implementation
|
||||
Modified `ClaimsFile::add()` in `claims_file.rs` to check for duplicate active claims with the same `concept_path` and `predicate`. When detected, prints a warning:
|
||||
|
||||
```
|
||||
⚠️ Warning: Active claim(s) already exist for path/to/concept::predicate
|
||||
- claim-001 (Invariant description)
|
||||
Consider using 'aphoria claims supersede claim-001' instead
|
||||
```
|
||||
|
||||
### Behavior
|
||||
- **Supersede**: Automatically marks old claim as `ClaimStatus::Superseded` when creating superseding claim
|
||||
- **Create**: Warns if creating duplicate active claim (suggests using supersede instead)
|
||||
- **No breaking changes**: Warning is informational only, claim is still added
|
||||
|
||||
### Files Changed
|
||||
- `applications/aphoria/src/claims_file.rs` (add() method enhanced, 2 new tests added)
|
||||
|
||||
### Tests
|
||||
- `test_supersede()` validates auto-deprecation (already existed)
|
||||
- `test_duplicate_active_warning()` validates warning is shown
|
||||
- `test_no_warning_for_deprecated_duplicates()` validates warning only for active claims
|
||||
- All 1171 aphoria tests pass
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
### Build & Test
|
||||
```bash
|
||||
# Build
|
||||
cargo build -p aphoria
|
||||
|
||||
# Run all tests
|
||||
cargo test -p aphoria --lib
|
||||
|
||||
# Run clippy
|
||||
cargo clippy -p aphoria --lib -- -D warnings
|
||||
```
|
||||
|
||||
All checks pass with no warnings.
|
||||
|
||||
### Manual Testing
|
||||
1. **Gap 1**: Run `aphoria scan --mode persistent --sync` and verify observations are created with Tier 4/5 (not Tier 3)
|
||||
2. **Gap 5**: Run `aphoria claims supersede old-id --new-id new-id ...` and verify old claim status becomes `superseded`
|
||||
3. **Gap 5**: Run `aphoria claims create` with same concept_path/predicate as existing active claim and verify warning is displayed
|
||||
|
||||
---
|
||||
|
||||
## Impact
|
||||
|
||||
### Gap 1
|
||||
- **Semantic correctness**: Observations are now properly distinguished from claims in authority weight
|
||||
- **Query resolution**: Lens calculations will correctly weight observations lower than authored claims
|
||||
- **Backward compatible**: Existing scans continue to work, just with corrected tier assignment
|
||||
|
||||
### Gap 5
|
||||
- **Lineage enforcement**: Supersession now properly deprecates old claims (already worked)
|
||||
- **User guidance**: Duplicate warnings help users discover supersede feature
|
||||
- **No breaking changes**: All existing workflows continue to work
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- `applications/aphoria/docs/vision-gaps.md` - Original gap analysis
|
||||
- `applications/aphoria/docs/claims-explained.md` - Claim vs observation semantics
|
||||
- `.aphoria/claims.toml` - Example claims with supersession chains
|
||||
- `applications/aphoria/src/bridge.rs` - Tier assignment logic
|
||||
|
||||
---
|
||||
|
||||
## Commit Message
|
||||
|
||||
```
|
||||
fix(aphoria): use confidence-based tiers for observations (Gap 1) + enhance lineage warnings (Gap 5)
|
||||
|
||||
Gap 1: Fix Observation Treatment
|
||||
- Change ingest_claims() to use observation_to_assertion() instead of claim_to_assertion()
|
||||
- Observations now get Tier 4 (≥0.9 confidence) or Tier 5 (<0.9 confidence) instead of Tier 3
|
||||
- Semantically correct: observations (grep results) ≠ claims (human-authored rules)
|
||||
|
||||
Gap 5: Enhance Lineage Enforcement
|
||||
- Add duplicate validation warning when creating claims with same concept_path/predicate
|
||||
- Suggests using 'aphoria claims supersede' instead of creating duplicate actives
|
||||
- Core auto-deprecation already worked (supersede() marks old claim as Superseded)
|
||||
|
||||
All 1171 tests pass. No breaking changes.
|
||||
```
|
||||
333
applications/aphoria/docs/phase-17-summary.md
Normal file
333
applications/aphoria/docs/phase-17-summary.md
Normal file
@ -0,0 +1,333 @@
|
||||
# Phase 17: Pattern Enrichment & Best Practices Infrastructure
|
||||
|
||||
**Status:** ✅ Complete (Backend Only)
|
||||
**Date:** 2026-02-08
|
||||
|
||||
## What Was Built
|
||||
|
||||
This phase implemented **backend infrastructure** for enriched corpus patterns and team guideline ingestion. The features are **fully functional via CLI** but **not yet integrated with the dashboard UI**.
|
||||
|
||||
---
|
||||
|
||||
## 1. Enriched Pattern Metadata
|
||||
|
||||
### The Problem
|
||||
Community patterns showed bare statistics like "md5: true, 347 projects" with no context about whether MD5 is deprecated, recommended, or neutral.
|
||||
|
||||
### The Solution
|
||||
Extractors now provide enrichment metadata:
|
||||
|
||||
```rust
|
||||
pub struct PatternMetadata {
|
||||
pub tail_path: String, // "crypto/hashing/algorithm"
|
||||
pub predicate: String, // "algorithm"
|
||||
pub value: Option<String>, // "md5" (or None for wildcard)
|
||||
pub category: String, // "security"
|
||||
pub verdict: String, // "deprecated"
|
||||
pub explanation: String, // "MD5 is cryptographically broken..."
|
||||
pub authority_source: Option<String>, // "NIST SP 800-131A"
|
||||
}
|
||||
```
|
||||
|
||||
### What Works Now
|
||||
- 10 security extractors provide enrichment metadata
|
||||
- `PatternEnricher` service matches patterns to metadata (exact, wildcard, noise detection)
|
||||
- Data model supports category, verdict, explanation, authority_source
|
||||
|
||||
### What's Missing
|
||||
❌ Dashboard doesn't display this metadata yet
|
||||
❌ No category filter dropdown
|
||||
❌ No "Hide noise" toggle
|
||||
❌ No visual badges for deprecated/recommended
|
||||
|
||||
---
|
||||
|
||||
## 2. TeamPolicy Authority Tier
|
||||
|
||||
### The Problem
|
||||
No authority tier between community observations (tier 4) and expert opinions (tier 3) for team-level architectural guidelines.
|
||||
|
||||
### The Solution
|
||||
New **tier 2.5**: `TeamPolicy`
|
||||
|
||||
- Sits between Observational (tier 2) and Expert (tier 3)
|
||||
- Authority weight: 0.6 (between 0.7 and 0.5)
|
||||
- Decay: 180 days (same as Expert)
|
||||
- Use case: Team architectural guidelines, internal standards
|
||||
|
||||
### What Works Now
|
||||
```bash
|
||||
# Create team policy claim
|
||||
aphoria claims create \
|
||||
--tier team_policy \
|
||||
--id hex-arch-http-001 \
|
||||
--concept-path myapp/adapters/http \
|
||||
--predicate layer \
|
||||
--value adapter \
|
||||
--invariant "HTTP handlers MUST be in adapters layer" \
|
||||
--consequence "Business logic leaks into infrastructure" \
|
||||
--provenance "Architecture team decision 2026-02-08" \
|
||||
--category architecture \
|
||||
--by architecture-team
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Best Practices Import CLI
|
||||
|
||||
### The Problem
|
||||
Teams write extensive architectural guidelines in markdown/PDFs but have no way to automatically enforce them.
|
||||
|
||||
### The Solution
|
||||
Batch import claims from TOML files:
|
||||
|
||||
```bash
|
||||
# Preview import
|
||||
aphoria claims import docs/hexagonal-arch.toml --dry-run
|
||||
|
||||
# Import with tracking
|
||||
aphoria claims import docs/hexagonal-arch.toml \
|
||||
--authority-tier team_policy \
|
||||
--source-guide "hexagonal-arch"
|
||||
```
|
||||
|
||||
### What Works Now
|
||||
- Batch import claims from TOML
|
||||
- Override authority tier for all claims
|
||||
- Merge strategies: `skip_existing`, `overwrite`, `fail_on_duplicate`
|
||||
- Dry-run preview
|
||||
- Guideline tracking in `.aphoria/ingested_guides.toml`
|
||||
|
||||
### Example TOML
|
||||
```toml
|
||||
[[claim]]
|
||||
id = "hex-arch-http-001"
|
||||
concept_path = "myapp/adapters/http"
|
||||
predicate = "layer"
|
||||
value = "adapter"
|
||||
comparison = "equals"
|
||||
provenance = "Hexagonal Architecture Guidelines"
|
||||
invariant = "HTTP handlers MUST be in adapters layer"
|
||||
consequence = "Business logic leaks into infrastructure"
|
||||
authority_tier = "team_policy"
|
||||
category = "architecture"
|
||||
evidence = ["docs/architecture/hexagonal.md"]
|
||||
created_by = "architecture-team"
|
||||
created_at = "2026-02-08T12:00:00Z"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Guideline Tracking
|
||||
|
||||
### The Problem
|
||||
No way to track which guidelines have been imported, detect changes, or filter compliance.
|
||||
|
||||
### The Solution
|
||||
`.aphoria/ingested_guides.toml` tracks imported guidelines:
|
||||
|
||||
```toml
|
||||
[[guide]]
|
||||
id = "hexagonal-arch"
|
||||
name = "Hexagonal Architecture Guidelines"
|
||||
source_path = "docs/hexagonal.md"
|
||||
document_hash = "blake3:abc123..."
|
||||
ingested_at = "2026-02-08T12:00:00Z"
|
||||
claims_count = 26
|
||||
authority_tier = "team_policy"
|
||||
category = "architecture"
|
||||
claim_ids = ["hex-arch-http-001", "hex-arch-domain-imports-001", ...]
|
||||
```
|
||||
|
||||
### What Works Now
|
||||
- Guideline metadata tracked with BLAKE3 hash
|
||||
- Change detection (compare hash to detect doc updates)
|
||||
- Audit trail (who imported what, when)
|
||||
|
||||
### What's Missing
|
||||
❌ `aphoria scan --check-policy <guide-id>` not implemented
|
||||
❌ No re-extraction workflow when source doc changes
|
||||
❌ No compliance dashboard
|
||||
|
||||
---
|
||||
|
||||
## 5. Updated Comparison Modes
|
||||
|
||||
### What Was Added
|
||||
Two new comparison modes for list/substring matching:
|
||||
|
||||
**Contains** - Value must contain substring/element
|
||||
```toml
|
||||
comparison = "contains"
|
||||
value = "Serialize"
|
||||
# Passes: "Clone,Debug,Serialize"
|
||||
# Fails: "Clone,Debug"
|
||||
```
|
||||
|
||||
**NotContains** - Value must NOT contain substring/element
|
||||
```toml
|
||||
comparison = "not_contains"
|
||||
value = "Clone"
|
||||
# Passes: "Debug"
|
||||
# Fails: "Clone,Debug"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10 Enriched Security Extractors
|
||||
|
||||
| Extractor | Enriched Patterns | Authority Source |
|
||||
|-----------|-------------------|------------------|
|
||||
| `WeakCryptoExtractor` | MD5, SHA1 (deprecated), DES, RC4 | NIST SP 800-131A, RFC 7465 |
|
||||
| `TlsVersionExtractor` | TLS 1.0/1.1 (deprecated), 1.2/1.3 (recommended) | RFC 8996, RFC 8446 |
|
||||
| `TlsVerifyExtractor` | cert_verification: false (insecure) | OWASP |
|
||||
| `JwtConfigExtractor` | algorithm: none (forbidden) | RFC 7519 |
|
||||
| `CorsConfigExtractor` | allow_origin: * (insecure) | OWASP, W3C CORS Spec |
|
||||
| `HardcodedSecretsExtractor` | API keys/passwords (critical) | OWASP A07:2021 |
|
||||
| `SqlInjectionExtractor` | String interpolation (vulnerable) | OWASP A03:2021 |
|
||||
| `CommandInjectionExtractor` | Shell exec (vulnerable) | OWASP A03:2021 |
|
||||
| `PathTraversalExtractor` | User-controlled paths (vulnerable) | OWASP A01:2021 |
|
||||
| `InsecureDeserializationExtractor` | pickle/yaml.load (unsafe) | OWASP A08:2021 |
|
||||
|
||||
---
|
||||
|
||||
## Files Created/Modified
|
||||
|
||||
### New Files
|
||||
- `applications/aphoria/src/corpus/enricher.rs` - Pattern enrichment service
|
||||
- `applications/aphoria/src/types/ingested_guides.rs` - Guideline tracking
|
||||
|
||||
### Modified Files
|
||||
**Core Types:**
|
||||
- `crates/stemedb-core/src/types/source.rs` - TeamPolicy tier
|
||||
- `crates/stemedb-storage/src/pattern_aggregate_store/mod.rs` - Enrichment fields
|
||||
|
||||
**Aphoria:**
|
||||
- `applications/aphoria/src/extractors/traits.rs` - `pattern_metadata()` method
|
||||
- `applications/aphoria/src/types/authored_claim.rs` - Contains/NotContains modes
|
||||
- `applications/aphoria/src/cli/claims.rs` - Import subcommand
|
||||
- `applications/aphoria/src/handlers/claims.rs` - Import handler
|
||||
- 10 extractor files with `pattern_metadata()` implementations
|
||||
|
||||
**API & DTOs:**
|
||||
- `crates/stemedb-api/src/dto/enums.rs` - TeamPolicy DTO
|
||||
- `crates/stemedb-api/src/dto/aphoria/types.rs` - Contains/NotContains DTOs
|
||||
- `crates/stemedb-ontology/src/dto/enums.rs` - TeamPolicy DTO
|
||||
|
||||
---
|
||||
|
||||
## How to Use (CLI)
|
||||
|
||||
### 1. Create a guideline TOML file
|
||||
```bash
|
||||
cat > docs/architecture-guidelines.toml <<EOF
|
||||
[[claim]]
|
||||
id = "no-tokio-in-core"
|
||||
concept_path = "myapp/core/imports/tokio"
|
||||
predicate = "imported"
|
||||
value = "true"
|
||||
comparison = "absent"
|
||||
provenance = "Architecture decision: core must be sync-only"
|
||||
invariant = "Core modules MUST NOT import tokio"
|
||||
consequence = "Creates async runtime coupling, breaks sync library users"
|
||||
authority_tier = "team_policy"
|
||||
category = "architecture"
|
||||
evidence = ["ADR-003"]
|
||||
created_by = "tech-lead"
|
||||
created_at = "2026-02-08T12:00:00Z"
|
||||
EOF
|
||||
```
|
||||
|
||||
### 2. Import the guideline
|
||||
```bash
|
||||
aphoria claims import docs/architecture-guidelines.toml \
|
||||
--source-guide "architecture-2026" \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
### 3. Run verification
|
||||
```bash
|
||||
aphoria scan --persist
|
||||
aphoria verify run
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What's NOT Done (UI Integration)
|
||||
|
||||
The backend is complete but the **dashboard doesn't display any of this**:
|
||||
|
||||
❌ Category badges (security/architecture/performance)
|
||||
❌ Verdict badges (deprecated/recommended/emerging)
|
||||
❌ Explanation tooltips ("MD5 is deprecated - NIST 2010")
|
||||
❌ Filter by category dropdown
|
||||
❌ "Hide noise" toggle
|
||||
❌ Guideline compliance filtering (`--check-policy` flag)
|
||||
❌ Compliance dashboard showing guideline status
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### To Make This User-Visible:
|
||||
|
||||
**Option 1: Dashboard Integration** (Frontend work)
|
||||
- Add category/verdict badges to pattern cards
|
||||
- Show explanations in tooltips
|
||||
- Add category filter dropdown
|
||||
- Implement "Hide noise" toggle
|
||||
- Build compliance dashboard
|
||||
|
||||
**Option 2: Enhanced CLI Output** (Backend work)
|
||||
- Show enrichment in `aphoria scan` table output
|
||||
- Add `--show-enrichment` flag
|
||||
- Color-code deprecated patterns (red), recommended (green)
|
||||
- Filter by category: `aphoria scan --category security`
|
||||
|
||||
**Option 3: Policy Filtering** (Backend work)
|
||||
- Implement `aphoria scan --check-policy <guide-id>`
|
||||
- Show only violations of specific guideline
|
||||
- Pre-commit hook support for policy enforcement
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
All code compiles and passes existing tests. To verify:
|
||||
|
||||
```bash
|
||||
# Build workspace
|
||||
cargo build --workspace
|
||||
|
||||
# Test aphoria
|
||||
cargo test --package aphoria
|
||||
|
||||
# Try the import command
|
||||
aphoria claims import --help
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Documentation Updated
|
||||
|
||||
- ✅ `roadmap-archive.md` - Added Phase 17
|
||||
- ✅ `roadmap.md` - Updated status table
|
||||
- ✅ `cli-reference.md` - Added `aphoria claims import` documentation
|
||||
- ✅ `comparison-modes.md` - Contains/NotContains already documented
|
||||
- ✅ This summary document
|
||||
|
||||
---
|
||||
|
||||
## Questions?
|
||||
|
||||
**Q: Why can't I see any changes in the UI?**
|
||||
A: This phase implemented backend infrastructure only. The dashboard doesn't consume the enrichment metadata yet.
|
||||
|
||||
**Q: How do I know it works?**
|
||||
A: Use the CLI commands. The `aphoria claims import` command is fully functional.
|
||||
|
||||
**Q: When will this show up in the dashboard?**
|
||||
A: That requires frontend work to integrate the enrichment metadata into the UI components.
|
||||
|
||||
**Q: Is this production-ready?**
|
||||
A: The backend is production-ready. The CLI commands work. The UI integration is not done.
|
||||
@ -1,3 +1,11 @@
|
||||
---
|
||||
created: 2026-02-08
|
||||
last_updated: 2026-02-08
|
||||
status: Planning Document
|
||||
feature: Phase 17+ - Pattern Enrichment
|
||||
timeline: 10-14 days estimated
|
||||
---
|
||||
|
||||
# Enriched Corpus Patterns - Making Community Patterns Actionable
|
||||
|
||||
## Problem Statement
|
||||
|
||||
@ -1,3 +1,11 @@
|
||||
---
|
||||
created: 2026-02-08
|
||||
last_updated: 2026-02-08
|
||||
status: Planning Document
|
||||
feature: Phase 2-3 - LLM-Assisted Document Ingestion
|
||||
timeline: 4 weeks estimated
|
||||
---
|
||||
|
||||
# Ingest Best Practices Documentation - Executable Policy
|
||||
|
||||
## Problem Statement
|
||||
|
||||
@ -70,7 +70,7 @@ sequenceDiagram
|
||||
Registry->>Registry: extractor.extract(segments, content, lang, file)
|
||||
end
|
||||
Registry->>Registry: filter by IgnoreCommentParser
|
||||
Registry-->>Scanner: Vec<ExtractedClaim>
|
||||
Registry-->>Scanner: Vec<Observation>
|
||||
end
|
||||
|
||||
Note over Scanner: Phase 3: CONFLICT DETECTION
|
||||
@ -86,7 +86,7 @@ sequenceDiagram
|
||||
Index-->>Scanner: ConceptIndex
|
||||
|
||||
Scanner->>Conflict: check_conflicts(claims, index, config)
|
||||
loop For each ExtractedClaim
|
||||
loop For each Observation
|
||||
Conflict->>Index: lookup(claim.subject, claim.predicate)
|
||||
Note over Conflict: Tail-path match:<br/>"code://rust/app/tls/cert_verification"<br/>matches "rfc://5246/tls/cert_verification"
|
||||
Conflict->>Conflict: Compare values, compute score
|
||||
@ -182,11 +182,11 @@ sequenceDiagram
|
||||
|
||||
## What We Built (Grounded)
|
||||
|
||||
Aphoria has **42 built-in extractors** (`registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42`) that scan source code with regex patterns and produce `ExtractedClaim` structs:
|
||||
Aphoria has **42 built-in extractors** (`registry.rs:327` -- `BUILTIN_EXTRACTOR_COUNT: usize = 42`) that scan source code with regex patterns and produce `Observation` structs:
|
||||
|
||||
```rust
|
||||
// types/claim.rs:7-31
|
||||
pub struct ExtractedClaim {
|
||||
pub struct Observation {
|
||||
pub concept_path: String, // e.g., "code://rust/maxwell/hypervisor/lib/imports/firecracker"
|
||||
pub predicate: String, // e.g., "imported"
|
||||
pub value: ObjectValue, // Boolean(true)
|
||||
@ -273,7 +273,7 @@ The `bridge.rs` conversion (`bridge.rs:45-92`) forces observations into the Asse
|
||||
| `parent_hash` | Links to superseded assertion | Always `None` | `bridge.rs:79` |
|
||||
| `epoch` | Paradigm context (e.g., "post-quantum") | Always `None` | `bridge.rs:89` |
|
||||
| `lifecycle` | Pending -> Review -> Approved | Always `LifecycleStage::Approved` (skips review) | `bridge.rs:85` |
|
||||
| `evidence` | Provenance chain, ADR references | Not present in `ExtractedClaim` at all | `types/claim.rs:7-31` |
|
||||
| `evidence` | Provenance chain, ADR references | Not present in `Observation` at all | `types/claim.rs:7-31` |
|
||||
|
||||
**We're using a Mercedes as a shopping cart.**
|
||||
|
||||
@ -409,7 +409,7 @@ The following claims were extracted using the `extract-claims` skill pattern. Ea
|
||||
| VG-006 | `bridge.rs` always sets `epoch: None` | VERIFIED | `bridge.rs:89` |
|
||||
| VG-007 | `bridge.rs` always sets `lifecycle: LifecycleStage::Approved` | VERIFIED | `bridge.rs:85` |
|
||||
| VG-008 | `source_metadata` contains `{file, line, matched_text, scan_tool, scan_version}` only | VERIFIED | `bridge.rs:52-58` |
|
||||
| VG-009 | `ExtractedClaim` has no evidence/provenance field | VERIFIED | `types/claim.rs:7-31` -- only has location, value, confidence |
|
||||
| VG-009 | `Observation` has no evidence/provenance field | VERIFIED | `types/claim.rs:7-31` -- only has location, value, confidence |
|
||||
| VG-010 | `claim_to_observation()` uses Tier 4 (Community) | VERIFIED | `bridge.rs:36-42` |
|
||||
| VG-011 | Extractor trait has no mechanism to receive claims for verification | ✅ **CLOSED** | `traits.rs:68-107` -- `verifiable_predicates()` method added, 10 extractors declare predicates |
|
||||
|
||||
@ -417,7 +417,7 @@ The following claims were extracted using the `extract-claims` skill pattern. Ea
|
||||
|
||||
| ID | Claim | Gap |
|
||||
|----|-------|-----|
|
||||
| VG-020 | `ExtractedClaim` should be renamed to `Observation` | `types/claim.rs` still uses `ExtractedClaim` |
|
||||
| VG-020 | `Observation` type exists and is properly named | ✅ **CLOSED** — `ExtractedClaim` renamed to `Observation` in Phase A1 |
|
||||
| VG-021 | A real `Claim` type should exist with provenance, invariant, consequence, authority | No such type exists anywhere |
|
||||
| VG-022 | Extractors should be paired with claims they verify | ✅ **CLOSED** — `verifiable_predicates()` added to `Extractor` trait; 10 extractors declare predicates; `compute_extractor_claim_map()` in verify.rs; `aphoria verify map` shows coverage |
|
||||
| VG-023 | `aphoria audit` command should exist | No audit subcommand in CLI |
|
||||
@ -425,7 +425,7 @@ The following claims were extracted using the `extract-claims` skill pattern. Ea
|
||||
| VG-025 | `aphoria claims list` / `aphoria claims explain` should exist | No claims subcommand |
|
||||
| VG-026 | Corpus should be real assertions, not hardcoded in `corpus.rs:33-157` | Corpus is built procedurally per scan |
|
||||
| VG-027 | Conflict resolution should use Episteme lenses | No lens invoked during scan |
|
||||
| VG-028 | Direction 2 audit (walk claims, verify code) doesn't exist | No inverse audit flow |
|
||||
| VG-028 | Direction 2 audit (walk claims, verify code) doesn't exist | ✅ **CLOSED** — `aphoria verify run` walks claims and checks code |
|
||||
| VG-029 | Skill should be primary claim authoring interface | No `.claude/skills/aphoria` skill exists |
|
||||
|
||||
---
|
||||
@ -439,8 +439,8 @@ Extractors don't produce claims. Humans (assisted by the Aphoria skill) produce
|
||||
The type system should reflect this:
|
||||
|
||||
```rust
|
||||
// CURRENT (types/claim.rs:7-31)
|
||||
pub struct ExtractedClaim { // This is an observation, not a claim
|
||||
// CURRENT (types/claim.rs:7-31) - Phase A1 COMPLETE
|
||||
pub struct Observation {
|
||||
pub concept_path: String,
|
||||
pub predicate: String,
|
||||
pub value: ObjectValue,
|
||||
@ -451,7 +451,7 @@ pub struct ExtractedClaim { // This is an observation, not a claim
|
||||
pub description: String,
|
||||
}
|
||||
|
||||
// TARGET: New Observation type (rename ExtractedClaim)
|
||||
// Already exists as Observation (was ExtractedClaim before A1)
|
||||
pub struct Observation {
|
||||
pub concept_path: String,
|
||||
pub predicate: String,
|
||||
@ -498,7 +498,7 @@ The `Extractor` trait (`traits.rs:68-94`) needs to change:
|
||||
pub trait Extractor: Send + Sync {
|
||||
fn name(&self) -> &str;
|
||||
fn languages(&self) -> &[Language];
|
||||
fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec<ExtractedClaim>;
|
||||
fn extract(&self, segments: &[String], content: &str, lang: Language, file: &str) -> Vec<Observation>;
|
||||
}
|
||||
|
||||
// TARGET: Extractors can also verify observations against claims
|
||||
@ -608,7 +608,7 @@ source = { claim_id = "arch-boundary-001", authority = "architecture-decision" }
|
||||
|
||||
### Phase 1: Distinguish observations from claims
|
||||
|
||||
- [ ] Rename `ExtractedClaim` to `Observation` in `types/claim.rs`
|
||||
- [x] Rename `ExtractedClaim` to `Observation` in `types/claim.rs` ✅ **COMPLETE (Phase A1)**
|
||||
- [ ] Create `AuthoredClaim` type with provenance, invariant, consequence, authority, evidence_chain
|
||||
- [ ] Update `bridge.rs` default path to use Tier 4/5 (not Tier 3) for scanner output
|
||||
- [ ] Add `evidence` field to `source_metadata` in bridge
|
||||
|
||||
@ -1,5 +1,7 @@
|
||||
# The Open Vision: The Epistemic Assertion Protocol (EAP)
|
||||
|
||||
> **Protocol Vision:** This document describes the Epistemic Assertion Protocol (EAP) - an open standard for publishing authoritative technical knowledge. For Aphoria's product vision, see [Vision](vision.md).
|
||||
|
||||
**From "Reading the Manual" to "Querying the Truth."**
|
||||
|
||||
## The Stagnation of Truth
|
||||
|
||||
@ -273,6 +273,124 @@ Clean scans by excluding test fixtures and intentional patterns.
|
||||
|
||||
---
|
||||
|
||||
## Phase 17: Pattern Enrichment & Best Practices Infrastructure ✅
|
||||
|
||||
**Backend infrastructure for enriched corpus patterns and team guideline ingestion.**
|
||||
|
||||
> Note: Backend only — UI integration not implemented. Patterns have metadata but dashboard doesn't display it yet.
|
||||
|
||||
### 17.1 Enriched Pattern Metadata ✅
|
||||
|
||||
**What:** Transform bare patterns like "md5: true" into actionable insights "MD5 is deprecated (NIST 2010)".
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Add enrichment fields to `PatternAggregate` (category, verdict, explanation, authority_source) | ✅ |
|
||||
| Add `pattern_metadata()` method to `Extractor` trait | ✅ |
|
||||
| Create `PatternEnricher` service with exact/wildcard matching + noise detection | ✅ `corpus/enricher.rs` |
|
||||
| Implement `pattern_metadata()` for 10 security extractors | ✅ See below |
|
||||
|
||||
**Enriched Extractors:**
|
||||
- `WeakCryptoExtractor` — MD5, SHA1, DES, RC4 deprecated
|
||||
- `TlsVersionExtractor` — TLS 1.0/1.1 deprecated, 1.2/1.3 recommended
|
||||
- `TlsVerifyExtractor` — cert_verification: false insecure
|
||||
- `JwtConfigExtractor` — algorithm: none forbidden
|
||||
- `CorsConfigExtractor` — allow_origin: * insecure
|
||||
- `HardcodedSecretsExtractor` — API keys/passwords critical
|
||||
- `SqlInjectionExtractor` — string interpolation vulnerable
|
||||
- `CommandInjectionExtractor` — shell exec vulnerable
|
||||
- `PathTraversalExtractor` — user-controlled paths vulnerable
|
||||
- `InsecureDeserializationExtractor` — pickle/yaml.load unsafe
|
||||
|
||||
### 17.2 TeamPolicy Authority Tier ✅
|
||||
|
||||
**What:** New tier 2.5 between Observational and Expert for team-level architectural guidelines.
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Add `TeamPolicy` variant to `SourceClass` enum | ✅ `stemedb-core/src/types/source.rs` |
|
||||
| Add tier_fractional() for 2.5 representation | ✅ |
|
||||
| Update authority_weight() (0.6) and default_decay_days() (180) | ✅ |
|
||||
| Add "team_policy" parsing to `parse_authority_tier()` | ✅ `aphoria/src/types/authored_claim.rs` |
|
||||
| Update all DTO conversions (API, ontology) | ✅ |
|
||||
|
||||
### 17.3 Best Practices Import CLI ✅
|
||||
|
||||
**What:** Batch import claims from TOML files (e.g., hexagonal architecture guidelines).
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Add `Import` subcommand to `ClaimsCommands` | ✅ `cli/claims.rs` |
|
||||
| Implement `handle_claims_import()` with merge strategies | ✅ `handlers/claims.rs` |
|
||||
| Support `--authority-tier` override | ✅ |
|
||||
| Support `--source-guide` for tracking | ✅ |
|
||||
| Support `--dry-run` for preview | ✅ |
|
||||
| Merge strategies: skip_existing, overwrite, fail_on_duplicate | ✅ |
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
aphoria claims import docs/guidelines.toml \
|
||||
--authority-tier team_policy \
|
||||
--source-guide "hexagonal-arch" \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
### 17.4 Guideline Tracking System ✅
|
||||
|
||||
**What:** Track which guidelines have been imported for compliance filtering and change detection.
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Create `GuidelineMetadata` struct | ✅ `types/ingested_guides.rs` |
|
||||
| Create `IngestedGuidesFile` with TOML persistence | ✅ |
|
||||
| Track: id, name, source_path, document_hash, claim_ids | ✅ |
|
||||
| Integrate with import command | ✅ |
|
||||
| Store in `.aphoria/ingested_guides.toml` | ✅ |
|
||||
|
||||
### 17.5 Updated Comparison Modes ✅
|
||||
|
||||
| Task | Status |
|
||||
|------|--------|
|
||||
| Add `Contains` comparison mode | ✅ `types/authored_claim.rs` |
|
||||
| Add `NotContains` comparison mode | ✅ |
|
||||
| Update API DTOs | ✅ `stemedb-api/src/dto/aphoria/types.rs` |
|
||||
|
||||
### What's NOT Implemented
|
||||
|
||||
❌ **Dashboard UI Integration** — Enrichment metadata exists in backend but no UI to display it
|
||||
❌ **Category Badges** — No visual badges for security/architecture/performance
|
||||
❌ **Verdict Badges** — No visual indicators for deprecated/recommended
|
||||
❌ **Filtering UI** — No dropdown to filter patterns by category
|
||||
❌ **"Hide Noise" Toggle** — Noise detection works but no UI control
|
||||
❌ **--check-policy Flag** — Backend ready but scan filtering not implemented
|
||||
|
||||
### Files Modified
|
||||
|
||||
**Core:**
|
||||
- `crates/stemedb-core/src/types/source.rs` — TeamPolicy tier
|
||||
- `crates/stemedb-storage/src/pattern_aggregate_store/mod.rs` — Enrichment fields
|
||||
|
||||
**Aphoria:**
|
||||
- `applications/aphoria/src/extractors/traits.rs` — `pattern_metadata()` method
|
||||
- `applications/aphoria/src/corpus/enricher.rs` — **NEW** Pattern enrichment service
|
||||
- `applications/aphoria/src/types/authored_claim.rs` — Contains/NotContains modes
|
||||
- `applications/aphoria/src/types/ingested_guides.rs` — **NEW** Guideline tracking
|
||||
- `applications/aphoria/src/cli/claims.rs` — Import subcommand
|
||||
- `applications/aphoria/src/handlers/claims.rs` — Import handler
|
||||
- 10 extractor files with `pattern_metadata()` implementations
|
||||
|
||||
**API:**
|
||||
- `crates/stemedb-api/src/dto/enums.rs` — TeamPolicy DTO
|
||||
- `crates/stemedb-api/src/dto/aphoria/types.rs` — Contains/NotContains DTOs
|
||||
- `crates/stemedb-api/src/handlers/aphoria/claims.rs` — Comparison mode conversion
|
||||
- `crates/stemedb-api/src/handlers/layered.rs` — SourceClass conversion
|
||||
|
||||
**Ontology:**
|
||||
- `crates/stemedb-ontology/src/dto/enums.rs` — TeamPolicy DTO
|
||||
- `crates/stemedb-ontology/src/dto/conversions.rs` — Conversion functions
|
||||
|
||||
---
|
||||
|
||||
## The Self-Learning Vision (Complete)
|
||||
|
||||
```
|
||||
@ -317,3 +435,4 @@ Phase 9: Autonomous Generation (fully self-improving) ✅
|
||||
| 12 | Knowledge Scope Hierarchy | ✅ |
|
||||
| 13 | Knowledge Lifecycle Management | ✅ |
|
||||
| 16 | Ignore & Exclusion System | ✅ |
|
||||
| 17 | Pattern Enrichment & Best Practices Infrastructure | ✅ (backend only) |
|
||||
|
||||
@ -8,7 +8,7 @@
|
||||
|
||||
| Phase | Deliverable | Status |
|
||||
|-------|-------------|--------|
|
||||
| 0–9, 11–13, 16 | Core CLI, Extractors (42), LLM, Learning, Enterprise, Lifecycle | ✅ Archived |
|
||||
| 0–9, 11–13, 16–17 | Core CLI, Extractors (42), LLM, Learning, Enterprise, Lifecycle, Pattern Enrichment | ✅ Archived |
|
||||
| 10 | UX & Enterprise Polish | 🔄 Partial (10.1 ✅, 10.2–10.3 ⬜) |
|
||||
| 14 | Governance Workflows | 🎯 Current |
|
||||
| 15 | Evidence Source Integration | ⬜ Future |
|
||||
|
||||
@ -85,7 +85,7 @@ crates/
|
||||
owasp.rs OWASP ingestion (Tier 1)
|
||||
vendor.rs Vendor docs (Tier 2)
|
||||
policy.rs Local policy ingestion (Tier 0 Override)
|
||||
bridge.rs ExtractedClaim → Assertion conversion
|
||||
bridge.rs Observation → Assertion conversion
|
||||
conflict.rs Conflict query + scoring
|
||||
report/
|
||||
mod.rs Report generation orchestration
|
||||
@ -249,21 +249,21 @@ pub trait Extractor: Send + Sync {
|
||||
/// - `content`: The file content as a string.
|
||||
/// - `language`: The detected language of the file.
|
||||
///
|
||||
/// Returns zero or more extracted claims.
|
||||
/// Returns zero or more extracted observations.
|
||||
fn extract(
|
||||
&self,
|
||||
path_segments: &[String],
|
||||
content: &str,
|
||||
language: Language,
|
||||
) -> Vec<ExtractedClaim>;
|
||||
) -> Vec<Observation>;
|
||||
}
|
||||
```
|
||||
|
||||
### ExtractedClaim
|
||||
### Observation
|
||||
|
||||
```rust
|
||||
/// A claim extracted from source code by an Extractor.
|
||||
pub struct ExtractedClaim {
|
||||
/// An observation extracted from source code by an Extractor.
|
||||
pub struct Observation {
|
||||
/// The full ConceptPath for this claim.
|
||||
/// Scheme is always "code" for code-extracted claims.
|
||||
pub concept_path: ConceptPath,
|
||||
@ -549,7 +549,7 @@ A specialized StemeDB Lens that resolves conflicts by prioritizing `Policy` asse
|
||||
|
||||
```rust
|
||||
fn to_assertion(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
agent_keypair: &Ed25519Keypair,
|
||||
scan_timestamp: u64,
|
||||
) -> Assertion {
|
||||
@ -621,7 +621,7 @@ After ingestion, for each extracted claim:
|
||||
|
||||
```rust
|
||||
async fn check_conflict(
|
||||
claim: &ExtractedClaim,
|
||||
claim: &Observation,
|
||||
query_engine: &QueryEngine,
|
||||
) -> Option<ConflictResult> {
|
||||
// 1. Query with Skeptic lens, resolving aliases
|
||||
|
||||
@ -52,10 +52,36 @@ impl ClaimsFile {
|
||||
}
|
||||
|
||||
/// Add a claim entry, deduplicating by ID.
|
||||
///
|
||||
/// Warns if an active claim already exists for the same concept_path/predicate.
|
||||
pub fn add(&mut self, claim: AuthoredClaim) {
|
||||
if !self.claims.iter().any(|c| c.id == claim.id) {
|
||||
self.claims.push(claim);
|
||||
// Check for duplicate ID
|
||||
if self.claims.iter().any(|c| c.id == claim.id) {
|
||||
return; // Skip duplicate ID
|
||||
}
|
||||
|
||||
// Check for duplicate active claims (same concept_path + predicate)
|
||||
if claim.status == ClaimStatus::Active {
|
||||
let duplicates: Vec<_> = self.claims.iter()
|
||||
.filter(|c| c.status == ClaimStatus::Active)
|
||||
.filter(|c| c.concept_path == claim.concept_path)
|
||||
.filter(|c| c.predicate == claim.predicate)
|
||||
.filter(|c| c.id != claim.id)
|
||||
.collect();
|
||||
|
||||
if !duplicates.is_empty() {
|
||||
#[allow(clippy::print_stderr)]
|
||||
{
|
||||
eprintln!("⚠️ Warning: Active claim(s) already exist for {}::{}", claim.concept_path, claim.predicate);
|
||||
for dup in &duplicates {
|
||||
eprintln!(" - {} ({})", dup.id, dup.invariant);
|
||||
}
|
||||
eprintln!("Consider using 'aphoria claims supersede {}' instead", duplicates[0].id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
self.claims.push(claim);
|
||||
}
|
||||
|
||||
/// Load from a TOML file.
|
||||
@ -326,4 +352,41 @@ mod tests {
|
||||
let file = ClaimsFile::load(&path).expect("load should succeed");
|
||||
assert!(file.is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_duplicate_active_warning() {
|
||||
let mut file = ClaimsFile::new();
|
||||
|
||||
// Add first claim
|
||||
file.add(sample_claim("claim-001"));
|
||||
assert_eq!(file.len(), 1);
|
||||
|
||||
// Add duplicate with same concept_path/predicate
|
||||
// This should print a warning but still add the claim
|
||||
let mut dup_claim = sample_claim("claim-002");
|
||||
dup_claim.concept_path = "test/concept".to_string();
|
||||
dup_claim.predicate = "test_pred".to_string();
|
||||
|
||||
file.add(dup_claim);
|
||||
assert_eq!(file.len(), 2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_no_warning_for_deprecated_duplicates() {
|
||||
let mut file = ClaimsFile::new();
|
||||
|
||||
// Add first claim and deprecate it
|
||||
file.add(sample_claim("claim-001"));
|
||||
file.deprecate("claim-001", "2026-02-08T14:00:00Z").expect("deprecate");
|
||||
|
||||
// Add another claim with same concept_path/predicate
|
||||
// Should NOT warn because the first is deprecated
|
||||
let mut new_claim = sample_claim("claim-002");
|
||||
new_claim.concept_path = "test/concept".to_string();
|
||||
new_claim.predicate = "test_pred".to_string();
|
||||
|
||||
file.add(new_claim);
|
||||
assert_eq!(file.len(), 2);
|
||||
assert_eq!(file.find_by_status(&ClaimStatus::Active).len(), 1);
|
||||
}
|
||||
}
|
||||
|
||||
@ -170,6 +170,28 @@ pub enum ClaimsCommands {
|
||||
reason: String,
|
||||
},
|
||||
|
||||
/// Import claims from a TOML file in batch
|
||||
Import {
|
||||
/// Path to TOML file with claims
|
||||
file: PathBuf,
|
||||
|
||||
/// Authority tier to apply to all claims (overrides tier in file)
|
||||
#[arg(long)]
|
||||
authority_tier: Option<String>,
|
||||
|
||||
/// Source guideline name (for tracking)
|
||||
#[arg(long)]
|
||||
source_guide: Option<String>,
|
||||
|
||||
/// Preview changes without writing to file
|
||||
#[arg(long)]
|
||||
dry_run: bool,
|
||||
|
||||
/// Merge strategy: skip_existing, overwrite, fail_on_duplicate
|
||||
#[arg(long, default_value = "skip_existing")]
|
||||
merge: String,
|
||||
},
|
||||
|
||||
/// List pending claim markers
|
||||
ListMarkers {
|
||||
/// Filter by status (pending, formalized, rejected)
|
||||
|
||||
203
applications/aphoria/src/corpus/enricher.rs
Normal file
203
applications/aphoria/src/corpus/enricher.rs
Normal file
@ -0,0 +1,203 @@
|
||||
//! Pattern enrichment service.
|
||||
//!
|
||||
//! Matches patterns to extractor metadata and applies enrichment (category, verdict, explanation, authority).
|
||||
//! Transforms "md5: true" → "MD5 is deprecated (NIST 2010)".
|
||||
|
||||
use std::collections::HashMap;
|
||||
|
||||
use crate::extractors::{ExtractorRegistry, PatternMetadata};
|
||||
|
||||
/// Pattern enrichment service.
|
||||
///
|
||||
/// Matches observations to metadata using:
|
||||
/// 1. Exact match (tail_path + predicate + value)
|
||||
/// 2. Wildcard match (tail_path + predicate, any value)
|
||||
/// 3. Heuristic scoring (noise detection for unenriched patterns)
|
||||
pub struct PatternEnricher {
|
||||
/// Exact matches: (tail_path, predicate, value) → metadata
|
||||
exact_matches: HashMap<(String, String, String), PatternMetadata>,
|
||||
|
||||
/// Wildcard matches: (tail_path, predicate) → metadata
|
||||
wildcard_matches: HashMap<(String, String), PatternMetadata>,
|
||||
}
|
||||
|
||||
impl PatternEnricher {
|
||||
/// Create a new enricher from an extractor registry.
|
||||
pub fn from_registry(registry: &ExtractorRegistry) -> Self {
|
||||
let mut exact_matches = HashMap::new();
|
||||
let mut wildcard_matches = HashMap::new();
|
||||
|
||||
for extractor in registry.extractors() {
|
||||
for metadata in extractor.pattern_metadata() {
|
||||
let key_tail_pred = (metadata.tail_path.clone(), metadata.predicate.clone());
|
||||
|
||||
if let Some(value) = &metadata.value {
|
||||
// Exact match: tail_path + predicate + value
|
||||
let key_exact = (
|
||||
metadata.tail_path.clone(),
|
||||
metadata.predicate.clone(),
|
||||
value.clone(),
|
||||
);
|
||||
exact_matches.insert(key_exact, metadata.clone());
|
||||
} else {
|
||||
// Wildcard match: tail_path + predicate (any value)
|
||||
wildcard_matches.insert(key_tail_pred, metadata);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Self { exact_matches, wildcard_matches }
|
||||
}
|
||||
|
||||
/// Enrich a pattern with metadata.
|
||||
///
|
||||
/// Returns (category, verdict, explanation, authority_source) if a match is found.
|
||||
pub fn enrich(
|
||||
&self,
|
||||
tail_path: &str,
|
||||
predicate: &str,
|
||||
value: &str,
|
||||
) -> Option<Enrichment> {
|
||||
// 1. Try exact match first
|
||||
let key_exact = (tail_path.to_string(), predicate.to_string(), value.to_string());
|
||||
if let Some(metadata) = self.exact_matches.get(&key_exact) {
|
||||
return Some(Enrichment {
|
||||
category: Some(metadata.category.clone()),
|
||||
verdict: Some(metadata.verdict.clone()),
|
||||
explanation: Some(metadata.explanation.clone()),
|
||||
authority_source: metadata.authority_source.clone(),
|
||||
});
|
||||
}
|
||||
|
||||
// 2. Try wildcard match (tail_path + predicate, any value)
|
||||
let key_wildcard = (tail_path.to_string(), predicate.to_string());
|
||||
if let Some(metadata) = self.wildcard_matches.get(&key_wildcard) {
|
||||
return Some(Enrichment {
|
||||
category: Some(metadata.category.clone()),
|
||||
verdict: Some(metadata.verdict.clone()),
|
||||
explanation: Some(metadata.explanation.clone()),
|
||||
authority_source: metadata.authority_source.clone(),
|
||||
});
|
||||
}
|
||||
|
||||
// 3. Apply noise detection heuristics
|
||||
if Self::is_noise_pattern(tail_path, predicate) {
|
||||
return Some(Enrichment {
|
||||
category: Some("noise".to_string()),
|
||||
verdict: Some("noise".to_string()),
|
||||
explanation: Some("Common pattern with low signal".to_string()),
|
||||
authority_source: None,
|
||||
});
|
||||
}
|
||||
|
||||
None
|
||||
}
|
||||
|
||||
/// Detect noise patterns using heuristics.
|
||||
///
|
||||
/// Noise patterns include:
|
||||
/// - Standard library imports (std, core, tokio, serde)
|
||||
/// - Generic predicates (enabled, present)
|
||||
/// - Common infrastructure patterns
|
||||
fn is_noise_pattern(tail_path: &str, _predicate: &str) -> bool {
|
||||
// Common noise patterns: std library imports, generic imports
|
||||
let noise_patterns = [
|
||||
"imports/std",
|
||||
"imports/core",
|
||||
"imports/alloc",
|
||||
"imports/serde",
|
||||
"imports/tokio",
|
||||
"imports/async_trait",
|
||||
"imports/tracing",
|
||||
"imports/anyhow",
|
||||
"imports/thiserror",
|
||||
];
|
||||
|
||||
for pattern in &noise_patterns {
|
||||
if tail_path.contains(pattern) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
false
|
||||
}
|
||||
}
|
||||
|
||||
/// Enrichment metadata for a pattern.
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct Enrichment {
|
||||
/// Pattern category (e.g., "security", "architecture", "performance").
|
||||
pub category: Option<String>,
|
||||
/// Verdict (e.g., "deprecated", "recommended", "emerging", "common", "noise").
|
||||
pub verdict: Option<String>,
|
||||
/// Human-readable explanation.
|
||||
pub explanation: Option<String>,
|
||||
/// Authority source (e.g., "RFC 8996", "NIST 2010").
|
||||
pub authority_source: Option<String>,
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::config::AphoriaConfig;
|
||||
|
||||
#[test]
|
||||
fn test_exact_match() {
|
||||
// Create a default config and registry (includes WeakCryptoExtractor)
|
||||
let config = AphoriaConfig::default();
|
||||
let registry = ExtractorRegistry::new(&config);
|
||||
let enricher = PatternEnricher::from_registry(®istry);
|
||||
|
||||
// WeakCryptoExtractor provides MD5 metadata
|
||||
let enrichment = enricher
|
||||
.enrich("crypto/hashing/algorithm", "algorithm", "md5")
|
||||
.expect("Should match MD5");
|
||||
|
||||
assert_eq!(enrichment.category, Some("security".to_string()));
|
||||
assert_eq!(enrichment.verdict, Some("deprecated".to_string()));
|
||||
assert!(enrichment.explanation.is_some());
|
||||
assert!(enrichment.authority_source.is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_wildcard_match() {
|
||||
// TLS version extractor provides metadata for all TLS versions
|
||||
let config = AphoriaConfig::default();
|
||||
let registry = ExtractorRegistry::new(&config);
|
||||
let enricher = PatternEnricher::from_registry(®istry);
|
||||
|
||||
// Match TLS 1.0 (should be deprecated)
|
||||
let enrichment = enricher
|
||||
.enrich("tls/min_version", "version", "1.0")
|
||||
.expect("Should match TLS 1.0");
|
||||
|
||||
assert_eq!(enrichment.category, Some("security".to_string()));
|
||||
assert_eq!(enrichment.verdict, Some("deprecated".to_string()));
|
||||
assert!(enrichment.authority_source.is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_noise_detection() {
|
||||
let config = AphoriaConfig::default();
|
||||
let registry = ExtractorRegistry::new(&config);
|
||||
let enricher = PatternEnricher::from_registry(®istry);
|
||||
|
||||
let enrichment = enricher
|
||||
.enrich("imports/std", "imported", "true")
|
||||
.expect("Should detect noise");
|
||||
|
||||
assert_eq!(enrichment.category, Some("noise".to_string()));
|
||||
assert_eq!(enrichment.verdict, Some("noise".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_no_match() {
|
||||
let config = AphoriaConfig::default();
|
||||
let registry = ExtractorRegistry::new(&config);
|
||||
let enricher = PatternEnricher::from_registry(®istry);
|
||||
|
||||
let enrichment = enricher.enrich("custom/path", "custom_pred", "value");
|
||||
|
||||
assert!(enrichment.is_none());
|
||||
}
|
||||
}
|
||||
@ -33,11 +33,13 @@
|
||||
//! └─────────────────────────────────────────────────────────────────┘
|
||||
//! ```
|
||||
|
||||
mod enricher;
|
||||
mod hardcoded;
|
||||
mod owasp;
|
||||
mod rfc;
|
||||
mod vendor;
|
||||
|
||||
pub use enricher::{Enrichment, PatternEnricher};
|
||||
pub use hardcoded::HardcodedCorpusBuilder;
|
||||
pub use owasp::OwaspCorpusBuilder;
|
||||
pub use rfc::RfcCorpusBuilder;
|
||||
|
||||
@ -7,7 +7,7 @@ use stemedb_ingest::serialize_assertion;
|
||||
use stemedb_storage::PredicateIndexStore;
|
||||
use tracing::{debug, info, instrument, warn};
|
||||
|
||||
use crate::bridge::{claim_to_assertion, observation_to_assertion};
|
||||
use crate::bridge::observation_to_assertion;
|
||||
use crate::types::{predicates, Observation};
|
||||
use crate::walker::git::get_current_commit_hash;
|
||||
use crate::AphoriaError;
|
||||
@ -33,7 +33,7 @@ impl LocalEpisteme {
|
||||
let mut blessed_claims = Vec::new();
|
||||
|
||||
for claim in claims {
|
||||
let assertion = claim_to_assertion(claim, &self.signing_key, timestamp, git_commit.as_deref());
|
||||
let assertion = observation_to_assertion(claim, &self.signing_key, timestamp, git_commit.as_deref());
|
||||
|
||||
// Serialize and write to WAL
|
||||
let record_bytes = serialize_assertion(&assertion)
|
||||
|
||||
@ -261,6 +261,21 @@ impl Extractor for CommandInjectionExtractor {
|
||||
claims
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// Command input sanitization - shell interpolation is vulnerable
|
||||
super::PatternMetadata {
|
||||
tail_path: "os/command/input".to_string(),
|
||||
predicate: "input".to_string(),
|
||||
value: Some("interpolated".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Commands with interpolated user input are vulnerable to command injection".to_string(),
|
||||
authority_source: Some("OWASP A03:2021".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"Command::new",
|
||||
|
||||
@ -128,6 +128,31 @@ impl Extractor for CorsConfigExtractor {
|
||||
vec![("cors/allow_origin", "config_value"), ("cors/credentials_with_wildcard", "enabled")]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// allow_origin: * - insecure
|
||||
super::PatternMetadata {
|
||||
tail_path: "cors/allow_origin".to_string(),
|
||||
predicate: "config_value".to_string(),
|
||||
value: Some("*".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "CORS wildcard (*) allows any origin and should be replaced with specific origins".to_string(),
|
||||
authority_source: Some("OWASP".to_string()),
|
||||
},
|
||||
// credentials with wildcard - critical
|
||||
super::PatternMetadata {
|
||||
tail_path: "cors/credentials_with_wildcard".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: Some("true".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "CORS credentials with wildcard origin is forbidden by spec and allows credential theft".to_string(),
|
||||
authority_source: Some("W3C CORS Spec".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)allow_origin|AllowAllOrigins|permissive",
|
||||
|
||||
@ -234,6 +234,39 @@ impl Extractor for HardcodedSecretsExtractor {
|
||||
]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// Hardcoded secrets - wildcard for all secret types
|
||||
super::PatternMetadata {
|
||||
tail_path: "secrets/api_key".to_string(),
|
||||
predicate: "storage_method".to_string(),
|
||||
value: Some("hardcoded".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Hardcoded API keys expose credentials in source code and version control".to_string(),
|
||||
authority_source: Some("OWASP A07:2021".to_string()),
|
||||
},
|
||||
super::PatternMetadata {
|
||||
tail_path: "secrets/password".to_string(),
|
||||
predicate: "storage_method".to_string(),
|
||||
value: Some("hardcoded".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Hardcoded passwords expose credentials in source code and version control".to_string(),
|
||||
authority_source: Some("OWASP A07:2021".to_string()),
|
||||
},
|
||||
super::PatternMetadata {
|
||||
tail_path: "secrets/aws_credentials".to_string(),
|
||||
predicate: "storage_method".to_string(),
|
||||
value: Some("hardcoded".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Hardcoded AWS credentials expose cloud account access".to_string(),
|
||||
authority_source: Some("OWASP A07:2021".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)api[_-]?key",
|
||||
|
||||
@ -229,6 +229,31 @@ impl Extractor for InsecureDeserializationExtractor {
|
||||
|
||||
claims
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// Pickle deserialization - critical RCE vulnerability
|
||||
super::PatternMetadata {
|
||||
tail_path: "serialization/deserialization".to_string(),
|
||||
predicate: "method".to_string(),
|
||||
value: Some("pickle".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Python pickle deserialization enables arbitrary code execution and must never be used with untrusted data".to_string(),
|
||||
authority_source: Some("OWASP A08:2021".to_string()),
|
||||
},
|
||||
// yaml.load - vulnerable to code execution
|
||||
super::PatternMetadata {
|
||||
tail_path: "serialization/deserialization".to_string(),
|
||||
predicate: "method".to_string(),
|
||||
value: Some("yaml_unsafe".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "YAML unsafe loading allows arbitrary code execution - use safe_load instead".to_string(),
|
||||
authority_source: Some("OWASP A08:2021".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -247,6 +247,31 @@ impl Extractor for JwtConfigExtractor {
|
||||
]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// Algorithm: none - critical vulnerability
|
||||
super::PatternMetadata {
|
||||
tail_path: "jwt/algorithm_restriction".to_string(),
|
||||
predicate: "config_value".to_string(),
|
||||
value: Some("none".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "JWT 'none' algorithm allows unsigned tokens and must never be used".to_string(),
|
||||
authority_source: Some("RFC 7519".to_string()),
|
||||
},
|
||||
// Signature verification disabled - critical
|
||||
super::PatternMetadata {
|
||||
tail_path: "jwt/signature_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: Some("false".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "JWT signature verification must always be enabled".to_string(),
|
||||
authority_source: Some("RFC 7519".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)jwt|jsonwebtoken|jose",
|
||||
|
||||
@ -147,7 +147,7 @@ pub use ssrf::SsrfExtractor;
|
||||
pub use timeout_config::{TimeoutConfigExtractor, TimeoutThresholds};
|
||||
pub use tls_verify::TlsVerifyExtractor;
|
||||
pub use tls_version::TlsVersionExtractor;
|
||||
pub use traits::{build_claim, is_test_file, Extractor};
|
||||
pub use traits::{build_claim, is_test_file, Extractor, PatternMetadata};
|
||||
pub use unreal_config::UnrealConfigExtractor;
|
||||
pub use unreal_cpp::UnrealCppExtractor;
|
||||
pub use unreal_performance::UnrealPerformanceExtractor;
|
||||
|
||||
@ -237,6 +237,21 @@ impl Extractor for PathTraversalExtractor {
|
||||
claims
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// User-controlled path without validation - vulnerable
|
||||
super::PatternMetadata {
|
||||
tail_path: "filesystem/path/concatenation".to_string(),
|
||||
predicate: "user_controlled_path".to_string(),
|
||||
value: Some("true".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "User-controlled file paths without validation enable path traversal attacks".to_string(),
|
||||
authority_source: Some("OWASP A01:2021".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"\.\./",
|
||||
|
||||
@ -236,6 +236,21 @@ impl Extractor for SqlInjectionExtractor {
|
||||
claims
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// SQL query construction via string interpolation - vulnerable
|
||||
super::PatternMetadata {
|
||||
tail_path: "db/query/construction".to_string(),
|
||||
predicate: "construction".to_string(),
|
||||
value: Some("interpolated".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "SQL queries constructed via string interpolation are vulnerable to SQL injection".to_string(),
|
||||
authority_source: Some("OWASP A03:2021".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)format!.*SELECT|format!.*INSERT|format!.*UPDATE|format!.*DELETE",
|
||||
|
||||
@ -178,6 +178,31 @@ impl Extractor for TlsVerifyExtractor {
|
||||
vec![("tls/cert_verification", "enabled")]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// cert_verification: false - insecure
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: Some("false".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "Disabling TLS certificate verification allows man-in-the-middle attacks".to_string(),
|
||||
authority_source: Some("OWASP".to_string()),
|
||||
},
|
||||
// cert_verification: true - recommended
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/cert_verification".to_string(),
|
||||
predicate: "enabled".to_string(),
|
||||
value: Some("true".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "recommended".to_string(),
|
||||
explanation: "TLS certificate verification should always be enabled".to_string(),
|
||||
authority_source: Some("OWASP".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"danger_accept_invalid",
|
||||
|
||||
@ -367,6 +367,51 @@ impl Extractor for TlsVersionExtractor {
|
||||
fn verifiable_predicates(&self) -> Vec<(&str, &str)> {
|
||||
vec![("tls/min_version", "version")]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// TLS 1.0 - deprecated
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/min_version".to_string(),
|
||||
predicate: "version".to_string(),
|
||||
value: Some("1.0".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "TLS 1.0 is deprecated and must not be used".to_string(),
|
||||
authority_source: Some("RFC 8996".to_string()),
|
||||
},
|
||||
// TLS 1.1 - deprecated
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/min_version".to_string(),
|
||||
predicate: "version".to_string(),
|
||||
value: Some("1.1".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "TLS 1.1 is deprecated and must not be used".to_string(),
|
||||
authority_source: Some("RFC 8996".to_string()),
|
||||
},
|
||||
// TLS 1.2 - recommended minimum
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/min_version".to_string(),
|
||||
predicate: "version".to_string(),
|
||||
value: Some("1.2".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "recommended".to_string(),
|
||||
explanation: "TLS 1.2 is the recommended minimum version for secure communications".to_string(),
|
||||
authority_source: Some("RFC 8996".to_string()),
|
||||
},
|
||||
// TLS 1.3 - recommended
|
||||
super::PatternMetadata {
|
||||
tail_path: "tls/min_version".to_string(),
|
||||
predicate: "version".to_string(),
|
||||
value: Some("1.3".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "recommended".to_string(),
|
||||
explanation: "TLS 1.3 is the latest and most secure version of TLS".to_string(),
|
||||
authority_source: Some("RFC 8446".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -116,6 +116,41 @@ pub trait Extractor: Send + Sync {
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![]
|
||||
}
|
||||
|
||||
/// Declare metadata for patterns this extractor produces.
|
||||
///
|
||||
/// Returns enrichment metadata for patterns (category, verdict, explanation, authority).
|
||||
/// Used to transform "md5: true" → "MD5 is deprecated (NIST 2010)".
|
||||
///
|
||||
/// Default: empty (backward compatible — no enrichment).
|
||||
fn pattern_metadata(&self) -> Vec<PatternMetadata> {
|
||||
vec![]
|
||||
}
|
||||
}
|
||||
|
||||
/// Metadata for enriching a pattern with human context.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct PatternMetadata {
|
||||
/// Tail-path suffix (last 2 segments, e.g., "crypto/hashing/algorithm").
|
||||
pub tail_path: String,
|
||||
|
||||
/// Predicate (e.g., "algorithm").
|
||||
pub predicate: String,
|
||||
|
||||
/// Optional specific value (e.g., "md5"). None means wildcard (any value).
|
||||
pub value: Option<String>,
|
||||
|
||||
/// Category (e.g., "security", "architecture", "performance").
|
||||
pub category: String,
|
||||
|
||||
/// Verdict (e.g., "deprecated", "recommended", "emerging", "common", "noise").
|
||||
pub verdict: String,
|
||||
|
||||
/// Human-readable explanation.
|
||||
pub explanation: String,
|
||||
|
||||
/// Optional authority source (e.g., "RFC 8996", "NIST 2010").
|
||||
pub authority_source: Option<String>,
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@ -308,6 +308,51 @@ impl Extractor for WeakCryptoExtractor {
|
||||
vec![("hashing/algorithm", "algorithm"), ("encryption/algorithm", "algorithm")]
|
||||
}
|
||||
|
||||
fn pattern_metadata(&self) -> Vec<super::PatternMetadata> {
|
||||
vec![
|
||||
// MD5 - deprecated
|
||||
super::PatternMetadata {
|
||||
tail_path: "crypto/hashing/algorithm".to_string(),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: Some("md5".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "MD5 is cryptographically broken and unsuitable for security purposes".to_string(),
|
||||
authority_source: Some("NIST SP 800-131A".to_string()),
|
||||
},
|
||||
// SHA1 - deprecated for security use
|
||||
super::PatternMetadata {
|
||||
tail_path: "crypto/hashing/algorithm".to_string(),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: Some("sha1".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "SHA-1 is deprecated for cryptographic use due to collision attacks".to_string(),
|
||||
authority_source: Some("NIST SP 800-131A".to_string()),
|
||||
},
|
||||
// DES - weak encryption
|
||||
super::PatternMetadata {
|
||||
tail_path: "crypto/encryption/algorithm".to_string(),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: Some("des".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "DES has a small 56-bit key size and is vulnerable to brute force".to_string(),
|
||||
authority_source: Some("NIST FIPS 140-2".to_string()),
|
||||
},
|
||||
// RC4 - broken cipher
|
||||
super::PatternMetadata {
|
||||
tail_path: "crypto/encryption/algorithm".to_string(),
|
||||
predicate: "algorithm".to_string(),
|
||||
value: Some("rc4".to_string()),
|
||||
category: "security".to_string(),
|
||||
verdict: "deprecated".to_string(),
|
||||
explanation: "RC4 stream cipher has known biases and is cryptographically broken".to_string(),
|
||||
authority_source: Some("RFC 7465".to_string()),
|
||||
},
|
||||
]
|
||||
}
|
||||
|
||||
fn screening_patterns(&self) -> Vec<&str> {
|
||||
vec![
|
||||
r"(?i)md5|Md5",
|
||||
|
||||
@ -7,6 +7,7 @@ use aphoria::claims_file::ClaimsFile;
|
||||
use aphoria::pending_markers::{MarkerStatus, PendingMarkersFile};
|
||||
use aphoria::AphoriaConfig;
|
||||
use aphoria::{parse_authority_tier, AuthoredClaim, AuthoredValue, ClaimStatus};
|
||||
use chrono::Utc;
|
||||
|
||||
use crate::cli::ClaimsCommands;
|
||||
|
||||
@ -137,6 +138,13 @@ pub async fn handle_claims_command(command: ClaimsCommands, config: &AphoriaConf
|
||||
ClaimsCommands::RejectMarker { marker_id, reason } => {
|
||||
handle_reject_marker(marker_id, reason, config).await
|
||||
}
|
||||
ClaimsCommands::Import {
|
||||
file,
|
||||
authority_tier,
|
||||
source_guide,
|
||||
dry_run,
|
||||
merge,
|
||||
} => handle_claims_import(file, authority_tier, source_guide, dry_run, merge, config).await,
|
||||
}
|
||||
}
|
||||
|
||||
@ -934,3 +942,170 @@ async fn handle_reject_marker(
|
||||
println!("✗ Marker {} rejected: \"{}\"", marker_id, reason);
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
async fn handle_claims_import(
|
||||
file: std::path::PathBuf,
|
||||
authority_tier: Option<String>,
|
||||
source_guide: Option<String>,
|
||||
dry_run: bool,
|
||||
merge: String,
|
||||
_config: &AphoriaConfig,
|
||||
) -> ExitCode {
|
||||
use aphoria::AuthoredClaim;
|
||||
use aphoria::claims_file::ClaimsFile;
|
||||
|
||||
// Get project root
|
||||
let root = match project_root() {
|
||||
Ok(r) => r,
|
||||
Err(code) => return code,
|
||||
};
|
||||
|
||||
// Load import file
|
||||
let import_content = match std::fs::read_to_string(&file) {
|
||||
Ok(c) => c,
|
||||
Err(e) => {
|
||||
eprintln!("Error reading import file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
// Parse claims from TOML
|
||||
#[derive(serde::Deserialize)]
|
||||
struct ImportFile {
|
||||
claim: Vec<AuthoredClaim>,
|
||||
}
|
||||
|
||||
let mut import: ImportFile = match toml::from_str(&import_content) {
|
||||
Ok(i) => i,
|
||||
Err(e) => {
|
||||
eprintln!("Error parsing import file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
// Override authority tier if specified
|
||||
if let Some(ref tier) = authority_tier {
|
||||
for claim in &mut import.claim {
|
||||
claim.authority_tier = tier.clone();
|
||||
}
|
||||
}
|
||||
|
||||
// Load existing claims
|
||||
let claims_path = ClaimsFile::default_path(&root);
|
||||
let mut claims_file = match ClaimsFile::load(&claims_path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
eprintln!("Error loading claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
};
|
||||
|
||||
// Determine merge strategy
|
||||
let mut added_count = 0;
|
||||
let mut skipped_count = 0;
|
||||
let mut overwritten_count = 0;
|
||||
|
||||
for claim in import.claim {
|
||||
let existing = claims_file.claims.iter().position(|c| c.id == claim.id);
|
||||
|
||||
match (existing, merge.as_str()) {
|
||||
(Some(_idx), "skip_existing") => {
|
||||
skipped_count += 1;
|
||||
if dry_run {
|
||||
println!("Would skip existing claim: {}", claim.id);
|
||||
}
|
||||
}
|
||||
(Some(idx), "overwrite") => {
|
||||
if dry_run {
|
||||
println!("Would overwrite claim: {}", claim.id);
|
||||
} else {
|
||||
claims_file.claims[idx] = claim;
|
||||
}
|
||||
overwritten_count += 1;
|
||||
}
|
||||
(Some(_), "fail_on_duplicate") => {
|
||||
eprintln!("Error: Duplicate claim ID: {}", claim.id);
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
(None, _) => {
|
||||
if dry_run {
|
||||
println!("Would add claim: {}", claim.id);
|
||||
} else {
|
||||
claims_file.claims.push(claim);
|
||||
}
|
||||
added_count += 1;
|
||||
}
|
||||
_ => {
|
||||
eprintln!("Invalid merge strategy: {merge}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Save (unless dry-run)
|
||||
if !dry_run {
|
||||
if let Err(e) = claims_file.save(&claims_path) {
|
||||
eprintln!("Error saving claims file: {e}");
|
||||
return ExitCode::from(3);
|
||||
}
|
||||
|
||||
// Track guideline metadata if source_guide is provided
|
||||
if let Some(guide_name) = source_guide {
|
||||
use aphoria::ingested_guides::{GuidelineMetadata, IngestedGuidesFile};
|
||||
|
||||
let guides_path = IngestedGuidesFile::default_path(&root);
|
||||
let mut guides_file = IngestedGuidesFile::load(&guides_path).unwrap_or_default();
|
||||
|
||||
// Compute document hash if source file exists
|
||||
let (source_path, document_hash) = if let Ok(content) = std::fs::read(&file) {
|
||||
use blake3::Hasher;
|
||||
let mut hasher = Hasher::new();
|
||||
hasher.update(&content);
|
||||
let hash = hasher.finalize();
|
||||
(Some(file.clone()), Some(format!("blake3:{}", hash.to_hex())))
|
||||
} else {
|
||||
(None, None)
|
||||
};
|
||||
|
||||
// Create guideline metadata
|
||||
let guideline = GuidelineMetadata {
|
||||
id: guide_name.clone(),
|
||||
name: guide_name.clone(),
|
||||
source_path,
|
||||
document_hash,
|
||||
ingested_at: Utc::now().to_rfc3339(),
|
||||
claims_count: added_count + overwritten_count,
|
||||
authority_tier: authority_tier.clone().unwrap_or_else(|| "team_policy".to_string()),
|
||||
category: "imported".to_string(),
|
||||
claim_ids: claims_file
|
||||
.claims
|
||||
.iter()
|
||||
.rev()
|
||||
.take(added_count + overwritten_count)
|
||||
.map(|c| c.id.clone())
|
||||
.collect(),
|
||||
};
|
||||
|
||||
guides_file.upsert(guideline);
|
||||
|
||||
if let Err(e) = guides_file.save(&guides_path) {
|
||||
eprintln!("Warning: Failed to save guideline tracking: {e}");
|
||||
} else {
|
||||
println!(" Guideline tracked: {guide_name}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Report results
|
||||
if dry_run {
|
||||
println!("\n🔍 Dry-run mode (no changes written)");
|
||||
} else {
|
||||
println!("\n✓ Import complete");
|
||||
}
|
||||
println!(" Added: {added_count}");
|
||||
println!(" Overwritten: {overwritten_count}");
|
||||
println!(" Skipped: {skipped_count}");
|
||||
println!(" Total imported: {}", added_count + overwritten_count + skipped_count);
|
||||
|
||||
ExitCode::SUCCESS
|
||||
}
|
||||
|
||||
@ -162,6 +162,7 @@ pub use shadow::{
|
||||
};
|
||||
#[allow(deprecated)]
|
||||
pub use types::ExtractedClaim; // Backward compat alias for Observation
|
||||
pub use types::ingested_guides;
|
||||
pub use types::{
|
||||
extract_leaf_concept, format_authority_tier, parse_authority_tier, predicates, AcknowledgeArgs,
|
||||
AuthoredClaim, AuthoredValue, BlessArgs, ClaimStatus, ClaimValue, ComparisonMode,
|
||||
|
||||
@ -179,17 +179,18 @@ impl std::fmt::Display for ClaimStatus {
|
||||
|
||||
/// Parse an authority tier string into a `SourceClass`.
|
||||
///
|
||||
/// Accepted values: "regulatory", "clinical", "observational", "expert", "community", "anecdotal".
|
||||
/// Accepted values: "regulatory", "clinical", "observational", "team_policy", "expert", "community", "anecdotal".
|
||||
pub fn parse_authority_tier(s: &str) -> Result<SourceClass, AphoriaError> {
|
||||
match s.to_lowercase().as_str() {
|
||||
"regulatory" => Ok(SourceClass::Regulatory),
|
||||
"clinical" => Ok(SourceClass::Clinical),
|
||||
"observational" => Ok(SourceClass::Observational),
|
||||
"team_policy" => Ok(SourceClass::TeamPolicy),
|
||||
"expert" => Ok(SourceClass::Expert),
|
||||
"community" => Ok(SourceClass::Community),
|
||||
"anecdotal" => Ok(SourceClass::Anecdotal),
|
||||
_ => Err(AphoriaError::Claims(format!(
|
||||
"Unknown authority tier '{s}'. Expected: regulatory, clinical, observational, expert, community, anecdotal"
|
||||
"Unknown authority tier '{s}'. Expected: regulatory, clinical, observational, team_policy, expert, community, anecdotal"
|
||||
))),
|
||||
}
|
||||
}
|
||||
@ -200,11 +201,17 @@ pub fn format_authority_tier(source_class: SourceClass) -> String {
|
||||
SourceClass::Regulatory => "Regulatory",
|
||||
SourceClass::Clinical => "Clinical",
|
||||
SourceClass::Observational => "Observational",
|
||||
SourceClass::TeamPolicy => "TeamPolicy",
|
||||
SourceClass::Expert => "Expert",
|
||||
SourceClass::Community => "Community",
|
||||
SourceClass::Anecdotal => "Anecdotal",
|
||||
};
|
||||
format!("{name} (Tier {tier})", tier = source_class.tier())
|
||||
let tier = source_class.tier_fractional();
|
||||
if tier.fract() == 0.0 {
|
||||
format!("{name} (Tier {tier})", tier = tier as u8)
|
||||
} else {
|
||||
format!("{name} (Tier {tier})")
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
@ -235,6 +242,7 @@ mod tests {
|
||||
assert_eq!(parse_authority_tier("regulatory").ok(), Some(SourceClass::Regulatory));
|
||||
assert_eq!(parse_authority_tier("Expert").ok(), Some(SourceClass::Expert));
|
||||
assert_eq!(parse_authority_tier("CLINICAL").ok(), Some(SourceClass::Clinical));
|
||||
assert_eq!(parse_authority_tier("team_policy").ok(), Some(SourceClass::TeamPolicy));
|
||||
assert!(parse_authority_tier("unknown").is_err());
|
||||
}
|
||||
|
||||
@ -242,6 +250,7 @@ mod tests {
|
||||
fn test_format_authority_tier() {
|
||||
assert_eq!(format_authority_tier(SourceClass::Expert), "Expert (Tier 3)");
|
||||
assert_eq!(format_authority_tier(SourceClass::Regulatory), "Regulatory (Tier 0)");
|
||||
assert_eq!(format_authority_tier(SourceClass::TeamPolicy), "TeamPolicy (Tier 2.5)");
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
||||
218
applications/aphoria/src/types/ingested_guides.rs
Normal file
218
applications/aphoria/src/types/ingested_guides.rs
Normal file
@ -0,0 +1,218 @@
|
||||
//! Guideline tracking for best practices ingestion.
|
||||
//!
|
||||
//! Tracks which architectural/security guidelines have been imported as claims,
|
||||
//! enabling change detection and compliance filtering.
|
||||
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
use crate::AphoriaError;
|
||||
|
||||
#[cfg(test)]
|
||||
use chrono::Utc;
|
||||
|
||||
/// Tracks a guideline that has been ingested as claims.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
#[allow(dead_code)] // Used in handlers and tests
|
||||
pub struct GuidelineMetadata {
|
||||
/// Unique guideline ID (e.g., "hexagonal-arch", "owasp-top-10")
|
||||
pub id: String,
|
||||
|
||||
/// Human-readable name
|
||||
pub name: String,
|
||||
|
||||
/// Path to the source document (relative to project root)
|
||||
pub source_path: Option<PathBuf>,
|
||||
|
||||
/// BLAKE3 hash of the source document (for change detection)
|
||||
pub document_hash: Option<String>,
|
||||
|
||||
/// When this guideline was first ingested
|
||||
pub ingested_at: String,
|
||||
|
||||
/// How many claims were created from this guideline
|
||||
pub claims_count: usize,
|
||||
|
||||
/// Authority tier applied to claims (e.g., "team_policy")
|
||||
pub authority_tier: String,
|
||||
|
||||
/// Category (e.g., "architecture", "security")
|
||||
pub category: String,
|
||||
|
||||
/// Claim IDs associated with this guideline
|
||||
pub claim_ids: Vec<String>,
|
||||
}
|
||||
|
||||
/// Manages the ingested guidelines registry (.aphoria/ingested_guides.toml).
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
#[allow(dead_code)] // Used in handlers and tests
|
||||
pub struct IngestedGuidesFile {
|
||||
/// List of tracked guidelines.
|
||||
#[serde(default)]
|
||||
pub guide: Vec<GuidelineMetadata>,
|
||||
}
|
||||
|
||||
#[allow(dead_code)] // Methods used by handlers and tests
|
||||
impl IngestedGuidesFile {
|
||||
/// Default path for the ingested guides file.
|
||||
pub fn default_path(project_root: &Path) -> PathBuf {
|
||||
project_root.join(".aphoria/ingested_guides.toml")
|
||||
}
|
||||
|
||||
/// Load from TOML file.
|
||||
pub fn load(path: &Path) -> Result<Self, AphoriaError> {
|
||||
if !path.exists() {
|
||||
return Ok(Self::default());
|
||||
}
|
||||
|
||||
let content = std::fs::read_to_string(path)?;
|
||||
|
||||
toml::from_str(&content)
|
||||
.map_err(|e| AphoriaError::Claims(format!("Failed to parse ingested guides file: {e}")))
|
||||
}
|
||||
|
||||
/// Save to TOML file.
|
||||
pub fn save(&self, path: &Path) -> Result<(), AphoriaError> {
|
||||
let content = toml::to_string_pretty(self)
|
||||
.map_err(|e| AphoriaError::Claims(format!("Failed to serialize ingested guides: {e}")))?;
|
||||
|
||||
if let Some(parent) = path.parent() {
|
||||
std::fs::create_dir_all(parent)?;
|
||||
}
|
||||
|
||||
std::fs::write(path, content)?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Add or update a guideline.
|
||||
///
|
||||
/// If a guideline with the same ID exists, it is replaced.
|
||||
/// Otherwise, the guideline is appended to the list.
|
||||
pub fn upsert(&mut self, guideline: GuidelineMetadata) {
|
||||
if let Some(idx) = self.guide.iter().position(|g| g.id == guideline.id) {
|
||||
self.guide[idx] = guideline;
|
||||
} else {
|
||||
self.guide.push(guideline);
|
||||
}
|
||||
}
|
||||
|
||||
/// Get a guideline by ID.
|
||||
///
|
||||
/// Returns None if the guideline doesn't exist.
|
||||
pub fn get(&self, id: &str) -> Option<&GuidelineMetadata> {
|
||||
self.guide.iter().find(|g| g.id == id)
|
||||
}
|
||||
|
||||
/// Remove a guideline by ID.
|
||||
///
|
||||
/// Returns true if the guideline was removed, false if it didn't exist.
|
||||
pub fn remove(&mut self, id: &str) -> bool {
|
||||
if let Some(idx) = self.guide.iter().position(|g| g.id == id) {
|
||||
self.guide.remove(idx);
|
||||
true
|
||||
} else {
|
||||
false
|
||||
}
|
||||
}
|
||||
|
||||
/// List all guidelines, optionally filtered by category.
|
||||
///
|
||||
/// Pass None to get all guidelines, or Some(category) to filter.
|
||||
pub fn list(&self, category: Option<&str>) -> Vec<&GuidelineMetadata> {
|
||||
self.guide
|
||||
.iter()
|
||||
.filter(|g| category.map_or(true, |c| g.category == c))
|
||||
.collect()
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_guideline_upsert() {
|
||||
let mut file = IngestedGuidesFile::default();
|
||||
|
||||
let guide = GuidelineMetadata {
|
||||
id: "test-guide".to_string(),
|
||||
name: "Test Guide".to_string(),
|
||||
source_path: None,
|
||||
document_hash: None,
|
||||
ingested_at: Utc::now().to_rfc3339(),
|
||||
claims_count: 5,
|
||||
authority_tier: "team_policy".to_string(),
|
||||
category: "architecture".to_string(),
|
||||
claim_ids: vec!["claim-1".to_string(), "claim-2".to_string()],
|
||||
};
|
||||
|
||||
file.upsert(guide.clone());
|
||||
assert_eq!(file.guide.len(), 1);
|
||||
|
||||
// Update
|
||||
let mut updated = guide.clone();
|
||||
updated.claims_count = 10;
|
||||
file.upsert(updated);
|
||||
assert_eq!(file.guide.len(), 1);
|
||||
assert_eq!(file.guide[0].claims_count, 10);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_guideline_get_and_remove() {
|
||||
let mut file = IngestedGuidesFile::default();
|
||||
|
||||
let guide = GuidelineMetadata {
|
||||
id: "test-guide".to_string(),
|
||||
name: "Test Guide".to_string(),
|
||||
source_path: None,
|
||||
document_hash: None,
|
||||
ingested_at: Utc::now().to_rfc3339(),
|
||||
claims_count: 5,
|
||||
authority_tier: "team_policy".to_string(),
|
||||
category: "architecture".to_string(),
|
||||
claim_ids: vec![],
|
||||
};
|
||||
|
||||
file.upsert(guide);
|
||||
|
||||
assert!(file.get("test-guide").is_some());
|
||||
assert!(file.remove("test-guide"));
|
||||
assert!(file.get("test-guide").is_none());
|
||||
assert!(!file.remove("test-guide"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_guideline_list_filter() {
|
||||
let mut file = IngestedGuidesFile::default();
|
||||
|
||||
file.upsert(GuidelineMetadata {
|
||||
id: "arch-1".to_string(),
|
||||
name: "Arch Guide 1".to_string(),
|
||||
source_path: None,
|
||||
document_hash: None,
|
||||
ingested_at: Utc::now().to_rfc3339(),
|
||||
claims_count: 5,
|
||||
authority_tier: "team_policy".to_string(),
|
||||
category: "architecture".to_string(),
|
||||
claim_ids: vec![],
|
||||
});
|
||||
|
||||
file.upsert(GuidelineMetadata {
|
||||
id: "sec-1".to_string(),
|
||||
name: "Security Guide 1".to_string(),
|
||||
source_path: None,
|
||||
document_hash: None,
|
||||
ingested_at: Utc::now().to_rfc3339(),
|
||||
claims_count: 3,
|
||||
authority_tier: "team_policy".to_string(),
|
||||
category: "security".to_string(),
|
||||
claim_ids: vec![],
|
||||
});
|
||||
|
||||
assert_eq!(file.list(None).len(), 2);
|
||||
assert_eq!(file.list(Some("architecture")).len(), 1);
|
||||
assert_eq!(file.list(Some("security")).len(), 1);
|
||||
}
|
||||
}
|
||||
@ -3,6 +3,7 @@
|
||||
pub mod authored_claim;
|
||||
mod claim;
|
||||
mod command;
|
||||
pub mod ingested_guides;
|
||||
mod language;
|
||||
mod result;
|
||||
mod verdict;
|
||||
|
||||
227
applications/aphoria/tests/gap_fixes_integration.rs
Normal file
227
applications/aphoria/tests/gap_fixes_integration.rs
Normal file
@ -0,0 +1,227 @@
|
||||
//! Integration tests for Gap 1 and Gap 5 fixes.
|
||||
//!
|
||||
//! Gap 1: Observations should use confidence-based tiers (4 or 5), not Tier 3
|
||||
//! Gap 5: Superseding claims should auto-deprecate old claims, warn on duplicates
|
||||
|
||||
use aphoria::{AuthoredClaim, AuthoredValue, ClaimStatus, ComparisonMode};
|
||||
use aphoria::claims_file::ClaimsFile;
|
||||
use stemedb_core::types::SourceClass;
|
||||
use tempfile::TempDir;
|
||||
|
||||
/// Test Gap 1: Observations use confidence-based tiers (not Tier 3 Expert)
|
||||
#[test]
|
||||
fn test_gap1_observation_tiers() {
|
||||
// High confidence observation should be Tier 4 (Community)
|
||||
let high_confidence_tier = aphoria::bridge::observation_to_tier(0.95);
|
||||
assert_eq!(high_confidence_tier, SourceClass::Community);
|
||||
assert_eq!(high_confidence_tier.tier(), 4);
|
||||
assert!((high_confidence_tier.authority_weight() - 0.3).abs() < f32::EPSILON);
|
||||
|
||||
// Low confidence observation should be Tier 5 (Anecdotal)
|
||||
let low_confidence_tier = aphoria::bridge::observation_to_tier(0.7);
|
||||
assert_eq!(low_confidence_tier, SourceClass::Anecdotal);
|
||||
assert_eq!(low_confidence_tier.tier(), 5);
|
||||
assert!((low_confidence_tier.authority_weight() - 0.1).abs() < f32::EPSILON);
|
||||
|
||||
// Boundary case: exactly 0.9 should be Tier 4
|
||||
let boundary_tier = aphoria::bridge::observation_to_tier(0.9);
|
||||
assert_eq!(boundary_tier, SourceClass::Community);
|
||||
assert_eq!(boundary_tier.tier(), 4);
|
||||
}
|
||||
|
||||
/// Test Gap 5: Supersede auto-deprecates old claims
|
||||
#[test]
|
||||
fn test_gap5_supersede_auto_deprecates() {
|
||||
let temp_dir = TempDir::new().expect("create temp dir");
|
||||
let claims_path = temp_dir.path().join("claims.toml");
|
||||
|
||||
let mut claims_file = ClaimsFile::new();
|
||||
|
||||
// Create initial claim
|
||||
let claim_v1 = AuthoredClaim {
|
||||
id: "test-001".to_string(),
|
||||
concept_path: "test/feature/enabled".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: AuthoredValue::Bool(true),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "Initial implementation".to_string(),
|
||||
invariant: "Feature should be enabled".to_string(),
|
||||
consequence: "Feature disabled".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: "feature".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "dev".to_string(),
|
||||
created_at: "2026-02-08T10:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
claims_file.add(claim_v1);
|
||||
assert_eq!(claims_file.len(), 1);
|
||||
assert_eq!(
|
||||
claims_file.find_by_id("test-001").map(|c| &c.status),
|
||||
Some(&ClaimStatus::Active)
|
||||
);
|
||||
|
||||
// Supersede with v2
|
||||
let claim_v2 = AuthoredClaim {
|
||||
id: "test-002".to_string(),
|
||||
concept_path: "test/feature/enabled".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: AuthoredValue::Bool(false),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "Updated after review".to_string(),
|
||||
invariant: "Feature should be disabled".to_string(),
|
||||
consequence: "Feature enabled".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec!["Review notes".to_string()],
|
||||
category: "feature".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: Some("test-001".to_string()),
|
||||
created_by: "lead".to_string(),
|
||||
created_at: "2026-02-08T11:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
claims_file.supersede("test-001", claim_v2).expect("supersede");
|
||||
|
||||
// Verify old claim is superseded
|
||||
assert_eq!(
|
||||
claims_file.find_by_id("test-001").map(|c| &c.status),
|
||||
Some(&ClaimStatus::Superseded)
|
||||
);
|
||||
|
||||
// Verify new claim is active
|
||||
assert_eq!(
|
||||
claims_file.find_by_id("test-002").map(|c| &c.status),
|
||||
Some(&ClaimStatus::Active)
|
||||
);
|
||||
|
||||
// Verify lineage link
|
||||
assert_eq!(
|
||||
claims_file.find_by_id("test-002").and_then(|c| c.supersedes.as_deref()),
|
||||
Some("test-001")
|
||||
);
|
||||
|
||||
// Verify persistence
|
||||
claims_file.save(&claims_path).expect("save");
|
||||
let loaded = ClaimsFile::load(&claims_path).expect("load");
|
||||
assert_eq!(loaded.len(), 2);
|
||||
assert_eq!(
|
||||
loaded.find_by_id("test-001").map(|c| &c.status),
|
||||
Some(&ClaimStatus::Superseded)
|
||||
);
|
||||
}
|
||||
|
||||
/// Test Gap 5: Duplicate validation warns when creating duplicate active claims
|
||||
#[test]
|
||||
fn test_gap5_duplicate_validation_warning() {
|
||||
let mut claims_file = ClaimsFile::new();
|
||||
|
||||
// Create first claim
|
||||
let claim1 = AuthoredClaim {
|
||||
id: "dup-001".to_string(),
|
||||
concept_path: "test/config/timeout".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: AuthoredValue::Number(30.0),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "Initial config".to_string(),
|
||||
invariant: "Timeout must be 30s".to_string(),
|
||||
consequence: "Requests timeout too fast".to_string(),
|
||||
authority_tier: "team_policy".to_string(),
|
||||
evidence: vec![],
|
||||
category: "config".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "dev1".to_string(),
|
||||
created_at: "2026-02-08T10:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
claims_file.add(claim1);
|
||||
|
||||
// Create duplicate (same concept_path + predicate, different ID)
|
||||
let claim2 = AuthoredClaim {
|
||||
id: "dup-002".to_string(),
|
||||
concept_path: "test/config/timeout".to_string(), // Same
|
||||
predicate: "value".to_string(), // Same
|
||||
value: AuthoredValue::Number(60.0), // Different value
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "Updated config".to_string(),
|
||||
invariant: "Timeout must be 60s".to_string(),
|
||||
consequence: "Requests timeout too slow".to_string(),
|
||||
authority_tier: "team_policy".to_string(),
|
||||
evidence: vec![],
|
||||
category: "config".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "dev2".to_string(),
|
||||
created_at: "2026-02-08T11:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
// This should print a warning (captured in test output)
|
||||
// but still add the claim
|
||||
claims_file.add(claim2);
|
||||
|
||||
assert_eq!(claims_file.len(), 2);
|
||||
assert_eq!(claims_file.find_by_status(&ClaimStatus::Active).len(), 2);
|
||||
}
|
||||
|
||||
/// Test Gap 5: No warning when duplicate is deprecated
|
||||
#[test]
|
||||
fn test_gap5_no_warning_for_deprecated_duplicate() {
|
||||
let mut claims_file = ClaimsFile::new();
|
||||
|
||||
// Create and deprecate first claim
|
||||
let claim1 = AuthoredClaim {
|
||||
id: "old-001".to_string(),
|
||||
concept_path: "test/feature/mode".to_string(),
|
||||
predicate: "value".to_string(),
|
||||
value: AuthoredValue::Text("legacy".to_string()),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "Old implementation".to_string(),
|
||||
invariant: "Mode should be legacy".to_string(),
|
||||
consequence: "Mode incorrect".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: "feature".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "dev".to_string(),
|
||||
created_at: "2026-02-08T10:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
claims_file.add(claim1);
|
||||
claims_file.deprecate("old-001", "2026-02-08T11:00:00Z").expect("deprecate");
|
||||
|
||||
// Now add new claim with same concept_path/predicate
|
||||
// Should NOT warn because the first is deprecated
|
||||
let claim2 = AuthoredClaim {
|
||||
id: "new-001".to_string(),
|
||||
concept_path: "test/feature/mode".to_string(), // Same
|
||||
predicate: "value".to_string(), // Same
|
||||
value: AuthoredValue::Text("modern".to_string()),
|
||||
comparison: ComparisonMode::Equals,
|
||||
provenance: "New implementation".to_string(),
|
||||
invariant: "Mode should be modern".to_string(),
|
||||
consequence: "Mode incorrect".to_string(),
|
||||
authority_tier: "expert".to_string(),
|
||||
evidence: vec![],
|
||||
category: "feature".to_string(),
|
||||
status: ClaimStatus::Active,
|
||||
supersedes: None,
|
||||
created_by: "dev".to_string(),
|
||||
created_at: "2026-02-08T12:00:00Z".to_string(),
|
||||
updated_at: None,
|
||||
};
|
||||
|
||||
// Should NOT print warning
|
||||
claims_file.add(claim2);
|
||||
|
||||
assert_eq!(claims_file.len(), 2);
|
||||
assert_eq!(claims_file.find_by_status(&ClaimStatus::Active).len(), 1);
|
||||
assert_eq!(claims_file.find_by_status(&ClaimStatus::Deprecated).len(), 1);
|
||||
}
|
||||
@ -1,5 +1,7 @@
|
||||
# Aphoria
|
||||
|
||||
> **Product Vision:** This document describes Aphoria's product vision as a knowledge compounding system that learns from your organization's decisions. For the protocol-level vision (EAP standard), see [Protocol Vision](protocol_vision.md).
|
||||
|
||||
**Self-learning institutional knowledge that compounds with every commit.**
|
||||
|
||||
Aphoria transforms your organization's implicit decisions into explicit, auditable, shareable knowledge. Every commit teaches the system. Every new hire benefits from what came before. Knowledge compounds instead of walking out the door.
|
||||
|
||||
@ -257,6 +257,22 @@ pub struct PatternDto {
|
||||
|
||||
/// Unix timestamp of most recent observation.
|
||||
pub last_seen: u64,
|
||||
|
||||
/// Optional enrichment: pattern category (e.g., "security", "architecture", "performance").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub category: Option<String>,
|
||||
|
||||
/// Optional enrichment: verdict (e.g., "deprecated", "recommended", "emerging", "common", "noise").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub verdict: Option<String>,
|
||||
|
||||
/// Optional enrichment: human-readable explanation of the pattern.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub explanation: Option<String>,
|
||||
|
||||
/// Optional enrichment: authority source (e.g., "RFC 8996", "NIST 2010").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub authority_source: Option<String>,
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
@ -348,6 +364,12 @@ pub enum ComparisonModeDto {
|
||||
/// No observation should exist at this path.
|
||||
#[serde(rename = "absent")]
|
||||
Absent,
|
||||
/// Observation value must contain claim value as substring/element.
|
||||
#[serde(rename = "contains")]
|
||||
Contains,
|
||||
/// Observation value must NOT contain claim value as substring/element.
|
||||
#[serde(rename = "not_contains")]
|
||||
NotContains,
|
||||
}
|
||||
|
||||
/// Claim lifecycle status.
|
||||
|
||||
@ -64,6 +64,9 @@ pub enum SourceClassDto {
|
||||
/// Tier 2: Observational studies, real-world evidence
|
||||
Observational,
|
||||
|
||||
/// Tier 2.5: Team-level architectural guidelines and policies
|
||||
TeamPolicy,
|
||||
|
||||
/// Tier 3: Expert opinions, medical guidelines
|
||||
Expert,
|
||||
|
||||
@ -240,6 +243,7 @@ impl From<SourceClass> for SourceClassDto {
|
||||
SourceClass::Regulatory => SourceClassDto::Regulatory,
|
||||
SourceClass::Clinical => SourceClassDto::Clinical,
|
||||
SourceClass::Observational => SourceClassDto::Observational,
|
||||
SourceClass::TeamPolicy => SourceClassDto::TeamPolicy,
|
||||
SourceClass::Expert => SourceClassDto::Expert,
|
||||
SourceClass::Community => SourceClassDto::Community,
|
||||
SourceClass::Anecdotal => SourceClassDto::Anecdotal,
|
||||
@ -253,6 +257,7 @@ impl From<SourceClassDto> for SourceClass {
|
||||
SourceClassDto::Regulatory => SourceClass::Regulatory,
|
||||
SourceClassDto::Clinical => SourceClass::Clinical,
|
||||
SourceClassDto::Observational => SourceClass::Observational,
|
||||
SourceClassDto::TeamPolicy => SourceClass::TeamPolicy,
|
||||
SourceClassDto::Expert => SourceClass::Expert,
|
||||
SourceClassDto::Community => SourceClass::Community,
|
||||
SourceClassDto::Anecdotal => SourceClass::Anecdotal,
|
||||
|
||||
@ -550,6 +550,8 @@ fn comparison_mode_to_dto(mode: ComparisonMode) -> ComparisonModeDto {
|
||||
ComparisonMode::NotEquals => ComparisonModeDto::NotEquals,
|
||||
ComparisonMode::Present => ComparisonModeDto::Present,
|
||||
ComparisonMode::Absent => ComparisonModeDto::Absent,
|
||||
ComparisonMode::Contains => ComparisonModeDto::Contains,
|
||||
ComparisonMode::NotContains => ComparisonModeDto::NotContains,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@ -16,6 +16,25 @@ use crate::{
|
||||
|
||||
use super::super::aphoria_helpers::{compute_assertion_hash, observation_dto_to_assertion};
|
||||
|
||||
#[cfg(feature = "aphoria")]
|
||||
use aphoria::{
|
||||
AphoriaConfig,
|
||||
corpus::PatternEnricher,
|
||||
extractors::ExtractorRegistry,
|
||||
};
|
||||
|
||||
/// Extract tail path from subject for enrichment matching.
|
||||
///
|
||||
/// Tail path is the last 2 segments: "code://rust/*/core/imports/std" → "imports/std"
|
||||
fn extract_tail_path(subject: &str) -> String {
|
||||
let parts: Vec<&str> = subject.split('/').filter(|s| !s.is_empty()).collect();
|
||||
if parts.len() >= 2 {
|
||||
format!("{}/{}", parts[parts.len() - 2], parts[parts.len() - 1])
|
||||
} else {
|
||||
subject.to_string()
|
||||
}
|
||||
}
|
||||
|
||||
/// Push observations from an Aphoria client (hosted mode).
|
||||
///
|
||||
/// This endpoint receives observations from teams running Aphoria in hosted
|
||||
@ -236,16 +255,51 @@ pub async fn get_patterns(
|
||||
|
||||
let total_matching = aggregates.len();
|
||||
|
||||
// Create enricher for query-time enrichment
|
||||
#[cfg(feature = "aphoria")]
|
||||
let enricher = {
|
||||
let config = AphoriaConfig::default();
|
||||
let registry = ExtractorRegistry::new(&config);
|
||||
PatternEnricher::from_registry(®istry)
|
||||
};
|
||||
|
||||
let patterns: Vec<PatternDto> = aggregates
|
||||
.into_iter()
|
||||
.map(|agg| PatternDto {
|
||||
subject: agg.subject,
|
||||
predicate: agg.predicate,
|
||||
value: agg.value_display,
|
||||
project_count: agg.project_count,
|
||||
observation_count: agg.observation_count,
|
||||
first_seen: agg.first_seen,
|
||||
last_seen: agg.last_seen,
|
||||
.map(|agg| {
|
||||
// Try to enrich pattern if aphoria feature is enabled
|
||||
#[cfg(feature = "aphoria")]
|
||||
let (category, verdict, explanation, authority_source) = {
|
||||
if agg.category.is_some() {
|
||||
// Pattern already enriched at write time
|
||||
(agg.category, agg.verdict, agg.explanation, agg.authority_source)
|
||||
} else {
|
||||
// Enrich at query time
|
||||
let tail_path = extract_tail_path(&agg.subject);
|
||||
if let Some(enrichment) = enricher.enrich(&tail_path, &agg.predicate, &agg.value_display) {
|
||||
(enrichment.category, enrichment.verdict, enrichment.explanation, enrichment.authority_source)
|
||||
} else {
|
||||
(None, None, None, None)
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
#[cfg(not(feature = "aphoria"))]
|
||||
let (category, verdict, explanation, authority_source) =
|
||||
(agg.category, agg.verdict, agg.explanation, agg.authority_source);
|
||||
|
||||
PatternDto {
|
||||
subject: agg.subject,
|
||||
predicate: agg.predicate,
|
||||
value: agg.value_display,
|
||||
project_count: agg.project_count,
|
||||
observation_count: agg.observation_count,
|
||||
first_seen: agg.first_seen,
|
||||
last_seen: agg.last_seen,
|
||||
category,
|
||||
verdict,
|
||||
explanation,
|
||||
authority_source,
|
||||
}
|
||||
})
|
||||
.collect();
|
||||
|
||||
|
||||
@ -149,6 +149,7 @@ fn source_class_to_dto(sc: SourceClass) -> SourceClassDto {
|
||||
SourceClass::Regulatory => SourceClassDto::Regulatory,
|
||||
SourceClass::Clinical => SourceClassDto::Clinical,
|
||||
SourceClass::Observational => SourceClassDto::Observational,
|
||||
SourceClass::TeamPolicy => SourceClassDto::TeamPolicy,
|
||||
SourceClass::Expert => SourceClassDto::Expert,
|
||||
SourceClass::Community => SourceClassDto::Community,
|
||||
SourceClass::Anecdotal => SourceClassDto::Anecdotal,
|
||||
|
||||
@ -17,6 +17,7 @@ use rkyv::{Archive, Deserialize, Serialize};
|
||||
/// | 0 | Regulatory | FDA approval letters, EMA assessments |
|
||||
/// | 1 | Clinical | Phase III trials, peer-reviewed RCTs |
|
||||
/// | 2 | Observational | Real-world evidence, cohort studies |
|
||||
/// | 2.5 | TeamPolicy | Internal team architecture guidelines |
|
||||
/// | 3 | Expert | Medical professional opinions, guidelines |
|
||||
/// | 4 | Community | Curated forums, patient advocacy groups |
|
||||
/// | 5 | Anecdotal | Reddit posts, individual testimonials |
|
||||
@ -32,6 +33,9 @@ pub enum SourceClass {
|
||||
/// Tier 2: Observational studies, real-world evidence.
|
||||
/// Medium-high authority. Moderate decay.
|
||||
Observational,
|
||||
/// Tier 2.5: Team-level architectural guidelines and policies.
|
||||
/// Medium-high authority. Overrides community observations but respects industry standards.
|
||||
TeamPolicy,
|
||||
/// Tier 3: Expert opinions, medical guidelines.
|
||||
/// Medium authority. Faster decay as guidelines update.
|
||||
#[default]
|
||||
@ -45,18 +49,37 @@ pub enum SourceClass {
|
||||
}
|
||||
|
||||
impl SourceClass {
|
||||
/// Returns the tier number (0-5) for this source class.
|
||||
/// Returns the tier number (0-5, with 2.5 for TeamPolicy) for this source class.
|
||||
///
|
||||
/// Note: This returns u8, so TeamPolicy returns 2 (between Observational and Expert).
|
||||
/// Use tier_fractional() for the precise 2.5 value.
|
||||
pub fn tier(&self) -> u8 {
|
||||
match self {
|
||||
SourceClass::Regulatory => 0,
|
||||
SourceClass::Clinical => 1,
|
||||
SourceClass::Observational => 2,
|
||||
SourceClass::TeamPolicy => 2, // Actually 2.5, but u8 can't represent it
|
||||
SourceClass::Expert => 3,
|
||||
SourceClass::Community => 4,
|
||||
SourceClass::Anecdotal => 5,
|
||||
}
|
||||
}
|
||||
|
||||
/// Returns the fractional tier number for this source class.
|
||||
///
|
||||
/// Use this when you need precise tier values (e.g., TeamPolicy = 2.5).
|
||||
pub fn tier_fractional(&self) -> f32 {
|
||||
match self {
|
||||
SourceClass::Regulatory => 0.0,
|
||||
SourceClass::Clinical => 1.0,
|
||||
SourceClass::Observational => 2.0,
|
||||
SourceClass::TeamPolicy => 2.5,
|
||||
SourceClass::Expert => 3.0,
|
||||
SourceClass::Community => 4.0,
|
||||
SourceClass::Anecdotal => 5.0,
|
||||
}
|
||||
}
|
||||
|
||||
/// Returns the default decay half-life in days for this source class.
|
||||
///
|
||||
/// Higher tiers decay faster. Regulatory sources essentially never decay,
|
||||
@ -66,6 +89,7 @@ impl SourceClass {
|
||||
SourceClass::Regulatory => None, // Never decays
|
||||
SourceClass::Clinical => Some(730), // 2 years
|
||||
SourceClass::Observational => Some(365), // 1 year
|
||||
SourceClass::TeamPolicy => Some(180), // 6 months (same as Expert)
|
||||
SourceClass::Expert => Some(180), // 6 months
|
||||
SourceClass::Community => Some(90), // 3 months
|
||||
SourceClass::Anecdotal => Some(30), // 1 month
|
||||
@ -80,6 +104,7 @@ impl SourceClass {
|
||||
SourceClass::Regulatory => 1.0,
|
||||
SourceClass::Clinical => 0.9,
|
||||
SourceClass::Observational => 0.7,
|
||||
SourceClass::TeamPolicy => 0.6, // Between Observational (0.7) and Expert (0.5)
|
||||
SourceClass::Expert => 0.5,
|
||||
SourceClass::Community => 0.3,
|
||||
SourceClass::Anecdotal => 0.1,
|
||||
|
||||
@ -59,6 +59,7 @@ pub fn source_class_to_dto(sc: SourceClass) -> SourceClassDto {
|
||||
SourceClass::Regulatory => SourceClassDto::Regulatory,
|
||||
SourceClass::Clinical => SourceClassDto::Clinical,
|
||||
SourceClass::Observational => SourceClassDto::Observational,
|
||||
SourceClass::TeamPolicy => SourceClassDto::TeamPolicy,
|
||||
SourceClass::Expert => SourceClassDto::Expert,
|
||||
SourceClass::Community => SourceClassDto::Community,
|
||||
SourceClass::Anecdotal => SourceClassDto::Anecdotal,
|
||||
|
||||
@ -42,6 +42,8 @@ pub enum SourceClassDto {
|
||||
Clinical,
|
||||
/// Tier 2: Observational studies, real-world evidence
|
||||
Observational,
|
||||
/// Tier 2.5: Team-level architectural guidelines and policies
|
||||
TeamPolicy,
|
||||
/// Tier 3: Expert opinions, medical guidelines
|
||||
Expert,
|
||||
/// Tier 4: Curated community knowledge
|
||||
|
||||
@ -56,6 +56,22 @@ pub struct PatternAggregate {
|
||||
|
||||
/// Unix timestamp of most recent observation.
|
||||
pub last_seen: u64,
|
||||
|
||||
/// Optional enrichment: pattern category (e.g., "security", "architecture", "performance").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub category: Option<String>,
|
||||
|
||||
/// Optional enrichment: verdict (e.g., "deprecated", "recommended", "emerging", "common", "noise").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub verdict: Option<String>,
|
||||
|
||||
/// Optional enrichment: human-readable explanation of the pattern.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub explanation: Option<String>,
|
||||
|
||||
/// Optional enrichment: authority source (e.g., "RFC 8996", "NIST 2010").
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub authority_source: Option<String>,
|
||||
}
|
||||
|
||||
/// Specialized storage trait for pattern aggregate operations.
|
||||
|
||||
@ -93,6 +93,10 @@ impl<S: KVStore + 'static> PatternAggregateStore for GenericPatternAggregateStor
|
||||
observation_count: obs_count,
|
||||
first_seen: timestamp,
|
||||
last_seen: timestamp,
|
||||
category: None,
|
||||
verdict: None,
|
||||
explanation: None,
|
||||
authority_source: None,
|
||||
},
|
||||
};
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user