This commit implements Phase 17 of the Aphoria roadmap, adding: **Inline Claim Markers (@aphoria:claim):** - New extractor for detecting inline markers in comments - Pending markers tracked in .aphoria/pending_markers.toml - CLI commands: list-markers, formalize-marker, reject-marker - Support for all major comment styles (Rust, Python, SQL, etc.) - Auto-sync during scan (configurable) **Claim Enrichment:** - ClaimEnrichment type with source attribution (inline, extractor, manual) - EnrichedClaimInfo with full enrichment metadata - Extended AuthoredClaim with optional enrichment field - API endpoints for enriched claim queries - Dashboard UI components (enrichment badge, verdict badge) **Enhanced Extractor Trait:** - verifiable_predicates() method for declaring (tail_path, predicate) pairs - 10 security extractors now implement verifiable_predicates - Enables claim suggester skill to find unclaimed patterns **Documentation:** - Phase 17 summary with complete implementation details - Gap fixes summary documenting 8 closed vision gaps - Updated CLI reference with new commands - New aphoria-docs skill for documentation maintenance - Updated roadmap with Phase 17 completion **Integration:** - ClaimsFile support for claim enrichment persistence - Pattern aggregate store support for enrichment queries - Dashboard filters and display for enrichment metadata - API handlers for list-markers and enrichment queries **Tests:** - New gap_fixes_integration test suite - Corpus enricher module with best practices ingestion Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
334 lines
9.6 KiB
Markdown
334 lines
9.6 KiB
Markdown
# Phase 17: Pattern Enrichment & Best Practices Infrastructure
|
|
|
|
**Status:** ✅ Complete (Backend Only)
|
|
**Date:** 2026-02-08
|
|
|
|
## What Was Built
|
|
|
|
This phase implemented **backend infrastructure** for enriched corpus patterns and team guideline ingestion. The features are **fully functional via CLI** but **not yet integrated with the dashboard UI**.
|
|
|
|
---
|
|
|
|
## 1. Enriched Pattern Metadata
|
|
|
|
### The Problem
|
|
Community patterns showed bare statistics like "md5: true, 347 projects" with no context about whether MD5 is deprecated, recommended, or neutral.
|
|
|
|
### The Solution
|
|
Extractors now provide enrichment metadata:
|
|
|
|
```rust
|
|
pub struct PatternMetadata {
|
|
pub tail_path: String, // "crypto/hashing/algorithm"
|
|
pub predicate: String, // "algorithm"
|
|
pub value: Option<String>, // "md5" (or None for wildcard)
|
|
pub category: String, // "security"
|
|
pub verdict: String, // "deprecated"
|
|
pub explanation: String, // "MD5 is cryptographically broken..."
|
|
pub authority_source: Option<String>, // "NIST SP 800-131A"
|
|
}
|
|
```
|
|
|
|
### What Works Now
|
|
- 10 security extractors provide enrichment metadata
|
|
- `PatternEnricher` service matches patterns to metadata (exact, wildcard, noise detection)
|
|
- Data model supports category, verdict, explanation, authority_source
|
|
|
|
### What's Missing
|
|
❌ Dashboard doesn't display this metadata yet
|
|
❌ No category filter dropdown
|
|
❌ No "Hide noise" toggle
|
|
❌ No visual badges for deprecated/recommended
|
|
|
|
---
|
|
|
|
## 2. TeamPolicy Authority Tier
|
|
|
|
### The Problem
|
|
No authority tier between community observations (tier 4) and expert opinions (tier 3) for team-level architectural guidelines.
|
|
|
|
### The Solution
|
|
New **tier 2.5**: `TeamPolicy`
|
|
|
|
- Sits between Observational (tier 2) and Expert (tier 3)
|
|
- Authority weight: 0.6 (between 0.7 and 0.5)
|
|
- Decay: 180 days (same as Expert)
|
|
- Use case: Team architectural guidelines, internal standards
|
|
|
|
### What Works Now
|
|
```bash
|
|
# Create team policy claim
|
|
aphoria claims create \
|
|
--tier team_policy \
|
|
--id hex-arch-http-001 \
|
|
--concept-path myapp/adapters/http \
|
|
--predicate layer \
|
|
--value adapter \
|
|
--invariant "HTTP handlers MUST be in adapters layer" \
|
|
--consequence "Business logic leaks into infrastructure" \
|
|
--provenance "Architecture team decision 2026-02-08" \
|
|
--category architecture \
|
|
--by architecture-team
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Best Practices Import CLI
|
|
|
|
### The Problem
|
|
Teams write extensive architectural guidelines in markdown/PDFs but have no way to automatically enforce them.
|
|
|
|
### The Solution
|
|
Batch import claims from TOML files:
|
|
|
|
```bash
|
|
# Preview import
|
|
aphoria claims import docs/hexagonal-arch.toml --dry-run
|
|
|
|
# Import with tracking
|
|
aphoria claims import docs/hexagonal-arch.toml \
|
|
--authority-tier team_policy \
|
|
--source-guide "hexagonal-arch"
|
|
```
|
|
|
|
### What Works Now
|
|
- Batch import claims from TOML
|
|
- Override authority tier for all claims
|
|
- Merge strategies: `skip_existing`, `overwrite`, `fail_on_duplicate`
|
|
- Dry-run preview
|
|
- Guideline tracking in `.aphoria/ingested_guides.toml`
|
|
|
|
### Example TOML
|
|
```toml
|
|
[[claim]]
|
|
id = "hex-arch-http-001"
|
|
concept_path = "myapp/adapters/http"
|
|
predicate = "layer"
|
|
value = "adapter"
|
|
comparison = "equals"
|
|
provenance = "Hexagonal Architecture Guidelines"
|
|
invariant = "HTTP handlers MUST be in adapters layer"
|
|
consequence = "Business logic leaks into infrastructure"
|
|
authority_tier = "team_policy"
|
|
category = "architecture"
|
|
evidence = ["docs/architecture/hexagonal.md"]
|
|
created_by = "architecture-team"
|
|
created_at = "2026-02-08T12:00:00Z"
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Guideline Tracking
|
|
|
|
### The Problem
|
|
No way to track which guidelines have been imported, detect changes, or filter compliance.
|
|
|
|
### The Solution
|
|
`.aphoria/ingested_guides.toml` tracks imported guidelines:
|
|
|
|
```toml
|
|
[[guide]]
|
|
id = "hexagonal-arch"
|
|
name = "Hexagonal Architecture Guidelines"
|
|
source_path = "docs/hexagonal.md"
|
|
document_hash = "blake3:abc123..."
|
|
ingested_at = "2026-02-08T12:00:00Z"
|
|
claims_count = 26
|
|
authority_tier = "team_policy"
|
|
category = "architecture"
|
|
claim_ids = ["hex-arch-http-001", "hex-arch-domain-imports-001", ...]
|
|
```
|
|
|
|
### What Works Now
|
|
- Guideline metadata tracked with BLAKE3 hash
|
|
- Change detection (compare hash to detect doc updates)
|
|
- Audit trail (who imported what, when)
|
|
|
|
### What's Missing
|
|
❌ `aphoria scan --check-policy <guide-id>` not implemented
|
|
❌ No re-extraction workflow when source doc changes
|
|
❌ No compliance dashboard
|
|
|
|
---
|
|
|
|
## 5. Updated Comparison Modes
|
|
|
|
### What Was Added
|
|
Two new comparison modes for list/substring matching:
|
|
|
|
**Contains** - Value must contain substring/element
|
|
```toml
|
|
comparison = "contains"
|
|
value = "Serialize"
|
|
# Passes: "Clone,Debug,Serialize"
|
|
# Fails: "Clone,Debug"
|
|
```
|
|
|
|
**NotContains** - Value must NOT contain substring/element
|
|
```toml
|
|
comparison = "not_contains"
|
|
value = "Clone"
|
|
# Passes: "Debug"
|
|
# Fails: "Clone,Debug"
|
|
```
|
|
|
|
---
|
|
|
|
## 10 Enriched Security Extractors
|
|
|
|
| Extractor | Enriched Patterns | Authority Source |
|
|
|-----------|-------------------|------------------|
|
|
| `WeakCryptoExtractor` | MD5, SHA1 (deprecated), DES, RC4 | NIST SP 800-131A, RFC 7465 |
|
|
| `TlsVersionExtractor` | TLS 1.0/1.1 (deprecated), 1.2/1.3 (recommended) | RFC 8996, RFC 8446 |
|
|
| `TlsVerifyExtractor` | cert_verification: false (insecure) | OWASP |
|
|
| `JwtConfigExtractor` | algorithm: none (forbidden) | RFC 7519 |
|
|
| `CorsConfigExtractor` | allow_origin: * (insecure) | OWASP, W3C CORS Spec |
|
|
| `HardcodedSecretsExtractor` | API keys/passwords (critical) | OWASP A07:2021 |
|
|
| `SqlInjectionExtractor` | String interpolation (vulnerable) | OWASP A03:2021 |
|
|
| `CommandInjectionExtractor` | Shell exec (vulnerable) | OWASP A03:2021 |
|
|
| `PathTraversalExtractor` | User-controlled paths (vulnerable) | OWASP A01:2021 |
|
|
| `InsecureDeserializationExtractor` | pickle/yaml.load (unsafe) | OWASP A08:2021 |
|
|
|
|
---
|
|
|
|
## Files Created/Modified
|
|
|
|
### New Files
|
|
- `applications/aphoria/src/corpus/enricher.rs` - Pattern enrichment service
|
|
- `applications/aphoria/src/types/ingested_guides.rs` - Guideline tracking
|
|
|
|
### Modified Files
|
|
**Core Types:**
|
|
- `crates/stemedb-core/src/types/source.rs` - TeamPolicy tier
|
|
- `crates/stemedb-storage/src/pattern_aggregate_store/mod.rs` - Enrichment fields
|
|
|
|
**Aphoria:**
|
|
- `applications/aphoria/src/extractors/traits.rs` - `pattern_metadata()` method
|
|
- `applications/aphoria/src/types/authored_claim.rs` - Contains/NotContains modes
|
|
- `applications/aphoria/src/cli/claims.rs` - Import subcommand
|
|
- `applications/aphoria/src/handlers/claims.rs` - Import handler
|
|
- 10 extractor files with `pattern_metadata()` implementations
|
|
|
|
**API & DTOs:**
|
|
- `crates/stemedb-api/src/dto/enums.rs` - TeamPolicy DTO
|
|
- `crates/stemedb-api/src/dto/aphoria/types.rs` - Contains/NotContains DTOs
|
|
- `crates/stemedb-ontology/src/dto/enums.rs` - TeamPolicy DTO
|
|
|
|
---
|
|
|
|
## How to Use (CLI)
|
|
|
|
### 1. Create a guideline TOML file
|
|
```bash
|
|
cat > docs/architecture-guidelines.toml <<EOF
|
|
[[claim]]
|
|
id = "no-tokio-in-core"
|
|
concept_path = "myapp/core/imports/tokio"
|
|
predicate = "imported"
|
|
value = "true"
|
|
comparison = "absent"
|
|
provenance = "Architecture decision: core must be sync-only"
|
|
invariant = "Core modules MUST NOT import tokio"
|
|
consequence = "Creates async runtime coupling, breaks sync library users"
|
|
authority_tier = "team_policy"
|
|
category = "architecture"
|
|
evidence = ["ADR-003"]
|
|
created_by = "tech-lead"
|
|
created_at = "2026-02-08T12:00:00Z"
|
|
EOF
|
|
```
|
|
|
|
### 2. Import the guideline
|
|
```bash
|
|
aphoria claims import docs/architecture-guidelines.toml \
|
|
--source-guide "architecture-2026" \
|
|
--dry-run
|
|
```
|
|
|
|
### 3. Run verification
|
|
```bash
|
|
aphoria scan --persist
|
|
aphoria verify run
|
|
```
|
|
|
|
---
|
|
|
|
## What's NOT Done (UI Integration)
|
|
|
|
The backend is complete but the **dashboard doesn't display any of this**:
|
|
|
|
❌ Category badges (security/architecture/performance)
|
|
❌ Verdict badges (deprecated/recommended/emerging)
|
|
❌ Explanation tooltips ("MD5 is deprecated - NIST 2010")
|
|
❌ Filter by category dropdown
|
|
❌ "Hide noise" toggle
|
|
❌ Guideline compliance filtering (`--check-policy` flag)
|
|
❌ Compliance dashboard showing guideline status
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
### To Make This User-Visible:
|
|
|
|
**Option 1: Dashboard Integration** (Frontend work)
|
|
- Add category/verdict badges to pattern cards
|
|
- Show explanations in tooltips
|
|
- Add category filter dropdown
|
|
- Implement "Hide noise" toggle
|
|
- Build compliance dashboard
|
|
|
|
**Option 2: Enhanced CLI Output** (Backend work)
|
|
- Show enrichment in `aphoria scan` table output
|
|
- Add `--show-enrichment` flag
|
|
- Color-code deprecated patterns (red), recommended (green)
|
|
- Filter by category: `aphoria scan --category security`
|
|
|
|
**Option 3: Policy Filtering** (Backend work)
|
|
- Implement `aphoria scan --check-policy <guide-id>`
|
|
- Show only violations of specific guideline
|
|
- Pre-commit hook support for policy enforcement
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
All code compiles and passes existing tests. To verify:
|
|
|
|
```bash
|
|
# Build workspace
|
|
cargo build --workspace
|
|
|
|
# Test aphoria
|
|
cargo test --package aphoria
|
|
|
|
# Try the import command
|
|
aphoria claims import --help
|
|
```
|
|
|
|
---
|
|
|
|
## Documentation Updated
|
|
|
|
- ✅ `roadmap-archive.md` - Added Phase 17
|
|
- ✅ `roadmap.md` - Updated status table
|
|
- ✅ `cli-reference.md` - Added `aphoria claims import` documentation
|
|
- ✅ `comparison-modes.md` - Contains/NotContains already documented
|
|
- ✅ This summary document
|
|
|
|
---
|
|
|
|
## Questions?
|
|
|
|
**Q: Why can't I see any changes in the UI?**
|
|
A: This phase implemented backend infrastructure only. The dashboard doesn't consume the enrichment metadata yet.
|
|
|
|
**Q: How do I know it works?**
|
|
A: Use the CLI commands. The `aphoria claims import` command is fully functional.
|
|
|
|
**Q: When will this show up in the dashboard?**
|
|
A: That requires frontend work to integrate the enrichment metadata into the UI components.
|
|
|
|
**Q: Is this production-ready?**
|
|
A: The backend is production-ready. The CLI commands work. The UI integration is not done.
|