# Phase 17: Pattern Enrichment & Best Practices Infrastructure **Status:** ✅ Complete (Backend Only) **Date:** 2026-02-08 ## What Was Built This phase implemented **backend infrastructure** for enriched corpus patterns and team guideline ingestion. The features are **fully functional via CLI** but **not yet integrated with the dashboard UI**. --- ## 1. Enriched Pattern Metadata ### The Problem Community patterns showed bare statistics like "md5: true, 347 projects" with no context about whether MD5 is deprecated, recommended, or neutral. ### The Solution Extractors now provide enrichment metadata: ```rust pub struct PatternMetadata { pub tail_path: String, // "crypto/hashing/algorithm" pub predicate: String, // "algorithm" pub value: Option, // "md5" (or None for wildcard) pub category: String, // "security" pub verdict: String, // "deprecated" pub explanation: String, // "MD5 is cryptographically broken..." pub authority_source: Option, // "NIST SP 800-131A" } ``` ### What Works Now - 10 security extractors provide enrichment metadata - `PatternEnricher` service matches patterns to metadata (exact, wildcard, noise detection) - Data model supports category, verdict, explanation, authority_source ### What's Missing ❌ Dashboard doesn't display this metadata yet ❌ No category filter dropdown ❌ No "Hide noise" toggle ❌ No visual badges for deprecated/recommended --- ## 2. TeamPolicy Authority Tier ### The Problem No authority tier between community observations (tier 4) and expert opinions (tier 3) for team-level architectural guidelines. ### The Solution New **tier 2.5**: `TeamPolicy` - Sits between Observational (tier 2) and Expert (tier 3) - Authority weight: 0.6 (between 0.7 and 0.5) - Decay: 180 days (same as Expert) - Use case: Team architectural guidelines, internal standards ### What Works Now ```bash # Create team policy claim aphoria claims create \ --tier team_policy \ --id hex-arch-http-001 \ --concept-path myapp/adapters/http \ --predicate layer \ --value adapter \ --invariant "HTTP handlers MUST be in adapters layer" \ --consequence "Business logic leaks into infrastructure" \ --provenance "Architecture team decision 2026-02-08" \ --category architecture \ --by architecture-team ``` --- ## 3. Best Practices Import CLI ### The Problem Teams write extensive architectural guidelines in markdown/PDFs but have no way to automatically enforce them. ### The Solution Batch import claims from TOML files: ```bash # Preview import aphoria claims import docs/hexagonal-arch.toml --dry-run # Import with tracking aphoria claims import docs/hexagonal-arch.toml \ --authority-tier team_policy \ --source-guide "hexagonal-arch" ``` ### What Works Now - Batch import claims from TOML - Override authority tier for all claims - Merge strategies: `skip_existing`, `overwrite`, `fail_on_duplicate` - Dry-run preview - Guideline tracking in `.aphoria/ingested_guides.toml` ### Example TOML ```toml [[claim]] id = "hex-arch-http-001" concept_path = "myapp/adapters/http" predicate = "layer" value = "adapter" comparison = "equals" provenance = "Hexagonal Architecture Guidelines" invariant = "HTTP handlers MUST be in adapters layer" consequence = "Business logic leaks into infrastructure" authority_tier = "team_policy" category = "architecture" evidence = ["docs/architecture/hexagonal.md"] created_by = "architecture-team" created_at = "2026-02-08T12:00:00Z" ``` --- ## 4. Guideline Tracking ### The Problem No way to track which guidelines have been imported, detect changes, or filter compliance. ### The Solution `.aphoria/ingested_guides.toml` tracks imported guidelines: ```toml [[guide]] id = "hexagonal-arch" name = "Hexagonal Architecture Guidelines" source_path = "docs/hexagonal.md" document_hash = "blake3:abc123..." ingested_at = "2026-02-08T12:00:00Z" claims_count = 26 authority_tier = "team_policy" category = "architecture" claim_ids = ["hex-arch-http-001", "hex-arch-domain-imports-001", ...] ``` ### What Works Now - Guideline metadata tracked with BLAKE3 hash - Change detection (compare hash to detect doc updates) - Audit trail (who imported what, when) ### What's Missing ❌ `aphoria scan --check-policy ` not implemented ❌ No re-extraction workflow when source doc changes ❌ No compliance dashboard --- ## 5. Updated Comparison Modes ### What Was Added Two new comparison modes for list/substring matching: **Contains** - Value must contain substring/element ```toml comparison = "contains" value = "Serialize" # Passes: "Clone,Debug,Serialize" # Fails: "Clone,Debug" ``` **NotContains** - Value must NOT contain substring/element ```toml comparison = "not_contains" value = "Clone" # Passes: "Debug" # Fails: "Clone,Debug" ``` --- ## 10 Enriched Security Extractors | Extractor | Enriched Patterns | Authority Source | |-----------|-------------------|------------------| | `WeakCryptoExtractor` | MD5, SHA1 (deprecated), DES, RC4 | NIST SP 800-131A, RFC 7465 | | `TlsVersionExtractor` | TLS 1.0/1.1 (deprecated), 1.2/1.3 (recommended) | RFC 8996, RFC 8446 | | `TlsVerifyExtractor` | cert_verification: false (insecure) | OWASP | | `JwtConfigExtractor` | algorithm: none (forbidden) | RFC 7519 | | `CorsConfigExtractor` | allow_origin: * (insecure) | OWASP, W3C CORS Spec | | `HardcodedSecretsExtractor` | API keys/passwords (critical) | OWASP A07:2021 | | `SqlInjectionExtractor` | String interpolation (vulnerable) | OWASP A03:2021 | | `CommandInjectionExtractor` | Shell exec (vulnerable) | OWASP A03:2021 | | `PathTraversalExtractor` | User-controlled paths (vulnerable) | OWASP A01:2021 | | `InsecureDeserializationExtractor` | pickle/yaml.load (unsafe) | OWASP A08:2021 | --- ## Files Created/Modified ### New Files - `applications/aphoria/src/corpus/enricher.rs` - Pattern enrichment service - `applications/aphoria/src/types/ingested_guides.rs` - Guideline tracking ### Modified Files **Core Types:** - `crates/stemedb-core/src/types/source.rs` - TeamPolicy tier - `crates/stemedb-storage/src/pattern_aggregate_store/mod.rs` - Enrichment fields **Aphoria:** - `applications/aphoria/src/extractors/traits.rs` - `pattern_metadata()` method - `applications/aphoria/src/types/authored_claim.rs` - Contains/NotContains modes - `applications/aphoria/src/cli/claims.rs` - Import subcommand - `applications/aphoria/src/handlers/claims.rs` - Import handler - 10 extractor files with `pattern_metadata()` implementations **API & DTOs:** - `crates/stemedb-api/src/dto/enums.rs` - TeamPolicy DTO - `crates/stemedb-api/src/dto/aphoria/types.rs` - Contains/NotContains DTOs - `crates/stemedb-ontology/src/dto/enums.rs` - TeamPolicy DTO --- ## How to Use (CLI) ### 1. Create a guideline TOML file ```bash cat > docs/architecture-guidelines.toml <` - Show only violations of specific guideline - Pre-commit hook support for policy enforcement --- ## Testing All code compiles and passes existing tests. To verify: ```bash # Build workspace cargo build --workspace # Test aphoria cargo test --package aphoria # Try the import command aphoria claims import --help ``` --- ## Documentation Updated - ✅ `roadmap-archive.md` - Added Phase 17 - ✅ `roadmap.md` - Updated status table - ✅ `cli-reference.md` - Added `aphoria claims import` documentation - ✅ `comparison-modes.md` - Contains/NotContains already documented - ✅ This summary document --- ## Questions? **Q: Why can't I see any changes in the UI?** A: This phase implemented backend infrastructure only. The dashboard doesn't consume the enrichment metadata yet. **Q: How do I know it works?** A: Use the CLI commands. The `aphoria claims import` command is fully functional. **Q: When will this show up in the dashboard?** A: That requires frontend work to integrate the enrichment metadata into the UI components. **Q: Is this production-ready?** A: The backend is production-ready. The CLI commands work. The UI integration is not done.