stemedb/applications/aphoria/docs/phase-17-summary.md
jml e95c978481 feat(aphoria): add inline claim markers and claim enrichment infrastructure
This commit implements Phase 17 of the Aphoria roadmap, adding:

**Inline Claim Markers (@aphoria:claim):**
- New extractor for detecting inline markers in comments
- Pending markers tracked in .aphoria/pending_markers.toml
- CLI commands: list-markers, formalize-marker, reject-marker
- Support for all major comment styles (Rust, Python, SQL, etc.)
- Auto-sync during scan (configurable)

**Claim Enrichment:**
- ClaimEnrichment type with source attribution (inline, extractor, manual)
- EnrichedClaimInfo with full enrichment metadata
- Extended AuthoredClaim with optional enrichment field
- API endpoints for enriched claim queries
- Dashboard UI components (enrichment badge, verdict badge)

**Enhanced Extractor Trait:**
- verifiable_predicates() method for declaring (tail_path, predicate) pairs
- 10 security extractors now implement verifiable_predicates
- Enables claim suggester skill to find unclaimed patterns

**Documentation:**
- Phase 17 summary with complete implementation details
- Gap fixes summary documenting 8 closed vision gaps
- Updated CLI reference with new commands
- New aphoria-docs skill for documentation maintenance
- Updated roadmap with Phase 17 completion

**Integration:**
- ClaimsFile support for claim enrichment persistence
- Pattern aggregate store support for enrichment queries
- Dashboard filters and display for enrichment metadata
- API handlers for list-markers and enrichment queries

**Tests:**
- New gap_fixes_integration test suite
- Corpus enricher module with best practices ingestion

Closes: VG-005, VG-017, VG-018, VG-019, VG-020, VG-021, VG-022, VG-023

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-08 20:18:20 +00:00

9.6 KiB

Phase 17: Pattern Enrichment & Best Practices Infrastructure

Status: Complete (Backend Only) Date: 2026-02-08

What Was Built

This phase implemented backend infrastructure for enriched corpus patterns and team guideline ingestion. The features are fully functional via CLI but not yet integrated with the dashboard UI.


1. Enriched Pattern Metadata

The Problem

Community patterns showed bare statistics like "md5: true, 347 projects" with no context about whether MD5 is deprecated, recommended, or neutral.

The Solution

Extractors now provide enrichment metadata:

pub struct PatternMetadata {
    pub tail_path: String,         // "crypto/hashing/algorithm"
    pub predicate: String,          // "algorithm"
    pub value: Option<String>,      // "md5" (or None for wildcard)
    pub category: String,           // "security"
    pub verdict: String,            // "deprecated"
    pub explanation: String,        // "MD5 is cryptographically broken..."
    pub authority_source: Option<String>,  // "NIST SP 800-131A"
}

What Works Now

  • 10 security extractors provide enrichment metadata
  • PatternEnricher service matches patterns to metadata (exact, wildcard, noise detection)
  • Data model supports category, verdict, explanation, authority_source

What's Missing

Dashboard doesn't display this metadata yet No category filter dropdown No "Hide noise" toggle No visual badges for deprecated/recommended


2. TeamPolicy Authority Tier

The Problem

No authority tier between community observations (tier 4) and expert opinions (tier 3) for team-level architectural guidelines.

The Solution

New tier 2.5: TeamPolicy

  • Sits between Observational (tier 2) and Expert (tier 3)
  • Authority weight: 0.6 (between 0.7 and 0.5)
  • Decay: 180 days (same as Expert)
  • Use case: Team architectural guidelines, internal standards

What Works Now

# Create team policy claim
aphoria claims create \
  --tier team_policy \
  --id hex-arch-http-001 \
  --concept-path myapp/adapters/http \
  --predicate layer \
  --value adapter \
  --invariant "HTTP handlers MUST be in adapters layer" \
  --consequence "Business logic leaks into infrastructure" \
  --provenance "Architecture team decision 2026-02-08" \
  --category architecture \
  --by architecture-team

3. Best Practices Import CLI

The Problem

Teams write extensive architectural guidelines in markdown/PDFs but have no way to automatically enforce them.

The Solution

Batch import claims from TOML files:

# Preview import
aphoria claims import docs/hexagonal-arch.toml --dry-run

# Import with tracking
aphoria claims import docs/hexagonal-arch.toml \
  --authority-tier team_policy \
  --source-guide "hexagonal-arch"

What Works Now

  • Batch import claims from TOML
  • Override authority tier for all claims
  • Merge strategies: skip_existing, overwrite, fail_on_duplicate
  • Dry-run preview
  • Guideline tracking in .aphoria/ingested_guides.toml

Example TOML

[[claim]]
id = "hex-arch-http-001"
concept_path = "myapp/adapters/http"
predicate = "layer"
value = "adapter"
comparison = "equals"
provenance = "Hexagonal Architecture Guidelines"
invariant = "HTTP handlers MUST be in adapters layer"
consequence = "Business logic leaks into infrastructure"
authority_tier = "team_policy"
category = "architecture"
evidence = ["docs/architecture/hexagonal.md"]
created_by = "architecture-team"
created_at = "2026-02-08T12:00:00Z"

4. Guideline Tracking

The Problem

No way to track which guidelines have been imported, detect changes, or filter compliance.

The Solution

.aphoria/ingested_guides.toml tracks imported guidelines:

[[guide]]
id = "hexagonal-arch"
name = "Hexagonal Architecture Guidelines"
source_path = "docs/hexagonal.md"
document_hash = "blake3:abc123..."
ingested_at = "2026-02-08T12:00:00Z"
claims_count = 26
authority_tier = "team_policy"
category = "architecture"
claim_ids = ["hex-arch-http-001", "hex-arch-domain-imports-001", ...]

What Works Now

  • Guideline metadata tracked with BLAKE3 hash
  • Change detection (compare hash to detect doc updates)
  • Audit trail (who imported what, when)

What's Missing

aphoria scan --check-policy <guide-id> not implemented No re-extraction workflow when source doc changes No compliance dashboard


5. Updated Comparison Modes

What Was Added

Two new comparison modes for list/substring matching:

Contains - Value must contain substring/element

comparison = "contains"
value = "Serialize"
# Passes: "Clone,Debug,Serialize"
# Fails: "Clone,Debug"

NotContains - Value must NOT contain substring/element

comparison = "not_contains"
value = "Clone"
# Passes: "Debug"
# Fails: "Clone,Debug"

10 Enriched Security Extractors

Extractor Enriched Patterns Authority Source
WeakCryptoExtractor MD5, SHA1 (deprecated), DES, RC4 NIST SP 800-131A, RFC 7465
TlsVersionExtractor TLS 1.0/1.1 (deprecated), 1.2/1.3 (recommended) RFC 8996, RFC 8446
TlsVerifyExtractor cert_verification: false (insecure) OWASP
JwtConfigExtractor algorithm: none (forbidden) RFC 7519
CorsConfigExtractor allow_origin: * (insecure) OWASP, W3C CORS Spec
HardcodedSecretsExtractor API keys/passwords (critical) OWASP A07:2021
SqlInjectionExtractor String interpolation (vulnerable) OWASP A03:2021
CommandInjectionExtractor Shell exec (vulnerable) OWASP A03:2021
PathTraversalExtractor User-controlled paths (vulnerable) OWASP A01:2021
InsecureDeserializationExtractor pickle/yaml.load (unsafe) OWASP A08:2021

Files Created/Modified

New Files

  • applications/aphoria/src/corpus/enricher.rs - Pattern enrichment service
  • applications/aphoria/src/types/ingested_guides.rs - Guideline tracking

Modified Files

Core Types:

  • crates/stemedb-core/src/types/source.rs - TeamPolicy tier
  • crates/stemedb-storage/src/pattern_aggregate_store/mod.rs - Enrichment fields

Aphoria:

  • applications/aphoria/src/extractors/traits.rs - pattern_metadata() method
  • applications/aphoria/src/types/authored_claim.rs - Contains/NotContains modes
  • applications/aphoria/src/cli/claims.rs - Import subcommand
  • applications/aphoria/src/handlers/claims.rs - Import handler
  • 10 extractor files with pattern_metadata() implementations

API & DTOs:

  • crates/stemedb-api/src/dto/enums.rs - TeamPolicy DTO
  • crates/stemedb-api/src/dto/aphoria/types.rs - Contains/NotContains DTOs
  • crates/stemedb-ontology/src/dto/enums.rs - TeamPolicy DTO

How to Use (CLI)

1. Create a guideline TOML file

cat > docs/architecture-guidelines.toml <<EOF
[[claim]]
id = "no-tokio-in-core"
concept_path = "myapp/core/imports/tokio"
predicate = "imported"
value = "true"
comparison = "absent"
provenance = "Architecture decision: core must be sync-only"
invariant = "Core modules MUST NOT import tokio"
consequence = "Creates async runtime coupling, breaks sync library users"
authority_tier = "team_policy"
category = "architecture"
evidence = ["ADR-003"]
created_by = "tech-lead"
created_at = "2026-02-08T12:00:00Z"
EOF

2. Import the guideline

aphoria claims import docs/architecture-guidelines.toml \
  --source-guide "architecture-2026" \
  --dry-run

3. Run verification

aphoria scan --persist
aphoria verify run

What's NOT Done (UI Integration)

The backend is complete but the dashboard doesn't display any of this:

Category badges (security/architecture/performance) Verdict badges (deprecated/recommended/emerging) Explanation tooltips ("MD5 is deprecated - NIST 2010") Filter by category dropdown "Hide noise" toggle Guideline compliance filtering (--check-policy flag) Compliance dashboard showing guideline status


Next Steps

To Make This User-Visible:

Option 1: Dashboard Integration (Frontend work)

  • Add category/verdict badges to pattern cards
  • Show explanations in tooltips
  • Add category filter dropdown
  • Implement "Hide noise" toggle
  • Build compliance dashboard

Option 2: Enhanced CLI Output (Backend work)

  • Show enrichment in aphoria scan table output
  • Add --show-enrichment flag
  • Color-code deprecated patterns (red), recommended (green)
  • Filter by category: aphoria scan --category security

Option 3: Policy Filtering (Backend work)

  • Implement aphoria scan --check-policy <guide-id>
  • Show only violations of specific guideline
  • Pre-commit hook support for policy enforcement

Testing

All code compiles and passes existing tests. To verify:

# Build workspace
cargo build --workspace

# Test aphoria
cargo test --package aphoria

# Try the import command
aphoria claims import --help

Documentation Updated

  • roadmap-archive.md - Added Phase 17
  • roadmap.md - Updated status table
  • cli-reference.md - Added aphoria claims import documentation
  • comparison-modes.md - Contains/NotContains already documented
  • This summary document

Questions?

Q: Why can't I see any changes in the UI? A: This phase implemented backend infrastructure only. The dashboard doesn't consume the enrichment metadata yet.

Q: How do I know it works? A: Use the CLI commands. The aphoria claims import command is fully functional.

Q: When will this show up in the dashboard? A: That requires frontend work to integrate the enrichment metadata into the UI components.

Q: Is this production-ready? A: The backend is production-ready. The CLI commands work. The UI integration is not done.