stemedb/applications/aphoria/docs/archive/gap-analysis-institutional-knowledge-2026-02.md
jml 9bfa626203 docs: reorganize documentation structure for clarity
Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 07:33:40 +00:00

628 lines
17 KiB
Markdown

# Gap Analysis: Institutional Knowledge Vision
**Date:** 2026-02-06
**Status:** Roadmap Planning
**Vision:** Self-learning institutional knowledge that compounds with every commit
---
## Executive Summary
Aphoria has **strong foundations** for pattern discovery and learning (Phases 7-9), but **critical gaps** exist for the institutional knowledge vision. The missing components center on **authority, governance, scoping, and lifecycle management**.
### Current State vs. Vision
| Capability | Current State | Vision State | Gap |
|------------|---------------|--------------|-----|
| Pattern discovery | ✅ Strong | ✅ Strong | None |
| Security scanning | ✅ 24 extractors | ✅ Comprehensive | None |
| Learning/promotion | ✅ Shadow mode works | ✅ Works | None |
| **Authority model** | ❌ Binary (human/auto) | Evidence-based (merit, not titles) | **CRITICAL** |
| **Scope hierarchy** | ❌ Flat team_id | Org → Team → Project | **CRITICAL** |
| **Knowledge lifecycle** | ❌ No deprecation | Active/Deprecated/Superseded | **CRITICAL** |
| **Governance** | ❌ Manual or 0.95 threshold | Evidence-aware approval | **HIGH** |
| **External integration** | ❌ None | ADR/Spec/Standard linking | **HIGH** |
---
## Phase 10: Evidence-Based Authority Model (CRITICAL)
**Problem:** All patterns treated equally. A random commit carries same weight as a pattern backed by RFC research and product specs.
**Principle:** Authority comes from **evidence**, not titles. We go by merit.
**Required Components:**
### 10.1 Evidence Levels
```rust
pub enum EvidenceLevel {
/// Just a commit, no supporting context
Commit,
/// Commit + research, ADR, or documentation
Research,
/// Pattern references RFC, OWASP, or external standard
Standard,
/// Pattern linked to product spec, task file, or explicit decision
ProductSpec,
}
impl EvidenceLevel {
pub fn authority_weight(&self) -> f32 {
match self {
EvidenceLevel::Commit => 0.40,
EvidenceLevel::Research => 0.70,
EvidenceLevel::Standard => 0.85,
EvidenceLevel::ProductSpec => 0.95,
}
}
}
```
### 10.2 Evidence Detection
```rust
pub struct PatternEvidence {
pub level: EvidenceLevel,
pub sources: Vec<EvidenceSource>,
}
pub enum EvidenceSource {
/// Just the commit itself
Commit { hash: String, author: String },
/// ADR or documentation in repo
Adr { path: String, title: String },
/// Research notes or investigation
Research { path: String, summary: String },
/// External standard reference
Standard {
standard_type: StandardType, // RFC, OWASP, NIST, Vendor
reference: String, // e.g., "RFC 7519 Section 4.1.3"
},
/// Product spec or task file
ProductSpec {
path: String, // e.g., "specs/auth-flow.md"
requirement_id: Option<String>, // e.g., "REQ-AUTH-001"
},
}
pub enum StandardType {
Rfc,
Owasp,
Nist,
Vendor,
Internal, // Internal policy document
}
```
### 10.3 Evidence Detection Logic
```rust
impl PatternEvidence {
pub fn detect(commit: &Commit, pattern: &Pattern) -> Self {
let mut sources = vec![EvidenceSource::Commit {
hash: commit.hash.clone(),
author: commit.author.clone(),
}];
// Check commit message for RFC/standard references
if let Some(std) = extract_standard_reference(&commit.message) {
sources.push(EvidenceSource::Standard {
standard_type: std.std_type,
reference: std.reference,
});
}
// Check for linked ADR in same commit
if let Some(adr) = find_linked_adr(commit) {
sources.push(EvidenceSource::Adr {
path: adr.path,
title: adr.title,
});
}
// Check for product spec reference
if let Some(spec) = find_linked_spec(commit) {
sources.push(EvidenceSource::ProductSpec {
path: spec.path,
requirement_id: spec.requirement_id,
});
}
let level = sources.iter()
.map(|s| s.evidence_level())
.max()
.unwrap_or(EvidenceLevel::Commit);
Self { level, sources }
}
}
```
### 10.4 Graduation by Evidence
| Evidence Level | Usages for Convention | Usages for Org Policy |
|----------------|----------------------|----------------------|
| ProductSpec | 1 | 3 (multi-team) |
| Standard | 3 | 5 (multi-team) |
| Research | 5 | 10 (multi-team) |
| Commit | 10 | 25 (multi-team) |
**A single product spec reference can graduate a pattern immediately.**
### 10.5 Display with Evidence
```bash
$ aphoria patterns show "api-versioning"
Pattern: API Versioning (/api/v{major}/{resource})
Authority: 0.95 (ProductSpec)
Evidence:
• specs/api-design.md → REQ-API-001
• ADR-042: API Versioning Strategy
• RFC 7231: HTTP Semantics
Usages: 25 across 3 teams
Status: Org Convention
```
### 10.6 Files to Create/Modify
| File | Change |
|------|--------|
| `src/evidence/mod.rs` | New module |
| `src/evidence/levels.rs` | EvidenceLevel enum |
| `src/evidence/sources.rs` | EvidenceSource types |
| `src/evidence/detection.rs` | Auto-detection from commits |
| `src/learning/types.rs` | Add evidence to LearnedPattern |
| `src/promotion/pipeline.rs` | Evidence-aware graduation |
| `src/handlers/patterns.rs` | Include evidence in responses |
---
## Phase 11: Knowledge Scope Hierarchy (CRITICAL)
**Problem:** All knowledge exists at one flat level. No way to say "this applies org-wide" vs "this is just our team's preference."
**Required Components:**
### 11.1 Scope Levels
```rust
pub enum ScopeLevel {
Organization, // Applies to all teams
Team, // Applies to team's projects
Project, // Single project only
}
pub struct ScopedKnowledge {
pub scope_level: ScopeLevel,
pub scope_id: String, // org_id, team_id, or project_id
pub knowledge: Knowledge,
pub inherited_from: Option<Box<ScopedKnowledge>>,
}
```
### 11.2 Scope Inheritance
```
Organization: "TLS 1.3 required" (BLOCK)
└── Team A: (inherits automatically)
└── Team B: "TLS 1.2 allowed for legacy" (OVERRIDE with justification)
└── Project B1: (inherits Team B override)
└── Project B2: (inherits Team B override)
```
**Default behavior:**
- Security policies (TLS, auth, secrets): Auto-apply org → team → project
- Conventions (API patterns, error formats): Auto-apply, teams can override with justification
- Observations: Never inherited, team-specific only
**Resolution rules:**
1. Project-level override wins (if exists with justification)
2. Else team-level (if exists)
3. Else org-level
4. Else external authority (RFC/OWASP)
**Override requirements:**
- Must provide justification
- Must link to evidence (spec, ADR, or ticket)
- Auditable in SOC 2 reports
### 11.3 Scope Configuration
```toml
# .aphoria.toml
[scope]
organization = "acme"
team = "platform"
project = "payment-service"
[scope.inheritance]
inherit_org = true
inherit_team = true
allow_project_overrides = true # false = strict mode
```
### 11.4 Cross-Scope Queries
```bash
# What patterns does the org enforce?
aphoria patterns --scope org
# What has my team added on top?
aphoria patterns --scope team --exclude-inherited
# What's unique to this project?
aphoria patterns --scope project --only-local
```
### 11.5 Files to Create/Modify
| File | Change |
|------|--------|
| `src/scope/mod.rs` | New module |
| `src/scope/levels.rs` | ScopeLevel enum and hierarchy |
| `src/scope/resolution.rs` | Inheritance resolution |
| `src/config/types/scope.rs` | ScopeConfig |
| `src/episteme/local/queries.rs` | Scope-aware queries |
| `src/handlers/patterns.rs` | --scope flag handling |
---
## Phase 12: Knowledge Lifecycle Management (CRITICAL)
**Problem:** Knowledge exists forever. No way to deprecate patterns or track evolution.
**Required Components:**
### 12.1 Knowledge Status
```rust
pub enum KnowledgeStatus {
Active,
Deprecated {
deprecated_at: u64,
reason: String,
superseded_by: Option<String>,
},
Archived {
archived_at: u64,
reason: String,
},
}
pub struct KnowledgeLifecycle {
pub knowledge_id: String,
pub current_status: KnowledgeStatus,
pub status_history: Vec<StatusChange>,
pub expires_at: Option<u64>,
}
```
### 12.2 Deprecation Flow
```bash
# Mark pattern as deprecated
aphoria deprecate "use-requests-lib" \
--reason "Migrating to httpx for async" \
--superseded-by "use-httpx-lib" \
--sunset-date 2026-06-01
# What happens on scan:
# Old pattern usage → FLAG with migration guidance
# New pattern usage → No flag
```
### 12.3 Knowledge Versioning
```rust
pub struct KnowledgeVersion {
pub version: u32,
pub content_hash: String, // BLAKE3 of knowledge content
pub created_at: u64,
pub created_by: String,
pub changelog: String,
}
```
**Use cases:**
- "When did this pattern change?"
- "What was the previous version?"
- "Who made the last update?"
### 12.4 Expiry and Refresh
```rust
pub struct KnowledgeExpiry {
pub expires_at: u64,
pub reminder_at: u64, // 30 days before expiry
pub auto_archive: bool, // Archive on expiry?
pub requires_revalidation: bool, // Must be re-approved?
}
```
**Use cases:**
- Temporary exceptions expire automatically
- Annual security review required for certain patterns
- Stale knowledge gets flagged for review
### 12.5 Files to Create/Modify
| File | Change |
|------|--------|
| `src/lifecycle/mod.rs` | New module |
| `src/lifecycle/status.rs` | Status enum and transitions |
| `src/lifecycle/versioning.rs` | Version tracking |
| `src/lifecycle/expiry.rs` | Expiry and refresh logic |
| `src/handlers/deprecate.rs` | Deprecation command handler |
| `src/cli.rs` | Add `deprecate` command |
---
## Phase 13: Governance Workflows (HIGH)
**Problem:** Governance is binary: manual review or >0.95 auto-promote. No approval workflows, SLAs, or role-based gates.
**Required Components:**
### 13.1 Approval Workflow Definition
```rust
pub struct ApprovalWorkflow {
pub name: String,
pub applies_to: WorkflowScope,
pub stages: Vec<ApprovalStage>,
pub timeout: Duration,
pub escalation: EscalationPolicy,
}
pub struct ApprovalStage {
pub name: String,
pub required_approvers: u32,
pub approver_roles: Vec<ContributorRole>,
pub auto_approve_threshold: Option<f32>,
}
```
**Example:**
```toml
[[governance.workflows]]
name = "pattern-promotion"
applies_to = "conventions"
[[governance.workflows.stages]]
name = "team-review"
required_approvers = 1
approver_roles = ["SeniorEngineer", "Architect"]
auto_approve_threshold = 0.95
[[governance.workflows.stages]]
name = "security-review"
required_approvers = 1
approver_roles = ["SecurityReviewer"]
# No auto-approve for security-sensitive patterns
```
### 13.2 Approval State Machine
```
PENDING → STAGE_1_PENDING → STAGE_1_APPROVED → STAGE_2_PENDING → APPROVED
↓ ↓ ↓
REJECTED REJECTED REJECTED
↓ ↓ ↓
ARCHIVED ARCHIVED ARCHIVED
```
### 13.3 Approval CLI
```bash
# List pending approvals
aphoria governance pending
# Approve a pattern promotion
aphoria governance approve <pattern-id> --comment "LGTM"
# Reject with reason
aphoria governance reject <pattern-id> --reason "Too specific to one project"
# Escalate to next stage
aphoria governance escalate <pattern-id>
```
### 13.4 Files to Create/Modify
| File | Change |
|------|--------|
| `src/governance/mod.rs` | New module |
| `src/governance/workflow.rs` | Workflow definitions |
| `src/governance/approval.rs` | Approval state machine |
| `src/governance/store.rs` | Persistence |
| `src/handlers/governance.rs` | CLI handlers |
| `src/cli.rs` | Add `governance` subcommand |
---
## Phase 14: Evidence Source Integration (HIGH)
**Problem:** Evidence sources (ADRs, specs, standards) aren't automatically linked. Developers must manually reference them.
**Required Components:**
### 14.1 ADR Auto-Detection
```rust
pub struct AdrDetector {
pub patterns: Vec<String>, // e.g., ["docs/adr/*.md", "adr/*.md", "decisions/*.md"]
}
impl AdrDetector {
pub fn find_linked_adr(&self, commit: &Commit) -> Option<AdrReference> {
// Check commit message for ADR-XXX pattern
// Check modified files for ADR documents
// Parse ADR content for related patterns
}
}
```
### 14.2 Spec File Detection
```rust
pub struct SpecDetector {
pub patterns: Vec<String>, // e.g., ["specs/*.md", "*.spec.md", "requirements/*.md"]
}
impl SpecDetector {
pub fn find_linked_spec(&self, commit: &Commit) -> Option<SpecReference> {
// Check for spec file modifications in same commit
// Parse spec for requirement IDs (REQ-XXX)
// Link pattern to requirement
}
}
```
### 14.3 Standard Reference Extraction
```rust
pub fn extract_standard_reference(text: &str) -> Option<StandardReference> {
// Match patterns like:
// - "RFC 7519" → Standard(Rfc, "7519")
// - "OWASP A03:2021" → Standard(Owasp, "A03:2021")
// - "NIST SP 800-53" → Standard(Nist, "SP 800-53")
}
```
### 14.4 Evidence Display
```bash
$ aphoria patterns show "api-versioning"
Pattern: API Versioning (/api/v{major}/{resource})
Status: Active (Convention)
Scope: Team (Platform)
Authority: 0.95 (ProductSpec)
Evidence Chain:
1. ProductSpec: specs/api-design.md → REQ-API-001
2. Research: docs/adr/ADR-042-api-versioning.md
3. Standard: RFC 7231 Section 5.1
First seen: 2024-03-15 in payment-service
Usages: 25 across 3 teams
Adoption: 89% of team projects (8/9)
```
### 14.5 Files to Create/Modify
| File | Change |
|------|--------|
| `src/attribution/mod.rs` | New module |
| `src/attribution/git.rs` | Git extraction |
| `src/attribution/directory.rs` | LDAP/SSO integration |
| `src/attribution/display.rs` | Human-readable output |
| `src/handlers/patterns.rs` | Include attribution |
---
## Phase 15: External Knowledge Integration (MEDIUM)
**Problem:** Internal documentation (ADRs, Confluence) not integrated. Teams maintain knowledge in two places.
**Required Components:**
### 15.1 ADR Importer
```bash
# Import ADRs as authoritative sources
aphoria import adr ./docs/adr/
# ADR-042: API Versioning
# Status: Accepted
# → Imports as Tier 1 policy for the project
```
### 15.2 Confluence Connector (Future)
```toml
[integration.confluence]
url = "https://acme.atlassian.net/wiki"
space = "PLATFORM"
label = "aphoria-policy" # Only import pages with this label
```
### 15.3 Jira Integration (Future)
```toml
[integration.jira]
url = "https://acme.atlassian.net"
project = "PLAT"
conflict_link = true # Link conflicts to Jira issues
```
### 15.4 Files to Create/Modify
| File | Change |
|------|--------|
| `src/integration/mod.rs` | New module |
| `src/integration/adr.rs` | ADR parser |
| `src/integration/confluence.rs` | Confluence API client |
| `src/integration/jira.rs` | Jira API client |
| `src/handlers/import.rs` | Import command handlers |
---
## Implementation Priority
### Must Have for Enterprise Pilot (Phases 10-12)
| Phase | Feature | Effort | Why Critical |
|-------|---------|--------|--------------|
| **10** | Evidence-based authority | 2 weeks | "Why trust this pattern?" - merit, not titles |
| **11** | Scope hierarchy | 2 weeks | "Does this apply to my team?" |
| **12** | Knowledge lifecycle | 1 week | "Is this still current?" |
### Should Have for Production (Phases 13-15)
| Phase | Feature | Effort | Why Important |
|-------|---------|--------|---------------|
| **13** | Governance workflows | 2 weeks | SOC 2 compliance |
| **14** | Evidence source integration | 1 week | Auto-detect ADRs, specs, standards |
| **15** | External integration | 2 weeks | Confluence, Jira linking |
---
## Success Metrics
### Enterprise Pilot (90 days)
| Metric | Target | Measurement |
|--------|--------|-------------|
| Patterns captured | 100+ observations | Count in knowledge graph |
| Patterns promoted | 10+ conventions | Count with status=Active |
| Cross-team adoption | 2+ teams connected | Unique team_ids |
| New hire guidance | 5+ accepted suggestions | Accept rate tracking |
| False positive rate | <10% | FP feedback / total flags |
### Production Ready (180 days)
| Metric | Target | Measurement |
|--------|--------|-------------|
| Knowledge retention | 0 lost patterns on departures | Audit log |
| Onboarding velocity | 50% faster ramp | Time to first PR |
| Convention adoption | 80% across org | Compliance rate |
| SOC 2 evidence | Audit pass | External validation |
---
## Next Steps
1. **Finalize Phase 10-12 specs** - Detailed API design
2. **Create enterprise simulation UAT** - Multi-month scenario
3. **Build Phase 10** - Authority model is foundational
4. **Pilot with internal team** - Eat our own dog food
5. **External pilot** - First enterprise customer