stemedb/applications/aphoria/docs/archive/gap-analysis-institutional-knowledge-2026-02.md
jml 9bfa626203 docs: reorganize documentation structure for clarity
Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 07:33:40 +00:00

17 KiB

Gap Analysis: Institutional Knowledge Vision

Date: 2026-02-06 Status: Roadmap Planning Vision: Self-learning institutional knowledge that compounds with every commit


Executive Summary

Aphoria has strong foundations for pattern discovery and learning (Phases 7-9), but critical gaps exist for the institutional knowledge vision. The missing components center on authority, governance, scoping, and lifecycle management.

Current State vs. Vision

Capability Current State Vision State Gap
Pattern discovery Strong Strong None
Security scanning 24 extractors Comprehensive None
Learning/promotion Shadow mode works Works None
Authority model Binary (human/auto) Evidence-based (merit, not titles) CRITICAL
Scope hierarchy Flat team_id Org → Team → Project CRITICAL
Knowledge lifecycle No deprecation Active/Deprecated/Superseded CRITICAL
Governance Manual or 0.95 threshold Evidence-aware approval HIGH
External integration None ADR/Spec/Standard linking HIGH

Phase 10: Evidence-Based Authority Model (CRITICAL)

Problem: All patterns treated equally. A random commit carries same weight as a pattern backed by RFC research and product specs.

Principle: Authority comes from evidence, not titles. We go by merit.

Required Components:

10.1 Evidence Levels

pub enum EvidenceLevel {
    /// Just a commit, no supporting context
    Commit,
    /// Commit + research, ADR, or documentation
    Research,
    /// Pattern references RFC, OWASP, or external standard
    Standard,
    /// Pattern linked to product spec, task file, or explicit decision
    ProductSpec,
}

impl EvidenceLevel {
    pub fn authority_weight(&self) -> f32 {
        match self {
            EvidenceLevel::Commit => 0.40,
            EvidenceLevel::Research => 0.70,
            EvidenceLevel::Standard => 0.85,
            EvidenceLevel::ProductSpec => 0.95,
        }
    }
}

10.2 Evidence Detection

pub struct PatternEvidence {
    pub level: EvidenceLevel,
    pub sources: Vec<EvidenceSource>,
}

pub enum EvidenceSource {
    /// Just the commit itself
    Commit { hash: String, author: String },

    /// ADR or documentation in repo
    Adr { path: String, title: String },

    /// Research notes or investigation
    Research { path: String, summary: String },

    /// External standard reference
    Standard {
        standard_type: StandardType,  // RFC, OWASP, NIST, Vendor
        reference: String,            // e.g., "RFC 7519 Section 4.1.3"
    },

    /// Product spec or task file
    ProductSpec {
        path: String,                 // e.g., "specs/auth-flow.md"
        requirement_id: Option<String>, // e.g., "REQ-AUTH-001"
    },
}

pub enum StandardType {
    Rfc,
    Owasp,
    Nist,
    Vendor,
    Internal,  // Internal policy document
}

10.3 Evidence Detection Logic

impl PatternEvidence {
    pub fn detect(commit: &Commit, pattern: &Pattern) -> Self {
        let mut sources = vec![EvidenceSource::Commit {
            hash: commit.hash.clone(),
            author: commit.author.clone(),
        }];

        // Check commit message for RFC/standard references
        if let Some(std) = extract_standard_reference(&commit.message) {
            sources.push(EvidenceSource::Standard {
                standard_type: std.std_type,
                reference: std.reference,
            });
        }

        // Check for linked ADR in same commit
        if let Some(adr) = find_linked_adr(commit) {
            sources.push(EvidenceSource::Adr {
                path: adr.path,
                title: adr.title,
            });
        }

        // Check for product spec reference
        if let Some(spec) = find_linked_spec(commit) {
            sources.push(EvidenceSource::ProductSpec {
                path: spec.path,
                requirement_id: spec.requirement_id,
            });
        }

        let level = sources.iter()
            .map(|s| s.evidence_level())
            .max()
            .unwrap_or(EvidenceLevel::Commit);

        Self { level, sources }
    }
}

10.4 Graduation by Evidence

Evidence Level Usages for Convention Usages for Org Policy
ProductSpec 1 3 (multi-team)
Standard 3 5 (multi-team)
Research 5 10 (multi-team)
Commit 10 25 (multi-team)

A single product spec reference can graduate a pattern immediately.

10.5 Display with Evidence

$ aphoria patterns show "api-versioning"

Pattern: API Versioning (/api/v{major}/{resource})
Authority: 0.95 (ProductSpec)
Evidence:
  • specs/api-design.md → REQ-API-001
  • ADR-042: API Versioning Strategy
  • RFC 7231: HTTP Semantics

Usages: 25 across 3 teams
Status: Org Convention

10.6 Files to Create/Modify

File Change
src/evidence/mod.rs New module
src/evidence/levels.rs EvidenceLevel enum
src/evidence/sources.rs EvidenceSource types
src/evidence/detection.rs Auto-detection from commits
src/learning/types.rs Add evidence to LearnedPattern
src/promotion/pipeline.rs Evidence-aware graduation
src/handlers/patterns.rs Include evidence in responses

Phase 11: Knowledge Scope Hierarchy (CRITICAL)

Problem: All knowledge exists at one flat level. No way to say "this applies org-wide" vs "this is just our team's preference."

Required Components:

11.1 Scope Levels

pub enum ScopeLevel {
    Organization,  // Applies to all teams
    Team,          // Applies to team's projects
    Project,       // Single project only
}

pub struct ScopedKnowledge {
    pub scope_level: ScopeLevel,
    pub scope_id: String,  // org_id, team_id, or project_id
    pub knowledge: Knowledge,
    pub inherited_from: Option<Box<ScopedKnowledge>>,
}

11.2 Scope Inheritance

Organization: "TLS 1.3 required" (BLOCK)
    └── Team A: (inherits automatically)
    └── Team B: "TLS 1.2 allowed for legacy" (OVERRIDE with justification)
        └── Project B1: (inherits Team B override)
        └── Project B2: (inherits Team B override)

Default behavior:

  • Security policies (TLS, auth, secrets): Auto-apply org → team → project
  • Conventions (API patterns, error formats): Auto-apply, teams can override with justification
  • Observations: Never inherited, team-specific only

Resolution rules:

  1. Project-level override wins (if exists with justification)
  2. Else team-level (if exists)
  3. Else org-level
  4. Else external authority (RFC/OWASP)

Override requirements:

  • Must provide justification
  • Must link to evidence (spec, ADR, or ticket)
  • Auditable in SOC 2 reports

11.3 Scope Configuration

# .aphoria.toml
[scope]
organization = "acme"
team = "platform"
project = "payment-service"

[scope.inheritance]
inherit_org = true
inherit_team = true
allow_project_overrides = true  # false = strict mode

11.4 Cross-Scope Queries

# What patterns does the org enforce?
aphoria patterns --scope org

# What has my team added on top?
aphoria patterns --scope team --exclude-inherited

# What's unique to this project?
aphoria patterns --scope project --only-local

11.5 Files to Create/Modify

File Change
src/scope/mod.rs New module
src/scope/levels.rs ScopeLevel enum and hierarchy
src/scope/resolution.rs Inheritance resolution
src/config/types/scope.rs ScopeConfig
src/episteme/local/queries.rs Scope-aware queries
src/handlers/patterns.rs --scope flag handling

Phase 12: Knowledge Lifecycle Management (CRITICAL)

Problem: Knowledge exists forever. No way to deprecate patterns or track evolution.

Required Components:

12.1 Knowledge Status

pub enum KnowledgeStatus {
    Active,
    Deprecated {
        deprecated_at: u64,
        reason: String,
        superseded_by: Option<String>,
    },
    Archived {
        archived_at: u64,
        reason: String,
    },
}

pub struct KnowledgeLifecycle {
    pub knowledge_id: String,
    pub current_status: KnowledgeStatus,
    pub status_history: Vec<StatusChange>,
    pub expires_at: Option<u64>,
}

12.2 Deprecation Flow

# Mark pattern as deprecated
aphoria deprecate "use-requests-lib" \
    --reason "Migrating to httpx for async" \
    --superseded-by "use-httpx-lib" \
    --sunset-date 2026-06-01

# What happens on scan:
# Old pattern usage → FLAG with migration guidance
# New pattern usage → No flag

12.3 Knowledge Versioning

pub struct KnowledgeVersion {
    pub version: u32,
    pub content_hash: String,  // BLAKE3 of knowledge content
    pub created_at: u64,
    pub created_by: String,
    pub changelog: String,
}

Use cases:

  • "When did this pattern change?"
  • "What was the previous version?"
  • "Who made the last update?"

12.4 Expiry and Refresh

pub struct KnowledgeExpiry {
    pub expires_at: u64,
    pub reminder_at: u64,       // 30 days before expiry
    pub auto_archive: bool,     // Archive on expiry?
    pub requires_revalidation: bool,  // Must be re-approved?
}

Use cases:

  • Temporary exceptions expire automatically
  • Annual security review required for certain patterns
  • Stale knowledge gets flagged for review

12.5 Files to Create/Modify

File Change
src/lifecycle/mod.rs New module
src/lifecycle/status.rs Status enum and transitions
src/lifecycle/versioning.rs Version tracking
src/lifecycle/expiry.rs Expiry and refresh logic
src/handlers/deprecate.rs Deprecation command handler
src/cli.rs Add deprecate command

Phase 13: Governance Workflows (HIGH)

Problem: Governance is binary: manual review or >0.95 auto-promote. No approval workflows, SLAs, or role-based gates.

Required Components:

13.1 Approval Workflow Definition

pub struct ApprovalWorkflow {
    pub name: String,
    pub applies_to: WorkflowScope,
    pub stages: Vec<ApprovalStage>,
    pub timeout: Duration,
    pub escalation: EscalationPolicy,
}

pub struct ApprovalStage {
    pub name: String,
    pub required_approvers: u32,
    pub approver_roles: Vec<ContributorRole>,
    pub auto_approve_threshold: Option<f32>,
}

Example:

[[governance.workflows]]
name = "pattern-promotion"
applies_to = "conventions"

[[governance.workflows.stages]]
name = "team-review"
required_approvers = 1
approver_roles = ["SeniorEngineer", "Architect"]
auto_approve_threshold = 0.95

[[governance.workflows.stages]]
name = "security-review"
required_approvers = 1
approver_roles = ["SecurityReviewer"]
# No auto-approve for security-sensitive patterns

13.2 Approval State Machine

PENDING → STAGE_1_PENDING → STAGE_1_APPROVED → STAGE_2_PENDING → APPROVED
              ↓                    ↓                   ↓
           REJECTED            REJECTED            REJECTED
              ↓                    ↓                   ↓
          ARCHIVED             ARCHIVED            ARCHIVED

13.3 Approval CLI

# List pending approvals
aphoria governance pending

# Approve a pattern promotion
aphoria governance approve <pattern-id> --comment "LGTM"

# Reject with reason
aphoria governance reject <pattern-id> --reason "Too specific to one project"

# Escalate to next stage
aphoria governance escalate <pattern-id>

13.4 Files to Create/Modify

File Change
src/governance/mod.rs New module
src/governance/workflow.rs Workflow definitions
src/governance/approval.rs Approval state machine
src/governance/store.rs Persistence
src/handlers/governance.rs CLI handlers
src/cli.rs Add governance subcommand

Phase 14: Evidence Source Integration (HIGH)

Problem: Evidence sources (ADRs, specs, standards) aren't automatically linked. Developers must manually reference them.

Required Components:

14.1 ADR Auto-Detection

pub struct AdrDetector {
    pub patterns: Vec<String>,  // e.g., ["docs/adr/*.md", "adr/*.md", "decisions/*.md"]
}

impl AdrDetector {
    pub fn find_linked_adr(&self, commit: &Commit) -> Option<AdrReference> {
        // Check commit message for ADR-XXX pattern
        // Check modified files for ADR documents
        // Parse ADR content for related patterns
    }
}

14.2 Spec File Detection

pub struct SpecDetector {
    pub patterns: Vec<String>,  // e.g., ["specs/*.md", "*.spec.md", "requirements/*.md"]
}

impl SpecDetector {
    pub fn find_linked_spec(&self, commit: &Commit) -> Option<SpecReference> {
        // Check for spec file modifications in same commit
        // Parse spec for requirement IDs (REQ-XXX)
        // Link pattern to requirement
    }
}

14.3 Standard Reference Extraction

pub fn extract_standard_reference(text: &str) -> Option<StandardReference> {
    // Match patterns like:
    // - "RFC 7519" → Standard(Rfc, "7519")
    // - "OWASP A03:2021" → Standard(Owasp, "A03:2021")
    // - "NIST SP 800-53" → Standard(Nist, "SP 800-53")
}

14.4 Evidence Display

$ aphoria patterns show "api-versioning"

Pattern: API Versioning (/api/v{major}/{resource})
Status: Active (Convention)
Scope: Team (Platform)
Authority: 0.95 (ProductSpec)

Evidence Chain:
  1. ProductSpec: specs/api-design.md → REQ-API-001
  2. Research: docs/adr/ADR-042-api-versioning.md
  3. Standard: RFC 7231 Section 5.1

First seen: 2024-03-15 in payment-service
Usages: 25 across 3 teams
Adoption: 89% of team projects (8/9)

14.5 Files to Create/Modify

File Change
src/attribution/mod.rs New module
src/attribution/git.rs Git extraction
src/attribution/directory.rs LDAP/SSO integration
src/attribution/display.rs Human-readable output
src/handlers/patterns.rs Include attribution

Phase 15: External Knowledge Integration (MEDIUM)

Problem: Internal documentation (ADRs, Confluence) not integrated. Teams maintain knowledge in two places.

Required Components:

15.1 ADR Importer

# Import ADRs as authoritative sources
aphoria import adr ./docs/adr/

# ADR-042: API Versioning
# Status: Accepted
# → Imports as Tier 1 policy for the project

15.2 Confluence Connector (Future)

[integration.confluence]
url = "https://acme.atlassian.net/wiki"
space = "PLATFORM"
label = "aphoria-policy"  # Only import pages with this label

15.3 Jira Integration (Future)

[integration.jira]
url = "https://acme.atlassian.net"
project = "PLAT"
conflict_link = true  # Link conflicts to Jira issues

15.4 Files to Create/Modify

File Change
src/integration/mod.rs New module
src/integration/adr.rs ADR parser
src/integration/confluence.rs Confluence API client
src/integration/jira.rs Jira API client
src/handlers/import.rs Import command handlers

Implementation Priority

Must Have for Enterprise Pilot (Phases 10-12)

Phase Feature Effort Why Critical
10 Evidence-based authority 2 weeks "Why trust this pattern?" - merit, not titles
11 Scope hierarchy 2 weeks "Does this apply to my team?"
12 Knowledge lifecycle 1 week "Is this still current?"

Should Have for Production (Phases 13-15)

Phase Feature Effort Why Important
13 Governance workflows 2 weeks SOC 2 compliance
14 Evidence source integration 1 week Auto-detect ADRs, specs, standards
15 External integration 2 weeks Confluence, Jira linking

Success Metrics

Enterprise Pilot (90 days)

Metric Target Measurement
Patterns captured 100+ observations Count in knowledge graph
Patterns promoted 10+ conventions Count with status=Active
Cross-team adoption 2+ teams connected Unique team_ids
New hire guidance 5+ accepted suggestions Accept rate tracking
False positive rate <10% FP feedback / total flags

Production Ready (180 days)

Metric Target Measurement
Knowledge retention 0 lost patterns on departures Audit log
Onboarding velocity 50% faster ramp Time to first PR
Convention adoption 80% across org Compliance rate
SOC 2 evidence Audit pass External validation

Next Steps

  1. Finalize Phase 10-12 specs - Detailed API design
  2. Create enterprise simulation UAT - Multi-month scenario
  3. Build Phase 10 - Authority model is foundational
  4. Pilot with internal team - Eat our own dog food
  5. External pilot - First enterprise customer