stemedb/applications/aphoria/docs/advanced/scale-adaptive-thresholds.md
jml 9bfa626203 docs: reorganize documentation structure for clarity
Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-11 07:33:40 +00:00

5.6 KiB

Scale-Adaptive Promotion Thresholds

Overview

Scale-adaptive thresholds automatically adjust promotion criteria based on organization size, enabling small teams to see value immediately while maintaining quality gates for larger organizations.

The Problem

Before adaptive thresholds:

  • Hardcoded minimums: 850/100/50 projects for regulatory/clinical/emerging
  • Small teams (2-5 projects) → 0 patterns promoted → empty dashboard
  • No immediate value demonstration → adoption killed before flywheel starts

Root cause:

  • Thresholds designed for enterprise scale (850 projects for regulatory)
  • Small teams locked out: can't meet 50-project minimum for emerging tier
  • Dashboard queries promoted patterns only (no visibility into raw aggregates)

The Solution

Adaptive Formula

effective_min_projects = max(
    absolute_floor,           // Safety: prevent single-project noise
    (percentage * total_projects).ceil()  // Scale: grow with team
)

Scale Tiers (Auto-Detected)

Tier Project Range Behavior
Micro 1-5 Only emerging tier, floor=2, rate=50%
Small 6-25 All tiers enabled, lower floors
Medium 26-100 Balanced thresholds
Large 101-500 Higher quality gates
Enterprise 501+ Current defaults (backward compatible)

Example: Emerging Tier Scaling

Team Size Projects Formula Min Projects Adoption Required
Micro 3 max(2, 0.50*3) 2 2/3 projects (67%)
Small 10 max(2, 0.40*10) 4 4/10 projects (40%)
Medium 50 max(5, 0.40*50) 20 20/50 projects (40%)
Enterprise 1000 max(25, 0.50*1000) 500 500/1000 projects (50%)

Quality Maintained

Floor prevents noise: Single-project patterns blocked Adoption rate required: Community consensus still matters Authority matching enforced: Regulatory/clinical tiers need RFC/OWASP match Manual review: Emerging tier still requires review (auto_promote=false) Backward compatible: Enterprise behavior unchanged

Configuration

Default (Adaptive)

# .aphoria/config.toml
[corpus]
use_community = true
aggregation_enabled = true
# adaptive_thresholds = <optional custom thresholds>
use_legacy_thresholds = false  # Default: use adaptive

Legacy Mode (Static Thresholds)

[corpus]
use_legacy_thresholds = true  # Use fixed 850/100/50

Custom Thresholds

[corpus.adaptive_thresholds.micro.emerging]
min_projects_floor = 1       # Override: allow 1 project (risky!)
min_projects_percentage = 0.40
min_adoption_rate = 0.40

Implementation

Core Components

  1. ScaleTier (corpus/thresholds.rs):

    • from_total_projects(u64) -> ScaleTier
    • Auto-detects tier from project count
  2. AdaptiveCriteria (corpus/thresholds.rs):

    • effective_min_projects(total_projects) -> u64
    • Applies max(floor, percentage * total) formula
  3. ScaleAdaptiveThresholds (corpus/thresholds.rs):

    • evaluate(project_count, total_projects, ...) -> PromotionDecision
    • Returns AutoPromote(tier), RequireReview, or Skip
  4. CommunityCorpusBuilder (corpus/community.rs):

    • Updated to use adaptive thresholds when use_adaptive=true
    • Falls back to legacy thresholds when use_legacy_thresholds=true
    • Logs scale tier and threshold mode on build

Configuration Fields

CorpusConfig (config/types/scan.rs):

  • adaptive_thresholds: Option<ScaleAdaptiveThresholds> - Custom thresholds
  • use_legacy_thresholds: bool - Backward compatibility flag (default: false)

Usage

Micro Team Example (3 projects)

# Scan 3 projects
cd project1 && aphoria scan --persist --sync
cd project2 && aphoria scan --persist --sync
cd project3 && aphoria scan --persist --sync

# Check logs
# Should see:
# scale_tier=Micro, use_adaptive=true
# Pattern promoted: 2/3 projects (67%) → RequireReview

Query Patterns

# API: Patterns with min 1 project (shows all for micro teams)
curl 'http://localhost:18180/api/patterns?min_projects=1&limit=10'

# Dashboard will show:
# - Scale tier: "Micro (3 projects)"
# - Promoted patterns visible
# - Thresholds: "Emerging: 2/3 projects (67%)"

Testing

Unit Tests

  • test_scale_tier_detection() - Verify tier boundaries
  • test_effective_min_projects() - Floor vs percentage dominance
  • test_micro_team_promotion() - 2/3 projects promoted
  • test_regulatory_disabled_for_micro() - Tier disabling works
  • test_enterprise_backward_compatible() - Same as legacy

Integration Tests

  • scale_adaptive_test.rs - 7 tests covering all scenarios
  • All 1199 library tests pass

Migration

Existing deployments: No action required

  • Adaptive thresholds default to enabled
  • Enterprise behavior unchanged (501+ projects)
  • Legacy mode available if needed

New deployments: Immediate value

  • Small teams see patterns after 2-3 scans
  • Quality maintained via floors and adoption rates
  • Natural growth path as team scales

Philosophy

Start simple, scale naturally:

  • Small teams see value immediately (2-3 projects → patterns visible)
  • Quality maintained via floors (no single-project noise)
  • Adoption rate still matters (community consensus)
  • Enterprise behavior unchanged (backward compatible)
  • Configuration optional (defaults work for 95%)

This unlocks the flywheel:

  • Small teams adopt → see patterns → gain trust
  • Teams grow → thresholds tighten → quality improves
  • Cross-team patterns emerge → community corpus strengthens
  • No manual threshold tuning required