stemedb/applications/aphoria/docs/scale-adaptive-thresholds.md

# Scale-Adaptive Promotion Thresholds

## Overview

Scale-adaptive thresholds automatically adjust promotion criteria based on organization size, enabling small teams to see value immediately while maintaining quality gates for larger organizations.

## The Problem

**Before adaptive thresholds:**
- Hardcoded minimums: 850/100/50 projects for regulatory/clinical/emerging
- Small teams (2-5 projects) → **0 patterns promoted** → empty dashboard
- No immediate value demonstration → adoption killed before flywheel starts

**Root cause:**
- Thresholds designed for enterprise scale (850 projects for regulatory)
- Small teams locked out: can't meet 50-project minimum for emerging tier
- Dashboard queries promoted patterns only (no visibility into raw aggregates)

## The Solution

### Adaptive Formula

```rust
effective_min_projects = max(
    absolute_floor,           // Safety: prevent single-project noise
    (percentage * total_projects).ceil()  // Scale: grow with team
)
```

### Scale Tiers (Auto-Detected)

| Tier | Project Range | Behavior |
|------|--------------|----------|
| **Micro** | 1-5 | Only emerging tier, floor=2, rate=50% |
| **Small** | 6-25 | All tiers enabled, lower floors |
| **Medium** | 26-100 | Balanced thresholds |
| **Large** | 101-500 | Higher quality gates |
| **Enterprise** | 501+ | Current defaults (backward compatible) |

### Example: Emerging Tier Scaling

| Team Size | Projects | Formula | Min Projects | Adoption Required |
|-----------|----------|---------|--------------|-------------------|
| Micro | 3 | `max(2, 0.50*3)` | **2** | 2/3 projects (67%) |
| Small | 10 | `max(2, 0.40*10)` | **4** | 4/10 projects (40%) |
| Medium | 50 | `max(5, 0.40*50)` | **20** | 20/50 projects (40%) |
| Enterprise | 1000 | `max(25, 0.50*1000)` | **500** | 500/1000 projects (50%) |

## Quality Maintained

✅ **Floor prevents noise:** Single-project patterns blocked
✅ **Adoption rate required:** Community consensus still matters
✅ **Authority matching enforced:** Regulatory/clinical tiers need RFC/OWASP match
✅ **Manual review:** Emerging tier still requires review (auto_promote=false)
✅ **Backward compatible:** Enterprise behavior unchanged

## Configuration

### Default (Adaptive)

```toml
# .aphoria/config.toml
[corpus]
use_community = true
aggregation_enabled = true
# adaptive_thresholds = <optional custom thresholds>
use_legacy_thresholds = false  # Default: use adaptive
```

### Legacy Mode (Static Thresholds)

```toml
[corpus]
use_legacy_thresholds = true  # Use fixed 850/100/50
```

### Custom Thresholds

```toml
[corpus.adaptive_thresholds.micro.emerging]
min_projects_floor = 1       # Override: allow 1 project (risky!)
min_projects_percentage = 0.40
min_adoption_rate = 0.40
```

## Implementation

### Core Components

1. **`ScaleTier`** (`corpus/thresholds.rs`):
   - `from_total_projects(u64) -> ScaleTier`
   - Auto-detects tier from project count

2. **`AdaptiveCriteria`** (`corpus/thresholds.rs`):
   - `effective_min_projects(total_projects) -> u64`
   - Applies `max(floor, percentage * total)` formula

3. **`ScaleAdaptiveThresholds`** (`corpus/thresholds.rs`):
   - `evaluate(project_count, total_projects, ...) -> PromotionDecision`
   - Returns `AutoPromote(tier)`, `RequireReview`, or `Skip`

4. **`CommunityCorpusBuilder`** (`corpus/community.rs`):
   - Updated to use adaptive thresholds when `use_adaptive=true`
   - Falls back to legacy thresholds when `use_legacy_thresholds=true`
   - Logs scale tier and threshold mode on build

### Configuration Fields

**`CorpusConfig`** (`config/types/scan.rs`):
- `adaptive_thresholds: Option<ScaleAdaptiveThresholds>` - Custom thresholds
- `use_legacy_thresholds: bool` - Backward compatibility flag (default: false)

## Usage

### Micro Team Example (3 projects)

```bash
# Scan 3 projects
cd project1 && aphoria scan --persist --sync
cd project2 && aphoria scan --persist --sync
cd project3 && aphoria scan --persist --sync

# Check logs
# Should see:
# scale_tier=Micro, use_adaptive=true
# Pattern promoted: 2/3 projects (67%) → RequireReview
```

### Query Patterns

```bash
# API: Patterns with min 1 project (shows all for micro teams)
curl 'http://localhost:18180/api/patterns?min_projects=1&limit=10'

# Dashboard will show:
# - Scale tier: "Micro (3 projects)"
# - Promoted patterns visible
# - Thresholds: "Emerging: 2/3 projects (67%)"
```

## Testing

### Unit Tests

- `test_scale_tier_detection()` - Verify tier boundaries
- `test_effective_min_projects()` - Floor vs percentage dominance
- `test_micro_team_promotion()` - 2/3 projects promoted
- `test_regulatory_disabled_for_micro()` - Tier disabling works
- `test_enterprise_backward_compatible()` - Same as legacy

### Integration Tests

- `scale_adaptive_test.rs` - 7 tests covering all scenarios
- All 1199 library tests pass

## Migration

**Existing deployments:** No action required
- Adaptive thresholds default to enabled
- Enterprise behavior unchanged (501+ projects)
- Legacy mode available if needed

**New deployments:** Immediate value
- Small teams see patterns after 2-3 scans
- Quality maintained via floors and adoption rates
- Natural growth path as team scales

## Philosophy

**Start simple, scale naturally:**
- Small teams see value immediately (2-3 projects → patterns visible)
- Quality maintained via floors (no single-project noise)
- Adoption rate still matters (community consensus)
- Enterprise behavior unchanged (backward compatible)
- Configuration optional (defaults work for 95%)

**This unlocks the flywheel:**
- Small teams adopt → see patterns → gain trust
- Teams grow → thresholds tighten → quality improves
- Cross-team patterns emerge → community corpus strengthens
- No manual threshold tuning required