jml 9bfa626203 docs: reorganize documentation structure for clarity

Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-11 07:33:40 +00:00

8.6 KiB

Raw Blame History

Aphoria Configuration Reference

Complete reference for aphoria.toml configuration options.

File Location

.aphoria/config.toml - Created by aphoria init in your project root.

Quick Start

Minimal configuration (defaults work for most projects):

[project]
name = "my-project"

That's it! Aphoria uses sensible defaults for everything else.

Database Configuration

Per-Project Databases (Default)

New in 2026-02-09: Each project now has its own isolated database by default.

[episteme]
# Project database (observations from this project)
# Default: .aphoria/db (project-local)
data_dir = ".aphoria/db"

# Corpus database (aggregated patterns across all projects)
# Default: ~/.aphoria/corpus-db (home-based, shared)
corpus_data_dir = "~/.aphoria/corpus-db"

Architecture:

~/projects/
├── maxwell/
│   └── .aphoria/db/        # Maxwell's observations
├── billing-api/
│   └── .aphoria/db/        # Billing API's observations
└── ~/.aphoria/
    └── corpus-db/          # Shared corpus (all projects)

Legacy Shared Mode

To use the old behavior (single shared database for all projects):

[episteme]
data_dir = "~/.aphoria/db"

Disable Corpus Aggregation

To disable cross-project pattern aggregation:

[episteme]
corpus_data_dir = null

Full Configuration Example

[project]
name = "my-project"
language = "rust"

[episteme]
# Per-project database (default: .aphoria/db)
data_dir = ".aphoria/db"

# Shared corpus database (default: ~/.aphoria/corpus-db)
corpus_data_dir = "~/.aphoria/corpus-db"

# Optional: Remote Episteme URL (future feature)
# url = "https://episteme.example.com"

[thresholds]
block = 0.7  # Conflict score at or above → BLOCK verdict
flag = 0.4   # Conflict score at or above → FLAG verdict

[extractors]
enabled = [
    "tls_verify",
    "tls_version",
    "jwt_config",
    "hardcoded_secrets",
    "timeout_config",
    "dep_versions",
    "cors_config",
    "durability_config",
    "rate_limit",
    # ... (42 total extractors, see cli-reference.md for full list)
]
disabled = []

[extractors.timeout_config]
min_reasonable_ms = 1000
max_reasonable_ms = 300_000

[extractors.dep_versions]
enabled = false  # OPT-IN: Disabled by default to reduce noise
advisory_db = "~/.aphoria/advisory-db"

[extractors.entropy]
min_entropy = 4.5
min_charset_variety = 0.4
min_length = 20
max_length = 200

[extractors.inline_markers]
enabled = false        # OPT-IN: Disabled by default
sync_to_pending = true # Auto-sync when enabled

[scan]
exclude = [
    "target/",
    "node_modules/",
    ".git/",
    "vendor/",
]
max_file_size = 1_048_576  # 1MB
include_tests = false

[aliases]
auto_suggest = true
auto_accept_tier0 = true
auto_create_aliases = true

[corpus]
cache_dir = "~/.cache/aphoria"  # Or system cache dir
include_rfc = true
include_owasp = true
include_vendor = true
use_community = true
aggregation_enabled = true
use_legacy_thresholds = false  # Use adaptive thresholds (default)

# Optional: Override adaptive thresholds
# adaptive_thresholds = { micro_floor = 2, small_floor = 5 }

[hosted]
# Optional: Hosted mode for team aggregation
# url = "https://aphoria-hosted.example.com"
# project_id = "billing-api"
# team_id = "platform-team"
# sync_mode = "push_only"  # or "bidirectional"
# max_retries = 3
# retry_delay_ms = 1000
# api_key_env = "APHORIA_API_KEY"

[community]
enabled = false  # CRITICAL: Opt-in only
anonymize = true # CRITICAL: Privacy by default
exclude = []
include = []
min_confidence = 0.8

[llm]
enabled = false
provider = "gemini"
model = "gemini-3-flash-preview"
api_key_env = "GEMINI_API_KEY"
max_tokens_per_scan = 50000
max_tokens_per_file = 4000
cache_responses = true
timeout_secs = 60
high_value_only = true
min_confidence = 0.7

[learning]
enabled = false
store = "local"
min_confidence = 0.7
prune_after_days = 90
max_patterns = 10_000

[learning.promotion]
min_projects = 5
min_confidence = 0.8
auto_promote = false
output_dir = ".aphoria/extractors/learned"
require_review = true

[autonomous]
# CRITICAL: Opt-in only - kill switch defaults to off
enabled = false
min_confidence = 0.95
min_projects = 10
require_zero_failures = true
require_zero_warnings = true
audit_log = true
# audit_dir defaults to ~/.aphoria/audit/

Key Sections

Project

Basic project metadata.

[project]
name = "my-project"       # Optional: auto-detected from directory name
language = "rust"          # Optional: auto-detected from file extensions

Episteme

Database and storage configuration.

[episteme]
data_dir = ".aphoria/db"              # Per-project observations
corpus_data_dir = "~/.aphoria/corpus-db"  # Shared corpus (optional)
url = null                            # Remote Episteme (future)

Key Options:

data_dir - Where to store this project's observations
- Default: .aphoria/db (project-local)
- Override to ~/.aphoria/db for legacy shared mode
corpus_data_dir - Where to store aggregated patterns
- Default: ~/.aphoria/corpus-db (home-based, shared)
- Set to null to disable cross-project aggregation

Thresholds

Conflict severity thresholds.

[thresholds]
block = 0.7  # High severity (blocks CI)
flag = 0.4   # Medium severity (warns)

Conflict scores range from 0.0 (no conflict) to 1.0 (total conflict).

Extractors

Control which extractors run.

[extractors]
enabled = ["tls_verify", "jwt_config", ...]
disabled = []

See cli-reference.md for the full list of 42 available extractors.

Scan

Control which files are scanned.

[scan]
exclude = ["target/", "node_modules/"]
max_file_size = 1_048_576  # 1MB
include_tests = false

You can also use .aphoriaignore files (gitignore syntax).

Corpus

Control corpus building and thresholds.

[corpus]
include_rfc = true
include_owasp = true
include_vendor = true
use_community = true
aggregation_enabled = true
use_legacy_thresholds = false  # Use adaptive thresholds

Scale-Adaptive Thresholds (default):

Automatically adjusts promotion thresholds based on team size:

Micro (1-5 projects): Patterns visible with 2/3 adoption
Small (6-25 projects): Patterns visible with 5+ projects
Enterprise (501+): Unchanged behavior

See ../advanced/scale-adaptive-thresholds.md for details.

Legacy Thresholds:

[corpus]
use_legacy_thresholds = true

Fixed thresholds regardless of team size (old behavior).

Hosted Mode

For team collaboration and pattern sharing.

[hosted]
url = "https://aphoria.example.com"
project_id = "billing-api"
team_id = "platform-team"
sync_mode = "push_only"

Requires hosted Aphoria server (future feature).

CRITICAL: Opt-in only. Anonymous pattern contribution.

[community]
enabled = false  # Must explicitly opt-in
anonymize = true # Project names are wildcarded

When enabled with --sync, observations are anonymized and shared with the community corpus.

Privacy Guarantees:

Project names are wildcarded in paths
No file paths, line numbers, or source code
Only pattern aggregates (subject + predicate + value)

LLM Extraction

Use LLMs (Gemini) for semantic claim detection.

[llm]
enabled = false  # OPT-IN
provider = "gemini"
model = "gemini-3-flash-preview"
api_key_env = "GEMINI_API_KEY"

Requires API key in environment.

Learning & Autonomous Promotion

CRITICAL: Both require explicit opt-in.

[learning]
enabled = false  # Pattern learning from scans

[autonomous]
enabled = false  # Auto-promotion to extractors (kill switch)

Environment Variables

Aphoria respects these environment variables:

Variable	Purpose	Default
`APHORIA_API_KEY`	Hosted mode API key	None (required if hosted.enabled)
`GEMINI_API_KEY`	Gemini API key	None (required if llm.enabled)
`STEMEDB_DB_DIR`	Override `data_dir`	`.aphoria/db`
`APHORIA_CONFIG`	Config file path	`.aphoria/config.toml`

Migration Guide

From Old Home-Based Database

Before (legacy):

# Default in old versions: ~/.aphoria/db

After (new default):

# Default now: ./.aphoria/db (per-project)

To keep legacy behavior:

[episteme]
data_dir = "~/.aphoria/db"

No migration needed - just set data_dir to old path.

8.6 KiB Raw Blame History