stemedb/applications/aphoria/src/research/mod.rs
jordan a734be3a0d feat: Phase 7 Content Defense + code structure refactoring
Content Defense (Phase 7):
- Add SimilarityIndex with MinHash/LSH for near-duplicate detection
- Add QuarantineStore for flagged assertions awaiting admin review
- Add CircuitBreakerStore for per-agent circuit breaker state
- Add ContentDefenseLayer for ingestion pipeline integration
- Add API endpoints for quarantine and circuit breaker management
- Add research module with gap detection and documentation fetching

Code Structure Improvements:
- Extract research CLI commands to research_commands.rs
- Extract API routers to routers.rs module
- Extract key_codec extraction functions to separate module
- Extract test modules to separate files across multiple crates
- All files now under 500 line limit per pre-commit hook

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:44:05 -07:00

111 lines
5.6 KiB
Rust

//! Research Agent for Aphoria.
//!
//! The Research Agent detects gaps in authoritative coverage and researches
//! official documentation to fill those gaps. This module provides:
//!
//! - **Gap Detection**: Identifies code claims with no authoritative coverage
//! - **Gap Storage**: Persists gaps with tracking metadata (project count, first seen)
//! - **Research Trigger**: Dispatches research when gaps reach threshold
//! - **Claim Extraction**: Parses official documentation for normative claims
//! - **Quality Validation**: Ensures extracted claims meet quality standards
//!
//! # Architecture
//!
//! ```text
//! ┌─────────────────────────────────────────────────────────────────────┐
//! │ Research Agent Flow │
//! │ │
//! │ ┌────────────┐ ┌──────────────┐ ┌─────────────────────────────┐│
//! │ │ Gap │──▶│ Gap Store │──▶│ Research Trigger ││
//! │ │ Detector │ │ (SQLite) │ │ (threshold: 3 projects) ││
//! │ └────────────┘ └──────────────┘ └─────────────────────────────┘│
//! │ │ │
//! │ ▼ │
//! │ ┌────────────────────────────────────────────────────────────────┐ │
//! │ │ Research Pipeline │ │
//! │ │ │ │
//! │ │ ┌───────────┐ ┌─────────────┐ ┌──────────────────────┐ │ │
//! │ │ │ Web │──▶│ Content │──▶│ Quality │ │ │
//! │ │ │ Fetcher │ │ Extractor │ │ Validator │ │ │
//! │ │ └───────────┘ └─────────────┘ └──────────────────────┘ │ │
//! │ │ │ │ │
//! │ │ ▼ │ │
//! │ │ ┌──────────────────────┐ │ │
//! │ │ │ Corpus Ingestion │ │ │
//! │ │ │ (if quality passes) │ │ │
//! │ │ └──────────────────────┘ │ │
//! │ └────────────────────────────────────────────────────────────────┘ │
//! └─────────────────────────────────────────────────────────────────────┘
//! ```
mod gap_detector;
mod gap_store;
mod helpers;
mod quality;
mod researcher;
#[cfg(test)]
mod tests;
pub use gap_detector::{detect_gaps, Gap};
pub use gap_store::{GapRecord, GapStore};
pub use quality::{QualityReport, QualityValidator};
pub use researcher::{ResearchConfig, ResearchResult, Researcher};
/// Minimum number of projects that must report a gap before triggering research.
pub const DEFAULT_GAP_THRESHOLD: u32 = 3;
/// Maximum age of a gap (in days) before it's considered stale.
pub const DEFAULT_GAP_MAX_AGE_DAYS: u64 = 90;
/// Confidence threshold for accepting researched claims.
pub const DEFAULT_QUALITY_THRESHOLD: f32 = 0.7;
/// Result of a research operation.
#[derive(Debug)]
pub struct ResearchOutcome {
/// Number of gaps analyzed.
pub gaps_analyzed: usize,
/// Number of gaps successfully researched.
pub gaps_filled: usize,
/// Number of assertions created from research.
pub assertions_created: usize,
/// Gaps that could not be filled (insufficient quality).
pub gaps_failed: Vec<String>,
/// Detailed results per gap.
pub results: Vec<GapResearchResult>,
}
/// Result of researching a single gap.
#[derive(Debug, Clone)]
pub struct GapResearchResult {
/// The gap that was researched.
pub gap: String,
/// Whether research was successful.
pub success: bool,
/// Number of assertions created.
pub assertions_created: usize,
/// Quality report for the research.
pub quality_report: Option<QualityReport>,
/// Error message if research failed.
pub error: Option<String>,
}
impl ResearchOutcome {
/// Create an empty outcome.
pub fn empty() -> Self {
Self {
gaps_analyzed: 0,
gaps_filled: 0,
assertions_created: 0,
gaps_failed: vec![],
results: vec![],
}
}
/// Check if any research was successful.
pub fn has_results(&self) -> bool {
self.assertions_created > 0
}
}