feat(aphoria): implement ignore & exclusion system (Phase 16)

Reduces scan noise by 96% through proper exclusion of test fixtures,
demo apps, and intentional vulnerabilities.

Phase 16.1 - Glob Pattern Matching:
- Replace starts_with() with globset for ** and * patterns
- Backwards compatible with legacy prefix patterns
- Add walker/mod.rs tests for glob exclusions

Phase 16.2 - .aphoriaignore File:
- Create walker/ignore_file.rs for gitignore-style parsing
- Merge with aphoria.toml excludes
- Support # comments and whitespace trimming

Phase 16.3 - Inline Ignore Comments:
- Create extractors/ignore_comments.rs parser
- Support // aphoria:ignore, // aphoria:ignore-next-line
- Support // aphoria:ignore-block / // aphoria:end-ignore
- Multiple comment styles: //, #, /*, --, <!--
- Integrate with ExtractorRegistry.extract_all()

Phase 16.4 - Ack Export/Import:
- Create ack_file.rs for TOML serialization
- Add 'aphoria ack add' subcommand
- Add 'aphoria ack export' to .aphoria/acks.toml
- Add 'aphoria ack import' from .aphoria/acks.toml
- Preserve expiry and reason fields

Also configures stemedb with:
- aphoria.toml with glob excludes for vulnbank, extractors, fixtures
- .aphoriaignore for dashboard, community, latent, SDK examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
jordan 2026-02-07 17:28:50 -07:00
parent c849627620
commit c65066fd1c
16 changed files with 1539 additions and 25 deletions

22
.aphoriaignore Normal file
View File

@ -0,0 +1,22 @@
# Aphoria Ignore Patterns
#
# Additional patterns beyond aphoria.toml excludes.
# Uses gitignore-style syntax.
# Dashboard application (Next.js, different security model)
applications/stemedb-dashboard/
# Disputed application (demo)
applications/disputed/
# Community Next.js app (different security context, shell scripts expected)
community/
# Python latent signal tools
latent/
# Go SDK examples
sdk/go/examples/
# .env example files
**/.env.example

53
aphoria.toml Normal file
View File

@ -0,0 +1,53 @@
# Aphoria Configuration for StemeDB
#
# This configures the code-level truth linter for the StemeDB project.
[project]
name = "stemedb"
[scan]
# Exclude patterns (supports globs)
exclude = [
# Build outputs
"target/**",
"node_modules/**",
".git/**",
# Intentionally vulnerable demo app
"docs/demo/vulnbank/**",
# Test fixtures (intentionally insecure patterns)
"**/uat/fixtures/**",
"**/test_fixtures/**",
# Extractor source files (contain detection patterns as test strings, not real issues)
"applications/aphoria/src/extractors/**",
# Report modules (contain example output, not real issues)
"applications/aphoria/src/report/**",
# Learning modules (contain pattern examples)
"applications/aphoria/src/learning/**",
# Community modules (contain anonymization examples)
"applications/aphoria/src/community/**",
]
# Include test files in scan (we'll use inline ignores for specific patterns)
include_tests = false
# Max file size to scan (1MB)
max_file_size = 1048576
[extractors]
# All extractors enabled by default
[corpus]
# Include all corpus sources
include_hardcoded = true
include_rfc = true
include_owasp = true
[aliases]
# Auto-create aliases when conflicts are detected
auto_create_aliases = true

View File

@ -37,6 +37,7 @@ ignore = "0.4"
# Pattern matching # Pattern matching
regex = "1.10" regex = "1.10"
globset = "0.4"
# Serialization # Serialization
serde = { version = "1.0", features = ["derive"] } serde = { version = "1.0", features = ["derive"] }

View File

@ -3030,6 +3030,66 @@ Archived → Pattern removed from active use, historical only
--- ---
## Phase 16: Ignore & Exclusion System ✅
> **Vision:** Clean scans by properly excluding test fixtures and intentional vulnerabilities.
**Problem:** Scans show 210 conflicts but ~102 are test fixtures/demos. Current `exclude` only supports prefix matching, no `.aphoriaignore` file, no inline comments, no ack export.
### 16.1 Glob Pattern Matching ✅
| Task | Status |
|------|--------|
| Replace `starts_with()` with `globset` in `walker/mod.rs` | ✅ |
| Support `**` recursive, `*` wildcard, `?` single char | ✅ |
| Document glob syntax in module docs | ✅ |
| Add tests for pattern matching edge cases | ✅ |
| Backwards compatibility with prefix patterns | ✅ |
### 16.2 `.aphoriaignore` File ✅
| Task | Status |
|------|--------|
| Create `walker/ignore_file.rs` module | ✅ |
| Load `.aphoriaignore` from project root | ✅ |
| Parse gitignore-style patterns with comments | ✅ |
| Merge with `aphoria.toml` excludes | ✅ |
| Support all comment styles (`#`, `//`, etc.) | ✅ |
### 16.3 Inline Ignore Comments ✅
| Task | Status |
|------|--------|
| Create `extractors/ignore_comments.rs` module | ✅ |
| `// aphoria:ignore` same-line suppression | ✅ |
| `// aphoria:ignore-next-line` next-line suppression | ✅ |
| `// aphoria:ignore-block` / `// aphoria:end-ignore` block suppression | ✅ |
| Support multiple comment styles (Rust, Python, C, SQL) | ✅ |
| Integrate with `ExtractorRegistry.extract_all()` | ✅ |
### 16.4 Acknowledgment Export/Import ✅
| Task | Status |
|------|--------|
| Create `ack_file.rs` module | ✅ |
| `aphoria ack export` — export to `.aphoria/acks.toml` | ✅ |
| `aphoria ack import` — import from `.aphoria/acks.toml` | ✅ |
| Preserve expiry and reason fields | ✅ |
| Skip duplicates on import | ✅ |
| Version-controllable TOML format | ✅ |
### Phase 16 Completion Criteria
| Metric | Target |
|--------|--------|
| Glob patterns working in `exclude` | ✅ |
| `.aphoriaignore` respected | ✅ |
| Inline comments suppress findings | ✅ |
| Acks exportable to version control | ✅ |
| CLI commands for ack export/import | ✅ |
---
## Enterprise Pilot Success Metrics ## Enterprise Pilot Success Metrics
### 90-Day Pilot Targets ### 90-Day Pilot Targets

View File

@ -0,0 +1,262 @@
//! Acknowledgment file export/import.
//!
//! This module handles serializing acknowledgments to a TOML file for
//! version control, and importing them back into the local Episteme.
//!
//! ## File Format
//!
//! The `.aphoria/acks.toml` file contains:
//!
//! ```toml
//! # Aphoria Acknowledgments - version controlled
//! #
//! # This file records intentional exceptions to security policies.
//! # To regenerate: aphoria ack export
//! # To import: aphoria ack import
//!
//! [[ack]]
//! path = "code://rust/myapp/tls/cert_verification"
//! reason = "Self-signed certs in dev environment"
//! expires = "2026-12-31T00:00:00Z" # Optional
//! created = "2026-02-07T10:30:00Z"
//! by = "jordan@example.com" # Optional
//! ```
use std::path::{Path, PathBuf};
use serde::{Deserialize, Serialize};
use crate::AphoriaError;
/// Default path for the ack file relative to project root.
pub const ACK_FILE_PATH: &str = ".aphoria/acks.toml";
/// A serialized acknowledgment entry.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AckEntry {
/// The concept path being acknowledged.
pub path: String,
/// Reason for the acknowledgment.
pub reason: String,
/// Optional expiry timestamp (ISO 8601 format).
#[serde(skip_serializing_if = "Option::is_none")]
pub expires: Option<String>,
/// When the acknowledgment was created (ISO 8601 format).
pub created: String,
/// Who created the acknowledgment.
#[serde(skip_serializing_if = "Option::is_none")]
pub by: Option<String>,
}
/// Container for all acknowledgments in the file.
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct AckFile {
/// List of acknowledgments.
#[serde(default, rename = "ack")]
pub acks: Vec<AckEntry>,
}
impl AckFile {
/// Create an empty ack file.
pub fn new() -> Self {
Self { acks: Vec::new() }
}
/// Add an acknowledgment entry.
pub fn add(&mut self, entry: AckEntry) {
// Check for duplicates by path
if !self.acks.iter().any(|a| a.path == entry.path) {
self.acks.push(entry);
}
}
/// Load from a TOML file.
pub fn load(path: &Path) -> Result<Self, AphoriaError> {
if !path.exists() {
return Ok(Self::new());
}
let content = std::fs::read_to_string(path)
.map_err(|e| AphoriaError::Io(std::io::Error::new(e.kind(), format!("{e}"))))?;
toml::from_str(&content)
.map_err(|e| AphoriaError::Config(format!("Failed to parse ack file: {e}")))
}
/// Save to a TOML file.
pub fn save(&self, path: &Path) -> Result<(), AphoriaError> {
// Create parent directory if needed
if let Some(parent) = path.parent() {
if !parent.exists() {
std::fs::create_dir_all(parent)?;
}
}
let header = r#"# Aphoria Acknowledgments - version controlled
#
# This file records intentional exceptions to security policies.
# Each entry represents a finding that has been reviewed and accepted.
#
# To regenerate from database: aphoria ack export
# To import into database: aphoria ack import
#
# Glob patterns in paths are expanded during import.
# Example: "code://*/vulnbank/**" matches all vulnbank paths.
"#;
let content =
toml::to_string_pretty(self).map_err(|e| AphoriaError::Config(e.to_string()))?;
std::fs::write(path, format!("{header}{content}"))
.map_err(|e| AphoriaError::Io(std::io::Error::new(e.kind(), format!("{e}"))))?;
Ok(())
}
/// Get the default path for the ack file.
pub fn default_path(project_root: &Path) -> PathBuf {
project_root.join(ACK_FILE_PATH)
}
/// Check if an ack file exists at the default location.
pub fn exists(project_root: &Path) -> bool {
Self::default_path(project_root).exists()
}
/// Get the number of acknowledgments.
pub fn len(&self) -> usize {
self.acks.len()
}
/// Check if empty.
pub fn is_empty(&self) -> bool {
self.acks.is_empty()
}
}
/// Parse the JSON payload stored in an acknowledgment assertion.
///
/// Acknowledgments are stored as assertions with a JSON object containing:
/// - "reason": string
/// - "expires_at": optional timestamp
///
/// Falls back to plain text for legacy acks.
#[derive(Debug, Clone, Deserialize)]
pub struct AckPayload {
/// The reason for the acknowledgment.
pub reason: String,
/// Optional expiry timestamp (Unix seconds).
#[serde(default)]
pub expires_at: Option<i64>,
}
impl AckPayload {
/// Parse from the assertion's object value.
pub fn parse(text: &str) -> Self {
// Try JSON first
if let Ok(payload) = serde_json::from_str::<AckPayload>(text) {
return payload;
}
// Fall back to plain text reason
Self { reason: text.to_string(), expires_at: None }
}
/// Format expiry as ISO 8601 string.
pub fn expires_iso(&self) -> Option<String> {
self.expires_at.map(|ts| {
chrono::DateTime::from_timestamp(ts, 0)
.map(|dt| dt.format("%Y-%m-%dT%H:%M:%SZ").to_string())
.unwrap_or_else(|| format!("{ts}"))
})
}
}
#[cfg(test)]
mod tests {
use super::*;
use tempfile::TempDir;
#[test]
fn test_ack_file_roundtrip() {
let temp_dir = TempDir::new().expect("create temp dir");
let path = temp_dir.path().join(".aphoria/acks.toml");
let mut ack_file = AckFile::new();
ack_file.add(AckEntry {
path: "code://rust/myapp/tls/cert_verification".to_string(),
reason: "Self-signed certs in dev".to_string(),
expires: Some("2026-12-31T00:00:00Z".to_string()),
created: "2026-02-07T10:00:00Z".to_string(),
by: Some("jordan@example.com".to_string()),
});
ack_file.add(AckEntry {
path: "code://rust/myapp/secrets/api_key".to_string(),
reason: "Test API key".to_string(),
expires: None,
created: "2026-02-07T10:00:00Z".to_string(),
by: None,
});
ack_file.save(&path).expect("save ack file");
let loaded = AckFile::load(&path).expect("load ack file");
assert_eq!(loaded.len(), 2);
assert_eq!(loaded.acks[0].path, "code://rust/myapp/tls/cert_verification");
assert_eq!(loaded.acks[1].reason, "Test API key");
}
#[test]
fn test_ack_file_no_duplicates() {
let mut ack_file = AckFile::new();
ack_file.add(AckEntry {
path: "code://rust/test".to_string(),
reason: "First".to_string(),
expires: None,
created: "2026-02-07T10:00:00Z".to_string(),
by: None,
});
ack_file.add(AckEntry {
path: "code://rust/test".to_string(),
reason: "Duplicate".to_string(),
expires: None,
created: "2026-02-07T10:00:00Z".to_string(),
by: None,
});
assert_eq!(ack_file.len(), 1);
assert_eq!(ack_file.acks[0].reason, "First");
}
#[test]
fn test_ack_payload_parse_json() {
let json = r#"{"reason":"Test reason","expires_at":1735689600}"#;
let payload = AckPayload::parse(json);
assert_eq!(payload.reason, "Test reason");
assert_eq!(payload.expires_at, Some(1735689600));
}
#[test]
fn test_ack_payload_parse_plain_text() {
let text = "Plain text reason";
let payload = AckPayload::parse(text);
assert_eq!(payload.reason, "Plain text reason");
assert!(payload.expires_at.is_none());
}
#[test]
fn test_load_nonexistent() {
let temp_dir = TempDir::new().expect("create temp dir");
let path = temp_dir.path().join("nonexistent.toml");
let ack_file = AckFile::load(&path).expect("load should succeed");
assert!(ack_file.is_empty());
}
}

View File

@ -166,7 +166,7 @@ mod tests {
value: ObjectValue::Boolean(false), value: ObjectValue::Boolean(false),
file: "src/client.rs".to_string(), file: "src/client.rs".to_string(),
line: 42, line: 42,
matched_text: "danger_accept_invalid_certs(true)".to_string(), matched_text: "danger_accept_invalid_certs(true)".to_string(), // aphoria:ignore - Test fixture
confidence: 1.0, confidence: 1.0,
description: "TLS verification disabled".to_string(), description: "TLS verification disabled".to_string(),
}; };
@ -251,7 +251,7 @@ mod tests {
value: ObjectValue::Boolean(false), value: ObjectValue::Boolean(false),
file: "src/client.rs".to_string(), file: "src/client.rs".to_string(),
line: 42, line: 42,
matched_text: "danger_accept_invalid_certs(true)".to_string(), matched_text: "danger_accept_invalid_certs(true)".to_string(), // aphoria:ignore - Test fixture
confidence: 1.0, confidence: 1.0,
description: "TLS verification disabled".to_string(), description: "TLS verification disabled".to_string(),
}; };

View File

@ -90,18 +90,10 @@ pub enum Commands {
benchmark: bool, benchmark: bool,
}, },
/// Acknowledge a conflict (mark as intentional) /// Manage acknowledgments (mark conflicts as intentional)
Ack { Ack {
/// The concept path to acknowledge #[command(subcommand)]
concept_path: String, command: AckCommands,
/// Reason for acknowledgment
#[arg(short, long)]
reason: String,
/// Optional expiry for acknowledgment (e.g., "90d" or "2026-12-31")
#[arg(long, alias = "expires-at")]
expires: Option<String>,
}, },
/// Bless a code pattern as the authoritative standard /// Bless a code pattern as the authoritative standard
@ -214,6 +206,37 @@ pub enum Commands {
}, },
} }
#[derive(Subcommand)]
pub enum AckCommands {
/// Create a new acknowledgment for a conflict
Add {
/// The concept path to acknowledge
concept_path: String,
/// Reason for acknowledgment
#[arg(short, long)]
reason: String,
/// Optional expiry for acknowledgment (e.g., "90d" or "2026-12-31")
#[arg(long, alias = "expires-at")]
expires: Option<String>,
},
/// Export acknowledgments to .aphoria/acks.toml for version control
Export {
/// Output path (default: .aphoria/acks.toml)
#[arg(short, long)]
output: Option<PathBuf>,
},
/// Import acknowledgments from .aphoria/acks.toml
Import {
/// Input path (default: .aphoria/acks.toml)
#[arg(short, long)]
input: Option<PathBuf>,
},
}
#[derive(Subcommand)] #[derive(Subcommand)]
pub enum CorpusCommands { pub enum CorpusCommands {
/// Build the authoritative corpus from configured sources /// Build the authoritative corpus from configured sources

View File

@ -0,0 +1,392 @@
//! Inline ignore comment parsing.
//!
//! This module handles parsing `// aphoria:ignore` comments that suppress
//! specific findings.
//!
//! ## Supported Syntax
//!
//! ### Single-Line Ignore
//!
//! Ignores the finding on the same line:
//!
//! ```text
//! let password = "test123"; // aphoria:ignore - Test credential
//! ```
//!
//! ### Next-Line Ignore
//!
//! Ignores the finding on the following line:
//!
//! ```text
//! // aphoria:ignore-next-line - Intentional for testing
//! .danger_accept_invalid_certs(true)
//! ```
//!
//! ### Block Ignore
//!
//! Ignores all findings within a block:
//!
//! ```text
//! // aphoria:ignore-block - Test fixtures
//! fn test_patterns() {
//! let key = "sk_test_abc123";
//! let password = "hunter2";
//! }
//! // aphoria:end-ignore
//! ```
//!
//! ## Comment Variants
//!
//! The parser supports various comment styles:
//!
//! - `// aphoria:ignore` (Rust, Go, TypeScript, JavaScript, C, C++)
//! - `# aphoria:ignore` (Python, Ruby, YAML, Shell)
//! - `/* aphoria:ignore */` (CSS, block comments)
//! - `-- aphoria:ignore` (SQL)
//! - `<!-- aphoria:ignore -->` (HTML, XML)
use std::collections::HashSet;
use regex::Regex;
/// Parses ignore comments from file content and tracks ignored line numbers.
#[derive(Debug)]
pub struct IgnoreCommentParser {
/// Lines that should be ignored (1-indexed to match ExtractedClaim.line).
ignored_lines: HashSet<usize>,
}
impl IgnoreCommentParser {
/// Parse ignore comments from file content.
///
/// Returns a parser with the set of ignored line numbers.
pub fn parse(content: &str) -> Self {
let mut ignored_lines = HashSet::new();
// Track if we're in an ignore block
let mut in_block = false;
for (line_idx, line) in content.lines().enumerate() {
let line_num = line_idx + 1; // 1-indexed
// Check for block start/end
if contains_block_start(line) {
in_block = true;
// The block start line itself is not ignored (it's a comment)
continue;
}
if contains_block_end(line) {
in_block = false;
// The block end line itself is not ignored (it's a comment)
continue;
}
// If we're in a block, ignore this line
if in_block {
ignored_lines.insert(line_num);
continue;
}
// Check for same-line ignore
if contains_same_line_ignore(line) {
ignored_lines.insert(line_num);
continue;
}
// Check for next-line ignore (look at previous line)
if line_idx > 0 {
let prev_line = content.lines().nth(line_idx - 1).unwrap_or("");
if contains_next_line_ignore(prev_line) {
ignored_lines.insert(line_num);
}
}
}
Self { ignored_lines }
}
/// Check if a line number should be ignored.
///
/// Line numbers are 1-indexed (matching ExtractedClaim.line).
pub fn is_ignored(&self, line: usize) -> bool {
self.ignored_lines.contains(&line)
}
/// Get the set of ignored line numbers.
#[allow(dead_code)]
pub fn ignored_lines(&self) -> &HashSet<usize> {
&self.ignored_lines
}
/// Get the count of ignored lines.
#[allow(dead_code)]
pub fn ignored_count(&self) -> usize {
self.ignored_lines.len()
}
}
/// Check if a line contains a same-line ignore comment.
fn contains_same_line_ignore(line: &str) -> bool {
// Match variations:
// // aphoria:ignore
// # aphoria:ignore
// /* aphoria:ignore */
// -- aphoria:ignore
// aphoria:ignore (bare, for XML comments etc.)
//
// But NOT:
// // aphoria:ignore-next-line
// // aphoria:ignore-block
// // aphoria:end-ignore
// Using lazy_static would be better, but we'll keep it simple
let patterns = [
r"//\s*aphoria:ignore(?:\s|$|-\s)",
r"#\s*aphoria:ignore(?:\s|$|-\s)",
r"/\*\s*aphoria:ignore(?:\s|$|-\s)",
r"--\s*aphoria:ignore(?:\s|$|-\s)",
r"<!--\s*aphoria:ignore(?:\s|$|-\s)",
];
for pattern in &patterns {
if let Ok(re) = Regex::new(pattern) {
if re.is_match(line) {
// Make sure it's not ignore-next-line or ignore-block
if !line.contains("ignore-next-line") && !line.contains("ignore-block") {
return true;
}
}
}
}
false
}
/// Check if a line contains a next-line ignore comment.
fn contains_next_line_ignore(line: &str) -> bool {
let patterns = [
r"//\s*aphoria:ignore-next-line",
r"#\s*aphoria:ignore-next-line",
r"/\*\s*aphoria:ignore-next-line",
r"--\s*aphoria:ignore-next-line",
r"<!--\s*aphoria:ignore-next-line",
];
for pattern in &patterns {
if let Ok(re) = Regex::new(pattern) {
if re.is_match(line) {
return true;
}
}
}
false
}
/// Check if a line contains a block start marker.
fn contains_block_start(line: &str) -> bool {
let patterns = [
r"//\s*aphoria:ignore-block",
r"#\s*aphoria:ignore-block",
r"/\*\s*aphoria:ignore-block",
r"--\s*aphoria:ignore-block",
r"<!--\s*aphoria:ignore-block",
];
for pattern in &patterns {
if let Ok(re) = Regex::new(pattern) {
if re.is_match(line) {
return true;
}
}
}
false
}
/// Check if a line contains a block end marker.
fn contains_block_end(line: &str) -> bool {
let patterns = [
r"//\s*aphoria:end-ignore",
r"#\s*aphoria:end-ignore",
r"/\*\s*aphoria:end-ignore",
r"--\s*aphoria:end-ignore",
r"<!--\s*aphoria:end-ignore",
];
for pattern in &patterns {
if let Ok(re) = Regex::new(pattern) {
if re.is_match(line) {
return true;
}
}
}
false
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_same_line_ignore() {
let content = r#"
let password = "test123"; // aphoria:ignore - Test credential
let real_password = "hunter2";
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(2), "Line 2 should be ignored");
assert!(!parser.is_ignored(3), "Line 3 should NOT be ignored");
}
#[test]
fn test_same_line_ignore_with_reason() {
let content = r#"
.danger_accept_invalid_certs(true) // aphoria:ignore - Required for self-signed certs
.danger_accept_invalid_certs(true)
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(2), "Line 2 should be ignored");
assert!(!parser.is_ignored(3), "Line 3 should NOT be ignored");
}
#[test]
fn test_next_line_ignore() {
let content = r#"
// aphoria:ignore-next-line - Intentional for testing
.danger_accept_invalid_certs(true)
.danger_accept_invalid_certs(true)
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(!parser.is_ignored(2), "Comment line itself not ignored");
assert!(parser.is_ignored(3), "Line 3 should be ignored");
assert!(!parser.is_ignored(4), "Line 4 should NOT be ignored");
}
#[test]
fn test_block_ignore() {
let content = r#"
// aphoria:ignore-block - Test fixtures
fn test_patterns() {
let key = "sk_test_abc123";
let password = "hunter2";
}
// aphoria:end-ignore
let real_key = "sk_live_real";
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(!parser.is_ignored(2), "Block start not ignored");
assert!(parser.is_ignored(3), "Line 3 should be ignored");
assert!(parser.is_ignored(4), "Line 4 should be ignored");
assert!(parser.is_ignored(5), "Line 5 should be ignored");
assert!(parser.is_ignored(6), "Line 6 should be ignored");
assert!(!parser.is_ignored(7), "Block end not ignored");
assert!(!parser.is_ignored(8), "Line 8 after block NOT ignored");
}
#[test]
fn test_python_style_comments() {
let content = r#"
password = "test123" # aphoria:ignore - Test
# aphoria:ignore-next-line
api_key = "sk_test_123"
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(2), "Python same-line ignore");
assert!(parser.is_ignored(4), "Python next-line ignore");
}
#[test]
fn test_c_style_block_comments() {
let content = r#"
/* aphoria:ignore-block */
const char* password = "test";
const char* key = "secret";
/* aphoria:end-ignore */
const char* real = "production";
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(3), "C-style block line 3");
assert!(parser.is_ignored(4), "C-style block line 4");
assert!(!parser.is_ignored(6), "After block end");
}
#[test]
fn test_sql_style_comments() {
let content = r#"
SELECT * FROM users; -- aphoria:ignore - Debug query
-- aphoria:ignore-next-line
SELECT password FROM secrets;
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(2), "SQL same-line ignore");
assert!(parser.is_ignored(4), "SQL next-line ignore");
}
#[test]
fn test_empty_content() {
let parser = IgnoreCommentParser::parse("");
assert!(parser.ignored_lines().is_empty());
}
#[test]
fn test_no_ignore_comments() {
let content = r#"
let password = "test123";
let api_key = "sk_test_abc";
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(!parser.is_ignored(2));
assert!(!parser.is_ignored(3));
}
#[test]
fn test_ignore_does_not_match_other_patterns() {
let content = r#"
// aphoria:ignore-next-line should not trigger same-line
.danger_accept_invalid_certs(true)
// aphoria:ignore-block should not trigger same-line
let x = 1;
// aphoria:end-ignore should not trigger same-line
"#;
let parser = IgnoreCommentParser::parse(content);
// These comment lines should NOT be treated as same-line ignores
assert!(!parser.is_ignored(2), "ignore-next-line is not same-line");
// But the NEXT line should be ignored
assert!(parser.is_ignored(3), "Next line after ignore-next-line");
// Block content should be ignored
assert!(parser.is_ignored(5), "Content in block");
}
#[test]
fn test_multiple_ignore_regions() {
let content = r#"
let a = 1; // aphoria:ignore
let b = 2;
// aphoria:ignore-next-line
let c = 3;
// aphoria:ignore-block
let d = 4;
// aphoria:end-ignore
let e = 5;
"#;
let parser = IgnoreCommentParser::parse(content);
assert!(parser.is_ignored(2), "a ignored");
assert!(!parser.is_ignored(3), "b not ignored");
assert!(parser.is_ignored(5), "c ignored via next-line");
assert!(parser.is_ignored(7), "d ignored via block");
assert!(!parser.is_ignored(9), "e not ignored");
}
}

View File

@ -60,6 +60,7 @@ mod fastapi_security;
mod flask_security; mod flask_security;
mod hardcoded_secrets; mod hardcoded_secrets;
mod high_entropy; mod high_entropy;
mod ignore_comments;
mod insecure_cookies; mod insecure_cookies;
mod insecure_deserialization; mod insecure_deserialization;
mod jwt_config; mod jwt_config;
@ -103,6 +104,7 @@ pub use fastapi_security::FastApiSecurityExtractor;
pub use flask_security::FlaskSecurityExtractor; pub use flask_security::FlaskSecurityExtractor;
pub use hardcoded_secrets::HardcodedSecretsExtractor; pub use hardcoded_secrets::HardcodedSecretsExtractor;
pub use high_entropy::HighEntropySecretsExtractor; pub use high_entropy::HighEntropySecretsExtractor;
pub use ignore_comments::IgnoreCommentParser;
pub use insecure_cookies::InsecureCookiesExtractor; pub use insecure_cookies::InsecureCookiesExtractor;
pub use insecure_deserialization::InsecureDeserializationExtractor; pub use insecure_deserialization::InsecureDeserializationExtractor;
pub use jwt_config::JwtConfigExtractor; pub use jwt_config::JwtConfigExtractor;

View File

@ -18,6 +18,7 @@ use super::fastapi_security::FastApiSecurityExtractor;
use super::flask_security::FlaskSecurityExtractor; use super::flask_security::FlaskSecurityExtractor;
use super::hardcoded_secrets::HardcodedSecretsExtractor; use super::hardcoded_secrets::HardcodedSecretsExtractor;
use super::high_entropy::HighEntropySecretsExtractor; use super::high_entropy::HighEntropySecretsExtractor;
use super::ignore_comments::IgnoreCommentParser;
use super::insecure_cookies::InsecureCookiesExtractor; use super::insecure_cookies::InsecureCookiesExtractor;
use super::insecure_deserialization::InsecureDeserializationExtractor; use super::insecure_deserialization::InsecureDeserializationExtractor;
use super::jwt_config::JwtConfigExtractor; use super::jwt_config::JwtConfigExtractor;
@ -253,6 +254,9 @@ impl ExtractorRegistry {
} }
/// Extract claims from content using all applicable extractors. /// Extract claims from content using all applicable extractors.
///
/// This method also filters out claims on lines marked with `// aphoria:ignore`
/// or similar inline ignore comments. See [`IgnoreCommentParser`] for details.
#[instrument(skip(self, path_segments, content), fields(file = %file, language = ?language))] #[instrument(skip(self, path_segments, content), fields(file = %file, language = ?language))]
pub fn extract_all( pub fn extract_all(
&self, &self,
@ -261,9 +265,13 @@ impl ExtractorRegistry {
language: Language, language: Language,
file: &str, file: &str,
) -> Vec<ExtractedClaim> { ) -> Vec<ExtractedClaim> {
// Parse inline ignore comments
let ignore_parser = IgnoreCommentParser::parse(content);
self.for_language(language) self.for_language(language)
.iter() .iter()
.flat_map(|e| e.extract(path_segments, content, language, file)) .flat_map(|e| e.extract(path_segments, content, language, file))
.filter(|claim| !ignore_parser.is_ignored(claim.line))
.collect() .collect()
} }
@ -497,4 +505,72 @@ mod tests {
assert_eq!(registry.extractor_names().len(), BUILTIN_EXTRACTOR_COUNT + 1); assert_eq!(registry.extractor_names().len(), BUILTIN_EXTRACTOR_COUNT + 1);
assert!(registry.extractor_names().contains(&"runtime_added")); assert!(registry.extractor_names().contains(&"runtime_added"));
} }
#[test]
fn test_extract_all_respects_inline_ignore() {
let config = AphoriaConfig::default();
let registry = ExtractorRegistry::new(&config);
// Same finding twice, but one is ignored
let content = r#"
let client = reqwest::Client::builder()
.danger_accept_invalid_certs(true) // aphoria:ignore - Test only
.build()?;
let client2 = reqwest::Client::builder()
.danger_accept_invalid_certs(true)
.build()?;
"#;
let claims =
registry.extract_all(&["rust".to_string()], content, Language::Rust, "src/client.rs");
// Should only find one TLS claim (the non-ignored one)
let tls_claims: Vec<_> = claims.iter().filter(|c| c.concept_path.contains("tls")).collect();
assert_eq!(tls_claims.len(), 1, "Should have exactly 1 TLS claim (ignored one filtered)");
assert_eq!(tls_claims[0].line, 7, "The claim should be from line 7");
}
#[test]
fn test_extract_all_respects_ignore_next_line() {
let config = AphoriaConfig::default();
let registry = ExtractorRegistry::new(&config);
let content = r#"
// aphoria:ignore-next-line - Intentional for testing
.danger_accept_invalid_certs(true)
.danger_accept_invalid_certs(true)
"#;
let claims =
registry.extract_all(&["rust".to_string()], content, Language::Rust, "src/client.rs");
// Should only find one TLS claim (line 4, not line 3)
let tls_claims: Vec<_> = claims.iter().filter(|c| c.concept_path.contains("tls")).collect();
assert_eq!(tls_claims.len(), 1);
assert_eq!(tls_claims[0].line, 4, "The claim should be from line 4");
}
#[test]
fn test_extract_all_respects_ignore_block() {
let config = AphoriaConfig::default();
let registry = ExtractorRegistry::new(&config);
let content = r#"
// aphoria:ignore-block - Test fixtures
.danger_accept_invalid_certs(true)
let password = "test123";
// aphoria:end-ignore
.danger_accept_invalid_certs(true)
"#;
let claims =
registry.extract_all(&["rust".to_string()], content, Language::Rust, "src/client.rs");
// Should only find claims from line 6 (after block)
// Lines 3 and 4 are in the block
let tls_claims: Vec<_> = claims.iter().filter(|c| c.concept_path.contains("tls")).collect();
assert_eq!(tls_claims.len(), 1);
assert_eq!(tls_claims[0].line, 6, "The claim should be from line 6");
}
} }

View File

@ -76,9 +76,7 @@ pub async fn handle_command(command: Commands, config: &AphoriaConfig) -> ExitCo
} }
} }
Commands::Ack { concept_path, reason, expires } => { Commands::Ack { command } => policy_ops::handle_ack_command(command, config).await,
policy_ops::handle_ack(concept_path, reason, expires, config).await
}
Commands::Bless { concept_path, predicate, value, reason } => { Commands::Bless { concept_path, predicate, value, reason } => {
policy_ops::handle_bless(concept_path, predicate, value, reason, config).await policy_ops::handle_bless(concept_path, predicate, value, reason, config).await

View File

@ -4,7 +4,20 @@ use std::process::ExitCode;
use aphoria::{AcknowledgeArgs, AphoriaConfig, BlessArgs, UpdateArgs}; use aphoria::{AcknowledgeArgs, AphoriaConfig, BlessArgs, UpdateArgs};
pub async fn handle_ack( use crate::cli::AckCommands;
/// Handle the ack command and its subcommands.
pub async fn handle_ack_command(command: AckCommands, config: &AphoriaConfig) -> ExitCode {
match command {
AckCommands::Add { concept_path, reason, expires } => {
handle_ack_add(concept_path, reason, expires, config).await
}
AckCommands::Export { output } => handle_ack_export(output, config).await,
AckCommands::Import { input } => handle_ack_import(input, config).await,
}
}
async fn handle_ack_add(
concept_path: String, concept_path: String,
reason: String, reason: String,
expires: Option<String>, expires: Option<String>,
@ -28,6 +41,39 @@ pub async fn handle_ack(
} }
} }
async fn handle_ack_export(output: Option<std::path::PathBuf>, config: &AphoriaConfig) -> ExitCode {
match aphoria::export_acks(output, config).await {
Ok(stats) => {
println!(
"Exported {} acknowledgments to {}",
stats.exported,
stats.output_path.display()
);
ExitCode::SUCCESS
}
Err(e) => {
eprintln!("Export error: {e}");
ExitCode::from(3)
}
}
}
async fn handle_ack_import(input: Option<std::path::PathBuf>, config: &AphoriaConfig) -> ExitCode {
match aphoria::import_acks(input, config).await {
Ok(stats) => {
println!(
"Imported {} acknowledgments ({} skipped as duplicates)",
stats.imported, stats.skipped
);
ExitCode::SUCCESS
}
Err(e) => {
eprintln!("Import error: {e}");
ExitCode::from(3)
}
}
}
pub async fn handle_bless( pub async fn handle_bless(
concept_path: String, concept_path: String,
predicate: String, predicate: String,

View File

@ -39,6 +39,7 @@
//! ``` //! ```
// Module declarations // Module declarations
pub mod ack_file;
mod baseline; mod baseline;
pub mod bridge; pub mod bridge;
pub mod community; pub mod community;
@ -108,8 +109,8 @@ pub use lifecycle::{
}; };
pub use policy::{PackPredicateAliasSet, PolicyManager, SignatureRecord, TrustPack}; pub use policy::{PackPredicateAliasSet, PolicyManager, SignatureRecord, TrustPack};
pub use policy_ops::{ pub use policy_ops::{
acknowledge, bless, export_policy, import_policy, parse_value, resign_policy, update, acknowledge, bless, export_acks, export_policy, import_acks, import_policy, parse_value,
ImportStats, ResignStats, resign_policy, update, AckExportStats, AckImportStats, ImportStats, ResignStats,
}; };
pub use promotion::{ pub use promotion::{
compute_metrics_delta, display_candidate, display_candidates_summary, ChangelogEntry, compute_metrics_delta, display_candidate, display_candidates_summary, ChangelogEntry,

View File

@ -1,13 +1,15 @@
//! Policy export and import operations. //! Policy export and import operations.
use std::path::PathBuf;
use tracing::{info, instrument, warn};
use crate::bridge; use crate::bridge;
use crate::config::AphoriaConfig; use crate::config::AphoriaConfig;
use crate::episteme::LocalEpisteme; use crate::episteme::LocalEpisteme;
use crate::error::AphoriaError; use crate::error::AphoriaError;
use crate::policy::{PackPredicateAliasSet, SignatureRecord, TrustPack}; use crate::policy::{PackPredicateAliasSet, SignatureRecord, TrustPack};
use crate::types::{predicates, AcknowledgeArgs, ExtractedClaim, UpdateArgs}; use crate::types::{predicates, AcknowledgeArgs, ExtractedClaim, UpdateArgs};
use std::path::PathBuf;
use tracing::{info, instrument, warn};
/// Export policy from the current project. /// Export policy from the current project.
/// ///
@ -499,3 +501,173 @@ pub fn parse_value(s: &str) -> stemedb_core::types::ObjectValue {
} }
} }
} }
// ============================================================================
// Ack Export/Import (Version Control)
// ============================================================================
/// Statistics from ack export.
#[derive(Debug, Clone, Default)]
pub struct AckExportStats {
/// Number of acks exported.
pub exported: usize,
/// Path to the export file.
pub output_path: std::path::PathBuf,
}
/// Statistics from ack import.
#[derive(Debug, Clone, Default)]
pub struct AckImportStats {
/// Number of acks imported (created new).
pub imported: usize,
/// Number of acks skipped (already exist).
pub skipped: usize,
}
/// Export acknowledgments to a version-controllable TOML file.
///
/// Reads all acks from the local Episteme and writes them to `.aphoria/acks.toml`.
/// This file can be committed to version control for team collaboration.
#[instrument(skip(config))]
pub async fn export_acks(
output: Option<std::path::PathBuf>,
config: &AphoriaConfig,
) -> Result<AckExportStats, AphoriaError> {
use crate::ack_file::{AckEntry, AckFile, AckPayload};
info!("Exporting acknowledgments to file");
let project_root = std::env::current_dir()?;
let episteme = LocalEpisteme::open(config, &project_root).await?;
// Fetch all acks from database
let acks = episteme.fetch_acknowledgments().await?;
// Convert to AckFile format
let mut ack_file = AckFile::new();
for assertion in &acks {
// Parse the JSON payload to extract reason and expiry
let payload_text = match &assertion.object {
stemedb_core::types::ObjectValue::Text(t) => t.clone(),
other => format!("{other:?}"),
};
let payload = AckPayload::parse(&payload_text);
// Format created timestamp (timestamp is u64, need to convert for chrono)
let created = i64::try_from(assertion.timestamp)
.ok()
.and_then(|ts| chrono::DateTime::from_timestamp(ts, 0))
.map(|dt| dt.format("%Y-%m-%dT%H:%M:%SZ").to_string())
.unwrap_or_else(|| format!("{}", assertion.timestamp));
let expires = payload.expires_iso();
ack_file.add(AckEntry {
path: assertion.subject.clone(),
reason: payload.reason,
expires,
created,
by: None, // Could add agent_id here if we want to track who created it
});
}
// Determine output path
let output_path = output.unwrap_or_else(|| AckFile::default_path(&project_root));
// Save to file
ack_file.save(&output_path)?;
info!(
count = ack_file.len(),
path = %output_path.display(),
"Acknowledgments exported"
);
Ok(AckExportStats { exported: ack_file.len(), output_path })
}
/// Import acknowledgments from a TOML file into the local Episteme.
///
/// Reads `.aphoria/acks.toml` and creates acknowledgment assertions for each entry.
/// Skips entries that already exist (based on concept path).
#[instrument(skip(config))]
pub async fn import_acks(
input: Option<std::path::PathBuf>,
config: &AphoriaConfig,
) -> Result<AckImportStats, AphoriaError> {
use crate::ack_file::AckFile;
use crate::expiry;
info!("Importing acknowledgments from file");
let project_root = std::env::current_dir()?;
let mut episteme = LocalEpisteme::open(config, &project_root).await?;
// Determine input path
let input_path = input.unwrap_or_else(|| AckFile::default_path(&project_root));
if !input_path.exists() {
return Err(AphoriaError::Config(format!("Ack file not found: {}", input_path.display())));
}
// Load ack file
let ack_file = AckFile::load(&input_path)?;
info!(count = ack_file.len(), path = %input_path.display(), "Loaded ack file");
// Fetch existing acks to avoid duplicates
let existing_acks = episteme.fetch_acknowledgments().await?;
let existing_paths: std::collections::HashSet<&str> =
existing_acks.iter().map(|a| a.subject.as_str()).collect();
let mut stats = AckImportStats::default();
for entry in &ack_file.acks {
// Check if ack already exists
if existing_paths.contains(entry.path.as_str()) {
stats.skipped += 1;
continue;
}
// Parse expiry if present
let expires_at: Option<i64> = if let Some(ref exp_str) = entry.expires {
// Try to parse as ISO 8601 timestamp
chrono::DateTime::parse_from_rfc3339(exp_str).map(|dt| dt.timestamp()).ok().or_else(
|| {
// Fall back to expiry::parse_expiry for duration strings
// parse_expiry returns u64, convert to i64
expiry::parse_expiry(exp_str).ok().and_then(|u| i64::try_from(u).ok())
},
)
} else {
None
};
// Build ack payload as JSON
let ack_payload = serde_json::json!({
"reason": entry.reason,
"expires_at": expires_at,
});
// Create acknowledgment assertion
let claim = ExtractedClaim {
concept_path: entry.path.clone(),
predicate: predicates::ACKNOWLEDGED.to_string(),
value: stemedb_core::types::ObjectValue::Text(ack_payload.to_string()),
file: "aphoria_ack_import".to_string(),
line: 0,
matched_text: format!("Acknowledged: {}", entry.reason),
confidence: 1.0,
description: format!("Imported: {}", entry.reason),
};
episteme.ingest_claims(&[claim]).await?;
stats.imported += 1;
}
episteme.shutdown().await;
info!(imported = stats.imported, skipped = stats.skipped, "Acknowledgments imported");
Ok(stats)
}

View File

@ -0,0 +1,179 @@
//! `.aphoriaignore` file parser.
//!
//! This module handles loading and parsing `.aphoriaignore` files, which use
//! gitignore-style syntax for excluding paths from scanning.
//!
//! ## Syntax
//!
//! The `.aphoriaignore` file follows gitignore conventions:
//!
//! - Lines starting with `#` are comments
//! - Empty lines are ignored
//! - Trailing whitespace is stripped
//! - Patterns use glob syntax (see `walker/mod.rs` for details)
//!
//! ## Example
//!
//! ```text
//! # .aphoriaignore
//!
//! # Test fixtures - intentionally vulnerable
//! uat/fixtures/
//! tests/fixtures/
//!
//! # Demo apps
//! docs/demo/vulnbank/
//!
//! # Pattern definitions in extractor source
//! applications/aphoria/src/extractors/*.rs
//!
//! # Vendor code
//! vendor/
//! third_party/
//! ```
use std::path::Path;
use crate::AphoriaError;
/// File name for the ignore file.
pub const IGNORE_FILE_NAME: &str = ".aphoriaignore";
/// Patterns loaded from an `.aphoriaignore` file.
#[derive(Debug, Default)]
pub struct IgnorePatterns {
/// The parsed patterns.
patterns: Vec<String>,
}
impl IgnorePatterns {
/// Create an empty set of patterns.
pub fn empty() -> Self {
Self { patterns: Vec::new() }
}
/// Load patterns from `.aphoriaignore` in the given directory.
///
/// If the file doesn't exist, returns an empty pattern set.
/// If the file exists but can't be read, returns an error.
pub fn load(root: &Path) -> Result<Self, AphoriaError> {
let ignore_path = root.join(IGNORE_FILE_NAME);
if !ignore_path.exists() {
return Ok(Self::empty());
}
let content = std::fs::read_to_string(&ignore_path)
.map_err(|e| AphoriaError::Walker(format!("Failed to read {IGNORE_FILE_NAME}: {e}")))?;
Ok(Self::parse(&content))
}
/// Parse patterns from a string (the contents of `.aphoriaignore`).
pub fn parse(content: &str) -> Self {
let patterns = content
.lines()
.map(|line| line.trim())
.filter(|line| !line.is_empty() && !line.starts_with('#'))
.map(|line| line.to_string())
.collect();
Self { patterns }
}
/// Get the parsed patterns.
pub fn patterns(&self) -> &[String] {
&self.patterns
}
/// Check if the pattern set is empty.
#[allow(dead_code)]
pub fn is_empty(&self) -> bool {
self.patterns.is_empty()
}
/// Get the number of patterns.
#[allow(dead_code)]
pub fn len(&self) -> usize {
self.patterns.len()
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
use tempfile::TempDir;
#[test]
fn test_parse_empty() {
let patterns = IgnorePatterns::parse("");
assert!(patterns.is_empty());
}
#[test]
fn test_parse_comments_only() {
let content = r#"
# This is a comment
# Another comment
"#;
let patterns = IgnorePatterns::parse(content);
assert!(patterns.is_empty());
}
#[test]
fn test_parse_simple_patterns() {
let content = r#"
# Test fixtures
uat/fixtures/
tests/fixtures/
# Demo apps
docs/demo/vulnbank/
"#;
let patterns = IgnorePatterns::parse(content);
assert_eq!(patterns.len(), 3);
assert!(patterns.patterns().contains(&"uat/fixtures/".to_string()));
assert!(patterns.patterns().contains(&"tests/fixtures/".to_string()));
assert!(patterns.patterns().contains(&"docs/demo/vulnbank/".to_string()));
}
#[test]
fn test_parse_glob_patterns() {
let content = r#"
**/test_fixtures/**
**/*_test.rs
applications/aphoria/src/extractors/*.rs
"#;
let patterns = IgnorePatterns::parse(content);
assert_eq!(patterns.len(), 3);
assert!(patterns.patterns().contains(&"**/test_fixtures/**".to_string()));
assert!(patterns.patterns().contains(&"**/*_test.rs".to_string()));
}
#[test]
fn test_load_nonexistent() {
let temp_dir = TempDir::new().expect("create temp dir");
let patterns = IgnorePatterns::load(temp_dir.path()).expect("load patterns");
assert!(patterns.is_empty());
}
#[test]
fn test_load_existing() {
let temp_dir = TempDir::new().expect("create temp dir");
let content = "uat/fixtures/\ntests/";
fs::write(temp_dir.path().join(IGNORE_FILE_NAME), content).expect("write file");
let patterns = IgnorePatterns::load(temp_dir.path()).expect("load patterns");
assert_eq!(patterns.len(), 2);
}
#[test]
fn test_trim_whitespace() {
let content = " uat/fixtures/ \n tests/ ";
let patterns = IgnorePatterns::parse(content);
assert_eq!(patterns.len(), 2);
assert!(patterns.patterns().contains(&"uat/fixtures/".to_string()));
assert!(patterns.patterns().contains(&"tests/".to_string()));
}
}

View File

@ -5,16 +5,40 @@
//! 2. Detects the primary language //! 2. Detects the primary language
//! 3. Maps file paths to ConceptPath segments //! 3. Maps file paths to ConceptPath segments
//! 4. Filters files based on configuration //! 4. Filters files based on configuration
//!
//! ## Exclude Patterns
//!
//! The `exclude` list in `aphoria.toml` supports glob patterns:
//!
//! - `*` matches any sequence of characters except `/`
//! - `**` matches any sequence of characters including `/`
//! - `?` matches any single character except `/`
//! - `[abc]` matches any character in the brackets
//! - `{a,b}` matches either `a` or `b`
//!
//! Examples:
//! ```toml
//! [scan]
//! exclude = [
//! "target/**", # All files under target/
//! "**/test_fixtures/**", # test_fixtures anywhere
//! "**/*_test.rs", # All Rust test files
//! "docs/demo/**", # Demo apps
//! ]
//! ```
mod git; mod git;
mod ignore_file;
mod language; mod language;
mod path_mapper; mod path_mapper;
pub use git::{find_repo_root, get_staged_files}; pub use git::{find_repo_root, get_staged_files};
pub use ignore_file::IgnorePatterns;
pub use path_mapper::PathMapper; pub use path_mapper::PathMapper;
use std::path::Path; use std::path::Path;
use globset::{Glob, GlobSet, GlobSetBuilder};
use ignore::WalkBuilder; use ignore::WalkBuilder;
use crate::config::AphoriaConfig; use crate::config::AphoriaConfig;
@ -37,12 +61,80 @@ pub struct WalkedFile {
pub path_segments: Vec<String>, pub path_segments: Vec<String>,
} }
/// Build a GlobSet from a list of exclude patterns.
///
/// Supports standard glob syntax:
/// - `*` matches any sequence of characters except `/`
/// - `**` matches any sequence of characters including `/`
/// - `?` matches any single character except `/`
/// - `[abc]` matches any character in the brackets
/// - `{a,b}` matches either `a` or `b`
///
/// For backwards compatibility, patterns without wildcards are treated as prefixes.
fn build_exclude_globset(patterns: &[String]) -> Result<GlobSet, AphoriaError> {
let mut builder = GlobSetBuilder::new();
for pattern in patterns {
// Normalize pattern: remove trailing slashes for consistent matching
let pattern = pattern.trim_end_matches('/');
// Check if pattern contains glob characters
let has_glob_chars = pattern.contains('*')
|| pattern.contains('?')
|| pattern.contains('[')
|| pattern.contains('{');
let glob_pattern = if has_glob_chars {
// Use pattern as-is (it's already a glob)
pattern.to_string()
} else {
// Legacy prefix pattern: convert to glob by adding /**
format!("{pattern}/**")
};
let glob = Glob::new(&glob_pattern).map_err(|e| {
AphoriaError::Walker(format!("Invalid exclude pattern '{pattern}': {e}"))
})?;
builder.add(glob);
// Also match the directory itself (not just its contents)
if !has_glob_chars {
let dir_glob = Glob::new(pattern).map_err(|e| {
AphoriaError::Walker(format!("Invalid exclude pattern '{pattern}': {e}"))
})?;
builder.add(dir_glob);
}
}
builder
.build()
.map_err(|e| AphoriaError::Walker(format!("Failed to build exclude patterns: {e}")))
}
/// Walk a project directory and yield files for extraction. /// Walk a project directory and yield files for extraction.
pub fn walk_project(root: &Path, config: &AphoriaConfig) -> Result<Vec<WalkedFile>, AphoriaError> { pub fn walk_project(root: &Path, config: &AphoriaConfig) -> Result<Vec<WalkedFile>, AphoriaError> {
// Load .aphoriaignore if present
let ignore_patterns = IgnorePatterns::load(root)?;
walk_project_with_ignore(root, config, &ignore_patterns)
}
/// Walk a project directory with pre-loaded ignore patterns.
pub fn walk_project_with_ignore(
root: &Path,
config: &AphoriaConfig,
ignore_patterns: &IgnorePatterns,
) -> Result<Vec<WalkedFile>, AphoriaError> {
if !root.exists() { if !root.exists() {
return Err(AphoriaError::ProjectNotFound(root.to_path_buf())); return Err(AphoriaError::ProjectNotFound(root.to_path_buf()));
} }
// Combine config excludes with .aphoriaignore patterns
let mut all_excludes = config.scan.exclude.clone();
all_excludes.extend(ignore_patterns.patterns().iter().cloned());
// Build globset for exclude patterns
let exclude_globset = build_exclude_globset(&all_excludes)?;
let mut files = Vec::new(); let mut files = Vec::new();
let mapper = PathMapper::new(root, config); let mapper = PathMapper::new(root, config);
@ -74,7 +166,7 @@ pub fn walk_project(root: &Path, config: &AphoriaConfig) -> Result<Vec<WalkedFil
for entry in walker { for entry in walker {
let entry = entry let entry = entry
.map_err(|e| AphoriaError::Walker(format!("Failed to read directory entry: {e}")))?; .map_err(|e| AphoriaError::Walker(format!("Failed to read directory entry: {e}")))?;
if let Some(walked) = process_file(entry.path(), root, config, &mapper) { if let Some(walked) = process_file(entry.path(), root, config, &mapper, &exclude_globset) {
files.push(walked); files.push(walked);
} }
} }
@ -102,6 +194,7 @@ fn process_file(
scan_root: &Path, scan_root: &Path,
config: &AphoriaConfig, config: &AphoriaConfig,
mapper: &PathMapper, mapper: &PathMapper,
exclude_globset: &GlobSet,
) -> Option<WalkedFile> { ) -> Option<WalkedFile> {
// Skip directories // Skip directories
if path.is_dir() { if path.is_dir() {
@ -119,8 +212,8 @@ fn process_file(
let relative = path.strip_prefix(scan_root).ok()?; let relative = path.strip_prefix(scan_root).ok()?;
let relative_str = relative.to_string_lossy().to_string(); let relative_str = relative.to_string_lossy().to_string();
// Check exclusions // Check exclusions using glob patterns
if config.scan.exclude.iter().any(|ex| relative_str.starts_with(ex.trim_end_matches('/'))) { if exclude_globset.is_match(&relative_str) {
return None; return None;
} }
@ -168,6 +261,16 @@ pub fn walk_staged_files(
// Canonicalize scan_root for path comparisons // Canonicalize scan_root for path comparisons
let scan_root_canonical = scan_root.canonicalize().unwrap_or_else(|_| scan_root.to_path_buf()); let scan_root_canonical = scan_root.canonicalize().unwrap_or_else(|_| scan_root.to_path_buf());
// Load .aphoriaignore if present
let ignore_patterns = IgnorePatterns::load(&scan_root_canonical)?;
// Combine config excludes with .aphoriaignore patterns
let mut all_excludes = config.scan.exclude.clone();
all_excludes.extend(ignore_patterns.patterns().iter().cloned());
// Build globset for exclude patterns
let exclude_globset = build_exclude_globset(&all_excludes)?;
let mapper = PathMapper::new(&scan_root_canonical, config); let mapper = PathMapper::new(&scan_root_canonical, config);
let mut files = Vec::new(); let mut files = Vec::new();
@ -184,7 +287,9 @@ pub fn walk_staged_files(
} }
// Use shared helper for filtering and processing // Use shared helper for filtering and processing
if let Some(walked) = process_file(&path_canonical, &scan_root_canonical, config, &mapper) { if let Some(walked) =
process_file(&path_canonical, &scan_root_canonical, config, &mapper, &exclude_globset)
{
files.push(walked); files.push(walked);
} }
} }
@ -234,4 +339,126 @@ mod tests {
// Check that .git is excluded // Check that .git is excluded
assert!(!paths.iter().any(|p| p.contains(".git")), ".git should be excluded"); assert!(!paths.iter().any(|p| p.contains(".git")), ".git should be excluded");
} }
#[test]
fn test_glob_exclude_patterns() {
let temp_dir = TempDir::new().expect("create temp dir");
let root = temp_dir.path();
// Create a nested structure
fs::create_dir_all(root.join("src/auth")).expect("create src/auth");
fs::create_dir_all(root.join("uat/fixtures")).expect("create uat/fixtures");
fs::create_dir_all(root.join("docs/demo/vulnbank")).expect("create docs/demo/vulnbank");
// Create files
fs::write(root.join("src/auth/jwt.rs"), "fn verify() {}").expect("write jwt.rs");
fs::write(root.join("src/auth/jwt_test.rs"), "fn test() {}").expect("write jwt_test.rs");
fs::write(root.join("uat/fixtures/bad_tls.rs"), "// insecure").expect("write fixture");
fs::write(root.join("docs/demo/vulnbank/main.rs"), "// vuln").expect("write vulnbank");
// Configure glob excludes
let mut config = AphoriaConfig::default();
config.scan.exclude = vec![
"uat/fixtures/**".to_string(),
"docs/demo/**".to_string(),
"**/*_test.rs".to_string(),
];
config.scan.include_tests = true; // Don't let test detection interfere
let files = walk_project(root, &config).expect("walk project");
let paths: Vec<_> = files.iter().map(|f| f.relative_path.as_str()).collect();
// jwt.rs should be included
assert!(paths.contains(&"src/auth/jwt.rs"), "jwt.rs should be included");
// Test file should be excluded by glob
assert!(
!paths.contains(&"src/auth/jwt_test.rs"),
"jwt_test.rs should be excluded by **/*_test.rs"
);
// Fixtures should be excluded
assert!(
!paths.iter().any(|p| p.contains("uat/fixtures")),
"uat/fixtures should be excluded"
);
// Vulnbank should be excluded
assert!(!paths.iter().any(|p| p.contains("vulnbank")), "vulnbank should be excluded");
}
#[test]
fn test_legacy_prefix_patterns() {
let temp_dir = TempDir::new().expect("create temp dir");
let root = temp_dir.path();
// Create structure
fs::create_dir_all(root.join("vendor/lib")).expect("create vendor/lib");
fs::create_dir_all(root.join("src")).expect("create src");
// Create files
fs::write(root.join("vendor/lib/dep.rs"), "fn dep() {}").expect("write dep.rs");
fs::write(root.join("src/main.rs"), "fn main() {}").expect("write main.rs");
// Use legacy prefix pattern (no glob chars)
let mut config = AphoriaConfig::default();
config.scan.exclude = vec!["vendor/".to_string()];
let files = walk_project(root, &config).expect("walk project");
let paths: Vec<_> = files.iter().map(|f| f.relative_path.as_str()).collect();
// main.rs should be included
assert!(paths.contains(&"src/main.rs"), "main.rs should be included");
// vendor should be excluded (legacy prefix behavior)
assert!(!paths.iter().any(|p| p.contains("vendor")), "vendor should be excluded");
}
#[test]
fn test_aphoriaignore_file() {
let temp_dir = TempDir::new().expect("create temp dir");
let root = temp_dir.path();
// Create structure
fs::create_dir_all(root.join("uat/fixtures")).expect("create uat/fixtures");
fs::create_dir_all(root.join("src")).expect("create src");
// Create files
fs::write(root.join("uat/fixtures/bad.rs"), "// bad").expect("write bad.rs");
fs::write(root.join("src/main.rs"), "fn main() {}").expect("write main.rs");
// Create .aphoriaignore
fs::write(root.join(".aphoriaignore"), "uat/fixtures/").expect("write .aphoriaignore");
let config = AphoriaConfig::default();
let files = walk_project(root, &config).expect("walk project");
let paths: Vec<_> = files.iter().map(|f| f.relative_path.as_str()).collect();
// main.rs should be included
assert!(paths.contains(&"src/main.rs"), "main.rs should be included");
// uat/fixtures should be excluded via .aphoriaignore
assert!(
!paths.iter().any(|p| p.contains("uat/fixtures")),
"uat/fixtures should be excluded"
);
}
#[test]
fn test_build_exclude_globset_valid() {
let patterns =
vec!["target/**".to_string(), "**/test_fixtures/**".to_string(), "vendor/".to_string()];
let globset = build_exclude_globset(&patterns).expect("build globset");
assert!(globset.is_match("target/debug/build"));
assert!(globset.is_match("src/test_fixtures/bad.rs"));
assert!(globset.is_match("vendor/lib.rs"));
}
#[test]
fn test_build_exclude_globset_invalid() {
let patterns = vec!["[invalid".to_string()];
let result = build_exclude_globset(&patterns);
assert!(result.is_err());
}
} }