jordan b3e8a9a058 feat: Multi-application expansion with chaos testing and community UI

Major additions:
- Community Next.js app (port 18187) for browsing claims with API docs
- stemedb-chaos crate: Fault injection, chaos testing, CRDT properties
- Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents
- Disputed claims handling: Manual review workflows and validation
- Aphoria security scanner: New extractors (SQL injection, command
  injection, weak crypto, TLS version), policy-based ignores, UAT reports
- Docker infrastructure: Dockerfile, docker-compose.yml for full stack
- VulnBank demo: Intentionally vulnerable multi-language test corpus

SDK & API enhancements:
- Source registry handlers for tracking data provenance
- Metrics endpoint
- Skeptic filtering improvements

Code quality:
- Split 14 large files (>500 lines) into focused modules
- All files now under 500-line limit per project guidelines

Documentation:
- Chaos testing guide, circuit breakers, observability docs
- Phase 7 UAT documentation updates
- Martin Kleppmann technical writer agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-04 01:24:14 -07:00

8.3 KiB

Raw Blame History

UAT Report: Citadel Scan

Date: 2026-02-03 Target: /Users/jordanwashburn/Workspace/orchard9/citadel Aphoria Version: 0.1.0 Status: PASS

Executive Summary

Ran Aphoria scan against the Citadel codebase (a production Rust monorepo with ~50 crates). Initial scan revealed critical precision issues with extractors producing 98% false positives. After bug fixes, precision improved to 100% with only 3 true positive findings remaining.

Metric	Before Fixes	After Fixes
Files scanned	1,438	1,438
Claims extracted	1,259	1,259
Conflicts reported	168	3
True positives	~3	3
False positives	~165	0
Precision	~2%	100%
Scan time	1.7s	1.7s

Test Environment

Target Codebase: Citadel

Type: Rust workspace with ~50 crates
Domain: Log aggregation and observability platform
Notable crates:
- citadel-auth, citadel-auth-db, citadel-auth-oidc (authentication)
- citadel-api (HTTP API)
- citadel-agent (log collection agent)
- ingestion-edge, tiered-storage (data pipeline)
- citadel-community (Next.js frontend)
Languages: Rust, TypeScript, YAML, TOML, JSON

Scan Configuration

cargo run --release -- scan /Users/jordanwashburn/Workspace/orchard9/citadel

Mode: Ephemeral (no persistence)
Format: Table (default), JSON for analysis

Bugs Discovered and Fixed

Bug #1: Rate Limit Regex Matches Embedded Zeros (HIGH)

Symptom: 150 of 168 conflicts were rate_limit false positives.

Example False Positive:

// Line: RateLimiter::new(10); // 10 req/min
// Matched because: "RateLimiter" contains "rate...limit", and "10" contains "0"

Fix: Changed pattern to require assignment context:

// Before
r"(?i)(rate_?limit|ratelimit).*(?:disabled|off|false|0|none|skip)"

// After
r"(?i)(rate_?limit|ratelimit)(?:_enabled)?\s*[:=]\s*(?:disabled|off|false|\b0\b|none|skip)"

Key changes:

Added (?:_enabled)? to match ratelimit_enabled = false
Required \s*[:=]\s* for assignment context
Changed 0 to \b0\b (word boundary) to avoid matching 10, 100, etc.

File: src/extractors/rate_limit.rs:55-58

Bug #2: Rate Limit Numeric Pattern Doesn't Handle Underscores (MEDIUM)

Symptom: rateLimitPerTenant: 100_000 not detected as high limit.

Root Cause: Regex \d+ doesn't match underscore-separated numbers common in YAML/Rust.

Fix:

Changed pattern to [\d_]+ to capture underscored numbers
Added underscore stripping before parsing: value_str.chars().filter(|c| *c != '_').collect()

File: src/extractors/rate_limit.rs:59-70, 143-146

Bug #3: JWT Audience Pattern Matches "audit" (MEDIUM)

Symptom: audit_log_path.is_none() flagged as "JWT audience validation disabled".

Root Cause: Pattern aud.*None matched audit_log_path...None because aud is a prefix of audit.

Example False Positive:

// Line: assert!(config.audit_log_path.is_none());
// Matched because: "audit" starts with "aud", followed eventually by "None"

Fix: Made pattern specific to JWT audience contexts:

// Before
r"(?i)(set_audience.*\[\]|validate_aud.*false|aud.*None|ValidateAudience.*false)"

// After
r"(?i)(set_audience\s*\(\s*(?:vec!)?\s*\[\s*\]\s*\)|validate_aud\w*\s*[:=]\s*false|\baud(?:ience)?\s*[:=]\s*None|ValidateAudience\s*[:=]\s*false)"

Key changes:

Used \baud(?:ience)? with word boundary
Required assignment context [:=]

File: src/extractors/jwt_config.rs:40-47

Bug #4: JWT Signature Verify Pattern Too Broad (MEDIUM)

Symptom: insecure_skip_verify = false flagged as "signature verification disabled" when it's actually the SECURE setting.

Root Cause: Pattern verify.*false matched any line with "verify" followed by "false".

Example False Positives:

// Line: insecure_skip_verify = false  (SECURE - not skipping!)
// Line: const [isVerifying, setIsVerifying] = useState(false)  (React UI state)

Fix: Required specific verify_signature or signature_verify context:

// Before
r"(?i)(dangerous_insecure|skip_signature|verify.*false|RequireSignedTokens.*false)"

// After
r"(?i)(dangerous_insecure|skip_signature|(?:verify_signature|signature_verify)\s*[:=]\s*false|RequireSignedTokens\s*[:=]\s*false)"

File: src/extractors/jwt_config.rs:60-69

Bug #5: JWT Algorithm None Matches Documentation (LOW)

Symptom: Documentation line (no 'alg: none') flagged when it's describing PREVENTION.

Example False Positive:

// Line: /// - Algorithm whitelist (no `alg: none`)
// This is documentation saying alg:none is NOT allowed (secure!)

Fix: Added post-processing filter to exclude prevention documentation:

let is_prevention_doc = lower_line.contains("no alg")
    || lower_line.contains("no `alg")
    || lower_line.contains("whitelist")
    || lower_line.contains("reject")
    // ... etc

File: src/extractors/jwt_config.rs:170-183

Regression Tests Added

Test	Purpose	File
`test_rate_limiter_constructor_not_flagged`	`RateLimiter::new(10)` not flagged	rate_limit.rs
`test_rate_limit_constant_not_flagged`	`RATE_LIMIT: u64 = 100` not flagged	rate_limit.rs
`test_rate_limit_per_tenant_not_flagged`	Underscore numbers handled	rate_limit.rs
`test_rate_limit_explicitly_zero`	`rate_limit = 0` IS flagged	rate_limit.rs
`test_rate_limit_documentation_not_flagged`	Doc comments not flagged	rate_limit.rs
`test_skip_verify_false_not_flagged`	`skip_verify = false` not flagged	jwt_config.rs
`test_ui_state_not_flagged`	`isVerifying` state not flagged	jwt_config.rs
`test_verify_release_not_flagged`	Non-JWT verify functions	jwt_config.rs
`test_actual_signature_verify_disabled`	True positives still caught	jwt_config.rs

Final Scan Results

Command

cargo run --release -- scan /Users/jordanwashburn/Workspace/orchard9/citadel

Output Summary

3 BLOCK, 0 FLAG, 0 PASS

Remaining True Positives

File	Line	Finding
`tools/citadel-cli/src/commands/agent.rs`	903	API key hardcoded in source
`crates/citadel-cli/src/commands/query.rs`	736	API key hardcoded in source
`crates/citadel-agent/src/config.rs`	101	API key hardcoded in source

Analysis: All 3 are example API keys in test code or documentation. They follow the pattern ck_live_* or ck_test_* which correctly matches the hardcoded secrets extractor. These are true positives (low severity) that can be:

Fixed by moving examples to environment variables
Acknowledged with aphoria ack if intentional for documentation

Quality Gates Passed

Check	Status
`cargo test`	158 passed, 0 failed
`cargo clippy -- -D warnings`	0 warnings
`cargo fmt --check`	Formatted

Known Limitations

No TLS conflicts found: Citadel's TLS configuration appears compliant, or patterns aren't matching their specific setup.
Secrets extractor catches test keys: Low-severity true positives in test/doc code. May want to add confidence reduction for test or example path segments.
Algorithm none filter is heuristic: The documentation filter uses keyword matching which could miss edge cases or over-filter.

Recommendations

Add path-aware confidence: Reduce confidence for findings in /tests/, /examples/, or files containing "test" in the name.
Consider YAML-aware extraction: The current line-by-line regex approach could be improved with proper YAML/TOML parsing for structured configs.
Expand test corpus: Use Citadel findings to build a larger regression test suite.
Document patterns: Add ai-lookup entries explaining the regex patterns for future maintainers.

Appendix: Conflict Categories

Before Fixes

Category	Count	%
rate_limit	150	89%
jwt	15	9%
secrets	3	2%
Total	168

After Fixes

Category	Count	%
secrets	3	100%
Total	3

8.3 KiB Raw Blame History

UAT Report: Citadel Scan

Executive Summary

Test Environment

Target Codebase: Citadel

Scan Configuration

Bugs Discovered and Fixed

Bug #1: Rate Limit Regex Matches Embedded Zeros (HIGH)

Bug #2: Rate Limit Numeric Pattern Doesn't Handle Underscores (MEDIUM)

Bug #3: JWT Audience Pattern Matches "audit" (MEDIUM)

Bug #4: JWT Signature Verify Pattern Too Broad (MEDIUM)

Bug #5: JWT Algorithm None Matches Documentation (LOW)

Regression Tests Added

Final Scan Results

Command

Output Summary

Remaining True Positives

Quality Gates Passed

Known Limitations

Recommendations

Appendix: Conflict Categories

Before Fixes

After Fixes

8.3 KiB

Raw Blame History