stemedb/applications/aphoria/docs/guides/the-first-scan.md
jml bb0c33f8d3 fix(api): enable querying of CLI-created community corpus items
## Problem
CLI-created community corpus items (tier 3) were stored correctly but
invisible via API queries. Two issues blocked discoverability:

1. **Prefix mismatch**: API hardcoded 'community://pattern/' for
   aggregated patterns, but CLI creates 'community://rust/http/...' URIs
2. **Query parameter parsing**: Axum's default parser doesn't support
   bracket notation (?sources[]=value) used by the dashboard

Result: 0/22 CLI-created items were queryable.

## Solution

### Fix 1: Broaden Community Prefix
- Changed: 'community://pattern/' → 'community://' in corpus handler
- Impact: Now matches both aggregated patterns AND CLI-created items
- Backward compatible: Broader prefix includes narrower results

### Fix 2: Add QsQuery Extractor
- Added: serde_qs dependency + custom QsQuery extractor
- Supports: Bracket notation for array parameters (?sources[]=a&sources[]=b)
- Compatible: Works with JavaScript URLSearchParams standard
- Tested: 3 new unit tests for extractor behavior

## Verification
-  All 22 CLI-created community items now queryable (was 0)
-  Source filtering works: community (22), RFC (2), vendor (5)
-  Multi-source queries work: ?sources[]=community&sources[]=rfc → 24
-  All 89 API tests pass + 3 new extractor tests
-  Clippy clean (0 warnings)
-  No regressions in existing functionality

## Files Changed
- crates/stemedb-api/Cargo.toml: Add serde_qs dependency
- crates/stemedb-api/src/extractors.rs: New QsQuery extractor (117 lines)
- crates/stemedb-api/src/handlers/aphoria/corpus.rs: Use QsQuery, broaden prefix
- crates/stemedb-api/src/lib.rs: Export extractors module

Also includes: Scale-adaptive thresholds, wiki corpus extraction,
documentation updates, and dashboard UI improvements from prior work.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-09 15:54:35 +00:00

4.1 KiB

Guide: The First Scan — From Zero to Compliance

Target Audience: Developers, Tech Leads Time to Value: 5 minutes


Introduction

Every codebase contains invisible decisions. When you set verify=false in a config file, you aren't just toggling a boolean; you are making a claim: "Certificate verification is not required for this connection."

Authoritative sources (like RFCs and OWASP) disagree. They claim: "Certificate verification is mandatory."

Aphoria detects this disagreement. It is a "truth linter" that finds where your code contradicts authoritative specs. This guide will take you from zero to your first clean scan.


1. Installation

Aphoria ships as a single binary.

# From source (assuming you have the repo)
cd /path/to/stemedb/applications/aphoria
cargo install --path .

Verify it works:

$ aphoria --version
aphoria 0.1.0

2. Initialize the Cortex

Before scanning, Aphoria needs to know "the truth." It needs a corpus of authoritative assertions (RFCs, OWASP cheat sheets, vendor docs).

$ aphoria init
Initializing Aphoria...
Ingested 1,240 authoritative assertions.
Ready.

This downloads strict security requirements (RFC 7519 for JWT, RFC 5246 for TLS, etc.) into your project database (.aphoria/db).

Note: By default, each project has its own isolated database. To share a database across all projects on your machine, set data_dir = "~/.aphoria/db" in aphoria.toml.

3. The First Scan

Navigate to any project—a Rust backend, a Node.js API, or a Python script.

$ aphoria scan .

You will likely see output like this:

Scanning my-project ...

  BLOCK    code://node/server/tls/cert_verification
         Your code:  rejectUnauthorized: false (server.js:42)
         RFC 5246:   TLS certificate verification MUST be enabled (Tier 0)
         Conflict:   0.92

  BLOCK    code://node/auth/jwt/algorithm
         Your code:  algorithms: ["none"] (auth.js:15)
         RFC 7519:   'none' algorithm MUST NOT be accepted (Tier 0)
         Conflict:   0.98

2 conflicts found (2 BLOCK).

Interpreting the Output

  • BLOCK: A high-confidence conflict with a Regulatory (Tier 0) or Clinical (Tier 1) source. This is likely a bug or security vulnerability.
  • FLAG: A conflict with a lower-tier source (Vendor recommendation, Expert opinion). Worth reviewing, but might be intentional.
  • Conflict Score: How strongly the sources disagree (0.0 to 1.0).

Notice that Aphoria didn't just say "Error." It cited RFC 5246. It told you why it's wrong.

4. Fixing the Drift

You have two choices: Fix or Acknowledge.

Option A: Fix the Code (Compliance)

You realize the dev team left rejectUnauthorized: false in from a debugging session. You delete the line or set it to true.

Run the scan again:

$ aphoria scan .
0 conflicts found.

Epistemic drift resolved. The code now aligns with the spec.

Option B: Acknowledge the Deviation (Constructive Disagreement)

Sometimes, you are right and the RFC is wrong for your context.

  • Scenario: This is a local test harness. It uses self-signed certs. rejectUnauthorized: false is correct here.

Instead of ignoring it, you Acknowledge it.

$ aphoria ack "code://node/server/tls/cert_verification" --reason "Local dev harness, self-signed certs OK"

Run the scan again:

$ aphoria scan .

  ACK      code://node/server/tls/cert_verification
         Reason:     "Local dev harness, self-signed certs OK"
         Status:     Acknowledged (passed)

Why this matters

By acknowledging, you created a new assertion in the database: "In this project context, disabling TLS verify is acceptable."

You didn't just suppress a warning. You created a permanent, signed audit trail of why the security rule was broken.

5. Next Steps

You've successfully aligned your code with reality.

  • If you fixed it, you improved security.
  • If you acknowledged it, you improved documentation and auditability.

Next: Learn how to enforce these rules across your entire company with Federating Truth.