## Problem CLI-created community corpus items (tier 3) were stored correctly but invisible via API queries. Two issues blocked discoverability: 1. **Prefix mismatch**: API hardcoded 'community://pattern/' for aggregated patterns, but CLI creates 'community://rust/http/...' URIs 2. **Query parameter parsing**: Axum's default parser doesn't support bracket notation (?sources[]=value) used by the dashboard Result: 0/22 CLI-created items were queryable. ## Solution ### Fix 1: Broaden Community Prefix - Changed: 'community://pattern/' → 'community://' in corpus handler - Impact: Now matches both aggregated patterns AND CLI-created items - Backward compatible: Broader prefix includes narrower results ### Fix 2: Add QsQuery Extractor - Added: serde_qs dependency + custom QsQuery extractor - Supports: Bracket notation for array parameters (?sources[]=a&sources[]=b) - Compatible: Works with JavaScript URLSearchParams standard - Tested: 3 new unit tests for extractor behavior ## Verification - ✅ All 22 CLI-created community items now queryable (was 0) - ✅ Source filtering works: community (22), RFC (2), vendor (5) - ✅ Multi-source queries work: ?sources[]=community&sources[]=rfc → 24 - ✅ All 89 API tests pass + 3 new extractor tests - ✅ Clippy clean (0 warnings) - ✅ No regressions in existing functionality ## Files Changed - crates/stemedb-api/Cargo.toml: Add serde_qs dependency - crates/stemedb-api/src/extractors.rs: New QsQuery extractor (117 lines) - crates/stemedb-api/src/handlers/aphoria/corpus.rs: Use QsQuery, broaden prefix - crates/stemedb-api/src/lib.rs: Export extractors module Also includes: Scale-adaptive thresholds, wiki corpus extraction, documentation updates, and dashboard UI improvements from prior work. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
110 lines
2.6 KiB
Markdown
110 lines
2.6 KiB
Markdown
# Corpus Quick Start Guide
|
|
|
|
## TL;DR - API is Already Running!
|
|
|
|
The corpus API is currently serving data at:
|
|
- **URL:** `http://localhost:18180/v1/aphoria/corpus`
|
|
- **Database:** `~/.aphoria/corpus-db`
|
|
- **Data:** 2 RFC items (TLS cert verification, JWT audience validation)
|
|
|
|
## Test It Right Now
|
|
|
|
```bash
|
|
# Get all RFC corpus items
|
|
curl -s 'http://localhost:18180/v1/aphoria/corpus?sources[]=rfc' | jq '.items[].subject'
|
|
|
|
# Expected output:
|
|
# "rfc://5246/tls/certificate_verification"
|
|
# "rfc://7519/audience_validation"
|
|
```
|
|
|
|
## Import Production Wiki
|
|
|
|
```bash
|
|
cd ~/Workspace/stemedb
|
|
target/release/aphoria corpus import wiki ~/Workspace/orchard9/wiki/content
|
|
```
|
|
|
|
## Start Dashboard
|
|
|
|
```bash
|
|
cd applications/aphoria-dashboard
|
|
npm run dev
|
|
# Open: http://localhost:3000/corpus
|
|
```
|
|
|
|
## Restart API Later (if needed)
|
|
|
|
```bash
|
|
cd ~/Workspace/stemedb
|
|
STEMEDB_DB_DIR=$HOME/.aphoria/corpus-db \
|
|
STEMEDB_WAL_DIR=$HOME/.aphoria/corpus-db/wal \
|
|
target/release/stemedb-api
|
|
```
|
|
|
|
## Query Examples
|
|
|
|
```bash
|
|
# Get all sources (RFC, OWASP, vendor, community)
|
|
curl 'http://localhost:18180/v1/aphoria/corpus'
|
|
|
|
# Filter by multiple sources
|
|
curl 'http://localhost:18180/v1/aphoria/corpus?sources[]=rfc&sources[]=owasp'
|
|
|
|
# Filter by category
|
|
curl 'http://localhost:18180/v1/aphoria/corpus?category=security'
|
|
|
|
# Pagination
|
|
curl 'http://localhost:18180/v1/aphoria/corpus?limit=10&offset=0'
|
|
```
|
|
|
|
## Response Format
|
|
|
|
```json
|
|
{
|
|
"items": [
|
|
{
|
|
"subject": "rfc://5246/tls/certificate_verification",
|
|
"predicate": "enabled",
|
|
"value": "true",
|
|
"source": "rfc://",
|
|
"tier": 0,
|
|
"category": "security",
|
|
"explanation": "TLS certificate verification MUST be enabled...",
|
|
"authority_source": "RFC 5246 Section 7.4.2"
|
|
}
|
|
],
|
|
"total_matching": 2,
|
|
"sources_included": ["rfc://"]
|
|
}
|
|
```
|
|
|
|
## Files to Know
|
|
|
|
- **Corpus DB:** `~/.aphoria/corpus-db/` (shared across projects)
|
|
- **Project DB:** `.aphoria/db/` (per-project)
|
|
- **Import CLI:** `aphoria corpus import wiki <path>`
|
|
- **API Config:** Set `STEMEDB_DB_DIR` to choose database
|
|
|
|
## Troubleshooting
|
|
|
|
**Dashboard shows empty results?**
|
|
- Check API is running on port 18180
|
|
- Verify API is using corpus database: `ps aux | grep stemedb-api`
|
|
- Check API logs for database path
|
|
|
|
**API won't start?**
|
|
- Make sure corpus DB exists: `ls ~/.aphoria/corpus-db/`
|
|
- Check port not in use: `lsof -i :18180`
|
|
- View logs: `tail -f /tmp/api-corpus.log`
|
|
|
|
**Need to reimport wiki?**
|
|
```bash
|
|
rm -rf ~/.aphoria/corpus-db
|
|
target/release/aphoria corpus import wiki <path>
|
|
```
|
|
|
|
---
|
|
|
|
✅ **Current Status:** API running, corpus database populated, ready for dashboard!
|