stemedb/ai-lookup/repo-structure.md
jordan 157dbbb9eb feat: Complete Aphoria Phase 8-9 + UAT suite (90/90 tests passing)
## Phase 8: Enterprise Extractor Improvements 
- 14 security extractors (TLS, JWT, SQL injection, XSS, etc.)
- 10 framework-specific extractors (Spring, Django, Rails, etc.)
- Config file security detection (YAML, TOML)

## Phase 9: Autonomous Extractor Generation 
- Shadow mode executor with TP/FP tracking
- Graduation pipeline with confidence thresholds
- Auto-rollback on regression detection
- Cross-project pattern syncing

## UAT Suite Complete (14 scripts, 90 tests)
- test-core-detection.sh (6 tests)
- test-declarative-extractors.sh (5 tests)
- test-domain-frameworks.sh (5 tests)
- test-domain-unreal.sh (3 tests)
- test-llm-extraction.sh (6 tests)
- test-eval-harness.sh (5 tests)
- test-cross-language.sh (3 tests)
- test-precommit-performance.sh (4 tests)
- test-output-formats.sh (8 tests)
- test-drift-detection.sh (6 tests)
- test-exit-codes.sh (12 tests)
+ 3 more scripts

## Other Changes
- Updated roadmap to mark Phase 8-9 complete
- Added .gitignore entries for build artifacts
- Updated pre-commit: 800 line limit, exclude tests/data/cmd

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 22:50:55 -07:00

129 lines
4.6 KiB
Markdown

# Repository Structure
This document describes the folder organization for the Episteme (StemeDB) monorepo.
## Top-Level Directories
```
episteme/
├── .claude/ # Claude Code configuration (agents, guides, skills)
├── ai-lookup/ # AI-readable documentation and feature references
├── applications/ # End-user applications and tools
├── batteries/ # Pre-built integrations and batteries-included packages
├── community/ # Community Next.js app (research agent chat UI)
├── crates/ # Rust workspace crates (core database engine)
├── data/ # Sample data and demo datasets
├── docs/ # Human-readable documentation
├── latent/ # Python CLI tools (Latent Signal detection)
├── scripts/ # Build, deploy, and utility scripts
├── sdk/ # Client SDKs (Go, potentially others)
├── uat/ # User Acceptance Testing scenarios and results
└── use-cases/ # Vertical-specific use case documentation
```
## `/applications/` - End-User Applications
All standalone applications live here, regardless of language or framework.
| Directory | Description | Tech Stack |
|-----------|-------------|------------|
| `aphoria/` | Code-level truth linter powered by Episteme | Rust |
| `disputed/` | Web app for exploring claim conflicts | Next.js |
| `stemedb-dashboard/` | Admin dashboard for StemeDB | Next.js + shadcn/ui |
**Rules:**
- Each application has its own `package.json`, `Cargo.toml`, or equivalent
- Applications may depend on crates or SDKs from the monorepo
- Each application should have a `README.md` explaining its purpose
## `/crates/` - Rust Workspace Crates
The core database engine and supporting libraries.
| Crate | Purpose |
|-------|---------|
| `stemedb-core` | Assertion, LifecycleStage, types, signing utilities |
| `stemedb-wal` | Write-ahead log with crash recovery |
| `stemedb-storage` | KVStore, IndexStore, QuarantineStore |
| `stemedb-ingest` | Ingestion pipeline, signature verification |
| `stemedb-query` | Query engine, Materializer |
| `stemedb-lens` | Lenses (Recency, Consensus, Authority, etc.) |
| `stemedb-api` | HTTP API with axum |
| `stemedb-sim` | Simulation and testing |
| `stemedb-merkle` | BLAKE3 Merkle tree |
| `stemedb-rpc` | gRPC node-to-node communication |
| `stemedb-sync` | Merkle sync, gossip, anti-entropy |
| `stemedb-cluster` | SWIM membership, sharding, gateway |
| `stemedb-ontology` | Domain definitions, subject builders |
| `stemedb-chaos` | Chaos testing infrastructure |
## `/sdk/` - Client SDKs
| Directory | Language | Purpose |
|-----------|----------|---------|
| `sdk/go/steme` | Go | HTTP client with Ed25519 signing |
| `sdk/go/adk` | Go | ADK-Go tools for AI agents |
## `/docs/` - Documentation
| Directory | Purpose |
|-----------|---------|
| `docs/app-concepts/` | Application concept documents |
| `docs/data-structures.md` | Core data structure reference |
| `docs/demo/` | Demo scripts and materials |
| `docs/research/` | Research documents and design notes |
| `docs/runbooks/` | Operational runbooks (planned) |
## `/.claude/` - Claude Code Configuration
| Directory | Purpose |
|-----------|---------|
| `.claude/agents/` | Specialized agent definitions |
| `.claude/guides/` | Task-specific guidelines |
| `.claude/skills/` | Reusable skill documents |
| `.claude/commands/` | Slash command definitions |
## `/ai-lookup/` - AI-Readable Documentation
Quick reference documents optimized for AI assistants.
| File | Purpose |
|------|---------|
| `index.md` | Entry point and directory |
| `services/sdk.md` | SDK usage reference |
| `features/*.md` | Feature-specific documentation |
| `repo-structure.md` | This file |
## `/community/` - Community App
Next.js application for the research agent chat interface.
- Runs on port 18187
- Uses the Claim component for inline citation
## `/latent/` - Latent Signal
Python CLI tools for adverse event signal detection.
- Different coding rules from Rust crates
- Uses StemeDB as backend
## Naming Conventions
- **Crates:** `stemedb-{name}` (lowercase, hyphens)
- **Applications:** descriptive name (e.g., `disputed`, `aphoria`)
- **SDKs:** `sdk/{language}/{package}`
- **Docs:** lowercase with hyphens (e.g., `data-structures.md`)
## Port Allocations
| Port | Service |
|------|---------|
| 18180 | StemeDB HTTP API |
| 18181 | Cluster Gateway |
| 18182 | Cluster RPC |
| 18183 | SWIM Gossip |
| 18184 | Metrics (reserved) |
| 18185 | Admin (reserved) |
| 18186 | Latent Signal |
| 18187 | Community App |
| 18188 | Admin Dashboard |