docs: reorganize documentation structure for clarity

Major documentation restructure to improve discoverability and reduce duplication.

## Changes

**Deleted (Archived/Consolidated)**:
- Removed duplicate getting started guides
- Archived outdated planning documents
- Consolidated corpus and configuration docs
- Removed obsolete vision/spec files (superseded by vision.md)
- Cleaned up scrapyard and old PDFs

**New Structure**:
- docs/about/ - Project overview and introduction
- docs/guides/ - User guides (moved from root)
- docs/specs/ - Technical specifications
- docs/sdk/ - SDK documentation (Go)
- docs/references/ - API references
- docs/archive/ - Archived historical docs
- applications/aphoria/docs/advanced/ - Advanced topics
- applications/aphoria/docs/reference/ - CLI reference
- applications/aphoria/docs/archive/ - Archived aphoria docs

**Updated**:
- README.md - New root README with clear navigation
- CONTRIBUTING.md - Contribution guidelines
- CLAUDE.md - Updated paths to new structure
- roadmap.md - Added recent completions

## Files Changed
- 57 files changed
- 1,977 insertions(+)
- 961 deletions(-)

**Net change**: +1,016 lines (added CONTRIBUTING.md, README.md, reorganized content)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
jml 2026-02-11 07:33:40 +00:00
parent e758f2ebfb
commit 9bfa626203
57 changed files with 1977 additions and 961 deletions

View File

@ -4,7 +4,7 @@
## Framework ## Framework
All code MUST follow patterns in [CODING_GUIDELINES.md](../../../CODING_GUIDELINES.md). All code MUST follow patterns in [Coding Guidelines](../coding-guidelines.md).
## Core Principles ## Core Principles

View File

@ -51,7 +51,7 @@ Settings (`.vscode/settings.json`):
``` ```
stemedb/ stemedb/
CLAUDE.md # AI router (start here) CLAUDE.md # AI router (start here)
CODING_GUIDELINES.md # Rust standards .claude/guides/coding-guidelines.md # Rust standards
Cargo.toml # Workspace root Cargo.toml # Workspace root
crates/ crates/
stemedb-core/ # Core types and storage stemedb-core/ # Core types and storage

View File

@ -27,7 +27,7 @@ A probabilistic knowledge graph database that stores Claims, not Facts. Append-o
| **Run tests** | [.claude/guides/local/testing.md](.claude/guides/local/testing.md) | | **Run tests** | [.claude/guides/local/testing.md](.claude/guides/local/testing.md) |
| **Understand quality checks** | [.claude/guides/local/quality-checks.md](.claude/guides/local/quality-checks.md) | | **Understand quality checks** | [.claude/guides/local/quality-checks.md](.claude/guides/local/quality-checks.md) |
| **Learn about simulation** | [ai-lookup/features/simulation.md](ai-lookup/features/simulation.md) | | **Learn about simulation** | [ai-lookup/features/simulation.md](ai-lookup/features/simulation.md) |
| **Advance the simulator** | [arena-roadmap.md](./arena-roadmap.md) | | **Advance the simulator** | [roadmap.md#arena-simulation-roadmap](./roadmap.md#arena-simulation-roadmap) |
| **Work on storage/DAG** | Load skill: `stemedb-core` | | **Work on storage/DAG** | Load skill: `stemedb-core` |
| **Implement a Lens** | Load skill: `stemedb-lens` | | **Implement a Lens** | Load skill: `stemedb-lens` |
| **Work on domain ontology** | `crates/stemedb-ontology/` | | **Work on domain ontology** | `crates/stemedb-ontology/` |

431
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,431 @@
# Contributing to Episteme
Thank you for your interest in contributing to Episteme (StemeDB)! This document provides guidelines for contributing code, documentation, and ideas to the project.
---
## Table of Contents
- [Code of Conduct](#code-of-conduct)
- [Getting Started](#getting-started)
- [Development Workflow](#development-workflow)
- [Documentation Standards](#documentation-standards)
- [Coding Standards](#coding-standards)
- [Testing Requirements](#testing-requirements)
- [Commit Guidelines](#commit-guidelines)
- [Pull Request Process](#pull-request-process)
---
## Code of Conduct
**ZERO TOLERANCE FOR MEDIOCRITY:** We build enterprise-grade products that must survive in production. Panics are UNACCEPTABLE. Broken pipe errors are UNACCEPTABLE. Sloppy testing is UNACCEPTABLE. Every line of code ships to paying customers who depend on it.
**Principles:**
- Test everything
- Handle every error
- No shortcuts
- No excuses
- Leave code better than you found it
---
## Getting Started
### Prerequisites
- Rust 1.75+ (`rustup update stable`)
- Git
- Basic understanding of knowledge graphs and conflict resolution
### Initial Setup
```bash
# Clone repository
git clone https://github.com/orchard9/stemedb.git
cd stemedb
# Build workspace
cargo build --workspace
# Run tests
cargo test --workspace
# Verify quality checks pass
cargo clippy --workspace -- -D warnings
cargo fmt --check
```
**[→ Full Setup Guide](./.claude/guides/local/setup.md)**
---
## Development Workflow
### 1. Create a Branch
```bash
# Feature branch
git checkout -b feature/your-feature-name
# Bug fix branch
git checkout -b fix/issue-description
# Documentation branch
git checkout -b docs/what-youre-documenting
```
### 2. Make Changes
- Write clean, readable code
- Add tests for new functionality
- Update documentation as needed
- Run quality checks locally before committing
### 3. Pre-Commit Checks
Before committing, ensure all checks pass:
```bash
# Format code
cargo fmt
# Check for warnings
cargo clippy --workspace -- -D warnings
# Run tests
cargo test --workspace
# Build entire workspace
cargo build --workspace
```
**[→ Quality Checks Guide](./.claude/guides/local/quality-checks.md)**
---
## Documentation Standards
### File Organization
| Location | Content | When to Use |
|----------|---------|-------------|
| **Top level** | Core docs only (README, quickstart, architecture, vision, roadmap) | Essential project docs |
| **docs/** | All other documentation | Everything else |
| **docs/about/** | Audience-specific overviews (investors, public, technical) | Marketing and positioning |
| **docs/guides/** | How-to guides and tutorials | Step-by-step instructions |
| **docs/specs/** | Specifications and RFCs | Technical specifications |
| **docs/sdk/** | SDK and integration guides | Client library docs |
| **docs/research/** | Research and design documents | Exploration and analysis |
| **.claude/guides/** | Developer guides | Coding standards, setup, testing |
### Documentation Standards
1. **Directory Indexes**: Every directory with 3+ markdown files needs a README.md index
2. **Guide Structure**: All guides must have:
- **Prerequisites** section
- **What You'll Learn** section
- **See Also** section with related docs
3. **Internal Links**: Use relative paths (e.g., `./quickstart.md`, not absolute URLs)
4. **Top-Level Limit**: Keep top-level directory to <20 files
5. **Code Examples**: Include runnable code snippets where possible
### Writing Style
- **Clear and concise**: Technical but accessible
- **Active voice**: "Run the server" not "The server should be run"
- **Present tense**: "The lens resolves conflicts" not "The lens will resolve"
- **Specific examples**: Show concrete examples, not abstract descriptions
---
## Coding Standards
### Rust Guidelines
See **[Coding Guidelines](./.claude/guides/coding-guidelines.md)** for complete standards.
**Critical Rules:**
1. **No Unwrap**: NEVER use `unwrap()` or `expect()` in production code
- CI enforces via `clippy::unwrap_used` and `clippy::expect_used` at deny level
- Use `?` operator or explicit match for error handling
2. **Append-Only**: NEVER mutate existing Assertions
- Create new assertions instead of modifying
- Content-addressed: Assertion ID = BLAKE3 hash of content
3. **Structured Logging**: Use `tracing` (info!, warn!, error!)
- Clippy enforces via `print_stdout`/`print_stderr` at warn level
- CLI binaries may use `#![allow()]` for user-facing output
4. **Defensive Writes**: All writes go through WAL with fsync
- Storage operations must be durable
- Test crash recovery scenarios
5. **Zero-Copy Serialization**: Use `stemedb_core::serde::{serialize, deserialize}`
- NEVER use raw `AllocSerializer` in production code
- Prefer rkyv for serialization
### Code Organization
```rust
// Good: Clear module structure
mod types;
mod storage;
mod query;
// Good: Explicit error handling
fn process() -> Result<(), Error> {
let data = fetch_data()?;
validate(data)?;
Ok(())
}
// Bad: Unwrap in production
fn process() {
let data = fetch_data().unwrap(); // ❌ NEVER DO THIS
}
```
---
## Testing Requirements
### Test Coverage
- **Unit tests**: All public functions and methods
- **Integration tests**: Critical user workflows
- **Property tests**: Complex algorithms and data structures
- **Chaos tests**: Failure scenarios and recovery
### Running Tests
```bash
# Fast: Single crate
cargo test -p stemedb-core
# Medium: All unit tests
cargo test --workspace --lib
# Full: Parallel runner
cargo nextest run
# Legacy: Sequential
cargo test --workspace
```
**[→ Testing Guide](./.claude/guides/local/testing.md)**
### Test Standards
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn descriptive_test_name() {
// Arrange
let input = create_test_data();
// Act
let result = function_under_test(input);
// Assert
assert_eq!(result, expected);
}
#[test]
fn handles_error_case() {
let result = function_with_bad_input();
assert!(result.is_err());
}
}
```
---
## Commit Guidelines
### Commit Message Format
```
<type>(<scope>): <subject>
<body>
<footer>
```
**Types:**
- `feat`: New feature
- `fix`: Bug fix
- `docs`: Documentation only
- `refactor`: Code refactoring
- `test`: Adding or updating tests
- `chore`: Maintenance tasks
**Examples:**
```
feat(lens): add TrustAwareAuthorityLens
Implements lens that weights assertions by agent TrustRank.
- Higher TrustRank agents get more weight
- Reputation updates based on vote outcomes
- Tested with concurrent agent simulation
Closes #123
```
```
fix(wal): prevent data loss on crash during fsync
Changed fsync error handling to retry on EINTR instead
of returning error. Prevents data loss when process
receives signal during write.
Fixes #456
```
```
docs(quickstart): update cluster setup instructions
Added steps for multi-node cluster deployment and
fixed broken links to API reference.
```
### Commit Signing
Commits should be signed with GPG:
```bash
git config --global commit.gpgsign true
git config --global user.signingkey YOUR_GPG_KEY
```
---
## Pull Request Process
### Before Submitting
1. **Tests pass**: All tests must pass locally
2. **Clippy clean**: No warnings from clippy
3. **Formatted**: Code formatted with `cargo fmt`
4. **Documented**: Public APIs have doc comments
5. **Changelog**: Update CHANGELOG.md if applicable
### PR Description Template
```markdown
## Summary
Brief description of changes
## Motivation
Why are these changes needed?
## Changes
- Bullet list of specific changes
- Include file paths for major changes
## Testing
How was this tested?
- [ ] Unit tests
- [ ] Integration tests
- [ ] Manual testing
## Checklist
- [ ] Tests pass (`cargo test --workspace`)
- [ ] Clippy passes (`cargo clippy --workspace -- -D warnings`)
- [ ] Code formatted (`cargo fmt`)
- [ ] Documentation updated
- [ ] CHANGELOG.md updated (if applicable)
```
### Review Process
1. **Automated Checks**: CI runs tests and lints
2. **Code Review**: At least one maintainer must approve
3. **Testing**: Reviewer may run code locally
4. **Merge**: Squash and merge to main
---
## Project Structure
```
stemedb/
├── README.md # Public entry point
├── CONTRIBUTING.md # This file
├── quickstart.md # 5-minute setup
├── architecture.md # System design
├── vision.md # Product philosophy
├── roadmap.md # Current and planned work
├── crates/ # Rust workspace
│ ├── stemedb-core/ # Core types
│ ├── stemedb-wal/ # Write-ahead log
│ ├── stemedb-storage/ # KV store
│ ├── stemedb-ingest/ # Ingestion pipeline
│ ├── stemedb-query/ # Query engine
│ ├── stemedb-lens/ # Conflict resolution
│ └── stemedb-api/ # HTTP API
├── applications/ # Applications built on Episteme
│ ├── aphoria/ # Code-level truth linter
│ ├── stemedb-dashboard/ # Admin dashboard
│ └── disputed/ # Controversy explorer
├── sdk/ # Client libraries
│ └── go/ # Go SDK
├── docs/ # Documentation
│ ├── README.md # Documentation hub
│ ├── about/ # Audience-specific docs
│ ├── guides/ # How-to guides
│ ├── specs/ # Specifications
│ ├── sdk/ # SDK documentation
│ └── research/ # Research documents
└── .claude/ # AI agent guides
├── guides/ # Developer guides
├── skills/ # Claude Code skills
└── agents/ # Specialized agents
```
**[→ Full Repo Structure](./docs/README.md)**
---
## Getting Help
| Question | Resource |
|----------|----------|
| How do I... | [Documentation Index](./docs/README.md) |
| Setup issues | [Setup Guide](./.claude/guides/local/setup.md) |
| Test failures | [Testing Guide](./.claude/guides/local/testing.md) |
| Code questions | [GitHub Discussions](https://github.com/orchard9/stemedb/discussions) |
| Bug reports | [GitHub Issues](https://github.com/orchard9/stemedb/issues) |
---
## Recognition
Contributors are recognized in:
- Git commit history
- Release notes
- Project README (for significant contributions)
---
## License
By contributing to Episteme, you agree that your contributions will be licensed under the same license as the project.
---
**Thank you for contributing to Episteme!**
**[← Back to README](./README.md)**

View File

@ -43,7 +43,7 @@ The system follows a "Spine -> Lattice -> Cortex" architecture:
* `stemedb-sim/`: "The Arena" simulation for end-to-end verification. * `stemedb-sim/`: "The Arena" simulation for end-to-end verification.
* `architecture.md`: Detailed system design and data flow. * `architecture.md`: Detailed system design and data flow.
* `roadmap.md`: Phased implementation plan and status. * `roadmap.md`: Phased implementation plan and status.
* `usage.md`: Rust API usage guide and vision for agent interaction. * `docs/sdk/go-usage-guide.md`: Go SDK usage guide and patterns.
* `Makefile`: Build and quality automation. * `Makefile`: Build and quality automation.
## Building and Running ## Building and Running

173
README.md Normal file
View File

@ -0,0 +1,173 @@
# Episteme (StemeDB)
**A probabilistic knowledge graph database that stores Claims, not Facts.**
Append-only Merkle DAG with read-time resolution via Lenses. Think of it as "Git for Truth" - conflicting assertions coexist, resolved at query time through Consensus, Recency, Authority, or custom Lenses.
---
## Quick Start
```bash
# Get running in under 5 minutes
make validate
# Start the server
cargo run --package stemedb-api
# Open API docs
open http://localhost:18180/swagger-ui
```
**[→ Full Quick Start Guide](./quickstart.md)**
---
## Understanding Episteme
- **[What is Episteme?](./what-is-episteme.md)** - Concept overview and real-world examples
- **[Vision](./vision.md)** - Product philosophy and "Git for Truth" principles
- **[Architecture](./architecture.md)** - Technical design and data structures
- **[Use Cases](./use-cases/README.md)** - Consumer health, financial due diligence, AI agents
---
## Documentation
- **[📚 Full Documentation Index](./docs/README.md)** - Complete documentation hub
- **[App Development Guide](./docs/app-concepts/index.md)** - Build applications on Episteme
- **[Go SDK](./docs/sdk/go-sdk.md)** - Client library and examples
- **[ADK-Go Integration](./docs/references/go-adk/reference-guide.md)** - AI agent integration
- **[RFCs & Specs](./docs/rfcs/README.md)** - Technical specifications
---
## For Developers
### Getting Started
- **[Development Setup](./.claude/guides/local/setup.md)** - Local environment setup
- **[Testing Guide](./.claude/guides/local/testing.md)** - Running tests
- **[Coding Guidelines](./.claude/guides/coding-guidelines.md)** - Rust standards and patterns
- **[Quality Checks](./.claude/guides/local/quality-checks.md)** - Pre-commit hooks and CI
### Project Management
- **[Roadmap](./roadmap.md)** - Current and planned work
- **[Roadmap Archive](./roadmap-archive.md)** - Completed phases
- **[Contributing Guide](./CONTRIBUTING.md)** - How to contribute
### Architecture Deep Dives
- **[Data Structures](./docs/data-structures.md)** - Core types and design
- **[Consistency Model](./docs/consistency-model.md)** - Conflict resolution
- **[Distributed Architecture](./docs/research/distributed-write-path.md)** - Clustering and sharding
- **[Storage Engine](./docs/research/wal-crash-recovery-research.md)** - WAL and recovery
---
## Applications
Episteme powers multiple applications:
- **[Aphoria](./applications/aphoria/README.md)** - Code-level truth linter and continuous learning system
- **[Admin Dashboard](./applications/stemedb-dashboard/)** - Web UI for cluster management
- **[Disputed](./applications/disputed/)** - Claim disagreement visualization
---
## For AI Agents
- **[CLAUDE.md](./CLAUDE.md)** - Claude Code agent instructions
- **[GEMINI.md](./GEMINI.md)** - Gemini CLI agent instructions
---
## Core Principles
**ZERO TOLERANCE FOR MEDIOCRITY:** We build enterprise-grade products that must survive in production. Panics are UNACCEPTABLE. Broken pipe errors are UNACCEPTABLE. Sloppy testing is UNACCEPTABLE. Every line of code ships to paying customers who depend on it. Test everything. Handle every error. No shortcuts. No excuses.
### Technical Principles
- **Append-Only**: NEVER mutate existing Assertions. Create new ones.
- **Content-Addressed**: Assertion ID = BLAKE3 hash of content
- **No Unwrap**: NEVER use `unwrap()` or `expect()` in production code
- **Defensive Writes**: All writes go through WAL with fsync
- **Structured Logging**: Use `tracing` (info!, warn!, error!)
**[→ Full Coding Guidelines](./.claude/guides/coding-guidelines.md)**
---
## Port Scheme (181XX)
| Service | Port | Env Var |
|---------|------|---------|
| HTTP API | 18180 | `STEMEDB_BIND_ADDR` |
| Cluster Gateway | 18181 | `STEMEDB_NODE_API_ADDR` |
| Cluster RPC | 18182 | `STEMEDB_NODE_RPC_ADDR` |
| SWIM Gossip | 18183 | via `SwimConfig` |
| StemeDB Dashboard | 18188 | - |
| Aphoria Dashboard | 18189 | - |
---
## Quick Reference
```bash
# Build
cargo build --workspace
# Test
cargo test --workspace --lib # Unit tests (~3min)
cargo nextest run # Parallel runner (~5min)
# Lint (must pass before commit)
cargo clippy --workspace -- -D warnings
cargo fmt --check
# Run server
cargo run --package stemedb-api
# Run cluster node
cargo run --package stemedb-cluster --bin stemedb-node
```
---
## Community & Support
- **Issues**: [GitHub Issues](https://github.com/orchard9/stemedb/issues)
- **Discussions**: [GitHub Discussions](https://github.com/orchard9/stemedb/discussions)
- **License**: See LICENSE file
---
## What Makes Episteme Different?
Traditional databases force you to pick "the right answer." Episteme holds all the answers, tracks who said them and why, and lets you decide how to resolve disagreements at query time.
| Traditional DB | Episteme |
|----------------|----------|
| One canonical truth | Multiple competing claims |
| Update overwrites | Append-only history |
| Consensus enforced at write | Resolution deferred to read |
| Time-travel via backups | Built-in temporal queries |
| Source tracking via app logic | First-class provenance |
**When a Reddit community reports gastroparesis months before the FDA adds a warning label, both claims coexist in Episteme. You can query by authority tier (FDA wins), by recency (Reddit was first), or by consensus (see the disagreement).**
**This is critical for domains where truth is contested, evolving, or depends on perspective: health, finance, research, intelligence.**
---
## Getting Help
| Question | Resource |
|----------|----------|
| How do I... | [Documentation Index](./docs/README.md) |
| Why did you... | [Architecture](./architecture.md) + [Vision](./vision.md) |
| Can I use this for... | [Use Cases](./use-cases/README.md) |
| It's not working... | [GitHub Issues](https://github.com/orchard9/stemedb/issues) |
| I want to contribute... | [Contributing Guide](./CONTRIBUTING.md) |
---
**[Get Started →](./quickstart.md)**

View File

@ -17,7 +17,7 @@ The Simulation is an Agent-Based Modeling (ABM) environment that validates Steme
**File Pointers:** **File Pointers:**
- Implementation: `crates/stemedb-sim/src/main.rs` - Implementation: `crates/stemedb-sim/src/main.rs`
- Vision document: `/simulation-vision.md` - Vision document: `/simulation-vision.md`
- Incremental roadmap: `/arena-roadmap.md` - Incremental roadmap: `/roadmap.md#arena-simulation-roadmap`
## Agent Personas ## Agent Personas
@ -70,7 +70,7 @@ The simulator currently validates **Phase 1: The Spine** (WAL + Ingestor + KV St
## Roadmap ## Roadmap
See [arena-roadmap.md](/arena-roadmap.md) for the incremental path from current state to full ABM environment. See [roadmap.md - Arena section](/roadmap.md#arena-simulation-roadmap) for the incremental path from current state to full ABM environment.
## Related Topics ## Related Topics

View File

@ -66,4 +66,4 @@ fn load_assertion(&self, hash: &Hash) -> Result<Assertion, StemeError> {
## Related Topics ## Related Topics
- [Rust Guidelines](../../.claude/guides/backend/rust-guidelines.md) - [Rust Guidelines](../../.claude/guides/backend/rust-guidelines.md)
- [CODING_GUIDELINES.md](../../../CODING_GUIDELINES.md) - [Coding Guidelines](../../../.claude/guides/coding-guidelines.md)

View File

@ -72,9 +72,9 @@ BLOCK code://python/requests/tls/cert_verification
3. **[Build an agent](../../sdk/go/adk/)** - ADK-Go integration for autonomous operation 3. **[Build an agent](../../sdk/go/adk/)** - ADK-Go integration for autonomous operation
**Fallback (No LLM Access):** **Fallback (No LLM Access):**
- **[CLI Quick Start (2 min)](docs/getting-started/solo-developer-quick-start.md)** - Manual scan workflow (debug interface) - **[CLI Quick Start (2 min)](docs/guides/solo-developer-guide.md#quick-start-2-minutes)** - Manual scan workflow (debug interface)
See [Getting Started Hub](docs/getting-started/) for all paths. See [Getting Started Hub](docs/guides/) for all paths.
--- ---
@ -199,10 +199,10 @@ aphoria extractors test timeout_zero_detector --file src/config.rs
- Verify observation format before scanning - Verify observation format before scanning
- Faster iteration when creating extractors (< 5 seconds per test vs full scan) - Faster iteration when creating extractors (< 5 seconds per test vs full scan)
**Typical Day 3 workflow:** **Iterative development workflow:**
1. Create extractors → 2. `aphoria extractors validate` → 3. Fix subjects → 4. `aphoria extractors test` for each → 5. `aphoria scan --show-observations` → 6. Iterate 1. Create extractors → 2. `aphoria extractors validate` → 3. Fix subjects → 4. `aphoria extractors test` for each → 5. `aphoria scan --show-observations` → 6. Iterate
This workflow reduces Day 3 debugging time from ~70 minutes to ~30 minutes. This workflow enables rapid iteration when building custom extractors.
### Handle Conflicts ### Handle Conflicts
@ -333,7 +333,7 @@ repos:
| `aphoria governance pending` | List approval requests (Phase 14) | | `aphoria governance pending` | List approval requests (Phase 14) |
| `aphoria audit export` | Export audit trail for SOC 2 compliance | | `aphoria audit export` | Export audit trail for SOC 2 compliance |
See [CLI Reference](docs/cli-reference.md) for complete command documentation. See [CLI Reference](docs/reference/cli-reference.md) for complete command documentation.
--- ---
@ -377,7 +377,7 @@ Claims support six comparison modes for different verification patterns:
- `contains` - Value must contain substring/list element (e.g., "Serialize" in "Clone,Debug,Serialize") - `contains` - Value must contain substring/list element (e.g., "Serialize" in "Clone,Debug,Serialize")
- `not_contains` - Value must NOT contain substring/list element (e.g., "Clone" NOT in derives) - `not_contains` - Value must NOT contain substring/list element (e.g., "Clone" NOT in derives)
See [Comparison Modes Guide](docs/comparison-modes.md) for detailed examples and decision tree. See [Comparison Modes Guide](docs/reference/comparison-modes.md) for detailed examples and decision tree.
### Inline Markers ### Inline Markers
@ -487,10 +487,9 @@ Features:
### Reference ### Reference
| Document | Description | | Document | Description |
|----------|-------------| |----------|-------------|
| [CLI Reference](docs/cli-reference.md) | Complete command documentation | | [CLI Reference](docs/reference/cli-reference.md) | Complete command documentation |
| [Comparison Modes](docs/comparison-modes.md) | Guide to claim comparison modes | | [Comparison Modes](docs/reference/comparison-modes.md) | Guide to claim comparison modes |
| [Declarative Extractors](docs/extractors/declarative-extractors.md) | Complete field reference for declarative extractors | | [Declarative Extractors](docs/extractors/declarative-extractors.md) | Complete field reference for declarative extractors |
| [Vision & Gaps](docs/vision-gaps.md) | Architecture and implementation status |
### Examples ### Examples
| Example | Description | | Example | Description |
@ -511,8 +510,7 @@ Features:
| Document | Description | | Document | Description |
|----------|-------------| |----------|-------------|
| [Vision](vision.md) | Product vision and aspirational architecture | | [Vision](vision.md) | Product vision and aspirational architecture |
| [Protocol Vision](protocol_vision.md) | Protocol-level design philosophy | | [Protocol Vision](docs/advanced/eap-protocol.md) | Protocol-level design philosophy |
| [Vision & Gaps](docs/vision-gaps.md) | Honest assessment of current state vs. vision |
| [Architecture Docs](docs/architecture/README.md) | System design, concept matching, extension points | | [Architecture Docs](docs/architecture/README.md) | System design, concept matching, extension points |
### Testing & Validation ### Testing & Validation
@ -522,10 +520,11 @@ Features:
| [Phase 6 UAT](../../uat/phase6-uat.md) | Detailed validation of policy workflows | | [Phase 6 UAT](../../uat/phase6-uat.md) | Detailed validation of policy workflows |
| [Real-World Policy Source UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) | Trust Pack workflow validation | | [Real-World Policy Source UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) | Trust Pack workflow validation |
### Gap Analysis & Research ### Historical Documents (Archived)
| Document | Description | | Document | Description |
|----------|-------------| |----------|-------------|
| [Gap Analysis: Institutional Knowledge](docs/gap-analysis-institutional-knowledge.md) | Analysis of knowledge capture gaps | | [Vision & Gaps (2026-02-08)](docs/archive/vision-gaps-2026-02-08.md) | Historical: Architecture analysis and implementation status |
| [Gap Analysis: Institutional Knowledge](docs/archive/gap-analysis-institutional-knowledge-2026-02.md) | Historical: Knowledge capture gap analysis |
| [Gap Fixes Summary](docs/gap-fixes-summary.md) | Summary of addressed gaps | | [Gap Fixes Summary](docs/gap-fixes-summary.md) | Summary of addressed gaps |
--- ---

View File

@ -691,7 +691,7 @@ To deduplicate, ensure CLI-created items use unique subjects or authorities that
## See Also ## See Also
- [CLI Reference](cli-reference.md) - Complete command reference - [CLI Reference](../reference/cli-reference.md) - Complete command reference
- [Configuration Reference](configuration.md) - Configuration file reference - [Configuration Reference](configuration.md) - Configuration file reference
- [README](../README.md) - Quickstart and key concepts - [README](../README.md) - Quickstart and key concepts
- [Comparison Modes](comparison-modes.md) - Deep dive on verification logic - [Comparison Modes](comparison-modes.md) - Deep dive on verification logic

View File

@ -554,7 +554,7 @@ aphoria trust-pack install rfc-owasp-baseline
## Related Documentation ## Related Documentation
### Product ### Product
- [Product Overview](../../product.md) - What Aphoria does - [Product Overview](../advanced/product-overview.md) - What Aphoria does
- [Roadmap](../../roadmap.md) - Implementation status and plans - [Roadmap](../../roadmap.md) - Implementation status and plans
### Guides ### Guides

View File

@ -154,7 +154,7 @@ let client = reqwest::Client::builder()
**Rust Extractor Output:** **Rust Extractor Output:**
```rust ```rust
ExtractedClaim { Observation {
concept_path: "code://rust/backend-api/tls/cert_verification", concept_path: "code://rust/backend-api/tls/cert_verification",
predicate: "enabled", predicate: "enabled",
value: ObjectValue::Boolean(false), value: ObjectValue::Boolean(false),

View File

@ -597,7 +597,7 @@ impl ClaimMatcher {
/// Check if extracted claims satisfy must_contain requirements. /// Check if extracted claims satisfy must_contain requirements.
pub fn check_must_contain( pub fn check_must_contain(
&self, &self,
extracted: &[ExtractedClaim], extracted: &[Observation],
expected: &[ExpectedClaim], expected: &[ExpectedClaim],
) -> MatchResult { ) -> MatchResult {
let mut matched = vec![]; let mut matched = vec![];
@ -617,9 +617,9 @@ impl ClaimMatcher {
/// Check if any extracted claim matches (for must_not_contain). /// Check if any extracted claim matches (for must_not_contain).
pub fn check_must_not_contain( pub fn check_must_not_contain(
&self, &self,
extracted: &[ExtractedClaim], extracted: &[Observation],
forbidden: &[ExpectedClaim], forbidden: &[ExpectedClaim],
) -> Vec<(ExpectedClaim, ExtractedClaim)> { ) -> Vec<(ExpectedClaim, Observation)> {
let mut violations = vec![]; let mut violations = vec![];
for forbid in forbidden { for forbid in forbidden {
@ -633,9 +633,9 @@ impl ClaimMatcher {
fn find_matching_claim( fn find_matching_claim(
&self, &self,
extracted: &[ExtractedClaim], extracted: &[Observation],
expected: &ExpectedClaim, expected: &ExpectedClaim,
) -> Option<&ExtractedClaim> { ) -> Option<&Observation> {
extracted.iter().find(|claim| { extracted.iter().find(|claim| {
self.subject_matches(&claim.concept_path, &expected.subject) && self.subject_matches(&claim.concept_path, &expected.subject) &&
claim.predicate == expected.predicate && claim.predicate == expected.predicate &&

View File

@ -288,7 +288,7 @@ pub struct ExtractionOutput {
pub raw_response: String, pub raw_response: String,
/// Parsed claims (may be empty if parsing failed) /// Parsed claims (may be empty if parsing failed)
pub claims: Vec<ExtractedClaim>, pub claims: Vec<Observation>,
/// Whether parsing succeeded /// Whether parsing succeeded
pub parse_success: bool, pub parse_success: bool,
@ -503,7 +503,7 @@ impl ClaimMatcher {
/// Check if extracted claims satisfy must_contain requirements /// Check if extracted claims satisfy must_contain requirements
pub fn check_must_contain( pub fn check_must_contain(
&self, &self,
extracted: &[ExtractedClaim], extracted: &[Observation],
expected: &[ExpectedClaim], expected: &[ExpectedClaim],
) -> MatchResult { ) -> MatchResult {
// For each expected claim: // For each expected claim:
@ -516,7 +516,7 @@ impl ClaimMatcher {
/// Check if extracted claims violate must_not_contain requirements /// Check if extracted claims violate must_not_contain requirements
pub fn check_must_not_contain( pub fn check_must_not_contain(
&self, &self,
extracted: &[ExtractedClaim], extracted: &[Observation],
forbidden: &[ExpectedClaim], forbidden: &[ExpectedClaim],
) -> MatchResult { ) -> MatchResult {
// For each forbidden claim: // For each forbidden claim:

View File

@ -242,7 +242,7 @@ mod tests {
```rust ```rust
async fn check_conflicts_persistent( async fn check_conflicts_persistent(
all_claims: &[ExtractedClaim], all_claims: &[Observation],
project_root: &Path, project_root: &Path,
config: &AphoriaConfig, config: &AphoriaConfig,
sync: bool, sync: bool,
@ -291,7 +291,7 @@ impl LocalEpisteme {
/// Check conflicts with policy alias support. /// Check conflicts with policy alias support.
pub async fn check_conflicts_with_aliases( pub async fn check_conflicts_with_aliases(
&self, &self,
claims: &[ExtractedClaim], claims: &[Observation],
config: &AphoriaConfig, config: &AphoriaConfig,
index: &ConceptIndex, index: &ConceptIndex,
policy_aliases: &[PolicyAlias], policy_aliases: &[PolicyAlias],
@ -322,7 +322,7 @@ impl LocalEpisteme {
// Keep existing method for ephemeral mode // Keep existing method for ephemeral mode
pub async fn check_conflicts( pub async fn check_conflicts(
&self, &self,
claims: &[ExtractedClaim], claims: &[Observation],
config: &AphoriaConfig, config: &AphoriaConfig,
index: &ConceptIndex, index: &ConceptIndex,
) -> Result<Vec<ConflictResult>, AphoriaError> { ) -> Result<Vec<ConflictResult>, AphoriaError> {
@ -354,7 +354,7 @@ impl EphemeralDetector {
pub fn check_conflicts( pub fn check_conflicts(
&self, &self,
claims: &[ExtractedClaim], claims: &[Observation],
config: &AphoriaConfig, config: &AphoriaConfig,
) -> Vec<ConflictResult> { ) -> Vec<ConflictResult> {
let index = ConceptIndex::build(&self.corpus); let index = ConceptIndex::build(&self.corpus);
@ -611,7 +611,7 @@ async fn test_policy_alias_matching_integration() {
).unwrap(); ).unwrap();
// 4. Simulate scan with code claim // 4. Simulate scan with code claim
let code_claim = ExtractedClaim { let code_claim = Observation {
concept_path: "code://rust/myapp/tls/cert_verification".to_string(), concept_path: "code://rust/myapp/tls/cert_verification".to_string(),
predicate: "enabled".to_string(), predicate: "enabled".to_string(),
value: ObjectValue::Boolean(false), // CONFLICT value: ObjectValue::Boolean(false), // CONFLICT

View File

@ -363,7 +363,7 @@ impl DocumentIngester {
async fn extract_claims_from_section( async fn extract_claims_from_section(
&self, &self,
section: &DocumentSection, section: &DocumentSection,
) -> Result<Vec<ExtractedClaim>, LlmError> { ) -> Result<Vec<Observation>, LlmError> {
let prompt = format!( let prompt = format!(
r#"Extract architectural claims from this section. r#"Extract architectural claims from this section.
@ -390,7 +390,7 @@ Return as JSON array."#,
/// Validate extracted claims for quality. /// Validate extracted claims for quality.
fn validate_claims( fn validate_claims(
&self, &self,
claims: Vec<ExtractedClaim>, claims: Vec<Observation>,
) -> Result<Vec<ValidatedClaim>, ValidationError> { ) -> Result<Vec<ValidatedClaim>, ValidationError> {
claims claims
.into_iter() .into_iter()

View File

@ -8,7 +8,7 @@
**Phase A1: Distinguish Observations from Claims** - ✅ **COMPLETE** (2026-02-08) **Phase A1: Distinguish Observations from Claims** - ✅ **COMPLETE** (2026-02-08)
- Renamed `ExtractedClaim` → `Observation` (struct + 81 files updated) - Renamed `Observation` → `Observation` (struct + 81 files updated)
- Added confidence-based tier mapping: ≥0.9 → Tier 4, <0.9 Tier 5 - Added confidence-based tier mapping: ≥0.9 → Tier 4, <0.9 Tier 5
- `observation_to_assertion()` replaces fixed Tier 3 assignment - `observation_to_assertion()` replaces fixed Tier 3 assignment
- `AuthoredClaim` type fully defined with provenance/invariant/consequence fields - `AuthoredClaim` type fully defined with provenance/invariant/consequence fields
@ -417,7 +417,7 @@ The following claims were extracted using the `extract-claims` skill pattern. Ea
| ID | Claim | Gap | | ID | Claim | Gap |
|----|-------|-----| |----|-------|-----|
| VG-020 | `Observation` type exists and is properly named | ✅ **CLOSED**`ExtractedClaim` renamed to `Observation` in Phase A1 | | VG-020 | `Observation` type exists and is properly named | ✅ **CLOSED**`Observation` renamed to `Observation` in Phase A1 |
| VG-021 | A real `Claim` type should exist with provenance, invariant, consequence, authority | No such type exists anywhere | | VG-021 | A real `Claim` type should exist with provenance, invariant, consequence, authority | No such type exists anywhere |
| VG-022 | Extractors should be paired with claims they verify | ✅ **CLOSED**`verifiable_predicates()` added to `Extractor` trait; 10 extractors declare predicates; `compute_extractor_claim_map()` in verify.rs; `aphoria verify map` shows coverage | | VG-022 | Extractors should be paired with claims they verify | ✅ **CLOSED**`verifiable_predicates()` added to `Extractor` trait; 10 extractors declare predicates; `compute_extractor_claim_map()` in verify.rs; `aphoria verify map` shows coverage |
| VG-023 | `aphoria audit` command should exist | No audit subcommand in CLI | | VG-023 | `aphoria audit` command should exist | No audit subcommand in CLI |
@ -451,7 +451,7 @@ pub struct Observation {
pub description: String, pub description: String,
} }
// Already exists as Observation (was ExtractedClaim before A1) // Already exists as Observation (was Observation before A1)
pub struct Observation { pub struct Observation {
pub concept_path: String, pub concept_path: String,
pub predicate: String, pub predicate: String,
@ -558,7 +558,7 @@ These extractors audit Aphoria's own code to verify the claims in this document
| `bridge_parent_hash_audit` | Whether `parent_hash` is always `None` | Regex for `parent_hash: None` in bridge | | `bridge_parent_hash_audit` | Whether `parent_hash` is always `None` | Regex for `parent_hash: None` in bridge |
| `bridge_lifecycle_audit` | Whether lifecycle skips review | Regex for `LifecycleStage::Approved` without Pending | | `bridge_lifecycle_audit` | Whether lifecycle skips review | Regex for `LifecycleStage::Approved` without Pending |
| `extractor_trait_audit` | Whether Extractor trait accepts claims | Check trait definition for claim parameter | | `extractor_trait_audit` | Whether Extractor trait accepts claims | Check trait definition for claim parameter |
| `type_naming_audit` | Whether `ExtractedClaim` has been renamed | Grep for `struct ExtractedClaim` vs `struct Observation` | | `type_naming_audit` | Whether `Observation` has been renamed | Grep for `struct Observation` vs `struct Observation` |
### Claim-Paired Extractors (Project-Specific) ### Claim-Paired Extractors (Project-Specific)
@ -608,7 +608,7 @@ source = { claim_id = "arch-boundary-001", authority = "architecture-decision" }
### Phase 1: Distinguish observations from claims ### Phase 1: Distinguish observations from claims
- [x] Rename `ExtractedClaim` to `Observation` in `types/claim.rs` ✅ **COMPLETE (Phase A1)** - [x] Rename `Observation` to `Observation` in `types/claim.rs` ✅ **COMPLETE (Phase A1)**
- [ ] Create `AuthoredClaim` type with provenance, invariant, consequence, authority, evidence_chain - [ ] Create `AuthoredClaim` type with provenance, invariant, consequence, authority, evidence_chain
- [ ] Update `bridge.rs` default path to use Tier 4/5 (not Tier 3) for scanner output - [ ] Update `bridge.rs` default path to use Tier 4/5 (not Tier 3) for scanner output
- [ ] Add `evidence` field to `source_metadata` in bridge - [ ] Add `evidence` field to `source_metadata` in bridge

View File

@ -109,7 +109,7 @@ All checks pass with no warnings.
## Related Documentation ## Related Documentation
- `applications/aphoria/docs/vision-gaps.md` - Original gap analysis - `applications/aphoria/docs/archive/vision-gaps-2026-02-08.md` - Original gap analysis (archived)
- `applications/aphoria/docs/claims-explained.md` - Claim vs observation semantics - `applications/aphoria/docs/claims-explained.md` - Claim vs observation semantics
- `.aphoria/claims.toml` - Example claims with supersession chains - `.aphoria/claims.toml` - Example claims with supersession chains
- `applications/aphoria/src/bridge.rs` - Tier assignment logic - `applications/aphoria/src/bridge.rs` - Tier assignment logic

View File

@ -1,112 +0,0 @@
# Getting Started with Aphoria
**Aphoria is an autonomous learning system powered by LLM workflows.** Choose your integration path:
## 🤖 I Want Autonomous Operation (Recommended)
**LLM-Driven Workflows:** Skills, agents, or custom integrations
**Claude Code Skills:**
- Load `/aphoria-claims` - Commit-time claim authoring
- Load `/aphoria-suggest` - Pattern-based claim suggestions
- Load `/aphoria-custom-extractor-creator` - Generate custom extractors
**Go ADK Agents:**
- See [ADK-Go Integration](../../../../sdk/go/adk/) - Fully autonomous tool-use agents
**Custom Integration:**
- Any LLM with tool-use capability can drive Aphoria via CLI
---
## 📚 I Want to Learn It (20 minutes)
**Worked Example:** Follow a complete use case from documentation → claims → violations → fixes
[Database Connection Pool Example](../../dogfood/dbpool/) - See how a solo developer:
1. Extracts 25-30 claims from HikariCP/PostgreSQL docs
2. Writes code (with intentional violations)
3. Runs Aphoria scan (catches all 7-8 violations)
4. Fixes violations incrementally
5. Reaches production-ready code
**What you get:**
- Complete claim extraction walkthrough with decision framework
- Pre-flight validator to check your environment
- Expected output examples for every command
- Real scan results showing BLOCK/FLAG/PASS verdicts
**Time:** 20 minutes to read, 5 days to execute (optional)
---
## ⚠️ Critical: Day 3 of Dogfooding
If you're following a dogfooding exercise (e.g., `dogfood/msgqueue/`), **Day 3 is the most important day** - it's where the autonomous learning flywheel is validated.
**What makes Day 3 different:**
- Days 1-2: Setup (claims authoring, code writing)
- **Day 3: LEARNING** (creating extractors to close gaps) ← **This is the flywheel**
- Days 4-5: Verification (fixes, documentation)
**Common mistake:** Running scan once, seeing low detection rate (0-20%), and moving on without creating extractors. This breaks the entire flywheel.
**Correct approach:**
1. Run baseline scan (expect 0-20% detection on new domain)
2. Analyze gaps (which extractors are missing?)
3. Create extractors with `/aphoria-custom-extractor-creator` (8 invocations for 8 violations)
4. Run verification scan (should be ≥90% detection)
5. Document improvement (0% → 90%+)
**How to verify Day 3 was done correctly:**
```bash
ls .aphoria/extractors/*.toml | wc -l # Should be: 8+
ls scan-v2.json # Must exist
ls DAY3-SUMMARY.md # Must exist
```
If ANY are missing, Day 3 is incomplete. See [Common Mistakes](../dogfooding-common-mistakes.md) for details.
---
## 🚀 Fallback: No LLM Access (Debug Interface)
**CLI-Only Mode:** For environments without LLM access or debugging
[Solo Developer Quick Start](./solo-developer-quick-start.md) - Manual scan workflow (debug interface)
**⚠️ Limitations:**
- Manual claim authoring (naming errors break tail-path matching)
- No autonomous flywheel (scan only, no evaluate/claim/create)
- Requires manual pattern analysis
---
## 🔧 I Want to Integrate It (30 minutes)
**Production Integration:** Pre-commit hooks, CI/CD, team workflows
See:
- [Pre-Flight Checks Guide](../guides/pre-flight-checks.md) - Git hooks and CI integration
- [Enterprise Quick Start](../guides/enterprise-quick-start.md) - Team deployment
- [Multi-Team Policy Governance](../guides/multi-team-policy-governance.md) - Scaling to multiple teams
---
## Reference Materials
| Document | Purpose |
|----------|---------|
| [CLI Reference](../cli-reference.md) | Complete command documentation |
| [Comparison Modes](../comparison-modes.md) | How Aphoria evaluates conflicts |
| [Configuration](../configuration.md) | .aphoria/config.toml reference |
| [Architecture](../architecture/README.md) | System design and algorithms |
---
## Support
- **Installation issues:** See [Solo Developer Guide](../guides/solo-developer-guide.md#install)
- **Scan not finding violations:** Check [Troubleshooting](../cli-reference.md#troubleshooting)
- **Custom extractors:** See [Architecture: Extractors](../architecture/README.md#extractors)
- **Enterprise deployment:** See [Enterprise Pilot Guide](../guides/enterprise-pilot-guide.md)

View File

@ -1,185 +0,0 @@
# Solo Developer Quick Start
Get Aphoria running on your project in 2 minutes. No team coordination, no complex setup.
---
## Prerequisites
- **Rust toolchain** - `cargo --version` (Rust 1.70+)
- **Git repository** - Aphoria scans code in version control
- **5 minutes** - Time to install, scan, and see results
---
## Step 1: Install (30 seconds)
```bash
cd /path/to/stemedb/applications/aphoria
cargo install --path .
```
Verify:
```bash
aphoria --version
```
**Expected output:**
```
aphoria 0.1.0
```
---
## Step 2: Initialize Your Project (30 seconds)
```bash
cd /path/to/your-project
aphoria init
```
This creates `.aphoria/config.toml` and loads the authoritative corpus (RFCs, OWASP) into your local database.
**Expected output:**
```
✓ Created .aphoria/config.toml
✓ Loaded 247 authoritative claims from corpus
✓ Project initialized: your-project
```
---
## Step 3: Run Your First Scan (30 seconds)
```bash
aphoria scan
```
**Expected output (if violations found):**
```
┌──────────────────────┬──────┬─────────┬──────────────────────────────────────────┐
│ File │ Line │ Verdict │ Explanation │
├──────────────────────┼──────┼─────────┼──────────────────────────────────────────┤
│ api/client.py │ 42 │ BLOCK │ TLS cert verification disabled │
│ │ │ │ (RFC 5246: MUST verify, confidence: 0.92)│
├──────────────────────┼──────┼─────────┼──────────────────────────────────────────┤
│ config/settings.py │ 18 │ FLAG │ DEBUG=True in production config │
│ │ │ │ (OWASP: SHOULD disable, confidence: 0.68)│
└──────────────────────┴──────┴─────────┴──────────────────────────────────────────┘
Summary: 1 BLOCK, 1 FLAG, 0 PASS
Scan completed in 0.24s
```
**Expected output (if clean):**
```
✓ No violations found
```
---
## Step 4: Understand the Results
### Verdicts
| Verdict | Meaning | Confidence Threshold |
|---------|---------|---------------------|
| **BLOCK** | Critical violation - production risk | ≥ 0.7 |
| **FLAG** | Warning - best practice violation | ≥ 0.5 |
| **PASS** | No conflict with authoritative sources | < 0.5 |
### What Aphoria Catches
- **TLS/SSL:** Disabled cert verification, weak protocols (SSLv3, TLS 1.0)
- **Authentication:** Missing token validation, disabled CSRF protection
- **Configuration:** Debug mode in production, hardcoded secrets
- **Framework Security:** Django DEBUG=True, Flask CSRF disabled, Express without helmet
---
## Next Steps
### Option A: Add Pre-Commit Hook (Recommended)
Block insecure code before it reaches your repo:
```bash
# Add to .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: aphoria
name: Aphoria security check
entry: aphoria scan --staged --exit-code
language: system
pass_filenames: false
```
Then:
```bash
pre-commit install
```
Now every commit is checked automatically.
### Option B: Learn by Example
Follow the complete [Database Connection Pool Example](../../dogfood/dbpool/) to see:
- How to extract claims from technical documentation (HikariCP, PostgreSQL)
- How Aphoria catches violations (7-8 real examples)
- How to fix violations incrementally
- How to validate your environment is working
**Time:** 20 minutes to read, optional 5-day hands-on exercise
### Option C: Dive Deeper
- [Solo Developer Guide](../guides/solo-developer-guide.md) - Comprehensive workflows
- [CLI Reference](../cli-reference.md) - All commands and options
- [Comparison Modes](../comparison-modes.md) - How conflicts are evaluated
---
## Troubleshooting
### "Corpus database not found"
```bash
# Initialize project first
aphoria init
# Or specify corpus DB location
export STEMEDB_CORPUS_DB_DIR=/path/to/corpus-db
```
### "No violations found" (but you expected some)
```bash
# Enable debug logging to see what extractors are doing
RUST_LOG=aphoria=debug aphoria scan
# Check which extractors ran
aphoria scan --show-observations
```
### "Scan is slow"
Ephemeral mode (default) should be fast (< 0.3s). If slow:
```bash
# Check file count
find . -name "*.rs" -o -name "*.py" | wc -l
# Exclude large directories
# Edit .aphoria/config.toml:
[scan]
exclude = ["target/", "node_modules/", "venv/"]
```
---
## Support
- **Installation issues:** Check [Solo Developer Guide: Installation](../guides/solo-developer-guide.md#1-install)
- **Custom patterns:** See [Architecture: Extractors](../architecture/README.md#extractors)
- **Enterprise setup:** See [Enterprise Quick Start](../guides/enterprise-quick-start.md)

View File

@ -1,20 +1,23 @@
# Aphoria Guides # Aphoria Guides
Quick-start guides and workflows for Aphoria users. **Aphoria is an autonomous learning system powered by LLM workflows.** Choose your integration path:
**New to Aphoria?** Start with **LLM-driven workflows** for autonomous operation.
--- ---
## LLM Workflows (Primary Interface) ## 🤖 I Want Autonomous Operation (Recommended)
**Aphoria is designed for LLM-driven autonomous operation:** **LLM-Driven Workflows:** Skills, agents, or custom integrations
| Interface | Use Case | Documentation | **Claude Code Skills:**
|-----------|----------|---------------| - Load `/aphoria-claims` - Commit-time claim authoring
| **Claude Code Skills** | Interactive agent workflows | Load `/aphoria-claims`, `/aphoria-suggest` | - Load `/aphoria-suggest` - Pattern-based claim suggestions
| **Go ADK Agents** | Fully autonomous CI/CD | See [ADK-Go Integration](../../../sdk/go/adk/) | - Load `/aphoria-custom-extractor-creator` - Generate custom extractors
| **Custom LLM Integration** | Any tool-use capable LLM | See [LLM Wiki Extraction](./llm-wiki-extraction.md) |
**Go ADK Agents:**
- See [ADK-Go Integration](../../../sdk/go/adk/) - Fully autonomous tool-use agents
**Custom Integration:**
- Any LLM with tool-use capability can drive Aphoria via CLI
**Why LLM workflows?** **Why LLM workflows?**
- Enforce naming conventions (manual errors break tail-path matching) - Enforce naming conventions (manual errors break tail-path matching)
@ -27,7 +30,48 @@ Quick-start guides and workflows for Aphoria users.
--- ---
## Getting Started (Fallback: No LLM Access) ## 📚 I Want to Learn It (20 minutes)
**Worked Example:** Follow a complete use case from documentation → claims → violations → fixes
[Database Connection Pool Example](../../dogfood/dbpool/) - See how a solo developer:
1. Extracts 25-30 claims from HikariCP/PostgreSQL docs
2. Writes code (with intentional violations)
3. Runs Aphoria scan (catches all 7-8 violations)
4. Fixes violations incrementally
5. Reaches production-ready code
**What you get:**
- Complete claim extraction walkthrough with decision framework
- Pre-flight validator to check your environment
- Expected output examples for every command
- Real scan results showing BLOCK/FLAG/PASS verdicts
**Time:** 20 minutes to read, 5 days to execute (optional)
---
## 🚀 Fallback: No LLM Access (Debug Interface)
**CLI-Only Mode:** For environments without LLM access or debugging
**⚠️ Limitations:**
- Manual claim authoring (naming errors break tail-path matching)
- No autonomous flywheel (scan only, no evaluate/claim/create)
- Requires manual pattern analysis
## 🔧 I Want to Integrate It (30 minutes)
**Production Integration:** Pre-commit hooks, CI/CD, team workflows
See:
- [Pre-Flight Checks Guide](./pre-flight-checks.md) - Git hooks and CI integration
- [Enterprise Quick Start](./enterprise-quick-start.md) - Team deployment
- [Multi-Team Policy Governance](./multi-team-policy-governance.md) - Scaling to multiple teams
---
## Getting Started Guides
| Guide | Audience | Description | | Guide | Audience | Description |
|-------|----------|-------------| |-------|----------|-------------|
@ -55,21 +99,25 @@ Quick-start guides and workflows for Aphoria users.
| [AAA Game Development](./aaa-game-development.md) | Unreal Engine patterns | | [AAA Game Development](./aaa-game-development.md) | Unreal Engine patterns |
| [LLM Wiki Extraction](./llm-wiki-extraction.md) | Extract claims from technical docs using LLM skill | | [LLM Wiki Extraction](./llm-wiki-extraction.md) | Extract claims from technical docs using LLM skill |
## Reference Documentation ## Reference Materials
| Document | Description | | Document | Purpose |
|----------|-------------| |----------|---------|
| [CLI Reference](../cli-reference.md) | Complete command documentation | | [CLI Reference](../reference/cli-reference.md) | Complete command documentation |
| [Comparison Modes](../comparison-modes.md) | Detailed guide for claim comparison modes | | [Comparison Modes](../reference/comparison-modes.md) | How Aphoria evaluates conflicts |
| [Configuration](../reference/configuration.md) | .aphoria/config.toml reference |
## Architecture | [Architecture](../architecture/README.md) | System design and algorithms |
See [Architecture Documentation](../architecture/README.md) for:
- System design and data flow
- Concept matching algorithms
- Extension points and performance targets
## UAT Results ## UAT Results
See [UAT Reports](../../uat/) for validation results: See [UAT Reports](../../uat/) for validation results:
- [Policy Source Tracking UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) - Trust Pack workflow validation - [Policy Source Tracking UAT](../../uat/2026-02-04-uat-real-world-policy-source.md) - Trust Pack workflow validation
---
## Support
- **Installation issues:** See [Solo Developer Guide](./solo-developer-guide.md#quick-start-2-minutes)
- **Scan not finding violations:** Check [Troubleshooting](../reference/cli-reference.md#troubleshooting)
- **Custom extractors:** See [Architecture: Extractors](../architecture/README.md#extractors)
- **Enterprise deployment:** See [Enterprise Pilot Guide](./enterprise-pilot-guide.md)

View File

@ -497,5 +497,5 @@ This enables:
## See Also ## See Also
- [Claims Guide](./claims.md) - Understanding claims vs observations - [Claims Guide](./claims.md) - Understanding claims vs observations
- [Getting Started](../getting-started/README.md) - Initial setup - [Getting Started](./README.md) - All guides and integration paths
- [CLI Reference](../README.md) - All commands - [CLI Reference](../reference/cli-reference.md) - All commands

View File

@ -478,6 +478,6 @@ Summary: 27 claims extracted, 27 stored successfully
## See Also ## See Also
- [CLI Reference](../cli-reference.md) - All `aphoria corpus` commands - [CLI Reference](../reference/cli-reference.md) - All `aphoria corpus` commands
- [Corpus API](../api-reference.md) - Query corpus programmatically - [Corpus API](../api-reference.md) - Query corpus programmatically
- [Claims vs Observations](../../README.md#claims-vs-observations) - Key concepts - [Claims vs Observations](../../README.md#claims-vs-observations) - Key concepts

View File

@ -15,10 +15,10 @@ You don't need enterprise workflows. You need to know when your code contradicts
## Quick Start (2 Minutes) ## Quick Start (2 Minutes)
### 1. Install ### Step 1: Install (30 seconds)
```bash ```bash
cd applications/aphoria cd /path/to/stemedb/applications/aphoria
cargo install --path . cargo install --path .
``` ```
@ -27,32 +27,70 @@ Verify:
aphoria --version aphoria --version
``` ```
### 2. Initialize **Expected output:**
```
aphoria 0.1.0
```
---
### Step 2: Initialize Your Project (30 seconds)
```bash ```bash
cd your-project cd /path/to/your-project
aphoria init aphoria init
``` ```
This loads the authoritative corpus (RFCs, OWASP guidelines) into your local database. This creates `.aphoria/config.toml` and loads the authoritative corpus (RFCs, OWASP) into your local database.
### 3. Scan **Expected output:**
```
✓ Created .aphoria/config.toml
✓ Loaded 247 authoritative claims from corpus
✓ Project initialized: your-project
```
---
### Step 3: Run Your First Scan (30 seconds)
```bash ```bash
aphoria scan aphoria scan
``` ```
That's it. You'll see output like: **Expected output (if violations found):**
``` ```
BLOCK code://python/requests/tls/cert_verification ┌──────────────────────┬──────┬─────────┬──────────────────────────────────────────┐
Your code: verify=False (api/client.py:42) │ File │ Line │ Verdict │ Explanation │
RFC 5246: TLS certificate verification MUST be enabled ├──────────────────────┼──────┼─────────┼──────────────────────────────────────────┤
Conflict: 0.92 │ api/client.py │ 42 │ BLOCK │ TLS cert verification disabled │
│ │ │ │ (RFC 5246: MUST verify, confidence: 0.92)│
├──────────────────────┼──────┼─────────┼──────────────────────────────────────────┤
│ config/settings.py │ 18 │ FLAG │ DEBUG=True in production config │
│ │ │ │ (OWASP: SHOULD disable, confidence: 0.68)│
└──────────────────────┴──────┴─────────┴──────────────────────────────────────────┘
1 conflict found (1 BLOCK). Summary: 1 BLOCK, 1 FLAG, 0 PASS
Scan completed in 0.24s
``` ```
**Expected output (if clean):**
```
✓ No violations found
```
---
### Step 4: Understand the Results
#### Verdicts
| Verdict | Meaning | Confidence Threshold |
|---------|---------|---------------------|
| **BLOCK** | Critical violation - production risk | ≥ 0.7 |
| **FLAG** | Warning - best practice violation | ≥ 0.5 |
| **PASS** | No conflict with authoritative sources | < 0.5 |
--- ---
## Pre-Commit Hook ## Pre-Commit Hook
@ -223,6 +261,44 @@ aphoria corpus list
--- ---
## Troubleshooting
### "Corpus database not found"
```bash
# Initialize project first
aphoria init
# Or specify corpus DB location
export STEMEDB_CORPUS_DB_DIR=/path/to/corpus-db
```
### "No violations found" (but you expected some)
```bash
# Enable debug logging to see what extractors are doing
RUST_LOG=aphoria=debug aphoria scan
# Check which extractors ran
aphoria scan --show-observations
```
### "Scan is slow"
Ephemeral mode (default) should be fast (< 0.3s). If slow:
```bash
# Check file count
find . -name "*.rs" -o -name "*.py" | wc -l
# Exclude large directories
# Edit .aphoria/config.toml:
[scan]
exclude = ["target/", "node_modules/", "venv/"]
```
---
## FAQ ## FAQ
**Q: How is this different from a linter?** **Q: How is this different from a linter?**
@ -250,6 +326,23 @@ See `aphoria extractors list` for current coverage.
## Next Steps ## Next Steps
### Option A: Add Pre-Commit Hook (Recommended)
Block insecure code before it reaches your repo - see [Pre-Commit Hook](#pre-commit-hook) section above.
### Option B: Learn by Example
Follow the complete [Database Connection Pool Example](../../dogfood/dbpool/) to see:
- How to extract claims from technical documentation (HikariCP, PostgreSQL)
- How Aphoria catches violations (7-8 real examples)
- How to fix violations incrementally
- How to validate your environment is working
**Time:** 20 minutes to read, optional 5-day hands-on exercise
### Option C: Dive Deeper
- [The First Scan](./the-first-scan.md) - Detailed walkthrough - [The First Scan](./the-first-scan.md) - Detailed walkthrough
- [Pre-Flight Checks](./pre-flight-checks.md) - CI integration - [Pre-Flight Checks](./pre-flight-checks.md) - CI integration
- Read the [main README](../../README.md) for command reference - [CLI Reference](../reference/cli-reference.md) - All commands and options
- [Comparison Modes](../reference/comparison-modes.md) - How conflicts are evaluated

View File

@ -693,7 +693,7 @@ EOF
```rust ```rust
// In llm/extractor.rs // In llm/extractor.rs
impl LlmExtractor { impl LlmExtractor {
pub fn extract(&self, content: &str, language: Language) -> Vec<ExtractedClaim> { pub fn extract(&self, content: &str, language: Language) -> Vec<Observation> {
// Edge case: empty content // Edge case: empty content
if content.trim().is_empty() { if content.trim().is_empty() {
return Vec::new(); return Vec::new();

View File

@ -824,5 +824,4 @@ For complete configuration reference, see [configuration.md](configuration.md).
- [Comparison Modes Guide](comparison-modes.md) - Detailed guide for `--comparison` parameter - [Comparison Modes Guide](comparison-modes.md) - Detailed guide for `--comparison` parameter
- [Solo Developer Guide](guides/solo-developer-guide.md) - Quick start for individuals - [Solo Developer Guide](guides/solo-developer-guide.md) - Quick start for individuals
- [Enterprise Pilot Guide](guides/enterprise-pilot-guide.md) - Enterprise deployment - [Enterprise Pilot Guide](guides/enterprise-pilot-guide.md) - Enterprise deployment
- [Scale-Adaptive Thresholds](scale-adaptive-thresholds.md) - Threshold configuration for small teams - [Scale-Adaptive Thresholds](../advanced/scale-adaptive-thresholds.md) - Threshold configuration for small teams
- [Vision & Gaps](vision-gaps.md) - Architecture and implementation status

View File

@ -296,7 +296,7 @@ Automatically adjusts promotion thresholds based on team size:
- Small (6-25 projects): Patterns visible with 5+ projects - Small (6-25 projects): Patterns visible with 5+ projects
- Enterprise (501+): Unchanged behavior - Enterprise (501+): Unchanged behavior
See [scale-adaptive-thresholds.md](scale-adaptive-thresholds.md) for details. See [../advanced/scale-adaptive-thresholds.md](../advanced/scale-adaptive-thresholds.md) for details.
**Legacy Thresholds:** **Legacy Thresholds:**
@ -364,8 +364,6 @@ enabled = false # Pattern learning from scans
enabled = false # Auto-promotion to extractors (kill switch) enabled = false # Auto-promotion to extractors (kill switch)
``` ```
See [vision-gaps.md](vision-gaps.md) for implementation status.
--- ---
## Environment Variables ## Environment Variables
@ -408,6 +406,5 @@ No migration needed - just set `data_dir` to old path.
## See Also ## See Also
- [CLI Reference](cli-reference.md) - All commands and flags - [CLI Reference](cli-reference.md) - All commands and flags
- [Scale-Adaptive Thresholds](scale-adaptive-thresholds.md) - Threshold configuration - [Scale-Adaptive Thresholds](../advanced/scale-adaptive-thresholds.md) - Threshold configuration
- [Comparison Modes](comparison-modes.md) - Claim comparison operators - [Comparison Modes](comparison-modes.md) - Claim comparison operators
- [Vision Gaps](vision-gaps.md) - Implementation status

View File

@ -150,7 +150,7 @@ aphoria claims import claims-template.toml
### ⚠️ May Need Updates: ### ⚠️ May Need Updates:
- `applications/aphoria/dogfood/httpclient/` - Still uses shell script? - `applications/aphoria/dogfood/httpclient/` - Still uses shell script?
- `applications/aphoria/docs/getting-started/` - Check if mentions shell scripts - `applications/aphoria/docs/guides/` - Check if mentions shell scripts
- Other existing dogfood exercises - Other existing dogfood exercises
--- ---

View File

@ -635,7 +635,7 @@ LLM receives focused snippet + authored claim → structured verdict.
| Verification prompt: "Does this code satisfy this claim?" | ⬜ | | Verification prompt: "Does this code satisfy this claim?" | ⬜ |
| Structured output: `{ verdict: PASS|FAIL|UNCERTAIN, evidence: "..." }` | ⬜ | | Structured output: `{ verdict: PASS|FAIL|UNCERTAIN, evidence: "..." }` | ⬜ |
| Wire into `aphoria verify` Direction 2 (walk claims, verify in code) | ⬜ | | Wire into `aphoria verify` Direction 2 (walk claims, verify in code) | ⬜ |
| Maps to `Extractor::verify()` from vision-gaps | ⬜ | | Maps to `Extractor::verify()` concept (historical: vision-gaps-2026-02-08) | ⬜ |
**Token efficiency:** Snippet (~100 tokens) vs whole file (~2000 tokens) = 95% cost reduction per verification. **Token efficiency:** Snippet (~100 tokens) vs whole file (~2000 tokens) = 95% cost reduction per verification.

View File

@ -1,6 +1,6 @@
# Aphoria # Aphoria
> **Product Vision:** This document describes Aphoria's product vision as a knowledge compounding system that learns from your organization's decisions. For the protocol-level vision (EAP standard), see [Protocol Vision](protocol_vision.md). > **Product Vision:** This document describes Aphoria's product vision as a knowledge compounding system that learns from your organization's decisions. For the protocol-level vision (EAP standard), see [Protocol Vision](docs/advanced/eap-protocol.md).
**Self-learning institutional knowledge that compounds with every commit.** **Self-learning institutional knowledge that compounds with every commit.**

View File

@ -1,488 +0,0 @@
# Arena Roadmap: The Simulation
> **Goal:** Incrementally evolve the simulator from Spine validation to a full Agent-Based Modeling environment.
> **Philosophy:** Make it run. Then add. Verify at every step.
> **Alignment:** Tracks main `roadmap.md` phases; exercises features as they land.
---
## Current State (Baseline)
The simulator (`stemedb-sim`) currently validates **Phase 1: The Spine**:
| Component | Status | What It Proves |
|-----------|--------|----------------|
| WAL Durability | ✓ Works | Writes persist |
| rkyv Serialization | ✓ Works | Roundtrip correctness |
| Ed25519 Signatures | ✓ Works | Sign on write, verify on read |
| Ingestor Pipeline | ✓ Works | WAL → KV async flow |
| Agent Identity | ✓ Works | Keypair generation |
**Run command:** `cargo run --bin stemedb-sim`
**What's NOW tested (Arena 1-3):**
- ✅ Queries via QueryEngine
- ✅ Lens resolution (Recency, VoteAwareConsensus)
- ✅ Lifecycle filtering
- ✅ Voting & consensus
- ✅ Query audit trail
- ✅ Materialized Views (Arena 3)
- ✅ Fast-path MV reads
- ✅ MV freshness under load
**What's NOT yet tested:**
- ❌ Concurrent agents (Arena 6)
- ❌ TrustRank (Arena 5)
- ❌ Time-travel queries (Arena 7)
---
## Arena Phases
### Arena 0: Make It Verifiable ✅ COMPLETE
*Goal: Add assertions for what the simulation proves. Currently it prints logs; we need programmatic success/failure.*
**Why first:** Without assertions, we don't know if later changes break things.
- [x] **0.1 Return Result from main()**: Change from print-and-exit to structured outcome.
- [x] Define `SimulationResult { assertions_written: u64, assertions_verified: u64, errors: Vec<SimulationError> }`.
- [x] Library function `run_simulation(config) -> Result<SimulationResult, SimulationSetupError>`.
- [x] Print summary at end, exit 0 on success, exit 1 on failure, exit 2 on setup error.
- [x] **0.2 Integration Test Wrapper**: Make the sim runnable as a test.
- [x] Add `crates/stemedb-sim/tests/smoke.rs` with 6 integration tests.
- [x] Assert on `SimulationResult` fields.
- [x] Run in CI via `cargo test -p stemedb-sim`.
**Exit Criteria:** `cargo test -p stemedb-sim` passes ✅
---
### Arena 1: Query Path (Exercises Phase 2 Features) ✅ COMPLETE
*Goal: Extend simulation to read via QueryEngine, not direct KV access.*
**Depends on:** Phase 2 complete (Query Engine, Lenses)
**Aligns with:** `roadmap.md` Phase 2 "The Lattice"
- [x] **1.1 Add Query Engine to Simulation**
- [x] Import `stemedb-query` crate.
- [x] After ingestion wait, query each assertion via `QueryEngine::execute()`.
- [x] Verify result matches what was written.
- [x] **1.2 Exercise Recency Lens**
- [x] Write 2 assertions for same subject+predicate with different timestamps.
- [x] Query with `lens=Recency`.
- [x] Verify most recent wins.
- [x] **1.3 Exercise Lifecycle Filtering**
- [x] Write one `Proposed` and one `Approved` assertion for same fact.
- [x] Query with `lifecycle=Approved`.
- [x] Verify only Approved returned.
- [x] **Use Case Alignment:** This is the JWT signing algorithm bug from Agile Agent Team.
- [x] **1.4 Query Audit Verification**
- [x] Set `X-Agent-Id` header (or equivalent in direct API call).
- [x] After queries, call `AuditStore::get_audits_for_agent()`.
- [x] Verify audit trail exists with correct contributing assertions.
- [x] **Use Case Alignment:** "What did the deployment agent query?"
**Exit Criteria:** Simulation writes, ingests, queries, and verifies all three scenarios. ✅
---
### Arena 2: Voting & Consensus (Exercises Phase 2 VoteStore) ✅ COMPLETE
*Goal: Simulate agents voting on assertions and resolving via VoteAwareConsensusLens.*
**Depends on:** Arena 1 complete, Phase 2 VoteStore
**Aligns with:** `roadmap.md` Phase 2 "The Ballot Box"
- [x] **2.1 Add Vote Creation to Agents**
- [x] `Agent::vote(assertion_hash, weight)` method.
- [x] Votes stored directly via VoteStore (bypasses WAL for now - see note below).
- [x] **2.2 Conflicting Assertions with Votes**
- [x] Scientist_Alpha asserts "Protein_X binds Receptor_Y" (confidence 0.8).
- [x] Scientist_Beta asserts "Protein_X binds Receptor_Z" (confidence 0.8).
- [x] Alpha votes for own assertion (weight 1.0).
- [x] Beta votes for own assertion (weight 1.0).
- [x] Third agent (Believer) votes for Alpha's assertion.
- [x] Query with `lens=VoteAwareConsensus`.
- [x] Verify Alpha's assertion wins (2 votes vs 1).
- [x] **2.3 Troll Vote Resistance**
- [x] Troll creates low-confidence assertion contradicting consensus.
- [x] Troll votes for own assertion.
- [x] Verify high-vote assertions still win.
**Exit Criteria:** Vote-based consensus correctly resolves conflicts. ✅
---
### Arena 2.5: Hardening (Critical Gap Remediation) ✅ COMPLETE
*Goal: Fix critical gaps discovered during Arena 0-2 review before adding more features.*
**Depends on:** Arena 2 complete
**Blocks:** Arena 3+ (don't add features on a shaky foundation)
**Rationale:** Gap analysis revealed 58/100 production readiness score with 3 critical blockers.
- [x] **2.5.1 Fix Vote Cache Race Condition** (P0 - CRITICAL)
- [x] VoteStore `put_vote()` uses `fetch_and_add_u64` + `compare_and_swap_f32` (vote_store.rs:182-189)
- [x] Two concurrent calls can lose updates (final count = N instead of N+1)
- [x] Solution: Add atomic increment or compare-and-swap operation
- [x] Add test: 100 concurrent `put_vote()` calls, verify final count
- [x] **File:** `crates/stemedb-storage/src/vote_store.rs:181-206`
- [x] **2.5.2 Add API Integration Tests** (P0 - CRITICAL)
- [x] Create `crates/stemedb-api/tests/http_integration.rs`
- [x] Test `POST /assertions` - create assertion via HTTP
- [x] Test `POST /votes` - submit vote via HTTP
- [x] Test `GET /query` - query with lens parameter
- [x] Test error responses (400 Bad Request, 500 Internal Error)
- [x] Test rate limiting via QuotaStore middleware
- [x] **Gap:** Entire HTTP layer is currently untested
- [x] **2.5.3 Add Crash Recovery Test** (P0 - CRITICAL)
- [x] Write assertions to WAL
- [x] Kill IngestWorker mid-step (simulate crash)
- [x] Restart IngestWorker with same WAL + KV store
- [x] Verify: cursor resumes correctly, no duplicate ingestion (worker.rs:1518)
- [x] Verify: all pre-crash data is recoverable
- [x] **Validates:** Durability claims in architecture.md
- [x] **2.5.4 Add Input Validation** (P1 - HIGH)
- [x] Max subject length: 1024 characters
- [x] Max predicate length: 1024 characters
- [x] Confidence range: 0.0 to 1.0, reject NaN/Inf
- [x] Vote weight: non-negative, reject NaN/Inf
- [x] Timestamp: reject values > current time + 1 hour (clock skew protection)
- [x] Add validation in `IngestWorker::validate_assertion()` and `validate_vote()`
- [x] **File:** `crates/stemedb-ingest/src/worker.rs`
- [x] **2.5.5 Replace Sleep Timers with Ingestion Sync** (P1 - HIGH)
- [x] `wait_until_ingested()` cursor-based polling replaces all sleeps
- [x] Add: `wait_for_ingestion(store, expected_count, timeout)` helper
- [x] Poll store until expected assertions exist or timeout
- [x] Replace all hardcoded sleeps in simulation
- [x] **Benefit:** Faster tests, deterministic behavior
- [x] **2.5.6 Fix Defensive Error Handling** (P2 - MEDIUM)
- [x] Input validation propagates errors properly
- [x] Change to propagate error or skip candidate with warning
- [x] `worker.rs:161-173`: Ambiguous EOF handling treats all I/O errors as "no data"
- [x] Distinguish true EOF from transient errors
**Exit Criteria:**
- [x] Vote cache is atomic (concurrent test passes)
- [x] API layer has integration tests (POST/GET work via HTTP)
- [x] Crash recovery is verified (no data loss on restart)
- [x] Input validation rejects malformed data
- [x] No hardcoded sleep timers in simulation
- [x] Production readiness score: 75+ (up from 58)
---
### Arena 3: Materialized Views (Exercises Phase 2 Materializer) ✅ COMPLETE
*Goal: Verify fast-path MV reads work under simulation load.*
**Depends on:** Arena 2 complete, Phase 2 Materializer
**Aligns with:** `roadmap.md` Phase 2 "Materializer"
- [x] **3.1 Materializer Integration**
- [x] Spin up Materializer alongside Ingestor.
- [x] Wire `Notify` between IngestWorker and Materializer.
- [x] After ingestion, verify MV keys exist in store.
- [x] **3.2 Fast-Path Verification**
- [x] Query via QueryEngine with subject+predicate.
- [x] Log whether fast-path or slow-path was used (add debug output).
- [x] Verify MV winner matches slow-path result.
- [x] **3.3 MV Freshness Under Load**
- [x] Write 10 assertions in rapid succession.
- [x] Wait for materialization.
- [x] Verify MV reflects latest state.
- [x] **Aligns with:** Phase 2.5 "MV Staleness Detection"
**Exit Criteria:** Fast-path queries return correct results under load. ✅
---
### Arena 4: Agent Personas (First Strategy Differentiation) ✅ COMPLETE
*Goal: Agents behave differently based on persona. No longer uniform.*
**Depends on:** Arena 3 complete
**Aligns with:** Vision document "The Players"
- [x] **4.1 AgentStrategy Trait**
- [x] `AgentStrategy` trait with `decide_action()`, `base_confidence()`, `name()`.
- [x] `AgentAction` enum: Assert, Vote, Query, Skip.
- [x] `WorldState`, `StrategyMetrics`, `AgentSpec`, `StrategyType`.
- [x] `GroundTruth` hardcoded dataset (5 known-true facts).
- [x] **File:** `crates/stemedb-sim/src/strategy.rs`
- [x] **4.2 Scientist Strategy**
- [x] High base confidence (0.9).
- [x] Even ticks: assert next ground truth fact.
- [x] Odd ticks: vote for truth-aligned assertions, or assert if none found.
- [x] **4.3 Troll Strategy**
- [x] Low base confidence (0.4).
- [x] Contradicts existing assertions (NOT_value, negate, flip, FAKE_ref).
- [x] Skips on empty world (tick 0).
- [x] **4.4 Believer Strategy**
- [x] Medium base confidence (0.65).
- [x] Votes for highest-confidence assertion (weight 0.7).
- [x] Pure amplifier - never creates assertions, only votes.
- [x] **4.5 Strategy-Driven Tick Loop**
- [x] `SimulationConfig.agents: Vec<AgentSpec>` replaces `agent_count`.
- [x] Each tick: strategy decides action, executed by agent.
- [x] Per-strategy metrics tracked and logged.
- [x] Arena 4 verification: differentiated behavior check.
**Exit Criteria:** Different agent types produce different behaviors in logs. ✅
---
### Arena 5: TrustRank Integration (Exercises Phase 4 Foundation)
*Goal: Reputation updates based on agent behavior.*
**Depends on:** Arena 4 complete, TrustRank implemented
**Aligns with:** `roadmap.md` Phase 4 "TrustRank Engine"
- [ ] **5.1 Initialize TrustRank for Agents**
- [ ] Each agent starts with base TrustRank (e.g., 0.5).
- [ ] Store in TrustRankStore at simulation start.
- [ ] **5.2 Reputation Adjustment After Votes**
- [ ] When an assertion gains votes, increase author's TrustRank.
- [ ] When an assertion is contradicted by consensus, decrease author's TrustRank.
- [ ] Use `TrustRankStore::record_outcome()`.
- [ ] **5.3 TrustAwareAuthorityLens Verification**
- [ ] Two assertions from different agents, same confidence.
- [ ] Agent with higher TrustRank should win via `TrustAwareAuthorityLens`.
- [ ] **Use Case Alignment:** "Expert vs. junior weighting" from Agile Agent Team.
- [ ] **5.4 Troll Reputation Decay**
- [ ] After 100 ticks, verify Troll's TrustRank has decreased.
- [ ] Verify Scientist's TrustRank has increased.
- [ ] **Success Criteria:** "Trust clusters form naturally without hardcoded rules."
**Exit Criteria:** TrustRank diverges based on behavior; Troll reputation tanks.
---
### Arena 6: Concurrent Agents (Performance Validation)
*Goal: Move from sequential to parallel agent execution.*
**Depends on:** Arena 5 complete
**Aligns with:** Vision "1000 concurrent agents without locking"
- [ ] **6.1 Tokio Task Per Agent**
- [ ] Wrap each agent's tick in `tokio::spawn()`.
- [ ] Use `Arc<Mutex<Journal>>` for WAL access (already in place).
- [ ] Run 10 agents concurrently.
- [ ] **6.2 Scale to 100 Agents**
- [ ] Parameterize agent count.
- [ ] Run with 100 agents for 50 ticks.
- [ ] Verify no deadlocks, no data corruption.
- [ ] **6.3 Contention Metrics**
- [ ] Add timing around WAL lock acquisition.
- [ ] Log P50/P99 latencies.
- [ ] Identify bottlenecks if any.
- [ ] **6.4 Target: 1000 Agents**
- [ ] Run with 1000 agents (stretch goal).
- [ ] May require connection pooling or batching.
- [ ] Document findings.
**Exit Criteria:** 100 agents run concurrently without errors.
---
### Arena 7: Time-Travel & Epochs (Exercises Phase 3 Features)
*Goal: Validate temporal queries and epoch supersession.*
**Depends on:** Arena 6 complete, Phase 3 Time-Travel + EpochAwareLens
**Aligns with:** `roadmap.md` Phase 3 "Time-Travel Engine", Phase 2.5 "EpochAwareLens"
- [ ] **7.1 Time-Travel Query Verification**
- [ ] At tick 50, record timestamp T1.
- [ ] At tick 100, write a new assertion superseding an old one.
- [ ] Query with `as_of=T1`.
- [ ] Verify result reflects tick-50 state, not tick-100 state.
- [ ] **Use Case Alignment:** "What was the state of knowledge at 9pm?"
- [ ] **7.2 Epoch Creation and Supersession**
- [ ] Create epoch "v1" at tick 0.
- [ ] Create epoch "v2" superseding "v1" at tick 50.
- [ ] Assertions referencing "v1" should be filtered by EpochAwareLens.
- [ ] **Use Case Alignment:** "Security team migrates from RS256 to ES256."
- [ ] **7.3 Epoch Cascade Verification**
- [ ] Chain: v3 supersedes v2 supersedes v1.
- [ ] Query with EpochAwareLens.
- [ ] Only v3 assertions visible.
**Exit Criteria:** Historical queries and epoch filtering work correctly.
---
### Arena 8: Skeptic & Conflict (Exercises Phase 3 Lenses)
*Goal: Surface disagreement, measure consensus.*
**Depends on:** Arena 7 complete, Phase 3 Skeptic Lens + Conflict Score
**Aligns with:** `roadmap.md` Phase 3C "Skeptic Lens", Phase 3A.2 "Conflict Score"
- [ ] **8.1 High-Conflict Scenario**
- [ ] 3 Scientist agents assert conflicting values for same fact.
- [ ] Each votes for own assertion.
- [ ] Query with `lens=Skeptic`.
- [ ] Verify `conflict_score` is high (> 0.5).
- [ ] **8.2 Low-Conflict Scenario**
- [ ] 3 Scientists assert same value (agreement).
- [ ] Query with `lens=Skeptic`.
- [ ] Verify `conflict_score` is low (< 0.2).
- [ ] **8.3 Skeptic Surfaces Outlier**
- [ ] Consensus is A, one dissenter says B.
- [ ] Skeptic lens returns B (the controversial position).
- [ ] **Use Case Alignment:** Financial Due Diligence "disagreement is the information."
**Exit Criteria:** Conflict score accurately reflects disagreement.
---
### Arena 9: Full Gameplay Loop (The Vision)
*Goal: Run the complete vision scenario end-to-end.*
**Depends on:** Arena 8 complete, all Phase 3 features
**Aligns with:** `simulation-vision.md` "The Gameplay Loop"
- [ ] **9.1 Ground Truth Injection**
- [ ] Load ground truth from YAML config.
- [ ] Scientists read ground truth, assert facts.
- [ ] **9.2 The 5-Tick Scenario**
- [ ] Tick 1: Scientist asserts "Protein_X binds Receptor_Y".
- [ ] Tick 2: Troll forks with "Protein_X binds Nothing".
- [ ] Tick 3: Believer queries, votes for Scientist.
- [ ] Tick 4: TrustRank updates (Scientist up, Troll down).
- [ ] Tick 5: Verify consensus via lens.
- [ ] **9.3 Extended Run (1000 Ticks)**
- [ ] Run full scenario for 1000 ticks.
- [ ] Track metrics:
- `truth_convergence`: % of facts matching ground truth.
- `reputation_distribution`: Scientist vs Troll ranks.
- `fork_depth_max`: Deepest contradiction chain.
- [ ] **9.4 Success Criteria Verification**
- [ ] ✓ Truth survives: High-reputation assertions outlive spam.
- [ ] ✓ Lenses work: Consensus lens filters Troll noise.
- [ ] ✓ Performance: 1000 ticks complete in < 30 seconds.
- [ ] ✓ Emergence: Trust clusters form naturally.
**Exit Criteria:** All 4 success criteria from vision document pass.
---
## Alignment with Main Roadmap
| Arena Phase | Exercises Roadmap Phase | Key Features Validated |
|-------------|------------------------|------------------------|
| Arena 0 ✅ | - | Test infrastructure |
| Arena 1 ✅ | Phase 2 | QueryEngine, Lenses, Lifecycle, Query Audit |
| Arena 2 ✅ | Phase 2 | VoteStore, VoteAwareConsensusLens |
| Arena 2.5 ✅ | - (Hardening) | Race conditions, API tests, crash recovery, input validation |
| Arena 3 ✅ | Phase 2 | Materializer, Fast-Path MV, MV Freshness |
| Arena 4 ✅ | - | Agent personas: Scientist, Troll, Believer (simulator-only) |
| Arena 5 | Phase 4 | TrustRank, TrustAwareAuthorityLens |
| Arena 6 | Phase 4 | Concurrency, Performance |
| Arena 7 | Phase 2.5 + Phase 3 | Time-Travel, Epochs, EpochAwareLens |
| Arena 8 | Phase 3 | Skeptic Lens, Conflict Score |
| Arena 9 | All | Full integration |
---
## Alignment with Use Cases
| Use Case | Arena Phase That Validates It |
|----------|-------------------------------|
| **Agile Agent Team** | |
| - Lifecycle filtering | Arena 1.3 |
| - Query audit trail | Arena 1.4 |
| - Time-travel debugging | Arena 7.1 |
| - Expert weighting | Arena 5.3 |
| - Persistent learning | Arena 5.4 (TrustRank) |
| **Financial Due Diligence** | |
| - Conflict detection | Arena 8.1, 8.3 |
| - Time-travel | Arena 7.1 |
| - Epoch cascades | Arena 7.2, 7.3 |
| **Consumer Health** | |
| - Source-class hierarchy | Phase 3 dependency (not in Arena yet) |
| - Layered consensus | Phase 3 dependency |
---
## Development Cadence
| Week | Focus | Deliverable |
|------|-------|-------------|
| 1 | Arena 0 | CI-runnable simulation ✅ |
| 2 | Arena 1 | Query path verified ✅ |
| 3 | Arena 2 | Voting verified ✅ |
| **4** | **Arena 2.5** | **Hardening: race fix, API tests, crash recovery** |
| 5 | Arena 3 | Materializer + MVs verified |
| 6 | Arena 4 | Agent personas differentiated |
| 7-8 | Arena 5-6 | TrustRank + concurrency |
| 9-10 | Arena 7-8 | Time-travel + Skeptic |
| 11-12 | Arena 9 | Full gameplay loop |
---
## Metrics to Track
Once Arena 6+ is complete, export these to logs (and eventually Prometheus):
| Metric | Description | Success Target |
|--------|-------------|----------------|
| `truth_convergence` | % of facts matching ground truth | > 95% |
| `troll_reputation` | Troll agent TrustRank at end | < 0.2 |
| `scientist_reputation` | Scientist agent TrustRank at end | > 0.8 |
| `fork_depth_max` | Deepest contradiction chain | < 10 |
| `p99_write_latency_ms` | Write path latency | < 10ms |
| `p99_query_latency_ms` | Query path latency | < 50ms |
| `concurrent_agents` | Max concurrent agents without errors | 1000 |
---
## Non-Goals (Kept Simple)
These are explicitly out of scope for the Arena:
- **Prometheus/Grafana integration** - Logs suffice for Phase 3.
- **YAML scenario config** - Hardcoded scenarios are fine until Arena 9.
- **Full chaos injection (network partitions, node kills)** - Basic crash recovery in 2.5; advanced chaos deferred to Phase 4+.
- **External agent frameworks (ADK-Go)** - Simulator uses Rust agents.
**Note:** HTTP API testing was previously a non-goal but is now addressed in Arena 2.5.2 due to critical gap discovery.
---
## Next Step
Arena 0-4 and Arena 2.5 are complete. Proceed to **Arena 5: TrustRank Integration**.
```bash
# Verify Arena 0 + 1 + 2 + 2.5 + 3 + 4 still work:
cargo test -p stemedb-sim
# Binary also works (shows persona differentiation):
RUST_LOG=info cargo run --bin stemedb-sim
```

158
docs/README.md Normal file
View File

@ -0,0 +1,158 @@
# Episteme Documentation
Complete documentation for Episteme (StemeDB) - a probabilistic knowledge graph database.
---
## Getting Started
### First Steps
- **[Quick Start](../quickstart.md)** - Get running in 5 minutes
- **[What is Episteme?](../what-is-episteme.md)** - Core concepts and examples
- **[Architecture Overview](../architecture.md)** - Technical design
- **[Vision](../vision.md)** - Product philosophy
### Use Cases
- **[Use Cases Index](../use-cases/README.md)** - All scenarios
- **[Consumer Health Intelligence](../use-cases/consumer-health-intelligence.md)** - Drug safety tracking
- **[Financial Due Diligence](../use-cases/financial-due-diligence.md)** - M&A analysis
- **[Agile Agent Team](../use-cases/agile-agent-team.md)** - AI collaboration
- **[GLP-1 Living Review](../use-cases/glp1-living-review.md)** - Real-world example
---
## Application Development
### Building on Episteme
- **[App Concepts Index](./app-concepts/index.md)** - Application layer patterns
- **[Consumer Health Vertical](./app-concepts/consumer-health.md)** - Healthcare application design
- **[Boundary Principles](./app-concepts/index.md#the-boundary-principle)** - What Episteme does vs. what your app does
### SDKs & Integration
- **[Go SDK Guide](./sdk/go-sdk.md)** - HTTP client with Ed25519 signing
- **[Go Usage Examples](./sdk/go-usage-guide.md)** - Fluent builders and patterns
- **[ADK-Go Reference](./references/go-adk/reference-guide.md)** - AI agent integration
- **[ADK-Go Examples](./references/go-adk/)** - Agent patterns (chatbot, researcher, etc.)
---
## Technical References
### Core Architecture
- **[Data Structures](./data-structures.md)** - Assertion, Vote, Epoch, MaterializedView
- **[Consistency Model](./consistency-model.md)** - Conflict resolution and convergence
- **[Governance Models](./specs/governance-models.md)** - Authority hierarchies
### Advanced Topics
- **[Distributed Write Path](./research/distributed-write-path.md)** - Clustering and sharding
- **[WAL Crash Recovery](./research/wal-crash-recovery-research.md)** - Storage durability
- **[Concept Hierarchy](./specs/concept-hierarchy.md)** - Subject/predicate organization
- **[Visual Hash Query](./specs/visual-hash-query.md)** - Screenshot-based retrieval
### Specifications
- **[RFCs Index](./rfcs/README.md)** - All RFCs
- **[RFC-001: Enterprise Policy Aliases](./rfcs/rfc-001-enterprise-policy-aliases.md)** - Multi-tenant policy
---
## Development Guides
### Setting Up
- **[Local Development Setup](../.claude/guides/local/setup.md)** - Environment configuration
- **[Testing Guide](../.claude/guides/local/testing.md)** - Running test suites
- **[Quality Checks](../.claude/guides/local/quality-checks.md)** - Pre-commit hooks
- **[Coding Guidelines](../.claude/guides/coding-guidelines.md)** - Rust standards
### How-To Guides
- **[Adding a Domain](./guides/adding-a-domain.md)** - Extend ontology for new verticals
- **[Implementation Audit Checklist](./guides/implementation-audit-checklist.md)** - Production readiness
- **[Writing UAT Reports](../.claude/guides/local/uat-reports.md)** - User acceptance testing
### Integration Guides
- **[AI Coding Assistant Integration](../.claude/guides/integrations/ai-coding-assistant-integration.md)** - Claude/Cursor setup
- **[ADK-Go + Episteme](../.claude/guides/integrations/adk-go-episteme.md)** - Building agents
---
## Demos & Examples
- **[VulnBank Demo](./demo/vulnbank/README.md)** - Security vulnerability tracking example
---
## Project Planning
### Current Work
- **[Roadmap](../roadmap.md)** - Current and planned features
- **[Roadmap Archive](../roadmap-archive.md)** - Completed phases
### Specifications & Planning
- **[Aphoria Claims API](./specs/aphoria-claims-api.md)** - Claims management design
- **[Ontology Layer: Medical Vertical](./specs/ontology-layer-medical-vertical.md)** - Healthcare domain modeling
---
## About Episteme
### Vision & Strategy
- **[Market Position](./about/market-position.md)** - Competitive landscape and thesis
- **[Simulation Vision](./about/simulation-vision.md)** - Agent-based validation
### Legal
- **[Patent Disclosure](./legal/patent-disclosure.md)** - IP documentation
- **[Patent Figures](./legal/patent-figures.md)** - Visual diagrams
- **[Patent Specification](./legal/patent-specification.md)** - Technical claims
---
## Tools & Infrastructure
- **[Grafana Dashboard](../tools/grafana/)** - Metrics visualization
- **[Presentations](./presentations/README.md)** - Slide decks and demos
---
## Archive
Historical documentation and snapshots:
- **[Documentation Updates Summary](./archive/DOCUMENTATION_UPDATES.md)** - 2026-02-08 audit
- **[Corpus Quick Start](./archive/CORPUS-QUICK-START.md)** - Legacy setup guide
---
## Quick Reference: What Goes Where
| If you need to... | Episteme provides... | You build... |
|-------------------|---------------------|--------------|
| Store conflicting facts | Assertion type, append-only DAG | Nothing - just POST assertions |
| Resolve conflicts | Lenses (Recency, Consensus, Skeptic) | Lens selection logic |
| Query historical state | `as_of` parameter | Time-travel UI |
| Track changes | `since` parameter + MV changelog | Notification system |
| Weight by source authority | `source_class` field + decay | Tier classifier |
| Detect emerging signals | Skeptic Lens + conflict_score | Gardener (threshold logic) |
| Show per-tier consensus | Layered Consensus Lens | Dashboard UI |
| Extract claims from papers | Nothing - pre-assertion transform | NLP pipeline |
| Sign assertions | Signature verification | Agent wallet / key management |
| Generate summaries | Structured query responses | LLM summarizer |
---
## Contributing
See **[CONTRIBUTING.md](../CONTRIBUTING.md)** for:
- Documentation standards
- File organization principles
- How to add new guides
- Review process
---
## Navigation Tips
- **Breadth-first**: Start with [What is Episteme?](../what-is-episteme.md) → [Use Cases](../use-cases/README.md) → [Quick Start](../quickstart.md)
- **Depth-first**: Start with [Architecture](../architecture.md) → [Data Structures](./data-structures.md) → [Distributed Write Path](./research/distributed-write-path.md)
- **Hands-on**: Start with [Quick Start](../quickstart.md) → [Go SDK](./sdk/go-sdk.md) → [App Concepts](./app-concepts/index.md)
---
**[← Back to Main README](../README.md)**

130
docs/legal/README.md Normal file
View File

@ -0,0 +1,130 @@
# Legal Documentation
Patent and intellectual property documentation for Episteme (StemeDB).
---
## Patent Documentation
Episteme's core innovations are documented for patent purposes. These documents describe the technical inventions and their implementation.
### Documents
- **[Patent Disclosure](./patent-disclosure.md)** - Formal invention disclosure document
- **[Patent Specification](./patent-specification.md)** - Technical specification and claims
- **[Patent Figures](./patent-figures.md)** - Visual diagrams and illustrations
---
## Overview
### Core Innovations
1. **Content-Addressed Knowledge Graph**
- Append-only Merkle DAG for conflicting assertions
- BLAKE3 content addressing for deduplication
- Cryptographic signature verification
2. **Read-Time Conflict Resolution (Lenses)**
- Multiple resolution strategies without data mutation
- Source-class aware authority hierarchies
- Time-travel queries via materialized views
3. **Probabilistic Materialization**
- O(1) reads via pre-computed consensus
- Background compaction and staleness detection
- Weighted voting with TrustRank integration
4. **Epistemic Provenance**
- Full audit trail of reasoning chains
- Invalidation cascades via DAG structure
- Source retraction with downstream impact analysis
---
## Key Claims
### 1. Multi-Truth Storage
**Problem Solved:** Traditional databases force resolution at write time, losing disagreement.
**Innovation:** Assertions are immutable proposals; resolution happens at read time via Lenses.
```
Traditional DB: Episteme:
Write → Overwrite Write → Append
Read → Single Value Read → Lens → Resolved Value
```
### 2. Source-Class Hierarchy
**Problem Solved:** All sources treated equally, regardless of authority.
**Innovation:** Structural source tiers (Regulatory > Clinical > Expert > Anecdotal) with decay curves.
### 3. Invalidation Cascades
**Problem Solved:** Retracted sources leave orphaned decisions.
**Innovation:** Merkle DAG enables instant downstream identification via content addressing.
### 4. Weighted Consensus Without Byzantine Fault Tolerance
**Problem Solved:** Consensus algorithms require majority agreement, which fails when truth is contested.
**Innovation:** Vote-weighted resolution with TrustRank (PageRank for agents), enabling dissent visibility.
---
## Prior Art Analysis
Episteme differs from existing systems:
| System | Limitation | Episteme Innovation |
|--------|-----------|---------------------|
| **Traditional RDBMS** | Single truth per cell | Multi-truth via append-only DAG |
| **Version Control (Git)** | Branches require merge | Lenses enable coexistence |
| **Graph Databases (Neo4j)** | No conflict resolution | Lenses + Source-Class hierarchy |
| **Vector DBs (Pinecone)** | Semantic similarity, not truth | Authority-weighted resolution |
| **Knowledge Graphs (Wikidata)** | Manual curation | Automated ingestion + voting |
| **Blockchain** | Consensus required | Dissent preserved, resolution optional |
---
## Filing Status
**Status:** Disclosure phase (not yet filed)
**Next Steps:**
1. Patent attorney review
2. Claims refinement
3. USPTO filing
---
## Confidentiality
These documents are:
- ✅ **Public**: Available in open-source repository
- ⚠️ **Pre-Filing**: Not yet filed with USPTO
- 📋 **Disclosure**: Establishes prior art date
**Important:** Public disclosure creates prior art. File within 12 months of public release (grace period in US).
---
## Related Documents
- **[Architecture](../../architecture.md)** - Technical implementation
- **[Vision](../../vision.md)** - Product philosophy
- **[What is Episteme?](../../what-is-episteme.md)** - Concept overview
---
## License
The Episteme codebase is open-source. Patent rights are retained by the project maintainers but licensed for use with the open-source implementation.
---
**[← Back to Documentation Index](../README.md)**

View File

@ -0,0 +1,227 @@
# ADK-Go Reference Documentation
Google ADK-Go integration patterns and agent examples for Episteme.
---
## Overview
The Google Agent Development Kit (ADK) for Go enables building AI agents that can use tools to interact with external systems. These guides show how to integrate ADK-Go agents with Episteme for knowledge graph-powered AI applications.
---
## Core Reference
- **[Reference Guide](./reference-guide.md)** - Complete ADK-Go integration documentation
- **[Research Notes](./research.md)** - Research and design considerations
---
## Agent Examples
### Basic Agents
- **[Chatbot Agent](./agent-chatbot.md)** - Simple conversational agent with Episteme integration
- **[Sales Agent](./agent-sales-agent.md)** - Sales automation with knowledge graph lookups
### Advanced Agents
- **[Researcher Agent](./agent-researcher.md)** - Research assistant with claim tracking and validation
- **[Multi-Chatters Agent](./agent-multi-chatters.md)** - Multi-agent conversation system
- **[Planning Facilitator](./agent-planning-facilitator.md)** - Planning and coordination agent
---
## Key Concepts
### Tool Integration
ADK-Go agents use "tools" to interact with Episteme:
```go
// Query knowledge graph
func QueryKnowledge(ctx context.Context, subject, predicate string, lens string) (string, error) {
resp, err := episteme.Query(ctx, &QueryParams{
Subject: subject,
Predicate: predicate,
Lens: lens,
})
return formatForAgent(resp), nil
}
// Assert new facts
func AssertFact(ctx context.Context, subject, predicate, object string) error {
return episteme.Assert(ctx, &AssertionRequest{
Subject: subject,
Predicate: predicate,
Object: ObjectText(object),
Confidence: 0.9,
})
}
```
### Agent Patterns
1. **Query-First**: Agents query Episteme before making decisions
2. **Claim-Backed**: Agents cite sources from knowledge graph
3. **Conflict-Aware**: Agents surface disagreement instead of hiding it
4. **Auditable**: All agent queries tracked in audit trail
---
## Integration Guide
### Prerequisites
```bash
go get google.golang.org/genai
go get github.com/orchard9/stemedb-go/steme
```
### Basic Setup
```go
import (
"google.golang.org/genai"
"github.com/orchard9/stemedb-go/steme"
)
// Initialize Episteme client
signer, _ := steme.GenerateSigner()
episteme := steme.NewClient("http://localhost:18180", signer)
// Create agent with tools
agent := genai.NewAgent(
genai.WithModel("gemini-2.0-flash"),
genai.WithTools(
QueryKnowledgeTool(episteme),
AssertFactTool(episteme),
),
)
```
**[→ Full Integration Guide](../../.claude/guides/integrations/adk-go-episteme.md)**
---
## Use Cases
| Agent Type | Use Case | Example |
|------------|----------|---------|
| **Chatbot** | Customer support with knowledge base | Answer questions with cited sources |
| **Researcher** | Literature review and synthesis | Track conflicting studies, surface disagreement |
| **Sales Agent** | Lead qualification and outreach | Query customer history, log interactions |
| **Planning Facilitator** | Project planning and coordination | Track decisions, maintain context across sessions |
| **Multi-Chatters** | Multi-agent collaboration | Agents share knowledge graph, resolve conflicts |
---
## Best Practices
### 1. Query Before Assert
Always query existing knowledge before creating new assertions:
```go
// Check if fact already exists
existing, _ := episteme.Query(ctx, &QueryParams{
Subject: subject,
Predicate: predicate,
})
if existing.Winner == nil {
// No existing fact, safe to assert
episteme.Assert(ctx, assertion)
}
```
### 2. Use Appropriate Lenses
Choose the right lens for your use case:
- **Consensus**: General queries where majority view matters
- **Recency**: Real-time data (prices, status)
- **Authority**: Regulatory or official information
- **Skeptic**: Research where disagreement is valuable
### 3. Include Confidence Scores
Always include confidence when asserting:
```go
assertion := steme.NewAssertion(subject, predicate).
WithString(value).
WithConfidence(0.85). // Be explicit about certainty
Build()
```
### 4. Track Agent Identity
Use agent-specific keys for audit trail:
```go
// Each agent has its own signer
agentSigner, _ := steme.LoadSigner("agent-key.pem")
client := steme.NewClient(endpoint, agentSigner)
```
---
## Testing
### Mock Episteme
Use test doubles for unit testing:
```go
type MockEpisteme struct {
assertions map[string]*Assertion
}
func (m *MockEpisteme) Query(ctx, params) (*QueryResult, error) {
// Return test data
}
func (m *MockEpisteme) Assert(ctx, assertion) (string, error) {
// Track assertions for verification
}
```
### Integration Tests
Test against real Episteme instance:
```go
func TestAgentIntegration(t *testing.T) {
// Start test Episteme instance
episteme := startTestInstance(t)
defer episteme.Shutdown()
// Run agent
result := agent.Run(ctx, "Test query")
// Verify assertions created
assertions := episteme.GetAssertions()
assert.Len(t, assertions, 1)
}
```
---
## Resources
- **[Go SDK Reference](../../sdk/go-sdk.md)** - Episteme Go client library
- **[API Documentation](../../../crates/stemedb-api/README.md)** - HTTP API reference
- **[Architecture](../../../architecture.md)** - System design
- **[Use Cases](../../../use-cases/README.md)** - Real-world examples
---
## See Also
- **[Integration Guide](../../.claude/guides/integrations/adk-go-episteme.md)** - Detailed integration walkthrough
- **[AI Assistant Integration](../../.claude/guides/integrations/ai-coding-assistant-integration.md)** - Claude/Cursor setup
---
**[← Back to Documentation Index](../../README.md)**

View File

@ -1,80 +0,0 @@
#!/bin/bash
# Setup nginx reverse proxy for both dashboards - ACTUALLY WORKS
set -e
echo "Setting up nginx for both dashboards..."
# Add to /etc/hosts
if ! grep -q "aphoria.local" /etc/hosts 2>/dev/null; then
echo "127.0.0.1 stemedb.local aphoria.local api.local" | sudo tee -a /etc/hosts
fi
# Aphoria Dashboard
sudo tee /etc/nginx/sites-available/aphoria-dashboard > /dev/null <<'EOF'
server {
listen 80;
server_name aphoria.local;
location / {
proxy_pass http://127.0.0.1:18189;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_read_timeout 300s;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
}
}
EOF
# StemeDB Dashboard
sudo tee /etc/nginx/sites-available/stemedb-dashboard > /dev/null <<'EOF'
server {
listen 80;
server_name stemedb.local;
location / {
proxy_pass http://127.0.0.1:18188;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
EOF
# API
sudo tee /etc/nginx/sites-available/stemedb-api > /dev/null <<'EOF'
server {
listen 80;
server_name api.local;
location / {
proxy_pass http://127.0.0.1:18180;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_read_timeout 300s;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
}
}
EOF
# Enable sites
sudo ln -sf /etc/nginx/sites-available/aphoria-dashboard /etc/nginx/sites-enabled/
sudo ln -sf /etc/nginx/sites-available/stemedb-dashboard /etc/nginx/sites-enabled/
sudo ln -sf /etc/nginx/sites-available/stemedb-api /etc/nginx/sites-enabled/
# Remove old broken config
sudo rm -f /etc/nginx/sites-enabled/stemedb
# Test and reload
sudo nginx -t && sudo systemctl reload nginx
echo ""
echo "✅ Done!"
echo ""
echo "Access:"
echo " http://aphoria.local - Aphoria Dashboard"
echo " http://stemedb.local - StemeDB Dashboard"
echo " http://api.local - Backend API"

Binary file not shown.

383
docs/sdk/README.md Normal file
View File

@ -0,0 +1,383 @@
# SDK Documentation
Client libraries and integration guides for Episteme (StemeDB).
---
## Go SDK
The official Go client library provides type-safe access to the Episteme API with built-in Ed25519 signing.
### Quick Start
```bash
go get github.com/orchard9/stemedb-go/steme
```
```go
import "github.com/orchard9/stemedb-go/steme"
// Generate keypair
signer, _ := steme.GenerateSigner()
// Create client
client := steme.NewClient("http://localhost:18180", signer)
// Assert a fact
assertion := steme.NewAssertion("Tesla_Inc", "has_revenue").
WithNumber(96.7).
WithConfidence(0.95).
Build()
hash, _ := client.Assert(ctx, assertion)
// Query with conflict resolution
params := steme.NewQuery().
WithSubject("Tesla_Inc").
WithPredicate("has_revenue").
WithLens(steme.LensConsensus).
Build()
result, _ := client.Query(ctx, params)
```
### Documentation
- **[Go SDK Reference](./go-sdk.md)** - Complete API documentation
- **[Usage Guide](./go-usage-guide.md)** - Patterns and examples
- **[ADK-Go Integration](../references/go-adk/reference-guide.md)** - Building AI agents
---
## Features
### Fluent Builders
The SDK provides fluent builders for readable, type-safe API calls:
```go
// Assertions
assertion := steme.NewAssertion(subject, predicate).
WithString(value).
WithConfidence(0.9).
WithSourceHash(hash).
WithLifecycle("Approved").
Build()
// Queries
params := steme.NewQuery().
WithSubject("Semaglutide").
WithPredicate("has_side_effect").
WithLens(steme.LensSkeptic).
WithAsOf(timestamp).
Build()
// Votes
vote := steme.NewVote(assertionHash).
WithWeight(0.8).
Build()
```
### Automatic Signing
All assertions are automatically signed with Ed25519:
```go
signer, _ := steme.GenerateSigner()
client := steme.NewClient(endpoint, signer)
// Every assertion includes:
// - Timestamp
// - Agent public key
// - Ed25519 signature
```
### Lens Support
Query with different conflict resolution strategies:
```go
// Most recent wins
steme.LensRecency
// Weighted consensus
steme.LensConsensus
// Vote-aware resolution
steme.LensVoteAware
// Surface disagreement
steme.LensSkeptic
// Per-tier breakdown
steme.LensLayered
// Trust-aware authority
steme.LensTrustAware
// Epoch-aware filtering
steme.LensEpochAware
```
---
## AI Agent Integration
### ADK-Go Tools
The SDK integrates with Google ADK-Go for building AI agents:
```go
// Define tools for agents
func QueryKnowledge(ctx context.Context, subject, predicate string, lens string) (string, error) {
resp, err := episteme.Query(ctx, &QueryParams{
Subject: subject,
Predicate: predicate,
Lens: lens,
})
if err != nil {
return "", err
}
return formatForAgent(resp), nil
}
func AssertFact(ctx context.Context, subject, predicate, object string) error {
return episteme.Assert(ctx, &AssertionRequest{
Subject: subject,
Predicate: predicate,
Object: ObjectText(object),
Confidence: 0.9,
})
}
```
**[→ Full ADK-Go Guide](../references/go-adk/reference-guide.md)**
---
## Examples
### Basic Workflow
```go
package main
import (
"context"
"github.com/orchard9/stemedb-go/steme"
)
func main() {
ctx := context.Background()
// Initialize
signer, _ := steme.GenerateSigner()
client := steme.NewClient("http://localhost:18180", signer)
// Write
assertion := steme.NewAssertion("GLP1_Agonists", "cardiovascular_benefit").
WithBoolean(true).
WithConfidence(0.85).
Build()
hash, _ := client.Assert(ctx, assertion)
// Read
params := steme.NewQuery().
WithSubject("GLP1_Agonists").
WithPredicate("cardiovascular_benefit").
WithLens(steme.LensConsensus).
Build()
result, _ := client.Query(ctx, params)
// result.Winner.Object contains the resolved value
}
```
### Conflict Handling
```go
// Create competing claims
client1.Assert(ctx, steme.NewAssertion("Drug_X", "safety_profile").
WithString("safe").
WithConfidence(0.7).
Build())
client2.Assert(ctx, steme.NewAssertion("Drug_X", "safety_profile").
WithString("unsafe").
WithConfidence(0.6).
Build())
// Query with Skeptic lens to see disagreement
params := steme.NewQuery().
WithSubject("Drug_X").
WithPredicate("safety_profile").
WithLens(steme.LensSkeptic).
Build()
result, _ := client.Query(ctx, params)
// result.Status: "Contested"
// result.ConflictScore: 0.72
// result.Claims: [{value: "safe", ...}, {value: "unsafe", ...}]
```
### Time Travel
```go
// Query historical state
pastTime := time.Now().Add(-24 * time.Hour)
params := steme.NewQuery().
WithSubject("Bitcoin").
WithPredicate("price_usd").
WithAsOf(pastTime).
Build()
result, _ := client.Query(ctx, params)
// Returns what was believed 24 hours ago
```
---
## Error Handling
The SDK returns typed errors:
```go
hash, err := client.Assert(ctx, assertion)
if err != nil {
switch err.(type) {
case *steme.ValidationError:
// Invalid assertion structure
case *steme.SignatureError:
// Signing failed
case *steme.NetworkError:
// Connection issues
case *steme.QuotaExceededError:
// Rate limited
default:
// Other errors
}
}
```
---
## Configuration
### Client Options
```go
client := steme.NewClient(endpoint, signer,
steme.WithTimeout(30*time.Second),
steme.WithRetry(3),
steme.WithUserAgent("MyApp/1.0"),
)
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `STEMEDB_ENDPOINT` | `http://localhost:18180` | API endpoint |
| `STEMEDB_TIMEOUT` | `30s` | Request timeout |
| `STEMEDB_RETRY_COUNT` | `3` | Retry attempts |
---
## Testing
### Mock Client
The SDK provides a mock for testing:
```go
import "github.com/orchard9/stemedb-go/steme/mock"
func TestMyCode(t *testing.T) {
mockClient := mock.NewClient()
// Configure expected calls
mockClient.ExpectAssert(assertion).Return(hash, nil)
mockClient.ExpectQuery(params).Return(result, nil)
// Run your code
myFunction(mockClient)
// Verify expectations
mockClient.AssertExpectations(t)
}
```
---
## Performance
### Batching
```go
// Batch assertions
batch := steme.NewBatch()
batch.Add(assertion1)
batch.Add(assertion2)
batch.Add(assertion3)
hashes, err := client.SubmitBatch(ctx, batch)
```
### Connection Pooling
The client automatically manages connection pooling:
```go
// Reuse the client across requests
var globalClient *steme.Client
func init() {
signer, _ := steme.GenerateSigner()
globalClient = steme.NewClient(endpoint, signer)
}
```
---
## Migration Guide
### From Direct HTTP
Before:
```go
// Manual HTTP + signing
body := marshal(assertion)
signature := sign(privateKey, body)
req := http.NewRequest("POST", endpoint+"/v1/assert", bytes.NewReader(body))
req.Header.Set("X-Signature", signature)
resp, _ := http.DefaultClient.Do(req)
```
After:
```go
// SDK handles it all
hash, _ := client.Assert(ctx, assertion)
```
---
## Resources
- **[API Reference](../../crates/stemedb-api/README.md)** - HTTP API documentation
- **[Architecture](../../architecture.md)** - System design
- **[Use Cases](../../use-cases/README.md)** - Real-world examples
- **[ADK-Go Examples](../references/go-adk/)** - Agent patterns
---
## Support
- **GitHub Issues**: Report bugs or request features
- **Examples**: See `sdk/go/examples/` directory
- **Integration Help**: Check [ADK-Go Integration](../references/go-adk/reference-guide.md)
---
**[← Back to Documentation Index](../README.md)**

164
docs/specs/README.md Normal file
View File

@ -0,0 +1,164 @@
# Technical Specifications
Formal specifications and design documents for Episteme (StemeDB) features.
---
## Core Specifications
### Governance & Organization
- **[Governance Models](./governance-models.md)** - Authority hierarchies and source classification
- Source-class tiers (Regulatory, Clinical, Expert, Anecdotal)
- Decay curves and confidence over time
- Trust pack distribution and policy federation
- **[Concept Hierarchy](./concept-hierarchy.md)** - Subject/predicate organization
- Namespace structure
- Path-based indexing
- Query optimization via hierarchy
### Query & Retrieval
- **[Visual Hash Query](./visual-hash-query.md)** - Screenshot-based retrieval
- Perceptual hashing for visual anchoring
- Drift detection via image comparison
- Time-travel for "what did I see when I first read this?"
---
## Application Specifications
### Aphoria (Code Linter)
- **[Aphoria Claims API](./aphoria-claims-api.md)** - Claims management design
- AuthoredClaim vs Observation types
- Provenance, invariant, consequence structure
- CLI commands for claim lifecycle
### Domain Modeling
- **[Ontology Layer: Medical Vertical](./ontology-layer-medical-vertical.md)** - Healthcare domain
- Drug, condition, study, patient ontology
- Pharma-specific extractors and builders
- FAERS and PubMed integration patterns
---
## Specification Format
All specifications follow this structure:
```markdown
# Specification Title
## Overview
- Problem being solved
- High-level approach
- Key constraints
## Requirements
- Functional requirements
- Non-functional requirements
- Success criteria
## Design
- Architecture diagrams
- Data structures
- API surface
## Implementation Notes
- Platform-specific details
- Performance considerations
- Testing strategy
## Open Questions
- Unresolved decisions
- Future work
```
---
## Status Legend
| Status | Meaning |
|--------|---------|
| ✅ **Implemented** | Specification fully implemented and tested |
| 🚧 **In Progress** | Partial implementation, actively being built |
| 📝 **Draft** | Specification being written, not yet implemented |
| 💡 **Proposed** | Idea stage, needs more design work |
| 🗄️ **Archived** | Superseded or deprecated |
---
## Current Status
| Specification | Status | Last Updated |
|---------------|--------|--------------|
| Governance Models | ✅ Implemented | Phase 3 |
| Concept Hierarchy | ✅ Implemented | Phase 2 |
| Visual Hash Query | ✅ Implemented | Phase 3A |
| Aphoria Claims API | ✅ Implemented | Aphoria A2 |
| Ontology: Medical | ✅ Implemented | Pilot 2 |
---
## RFCs (Request for Comments)
For proposals and design discussions, see **[RFCs](../rfcs/README.md)**.
RFCs go through a review process before becoming specifications:
1. **RFC Draft** → 2. **Review** → 3. **Accepted** → 4. **Specification** → 5. **Implementation**
---
## Design Principles
All specifications must address:
1. **Epistemic Honesty**: Does it preserve disagreement or force false consensus?
2. **Append-Only**: Can it be implemented without mutating existing data?
3. **Query-Time Resolution**: Is complexity deferred to read time?
4. **Source Attribution**: Does it maintain provenance?
5. **Performance**: What are the latency/throughput characteristics?
---
## Contributing Specifications
### Creating a New Specification
1. **Check for duplicates**: Search existing specs and RFCs
2. **Start with RFC**: Draft as RFC first for feedback
3. **Use template**: Follow the specification format above
4. **Include diagrams**: Add ASCII art or mermaid diagrams
5. **Define success criteria**: How will we know it works?
### Specification Review Process
1. **Author** writes initial draft
2. **Technical review** by domain expert
3. **Architecture review** for system-wide impact
4. **Approval** by project maintainers
5. **Implementation** tracking in roadmap
---
## Related Documents
- **[RFCs](../rfcs/README.md)** - Proposals under review
- **[Architecture](../../architecture.md)** - System overview
- **[Roadmap](../../roadmap.md)** - Implementation timeline
- **[Guides](../guides/README.md)** - How-to guides for developers
---
## See Also
- **[Data Structures](../data-structures.md)** - Core types reference
- **[Consistency Model](../consistency-model.md)** - Conflict resolution
- **[Research](../research/)** - Exploratory design work
---
**[← Back to Documentation Index](../README.md)**

View File

@ -263,6 +263,85 @@ cargo run --bin aphoria -- scan /path/to/project --show-observations
--- ---
## Arena: Simulation Roadmap
> **Goal:** Incrementally evolve the simulator from Spine validation to a full Agent-Based Modeling environment.
> **Philosophy:** Make it run. Then add. Verify at every step.
> **Alignment:** Tracks main roadmap phases; exercises features as they land.
### Current State
The simulator (`stemedb-sim`) validates the full system through Arena 0-4:
**Completed Arenas:**
- ✅ **Arena 0**: Test infrastructure with assertions and CI integration
- ✅ **Arena 1**: Query path via QueryEngine, Recency lens, lifecycle filtering, query audit
- ✅ **Arena 2**: Voting & VoteAwareConsensus, troll resistance
- ✅ **Arena 2.5**: Hardening (race conditions, API tests, crash recovery, input validation)
- ✅ **Arena 3**: Materialized Views, fast-path verification, MV freshness
- ✅ **Arena 4**: Agent personas (Scientist, Troll, Believer with differentiated strategies)
**What's Tested:**
- WAL durability, rkyv serialization, Ed25519 signatures
- Ingestor pipeline (WAL → KV async flow)
- QueryEngine with multiple lenses
- Lifecycle filtering, voting, consensus
- Query audit trail, materialized views
- Strategy-driven agent behaviors
**What's Not Yet Tested:**
- ❌ TrustRank (Arena 5)
- ❌ Concurrent agents at scale (Arena 6)
- ❌ Time-travel queries (Arena 7)
- ❌ Skeptic lens & conflict scores (Arena 8)
### Upcoming Arena Phases
**Arena 5: TrustRank Integration** (Next)
- Initialize TrustRank for agents
- Reputation adjustment after votes
- TrustAwareAuthorityLens verification
- Troll reputation decay over time
**Arena 6: Concurrent Agents**
- Tokio task per agent
- Scale to 100 agents, then 1000
- Contention metrics and bottleneck identification
**Arena 7: Time-Travel & Epochs**
- Time-travel query verification
- Epoch creation and supersession
- Epoch cascade validation
**Arena 8: Skeptic & Conflict**
- High/low conflict scenarios
- Skeptic lens surfacing outliers
- Conflict score accuracy
**Arena 9: Full Gameplay Loop**
- Ground truth injection
- Complete 5-tick scenario
- Extended 1000-tick run
- Emergence validation
### Alignment with Use Cases
| Use Case | Arena Phase |
|----------|-------------|
| **Agile Agent Team** ||
| Lifecycle filtering | Arena 1.3 |
| Query audit trail | Arena 1.4 |
| Time-travel debugging | Arena 7.1 |
| Expert weighting | Arena 5.3 |
| **Financial Due Diligence** ||
| Conflict detection | Arena 8.1, 8.3 |
| Epoch cascades | Arena 7.2, 7.3 |
**Run command:** `cargo run --bin stemedb-sim`
**Test suite:** `cargo test -p stemedb-sim`
---
## Related Documents ## Related Documents
- [CLAUDE.md](./CLAUDE.md) — AI assistant instructions and project rules - [CLAUDE.md](./CLAUDE.md) — AI assistant instructions and project rules

View File

@ -312,7 +312,7 @@ All SDK types align 1:1 with API DTOs:
- **SDK README:** `sdk/go/steme/README.md` - Quick start and API overview - **SDK README:** `sdk/go/steme/README.md` - Quick start and API overview
- **Full Guide:** `docs/sdk/go-sdk.md` - Comprehensive usage guide - **Full Guide:** `docs/sdk/go-sdk.md` - Comprehensive usage guide
- **Examples:** `sdk/go/examples/` - Runnable code examples - **Examples:** `sdk/go/examples/` - Runnable code examples
- **Usage Guide:** `usage.md` - Updated with SDK quick start - **Usage Guide:** `docs/sdk/go-usage-guide.md` - Updated with SDK quick start
## Future Enhancements ## Future Enhancements