jml 3b5f88b4f0 feat(aphoria): implement claims architecture (A1-A5) with verify engine, corpus, coverage, and explain

Complete Aphoria claims system overhaul:
- A1: Rename ExtractedClaim to Observation (extractors produce observations, not claims)
- A2: Add AuthoredClaim with full provenance, invariants, and authority tiers
- A3: Verify engine comparing observations against authored claims, CLI + formatters
- A4: Corpus as first-class assertions with predicate indexing, authority lens, trust packs
- A5: Coverage analysis, explain/docs generation, self-audit extractor, claim suggester skill

Also includes: 42 extractors updated for Observation type, verifiable_predicates trait,
conflict detection with comparison modes, claims TOML persistence, Grafana dashboard,
backup/restore scripts, and comprehensive test coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-08 09:11:47 +00:00

14 KiB

Raw Blame History

Aphoria Claims API — Sidecar Service

Goal: A lightweight HTTP API that exposes the same claim operations the aphoria-claims Claude Code skill performs — review diffs, identify claimable patterns, check existing claims, suggest new claims, create/update/verify claims — so any tool, agent, or CI pipeline can build claim authoring flows.

Key insight: The skill is just an LLM calling CLI commands. This API replaces the CLI calls with HTTP endpoints and replaces the Claude skill prompt with a Gemini call for the reasoning. Same workflow, any client.

Architecture

                    ┌──────────────────────────┐
                    │   Any Client             │
                    │   (CI, IDE, ADK agent,   │
                    │    custom UI, webhook)    │
                    └───────────┬──────────────┘
                                │ HTTP
                                ▼
                    ┌──────────────────────────┐
                    │  aphoria-claims-api       │
                    │  (Rust, axum, port 18189) │
                    │                          │
                    │  ┌─────────┐ ┌─────────┐ │
                    │  │ Claims  │ │ Gemini  │ │
                    │  │ Engine  │ │ Client  │ │
                    │  └────┬────┘ └────┬────┘ │
                    │       │           │      │
                    │  ┌────▼───────────▼────┐ │
                    │  │  aphoria lib crate  │ │
                    │  │  (ClaimsFile,       │ │
                    │  │   verify_claims,    │ │
                    │  │   extract_claims,   │ │
                    │  │   run_scan)         │ │
                    │  └─────────────────────┘ │
                    └──────────────────────────┘
                                │
                    ┌───────────▼──────────────┐
                    │  .aphoria/claims.toml     │
                    │  (project claim store)    │
                    └──────────────────────────┘

The sidecar calls aphoria as a library crate (not shelling out to CLI). It links against the same types: ClaimsFile, AuthoredClaim, Observation, verify_claims(), extract_claims().

For the reasoning parts (identifying claimable patterns in diffs, suggesting claims), it calls Gemini via the HTTP API.

Why a Sidecar, Not Extending stemedb-api

Aphoria claims are file-based (.aphoria/claims.toml) and project-scoped. The StemeDB API serves the knowledge graph (assertions, lenses, queries). These are different concerns:

	stemedb-api (18180)	aphoria-claims-api (18189)
Data	Episteme assertions (append-only DAG)	Authored claims (TOML file)
Scope	Cluster-wide knowledge	Single project
Storage	WAL + KV	`.aphoria/claims.toml`
Auth	API keys, per-agent	Local only (or simple token)

The sidecar runs alongside a project checkout. It needs filesystem access to the project root (for claims.toml, extractors, git).

API Surface

Claims CRUD

POST   /v1/claims              Create a claim
GET    /v1/claims              List claims (filter by ?category=&status=&format=json)
GET    /v1/claims/:id          Get a single claim
PATCH  /v1/claims/:id          Update claim fields
POST   /v1/claims/:id/supersede  Create superseding claim
DELETE /v1/claims/:id          Deprecate a claim (body: {reason})

Request/response types use the existing AuthoredClaim struct serialized as JSON. The API reads/writes .aphoria/claims.toml via ClaimsFile.

Verification

Mirrors aphoria verify run:

POST   /v1/verify              Run verification (claims vs observations)
GET    /v1/verify/map          Show claim-to-extractor mapping

POST /v1/verify body:

{
  "path": ".",
  "show_unclaimed": true,
  "categories": ["safety", "imports"],
  "claims": ["wallet-seqcst-001"]
}

Response: VerifyReport as JSON — per-claim verdicts (pass/conflict/missing/unclaimed) + summary counts.

Coverage (A5.1)

GET    /v1/coverage            Coverage metrics per module

Response:

{
  "project": "maxwell",
  "summary": {
    "total_observations": 67,
    "total_claims": 12,
    "claimed_percentage": 45.2,
    "unclaimed_count": 37
  },
  "modules": [
    {
      "module_path": "wallet/atomics",
      "observation_count": 5,
      "claim_count": 3,
      "density": 0.6
    }
  ]
}

Diff Review (A5.3 — the reasoning endpoint)

This is the one that calls Gemini. It does what the aphoria-claims skill does:

POST   /v1/review              Review a diff for claimable patterns

Request:

{
  "diff": "... unified diff text ...",
  "context": {
    "repo": "maxwell",
    "branch": "feat/new-ordering"
  }
}

Internally:

Load existing claims from .aphoria/claims.toml
Run extractors on changed files to get observations
Run verify_claims() to check for violations
Send diff + existing claims + observations to Gemini with the claim-identification prompt
Return structured suggestions

Response:

{
  "violations": [
    {
      "claim_id": "wallet-seqcst-001",
      "invariant": "All wallet atomics MUST use SeqCst",
      "violation": "Ordering::Relaxed at sync.rs:42",
      "action": "fix_code_or_supersede"
    }
  ],
  "suggestions": [
    {
      "observation": {
        "file": "src/pool.rs",
        "line": 15,
        "matched_text": "const MAX_POOL_SIZE: u32 = 50;"
      },
      "suggested_claim": {
        "id": "maxwell-pool-max-001",
        "concept_path": "maxwell/db/pool/max_size",
        "predicate": "max_value",
        "value": "50",
        "category": "constants",
        "invariant": "Database pool size MUST NOT exceed 50",
        "consequence": "OOM under sustained load",
        "authority_tier": "observational"
      },
      "reason": "New constant with non-obvious value. Similar to 2 existing claims about pool configuration.",
      "confidence": 0.85
    }
  ],
  "no_claim_needed": [
    {
      "pattern": "whitespace change in types.rs",
      "reason": "Internal refactor, no behavioral change"
    }
  ]
}

Docs Generation (A5.2)

POST   /v1/docs/generate       Generate claims-explained documentation

Response: Markdown string or JSON with full provenance chains, verification status, coverage gaps.

Onboarding (A5.4)

GET    /v1/explain              Narrative project overview from claims

Response: Markdown narrative: architectural boundaries, safety invariants, key constants with provenance, coverage gaps.

Gemini Integration

The reasoning endpoints (/v1/review, docs generation, onboarding narrative) call Gemini for LLM tasks.

Config:

# In aphoria.toml or env vars
[claims_api]
gemini_model = "gemini-2.5-flash"

GEMINI_API_KEY=AIzaSy...  # env var, never in config files

Gemini calls are structured, not conversational. Each call has:

A system prompt (the same logic from the aphoria-claims SKILL.md — claimability rules, category reference, authority tier guide)
Structured input (diff text, existing claims as JSON, observations as JSON)
Structured output (JSON schema for suggestions)

The prompt is essentially the skill document converted to a system prompt, with the human-in-the-loop parts replaced by structured JSON output.

Prompt Structure for `/v1/review`

System: You are an expert at identifying architectural decisions, safety
invariants, and policy requirements in code changes.

Given:
- A unified diff
- Existing authored claims (JSON)
- Observations extracted from changed files (JSON)

Identify:
1. Violations: Does the diff contradict any existing claim?
2. Suggestions: What new claims should be authored? (Only if a violation
   would break something, a new team member would need to know, or there's
   a non-obvious reason for the choice)
3. No-claim-needed: What patterns don't need claims and why?

For each suggestion, provide:
- id, concept_path, predicate, value, category
- invariant (what MUST be true)
- consequence (what breaks if violated)
- authority_tier (regulatory/clinical/observational/expert/community)
- reason (why this needs a claim)
- confidence (0.0-1.0)

Respond in JSON matching this schema: { ... }

Implementation

Crate Structure

New binary in applications/aphoria-claims-api/:

applications/aphoria-claims-api/
  Cargo.toml          # depends on aphoria (lib), axum, reqwest, serde_json
  src/
    main.rs           # axum server setup, routes
    routes/
      claims.rs       # CRUD endpoints
      verify.rs       # verification endpoint
      coverage.rs     # coverage metrics
      review.rs       # diff review (calls Gemini)
      docs.rs         # docs generation
      explain.rs      # onboarding narrative
    gemini/
      client.rs       # Gemini API client (generateContent)
      prompts.rs      # Prompt templates for each reasoning task
      types.rs        # Request/response types for Gemini API
    state.rs          # AppState (project_root, config, gemini client)
    error.rs          # Error types -> axum responses

Dependencies

aphoria (path dependency) — all the domain logic
axum — HTTP framework (already used by stemedb-api)
reqwest — Gemini API calls
serde_json — JSON serialization
tokio — async runtime
tower-http — CORS middleware
tracing — structured logging

Key Design Decisions

Library, not shell-out. The API imports aphoria as a crate and calls ClaimsFile::load(), verify_claims(), extract_claims() directly. No Command::new("aphoria").

Stateless per request. Each request reads .aphoria/claims.toml fresh. No in-memory cache of claims (the file is small, TOML parsing is fast). This means multiple clients can't corrupt each other's state.

File locking for writes. POST /v1/claims and PATCH /v1/claims/:id acquire a file lock on claims.toml before read-modify-write. Use fs2::FileExt or fd-lock.

Gemini is optional. The CRUD, verify, and coverage endpoints work without Gemini. Only /v1/review, /v1/docs/generate, and /v1/explain need LLM reasoning. If GEMINI_API_KEY is not set, these return 503 with a clear message.

Port Assignment

Following the existing 181XX scheme:

Offset	Service	Port
+9	Claims API	18189

Env var: APHORIA_CLAIMS_API_BIND_ADDR (default 127.0.0.1:18189)

Example Flows

CI Pipeline: Block PR if Claims Violated

# In CI script
DIFF=$(git diff origin/main...HEAD)

RESULT=$(curl -s -X POST http://localhost:18189/v1/review \
  -H "Content-Type: application/json" \
  -d "{\"diff\": $(echo "$DIFF" | jq -Rs .)}")

VIOLATIONS=$(echo "$RESULT" | jq '.violations | length')
if [ "$VIOLATIONS" -gt 0 ]; then
  echo "Claims violated:"
  echo "$RESULT" | jq '.violations[]'
  exit 1
fi

IDE Extension: Suggest Claims on Save

// VS Code extension pseudocode
const diff = await getDiffSinceLastSave();
const response = await fetch('http://localhost:18189/v1/review', {
  method: 'POST',
  body: JSON.stringify({ diff })
});
const { suggestions } = await response.json();

for (const s of suggestions) {
  showInlineHint(s.observation.file, s.observation.line,
    `Claim suggested: ${s.suggested_claim.invariant}`);
}

ADK-Go Agent: Claims-Aware Code Generation

// Before generating code, check constraints via claims
resp, _ := http.Post("http://localhost:18189/v1/verify",
    "application/json",
    bytes.NewReader([]byte(`{"show_unclaimed": false}`)))

var report VerifyReport
json.NewDecoder(resp.Body).Decode(&report)

if report.Summary.Conflict > 0 {
    // Don't generate code that conflicts with claims
    return fmt.Errorf("existing claims would be violated")
}

New Developer Onboarding

curl -s http://localhost:18189/v1/explain | less

Gets the narrative: what this codebase claims about itself, why, and where to find evidence.

Gemini API Integration Details

Endpoint

POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GEMINI_API_KEY

Request Shape

{
  "contents": [{
    "parts": [{
      "text": "System prompt + structured input"
    }]
  }],
  "generationConfig": {
    "responseMimeType": "application/json",
    "responseSchema": { ... }
  }
}

Using responseMimeType: "application/json" with a schema forces Gemini to return structured output matching our types. No parsing needed.

Cost Estimate

Diff review for a typical PR (~500 lines):

Input: ~2K tokens (prompt) + ~1K (diff) + ~1K (existing claims JSON) = ~4K tokens
Output: ~500 tokens (suggestions JSON)
At Gemini 2.5 Flash pricing: ~$0.001 per review

Negligible. Run it on every PR.

What This Enables

Any LLM can author claims. Not just Claude Code. Gemini, GPT, local models — they call the API.
CI enforcement. Block PRs that violate claims. No human needs to remember.
IDE integration. Inline suggestions as you type, not just at review time.
ADK-Go agents. Agents that generate code can check claims before writing, and author claims after.
Custom dashboards. Coverage metrics as a web service. Build whatever UI you want.
The flywheel without the skill. The aphoria-claims skill is great for Claude Code users. This API is for everyone else.

Implementation Order

Claims CRUD endpoints — Wrap ClaimsFile in axum routes. No LLM needed. Test with curl.
Verify endpoint — Call verify_claims(), return JSON. No LLM needed.
Coverage endpoint — Compute from verify report. No LLM needed.
Gemini client — reqwest + structured output schema.
Review endpoint — The reasoning endpoint. Diff + claims + observations -> Gemini -> suggestions.
Docs + Explain endpoints — Narrative generation via Gemini.

Steps 1-3 are pure engineering (wrap existing Rust functions in HTTP). Steps 4-6 add the Gemini reasoning layer.

14 KiB Raw Blame History