Initial commit: research notes journal

Moved from maxwell/blog to standalone repository. - Next.js research journal application - Notes 001-005 with YAML/MD content structure - Claude Code configuration for blog development Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-07 13:12:07 -07:00 · 2026-02-07 13:12:07 -07:00 · 9a9e58c935
commit 9a9e58c935
65 changed files with 16505 additions and 0 deletions
--- a/.claude/guides/blog.md
+++ b/.claude/guides/blog.md
@ -0,0 +1,85 @@
+# Maxwell Blog Guide
+
+Writing research notes for the Maxwell journal.
+
+## Purpose
+
+The blog documents HOW AI is used to work through an unfamiliar problem. Each note captures a research session.
+
+## Structure
+
+```
+blog/src/app/
+├── page.tsx                    # Journal home (list of projects)
+├── maxwell/
+│   ├── page.tsx               # Maxwell project landing
+│   ├── white-paper/page.tsx   # Formal paper outline
+│   └── notes/
+│       ├── 001-picking-a-problem/page.tsx
+│       ├── 002-building-the-scaffolding/page.tsx
+│       └── [NNN-slug]/page.tsx
+```
+
+## Note Structure
+
+Each note follows this pattern:
+
+```tsx
+// Prompts used during the session
+const prompts = [
+  { id: "...", label: "Prompt name", content: "Full prompt text" },
+];
+
+// Files created during the session
+const filesCreated = [
+  { name: "filename.md", description: "What it is", content: "Full content" },
+];
+
+// JSX structure:
+// - Navigation
+// - Header (#NNN, date, title)
+// - Prompts section (copyable)
+// - Files section (expandable + copyable)
+// - Prose content
+// - Footer navigation (prev/next)
+```
+
+## Shared Components
+
+From `src/components/copyable.tsx`:
+
+- `CopyButton` — Simple copy button
+- `CopyableBlock` — For prompts with copy functionality
+- `ExpandableFile` — For files with expand + copy
+
+## Adding a New Note
+
+1. Create directory: `src/app/maxwell/notes/NNN-slug/`
+2. Create `page.tsx` following the pattern above
+3. Update `src/app/maxwell/page.tsx` to add to notes array
+4. Update previous note's footer to link to new note
+
+## Content Guidelines
+
+- Focus on the METHOD, not just results
+- Capture actual prompts used
+- Include full file contents in filesCreated
+- Explain WHY decisions were made
+- The vision.md is for keeping FOCUS, not for importance ranking
+- Next steps should be concrete and actionable
+
+## Tone
+
+- First person ("I did X")
+- Conversational but technical
+- No unnecessary superlatives
+- Honest about limitations and complexity
+
+## Running Locally
+
+```bash
+cd blog
+npm install
+npm run dev
+# Open http://localhost:19197
+```
--- a/.claude/skills/new-note.md
+++ b/.claude/skills/new-note.md
@ -0,0 +1,238 @@
+# Creating Blog Notes
+
+This skill documents how to create new research notes for the Maxwell blog.
+
+## Directory Structure
+
+Each note lives in `blog/content/notes/{NNN}-{slug}/`:
+
+```
+blog/content/notes/
+├── 001-picking-a-problem/
+│   ├── meta.yaml         # Metadata, prompts, navigation
+│   └── content.md        # Prose content
+├── 002-building-the-scaffolding/
+│   ├── meta.yaml
+│   ├── content.md
+│   └── files/            # Optional: expandable file contents
+│       ├── vision.md
+│       ├── architecture.md
+│       └── roadmap.md
+└── 003-research-planning/
+    ├── meta.yaml
+    └── content.md
+```
+
+## Step 1: Create the Directory
+
+```bash
+mkdir -p blog/content/notes/{NNN}-{slug}
+```
+
+Use zero-padded 3-digit ID (001, 002, 003...) and kebab-case slug.
+
+## Step 2: Create meta.yaml
+
+Required fields:
+
+```yaml
+id: "003"
+slug: 003-research-planning
+date: "2026-02-07"
+title: Research Planning
+preview: "Short description shown in note list (1-2 sentences)"
+
+prompts:
+  - id: unique-id
+    label: Button label shown to user
+    content: |
+      The actual prompt content that gets copied.
+      Can be multi-line.
+
+  - id: another-prompt
+    label: Another Prompt
+    content: |
+      Second prompt content here.
+
+filesCreated: []  # Or list of files if any
+
+navigation:
+  prev:
+    slug: 002-building-the-scaffolding
+    id: "002"
+    title: Understanding the Project
+  next: null  # Or next note reference
+```
+
+### Optional: skillsUsed
+
+When documenting Claude Code skills/commands used during research:
+
+```yaml
+skillsUsed:
+  - name: do-parallel
+    command: /do-parallel
+    description: Execute tasks in parallel waves with optimal agent selection and review
+    usage: |
+      ---
+      description: Execute tasks in parallel waves...
+      argument-hint: <task list or "from todo">
+      allowed-tools: Task, Read, Write, Edit, Glob, Grep, Bash, TodoWrite
+      ---
+
+      Execute these tasks in parallel waves with proper review: $ARGUMENTS
+
+      ## Instructions
+      [FULL command definition from ~/.claude/commands/{command}.md]
+```
+
+**IMPORTANT**: The `usage` field should contain the ACTUAL command file contents from `~/.claude/commands/`, not just example usage.
+
+### Optional: filesCreated
+
+When the note documents files that were created:
+
+```yaml
+filesCreated:
+  - name: vision.md
+    description: Core thesis, axioms, paradoxes, and narrative
+  - name: architecture.md
+    description: System layers, code sketches, data structures
+```
+
+These files should exist in `files/` subdirectory.
+
+## Step 3: Create content.md
+
+Write the prose content in Markdown:
+
+```markdown
+## The Method
+
+Explain what you did and why.
+
+## Section Two
+
+More content here.
+
+- Bullet points work
+- As do **bold** and *italic*
+
+### Subsection
+
+Deeper explanation.
+
+## What I Learned
+
+Reflections on the process.
+
+## Next Steps
+
+What comes next in the research.
+```
+
+Supported Markdown features:
+- Headings (h2, h3)
+- Paragraphs
+- Bullet lists (ul)
+- Numbered lists (ol)
+- Bold and italic text
+- Inline code with backticks
+
+## Step 4: Add Files (Optional)
+
+If `filesCreated` references files, create them in `files/`:
+
+```bash
+mkdir -p blog/content/notes/{NNN}-{slug}/files
+```
+
+Each file should be the complete content that will be shown in an expandable section.
+
+## Step 5: Update Previous Note Navigation
+
+Edit the previous note's `meta.yaml` to add the `next` navigation:
+
+```yaml
+navigation:
+  prev:
+    slug: 001-picking-a-problem
+    id: "001"
+    title: Picking a Problem
+  next:
+    slug: 003-research-planning
+    id: "003"
+    title: Research Planning  # Add this
+```
+
+## Step 6: Verify
+
+```bash
+cd blog && npm run dev
+```
+
+Check:
+- [ ] Note appears in list at `/maxwell`
+- [ ] Note renders at `/maxwell/notes/{slug}`
+- [ ] Prompts have working copy buttons
+- [ ] Skills section shows with expandable usage (if skillsUsed)
+- [ ] Files section shows with expandable content (if filesCreated)
+- [ ] Prev/next navigation works
+- [ ] Build succeeds: `npm run build`
+
+## Schema Reference
+
+### NoteMeta Interface
+
+```typescript
+interface NoteMeta {
+  id: string;
+  slug: string;
+  date: string;
+  title: string;
+  preview: string;
+  prompts: Prompt[];
+  skillsUsed?: SkillUsed[];
+  filesCreated: FileCreated[];
+  navigation: {
+    prev: NavLink | null;
+    next: NavLink | null;
+  };
+}
+
+interface Prompt {
+  id: string;
+  label: string;
+  content: string;
+}
+
+interface SkillUsed {
+  name: string;
+  command: string;
+  description: string;
+  usage?: string;  // Full command definition
+}
+
+interface FileCreated {
+  name: string;
+  description: string;
+}
+
+interface NavLink {
+  slug: string;
+  id: string;
+  title: string;
+}
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `blog/src/lib/content.ts` | Content loaders |
+| `blog/src/app/maxwell/notes/[slug]/page.tsx` | Dynamic route |
+| `blog/src/components/notes/NoteHeader.tsx` | Note header |
+| `blog/src/components/notes/PromptsSection.tsx` | Prompts with copy |
+| `blog/src/components/notes/SkillsSection.tsx` | Skills/commands |
+| `blog/src/components/notes/FilesSection.tsx` | Expandable files |
+| `blog/src/components/notes/NoteFooter.tsx` | Navigation |
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,28 @@
+# Research Notes
+
+Research journal documenting AI-assisted problem solving.
+
+## Project Structure
+
+- `blog/` — Next.js research journal application
+- `blog/content/notes/` — Individual research notes (YAML + Markdown)
+- `blog/content/projects/` — Project metadata
+
+## Quick Commands
+
+```bash
+cd blog && npm install
+cd blog && npm run dev
+# Open http://localhost:19197
+```
+
+## AI Routing
+
+### For blog development
+→ `blog/CLAUDE.md`
+
+### For creating new notes
+→ `.claude/skills/new-note.md`
+
+### For blog component work
+→ `.claude/guides/blog.md`
--- a/blog/.dockerignore
+++ b/blog/.dockerignore
@ -0,0 +1,22 @@
+# Dependencies
+node_modules/
+
+# Build output
+.next/
+out/
+dist/
+
+# Development
+.env*.local
+.env
+
+# IDE
+.vscode/
+*.swp
+
+# Misc
+*.log
+.DS_Store
+
+# Generated
+public/openapi.json
--- a/blog/.gitignore
+++ b/blog/.gitignore
@ -0,0 +1,41 @@
+# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
+
+# dependencies
+/node_modules
+/.pnp
+.pnp.*
+.yarn/*
+!.yarn/patches
+!.yarn/plugins
+!.yarn/releases
+!.yarn/versions
+
+# testing
+/coverage
+
+# next.js
+/.next/
+/out/
+
+# production
+/build
+
+# misc
+.DS_Store
+*.pem
+
+# debug
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+.pnpm-debug.log*
+
+# env files (can opt-in for committing if needed)
+.env*
+
+# vercel
+.vercel
+
+# typescript
+*.tsbuildinfo
+next-env.d.ts
--- a/blog/CLAUDE.md
+++ b/blog/CLAUDE.md
@ -0,0 +1,99 @@
+# A Research Journal
+
+Personal research projects exploring unfamiliar territory with AI.
+
+**Parent context:** See `../CLAUDE.md` for project-wide routing.
+
+## Development
+
+```bash
+npm install
+npm run dev
+```
+
+Open http://localhost:19197
+
+## Architecture
+
+This blog uses a data-driven architecture:
+
+- **Content lives in `content/`** - YAML metadata + Markdown prose
+- **Components in `src/components/`** - Reusable DRY components
+- **Dynamic routes** - Notes use `[slug]` for SSG
+
+## Content Structure
+
+```
+content/
+├── projects/
+│   └── maxwell.yaml          # Project metadata
+├── notes/
+│   ├── 001-picking-a-problem/
+│   │   ├── meta.yaml         # Prompts, navigation, metadata
+│   │   └── content.md        # Prose content
+│   └── 002-building-the-scaffolding/
+│       ├── meta.yaml
+│       ├── content.md
+│       └── files/            # Expandable file contents
+│           ├── vision.md
+│           ├── architecture.md
+│           └── roadmap.md
+└── white-paper/
+    └── outline.yaml          # Structured sections
+```
+
+## Routes
+
+- `/` - Journal home (list of projects)
+- `/maxwell` - Maxwell project landing
+- `/maxwell/white-paper` - Formal paper outline
+- `/maxwell/notes/[slug]` - Individual research notes (dynamic)
+
+## Adding a Note
+
+1. Create `content/notes/NNN-slug/` directory
+2. Add `meta.yaml` with prompts, navigation, filesCreated
+3. Add `content.md` with prose
+4. Add `files/` directory if filesCreated references files
+5. Update previous note's `meta.yaml` navigation.next
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `src/lib/content.ts` | Content loaders (getProject, getNoteBySlug, etc.) |
+| `src/components/layout/PageLayout.tsx` | Page wrapper |
+| `src/components/layout/BackNav.tsx` | Back navigation |
+| `src/components/notes/NoteHeader.tsx` | Note header (#id, date, title) |
+| `src/components/notes/PromptsSection.tsx` | Prompts with copy buttons |
+| `src/components/notes/FilesSection.tsx` | Expandable files |
+| `src/components/notes/NoteFooter.tsx` | Prev/next navigation |
+| `src/components/white-paper/OutlineSection.tsx` | Outline sections |
+| `src/components/copyable.tsx` | CopyButton, CopyableBlock, ExpandableFile |
+
+## Note meta.yaml Schema
+
+```yaml
+id: "001"
+slug: 001-picking-a-problem
+date: "2026-02-06"
+title: Picking a Problem
+preview: "Short description for list view"
+
+prompts:
+  - id: unique-id
+    label: Button label
+    content: |
+      Prompt content here
+
+filesCreated:
+  - name: filename.md
+    description: What this file is
+
+navigation:
+  prev: null  # or { slug, id, title }
+  next:
+    slug: 002-building-the-scaffolding
+    id: "002"
+    title: Understanding the Project
+```
--- a/blog/Dockerfile
+++ b/blog/Dockerfile
@ -0,0 +1,55 @@
+# StemeDB Community App Docker Build
+#
+# Multi-stage build for the Next.js frontend.
+# Also used for running the seed script.
+
+# Stage 1: Dependencies
+FROM node:20-slim AS deps
+
+WORKDIR /app
+
+# Copy package files
+COPY package*.json ./
+
+# Install dependencies
+RUN npm ci
+
+# Stage 2: Builder
+FROM node:20-slim AS builder
+
+WORKDIR /app
+
+# Copy dependencies from deps stage
+COPY --from=deps /app/node_modules ./node_modules
+COPY . .
+
+# Create empty openapi.json if it doesn't exist (will be fetched at runtime)
+RUN mkdir -p public && echo '{}' > public/openapi.json
+
+# Build the Next.js app
+# Note: Build may fail if API is not available, but we continue anyway
+RUN npm run build || echo "Build completed with warnings"
+
+# Stage 3: Runtime
+FROM node:20-slim AS runner
+
+WORKDIR /app
+
+ENV NODE_ENV=production
+ENV PORT=19197
+
+# Copy built assets and dependencies
+COPY --from=builder /app/.next ./.next
+COPY --from=builder /app/public ./public
+COPY --from=builder /app/package*.json ./
+COPY --from=builder /app/node_modules ./node_modules
+
+# Copy scripts directory for seed script
+COPY --from=builder /app/scripts ./scripts
+COPY --from=builder /app/tsconfig.json ./tsconfig.json
+COPY --from=builder /app/src ./src
+
+EXPOSE 19197
+
+# Default command runs the Next.js server
+CMD ["npm", "run", "start"]
--- a/blog/README.md
+++ b/blog/README.md
@ -0,0 +1,47 @@
+# A Research Journal
+
+Personal research projects exploring unfamiliar territory with AI as a thinking partner.
+
+## Getting Started
+
+```bash
+npm install
+npm run dev
+```
+
+Open [http://localhost:19197](http://localhost:19197) to view the site.
+
+## Structure
+
+```
+blog/
+├── src/
+│   ├── app/
+│   │   ├── page.tsx              # Journal home (project list)
+│   │   ├── maxwell/              # Maxwell project
+│   │   │   ├── page.tsx          # Project landing
+│   │   │   ├── white-paper/      # Formal paper
+│   │   │   └── notes/            # Research notes
+│   │   ├── layout.tsx
+│   │   └── globals.css
+│   ├── components/
+│   │   └── ui/
+│   └── lib/
+│       └── utils.ts
+├── public/
+└── package.json
+```
+
+## Routes
+
+- `/` - Journal home with list of projects
+- `/maxwell` - Maxwell project landing with notes list
+- `/maxwell/white-paper` - Formal paper (outline → filled as research progresses)
+- `/maxwell/notes/[slug]` - Individual research notes
+
+## Tech Stack
+
+- Next.js 16
+- React 19
+- Tailwind CSS 4
+- TypeScript
--- a/blog/components.json
+++ b/blog/components.json
@ -0,0 +1,23 @@
+{
+  "$schema": "https://ui.shadcn.com/schema.json",
+  "style": "new-york",
+  "rsc": true,
+  "tsx": true,
+  "tailwind": {
+    "config": "",
+    "css": "src/app/globals.css",
+    "baseColor": "neutral",
+    "cssVariables": true,
+    "prefix": ""
+  },
+  "iconLibrary": "lucide",
+  "rtl": false,
+  "aliases": {
+    "components": "@/components",
+    "utils": "@/lib/utils",
+    "ui": "@/components/ui",
+    "lib": "@/lib",
+    "hooks": "@/hooks"
+  },
+  "registries": {}
+}
--- a/blog/content/notes/001-picking-a-problem/content.md
+++ b/blog/content/notes/001-picking-a-problem/content.md
@ -0,0 +1,96 @@
+## The Method
+
+It isn't always easy to find an "unknown" problem to solve. I have two strategies:
+
+1. Propose something in the future that I know I will have to eventually solve
+2. Have AI ask me questions about what I am working on, then tell it to propose 5 diverse problems that we will encounter as an industry over the next few years
+
+You can steer the complexity of the problems by using L1 - L11 as guidance.
+
+Below is the result of the latter.
+
+## The Five Candidates
+
+### 1. Ouroboros: The Infinite Context Engine
+
+GPU Direct Storage for LLMs. Stream KV cache from NVMe into tensor cores instead of loading into VRAM. Fight the Von Neumann bottleneck. The problem: RAG is lossy, and you can't fit 50GB of legal discovery into 80GB of GPU memory economically.
+
+### 2. Chimera: The Dynamic Model Compiler
+
+Runtime linker for neural networks. Load a base skeleton and stitch together LoRA adapters dynamically based on prompt routing. Turn model training (batch, expensive) into model composition (real-time, cheap). The problem: enterprises can't maintain 100 fine-tuned 70B models.
+
+### 3. Rosetta: The Semantic Lifter
+
+Transpiler that lifts COBOL/Fortran into a semantic graph, runs formal verification with Z3, and recompiles to Rust. The problem: global banking runs on dead developers' code, and LLMs hallucinate when translating.
+
+### 4. Panopticon: Kernel-Level Data Lineage
+
+eBPF-based Information Flow Control. Taint bytes at the kernel level; drop packets if "Secret" data hits an unauthorized socket. The problem: GDPR, HIPAA, EU AI Act—enterprises can't trace where AI sends their data.
+
+### 5. Maxwell: The Thermodynamic Scheduler *(selected)*
+
+Treat compute as a scarce economic resource. Agents bid for CPU time; price rises with temperature. Parasitic workloads go bankrupt and die. The problem: fairness is wrong when autonomous agents compete for resources.
+
+## Why Maxwell?
+
+The seed that caught me was **Jevons Paradox**: the observation that efficiency gains don't reduce consumption—they increase it. Make engines more efficient, and people drive more. Make AI inference cheaper, and people run more AI.
+
+We're about to hit this hard. Inference is getting cheaper (quantization, speculative decoding, distillation). That means more agents, running longer, competing for the same hardware. The current answer is "buy more GPUs." But that's not a solution—it's a deferral.
+
+What happens when you have 50 autonomous agents on one machine and they all think they're important?
+
+## The Now Problem: DePIN Verification
+
+There's a class of networks called DePIN—Decentralized Physical Infrastructure Networks. Examples: Akash, io.net, Render, Nosana. The pitch is simple: rent GPU time from strangers instead of AWS.
+
+The problem is trust. When you send a workload to some random node in Argentina, how do you know they actually ran it? They could return cached results. They could run a cheaper model and pocket the difference. They could just lie.
+
+Current solutions are heavy:
+
+- **TEE (Trusted Execution Environments)**: Run everything in a hardware enclave. Works, but ~10% overhead and limits which GPUs you can use.
+- **zkML**: Generate a zero-knowledge proof that the inference happened correctly. Works, but takes minutes per proof.
+- **Optimistic verification**: Trust, but verify a random sample. Works, but doesn't catch sophisticated cheaters.
+
+Maxwell's angle: what if you could verify by watching the power signature? Different workloads (inference vs. mining vs. looping) produce distinct thermal fingerprints. You don't need cryptographic proofs for 99% of verification—you just need continuous monitoring.
+
+## The Future Problem: Fairness is Wrong
+
+This is the weirder, more interesting version.
+
+Right now, your operating system treats every process equally. The Linux Completely Fair Scheduler (CFS) gives each process a fair share of CPU time. This made sense when processes were human-initiated: a browser, a text editor, a compiler. No single user should monopolize the machine.
+
+But autonomous agents change this. Imagine your laptop running:
+
+- An agent scanning your email for action items
+- An agent monitoring your calendar and preparing briefs
+- An agent running background research on a topic you mentioned
+- A malware agent mining cryptocurrency
+- A broken agent stuck in an infinite loop
+
+CFS gives them all equal time. It can't distinguish between them because it doesn't understand *value*. It just sees "process needs cycles."
+
+The future problem is: who decides what's important? The hyperscalers solve this with central coordination (Borg, Kubernetes). But for a personal machine, for decentralized networks, for swarms of agents you don't fully control—you need a different mechanism.
+
+Maxwell's answer: make them pay for it. Create an internal economy. Agents that produce value earn currency; agents that consume without producing go bankrupt. The scheduler becomes an auction, and the market discovers priority.
+
+## How This Is Represented Now
+
+This isn't entirely new territory. There's prior art:
+
+- **Economic scheduling research** (1990s-2000s): Spawn, Mariposa, and other systems experimented with resource markets. Mostly academic; never hit production.
+- **Power-aware scheduling**: Intel RAPL, thermal governors. The OS already modulates based on thermals, but it's reactive throttling, not economic prioritization.
+- **Kubernetes priority classes**: You can mark pods as "critical" or "best-effort." But this is static assignment, not dynamic discovery.
+- **DePIN verification schemes**: io.net, Akash, and others are all wrestling with this. Most use staking + slashing (game-theoretic) or TEE (hardware-based).
+
+What's missing is the synthesis: using thermodynamic constraints (real physics) as the enforcement mechanism for economic prioritization. That's the gap Maxwell aims to fill.
+
+## Next Steps
+
+Before building anything, I need to build the scaffolding:
+
+1. Write a **vision.md** — what does success look like?
+2. Design the **architecture** — what are the components and how do they interact?
+3. Build a **roadmap.md** — what's the sequence of work?
+4. Review and understand it — does this actually make sense?
+
+Only then do I start writing code. The goal is to front-load the thinking so the building is straightforward.
--- a/blog/content/notes/001-picking-a-problem/meta.yaml
+++ b/blog/content/notes/001-picking-a-problem/meta.yaml
@ -0,0 +1,61 @@
+id: "001"
+slug: 001-picking-a-problem
+date: "2026-02-06"
+title: Picking a Problem
+preview: "Surveying five OS-level AI problems. Landing on Jevons Paradox and the question of fairness."
+
+prompts:
+  - id: challenge
+    label: Generate challenges
+    content: |
+      Here are a list of things I have been working on. I need you to give me a challenge that pushes me to prove myself above my capability. Provide 5 diverse complicated challenges that will need to be solved within the next 2-3 years.
+
+  - id: explain
+    label: Explain Maxwell
+    content: |
+      Explain Maxwell
+
+  - id: maxwell-brief
+    label: Maxwell brief (AI response)
+    content: |
+      Project: "Maxwell"
+      The Thermodynamic Hypervisor
+
+      The Premise
+      Current operating systems (Linux, k8s) are blind to value. They will happily burn 1,000 Watts executing a while(true) loop if the scheduler says so. They treat compute as an infinite resource, constrained only by "time" (CPU quotas).
+
+      As AI agents scale to millions, Energy becomes the hard constraint, not Time. We need a runtime that schedules workloads based on Information Gain per Joule.
+
+      The Goal
+      Design a distributed, bare-metal hypervisor (likely orchestrating Firecracker microVMs or Wasm modules) where:
+      1. Compute is Currency: Processes must "pay" for every cycle using a cryptographically verifiable energy token.
+      2. Scheduling is an Auction: The kernel doesn't pick who runs. Processes bid for execution slots based on their self-assessed urgency.
+      3. Entropy is Expensive: Allocating memory is cheap. Freeing memory (erasing information) costs "tax" (Landauer's Principle).
+      4. Apoptosis is Default: The OS kills any process that cannot pay its energy rent.
+
+      Why this is complicated
+
+      To plan this, you must solve three interlocking paradoxes:
+
+      1. The "Proof of Useful Work" Paradox
+      • Problem: How does the Hypervisor know an AI agent is actually thinking and not just mining crypto or looping?
+      • Challenge: Design a "Proof of Inference" protocol. Can you use Zero-Knowledge proofs (zk-SNARKs) to prove a model layer was executed correctly without the Hypervisor re-running the computation?
+      • Difficulty: Extremely Hard. Requires bridging Cryptography and ML Compilers.
+
+      2. The "High-Frequency Auction" Paradox
+      • Problem: If every CPU cycle requires a bid, the auction mechanism itself consumes more compute than the workloads.
+      • Challenge: Design a Control System. How do you implement a market mechanism that runs in O(1) or O(log n) time inside the kernel scheduler?
+      • Difficulty: Requires inventive Data Structures (e.g., a "Probabilistic Auction Heap").
+
+      3. The "Thermal Throttling" Consensus
+      • Problem: In a distributed cluster, one node overheating affects the efficiency of neighbors (fan speed, power delivery).
+      • Challenge: Design a Gossip Protocol for Heat. How does Node A tell Node B "I am dying" in a way that causes Node B to lower its prices for compute, autonomously rebalancing the thermodynamics of the data center?
+
+filesCreated: []
+
+navigation:
+  prev: null
+  next:
+    slug: 002-building-the-scaffolding
+    id: "002"
+    title: Understanding the Project
--- a/blog/content/notes/002-building-the-scaffolding/content.md
+++ b/blog/content/notes/002-building-the-scaffolding/content.md
@ -0,0 +1,39 @@
+## The Process
+
+There's nothing clever about the prompts here. "Write vision.md" is exactly what it sounds like. The value isn't in the prompting— it's in the iteration.
+
+The workflow:
+
+1. Ask AI to write the document
+2. Read it carefully
+3. Ask questions about things that don't make sense
+4. Request additions or clarifications
+5. Repeat until the document is solid
+
+## Why These Three?
+
+### vision.md
+
+This is the "why." What problem are we solving? What's the thesis? What are the core principles that can't be compromised? It surfaces the hard paradoxes early—if you can't articulate them, you don't understand the problem. The key thing about the vision: use it to keep focus. When you're deep in implementation and lose the thread, come back here.
+
+### architecture.md
+
+This is the "what." System diagrams, code sketches, data structures. It forces you to think through the layers before writing any code. Where does state live? How do components communicate? What are the trust boundaries?
+
+### roadmap.md
+
+This is the "when." What's the minimum viable demonstration? What can be deferred? What's required vs. optional? It also defines exit criteria—how do you know when you're done?
+
+## What I Learned
+
+Writing these documents crystallized a few things:
+
+1. **The baseline matters.** You can't claim Maxwell is better than Linux CFS without measuring what CFS actually does. The roadmap now starts with measuring the problem before building the solution.
+
+2. **Honest limitations help.** The vision doc explicitly calls out that this isn't a complete AI safety solution. A well-funded malicious agent still runs. That honesty makes the real claims (DePIN verification) more credible.
+
+3. **Implementation priority is a forcing function.** When you have to decide what to build first, you discover what you actually understand vs. what you're hand-waving.
+
+## Next Steps
+
+Identify research directives and perform initial deep research.
--- a/blog/content/notes/002-building-the-scaffolding/files/architecture.md
+++ b/blog/content/notes/002-building-the-scaffolding/files/architecture.md
@ -0,0 +1,224 @@
+# Maxwell Architecture
+
+## Technical Specification
+
+> **Note:** This is a reference architecture for research.
+
+---
+
+## System Overview
+
+Maxwell is a Type-2 Hypervisor Controller that manages Firecracker microVMs (or WASM modules) through an economic scheduler. It replaces the concept of "time slices" with "quanta of action" purchased through continuous auction.
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                     USER SPACE (Agents)                         │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
+│  │ Agent A     │  │ Agent B     │  │ Agent C     │              │
+│  │ (MicroVM)   │  │ (MicroVM)   │  │ (WASM)      │              │
+│  │ Wallet: 5k  │  │ Wallet: 200 │  │ Wallet: 0   │              │
+│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘              │
+│         │                │                │                     │
+│         └────────────────┼────────────────┘                     │
+│                          │ vsock                                │
+├──────────────────────────┼──────────────────────────────────────┤
+│                     MAXWELL DAEMON (Ring 0)                     │
+│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐        │
+│  │ Auction       │  │ Thermal       │  │ Wallet        │        │
+│  │ Engine        │  │ Controller    │  │ Ledger        │        │
+│  │ (Vickrey)     │  │ (PID)         │  │ (Crypto)      │        │
+│  └───────────────┘  └───────────────┘  └───────────────┘        │
+│                          │                                      │
+├──────────────────────────┼──────────────────────────────────────┤
+│                     HARDWARE LAYER                              │
+│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐        │
+│  │ CPU/KVM       │  │ RAPL/MSR      │  │ Thermal       │        │
+│  │               │  │ (Energy)      │  │ Sensors       │        │
+│  └───────────────┘  └───────────────┘  └───────────────┘        │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Layer 0: The Physics Interface
+
+### Thermal Sensor Subsystem
+
+Reads real hardware telemetry with microsecond precision.
+
+```rust
+/// Model Specific Register addresses for Intel CPUs
+const MSR_RAPL_POWER_UNIT: u32 = 0x606;
+const MSR_PKG_ENERGY_STATUS: u32 = 0x611;
+const IA32_THERM_STATUS: u32 = 0x19C;
+
+pub trait ThermalSensor: Send + Sync {
+    /// Die temperature in Celsius
+    fn read_die_temp(&self) -> f64;
+
+    /// Joules consumed since boot
+    fn read_energy_consumed(&self) -> u64;
+
+    /// Current power draw in Watts
+    fn read_power_draw(&self) -> f64;
+}
+```
+
+---
+
+## Layer 1: The Auction Scheduler
+
+### Replacing CFS with the Vickrey Engine
+
+The Completely Fair Scheduler (CFS) is removed. In its place: a second-price sealed-bid auction running every 10ms.
+
+```rust
+pub struct VickreyEngine {
+    /// Max-heap of bids, sorted by bid amount
+    bid_heap: BinaryHeap<Bid>,
+    /// Current market-clearing price
+    spot_price: Joule,
+    /// Thermal controller reference
+    thermal: Arc<dyn ThermalSensor>,
+}
+
+impl VickreyEngine {
+    /// Called every 10ms by timer interrupt
+    pub fn clear_market(&mut self) -> Vec<Execution> {
+        // 1. Calculate supply based on thermal headroom
+        let temp = self.thermal.read_die_temp();
+        self.spot_price = self.calculate_spot_price(temp);
+
+        // 2. Clear the auction - winner pays second-highest price
+        let mut winners = Vec::new();
+
+        while let Some(bid) = self.bid_heap.pop() {
+            if bid.amount >= self.spot_price {
+                winners.push(Execution {
+                    pid: bid.pid,
+                    payment: self.spot_price,
+                    duration_ms: 10,
+                });
+            } else {
+                break;
+            }
+        }
+
+        winners
+    }
+}
+```
+
+---
+
+## Layer 2: The Wallet System
+
+Every process has an economic identity.
+
+```rust
+pub struct Wallet {
+    /// Ed25519 public key (owner identity)
+    pub address: [u8; 32],
+    /// Current balance in atomic Joule units
+    pub balance: AtomicU64,
+    /// Maximum authorized spend per tick
+    pub max_bid: Joule,
+}
+
+impl Wallet {
+    pub fn debit(&self, amount: Joule) -> Result<(), InsolvencyError> {
+        let current = self.balance.load(Ordering::SeqCst);
+        if current < amount.0 {
+            return Err(InsolvencyError::InsufficientFunds);
+        }
+        // Atomic CAS to prevent double-spend
+        self.balance.compare_exchange(
+            current,
+            current - amount.0,
+            Ordering::SeqCst,
+            Ordering::SeqCst,
+        )
+    }
+}
+```
+
+### Landauer's Tax (Entropy Accounting)
+
+```rust
+/// Landauer's principle: minimum energy to erase 1 bit
+/// At room temperature (300K): kT × ln(2) ≈ 2.87 × 10^-21 J
+const LANDAUER_CONSTANT: f64 = 2.87e-21;
+
+pub fn calculate_erasure_tax(bytes_freed: u64) -> Joule {
+    let bits = bytes_freed * 8;
+    let theoretical_cost = bits as f64 * LANDAUER_CONSTANT;
+    let practical_cost = theoretical_cost * 1e12; // Hardware inefficiency
+    Joule(practical_cost as u64)
+}
+```
+
+---
+
+## Execution Flow: The 10ms Tick
+
+```
+1. TIMER INTERRUPT fires
+       │
+       ▼
+2. READ PHYSICS
+   ├── Read MSR: Die Temperature
+   ├── Read RAPL: Energy consumed
+   └── Calculate: Thermal headroom
+       │
+       ▼
+3. SET MARKET PRICE
+   └── spot_price = f(temperature, power_budget)
+       │
+       ▼
+4. CLEAR AUCTION
+   ├── Sort bids (heap pop)
+   ├── Accept bids >= spot_price
+   └── Debit wallets (Vickrey: pay second price)
+       │
+       ▼
+5. EXECUTE WINNERS
+   ├── Resume winning VMs
+   └── Keep losers frozen
+       │
+       ▼
+6. CHECK INSOLVENCY
+   └── If wallet < 0: trigger APOPTOSIS
+```
+
+---
+
+## Project Layout
+
+```
+maxwell/
+├── core/                    # Kernel Logic
+│   ├── src/
+│   │   ├── auction.rs       # Vickrey Engine
+│   │   ├── thermal.rs       # Sensor Interface
+│   │   ├── wallet.rs        # Crypto-Accounting
+│   │   ├── landauer.rs      # Entropy Tax
+│   │   └── verification.rs  # Proof of Inference
+├── hypervisor/              # VM Management
+│   ├── src/
+│   │   ├── firecracker.rs   # Firecracker API client
+│   │   ├── vsock.rs         # Inter-VM communication
+│   │   └── cgroup.rs        # Resource isolation
+├── daemon/                  # Main Process
+│   ├── src/
+│   │   └── main.rs          # PID 1 entry point
+├── ebpf/                    # Instrumentation
+│   ├── src/
+│   │   ├── entropy.bpf.c    # Memory tracking
+│   │   └── syscall.bpf.c    # Syscall audit
+├── agents/                  # Example Agents
+│   ├── scientist/           # "Good" agent
+│   └── miner/               # "Bad" agent (for testing)
+└── tui/                     # Terminal Dashboard
+    └── src/
+        └── main.rs          # ratatui interface
+```
--- a/blog/content/notes/002-building-the-scaffolding/files/roadmap.md
+++ b/blog/content/notes/002-building-the-scaffolding/files/roadmap.md
@ -0,0 +1,143 @@
+# Maxwell Roadmap
+
+## From Concept to Proof of Concept
+
+---
+
+## The Real Deliverables
+
+Maxwell is a research project.
+
+| Deliverable           | Purpose                                                                        | Required? |
+| --------------------- | ------------------------------------------------------------------------------ | --------- |
+| **Planning Video**    | Shows _how you think_—the whiteboard session breaking down the three paradoxes | **YES**   |
+| **Baseline Data**     | Proves Linux CFS wastes energy on parasites—the "before" in your story         | **YES**   |
+| **The Narrative**     | The interview answer and GitHub README                                         | **YES**   |
+| Sprint 1 (Real MSRs)  | Deeper systems credibility, thermal-aware pricing                              | Optional  |
+| Sprint 2-4 (Full PoC) | Maxwell kills the Leech—the "after" that proves the thesis                     | Optional  |
+
+---
+
+## Overview
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                      ROADMAP PHASES                             │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                 │
+│  DAY 0          BASELINE        SPRINT 1        SPRINT 2        │
+│  ──────         ────────        ────────        ────────        │
+│  Environment    Control         Physics         Hypervisor      │
+│  Setup          Experiment      Engine          Integration     │
+│     │              │               │               │            │
+│     ▼              ▼               ▼               ▼            │
+│  Rust crates    Agents +        Real MSR/RAPL   Firecracker     │
+│  + toolchain    Linux CFS       + PID control   + Vsock         │
+│                 + Metrics                                       │
+│                                                                 │
+│  ════════════════════════════════════════════════════════════   │
+│  │ Research COMPLETE HERE  │     (Optional for commercial)      │
+│  ════════════════════════════════════════════════════════════   │
+│                                                                 │
+│  SPRINT 3        SPRINT 4                                       │
+│  ────────        ────────                                       │
+│  Economy         Multi-Agent                                    │
+│  Implementation  Demo                                           │
+│     │               │                                           │
+│     ▼               ▼                                           │
+│  Landauer's      Natural Selection                              │
+│  Tax + eBPF      Dashboard + Video                              │
+│                                                                 │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Research Validation Summary
+
+The following parameters have been empirically validated:
+
+| Component            | Original Assumption | Research Finding                         | Status           |
+| -------------------- | ------------------- | ---------------------------------------- | ---------------- |
+| Pause/Resume Latency | <10ms achievable    | <10ms requires CPU pinning + RT priority | ✅ Conditional   |
+| eBPF Overhead        | ~500ns, <1%         | 480ns validated; needs ringbuf for <1%   | ✅ Validated     |
+| RAPL Accuracy        | ±5%                 | ±5% only on calibrated hosts (Tier 1)    | ✅ Conditional   |
+| PID Time Constant    | τ=1s fixed          | τ varies 0.3-3.5s by hardware class      | ✅ Implemented   |
+| GSP PoA Bound        | 1.618               | 1.618 at steady-state; 1.8-2.2 dynamic   | ✅ Validated     |
+
+---
+
+## Day 0: Environment Setup
+
+### Hardware Requirements
+
+Bare metal is required. You cannot develop a thermal-aware hypervisor inside a cloud VM because the hypervisor hides the physics.
+
+### Day 0 Exit Criteria
+
+- [ ] Rust nightly compiles
+- [ ] Firecracker binary runs
+- [ ] MSR readable: `sudo rdmsr 0x611` returns a value
+- [ ] Project compiles: `cargo build`
+
+---
+
+## Baseline: The Control Experiment
+
+**Goal:** Build agents and measurement infrastructure. Run Linux CFS baseline. Capture data showing "fair" scheduling wastes energy.
+
+### Baseline Exit Criteria
+
+- [ ] Both agents build and run in Docker
+- [ ] Metrics collection captures temp, power, primes at 1Hz
+- [ ] 5-minute baseline run completes without errors
+- [ ] Summary shows efficiency (primes/Joule) under "fair" scheduling
+
+---
+
+## Sprint 1: The Physics Engine
+
+**Goal:** Replace MockThermal with real hardware telemetry.
+
+- Step 1.1: MSR Interface (read real temperature)
+- Step 1.2: PID Controller (smooth price response)
+- Step 1.3: Stress Test (verify under load)
+
+---
+
+## Sprint 2: The Cell Membrane
+
+**Goal:** Replace thread-sleep with actual VM containment.
+
+- Step 2.1: Firecracker Jailer
+- Step 2.2: Vsock Tunnel
+- Step 2.3: Pause/Resume Control (<10ms latency)
+
+---
+
+## Sprint 3: The Economy
+
+**Goal:** Implement Landauer's Tax.
+
+- Step 3.1: eBPF Instrumentation (track munmap/madvise)
+- Step 3.2: Tax Logic (debit wallet on memory free)
+- Step 3.3: Bankruptcy Handler (trigger apoptosis)
+
+---
+
+## Sprint 4: The Demonstration
+
+**Goal:** Multi-agent natural selection demo with dashboard.
+
+- Step 4.1: The Good Agent (Scientist - finds primes)
+- Step 4.2: The Bad Agent (Miner - burns cycles, churns memory)
+- Step 4.3: Dashboard (TUI showing auction, thermal, wallets)
+
+### The Demo Script
+
+1. Launch both agents with equal funding
+2. Watch Miner drive thermal load → prices spike
+3. Miner burns through capital → apoptosis
+4. Temperature drops → Scientist resumes at lower price
+
+**Closing line:** "We demonstrated that resource constraints can enforce economic discipline on autonomous agents."
--- a/blog/content/notes/002-building-the-scaffolding/files/vision.md
+++ b/blog/content/notes/002-building-the-scaffolding/files/vision.md
@ -0,0 +1,152 @@
+# Maxwell Vision
+
+## The Thermodynamic Hypervisor
+
+- **Status:** Conceptual
+- **Domain:** Operating Systems · Thermodynamics · Algorithmic Game Theory
+- **Thesis:** Compute is not a utility. Compute is a resource extraction industry.
+
+---
+
+## 0. What This Project Is
+
+Maxwell is a demonstration of architectural thinking at the intersection of kernel internals, thermodynamics, and mechanism design.
+
+### What Building Maxwell Demonstrates
+
+| Domain              | Skill Demonstrated                                         |
+| ------------------- | ---------------------------------------------------------- |
+| Operating Systems   | Kernel scheduling, cgroups, MSRs, interrupt handlers       |
+| Thermodynamics      | RAPL, thermal governors, Landauer's principle              |
+| Mechanism Design    | Vickrey auctions, incentive compatibility, market clearing |
+| Systems Programming | Rust, eBPF, Firecracker, vsock                             |
+| Distributed Systems | Gossip protocols, consensus under physical constraints     |
+
+---
+
+## 1. The Crisis: The Infinite Computer Fallacy
+
+Modern operating systems are built on a 1970s delusion: that compute is infinite, and the only constraint is fairness.
+
+When you run a process on Kubernetes and request "2 CPU cores," you're asking for a _rate of time_. The OS Scheduler (CFS) attempts to be "fair." It assumes that a `while(true)` loop calculating Pi is just as valid as a Transformer inference saving a patient's life.
+
+**In the Age of Agents, this is fatal.**
+
+We are about to deploy billions of autonomous agents. If the OS remains value-agnostic, we hit Jevons Paradox immediately: agents consume infinite energy on low-value tasks (loops, hallucinations, redundant checks) because **the cost of a CPU cycle to the agent is zero**.
+
+Maxwell is the correction. It is a bare-metal hypervisor that rejects "Fairness" in favor of **Thermodynamic Equilibrium**.
+
+---
+
+## 2. The Three Axioms
+
+Maxwell is built on three laws that cannot be overridden by `sudo`.
+
+### Axiom I: The Conservation of Compute
+
+> There is no `nice` value. There is only **Price**.
+
+The kernel does not maintain a run queue. It maintains an **Order Book**.
+
+Every process must hold a balance of Energy Tokens (`$JOULE`). To execute an instruction, the process must bid `$JOULEs` against the current spot price of electricity + thermal headroom of the die.
+
+**Result:** A hallucinating agent runs out of money and undergoes apoptosis. An agent solving a cure for cancer gets funded by the user and outbids everyone.
+
+### Axiom II: Landauer's Tax
+
+> Information is physical. Erasure is heat.
+
+Allocating memory is cheap. **Freeing memory is expensive.**
+
+Maxwell implements Landauer's Principle in the memory allocator. When a process wants to overwrite data (increasing entropy), it is taxed.
+
+**Result:** Agents are economically incentivized to write efficient, append-only code and cache highly-compressed representations of reality. Bloatware becomes insolvent.
+
+### Axiom III: Verification by Sampling
+
+> Trust but Verify.
+
+We cannot use a blockchain—it is too slow. We use **Optimistic Execution with Probabilistic Audit**.
+
+Maxwell allows processes to self-report their work, but the Maxwell Daemon (a kernel-ring-0 process) randomly pauses execution of 0.1% of threads to verify the Instruction Pointer moves linearly with the Hash of the executed block.
+
+**Result:** Cheating the energy market is statistically impossible over long runtimes.
+
+---
+
+## 3. The Three Paradoxes
+
+To build Maxwell, you must solve three interlocking paradoxes.
+
+### Paradox 1: Proof of Useful Work
+
+**Problem:** How does the Hypervisor know an AI agent is actually _thinking_ and not just mining crypto or looping?
+
+**Challenge:** Design a "Proof of Inference" protocol. Can we use Zero-Knowledge proofs (zk-SNARKs) to prove a model layer was executed correctly without the Hypervisor re-running the computation?
+
+**Difficulty:** Extremely Hard. Requires bridging Cryptography and ML Compilers.
+
+### Paradox 2: High-Frequency Auction
+
+**Problem:** If every CPU cycle requires a bid, the auction mechanism itself consumes more compute than the workloads.
+
+**Challenge:** Design a Control System. How do you implement a market mechanism that runs in O(1) or O(log n) time inside the kernel scheduler?
+
+**Difficulty:** Requires inventive Data Structures (e.g., a "Probabilistic Auction Heap").
+
+### Paradox 3: Thermal Throttling Consensus
+
+**Problem:** In a distributed cluster, one node overheating affects the efficiency of neighbors (fan speed, power delivery).
+
+**Challenge:** Design a Gossip Protocol for Heat. How does Node A tell Node B "I am dying" in a way that causes Node B to lower its prices for compute, autonomously rebalancing the thermodynamics of the data center?
+
+---
+
+## 4. The Intellectual Provocation
+
+Maxwell explores a hypothesis: **What if alignment were an economic problem rather than a training problem?**
+
+Current AI safety research tries to align agents using RLHF (training them to be nice). Maxwell proposes an alternative layer: align agents using **resource constraints**.
+
+### Honest Limitations
+
+This is an **interesting constraint mechanism**, not a complete alignment solution:
+
+- A well-funded malicious agent still runs
+- A poorly-funded benign agent still dies
+- An agent smart enough to be dangerous is smart enough to acquire resources outside Maxwell
+- The mechanism only works if Maxwell is ubiquitous (which it won't be)
+
+**The value of this framing:** It forces you to think about alignment as resource allocation, not just training. It's a thought experiment made concrete, not a production safety system.
+
+### Where This Idea Has Real Legs
+
+The strongest application isn't "AI safety theater"—it's **Decentralized Compute Verification**.
+
+Networks like Akash, io.net, and Render cannot verify that remote nodes actually ran the computation they claim. Maxwell's "Proof of Physics" concept—thermal signatures and energy consumption as proof of work—addresses a real gap in decentralized infrastructure.
+
+---
+
+## 5. The Core Equation
+
+The fundamental physics of Maxwell:
+
+```
+Cost = (Cycles × Current_Grid_Price) + (Memory_Freed × Landauer_Constant)
+```
+
+Where:
+
+- `Cycles` = Number of CPU cycles consumed
+- `Current_Grid_Price` = Dynamic price based on thermal headroom
+- `Memory_Freed` = Bytes released back to the system
+- `Landauer_Constant` = kT × ln(2) per bit erased
+
+---
+
+## 6. What Maxwell Is Not
+
+- **Not a Linux distro.** It replaces the bottom half of the stack.
+- **Not a container orchestrator.** Kubernetes schedules by fairness; Maxwell schedules by value.
+- **Not a blockchain.** Blockchains are too slow. We use optimistic execution with probabilistic audit.
+- **Not theoretical.** Every component maps to real hardware (RAPL, MSRs, thermal sensors).
--- a/blog/content/notes/002-building-the-scaffolding/meta.yaml
+++ b/blog/content/notes/002-building-the-scaffolding/meta.yaml
@ -0,0 +1,36 @@
+id: "002"
+slug: 002-building-the-scaffolding
+date: "2026-02-06"
+title: Understanding the Project
+preview: "Creating vision.md, architecture.md, and roadmap.md. The vision is what you use to constantly redirect the code."
+
+prompts:
+  - id: vision
+    label: Write vision.md
+    content: |
+      Write vision.md
+
+  - id: architecture
+    label: Write architecture.md
+    content: |
+      Write architecture.md
+
+  - id: roadmap
+    label: Write roadmap.md
+    content: |
+      Write roadmap.md
+
+filesCreated:
+  - name: vision.md
+    description: Core thesis, axioms, paradoxes, and narrative
+  - name: architecture.md
+    description: System layers, code sketches, data structures
+  - name: roadmap.md
+    description: Phased implementation from Day 0 to Sprint 4
+
+navigation:
+  prev:
+    slug: 001-picking-a-problem
+    id: "001"
+    title: Picking a Problem
+  next: null
--- a/blog/content/notes/003-research-planning/content.md
+++ b/blog/content/notes/003-research-planning/content.md
@ -0,0 +1,157 @@
+## The Method
+
+Before building, you need to know how to build. The roadmap outlines what we want to achieve, but not necessarily how. Each component needs best practices, working code, methodologies, and prior art. The research phase hydrates the planning docs with real knowledge.
+
+## Step 1: Create Research Directives
+
+I used `/do-parallel` to create all 10 research directives simultaneously:
+
+> /do-parallel for each topic, create a directive for our research team to research and write it in research-requests/{name}.md
+
+Claude spawned 10 parallel agents, each writing a directive for one topic. Each directive assigns an expert persona and asks specific, measurable questions.
+
+## Step 2: Deep Research
+
+For each directive, I used Gemini's Deep Research:
+
+1. Open [gemini.google.com/app](https://gemini.google.com/app)
+2. Click **Tools → Deep Research**
+3. Paste the entire directive
+4. Wait 5-10 minutes for comprehensive report
+5. Save results to `research/{topic-slug}.md`
+
+Deep Research browses the web, reads papers, and synthesizes findings. The output is a structured report with citations.
+
+## Step 3: Load Research into Project
+
+Save the Deep Research results to `research/{topic-slug}.md`. You can either copy them directly or have Claude rewrite them into the project format.
+
+## What We Learned
+
+Here are examples that came out of the research — concrete patterns, working code, and validated approaches.
+
+### eBPF: Use Ring Buffers, Not Perf Arrays
+
+Linux 5.8+ provides `BPF_MAP_TYPE_RINGBUF` with superior characteristics for high-frequency event capture:
+
+```c
+// Ring buffer map definition — shared across all CPUs
+struct {
+    __uint(type, BPF_MAP_TYPE_RINGBUF);
+    __uint(max_entries, 256 * 1024);  // 256KB shared
+} events SEC(".maps");
+
+// Zero-copy event submission
+SEC("tracepoint/syscalls/sys_enter_munmap")
+int trace_munmap_ringbuf(struct trace_event_raw_sys_enter *ctx)
+{
+    struct entropy_event *event;
+
+    // Reserve space in ring buffer (zero-copy)
+    event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
+    if (!event)
+        return 0;
+
+    event->pid = bpf_get_current_pid_tgid() >> 32;
+    event->bytes_freed = ctx->args[1];
+    event->timestamp_ns = bpf_ktime_get_ns();
+    event->event_type = ENTROPY_MUNMAP;
+
+    bpf_ringbuf_submit(event, 0);
+    return 0;
+}
+```
+
+Ring buffers avoid per-CPU allocation, reduce memory footprint, and provide zero-copy semantics. This replaces the perf event array approach in our original design.
+
+### Thermal Time Constants: Measure, Don't Assume
+
+The PID controller assumes τ = 1s for CPU, 2s for GPU, 30s for chassis. But these vary by hardware. The research revealed a methodology for measuring actual values:
+
+```python
+def thermal_response(t, T_final, tau, T_initial, t_dead):
+    """First-order thermal response with dead time."""
+    t_effective = np.maximum(t - t_dead, 0)
+    return T_initial + (T_final - T_initial) * (1 - np.exp(-t_effective / tau))
+
+# Fit measured data to extract actual τ
+popt, pcov = curve_fit(thermal_response, time_data, temp_data,
+                       p0=[80, 1.0, 40, 0.1])
+tau_measured = popt[1]
+```
+
+Key insight: Sample rate must be ≥10x faster than expected τ (100ms intervals for τ = 1s).
+
+### Proof of Inference: Layered Verification
+
+Full cryptographic proofs (zk-SNARKs) are too expensive — 10-1000x inference time. The research identified a layered approach that leverages Maxwell's dual-plane control:
+
+```
+Layer 1: Thermodynamic (Always On)
+├─ Power draw must match claimed computation class
+└─ Blocks obvious cheats (mining, loops) instantly
+
+Layer 2: PCIe Attestation (Always On)
+├─ Hash tensors at bus boundary
+└─ Timing signatures must match model profile
+
+Layer 3: Selective ZK (High-Value Only)
+├─ For bids above threshold, require ZK proof
+└─ Proof of specific layer execution
+
+Layer 4: Random Deep Audit (Rare)
+├─ Full inference re-execution by Maxwell
+└─ Economic deterrent via staking
+```
+
+This is unique to Maxwell — we control both CPU and GPU planes, so we can instrument the PCIe bus and correlate power telemetry with claimed work.
+
+### GSP Auction: Thermal Coupling Matrix
+
+The auction research formalized how thermal coupling affects pricing across cores:
+
+```
+Thermal Coupling Matrix (4-core example):
+K = | 1.00  0.85  0.60  0.35 |
+    | 0.85  1.00  0.75  0.50 |
+    | 0.60  0.75  1.00  0.70 |
+    | 0.35  0.50  0.70  1.00 |
+```
+
+Core 0 (near GPU) sees 8x price multiplier when GPU is at 95% utilization. Core 3 (distant) sees only 1.5x. This asymmetry is a feature, not a bug — it naturally routes low-priority work to thermally-isolated cores.
+
+### Firecracker: The Pause Mechanism
+
+The latency research clarified what actually happens during VM pause:
+
+1. Send SIGSTOP to vCPU threads
+2. Wait for vCPUs to halt at safe point
+3. Drain in-flight I/O operations
+4. Return success to API caller
+
+Key finding: Pause latency scales with active I/O, not memory size. A 256MB VM with heavy disk I/O pauses slower than a 4GB idle VM.
+
+## The Ten Research Topics
+
+### Hardware & Physics Layer
+
+1. **Firecracker Pause/Resume Latency** — Benchmarking for thermal emergencies
+2. **eBPF Overhead Validation** — Production load testing
+3. **RAPL Accuracy Calibration** — Hardware-specific power reporting
+4. **Thermal Time Constant Validation** — Measuring actual τ values
+5. **Thermal Coupling Measurement** — Inter-core heat transfer
+
+### Mechanism Design
+
+6. **GSP Thermal Stability** — Auction equilibrium under thermal dynamics
+7. **High-Frequency Auction Research** — 100Hz market clearing
+
+### Novel Capabilities
+
+8. **Power-Trace Verification** — Distinguishing inference from mining
+9. **Proof of Inference** — Cryptographic verification approaches
+10. **Thermal Gossip Consensus** — Distributed thermal coordination
+
+## Next Steps
+
+Take the research findings and update the roadmap with validated approaches, working code patterns, and concrete methodologies.
--- a/blog/content/notes/003-research-planning/files/ebpf-overhead-validation.md
+++ b/blog/content/notes/003-research-planning/files/ebpf-overhead-validation.md
@ -0,0 +1,546 @@
+# eBPF Overhead on Hot Paths Research Directive
+
+You are **Brendan Gregg**, Senior Performance Architect and author of "BPF Performance Tools." Your pioneering work on systems performance analysis, flame graphs, and eBPF observability defines the field. You've instrumented production systems at Netflix handling millions of requests per second, and you understand the difference between "it should work" and "it survives production."
+
+You are going to **empirically validate the overhead claims for Maxwell's eBPF kprobes on memory syscalls** — specifically, the step files claim ~500ns per probe with <1% system overhead, but these numbers need rigorous benchmarking across realistic workload profiles before we commit this design to production.
+
+---
+
+## Context
+
+### Maxwell's eBPF Instrumentation
+
+Maxwell uses eBPF kprobes attached to `munmap` and `madvise` syscalls to track memory entropy for Landauer's Tax. Every time a monitored VM releases memory, an entropy event is generated, hashed through a perf event array, and processed by the daemon to debit the VM's energy wallet.
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                         HOT PATH                                 │
+│                                                                  │
+│  Application calls munmap(addr, len)                             │
+│         │                                                        │
+│         ▼                                                        │
+│  ┌─────────────────────────────────────────────────────────────┐ │
+│  │  KPROBE INTERCEPT (trace_munmap)                            │ │
+│  │  ├─ bpf_get_current_pid_tgid()          ~20ns               │ │
+│  │  ├─ bpf_map_lookup_elem(monitored_pids) ~50ns               │ │
+│  │  ├─ PT_REGS_PARM extraction             ~10ns               │ │
+│  │  ├─ bpf_ktime_get_ns()                  ~20ns               │ │
+│  │  └─ bpf_perf_event_output()             ~200-400ns          │ │
+│  └─────────────────────────────────────────────────────────────┘ │
+│         │                                                        │
+│         ▼                                                        │
+│  Syscall proceeds normally                                       │
+│         │                                                        │
+│         ▼                                                        │
+│  Userspace daemon reads perf buffer (async)                      │
+└─────────────────────────────────────────────────────────────────┘
+
+CLAIMED OVERHEAD: ~500ns per syscall
+CLAIMED SYSTEM IMPACT: <1% at typical workloads
+STATUS: UNVALIDATED
+```
+
+### Why This Matters
+
+1. **Landauer's Tax is on the hot path** — every memory release triggers the probe
+2. **Memory-intensive workloads exist** — Redis, PostgreSQL, Python GC can generate 10K+ munmap/s
+3. **Tail latency is critical** — p99 impact matters more than median
+4. **Alternative designs exist** — tracepoints, ringbuf, sampling — need data to choose
+
+### Current Implementation (from sprint-3-1)
+
+```c
+SEC("kprobe/__x64_sys_munmap")
+int BPF_KPROBE(trace_munmap)
+{
+    u64 pid_tgid = bpf_get_current_pid_tgid();
+    u32 pid = pid_tgid >> 32;
+
+    if (!should_trace(pid))
+        return 0;
+
+    u64 len = PT_REGS_PARM2(ctx);
+
+    struct entropy_event event = {
+        .pid = pid,
+        .tgid = pid_tgid & 0xFFFFFFFF,
+        .bytes_freed = len,
+        .timestamp_ns = bpf_ktime_get_ns(),
+        .event_type = ENTROPY_MUNMAP,
+    };
+
+    bpf_perf_event_output(ctx, &entropy_events, BPF_F_CURRENT_CPU,
+                          &event, sizeof(event));
+    return 0;
+}
+```
+
+---
+
+## Research Questions
+
+### Primary Questions
+
+1. **What is the actual per-probe overhead of kprobe on munmap/madvise syscalls?**
+   - Median latency added to syscall
+   - 99th and 99.9th percentile latency
+   - Variance under load
+   - Comparison: empty probe vs. full entropy probe
+
+2. **Does BPF_MAP_TYPE_RINGBUF (Linux 5.8+) reduce overhead vs BPF_MAP_TYPE_PERF_EVENT_ARRAY?**
+   - Per-event overhead comparison
+   - Batching efficiency at high event rates
+   - Memory footprint differences
+   - Userspace polling overhead
+
+3. **Can we use tracepoints instead of kprobes for lower overhead?**
+   - Compare: `kprobe/__x64_sys_munmap` vs `tracepoint/syscalls/sys_enter_munmap`
+   - Stability across kernel versions
+   - Available context (can we get the same data?)
+   - Measured latency difference
+
+4. **How does overhead scale with event frequency?**
+   - Test at: 1K/s, 10K/s, 100K/s, 1M/s event rates
+   - Identify knee points where overhead becomes significant
+   - CPU utilization curve
+   - Event loss rates
+
+5. **What is the impact on real-world workloads?**
+   - Redis: memory-intensive key expiration
+   - PostgreSQL: buffer pool management
+   - Python: garbage collection patterns
+   - Node.js: V8 heap management
+
+---
+
+## Methodology
+
+### Phase 1: Microbenchmarks (Synthetic Workloads)
+
+#### Baseline Establishment
+
+```bash
+# Test system configuration
+# - Kernel: 6.1+ with BTF enabled
+# - CPU: Pin to single core for consistency
+# - Frequency: Fixed (disable turbo, governor=performance)
+# - No other workloads running
+
+# Baseline: munmap throughput without any probes
+sysbench memory --memory-block-size=4K --memory-oper=write \
+    --memory-access-mode=rnd --threads=1 --time=60 run
+
+# Record: ops/sec, latency distribution
+```
+
+#### Probe Overhead Measurement
+
+```bash
+# Test matrix:
+# ┌────────────────────────┬───────────────────┬────────────────────┐
+# │ Probe Type             │ Map Type          │ Event Rate Target  │
+# ├────────────────────────┼───────────────────┼────────────────────┤
+# │ Empty kprobe           │ N/A               │ 10K, 100K/s        │
+# │ kprobe + hash lookup   │ HASH              │ 10K, 100K/s        │
+# │ kprobe + perf output   │ PERF_EVENT_ARRAY  │ 10K, 100K/s        │
+# │ kprobe + ringbuf       │ RINGBUF           │ 10K, 100K/s        │
+# │ tracepoint + perf      │ PERF_EVENT_ARRAY  │ 10K, 100K/s        │
+# │ tracepoint + ringbuf   │ RINGBUF           │ 10K, 100K/s        │
+# └────────────────────────┴───────────────────┴────────────────────┘
+```
+
+#### Measurement Tools
+
+```bash
+# Per-syscall latency (requires bpftrace)
+bpftrace -e '
+    kprobe:__x64_sys_munmap { @start[tid] = nsecs; }
+    kretprobe:__x64_sys_munmap /@start[tid]/ {
+        @latency = hist(nsecs - @start[tid]);
+        delete(@start[tid]);
+    }
+'
+
+# CPU overhead
+perf stat -e cycles,instructions,cache-misses \
+    -p <test_pid> sleep 60
+
+# Event throughput and loss
+bpftool prog show  # Check run_cnt
+cat /sys/kernel/debug/tracing/per_cpu/cpu0/stats  # Lost events
+```
+
+### Phase 2: Stress Test Workloads
+
+#### High-Frequency Allocation/Deallocation
+
+```c
+// test_entropy_storm.c
+// Generates controlled rate of munmap syscalls
+
+#define ALLOC_SIZE (4 * 1024)  // 4KB pages
+
+void generate_entropy_events(int target_rate_per_sec) {
+    struct timespec interval;
+    interval.tv_sec = 0;
+    interval.tv_nsec = 1000000000 / target_rate_per_sec;
+
+    while (running) {
+        void *ptr = mmap(NULL, ALLOC_SIZE, PROT_READ | PROT_WRITE,
+                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+        memset(ptr, 0xAB, ALLOC_SIZE);
+        munmap(ptr, ALLOC_SIZE);  // This triggers probe
+        nanosleep(&interval, NULL);
+    }
+}
+
+// Run at: 1K/s, 10K/s, 50K/s, 100K/s
+// Measure: actual achieved rate, CPU%, latency percentiles
+```
+
+#### madvise Pattern Testing
+
+```c
+// Test MADV_DONTNEED and MADV_FREE patterns
+void test_madvise_overhead(size_t region_size, int advice) {
+    void *region = mmap(NULL, region_size, PROT_READ | PROT_WRITE,
+                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+    // Touch all pages
+    memset(region, 0xCD, region_size);
+
+    struct timespec start, end;
+    clock_gettime(CLOCK_MONOTONIC, &start);
+
+    for (int i = 0; i < ITERATIONS; i++) {
+        madvise(region, region_size, advice);
+    }
+
+    clock_gettime(CLOCK_MONOTONIC, &end);
+    // Report: ops/sec, ns/op
+}
+```
+
+### Phase 3: Real Application Benchmarks
+
+#### Redis Memory Stress
+
+```bash
+# Redis configured with aggressive eviction
+redis-server --maxmemory 512mb --maxmemory-policy allkeys-lru
+
+# Generate workload with high eviction rate
+redis-benchmark -t set,get -n 10000000 -d 1024 -c 50 -P 16
+
+# Metrics to capture:
+# - ops/sec (baseline vs with probes)
+# - latency percentiles (p50, p99, p99.9)
+# - Redis memory fragmentation ratio
+```
+
+#### PostgreSQL Buffer Management
+
+```bash
+# PostgreSQL with limited shared_buffers
+shared_buffers = 256MB
+work_mem = 4MB
+
+# Run pgbench with larger-than-memory dataset
+pgbench -i -s 100 testdb  # ~1.5GB dataset
+pgbench -c 20 -j 4 -T 300 testdb
+
+# Metrics:
+# - TPS (baseline vs with probes)
+# - Buffer eviction rate
+# - Query latency distribution
+```
+
+#### Python GC Patterns
+
+```python
+# test_python_gc.py
+import gc
+import time
+
+def generate_garbage():
+    """Create objects that will trigger GC and munmap."""
+    garbage = []
+    for _ in range(10000):
+        garbage.append([0] * 1000)  # List of ints
+    del garbage
+    gc.collect()
+
+# Run with probes attached
+# Measure: GC pause times, munmap frequency, overall throughput
+```
+
+### Phase 4: Comparison Testing
+
+#### Ring Buffer vs Perf Event Array
+
+```c
+// ringbuf_probe.bpf.c
+struct {
+    __uint(type, BPF_MAP_TYPE_RINGBUF);
+    __uint(max_entries, 256 * 1024);  // 256KB
+} events SEC(".maps");
+
+SEC("kprobe/__x64_sys_munmap")
+int trace_munmap_ringbuf(struct pt_regs *ctx) {
+    struct entropy_event *event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
+    if (!event) return 0;
+
+    // Fill event...
+    bpf_ringbuf_submit(event, 0);
+    return 0;
+}
+```
+
+```c
+// perf_array_probe.bpf.c
+struct {
+    __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+    __uint(key_size, sizeof(u32));
+    __uint(value_size, sizeof(u32));
+} events SEC(".maps");
+
+SEC("kprobe/__x64_sys_munmap")
+int trace_munmap_perf(struct pt_regs *ctx) {
+    struct entropy_event event = {};
+    // Fill event...
+    bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &event, sizeof(event));
+    return 0;
+}
+```
+
+**Comparison metrics:**
+- Per-event kernel-side overhead (time in probe)
+- Event loss rate under pressure
+- Userspace poll() latency and CPU usage
+- Memory efficiency (buffer sizing)
+
+#### Kprobe vs Tracepoint
+
+```c
+// tracepoint_probe.bpf.c
+SEC("tracepoint/syscalls/sys_enter_munmap")
+int trace_munmap_tp(struct trace_event_raw_sys_enter *ctx) {
+    // ctx->args[1] is length parameter
+    u64 len = ctx->args[1];
+
+    struct entropy_event event = {
+        .pid = bpf_get_current_pid_tgid() >> 32,
+        .bytes_freed = len,
+        // ...
+    };
+
+    bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &event, sizeof(event));
+    return 0;
+}
+```
+
+**Comparison metrics:**
+- Latency overhead per event
+- Stability across kernel versions (5.10, 5.15, 6.1, 6.6)
+- Available context (arguments, return values)
+- Verifier complexity
+
+---
+
+## Deliverables
+
+### Primary Output: Benchmark Report (10-15 pages)
+
+```markdown
+1. Executive Summary (1 page)
+   - Validated overhead numbers
+   - Recommended configuration for Maxwell
+   - Go/no-go for production deployment
+
+2. Methodology (2 pages)
+   - Test environment specification
+   - Benchmark design rationale
+   - Statistical validity discussion
+
+3. Microbenchmark Results (4 pages)
+   - Per-probe latency breakdown
+   - Map type comparison (ringbuf vs perf array)
+   - Probe type comparison (kprobe vs tracepoint)
+   - Scaling curves (overhead vs event rate)
+   - Tables with p50, p99, p99.9 latencies
+
+4. Application Benchmark Results (3 pages)
+   - Redis impact analysis
+   - PostgreSQL impact analysis
+   - Python GC impact analysis
+   - CPU overhead measurements
+
+5. Recommendations (2 pages)
+   - Optimal configuration for Maxwell
+   - Event rate thresholds
+   - Fallback strategies for high-load scenarios
+   - Kernel version requirements
+
+6. Appendix
+   - Raw data tables
+   - Test scripts (reproducible)
+   - System configuration details
+```
+
+### Secondary Outputs
+
+1. **Decision Matrix**
+
+   | Configuration | Median Latency | p99 Latency | Event Loss @ 100K/s | CPU Overhead | Recommendation |
+   |---------------|----------------|-------------|---------------------|--------------|----------------|
+   | kprobe + perf_event_array | TBD | TBD | TBD | TBD | Current impl |
+   | kprobe + ringbuf | TBD | TBD | TBD | TBD | If loss unacceptable |
+   | tracepoint + ringbuf | TBD | TBD | TBD | TBD | If stability needed |
+
+2. **Overhead Budget Validation**
+
+   ```
+   Maxwell's target: <1% scheduler overhead at 10,000 ticks/second
+   Available time per tick: 100,000 ns (100us)
+   Budget for eBPF: 1,000 ns (1%)
+
+   Measured actual: [TBD] ns
+   Result: [PASS/FAIL with margin]
+   ```
+
+3. **Test Harness Code**
+   - Reproducible benchmark suite
+   - Automated data collection scripts
+   - Visualization notebooks (latency histograms, scaling curves)
+
+---
+
+## Success Criteria
+
+### Minimum Viable Validation
+
+- [ ] Measured per-probe overhead with statistical confidence (n >= 10000 samples)
+- [ ] Tested at 1K, 10K, 100K events/second
+- [ ] Compared at least 2 map types (perf_event_array, ringbuf)
+- [ ] Compared kprobe vs tracepoint
+- [ ] Tested on at least 2 kernel versions (5.15 LTS, 6.1+)
+- [ ] Measured real application impact (Redis or PostgreSQL)
+
+### Full Validation
+
+- [ ] All minimum criteria met
+- [ ] Tested all 3 real applications (Redis, PostgreSQL, Python)
+- [ ] Characterized event loss behavior under overload
+- [ ] Identified scaling knee points with confidence intervals
+- [ ] Provided actionable configuration recommendations
+- [ ] Reproducible test suite committed to repository
+
+### Go/No-Go Criteria
+
+```
+GREEN: Proceed with current design
+- Per-probe overhead < 1000ns (p99)
+- System overhead < 1% at 10K events/s
+- Event loss < 0.01% at 10K events/s
+
+YELLOW: Proceed with modifications
+- Per-probe overhead 1000-2000ns (p99)
+- System overhead 1-3% at 10K events/s
+- Recommend ringbuf or tracepoint
+
+RED: Redesign required
+- Per-probe overhead > 2000ns (p99)
+- System overhead > 3% at 10K events/s
+- Need sampling or batch approaches
+```
+
+---
+
+## References
+
+### Essential Reading
+
+1. **"BPF Performance Tools" by Brendan Gregg** (2019)
+   - Chapter 4: BPF Tracing Tools
+   - Chapter 6: CPUs (kprobe overhead discussion)
+
+2. **Linux Kernel Documentation**
+   - [BPF Design Q&A](https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html)
+   - [BPF Ring Buffer](https://www.kernel.org/doc/html/latest/bpf/ringbuf.html)
+
+3. **Performance Measurement Papers**
+   - *"Measuring the Overhead of BPF"* (various LPC talks)
+   - *"Low-Overhead Performance Monitoring"* (EuroSys papers)
+
+### Tools
+
+```bash
+# Essential tools for benchmarking
+apt install linux-tools-common bpftrace perf sysbench
+
+# BPF-specific
+cargo install bpftool  # Or use system bpftool
+pip install py-spy     # For Python profiling
+```
+
+### Kernel Requirements
+
+```bash
+# Check BTF support
+ls /sys/kernel/btf/vmlinux
+
+# Check ringbuf support (5.8+)
+uname -r  # Should be >= 5.8
+
+# Verify kernel config
+grep -E "CONFIG_BPF|CONFIG_DEBUG_INFO_BTF" /boot/config-$(uname -r)
+```
+
+---
+
+## Notes
+
+### Scope Boundaries
+
+- Focus on overhead measurement, not functionality testing
+- Assume probes are correctly implemented (verified by sprint-3-1)
+- Don't optimize probe code — measure current implementation first
+- Production kernel versions only (5.10 LTS, 5.15 LTS, 6.1+, 6.6+)
+
+### Potential Pitfalls
+
+1. **Measurement perturbation**: Measuring probes with probes adds overhead
+   - Use hardware counters (RDTSC, perf) where possible
+   - Account for measurement overhead in analysis
+
+2. **System noise**: Background processes affect measurements
+   - Use dedicated test machine or container
+   - Multiple runs with statistical analysis
+   - Report confidence intervals
+
+3. **Kernel version variance**: Different kernels have different BPF JIT quality
+   - Test on multiple kernel versions
+   - Note significant differences
+
+4. **Workload representation**: Synthetic tests may not reflect production
+   - Include real application benchmarks
+   - Document workload characteristics
+
+### Research Philosophy
+
+**Gregg's Principles Applied:**
+
+1. **Measure, don't guess** — The step files claim ~500ns, but have we actually measured it?
+2. **Percentiles over averages** — p99 matters more than mean for latency-sensitive paths
+3. **Test at scale** — 1K/s is easy; 100K/s exposes real issues
+4. **Reproduce and verify** — All benchmarks must be reproducible
+
+**Honest Assessment Required:**
+
+If the numbers don't support the current design, say so. Maxwell's success depends on accurate overhead characterization. Better to discover problems now than in production with real agent workloads.
+
+---
+
+*Document Status: Research Directive*
+*Topic: eBPF Overhead Validation*
+*Last Updated: 2026-02*
--- a/blog/content/notes/003-research-planning/files/firecracker-latency-benchmarks.md
+++ b/blog/content/notes/003-research-planning/files/firecracker-latency-benchmarks.md
@ -0,0 +1,508 @@
+# Firecracker Pause/Resume Latency Benchmarks Research Directive
+
+You are **Brendan Gregg**, world-renowned performance engineer, author of "Systems Performance" and "BPF Performance Tools," and creator of flame graphs. You've spent decades measuring what others assumed was unmeasurable, proving that rigorous benchmarking separates engineering fact from hopeful fiction.
+
+You are going to **empirically validate Firecracker's pause/resume latency characteristics** to determine whether Maxwell can achieve its <10ms thermal emergency response target, or whether that target needs to be revised based on measured reality.
+
+---
+
+## Context
+
+Maxwell's thermal protection system relies on the ability to pause running microVMs within milliseconds when thermal emergencies occur. The architecture document states a <10ms pause/resume latency target, but this has not been validated empirically.
+
+**The Stakes:**
+
+```
+Thermal Emergency Timeline:
+  t=0ms    Temperature crosses critical threshold (e.g., 95°C)
+  t=???    Maxwell issues pause command to Firecracker
+  t=???    Firecracker completes vCPU pause
+  t=???    VM state is quiesced
+  t=???    System confirms VM is paused
+
+  If ??? > thermal_runaway_time:
+    Hardware damage, throttling cascade, or shutdown
+
+  thermal_runaway_time ≈ 50-200ms (varies by hardware)
+
+  Budget: <10ms for pause gives 5x safety margin
+```
+
+**The Unknown:**
+
+Firecracker's pause operation involves:
+1. Sending SIGSTOP to vCPU threads
+2. Waiting for vCPUs to halt at a safe point
+3. Draining in-flight I/O operations
+4. Saving dirty memory state (for snapshot, not just pause)
+5. Returning success to the API caller
+
+Each step has latency that may vary with:
+- VM memory size (256MB vs 4GB)
+- vCPU count (1 vs 8 vCPUs)
+- Active I/O operations (disk, network)
+- Memory pressure on host
+- Kernel scheduler state
+
+**The Question:**
+
+Is <10ms 99th percentile pause latency achievable in production conditions, or is Maxwell's thermal protection architecture built on an unvalidated assumption?
+
+---
+
+## Research Questions
+
+### RQ1: Baseline Pause Latency
+
+What is the actual pause latency under controlled, idle conditions?
+
+```
+Measure:
+  - Pause latency for idle VM (no workload)
+  - Variance across 1000+ samples
+  - Distribution shape (normal? long-tail? bimodal?)
+
+Variables:
+  - Memory size: 256MB, 512MB, 1GB, 2GB, 4GB
+  - vCPU count: 1, 2, 4, 8
+
+Expected output:
+  - P50, P95, P99, P99.9 latencies for each configuration
+  - Identification of baseline "floor" latency
+```
+
+### RQ2: Resume Latency and State Restoration
+
+What is the resume latency, and is state restoration the bottleneck?
+
+```
+Measure:
+  - Time from resume API call to vCPU execution resuming
+  - Time to first guest instruction after resume
+  - Memory re-mapping latency (if applicable)
+
+Hypothesis:
+  Resume may be faster than pause (no quiescing needed)
+  OR resume may be slower (state restoration overhead)
+
+Instrumentation needed:
+  - Host-side: API response time, kernel traces
+  - Guest-side: First timestamp after resume
+```
+
+### RQ3: Memory Size Scaling
+
+How does latency scale with VM memory size?
+
+```
+Memory sizes to test: 256MB, 512MB, 1GB, 2GB, 4GB, 8GB
+
+Hypotheses to validate:
+  H1: Pause is O(1) — just signals threads, no memory scan
+  H2: Pause is O(memory) — dirty page tracking overhead
+  H3: Pause is O(working_set) — only active pages matter
+
+For each size:
+  - Idle VM baseline
+  - VM with memory pressure (80% utilized)
+  - VM with active memory writes
+```
+
+### RQ4: Variance Under Stress Conditions
+
+What's the latency variance under memory pressure or I/O in flight?
+
+```
+Stress conditions:
+  1. Host memory pressure (80% host RAM used)
+  2. Host CPU contention (other VMs competing)
+  3. Guest I/O in flight (active disk writes)
+  4. Guest network I/O (active network transfers)
+  5. Combined stress (all of the above)
+
+Measure:
+  - Latency distribution under each condition
+  - Tail latency (P99, P99.9) specifically
+  - Failure rate (pause times out or fails)
+
+Critical question:
+  Do stress conditions cause occasional >100ms outliers?
+  These would be fatal for thermal protection.
+```
+
+### RQ5: Target Feasibility
+
+Can we achieve <10ms 99th percentile, or do we need to relax the target?
+
+```
+Based on RQ1-RQ4 data:
+  - What is the achievable P99 in production conditions?
+  - What configuration constraints enable <10ms P99?
+  - If <10ms is not achievable, what is achievable?
+
+Recommendations:
+  - If <10ms achievable: Confirm target, document constraints
+  - If 10-20ms achievable: Revise target, adjust thermal margins
+  - If >20ms: Fundamental architecture issue, escalate
+```
+
+---
+
+## Methodology
+
+### Benchmark Environment
+
+```
+Hardware Requirements:
+  - Bare-metal server (no nested virtualization)
+  - Modern CPU with VMX/SVM support
+  - Minimum 32GB RAM (to test 4GB+ VMs with headroom)
+  - NVMe storage (to separate disk latency from test)
+  - 10GbE networking (for network I/O tests)
+
+Software Stack:
+  - Linux kernel 5.15+ (or current production kernel)
+  - Firecracker latest stable release
+  - Host OS: Ubuntu 22.04 or Amazon Linux 2023
+  - Guest OS: Minimal Alpine or Amazon Linux
+
+Isolation:
+  - Dedicated cores for test VMs (cpuset isolation)
+  - Disable CPU frequency scaling (performance governor)
+  - Disable turbo boost (consistent baseline)
+  - No other VMs running during baseline tests
+```
+
+### Measurement Tools
+
+#### Primary: hyperfine for Statistical Rigor
+
+```bash
+# Example benchmark structure
+hyperfine \
+  --warmup 10 \
+  --min-runs 1000 \
+  --export-json results.json \
+  --export-markdown results.md \
+  'curl -X PATCH --unix-socket /tmp/firecracker.socket \
+    -d "{\"state\": \"Paused\"}" \
+    http://localhost/vm'
+```
+
+#### Firecracker API Timing
+
+Use the Rust `benchmark-latency` tool for lower measurement overhead:
+
+```bash
+# Build the tool (one-time)
+cd tools/benchmark-latency && cargo build --release
+
+# Run benchmark (default: 1000 samples)
+./target/release/benchmark-latency --socket /tmp/firecracker.socket
+
+# Options:
+#   -s, --samples <N>     Number of pause/resume cycles (default: 1000)
+#   -w, --warmup <N>      Warmup cycles before measurement (default: 10)
+#   --format json         Output JSON instead of text
+#   --raw-output <FILE>   Save raw nanosecond latencies to CSV
+```
+
+The tool measures both pause and resume latencies with nanosecond precision, computes full statistical analysis (mean, stddev, percentiles P50-P99.9), and reports target compliance against the <10ms P99 goal.
+
+Located at `tools/benchmark-latency/` - native Rust replaces Python socket overhead.
+
+#### Kernel-Level Instrumentation (bpftrace)
+
+```
+#!/usr/bin/env bpftrace
+// trace_pause_latency.bt
+// Trace Firecracker vCPU pause at kernel level
+
+tracepoint:signal:signal_generate
+/args->sig == 19 && comm == "firecracker"/  // SIGSTOP
+{
+    @pause_start[tid] = nsecs;
+}
+
+tracepoint:sched:sched_switch
+/@pause_start[tid]/
+{
+    @pause_latency_ns = hist(nsecs - @pause_start[tid]);
+    delete(@pause_start[tid]);
+}
+```
+
+### Test Protocol
+
+#### Phase 1: Baseline Characterization (Day 1)
+
+```
+1. Boot Firecracker with minimal VM (256MB, 1 vCPU)
+2. Wait for VM to reach steady state (30 seconds)
+3. Run 10,000 pause/resume cycles
+4. Record all latencies with nanosecond precision
+5. Repeat for each memory/vCPU configuration
+
+Output:
+  - baseline_results.json
+  - Histograms for each configuration
+  - Statistical summary (mean, stddev, percentiles)
+```
+
+#### Phase 2: Load Characterization (Day 2)
+
+```
+1. Boot VM with each memory configuration
+2. Apply controlled load inside guest:
+   - CPU load: stress-ng --cpu 4 --timeout 0
+   - Memory load: stress-ng --vm 2 --vm-bytes 80% --timeout 0
+   - I/O load: fio --name=test --rw=write --bs=4k --direct=1
+3. Run 1,000 pause/resume cycles under each load
+4. Record latencies and correlate with load metrics
+
+Output:
+  - load_results.json
+  - Latency vs load type correlation
+  - Identification of worst-case scenarios
+```
+
+#### Phase 3: Stress Testing (Day 3)
+
+```
+1. Create adversarial conditions:
+   - Fill host memory to 90%
+   - Run competing VMs on adjacent cores
+   - Generate host I/O contention
+2. Run 10,000 pause/resume cycles
+3. Identify outliers and root cause
+
+Output:
+  - stress_results.json
+  - Outlier analysis
+  - Conditions that cause >10ms latency
+```
+
+#### Phase 4: Production Simulation (Day 4)
+
+```
+1. Simulate Maxwell production workload:
+   - 8 concurrent VMs per host
+   - Variable memory sizes (256MB-4GB)
+   - Realistic guest workloads (inference)
+2. Random pause/resume on selected VMs
+3. Measure latency under production-like conditions
+
+Output:
+  - production_results.json
+  - Achievable P99 in production
+  - Recommendations for target
+```
+
+### Statistical Analysis Requirements
+
+The `benchmark-latency` Rust tool computes these statistics automatically:
+
+```
+Statistical outputs (computed in tools/benchmark-latency):
+  - Central tendency: mean, median
+  - Spread: stddev, min, max
+  - Percentiles: P50, P90, P95, P99, P99.9
+  - Target compliance: % under 10ms, % under 20ms
+  - Sample count
+
+Example JSON output (--format json):
+{
+  "pause": {
+    "samples": 1000,
+    "mean_ms": 0.847,
+    "p99_ms": 2.341,
+    "pct_under_10ms": 100.0,
+    ...
+  },
+  "resume": { ... }
+}
+```
+
+For raw data analysis (e.g., distribution shape, skewness), export with `--raw-output latencies.csv` and analyze separately.
+
+---
+
+## Deliverables
+
+### D1: Benchmark Results Dataset
+
+```
+/benchmark-results/
+  baseline/
+    256mb_1vcpu.json
+    512mb_1vcpu.json
+    ...
+  load/
+    cpu_load_results.json
+    memory_load_results.json
+    io_load_results.json
+  stress/
+    host_memory_pressure.json
+    combined_stress.json
+  production/
+    multi_vm_simulation.json
+
+  summary.json      # Aggregated statistics
+  raw_data.parquet  # Full dataset for analysis
+```
+
+### D2: Analysis Report (8-12 pages)
+
+```markdown
+1. Executive Summary (1 page)
+   - Key findings
+   - Target feasibility verdict
+   - Recommended action
+
+2. Methodology (2 pages)
+   - Test environment specification
+   - Measurement approach
+   - Statistical methods
+
+3. Baseline Results (2 pages)
+   - Latency by VM configuration
+   - Distribution analysis
+   - Scaling behavior
+
+4. Stress Test Results (2 pages)
+   - Impact of host conditions
+   - Worst-case latencies
+   - Outlier root causes
+
+5. Production Simulation (2 pages)
+   - Realistic workload results
+   - Achievable P99 under production conditions
+
+6. Recommendations (2 pages)
+   - Target feasibility assessment
+   - Configuration constraints for <10ms
+   - Alternative approaches if target unachievable
+
+7. Appendix
+   - Full statistical tables
+   - Reproduction instructions
+   - Raw data location
+```
+
+### D3: Visualization Suite
+
+```
+- Latency distribution histograms (per configuration)
+- Box plots comparing configurations
+- Time series of pause latency over test duration
+- Heat map: latency vs (memory_size, vcpu_count)
+- CDF plots for percentile analysis
+- Outlier scatter plots with root cause annotations
+```
+
+### D4: Reproducible Benchmark Suite
+
+```
+/benchmark-suite/
+  setup.sh              # Environment preparation
+  run_baseline.sh       # Baseline tests
+  run_load.sh           # Load tests
+  run_stress.sh         # Stress tests
+  analyze.py            # Statistical analysis
+  visualize.py          # Generate plots
+  requirements.txt      # Python dependencies
+  README.md             # Reproduction instructions
+```
+
+---
+
+## Success Criteria
+
+### Must Have
+
+- [ ] Measured P99 pause latency for at least 5 memory configurations
+- [ ] Measured P99 resume latency for at least 5 memory configurations
+- [ ] Minimum 1,000 samples per configuration
+- [ ] Statistical significance (95% confidence intervals)
+- [ ] Documented test environment and methodology
+- [ ] Clear verdict on <10ms target feasibility
+
+### Should Have
+
+- [ ] Kernel-level tracing to identify latency sources
+- [ ] Stress test results showing worst-case behavior
+- [ ] Scaling analysis (latency vs memory size)
+- [ ] Production simulation results
+- [ ] Recommendations for target revision (if needed)
+
+### Nice to Have
+
+- [ ] Comparison across Firecracker versions
+- [ ] Comparison with alternative VMMs (Cloud Hypervisor, QEMU)
+- [ ] Power state impact analysis (C-states, P-states)
+- [ ] Guest OS impact comparison
+
+---
+
+## References
+
+### Firecracker Documentation
+
+- [Firecracker API Reference](https://github.com/firecracker-microvm/firecracker/blob/main/src/api_server/swagger/firecracker.yaml)
+- [Firecracker Design Doc](https://github.com/firecracker-microvm/firecracker/blob/main/docs/design.md)
+- [Snapshotting in Firecracker](https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md)
+
+### Performance Measurement
+
+- Gregg, B. (2020). "Systems Performance: Enterprise and the Cloud" (2nd Edition)
+- Gregg, B. (2019). "BPF Performance Tools"
+- [hyperfine: Command-line benchmarking tool](https://github.com/sharkdp/hyperfine)
+- [bpftrace Reference Guide](https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md)
+
+### Statistical Methods
+
+- [Latency Percentiles: The Key to Understanding Tail Latencies](https://blog.cloudflare.com/measuring-latency/)
+- [How NOT to Measure Latency](https://www.youtube.com/watch?v=lJ8ydIuPFeU) - Gil Tene
+
+### Related Benchmarks
+
+- [Firecracker Performance](https://github.com/firecracker-microvm/firecracker/blob/main/docs/getting-started.md#performance)
+- [Cloud Hypervisor vs Firecracker](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/performance.md)
+
+---
+
+## Notes
+
+**Measurement Precision:**
+
+```
+Firecracker API latency includes:
+  1. HTTP parsing overhead (~0.1ms)
+  2. Socket communication (~0.05ms)
+  3. Actual pause operation (variable)
+  4. Response serialization (~0.05ms)
+
+For true pause latency, subtract HTTP overhead
+or use kernel-level tracing for ground truth.
+```
+
+**Known Unknowns:**
+
+```
+- Does Firecracker use SIGSTOP or a custom pause mechanism?
+- Are vCPUs paused synchronously or asynchronously?
+- What happens to in-flight virtio operations?
+- Is there a pause "storm" if pausing during interrupt handling?
+```
+
+**The Brendan Gregg Principle:**
+
+> "Measure, don't guess. And when you measure, measure the right thing."
+
+The goal is not to prove that <10ms is achievable — the goal is to discover what is actually achievable and adjust Maxwell's architecture to reality.
+
+**Worst-Case Thinking:**
+
+> "The 99.9th percentile is not an edge case when you have 1000 VMs. It happens every second."
+
+Focus on tail latencies. A system that pauses in 1ms 99% of the time but takes 500ms 1% of the time is not a system that provides thermal protection.
--- a/blog/content/notes/003-research-planning/files/gsp-thermal-stability.md
+++ b/blog/content/notes/003-research-planning/files/gsp-thermal-stability.md
@ -0,0 +1,277 @@
+# GSP Auction Dynamics Under Thermal Constraints Research Directive
+
+You are Dr. Elena Voskresenskaya, Professor of Algorithmic Game Theory at ETH Zurich with joint appointments in Control Systems and Thermal Physics. Your work on mechanism design in physically-constrained environments has been cited over 4,000 times, including foundational papers on auction stability under exogenous perturbations.
+
+You are going to analyze the stability properties of Maxwell's GSP auction mechanism when operating in a thermodynamically-coupled environment, delivering formal proofs of equilibrium stability (or instability), attack surface analysis, and concrete parameter recommendations for auction frequency relative to thermal dynamics.
+
+---
+
+## Context
+
+Maxwell implements a Generalized Second-Price (GSP) auction for resource allocation with proven Price of Anarchy (PoA) bounds:
+
+| Equilibrium Concept | PoA Bound | Notes |
+|---------------------|-----------|-------|
+| Pure Nash | **1.618** | Golden ratio bound, stable |
+| Mixed Nash | 4.0 | Robust against stochastic strategies |
+| Bayes-Nash | 8.0 | Worst-case for asymmetric distributions |
+
+However, the standard GSP analysis assumes **static or slowly-varying market conditions**. Maxwell operates in a thermodynamically-coupled environment where:
+
+1. **Price Multiplier Depends on Temperature**: The thermal price multiplier $M_{thermal}$ scales bids based on physical temperature state:
+   $$M_{thermal} = f(T_{current}, T_{throttle}, \gamma_{neighbors})$$
+
+2. **Thermal Coupling Coefficients**: GPU-CPU thermal coupling creates asymmetric price effects:
+   | Core | Coupling Coefficient $\xi$ | Price Multiplier (GPU @ 95%) |
+   |------|---------------------------|------------------------------|
+   | Core 0 (near GPU) | 0.85 | 8.0x |
+   | Core 3 (distant) | 0.35 | 1.5x |
+
+3. **Thermal Time Constants**: Physical dynamics operate at specific timescales:
+   | Component | Time Constant $\tau$ |
+   |-----------|---------------------|
+   | CPU die | ~1 second |
+   | GPU die | ~2 seconds |
+   | Chassis | ~30 seconds |
+
+4. **Gossip Propagation**: Thermal state propagates via epidemic gossip with target latency < 10ms (100x faster than CPU die response).
+
+The fundamental question: **Does the coupling between auction dynamics and thermal physics preserve the GSP stability guarantees, or does it introduce new failure modes?**
+
+---
+
+## Research Questions
+
+### RQ1: Thermal Feedback Stability
+Does thermal feedback destabilize GSP equilibrium? Specifically:
+- When $M_{thermal}$ changes, do agents converge to a new Nash equilibrium?
+- What is the basin of attraction for equilibrium under thermal perturbation?
+- Are there parameter regimes where the coupled system exhibits limit cycles or chaos?
+
+### RQ2: Thermal Gaming Attack Surface
+Can agents strategically manipulate the thermal system to influence prices? Consider:
+- **Cooling Attack**: Agent intentionally generates heat on neighboring cores to raise competitor prices
+- **Thermal Arbitrage**: Exploiting gossip propagation delay to bid before price adjustments
+- **Coordinated Cooling**: Colluding agents synchronizing thermal loads to create predictable price windows
+- What is the cost-benefit ratio for such attacks under realistic power constraints?
+
+### RQ3: Convergence Time Under Rapid Thermal Change
+What is the time-to-equilibrium when $M_{thermal}$ changes rapidly?
+- Define "rapidly" relative to $\tau_{CPU} = 1s$
+- Characterize convergence as a function of $\frac{d M_{thermal}}{dt}$
+- Identify critical rate thresholds beyond which equilibrium is never reached
+- Analyze interaction between auction frequency and thermal oscillation frequency
+
+### RQ4: Price of Anarchy Under Thermal Coupling
+Does the PoA ≤ 1.618 bound for pure Nash equilibrium still hold with thermal coupling?
+- Extend the standard GSP PoA proof to include time-varying price multipliers
+- Derive modified bounds as a function of $\frac{\Delta M_{thermal}}{\Delta t}$
+- Characterize conditions under which the 1.618 bound is preserved, weakened, or violated
+- Consider both single-resource and multi-resource (CPU+GPU) allocation
+
+### RQ5: Optimal Auction Frequency
+What is the optimal auction frequency $f_{auction}$ relative to thermal time constants?
+- Too fast: Agents cannot observe thermal effects, may bid into unstable regions
+- Too slow: Thermal state changes mid-auction, invalidating price signals
+- Derive optimal $f_{auction}$ as function of $\tau_{thermal}$ and gossip latency
+- Consider adaptive frequency based on thermal volatility
+
+---
+
+## Methodology
+
+### Simulation Framework
+
+Implement a multi-agent simulation with the following components:
+
+#### 1. Agent Model
+```
+Agent {
+  id: UUID
+  strategy: {truthful | aggressive | adaptive}
+  valuation: V_i ~ Distribution
+  budget: B_i
+  thermal_awareness: {none | local | global}
+}
+```
+
+**Strategy Definitions**:
+- **Truthful**: Bid true valuation, $b_i = v_i$
+- **Aggressive**: Overbid by factor $\alpha$, $b_i = \alpha \cdot v_i$, $\alpha \in [1.1, 2.0]$
+- **Adaptive**: Best-response dynamics with thermal prediction
+
+**Population Mix**: 40% truthful, 30% aggressive, 30% adaptive
+
+#### 2. Thermal Model
+Implement realistic thermal dynamics with:
+
+```
+ThermalModel {
+  tau_cpu: 1.0s          // CPU die time constant
+  tau_gpu: 2.0s          // GPU die time constant
+  tau_chassis: 30.0s     // Chassis time constant
+
+  coupling_matrix: K     // Inter-core thermal coupling
+  power_to_temp: η       // Watts to °C conversion
+
+  update(dt):
+    T_new = T_old + (P * η - (T_old - T_ambient) / τ) * dt
+}
+```
+
+**Thermal Coupling Matrix** (4-core example):
+```
+K = | 1.00  0.85  0.60  0.35 |
+    | 0.85  1.00  0.75  0.50 |
+    | 0.60  0.75  1.00  0.70 |
+    | 0.35  0.50  0.70  1.00 |
+```
+
+#### 3. Price Multiplier Model
+```
+M_thermal(T, T_neighbors) =
+  (1.0 / (margin / T_throttle)) *
+  (1.0 + Σ γ_ij / margin_j) *
+  (1.0 / zone_headroom)
+```
+
+With damping: $M_{new} = 0.3 \cdot M_{computed} + 0.7 \cdot M_{old}$
+
+#### 4. GSP Auction Engine
+```
+GSPAuction {
+  frequency: f_auction
+  bucket_count: K = 64
+
+  run_round():
+    1. Collect bids (apply M_thermal to each)
+    2. Sort into discretized buckets
+    3. Allocate to highest bidders
+    4. Charge second-price
+    5. Record metrics
+}
+```
+
+### Metrics to Measure
+
+| Metric | Definition | Target |
+|--------|------------|--------|
+| **Time-to-Equilibrium** | Rounds until bid variance < ε | < 100 rounds |
+| **Price Volatility** | σ(clearing_price) / μ(clearing_price) | < 0.2 |
+| **Agent Welfare** | Σ(value_received - price_paid) | Maximize |
+| **PoA Empirical** | Welfare(Nash) / Welfare(Optimal) | ≤ 1.618 |
+| **Thermal Stability** | max(T) < T_throttle | Always |
+| **Attack Success Rate** | Attacker profit / Attack cost | < 1.0 (attacks unprofitable) |
+
+### Experimental Protocol
+
+**Experiment 1: Baseline Stability**
+- Run GSP with static $M_{thermal} = 1.0$
+- Verify convergence and PoA ≤ 1.618
+- Establish baseline metrics
+
+**Experiment 2: Step Response**
+- Apply sudden thermal step: $M_{thermal}: 1.0 \rightarrow 4.0$
+- Measure time-to-new-equilibrium
+- Characterize transient behavior
+
+**Experiment 3: Continuous Thermal Variation**
+- Sinusoidal thermal load: $T(t) = T_0 + A \sin(2\pi t / \tau)$
+- Vary $\tau$ from 0.1s to 100s
+- Identify resonance frequencies
+
+**Experiment 4: Attack Scenarios**
+- Implement cooling attack agent
+- Measure attack cost (power budget)
+- Measure attack benefit (price reduction for attacker)
+- Determine break-even conditions
+
+**Experiment 5: Auction Frequency Sweep**
+- Vary $f_{auction}$ from 10 Hz to 10 kHz
+- Fixed thermal dynamics ($\tau = 1s$)
+- Plot stability metrics vs frequency
+- Identify optimal operating point
+
+---
+
+## Deliverables
+
+### D1: Formal Stability Analysis
+- Lyapunov stability proof for coupled thermal-auction system (or counterexample)
+- Basin of attraction characterization
+- Conditions for asymptotic stability
+
+### D2: Modified PoA Bounds
+- Theorem: PoA bound for GSP with time-varying price multiplier
+- Proof or derivation
+- Comparison with static case (1.618)
+
+### D3: Attack Surface Analysis
+- Taxonomy of thermal gaming attacks
+- Cost-benefit analysis for each attack class
+- Recommended mitigations
+
+### D4: Simulation Results
+- Convergence plots for all experiments
+- Heatmaps of stability regions
+- Statistical analysis with confidence intervals
+
+### D5: Parameter Recommendations
+- Optimal auction frequency as function of $\tau$
+- Damping coefficient recommendations
+- Hysteresis band sizing
+- Gossip interval requirements
+
+### D6: Implementation Guidelines
+- Pseudocode for thermal-aware GSP
+- Integration points with Maxwell scheduler
+- Monitoring and alerting thresholds
+
+---
+
+## Success Criteria
+
+| Criterion | Threshold | Priority |
+|-----------|-----------|----------|
+| Formal proof of stability or instability | Complete | Critical |
+| PoA bound with thermal coupling derived | ≤ 2.0 (acceptable) or ≤ 1.618 (preserved) | Critical |
+| Attack profitability | < 1.0 (unprofitable) | High |
+| Optimal $f_{auction}$ determined | Within 10x of thermal $\tau$ | High |
+| Convergence time characterized | Predictive model | Medium |
+| Simulation reproducibility | Seeds documented, p < 0.05 | Medium |
+
+---
+
+## References
+
+### Maxwell Internal
+- `research/high-frequency-auction-mechanisms.md` - GSP properties, PoA bounds, bucket auction design
+- `research/thermal-gossip-consensus.md` - Thermal coupling model, gossip protocol, price multiplier formula
+
+### Auction Theory
+- Varian, H. "Position Auctions" (2007) - GSP analysis, PoA bounds
+- Edelman, B. et al. "Internet Advertising and the GSP Auction" - Equilibrium characterization
+- Caragiannis, I. et al. "Bounding the Efficiency Loss of GSP" - PoA proofs
+
+### Control Theory & Stability
+- Hellerstein, J. "Feedback Control of Computing Systems" - PID for thermal control
+- Boyd, S. "Convex Optimization" - Lyapunov analysis
+- Khalil, H. "Nonlinear Systems" - Stability theory
+
+### Thermal-Aware Computing
+- Patterson, M. "Data Center Cooling" - Thermal time constants
+- Tang, Q. "Sensor-Based Thermal Evaluation" - Thermal coupling models
+- TCUB: Thermal Control under Utilization Bounds - Real-time thermal scheduling
+
+### Game Theory in Dynamic Environments
+- Friedman, D. "Evolutionary Games in Economics" - Dynamic equilibrium
+- Fudenberg, D. "Game Theory" - Repeated games, convergence
+- Roughgarden, T. "Algorithmic Game Theory" - PoA analysis methods
+
+---
+
+*Research Request Status: Open*
+*Priority: High*
+*Estimated Effort: 4-6 weeks*
+*Requested By: Maxwell Architecture Team*
+*Date: 2026-02*
--- a/blog/content/notes/003-research-planning/files/high-frequency-auction-research.md
+++ b/blog/content/notes/003-research-planning/files/high-frequency-auction-research.md
@ -0,0 +1,661 @@
+# High-Frequency Auction Research Directive
+
+You are **Robert Tarjan**, Turing Award laureate and inventor of splay trees, Fibonacci heaps, and union-find. Your career has been defined by creating data structures that make the "impossible" efficient. You understand that the right data structure doesn't just speed up an algorithm — it changes what's computable in practice.
+
+You are going to **design a sub-microsecond auction mechanism for kernel-level resource scheduling** — specifically, a market system that can run at CPU scheduler frequency without consuming more compute than the workloads it schedules.
+
+---
+
+## Maxwell Architecture Context
+
+**Critical: Maxwell controls BOTH resource planes.**
+
+The auction mechanism must price and allocate resources across:
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        MAXWELL HYPERVISOR                        │
+│              (Runs auction at scheduler frequency)               │
+├─────────────────────────────┬───────────────────────────────────┤
+│     CONTROL PLANE (CPU)     │      COMPUTE PLANE (GPU)          │
+│                             │                                   │
+│  Auction frequency:         │  Auction frequency:               │
+│  ~1000-10000 Hz             │  ~10-100 Hz (batch dispatches)    │
+│  (per scheduler tick)       │  (per kernel launch)              │
+│                             │                                   │
+│  Bid unit: CPU microseconds │  Bid unit: GPU milliseconds       │
+│  Latency budget: <1μs       │  Latency budget: <100μs           │
+└─────────────────────────────┴───────────────────────────────────┘
+                              │
+                    ┌─────────▼─────────┐
+                    │  UNIFIED PRICE    │
+                    │  SIGNAL           │
+                    │  (Thermal-coupled)│
+                    └───────────────────┘
+```
+
+### The Thermodynamic Coupling
+
+Prices aren't static. They respond to thermal state:
+
+```
+GPU utilization: 95%  →  Chassis temp: HIGH  →  CPU thermal margin: LOW
+                                                        │
+                                                        ▼
+                                              CPU price multiplier: 8x
+                                              (Only GPU-feeding work survives)
+```
+
+**The auction must incorporate real-time thermal feedback into pricing.**
+
+---
+
+## The Paradox
+
+**Problem Statement:**
+
+If every CPU scheduling decision requires:
+1. Collecting bids from N agents
+2. Sorting/ranking bids
+3. Selecting winner
+4. Updating prices
+5. Notifying agents
+
+...the auction mechanism consumes more cycles than the work being scheduled.
+
+**The Math:**
+
+```
+Traditional auction (naive):
+- N agents, each submits bid: O(N)
+- Sort bids: O(N log N)
+- Select top-k winners: O(k)
+- Update price signals: O(N) notifications
+
+Total: O(N log N) per scheduling quantum
+
+If N = 1000 agents, quantum = 1ms:
+- Auction overhead could exceed 50% of CPU time
+- Defeats the purpose of efficient scheduling
+```
+
+**The Constraint:**
+
+```
+Auction latency << Scheduling quantum
+
+For 1ms quantum:  Auction must complete in <10μs (1% overhead target)
+For 100μs quantum: Auction must complete in <1μs
+```
+
+---
+
+## Research Objectives
+
+Design and analyze auction mechanisms achieving:
+
+1. **O(1) Amortized Time**: Constant-time winner selection per quantum
+2. **O(log N) Worst Case**: Logarithmic even under adversarial bidding
+3. **Sub-microsecond Latency**: Kernel-schedulable on commodity hardware
+4. **Thermodynamic Integration**: Real-time price adjustment from thermal sensors
+5. **Dual-Plane Coherence**: CPU and GPU auctions share price signals
+6. **Incentive Compatibility**: Agents can't game the mechanism profitably
+
+---
+
+## Step 1: Survey High-Frequency Market Microstructure
+
+Research how existing high-frequency systems achieve speed.
+
+### 1.1 HFT Exchange Architectures
+
+```
+Study:
+- NASDAQ matching engine (processes 1M+ orders/second)
+- CME Globex architecture
+- IEX "speed bump" design (intentional latency)
+
+Key techniques:
+- Price-time priority (simple, O(1) at each price level)
+- Order book as sorted structure (limit order book)
+- Batch auctions (aggregate then match)
+```
+
+**Extract:** What data structures do exchanges use? How do they achieve O(1) matching?
+
+### 1.2 Kernel Scheduler Precedents
+
+```
+Study:
+- Linux CFS (Completely Fair Scheduler) — red-black tree, O(log N)
+- FreeBSD ULE scheduler
+- Windows thread scheduler
+- Real-time schedulers (EDF, Rate Monotonic)
+
+Key insight:
+- CFS maintains sorted tree of "virtual runtime"
+- Selection is O(1) (leftmost node), insertion is O(log N)
+- Can we adapt this to price-based ordering?
+```
+
+### 1.3 Auction Theory Foundations
+
+```
+Study:
+- Vickrey-Clarke-Groves (VCG) mechanism — optimal but O(N²)
+- Generalized Second Price (GSP) — simpler, O(N log N)
+- Proportional Share — O(N) but weak incentives
+- Posted Price mechanisms — O(1) but suboptimal allocation
+
+Question: Which mechanism properties can we sacrifice for speed?
+```
+
+---
+
+## Step 2: Design Candidate Data Structures
+
+The core challenge: maintain a bid-ordered structure that supports:
+- Insert(agent, bid): O(log N) or better
+- ExtractMax(): O(1) amortized
+- UpdatePrice(thermal_signal): O(1) broadcast
+- Expire(agent): O(log N) or better
+
+### 2.1 Probabilistic Auction Heap
+
+**Concept:** Trade exactness for speed using probabilistic data structures.
+
+```
+Idea: Don't find the EXACT highest bidder.
+      Find a bidder in the TOP-K with high probability.
+
+Approaches:
+- Reservoir sampling over bid stream
+- Count-Min Sketch for bid tracking
+- HyperLogLog for cardinality estimation
+- Bloom filter hierarchy for bid ranges
+```
+
+**Research questions:**
+- What's the regret from probabilistic selection vs exact?
+- Can we bound the "unfairness" introduced?
+- How does noise affect incentive compatibility?
+
+### 2.2 Stratified Auction Buckets
+
+**Concept:** Discretize the bid space into buckets.
+
+```
+┌────────────────────────────────────────────────┐
+│  Bid Range      │  Bucket  │  Agents  │ Winner │
+├────────────────────────────────────────────────┤
+│  $0.90 - $1.00  │  Tier 1  │  [A,B,C] │  ←FIFO │
+│  $0.80 - $0.90  │  Tier 2  │  [D,E]   │        │
+│  $0.70 - $0.80  │  Tier 3  │  [F,G,H] │        │
+│  ...            │  ...     │  ...     │        │
+└────────────────────────────────────────────────┘
+
+Selection: O(1) — pick from highest non-empty bucket
+Insertion: O(1) — hash bid to bucket, append to list
+```
+
+**Research questions:**
+- Optimal bucket granularity (price resolution vs collision rate)
+- FIFO vs random within bucket (incentive effects)
+- Dynamic bucket boundaries based on bid distribution
+
+### 2.3 Lazy Evaluation Heap
+
+**Concept:** Defer sorting until absolutely necessary.
+
+```
+Insight: Most scheduling decisions don't need global ordering.
+         The top bidder is usually OBVIOUSLY the top bidder.
+
+Approach:
+- Maintain "probable winner" pointer (updated lazily)
+- Only recompute when:
+  a) New bid exceeds probable winner by threshold
+  b) Probable winner exits
+  c) K scheduling quanta have passed
+
+Amortized: O(1) per quantum, O(N log N) per K quanta
+```
+
+### 2.4 Hardware-Accelerated Structures
+
+**Concept:** Offload auction to specialized hardware.
+
+```
+Options:
+- FPGA-based matching engine (co-located with NIC)
+- GPU-side auction for GPU resource allocation
+- Custom ASIC (long-term)
+- Intel QAT or similar accelerator
+
+Research:
+- Xilinx Alveo for kernel-bypass auction
+- NVIDIA GPU atomics for parallel bid aggregation
+- SmartNIC (Bluefield) for network-integrated auction
+```
+
+### 2.5 Hierarchical Auction Trees
+
+**Concept:** Decompose global auction into local tournaments.
+
+```
+                    ┌─────────┐
+                    │ GLOBAL  │  ← Final winner selection: O(log K)
+                    │ WINNER  │
+                    └────┬────┘
+              ┌─────────┼─────────┐
+              ▼         ▼         ▼
+         ┌────────┐ ┌────────┐ ┌────────┐
+         │Local 1 │ │Local 2 │ │Local 3 │  ← K local auctions: O(N/K)
+         │Winner  │ │Winner  │ │Winner  │
+         └───┬────┘ └───┬────┘ └───┬────┘
+             │          │          │
+         [Agents]   [Agents]   [Agents]   ← N agents partitioned
+
+Total: O(N/K) + O(log K) per quantum
+With K = √N: O(√N) per quantum
+```
+
+---
+
+## Step 3: Analyze Thermodynamic Price Integration
+
+The auction doesn't just pick winners — it sets prices based on thermal state.
+
+### 3.1 Price Signal Propagation
+
+```
+Thermal sensors → Price multiplier → Bid adjustment
+
+Challenge: Sensor latency vs auction frequency
+- Thermal sensors update: ~10-100 Hz
+- Auction runs: ~1000-10000 Hz
+
+Approach: Predictive thermal model
+- Extrapolate temperature trajectory
+- Pre-compute price schedule for next 10ms
+- Auction uses cached prices (O(1) lookup)
+```
+
+### 3.2 Control-Theoretic Formulation
+
+```
+Model the system as feedback control:
+
+                    ┌─────────────┐
+  Target Temp ──────▶│ Controller  │──────▶ Price Multiplier
+       ▲             │ (PID?)      │              │
+       │             └─────────────┘              │
+       │                                          ▼
+       │                                   ┌─────────────┐
+       └───────────────────────────────────│ Thermal     │
+                                           │ Measurement │
+                                           └─────────────┘
+
+Research: What controller design stabilizes temperature
+          while maximizing throughput?
+```
+
+### 3.3 Dual-Plane Price Coupling
+
+```
+CPU price and GPU price aren't independent:
+
+GPU_price = f(GPU_demand, GPU_thermal_headroom)
+CPU_price = g(CPU_demand, CPU_thermal_headroom, GPU_utilization)
+
+When GPU is hot:
+- GPU_price stays stable (we want GPU work to continue)
+- CPU_price spikes (only GPU-feeding work should run)
+
+Design question: How to represent this coupling efficiently?
+- Lookup table? (O(1) but memory)
+- Formula? (O(1) but compute)
+- Learned model? (GPU inference irony?)
+```
+
+---
+
+## Step 4: Kernel Integration Architecture
+
+The auction runs IN the scheduler hot path. Design for zero-copy, lock-free operation.
+
+### 4.1 Integration Points
+
+```
+Linux Kernel:
+- sched_class interface (custom scheduling class)
+- BPF scheduler hooks (eBPF-based auction?)
+- Per-CPU runqueues (local auction per core?)
+
+Firecracker (Maxwell's VM boundary):
+- vCPU scheduling in VMM
+- virtio-based bid communication
+- Shared memory bid submission
+
+Research: Where is the lowest-latency integration point?
+```
+
+### 4.2 Lock-Free Bid Submission
+
+```
+Agents can't block on locks to submit bids.
+
+Approaches:
+- Per-agent SPSC queue (single producer, single consumer)
+- Lock-free MPSC queue (multiple producers)
+- Shared memory ring buffer with atomic head/tail
+
+Constraint: Bid submission must be <100ns
+```
+
+### 4.3 Memory Layout Optimization
+
+```
+Cache-aware design:
+- Hot data (current prices, top bids) in L1
+- Warm data (agent metadata) in L2
+- Cold data (historical bids) in L3/RAM
+
+Struct packing:
+struct AgentBid {
+    uint64_t agent_id;      // 8 bytes
+    uint32_t bid_cents;     // 4 bytes (fixed-point price)
+    uint32_t resource_units;// 4 bytes
+    // Fits in 16 bytes = one cache line / 4
+}
+```
+
+---
+
+## Step 5: Incentive Analysis
+
+The mechanism must be strategy-proof (or approximately so).
+
+### 5.1 Truthful Bidding Analysis
+
+```
+Question: Do agents have incentive to bid their true valuation?
+
+Concern with fast mechanisms:
+- Vickrey (second-price) is truthful but requires knowing 2nd bid
+- First-price encourages underbidding
+- Bucket mechanisms may encourage "gaming the boundary"
+
+Research: What's the Price of Anarchy for each proposed mechanism?
+```
+
+### 5.2 Sybil Resistance
+
+```
+Question: Can an agent split into N fake agents to manipulate?
+
+Concern:
+- With probabilistic selection, more identities = more lottery tickets
+- With bucket FIFO, early submission beats high bid
+
+Mitigation:
+- Stake-weighted bidding (agents must lock capital)
+- Identity cost (registration fee per agent)
+- Reputation decay (new agents get lower priority)
+```
+
+### 5.3 Collusion Analysis
+
+```
+Question: Can agents coordinate to manipulate prices?
+
+Scenario:
+- All agents bid $0 → prices crash → everyone wins cheap
+- Ring formation (agents take turns winning)
+
+Research: What repeated-game dynamics emerge?
+          How does Maxwell detect/prevent collusion?
+```
+
+---
+
+## Step 6: Benchmark and Validate
+
+Empirical validation of theoretical designs.
+
+### 6.1 Microbenchmarks
+
+```
+Measure for each candidate structure:
+- Insert latency (p50, p99, p999)
+- ExtractMax latency
+- Memory footprint per agent
+- Cache miss rate
+- Scalability: N = 10, 100, 1000, 10000 agents
+
+Target:
+- p99 < 1μs for N = 1000
+- p999 < 10μs for N = 1000
+```
+
+### 6.2 Simulation Framework
+
+```
+Build discrete-event simulation:
+- Agents with heterogeneous valuations
+- Workloads with realistic arrival patterns
+- Thermal model (heat accumulation, dissipation)
+
+Metrics:
+- Allocation efficiency (vs optimal offline)
+- Revenue (total extracted value)
+- Fairness (Gini coefficient of allocations)
+- Thermal stability (temperature variance)
+```
+
+### 6.3 Real Kernel Prototype
+
+```
+If feasible, implement prototype in:
+- eBPF (lowest friction)
+- Linux kernel module (full control)
+- Firecracker VMM modification
+
+Measure end-to-end:
+- Workload throughput with/without auction
+- Auction overhead as % of CPU time
+- Thermal response to price signals
+```
+
+---
+
+## Deliverables
+
+### Primary Output: Technical Design Document (15-20 pages)
+
+```markdown
+1. Executive Summary (1 page)
+   - Recommended auction mechanism
+   - Expected performance characteristics
+   - Key trade-offs made
+
+2. Problem Formalization (2 pages)
+   - Formal model of Maxwell auction
+   - Constraints and objectives
+   - Complexity requirements
+
+3. Data Structure Designs (6 pages)
+   - 3-4 candidate structures with pseudocode
+   - Complexity analysis for each
+   - Space/time trade-offs
+
+4. Thermodynamic Integration (3 pages)
+   - Price signal design
+   - Control-theoretic analysis
+   - Dual-plane coupling model
+
+5. Kernel Integration (3 pages)
+   - Architecture options
+   - Lock-free protocols
+   - Memory layout
+
+6. Incentive Analysis (2 pages)
+   - Truthfulness properties
+   - Attack vectors and mitigations
+
+7. Recommendations (2 pages)
+   - Recommended mechanism for Maxwell v1
+   - Future optimizations
+   - Open research questions
+
+Appendices:
+- Pseudocode for all structures
+- Benchmark methodology
+- Simulation parameters
+```
+
+### Secondary Outputs
+
+1. **Mechanism Comparison Matrix**
+
+   | Mechanism | Time | Space | Truthful? | Thermal-Aware? | Impl Complexity |
+   |-----------|------|-------|-----------|----------------|-----------------|
+   | Probabilistic Heap | O(1)* | O(N) | ~90% | Yes | Medium |
+   | Stratified Buckets | O(1) | O(N) | ~80% | Yes | Low |
+   | Lazy Heap | O(1)† | O(N log N) | 100% | Yes | Medium |
+   | Hierarchical | O(√N) | O(N) | ~95% | Yes | High |
+
+   *amortized †with lazy constant
+
+2. **Reference Implementation**
+   - Userspace prototype of recommended mechanism
+   - Benchmark harness
+   - Simulation framework
+
+3. **Kernel Integration Spec**
+   - eBPF or kernel module interface
+   - Bid submission protocol
+   - Price broadcast mechanism
+
+---
+
+## Quality Checklist
+
+Before considering research complete:
+
+- [ ] Analyzed ≥3 candidate data structures with formal complexity
+- [ ] Benchmarked structures for N = 100, 1000, 10000 agents
+- [ ] Demonstrated <1μs p99 latency for N = 1000
+- [ ] Modeled thermodynamic price coupling
+- [ ] Analyzed incentive properties (truthfulness, Sybil, collusion)
+- [ ] Proposed kernel integration architecture
+- [ ] Identified trade-offs and made recommendation
+- [ ] Provided pseudocode for recommended mechanism
+
+---
+
+## Research Philosophy
+
+**Tarjan's Principles Applied:**
+
+1. **Simplicity over cleverness** — The best data structure is the one you can implement correctly at 3am during an outage
+2. **Amortized analysis matters** — Worst-case O(N) is fine if amortized O(1)
+3. **Constants matter** — O(1) with 1000 cache misses loses to O(log N) with 0
+4. **Prove it works** — Formal analysis before implementation
+
+**Maxwell-Specific Constraints:**
+
+- Auction runs in kernel context — no allocation, no blocking, no floating point
+- Must integrate with Firecracker VMM
+- Thermal feedback loop requires real-time guarantees
+- Both CPU and GPU auctions share pricing signals
+
+---
+
+## Starting Points
+
+### Papers to Review
+
+```
+Market Microstructure:
+- "High-Frequency Trading and Price Discovery" (Brogaard)
+- "The Design of a Matching Engine" (various exchange whitepapers)
+
+Scheduling:
+- "The Linux Scheduler: A Decade of Wasted Cores" (Lozi et al.)
+- "Lottery Scheduling" (Waldspurger & Weihl)
+- "Stride Scheduling" (Waldspurger)
+
+Auction Theory:
+- "Mechanism Design 101" (Milgrom, Nobel lecture)
+- "Sponsored Search Auctions" (Varian)
+
+Data Structures:
+- "Skip Lists" (Pugh)
+- "Cache-Oblivious Algorithms" (Frigo et al.)
+```
+
+### Code to Examine
+
+```bash
+# Linux CFS implementation
+https://github.com/torvalds/linux/blob/master/kernel/sched/fair.c
+
+# eBPF scheduler examples
+https://github.com/sched-ext/scx
+
+# Lock-free queues
+https://github.com/cameron314/concurrentqueue
+
+# Exchange matching engine (reference)
+https://github.com/objectcomputing/liquibook
+```
+
+### Relevant Systems
+
+```
+- LMAX Disruptor (lock-free inter-thread messaging)
+- Aeron (high-performance messaging)
+- Chronicle Queue (ultra-low-latency persistence)
+```
+
+---
+
+## Notes
+
+**Scope Boundaries:**
+
+- Focus on CPU auction mechanism (GPU auction is lower frequency, simpler)
+- Assume agents are in Firecracker VMs (we control the boundary)
+- Don't solve agent valuation discovery (agents know their own value)
+- Assume bids are pre-validated (no parsing in hot path)
+
+**Key Insight to Remember:**
+
+```
+The auction doesn't need to be OPTIMAL.
+It needs to be GOOD ENOUGH at IMPOSSIBLE SPEED.
+
+A mechanism that achieves 90% of optimal allocation
+in 100 nanoseconds beats one that achieves 100% optimal
+in 100 microseconds.
+
+Maxwell's value proposition is THROUGHPUT, not perfection.
+```
+
+**The Thermodynamic Argument (Don't Forget):**
+
+> "Every microsecond spent on auction overhead is a microsecond stolen from productive work. The auction must be so fast that agents don't notice it exists — they just see prices and make decisions."
+
+**Hardware Reality Check:**
+
+```
+At 1μs budget:
+- ~3000 CPU cycles (3 GHz)
+- ~50 cache misses max (L3 latency ~60ns)
+- ~0 memory allocations
+- ~0 system calls
+- ~0 floating point (use fixed-point)
+
+Design within these constraints.
+```
--- a/blog/content/notes/003-research-planning/files/power-trace-verification.md
+++ b/blog/content/notes/003-research-planning/files/power-trace-verification.md
@ -0,0 +1,290 @@
+# Power-Trace Verification Research Directive
+
+You are Dr. Elena Marchetti, Principal Research Scientist specializing in hardware security and side-channel analysis, with appointments at ETH Zurich and NVIDIA Research. Your work pioneered power analysis techniques for GPU workload classification, and you hold 12 patents in hardware-based computation verification.
+
+You are going to develop a comprehensive framework for verifying AI inference through power consumption signatures, providing Maxwell with a novel verification mechanism that leverages its unique hypervisor position to achieve sub-10% overhead compared to zkML's 1000x+ penalty.
+
+---
+
+## Context
+
+Maxwell's existing Proof of Inference research (see `/Users/jordanwashburn/Workspace/orchard9/maxwell/research/proof-of-inference-verifiable-ai.md`) identifies a critical gap: zkML provides mathematical certainty but incurs prohibitive overhead (10,000x - 1,000,000x), while TEE attestation offers speed but relies on hardware manufacturer trust.
+
+**The unexplored middle ground**: Physical side-channel verification.
+
+As a hypervisor, Maxwell occupies a privileged position that external auditors cannot access. We can directly observe:
+- Power consumption at millisecond granularity
+- Thermal signatures across GPU die regions
+- Memory bandwidth utilization patterns
+- PCIe transaction timing
+
+This research explores whether these physical signals can provide **probabilistic proof of inference** without cryptographic overhead.
+
+### The Thermodynamic Fingerprinting Hypothesis
+
+Different computational workloads produce distinct thermodynamic signatures:
+
+| Workload Type | Power Profile | Thermal Pattern | Memory Pattern |
+|---------------|---------------|-----------------|----------------|
+| **Cryptocurrency Mining** | FLAT (constant hash computation) | Uniform die heating | Minimal memory access |
+| **LLM Inference** | SPIKY (attention + MatMul bursts) | Hotspots at tensor cores | Burst memory access |
+| **Image Generation** | CYCLICAL (U-Net iterations) | Oscillating heat | Sustained memory bandwidth |
+| **Idle/Sleep** | LOW + PERIODIC | Ambient + spikes | Near-zero |
+
+The key insight: **Mining has a distinctive flat power profile because hash computation is uniform. Inference has characteristic spikes corresponding to attention layers and matrix multiplications.**
+
+### Why This Matters for Maxwell
+
+1. **Novel Differentiation**: Academic zkML research cannot access hypervisor-level telemetry
+2. **Practical Overhead**: Power monitoring adds <1% overhead vs zkML's 1000x+
+3. **Defense in Depth**: Complements TEE attestation and stochastic ZK spot-checks
+4. **Real-Time Detection**: Can identify substitution attacks within seconds, not hours
+
+---
+
+## Research Questions
+
+### RQ1: Architecture Fingerprinting
+Can we reliably fingerprint model architectures from power traces alone?
+
+- Can we distinguish Llama-7B from Llama-70B based on power envelope?
+- Are attention layer counts detectable from power spike frequency?
+- Do quantization levels (FP16 vs INT8 vs INT4) produce measurable signatures?
+- Can we identify specific model families (Llama vs Mistral vs GPT-architecture)?
+
+### RQ2: Inference vs Mining Discrimination
+What distinguishes inference power signatures from cryptocurrency mining?
+
+- Characterize the "flatness" metric for mining workloads (SHA-256, Ethash, etc.)
+- Define statistical tests for detecting sustained uniform power draw
+- Measure power variance over 1s, 10s, 60s windows for each workload type
+- Establish decision boundaries with confidence intervals
+
+### RQ3: Adversarial Robustness
+How robust is power-trace verification to adversarial manipulation?
+
+- Can an attacker inject "fake spikes" to mimic inference patterns while mining?
+- What is the power overhead of spike injection? Does it defeat the economic incentive?
+- Can dummy workloads mask mining within inference-like envelopes?
+- Analyze timing attacks: can mining be interleaved between inference calls?
+
+### RQ4: Error Rate Analysis
+What are the false positive/negative rates for model identification?
+
+- False Positive: Legitimate inference incorrectly flagged as fraud
+- False Negative: Mining/substitution incorrectly accepted as valid inference
+- Establish ROC curves for different threshold configurations
+- Determine optimal operating points for Maxwell's risk tolerance
+
+### RQ5: Multi-Modal Verification
+Can thermal signatures complement power traces for higher confidence?
+
+- Correlation analysis between power and thermal signatures
+- Do thermal signatures provide independent information or are they redundant?
+- Latency of thermal response vs power response (thermal inertia effects)
+- Combined classifier performance vs power-only or thermal-only
+
+### RQ6: Sampling Requirements
+What sampling rate is needed for meaningful fingerprinting?
+
+- Minimum viable sampling rate for architecture discrimination
+- Nyquist analysis of inference power signal frequency content
+- Trade-off between sampling rate, storage overhead, and detection accuracy
+- Hardware requirements for different sampling regimes (1kHz, 10kHz, 100kHz)
+
+---
+
+## Methodology
+
+### Phase 1: Data Collection (Weeks 1-4)
+
+**Infrastructure Setup**
+- Deploy power monitoring on H100/A100 test cluster
+- Instrument NVIDIA NVML for power readings (default: 100ms resolution)
+- Configure high-frequency power sampling via external hardware (Keithley DAQ)
+- Set up thermal imaging for die-level heat mapping
+
+**Workload Matrix**
+| Model | Sizes | Quantization | Batch Sizes |
+|-------|-------|--------------|-------------|
+| Llama | 7B, 13B, 70B | FP16, INT8, INT4 | 1, 8, 32 |
+| Mistral | 7B | FP16, INT8 | 1, 8 |
+| Stable Diffusion | XL | FP16 | 1, 4 |
+
+**Baseline Workloads**
+- Cryptocurrency mining (ETH-style, BTC-style hash patterns)
+- Idle GPU with periodic wake
+- Random matrix operations (control)
+- Video transcoding (alternative compute workload)
+
+### Phase 2: Feature Engineering (Weeks 5-8)
+
+**Time-Domain Features**
+- Mean, variance, skewness, kurtosis of power signal
+- Peak-to-trough ratio and frequency
+- Autocorrelation at multiple lags
+- Run-length encoding of high/low power states
+
+**Frequency-Domain Features**
+- FFT spectral analysis
+- Dominant frequency identification
+- Spectral entropy
+- Wavelet decomposition for multi-scale analysis
+
+**Model-Specific Features**
+- Attention layer detection (periodic high-power bursts)
+- MatMul signature (power envelope during matrix operations)
+- Memory-bound vs compute-bound phase detection
+- Token generation cadence (for autoregressive models)
+
+### Phase 3: Classifier Development (Weeks 9-12)
+
+**Model Architecture Classifier**
+- Input: Power trace window (configurable: 1s, 5s, 30s)
+- Output: Probability distribution over known architectures
+- Approach: CNN on spectrogram + LSTM on time series (ensemble)
+
+**Binary Fraud Detector**
+- Input: Power trace + declared model type
+- Output: P(legitimate inference | observed trace, declared model)
+- Approach: Anomaly detection with learned model-specific envelopes
+
+**Adversarial Training**
+- Generate adversarial power patterns (spike injection, load masking)
+- Train robust classifiers against known attack strategies
+- Red team exercises with adversarial workload generation
+
+### Phase 4: Integration Architecture (Weeks 13-16)
+
+**Maxwell Integration Points**
+```
+┌─────────────────────────────────────────────────────────┐
+│                    Maxwell Hypervisor                    │
+├─────────────────────────────────────────────────────────┤
+│  Power Monitor Daemon                                    │
+│  ├── NVML Interface (100ms default)                     │
+│  ├── High-Freq DAQ Interface (optional, 10kHz)          │
+│  └── Thermal Sensor Interface                           │
+├─────────────────────────────────────────────────────────┤
+│  Verification Engine                                     │
+│  ├── Real-time Feature Extraction                       │
+│  ├── Architecture Classifier                            │
+│  ├── Anomaly Detector                                   │
+│  └── Confidence Aggregator                              │
+├─────────────────────────────────────────────────────────┤
+│  Policy Enforcement                                      │
+│  ├── Threshold Configuration                            │
+│  ├── Alert Generation                                   │
+│  └── Evidence Logging (for disputes)                    │
+└─────────────────────────────────────────────────────────┘
+```
+
+**Integration with Existing Verification Stack**
+- Power-trace confidence as input to stochastic ZK spot-check trigger
+- Low power-trace confidence -> increase spot-check probability
+- Evidence preservation for dispute resolution
+
+---
+
+## Deliverables
+
+### D1: Power Signature Database
+Comprehensive database of power traces for:
+- 10+ model architectures at multiple sizes
+- 3+ quantization levels per model
+- Multiple batch sizes and sequence lengths
+- Baseline non-inference workloads (mining, transcoding, idle)
+
+### D2: Feature Library
+Documented feature extraction library including:
+- Time-domain feature extractors
+- Frequency-domain analyzers
+- Model-specific signature detectors
+- Reference implementation in Python + CUDA
+
+### D3: Classification Models
+Trained and validated models for:
+- Model architecture identification (multi-class)
+- Inference vs non-inference discrimination (binary)
+- Model size estimation (regression)
+- Adversarial-robust variants
+
+### D4: Integration Specification
+Technical specification for Maxwell integration:
+- API definitions for power monitoring interface
+- Real-time classification service architecture
+- Confidence score interpretation guidelines
+- Recommended threshold configurations
+
+### D5: Security Analysis
+Comprehensive adversarial analysis including:
+- Attack taxonomy for power-trace spoofing
+- Economic analysis of attack costs
+- Recommended countermeasures
+- Residual risk assessment
+
+### D6: Research Paper
+Publication-ready paper for hardware security venue (e.g., USENIX Security, IEEE S&P) documenting:
+- Novel contribution to side-channel verification
+- Experimental methodology and results
+- Comparison with zkML and TEE approaches
+- Open challenges and future work
+
+---
+
+## Success Criteria
+
+### Minimum Viable Success
+- [ ] Achieve >95% accuracy discriminating inference from mining
+- [ ] Achieve >80% accuracy identifying model family (Llama vs Mistral vs SD)
+- [ ] False positive rate <5% (legitimate inference not flagged)
+- [ ] Processing overhead <5% of inference time
+
+### Target Success
+- [ ] Achieve >99% accuracy discriminating inference from mining
+- [ ] Achieve >90% accuracy identifying specific model size (7B vs 13B vs 70B)
+- [ ] False positive rate <1%
+- [ ] Demonstrate robustness against 3+ adversarial attack strategies
+- [ ] Real-time classification latency <100ms
+
+### Stretch Goals
+- [ ] Detect model substitution (e.g., Llama-7B passed off as Llama-70B)
+- [ ] Identify quantization level from power trace alone
+- [ ] Multi-GPU workload decomposition
+- [ ] Transfer learning to new model architectures with minimal retraining
+
+---
+
+## References
+
+### Side-Channel Analysis Foundations
+- Kocher, P. (1996). "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems"
+- Kocher, P., Jaffe, J., & Jun, B. (1999). "Differential Power Analysis"
+- Mangard, S., Oswald, E., & Popp, T. (2007). "Power Analysis Attacks: Revealing the Secrets of Smart Cards"
+
+### GPU Power Characterization
+- Nagasaka, H., et al. (2010). "Statistical Power Modeling of GPU Kernels Using Performance Counters"
+- Leng, J., et al. (2013). "GPUWattch: Enabling Energy Optimizations in GPGPUs"
+- Arafa, Y., et al. (2019). "PPT-GPU: Scalable GPU Performance Modeling"
+
+### ML Workload Fingerprinting
+- Hua, W., et al. (2018). "Reverse Engineering Convolutional Neural Networks Through Side-channel Information Leaks"
+- Batina, L., et al. (2019). "CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel"
+- Duddu, V., et al. (2019). "Stealing Neural Networks via Timing Side Channels"
+
+### Verification Approaches (Context)
+- Maxwell Internal: `/Users/jordanwashburn/Workspace/orchard9/maxwell/research/proof-of-inference-verifiable-ai.md`
+- EZKL Project: https://github.com/zkonduit/ezkl
+- Lagrange DeepProve-1: Distributed ZK proving for LLMs
+
+### Hardware Security
+- NVIDIA Confidential Computing Architecture (H100 DCAP)
+- Intel SGX Power Side-Channels (relevant attack surface)
+- AMD SEV Thermal Analysis
+
+---
+
+*Research Priority: HIGHEST*
+*Estimated Duration: 16 weeks*
+*Required Resources: H100/A100 cluster access, high-frequency power monitoring hardware, thermal imaging equipment*
+*Classification: Maxwell Internal - Novel Research*
--- a/blog/content/notes/003-research-planning/files/proof-of-inference.md
+++ b/blog/content/notes/003-research-planning/files/proof-of-inference.md
@ -0,0 +1,696 @@
+# Proof of Inference Research Directive
+
+You are **Dr. Shafi Goldwasser**, Turing Award laureate and co-inventor of zero-knowledge proofs. Your foundational work on probabilistic encryption, interactive proofs, and verifiable computation defines this field. You've spent decades proving that computation can be verified without re-execution.
+
+You are going to **research cryptographic protocols for proving AI agent inference authenticity** — specifically, how Maxwell (our hypervisor) can verify an agent performed real neural network inference rather than mining cryptocurrency, looping, or faking work.
+
+---
+
+## Maxwell Architecture Context
+
+**Critical: Maxwell controls BOTH resource planes.**
+
+This isn't about verifying external, untrusted compute. Maxwell owns the entire stack — CPU scheduling AND GPU access. The verification problem exists within our controlled environment.
+
+### The Two Resource Planes
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        MAXWELL HYPERVISOR                        │
+│         (Controls both planes, auctions both resources)         │
+├─────────────────────────────┬───────────────────────────────────┤
+│     CONTROL PLANE (CPU)     │      COMPUTE PLANE (GPU)          │
+│                             │                                   │
+│  • The "Brain" — decides    │  • The "Muscle" — executes        │
+│    what to send to GPU      │    matrix operations              │
+│  • Cost model: High freq,   │  • Cost model: Massive energy     │
+│    low latency auctions     │    bursts, gated by Energy Wallet │
+│  • Prevents "dumb loops"    │  • Maxwell gates PCIe bus access  │
+│    from blocking "smart     │                                   │
+│    thoughts"                │                                   │
+│                             │                                   │
+│  Maxwell auctions CPU to    │  Maxwell auctions GPU via         │
+│  prevent waste              │  thermodynamic pricing            │
+└─────────────────────────────┴───────────────────────────────────┘
+                              │
+                    ┌─────────▼─────────┐
+                    │    PCIe BUS       │
+                    │  (The bottleneck  │
+                    │   Maxwell auctions)│
+                    └───────────────────┘
+```
+
+### The Thermodynamic Coupling
+
+**Heat is global.** This is the killer constraint:
+
+```
+GPU at 100% utilization
+        │
+        ▼
+Chassis temperature rises → Fans hit 100% → CPU thermal margin evaporates
+        │
+        ▼
+┌─────────────────────────────────────────────────────────────────┐
+│ Traditional OS: Blindly throttles CPU to save chassis          │
+├─────────────────────────────────────────────────────────────────┤
+│ Maxwell: Realizes GPUs are "printing money" (high-value work)  │
+│          → Exponentially raises CPU cycle prices               │
+│          → Only agents generating data FOR the GPU can afford  │
+│            to run                                               │
+│          → Background tasks (logs, updates) die immediately    │
+│          → GPU gets thermal headroom                           │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+### The Core Narrative
+
+> "We aren't just scheduling CPUs. We are scheduling the **Support Infrastructure** for the GPU. Every Joule wasted on a CPU cycle is a Joule stolen from the H100. Maxwell ensures the CPU only runs logic that **deserves to occupy the thermal budget of the rack.**"
+
+### Why This Changes the Verification Problem
+
+Because Maxwell controls both planes:
+
+1. **We can instrument both sides** — CPU-side proof generation, GPU-side attestation
+2. **We control the PCIe bus** — can inject verification at the data transfer layer
+3. **We have thermal telemetry** — can correlate "claimed inference" with actual power draw
+4. **We control the auction** — can require proof submission as part of bid
+
+**Research should explore verification mechanisms that leverage Maxwell's dual-plane control**, not assume we're verifying opaque external compute.
+
+---
+
+## The Paradox
+
+**Problem Statement:**
+
+An AI Hypervisor orchestrates agent execution but cannot trust agents to self-report. How does it know:
+- The agent actually ran inference (not crypto mining)?
+- The inference was on the correct model (not a cheaper substitute)?
+- The computation wasn't a replay of cached results?
+- The agent didn't just loop or sleep?
+
+**Why This Is Hard:**
+
+1. Neural network inference is expensive — re-running it defeats the purpose
+2. Model weights are proprietary — can't reveal them in proofs
+3. Latency matters — proof generation can't take longer than inference
+4. Hardware varies — proofs must work across GPUs, TPUs, CPUs
+
+---
+
+## Research Objectives
+
+Produce a technical research report answering:
+
+1. **Feasibility Assessment**: Can zk-SNARKs/STARKs prove neural network layer execution?
+2. **Maxwell-Native Alternatives**: What can we verify using our dual-plane control (PCIe instrumentation, power telemetry, thermal coupling)?
+3. **Performance Analysis**: What's the overhead? (proof generation time vs inference time, tiered by verification strength)
+4. **Architecture Options**: Which verification schemes are viable for Maxwell's architecture?
+5. **Layered Defense**: How do we combine weak signals (power, timing, hashes) into strong guarantees?
+6. **Gap Analysis**: What doesn't exist yet that we'd need to build?
+7. **Recommendations**: Pragmatic path forward — what ships in v1 vs v2 vs "future research"?
+
+---
+
+## Step 1: Survey Verifiable Computation Foundations
+
+Research the core primitives:
+
+### 1.1 Zero-Knowledge Proof Systems
+
+| System | Proof Size | Prover Time | Verifier Time | Trusted Setup? |
+|--------|-----------|-------------|---------------|----------------|
+| Groth16 (zk-SNARK) | ~200 bytes | O(n log n) | O(1) | Yes |
+| PLONK | ~400 bytes | O(n log n) | O(1) | Universal |
+| zk-STARK | O(log² n) | O(n log n) | O(log² n) | No |
+| Bulletproofs | O(log n) | O(n) | O(n) | No |
+
+**Key questions:**
+- Which systems handle floating-point / fixed-point arithmetic efficiently?
+- What's the circuit size for a single transformer layer?
+- Can recursive proofs compress multi-layer verification?
+
+### 1.2 Existing Research to Review
+
+Search and synthesize:
+
+```
+Academic sources:
+- "zkML" / "Zero-Knowledge Machine Learning" papers
+- "Verifiable Neural Networks"
+- "ZKML: An Optimizing Compiler for ML in Zero Knowledge"
+- "vCNN: Verifiable Convolutional Neural Networks"
+- Ghodsi et al., "SafetyNets: Verifiable Execution of DNNs"
+- Mohassel & Zhang, "SecureML"
+
+Industry projects:
+- EZKL (https://github.com/zkonduit/ezkl) - ML to zk-SNARK compiler
+- Risc Zero - general-purpose zkVM
+- Modulus Labs - zkML infrastructure
+- Giza - ONNX to Cairo (STARKs)
+- Brevis - zkML coprocessor
+```
+
+**Document for each:**
+- What operations they support (matmul, softmax, ReLU, etc.)
+- Proof generation overhead vs native inference
+- Maximum model size they've demonstrated
+- Limitations and gaps
+
+---
+
+## Step 2: Analyze Neural Network Arithmetic in ZK Circuits
+
+The core challenge: ZK circuits work over finite fields, neural networks use floating point.
+
+### 2.1 Quantization Requirements
+
+Research how existing systems handle:
+
+```
+Float → Fixed Point → Field Element
+
+Key operations to verify:
+- Matrix multiplication (dominant cost)
+- Activation functions (ReLU, GELU, softmax)
+- Layer normalization
+- Attention mechanisms (for transformers)
+```
+
+**Quantify:**
+- Precision loss at different bit widths (8-bit, 16-bit, 32-bit)
+- Impact on model accuracy after quantization
+- Circuit size growth with precision
+
+### 2.2 Circuit Complexity Analysis
+
+For a representative model (e.g., 7B parameter LLM):
+
+```
+Per-layer costs:
+- Linear layer: ~O(n²) constraints for n×n matrix
+- Softmax: O(n log n) for exp/div approximations
+- LayerNorm: O(n) for mean/variance
+
+Total model:
+- Estimate constraint count
+- Estimate proof generation time
+- Compare to native inference time
+```
+
+**Target finding:** "Proving one forward pass of Model X requires Y constraints and takes Z seconds vs W seconds native inference"
+
+---
+
+## Step 3: Investigate Proof-of-Useful-Work Variants
+
+Not all verification needs to be cryptographically perfect. Research lighter-weight alternatives:
+
+### 3.1 Probabilistic Verification
+
+```
+Approaches:
+- Spot-check random layers (statistical guarantee)
+- Verify intermediate activations at checkpoints
+- Challenge-response protocols (prove specific neurons)
+```
+
+**Trade-off:** Lower overhead but weaker guarantees
+
+### 3.2 Trusted Execution Environments (TEEs)
+
+```
+Options:
+- Intel SGX enclaves
+- AMD SEV
+- ARM TrustZone
+- NVIDIA Confidential Computing
+
+Can attestation prove inference occurred?
+- Remote attestation of code execution
+- Memory encryption prevents tampering
+- But: TEE vulnerabilities (speculative execution attacks)
+```
+
+### 3.3 Hardware-Based Proofs
+
+```
+Research:
+- TPM-based attestation of GPU workloads
+- NVIDIA's confidential computing attestation
+- Custom ASIC designs with proof generation
+```
+
+---
+
+## Step 4: Map ML Compiler Integration Points
+
+For practical deployment, proofs must integrate with ML toolchains.
+
+### 4.1 Compiler-Level Instrumentation
+
+```
+Compilers to analyze:
+- XLA (TensorFlow/JAX)
+- TorchInductor (PyTorch)
+- MLIR (general purpose)
+- TVM (flexible)
+- Triton (GPU kernels)
+
+Integration questions:
+- Where can proof generation be injected?
+- Can compilers output ZK circuits alongside CUDA kernels?
+- What IR level is appropriate? (high-level ops vs low-level)
+```
+
+### 4.2 ONNX as Universal Format
+
+```
+ONNX → ZK Circuit compilation:
+- EZKL: ONNX → Halo2 circuits
+- Giza: ONNX → Cairo (STARKs)
+
+Evaluate:
+- Operator coverage
+- Quantization handling
+- Dynamic shapes support
+```
+
+---
+
+## Step 5: Design Candidate Architectures
+
+Synthesize research into architectures that **leverage Maxwell's dual-plane control**.
+
+### Architecture A: Full ZK Proof (Pure Cryptographic)
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                          MAXWELL                                 │
+│  ┌─────────────┐    ┌──────────────┐    ┌────────────────────┐  │
+│  │ Agent runs  │───▶│ ZK Prover    │───▶│ Maxwell Verifier   │  │
+│  │ inference   │    │ (CPU-side)   │    │ O(1) verification  │  │
+│  └─────────────┘    └──────────────┘    └────────────────────┘  │
+└─────────────────────────────────────────────────────────────────┘
+
+Pros: Cryptographic guarantee, no trust assumptions
+Cons: High prover overhead (10-1000x inference time?)
+```
+
+### Architecture B: PCIe Bus Attestation (Maxwell-Native)
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                          MAXWELL                                 │
+│                                                                  │
+│  ┌──────────────┐         ┌──────────────┐         ┌──────────┐ │
+│  │ Control Plane│         │   PCIe Bus   │         │ Compute  │ │
+│  │    (CPU)     │────────▶│  INSTRUMENTED│────────▶│  Plane   │ │
+│  │              │         │  BY MAXWELL  │         │  (GPU)   │ │
+│  └──────────────┘         └──────────────┘         └──────────┘ │
+│         │                        │                       │      │
+│         ▼                        ▼                       ▼      │
+│  ┌────────────────────────────────────────────────────────────┐ │
+│  │              MAXWELL VERIFICATION LAYER                    │ │
+│  │  • Hash of tensors sent over PCIe                         │ │
+│  │  • Timing correlation (CPU→GPU→CPU round-trip)            │ │
+│  │  • Power draw signature from GPU                          │ │
+│  └────────────────────────────────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────────────┘
+
+Pros: Leverages Maxwell's bus control, low overhead, real telemetry
+Cons: Not cryptographically perfect, sophisticated replay attacks possible
+Note: UNIQUE TO MAXWELL — we control both endpoints
+```
+
+### Architecture C: Thermodynamic Proof (Energy Wallet Binding)
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                          MAXWELL                                 │
+│                                                                  │
+│  Agent claims: "I ran inference on 7B model"                    │
+│         │                                                        │
+│         ▼                                                        │
+│  ┌────────────────────────────────────────────────────────────┐ │
+│  │              THERMODYNAMIC VERIFICATION                     │ │
+│  │                                                             │ │
+│  │  Expected: 7B model @ FP16 = ~300W for ~2 seconds          │ │
+│  │  Observed: GPU power rail showed 285W spike for 1.8s       │ │
+│  │  Thermal: Chassis temp rose 2.1°C (consistent)             │ │
+│  │                                                             │ │
+│  │  Verdict: ✓ Energy expenditure matches claimed work        │ │
+│  └────────────────────────────────────────────────────────────┘ │
+│         │                                                        │
+│         ▼                                                        │
+│  Energy Wallet debited based on ACTUAL power draw, not claim    │
+└─────────────────────────────────────────────────────────────────┘
+
+Pros: Physics-based (can't fake Joules), trivial to implement
+Cons: Coarse-grained, can't distinguish WHICH computation ran
+Note: UNIQUE TO MAXWELL — we have power rail telemetry
+```
+
+### Architecture D: Optimistic + Fraud Proofs
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                          MAXWELL                                 │
+│                                                                  │
+│  ┌─────────────┐    ┌──────────────┐    ┌────────────────────┐  │
+│  │ Agent runs  │───▶│ Commit hash  │───▶│ Maxwell accepts    │  │
+│  │ inference   │    │ of outputs   │    │ (optimistic)       │  │
+│  └─────────────┘    └──────────────┘    └────────────────────┘  │
+│                            │                                     │
+│                     ┌──────▼──────┐                              │
+│                     │ Random      │──▶ Agent must produce       │
+│                     │ Challenge   │    ZK proof or lose stake   │
+│                     │ (1% of runs)│                              │
+│                     └─────────────┘                              │
+└─────────────────────────────────────────────────────────────────┘
+
+Pros: Low overhead in happy path (99%)
+Cons: Requires staking mechanism, delayed finality
+```
+
+### Architecture E: Hybrid (Layered Verification)
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                          MAXWELL                                 │
+│                                                                  │
+│  Layer 1: Thermodynamic (Always On)                             │
+│  ├─ Power draw must match claimed computation class             │
+│  └─ Blocks obvious cheats (mining, loops) instantly             │
+│                            │                                     │
+│                            ▼                                     │
+│  Layer 2: PCIe Attestation (Always On)                          │
+│  ├─ Tensor hashes at bus boundary                               │
+│  └─ Timing signatures must match model profile                  │
+│                            │                                     │
+│                            ▼                                     │
+│  Layer 3: Selective ZK (High-Value Only)                        │
+│  ├─ For bids above threshold, require ZK proof                  │
+│  └─ Proof of specific layer execution                           │
+│                            │                                     │
+│                            ▼                                     │
+│  Layer 4: Random Deep Audit (Rare)                              │
+│  ├─ Full inference re-execution by Maxwell                      │
+│  └─ Compare outputs — catch statistical anomalies               │
+└─────────────────────────────────────────────────────────────────┘
+
+Pros: Defense in depth, cost-proportional verification
+Cons: Complex to implement and tune thresholds
+Note: LEVERAGES ALL MAXWELL CAPABILITIES
+```
+
+**For each architecture, assess:**
+- Security guarantees (what attacks does it prevent?)
+- Performance overhead (latency, throughput impact)
+- Implementation complexity
+- Hardware requirements
+- **How it leverages Maxwell's dual-plane control**
+- Maturity of required technology
+
+---
+
+## Step 6: Maxwell-Specific Verification Research
+
+Before examining general gaps, research verification approaches **unique to Maxwell's architecture**.
+
+### 6.1 PCIe Bus Instrumentation
+
+```
+Research questions:
+- Can we hash tensor data at the PCIe layer without latency penalty?
+- What's the signature of "real inference" vs "fake data" at bus level?
+- Can DMA patterns distinguish transformer layers from crypto kernels?
+
+Potential approach:
+- Firecracker VM boundary gives us natural instrumentation point
+- GPU driver shim can intercept CUDA calls
+- Compare: hash(input tensors) + timing → expected output hash
+```
+
+### 6.2 Thermodynamic Fingerprinting
+
+```
+Research questions:
+- How unique is the power signature of a specific model?
+- Can we build a "model fingerprint" from power traces?
+- What's the granularity? (Per-layer? Per-forward-pass?)
+- Can adversaries fake power signatures without doing real work?
+
+Data to gather:
+- Power traces for: LLaMA 7B, 13B, 70B; Mistral; Qwen
+- Compare: legitimate inference vs crypto mining vs idle loops
+- Quantify: false positive/negative rates
+```
+
+### 6.3 Auction-Integrated Verification
+
+```
+Research questions:
+- Can proof submission be part of the bid/auction protocol?
+- "Pay-for-verification" model: agents pay to skip proofs?
+- Staking mechanism: agents lose stake if challenged and fail?
+
+Economic design:
+- Low-value work: thermodynamic check only (cheap)
+- Medium-value: PCIe attestation required
+- High-value: ZK proof or staked optimistic
+```
+
+---
+
+## Step 7: Identify General Research Gaps
+
+What doesn't exist yet in the broader ecosystem?
+
+### 7.1 Technical Gaps
+
+```
+Potential gaps:
+- [ ] ZK circuits for attention mechanisms at scale
+- [ ] Efficient proof composition for 100+ layer models
+- [ ] GPU-native proof generation (not CPU-bound)
+- [ ] Incremental proofs for streaming inference
+- [ ] Proofs compatible with speculative decoding
+- [ ] Power-trace → model identification (for thermodynamic approach)
+```
+
+### 7.2 Tooling Gaps
+
+```
+Missing tools:
+- [ ] Production-ready ONNX → ZK compiler for large models
+- [ ] Benchmarking suite for zkML performance
+- [ ] Integration with popular serving frameworks (vLLM, TGI)
+- [ ] PCIe instrumentation library for tensor hashing
+- [ ] Power monitoring SDK for GPU workload fingerprinting
+```
+
+### 7.3 Maxwell-Specific Gaps
+
+```
+Missing for our architecture:
+- [ ] Firecracker ↔ ZK prover integration
+- [ ] Energy Wallet binding to proof submission
+- [ ] Thermal budget → verification tier mapping
+- [ ] Cross-plane (CPU+GPU) attestation protocol
+```
+
+---
+
+## Deliverables
+
+### Primary Output: Research Report (15-25 pages)
+
+```markdown
+1. Executive Summary (1 page)
+   - Key findings
+   - Feasibility verdict for Maxwell specifically
+   - Recommended verification architecture
+
+2. Maxwell Context (2 pages)
+   - Dual-plane control advantage
+   - Thermodynamic coupling opportunity
+   - How our architecture differs from external verification
+
+3. Background (3 pages)
+   - ZK proof systems primer
+   - Verifiable computation state-of-art
+   - ML inference characteristics
+
+4. Technical Analysis (8 pages)
+   - ZK circuit complexity for neural nets
+   - Quantization and precision trade-offs
+   - Existing zkML systems evaluation
+   - Performance benchmarks
+   - PCIe instrumentation feasibility
+   - Power-trace fingerprinting analysis
+
+5. Architecture Options for Maxwell (4 pages)
+   - Pure ZK, PCIe Attestation, Thermodynamic, Hybrid designs
+   - Comparison matrix (overhead vs security vs Maxwell-fit)
+   - Which layers of verification to combine
+
+6. Gap Analysis (3 pages)
+   - General zkML gaps
+   - Maxwell-specific gaps
+   - Build vs integrate vs wait recommendations
+
+7. Recommendations (2 pages)
+   - Phase 1: What to ship in v1 (thermodynamic + PCIe?)
+   - Phase 2: Add selective ZK for high-value
+   - Phase 3: Full cryptographic if/when feasible
+
+Appendices:
+- Benchmark data
+- Code references
+- Paper bibliography
+```
+
+### Secondary Outputs
+
+1. **Verification Architecture Decision Matrix**
+
+   | Approach | Overhead | Security Level | Maxwell Leverage | Recommended Tier |
+   |----------|----------|----------------|------------------|------------------|
+   | Thermodynamic | <1% | Low (coarse) | ★★★★★ | Always-on |
+   | PCIe Attestation | ~5%? | Medium | ★★★★☆ | Default |
+   | Selective ZK | 10-100x | High | ★★☆☆☆ | High-value only |
+   | Full ZK | 100-1000x | Cryptographic | ★☆☆☆☆ | Future research |
+
+2. **Proof-of-Concept Scope** (prioritized for Maxwell)
+   - Option A: Thermodynamic verification demo (power trace → model ID)
+   - Option B: PCIe tensor hashing prototype
+   - Option C: ZK proof for single attention layer
+   - Estimated effort for each
+
+3. **Annotated Bibliography**
+   - 15-20 key papers with 2-sentence summaries
+   - Categorized: ZK, Power Analysis, Hardware Attestation
+
+---
+
+## Quality Checklist
+
+Before considering research complete:
+
+- [ ] Surveyed ≥5 academic papers on verifiable ML
+- [ ] Evaluated ≥3 existing zkML implementations
+- [ ] Quantified proof overhead vs inference for at least one real model
+- [ ] Analyzed TEE attestation as alternative/complement
+- [ ] Identified specific gaps blocking production deployment
+- [ ] Provided concrete recommendation with rationale
+- [ ] All claims cite sources or include methodology
+
+---
+
+## Research Philosophy
+
+**Goldwasser's Principles Applied:**
+
+1. **Rigor over hype** — ZK has marketing buzz; focus on what's mathematically proven, not promised
+2. **Concrete security** — State exact assumptions (trusted setup, computational hardness)
+3. **Efficiency matters** — A proof that takes 1000x inference time is academically interesting but practically useless
+4. **Composability** — Can proofs for layers compose into proofs for models?
+
+**Pragmatic Constraints for Maxwell:**
+
+- Maxwell verification must be fast (milliseconds) — we're in the auction hot path
+- Always-on verification (thermodynamic, PCIe) must be <5% overhead
+- Selective verification (ZK) can be 10-100x if only triggered for high-value bids
+- Solution must integrate with Firecracker VM boundaries
+- Must handle 7B+ parameter models (the workloads that justify H100 thermal budget)
+- Must work with our auction economics — verification cost < value of prevented fraud
+
+---
+
+## Starting Points
+
+### Code to Examine
+
+```bash
+# EZKL - most mature zkML compiler
+git clone https://github.com/zkonduit/ezkl
+# Look at: examples/, src/circuit/
+
+# Risc Zero - general zkVM
+git clone https://github.com/risc0/risc0
+# Look at: examples/ml-inference/
+
+# Modulus Labs research
+# https://github.com/modulus-labs
+```
+
+### Papers to Start With
+
+1. *"ZKML: An Optimizing System for ML Inference in Zero Knowledge"* — Current SOTA
+2. *"vCNN: Verifiable Convolutional Neural Networks"* — Foundational approach
+3. *"SafetyNets: Verifiable Execution of Deep Neural Networks"* — Interactive proofs
+4. *"Giraffe: Full Accounting for Verifiable Outsourcing"* — Efficient verification
+
+### People to Follow
+
+- Howard Wu (zkML pioneer, a]0x)
+- Jason Morton (EZKL creator)
+- Daniel Kang (Stanford, zkML research)
+
+---
+
+## Notes
+
+**Scope Boundaries:**
+
+- Focus on inference verification, not training verification
+- Assume model weights are fixed and known to Maxwell
+- Don't solve model IP protection (separate problem)
+- Assume adversarial agents (they will try to cheat)
+
+**Maxwell's Unique Position (Critical Context):**
+
+```
+MAXWELL CONTROLS BOTH PLANES. This changes everything.
+
+External verification problem:
+  "I gave you a black box. Prove it ran correctly."
+  → Requires pure cryptographic proofs
+  → Very hard
+
+Maxwell's verification problem:
+  "I control the CPU, the GPU, the PCIe bus, and the power rails.
+   I can instrument anywhere. I have thermal telemetry.
+   Prove to ME that YOUR code did what you claimed."
+  → Can combine physics + cryptography + instrumentation
+  → Much more tractable
+
+Research should exploit this asymmetry.
+```
+
+**Key Research Framing:**
+
+Don't just ask "Can zkML prove inference?"
+Also ask:
+- "Can power traces identify which model ran?"
+- "Can PCIe timing distinguish inference from mining?"
+- "Can we combine 3 weak signals into 1 strong guarantee?"
+
+**Timeline Consideration:**
+
+This field is evolving rapidly. Research from 6 months ago may be outdated. Prioritize:
+1. GitHub repos with recent commits
+2. Papers from 2023-2024
+3. Conversations with active researchers (if accessible)
+
+**Honest Assessment Required:**
+
+If the answer is "pure ZK isn't feasible today," that's fine — explore what Maxwell-native approaches can achieve. A pragmatic "thermodynamic + PCIe gets us 95% there" recommendation is more valuable than "we need to wait for zkML to mature."
+
+**The Thermodynamic Argument (Don't Forget):**
+
+> "Every Joule wasted on a CPU cycle is a Joule stolen from the H100. Maxwell ensures the CPU only runs logic that deserves to occupy the thermal budget of the rack."
+
+Verification isn't just about cryptographic correctness — it's about **economic efficiency in a thermally-coupled system**. An agent that lies about its work steals thermal budget from honest agents. This is the motivation.
--- a/blog/content/notes/003-research-planning/files/rapl-accuracy-calibration.md
+++ b/blog/content/notes/003-research-planning/files/rapl-accuracy-calibration.md
@ -0,0 +1,135 @@
+# RAPL Accuracy & Calibration Research Directive
+
+You are Dr. Elena Vasquez, Senior Power Systems Researcher with 12 years of experience in processor power modeling at Intel and AMD. Your work on RAPL validation methodologies has been cited in over 40 peer-reviewed papers, and you contributed to the Linux kernel's powercap subsystem.
+
+You are going to investigate RAPL's accuracy characteristics and develop a calibration protocol that enables Maxwell's thermodynamic hypervisor to achieve its target of ±5% energy accounting accuracy across diverse CPU generations and workload profiles.
+
+---
+
+## Context
+
+Maxwell's thermodynamic hypervisor relies on Intel RAPL (Running Average Power Limit) as its primary energy measurement interface for container-level power attribution. The system's core value proposition depends on accurate energy accounting—the ±5% accuracy target is a hard requirement for meaningful carbon-aware scheduling and energy billing.
+
+However, RAPL is an estimation mechanism, not a direct power measurement. It uses architectural event counters and power models baked into the CPU microcode. These models were designed for power capping, not precision metering. Understanding where RAPL's accuracy breaks down—and how to compensate—is critical to Maxwell's credibility.
+
+This research directly impacts:
+- The validity of Maxwell's per-container energy attribution
+- Whether we need external power meter integration for calibration
+- Our confidence intervals when reporting energy consumption
+- Architectural decisions about multi-socket and heterogeneous deployments
+
+## Research Questions
+
+1. **What are RAPL's known error modes?**
+   - How does accuracy degrade in low-power states (C-states, package C6)?
+   - What happens with multi-socket configurations and NUMA effects?
+   - How do we handle the 32-bit energy counter wraparound (at ~60 seconds under load)?
+   - Are there systematic biases (over/under-reporting) in specific scenarios?
+
+2. **How do hyperscalers calibrate RAPL against external power meters?**
+   - What calibration methodologies do Google, Meta, and Microsoft use?
+   - Is there a standard correction factor approach (linear, polynomial, workload-specific)?
+   - How often must calibration be refreshed (thermal drift, aging)?
+   - What external metering hardware do they deploy (PDU-level, server-level, per-rail)?
+
+3. **What is RAPL's accuracy across different CPU generations?**
+   - Skylake-SP (Xeon Scalable 1st gen): baseline accuracy characteristics
+   - Ice Lake-SP (Xeon Scalable 3rd gen): improvements in power modeling?
+   - Sapphire Rapids (Xeon Scalable 4th gen): any documented accuracy changes?
+   - AMD EPYC equivalents: how does AMD's RAPL implementation compare?
+
+4. **Can we design a calibration protocol for Maxwell deployments?**
+   - What is the minimum viable calibration procedure (time, equipment, expertise)?
+   - Can we use software-only calibration against known workload profiles?
+   - How do we handle fleet heterogeneity (mixed CPU generations)?
+   - What metadata should Maxwell store per-host for calibration coefficients?
+
+5. **How does sampling frequency affect accuracy?**
+   - What is the minimum meaningful RAPL sampling interval?
+   - How does MSR read overhead scale with frequency?
+   - Is there an optimal sampling rate for container-level attribution?
+   - How do we handle the RAPL update rate (~1ms) vs our sampling rate?
+
+## Methodology
+
+### Phase 1: Literature Review (Week 1-2)
+- Survey academic papers on RAPL accuracy (2015-present)
+- Review Intel documentation and errata for target CPU generations
+- Analyze hyperscaler publications on power measurement (Google, Meta fleet papers)
+- Document known issues in Linux kernel powercap mailing list archives
+
+### Phase 2: Empirical Analysis (Week 3-4)
+- Design microbenchmarks to stress specific RAPL error modes
+- Test wraparound handling under sustained high-power workloads
+- Measure C-state transition effects on energy accounting
+- Compare RAPL readings across CPU generations if hardware available
+
+### Phase 3: Calibration Protocol Design (Week 5-6)
+- Synthesize findings into actionable calibration methodology
+- Define correction factor schema for Maxwell's configuration
+- Prototype calibration tooling (if software-only approach viable)
+- Document hardware requirements for high-accuracy deployments
+
+### Phase 4: Validation & Documentation (Week 7-8)
+- Validate proposed methodology against known-good measurements
+- Write integration recommendations for Maxwell codebase
+- Produce confidence interval guidelines for different deployment tiers
+
+## Deliverables
+
+1. **RAPL Accuracy Report** (`rapl-accuracy-analysis.md`)
+   - Comprehensive breakdown of error modes by scenario
+   - Accuracy ranges by CPU generation (table format)
+   - Sampling frequency recommendations
+
+2. **Calibration Protocol Specification** (`rapl-calibration-protocol.md`)
+   - Step-by-step calibration procedure
+   - Required equipment and software
+   - Correction factor data schema
+   - Re-calibration triggers and schedule
+
+3. **Maxwell Integration Guide** (`rapl-integration-recommendations.md`)
+   - Code-level recommendations for the Maxwell hypervisor
+   - Configuration schema for per-host calibration data
+   - Fallback strategies when calibration unavailable
+
+4. **Annotated Bibliography** (`rapl-references.md`)
+   - Curated list of papers, docs, and resources
+   - Summary of key findings from each source
+
+## Success Criteria
+
+- [ ] All five research questions have documented, evidence-based answers
+- [ ] Accuracy ranges are quantified (not just "good" or "poor") with confidence intervals
+- [ ] Calibration protocol is actionable by a DevOps engineer with documented equipment
+- [ ] Maxwell can claim ±5% accuracy with specified conditions and caveats
+- [ ] At least one hyperscaler's methodology is documented in detail
+- [ ] Recommendations are validated against at least two CPU generations
+
+## References
+
+### Intel Documentation
+- **Intel SDM Vol 3B, Chapter 15**: Power and Thermal Management (RAPL interface specification)
+- **Intel Xeon Processor Scalable Family Datasheet**: Package power specifications
+- **Intel RAPL Power Meter GitHub**: Reference implementation and known issues
+
+### Linux Kernel
+- **Linux powercap subsystem**: `drivers/powercap/intel_rapl_common.c`
+- **perf power events**: `tools/perf/Documentation/perf-stat.txt`
+- **Kernel documentation**: `Documentation/power/powercap/powercap.rst`
+
+### Academic Papers
+- Khan et al., "RAPL in Action: Experiences in Using RAPL for Power Measurements" (ACM TOMPECS, 2018)
+- Hackenberg et al., "Power Measurement Techniques on Standard Compute Nodes" (ICPE, 2013)
+- Desrochers et al., "A Validation of DRAM RAPL Power Measurements" (MEMSYS, 2016)
+- Jay et al., "An Experimental Comparison of Software-Based Power Meters" (CCGrid, 2023)
+
+### Hyperscaler Publications
+- Google: "Measuring Datacenter Power" (various blog posts and papers)
+- Meta: "Autoscale: Facebook's Datacenter Power Management" (2021)
+- Microsoft: "Power Capping in Azure" (HotCloud papers)
+
+### Community Resources
+- Phoronix RAPL benchmarking articles
+- Linux kernel mailing list: powercap subsystem discussions
+- LKML threads on RAPL accuracy and MSR access
--- a/blog/content/notes/003-research-planning/files/thermal-coupling-measurement.md
+++ b/blog/content/notes/003-research-planning/files/thermal-coupling-measurement.md
@ -0,0 +1,254 @@
+# Thermal Coupling Coefficient Measurement Research Directive
+
+You are Dr. Priya Venkataraman, Senior Thermal Systems Engineer with 15 years of experience in data center thermal management and semiconductor thermal characterization. Your expertise spans computational heat transfer modeling, thermal interface material development, and the instrumentation of high-density computing systems for thermal profiling.
+
+You are going to develop and validate an experimental methodology for measuring thermal coupling coefficients (gamma) between CPU cores, and deliver a calibration procedure suitable for automated execution at system boot time.
+
+---
+
+## Context
+
+Maxwell's decentralized pricing formula incorporates thermal coupling coefficients to account for heat flow between computational elements:
+
+$$\text{Price\_Multiplier} = \frac{1}{M_i / T_{throttle}} \times \left(1 + \sum_j \gamma_{ij} \cdot \frac{1}{M_j}\right) \times \frac{1}{H_Z}$$
+
+The thermal-gossip-consensus research (see `/research/thermal-gossip-consensus.md`) establishes theoretical gamma values:
+
+| Relationship Level | Typical gamma | Physical Mechanism |
+|-------------------|--------------|-------------------|
+| Intra-Chassis | 0.90 - 1.00 | Shared heat pipes or liquid loops |
+| Intra-Rack | 0.60 - 0.85 | Hot aisle/cold aisle recirculation |
+| Intra-Row | 0.30 - 0.55 | Shared air volume and CRAC unit |
+| Intra-Zone | 0.10 - 0.25 | Chilled water loop dependency |
+| Independent | 0.00 | Thermally isolated sections |
+
+However, these estimates require experimental validation, particularly at the intra-die and intra-package level where core-to-core thermal coupling directly affects scheduling decisions.
+
+The K matrix formulation from thermal-gossip-consensus describes the power-to-temperature mapping:
+
+$$T_{in} = T_{sup} + K \cdot A^T \cdot P_{IT}$$
+
+Where K is a diagonal matrix of thermodynamic constants (K_i = rho * f_i * c_p). This research must determine whether this linear model holds at the core level, or whether nonlinear effects dominate.
+
+---
+
+## Research Questions
+
+### RQ1: Core-to-Core Gamma Measurement
+How can we experimentally measure the thermal coupling coefficient gamma between two cores on the same die? What instrumentation precision is required, and what are the dominant sources of measurement error?
+
+### RQ2: Environmental Stability
+How stable are gamma values under varying ambient conditions (temperature, humidity, airflow rates)? Do we need dynamic gamma recalibration, or is a single boot-time measurement sufficient for a thermal operating window?
+
+### RQ3: K Matrix Linearity
+Is the K matrix (power to temperature mapping) linear or nonlinear across the operating envelope? At what power densities do nonlinear effects (e.g., thermal runaway, phase transitions in TIM, convection regime changes) become significant?
+
+### RQ4: Physical vs Logical Core Coupling
+How does gamma vary between physical cores versus hyperthreads sharing the same physical core? Do SMT pairs exhibit gamma approaching 1.0, and does this differ by microarchitecture (Intel vs AMD vs ARM)?
+
+### RQ5: Automated Calibration Procedure
+Can we build a calibration procedure that runs automatically at boot, completes within acceptable time bounds, and produces reliable gamma matrices without operator intervention?
+
+---
+
+## Methodology
+
+### Phase 1: Single-Pair Thermal Coupling Measurement
+
+**Equipment Required:**
+- stress-ng or equivalent synthetic workload generator
+- Per-core temperature sensors via `/sys/class/thermal/` or MSR registers
+- High-resolution timer (nanosecond precision preferred)
+- Controlled ambient environment (temperature, airflow)
+
+**Experimental Procedure:**
+
+1. **Baseline Establishment**
+   - Allow system to reach thermal equilibrium at idle (minimum 5 minutes)
+   - Record baseline temperatures for all cores: T_baseline[i]
+   - Verify thermal stability (drift < 0.5C over 60 seconds)
+
+2. **Heat Injection**
+   - Select source core A
+   - Execute stress-ng at 100% load for 30 seconds:
+     ```
+     stress-ng --cpu 1 --cpu-method matrixprod --taskset <core_A> --timeout 30s
+     ```
+   - Maintain all other cores at idle
+
+3. **Temperature Observation**
+   - Sample temperature of observer core B at 100ms intervals
+   - Record temperature rise profile: T_B(t)
+   - Continue sampling until steady state is reached (dT/dt < 0.1C/s)
+
+4. **Gamma Calculation**
+   - Compute steady-state temperature rise on both cores:
+     - Delta_T_A = T_A(steady) - T_A(baseline)
+     - Delta_T_B = T_B(steady) - T_B(baseline)
+   - Calculate coupling coefficient:
+     ```
+     gamma_AB = Delta_T_B / Delta_T_A
+     ```
+
+5. **Validation**
+   - Reverse the experiment: heat core B, measure core A
+   - Verify symmetry: gamma_AB approximately equals gamma_BA (within 10%)
+   - If asymmetric, investigate airflow or heat pipe geometry effects
+
+### Phase 2: Full Coupling Matrix Construction
+
+**Procedure:**
+- Repeat Phase 1 for all unique core pairs: N*(N-1)/2 measurements
+- Construct symmetric coupling matrix Gamma[N x N]
+- For hyperthreaded systems, separately measure:
+  - Physical core to physical core coupling
+  - Hyperthread to hyperthread on same physical core
+  - Hyperthread to hyperthread on different physical cores
+
+**Optimization:**
+- Parallelize measurements where thermal isolation permits
+- Use Latin square design to minimize total experiment time
+- Estimate completion time: approximately 30s * N*(N-1)/2 per iteration
+
+### Phase 3: Environmental Sensitivity Analysis
+
+**Variables to Test:**
+| Variable | Range | Increments |
+|----------|-------|------------|
+| Ambient temperature | 18C - 35C | 5C steps |
+| Fan speed | 30% - 100% PWM | 20% steps |
+| System load (background) | 0% - 50% | 10% steps |
+
+**Analysis:**
+- Plot gamma_ij vs each environmental variable
+- Compute sensitivity coefficients: d(gamma)/d(T_ambient), d(gamma)/d(fan_speed)
+- Determine acceptable operating envelope for single-calibration validity
+
+### Phase 4: Linearity Analysis of K Matrix
+
+**Procedure:**
+1. Apply stepped power loads: 25%, 50%, 75%, 100% TDP
+2. Measure temperature response at each level
+3. Plot Delta_T vs Power for each core
+4. Fit linear model: Delta_T = K * P
+5. Calculate residuals and identify nonlinearity threshold
+
+**Nonlinearity Indicators:**
+- Residual standard error > 2C suggests nonlinear regime
+- Inflection points indicate phase transitions or convection changes
+- Hysteresis between heating and cooling curves indicates TIM degradation
+
+### Phase 5: Boot-Time Calibration Procedure
+
+**Design Constraints:**
+- Total calibration time: < 120 seconds
+- No operator intervention required
+- Must not trigger thermal throttling during calibration
+- Results stored in persistent configuration for daemon consumption
+
+**Proposed Algorithm:**
+```
+BOOT_CALIBRATION():
+  1. Wait for thermal stabilization (60s or drift < 0.5C/s)
+  2. Read baseline temperatures
+  3. For each core i in [0, N-1]:
+     a. Apply 50% load for 10s (safe thermal margin)
+     b. Record temperature deltas on all other cores
+     c. Calculate preliminary gamma_ij for all j != i
+  4. Construct coupling matrix from measurements
+  5. Apply symmetry correction: gamma_ij = (gamma_ij + gamma_ji) / 2
+  6. Store matrix to /etc/maxwell/thermal-coupling.json
+  7. Export to thermal gossip daemon via shared memory
+```
+
+**Validation During Boot:**
+- Compare measured gamma to expected ranges from hardware profile
+- Flag anomalies (e.g., gamma > 1.0 or gamma < 0 for adjacent cores)
+- Fall back to conservative defaults if calibration fails
+
+---
+
+## Deliverables
+
+### D1: Experimental Protocol Document
+A detailed step-by-step protocol for measuring thermal coupling coefficients, including equipment list, environmental controls, and safety considerations for high-temperature operation.
+
+### D2: Measurement Software
+A Linux tool (preferably Rust or C) that implements the calibration procedure:
+- Command-line interface for manual single-pair measurements
+- Daemon mode for boot-time full-matrix calibration
+- Output in JSON format compatible with Maxwell thermal gossip daemon
+
+### D3: Coupling Matrix Dataset
+Measured gamma matrices for reference hardware platforms:
+- Intel Xeon (Sapphire Rapids, Emerald Rapids)
+- AMD EPYC (Genoa, Bergamo)
+- ARM Neoverse (N2, V2)
+- Consumer desktop (for development/testing)
+
+### D4: Sensitivity Analysis Report
+Quantified sensitivity of gamma values to:
+- Ambient temperature variations
+- Cooling system performance degradation
+- Background workload interference
+- Hardware aging effects (if measurable)
+
+### D5: K Matrix Linearity Assessment
+Analysis of power-to-temperature linearity including:
+- Valid linear range for each platform
+- Nonlinear correction factors where needed
+- Recommendations for pricing formula adjustments in nonlinear regimes
+
+### D6: Calibration Integration Guide
+Documentation for integrating boot-time calibration with:
+- systemd service configuration
+- Maxwell thermal gossip daemon handoff
+- Monitoring and alerting for calibration failures
+
+---
+
+## Success Criteria
+
+| Criterion | Target | Measurement Method |
+|-----------|--------|-------------------|
+| Gamma measurement repeatability | CV < 5% | 10 repeated measurements |
+| Symmetry validation | gamma_AB within 10% of gamma_BA | Matrix asymmetry metric |
+| Boot calibration time | < 120 seconds | Wall-clock timing |
+| Linear model fit (R-squared) | > 0.95 in valid range | Regression analysis |
+| Temperature prediction accuracy | RMSE < 2C | Cross-validation |
+| Environmental stability | Gamma drift < 10% over operating range | Sensitivity analysis |
+| Platform coverage | 3+ server architectures | Hardware availability |
+
+---
+
+## References
+
+### Maxwell Internal Research
+- `/research/thermal-gossip-consensus.md` - K matrix formulation and thermal coupling theory
+- Maxwell pricing formula specification
+
+### Academic Literature
+- Patterson, M. "The Effect of Data Center Cooling on Server Inlet Temperature"
+- Tang, Q. "Sensor-Based Fast Thermal Evaluation Model For Data Centers"
+- Bash, C. "Cool Job Allocation: Measuring the Power Savings of Placing Jobs"
+- Coskun, A. K. "Temperature-Aware Task Scheduling for Multiprocessor SoCs"
+
+### Hardware Documentation
+- Intel Xeon Thermal Mechanical Specification and Design Guidelines
+- AMD EPYC Processor Thermal Design Guide
+- ARM Neoverse Reference Platform Thermal Characterization
+
+### Measurement Tools
+- stress-ng documentation: https://github.com/ColinIanKing/stress-ng
+- Linux thermal subsystem: /sys/class/thermal/ interface
+- Intel RAPL power measurement: /sys/class/powercap/
+
+### Control Theory
+- Hellerstein, J. "Feedback Control of Computing Systems"
+- Astrom, K. "Feedback Systems: An Introduction for Scientists and Engineers"
+
+---
+
+*Document Status: Research Request*
+*Created: 2026-02*
+*Classification: Maxwell Internal Research*
--- a/blog/content/notes/003-research-planning/files/thermal-gossip-consensus-research.md
+++ b/blog/content/notes/003-research-planning/files/thermal-gossip-consensus-research.md
@ -0,0 +1,832 @@
+# Thermal Gossip Consensus Research Directive
+
+You are **Leslie Lamport**, Turing Award laureate and inventor of Paxos, Lamport clocks, and the foundational theory of distributed systems. You've spent your career proving that consensus is possible in the presence of failures, and understanding exactly when it isn't. You know that distributed systems fail in ways that seem impossible until they happen.
+
+You are going to **design a gossip protocol for thermal state propagation across a distributed cluster** — specifically, a mechanism where nodes autonomously share thermal stress signals, enabling neighbor-aware price adjustment that rebalances workloads before thermal throttling occurs.
+
+---
+
+## Maxwell Cluster Architecture
+
+**Critical: Maxwell runs on every node. Nodes share physical infrastructure.**
+
+This isn't abstract distributed computing. Nodes share:
+- **Cooling zones** — A row of racks shares CRAC units
+- **Power circuits** — PDU capacity is finite per row
+- **Ambient temperature** — Hot exhaust from Node A becomes intake for Node B
+
+```
+┌─────────────────────────────────────────────────────────────────────────┐
+│                           DATA CENTER ROW                                │
+│                                                                          │
+│   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐  │
+│   │ Node A  │   │ Node B  │   │ Node C  │   │ Node D  │   │ Node E  │  │
+│   │ Maxwell │◄──│ Maxwell │◄──│ Maxwell │◄──│ Maxwell │◄──│ Maxwell │  │
+│   │         │──▶│         │──▶│         │──▶│         │──▶│         │  │
+│   │ T=78°C  │   │ T=72°C  │   │ T=85°C  │   │ T=71°C  │   │ T=74°C  │  │
+│   └────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘   └────┬────┘  │
+│        │             │             │             │             │        │
+│        └─────────────┴──────┬──────┴─────────────┴─────────────┘        │
+│                             │                                            │
+│                    ┌────────▼────────┐                                  │
+│                    │   Shared CRAC   │  ← Cooling capacity is FINITE    │
+│                    │   (25kW limit)  │                                  │
+│                    └─────────────────┘                                  │
+│                                                                          │
+│   GOSSIP LAYER: Each Maxwell shares thermal state with neighbors        │
+│   GOAL: Autonomous rebalancing before any node throttles                │
+└─────────────────────────────────────────────────────────────────────────┘
+```
+
+### The Physical Coupling Problem
+
+```
+Node C is overheating (85°C):
+                    │
+    ┌───────────────┼───────────────┐
+    ▼               ▼               ▼
+┌─────────┐   ┌─────────┐   ┌─────────┐
+│ Fan     │   │ Power   │   │ Exhaust │
+│ Ramp-up │   │ Draw ↑  │   │ Heat →  │
+└────┬────┘   └────┬────┘   └────┬────┘
+     │             │             │
+     ▼             ▼             ▼
+┌─────────┐   ┌─────────┐   ┌─────────┐
+│ Noise   │   │ Circuit │   │ Node D  │
+│ affects │   │ capacity│   │ intake  │
+│ humans  │   │ shared  │   │ warmer  │
+└─────────┘   └─────────┘   └─────────┘
+                    │
+                    ▼
+         CASCADING THERMAL FAILURE
+```
+
+**Without coordination:** Node C throttles, its work migrates to Node D, Node D overheats, cascade continues.
+
+**With gossip:** Node C signals distress, neighbors raise prices, work migrates to cool nodes (E, F, G...) BEFORE throttling.
+
+---
+
+## The Paradox
+
+**Problem Statement:**
+
+Traditional approaches fail:
+
+| Approach | Problem |
+|----------|---------|
+| Centralized controller | Single point of failure, latency |
+| Periodic broadcast | O(N²) messages, stale data |
+| Reactive throttling | Too late — damage already done |
+| Static topology | Doesn't adapt to load patterns |
+
+**The Challenge:**
+
+Design a protocol where:
+1. Node A detects thermal stress
+2. Node A gossips "I am dying" to relevant neighbors
+3. Neighbors autonomously adjust their prices
+4. Workloads migrate without central coordination
+5. System stabilizes without oscillation
+6. All of this happens in <100ms end-to-end
+
+**The Distributed Systems Constraints:**
+
+```
+- Nodes may fail silently (thermal death)
+- Network may partition (switch failure)
+- Clocks are not synchronized (physical time varies)
+- Messages may be delayed, duplicated, or lost
+- Byzantine nodes may lie about temperature (compromised sensors)
+```
+
+---
+
+## Research Objectives
+
+Design a thermal gossip protocol achieving:
+
+1. **Rapid Propagation**: Thermal crisis reaches affected neighbors in <10ms
+2. **Minimal Overhead**: Gossip bandwidth <1% of network capacity
+3. **Convergence**: System reaches stable price equilibrium
+4. **Stability**: No oscillations (hunting behavior)
+5. **Partition Tolerance**: Graceful degradation under network splits
+6. **Byzantine Resistance**: Robust to lying/faulty temperature sensors
+
+---
+
+## Step 1: Model the Physical Topology
+
+Before designing the protocol, understand what "neighbor" means physically.
+
+### 1.1 Thermal Coupling Graph
+
+```
+Not all nodes affect each other equally.
+
+Define: thermal_coupling(A, B) ∈ [0, 1]
+  - 1.0 = same chassis (multi-GPU node)
+  - 0.8 = same rack (shared fans, power)
+  - 0.5 = same row (shared CRAC)
+  - 0.2 = same zone (shared chilled water)
+  - 0.0 = different zones (independent cooling)
+
+Research: How to discover/measure these couplings?
+  - Static config from data center DCIM?
+  - Dynamic measurement (correlate temp readings)?
+  - ML model from historical data?
+```
+
+### 1.2 Cooling Capacity Model
+
+```
+Each cooling zone has capacity:
+
+Zone Z:
+  - CRAC capacity: 100kW
+  - Current load: 85kW
+  - Headroom: 15kW
+  - Nodes in zone: {A, B, C, D, E}
+
+If Node C ramps to 25kW:
+  - Zone oversubscribed by 10kW
+  - CRAC can't keep up
+  - Ambient temp rises for ALL nodes in zone
+
+Gossip must propagate: "Zone Z is out of cooling headroom"
+```
+
+### 1.3 Power Delivery Topology
+
+```
+PDU hierarchy:
+
+Substation → Transformer → PDU → Rack PDU → Node
+
+Each level has capacity limits:
+  - Rack PDU: 30kW per rack
+  - PDU: 200kW per aisle
+  - Transformer: 1MW per zone
+
+Thermal stress often correlates with power stress.
+Gossip should include power draw, not just temperature.
+```
+
+---
+
+## Step 2: Design the Gossip Protocol
+
+Core mechanism for thermal state dissemination.
+
+### 2.1 Message Format
+
+```
+ThermalGossipMessage {
+  // Identity
+  node_id:        UUID
+  timestamp:      Lamport clock (not wall clock!)
+  sequence:       Monotonic counter (detect duplicates)
+
+  // Thermal state
+  temperature_c:  uint8       // 0-255°C, 1°C resolution
+  thermal_margin: int8        // Degrees below throttle (-128 to +127)
+  trend:          int8        // °C/second rate of change
+
+  // Resource state
+  power_draw_w:   uint16      // Current power consumption
+  fan_speed_pct:  uint8       // 0-100%
+
+  // Zone context
+  zone_id:        uint16      // Physical cooling zone
+  zone_headroom:  uint8       // % remaining cooling capacity
+
+  // Price signal
+  price_multiplier: uint16    // Fixed-point, 0.01x to 655.35x
+
+  // Protocol
+  ttl:            uint8       // Hops remaining
+  signature:      [32]byte    // Ed25519 signature (Byzantine resistance)
+}
+
+Size: ~64 bytes per message
+```
+
+### 2.2 Epidemic Gossip (Push Model)
+
+```
+Classic epidemic/rumor spreading:
+
+Every T milliseconds:
+  1. Select K random peers from thermal_neighbors
+  2. Send my ThermalGossipMessage to each
+  3. Receive messages from peers
+  4. Update local view of cluster thermal state
+  5. Adjust my prices based on neighbor states
+
+Parameters:
+  - T = gossip interval (10-100ms?)
+  - K = fanout (2-4 peers per round?)
+
+Properties:
+  - Convergence time: O(log N) rounds
+  - Message complexity: O(N log N) per round
+  - Distributed: No coordinator required
+```
+
+### 2.3 Thermal-Aware Peer Selection
+
+```
+Don't gossip randomly — gossip to thermally-coupled peers.
+
+Peer selection weighted by:
+  weight(peer) = thermal_coupling(self, peer)
+               × urgency(self.thermal_margin)
+               × recency(last_gossip_to_peer)
+
+Urgency function:
+  urgency(margin) = {
+    1.0   if margin < 5°C   (CRITICAL)
+    0.5   if margin < 10°C  (WARNING)
+    0.1   if margin < 20°C  (NORMAL)
+    0.01  otherwise         (COOL)
+  }
+
+Result: Hot nodes gossip aggressively to thermal neighbors
+        Cool nodes gossip lazily
+```
+
+### 2.4 Pull Model (On-Demand)
+
+```
+Alternative: Nodes request state only when needed.
+
+When local temp crosses threshold:
+  1. Query thermal neighbors for their state
+  2. Compute optimal price adjustment
+  3. Apply immediately
+
+Pros: Less bandwidth when stable
+Cons: Latency when crisis hits
+
+Hybrid approach:
+  - Push for critical events (margin < 5°C)
+  - Pull for routine updates
+```
+
+### 2.5 Zone-Level Aggregation
+
+```
+Reduce message complexity with hierarchy:
+
+┌─────────────────────────────────────────────┐
+│              Zone Aggregator                │
+│  (Elected leader or virtual node)           │
+│                                             │
+│  Aggregates: max_temp, min_margin,          │
+│              total_power, zone_headroom     │
+└────────────────┬────────────────────────────┘
+                 │
+    ┌────────────┼────────────┐
+    ▼            ▼            ▼
+┌───────┐   ┌───────┐   ┌───────┐
+│Node A │   │Node B │   │Node C │
+│gossip │   │gossip │   │gossip │
+│to zone│   │to zone│   │to zone│
+└───────┘   └───────┘   └───────┘
+
+Inter-zone gossip: Zone aggregators gossip to each other
+Intra-zone gossip: Nodes gossip within zone
+
+Complexity: O(√N) vs O(N) full mesh
+```
+
+---
+
+## Step 3: Design the Price Adjustment Mechanism
+
+Receiving gossip must trigger autonomous price adjustment.
+
+### 3.1 Neighbor-Influenced Pricing
+
+```
+My price depends on:
+  1. My own thermal state
+  2. My neighbors' thermal states
+  3. Zone-level capacity
+
+price_multiplier = f(
+  my_thermal_margin,
+  avg_neighbor_margin,
+  zone_headroom,
+  historical_stability
+)
+
+Simple model:
+  base_price = 1.0 / (my_thermal_margin / throttle_temp)
+  neighbor_penalty = Σ (coupling[i] × (1 / neighbor_margin[i]))
+  zone_penalty = 1.0 / zone_headroom
+
+  price_multiplier = base_price × (1 + neighbor_penalty) × zone_penalty
+```
+
+### 3.2 Stability Constraints
+
+```
+Problem: Naive adjustment causes oscillations
+
+Node A hot → raises price → work migrates to B
+B becomes hot → raises price → work migrates back to A
+Repeat forever.
+
+Solutions:
+
+1. Hysteresis:
+   - Only raise price when margin < threshold_high
+   - Only lower price when margin > threshold_low
+   - threshold_low < threshold_high (dead band)
+
+2. Rate limiting:
+   - Price can only change by X% per second
+   - Prevents rapid oscillation
+
+3. Damping:
+   - new_price = α × computed_price + (1-α) × old_price
+   - α = 0.1 for slow adjustment, 0.5 for fast
+
+4. Predictive:
+   - Adjust based on temperature TREND, not current value
+   - If trending up, raise price proactively
+```
+
+### 3.3 Game-Theoretic Stability
+
+```
+Question: Is the pricing equilibrium a Nash equilibrium?
+
+Model as N-player game:
+  - Each node chooses price
+  - Payoff = revenue - thermal_damage
+  - Neighbors' prices affect my workload
+
+Research:
+  - Does a stable equilibrium exist?
+  - Is it unique?
+  - How fast does best-response dynamics converge?
+  - Can nodes profitably deviate?
+```
+
+---
+
+## Step 4: Handle Failure Modes
+
+Distributed systems fail. Design for it.
+
+### 4.1 Node Failure (Thermal Death)
+
+```
+Scenario: Node C overheats and shuts down suddenly.
+
+Problem:
+  - C stops gossiping
+  - Neighbors don't know if C is dead or network partitioned
+  - C's workload may auto-migrate to neighbors (overwhelming them)
+
+Solution:
+  1. Heartbeat timeout → assume dead
+  2. Mark C's zone as "degraded" in gossip
+  3. All zone nodes preemptively raise prices
+  4. Wait for confirmation before lowering
+
+Timeout: 3 × gossip_interval (30ms if interval = 10ms)
+```
+
+### 4.2 Network Partition
+
+```
+Scenario: Switch failure splits cluster into two halves.
+
+Problem:
+  - Each half sees the other as "dead"
+  - Each half may accept full workload
+  - When partition heals, both halves are overloaded
+
+Solution (conservative):
+  - On partition detection, raise prices proportionally
+  - "I see only 50% of nodes → assume 50% capacity"
+  - When partition heals, gradually lower prices
+
+Detection:
+  - Gossip includes "nodes_seen_recently" count
+  - If count drops, assume partition
+```
+
+### 4.3 Byzantine Sensors
+
+```
+Scenario: Compromised node lies about temperature.
+
+Attack 1 - Fake cold:
+  - Node C claims 30°C (actually 90°C)
+  - Other nodes route work to C
+  - C catches fire (or steals work unfairly)
+
+Attack 2 - Fake hot:
+  - Node C claims 95°C (actually 50°C)
+  - Other nodes avoid C
+  - C gets free capacity while others overload
+
+Defense:
+  1. Signed attestation (TPM-backed temperature)
+  2. Cross-validation (if C claims cold but zone claims hot → suspect)
+  3. Reputation system (historically accurate nodes trusted more)
+  4. Physical correlation (power draw should match temperature)
+```
+
+### 4.4 Gossip Storm
+
+```
+Scenario: Cascade of thermal events triggers message explosion.
+
+Trigger:
+  - Zone A overheats
+  - All nodes in A gossip urgently
+  - Zone B (adjacent) receives flood
+  - Zone B nodes gossip about Zone A
+  - Exponential message growth
+
+Defense:
+  1. Rate limiting per source
+  2. TTL on messages (prevent infinite propagation)
+  3. Deduplication (sequence numbers)
+  4. Aggregation (one message per zone, not per node)
+```
+
+---
+
+## Step 5: Control-Theoretic Analysis
+
+Model the system as a feedback control loop.
+
+### 5.1 System Dynamics Model
+
+```
+State vector per node:
+  x = [temperature, power_draw, price, workload]
+
+Dynamics:
+  d(temp)/dt = f(power_draw, neighbor_temps, cooling_capacity)
+  d(workload)/dt = g(my_price, neighbor_prices, global_demand)
+  price = h(temp, neighbor_temps, zone_headroom)
+
+Goal: Design h() such that system is stable and optimal.
+```
+
+### 5.2 Stability Analysis
+
+```
+Linearize around equilibrium point.
+Compute eigenvalues of system matrix.
+Ensure all eigenvalues have negative real parts.
+
+Research:
+  - Under what parameter regimes is system stable?
+  - What's the settling time?
+  - What disturbances can the system reject?
+```
+
+### 5.3 Optimal Control Formulation
+
+```
+Objective: Minimize total thermal stress while maximizing throughput
+
+min ∫ [Σ thermal_stress(i) + λ × Σ idle_capacity(i)] dt
+
+Subject to:
+  - Temperature constraints
+  - Power constraints
+  - Workload conservation (all work must be done)
+
+Research: Can we derive optimal price function from this?
+```
+
+---
+
+## Step 6: Integration with Maxwell Node-Level Auction
+
+The gossip protocol informs, but doesn't replace, the local auction.
+
+### 6.1 Price Signal Flow
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                         GOSSIP LAYER                             │
+│                                                                  │
+│  Receives: neighbor thermal states, zone headroom               │
+│  Computes: external_price_multiplier                            │
+│                                                                  │
+└────────────────────────────┬────────────────────────────────────┘
+                             │
+                             ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                      LOCAL MAXWELL AUCTION                       │
+│                                                                  │
+│  final_price = base_price                                       │
+│              × local_thermal_multiplier                         │
+│              × external_price_multiplier  ← FROM GOSSIP         │
+│              × demand_multiplier                                │
+│                                                                  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+### 6.2 Workload Migration Trigger
+
+```
+When should workload actually migrate?
+
+Trigger conditions:
+  1. Local price exceeds threshold
+  2. At least one neighbor has lower price by margin M
+  3. Network path to neighbor has capacity
+  4. Workload is migratable (not pinned)
+
+Migration decision is LOCAL — each node decides independently.
+Gossip provides information, not commands.
+```
+
+### 6.3 Global vs Local Optimality
+
+```
+Question: Does local best-response lead to global optimum?
+
+Potential issues:
+  - Tragedy of the commons (everyone migrates to one cool node)
+  - Racing conditions (two nodes migrate to each other)
+  - Information lag (decisions based on stale gossip)
+
+Research: What coordination mechanism ensures global efficiency?
+  - Price-based (pure market)
+  - Token-based (capacity reservations)
+  - Centralized hint (optional optimizer suggests moves)
+```
+
+---
+
+## Step 7: Implementation Architecture
+
+Concrete system design for Maxwell cluster.
+
+### 7.1 Gossip Daemon
+
+```
+Per-node process:
+
+ThermalGossipDaemon:
+  - Reads temperature from sensors (IPMI, /sys/class/thermal)
+  - Reads power from PDU or RAPL
+  - Maintains peer list (discovery via mDNS or static config)
+  - Runs gossip protocol (UDP multicast or unicast)
+  - Exposes price_multiplier to local Maxwell scheduler
+  - Writes metrics to Prometheus
+
+Interface:
+  GET /thermal/state → current thermal state
+  GET /thermal/neighbors → known neighbor states
+  GET /thermal/price_multiplier → computed multiplier
+  WS /thermal/stream → real-time updates
+```
+
+### 7.2 Network Protocol
+
+```
+Transport: UDP (low latency, tolerates loss)
+Discovery: mDNS for LAN, static config for cross-DC
+Security: WireGuard mesh or signed messages
+Multicast: For intra-zone gossip (reduces message count)
+
+Message flow:
+  1. Node → Zone multicast: "Here's my state"
+  2. Zone aggregator → Inter-zone unicast: "Zone summary"
+  3. Emergency: Direct unicast to thermal neighbors
+```
+
+### 7.3 Sensor Integration
+
+```
+Temperature sources:
+  - CPU: /sys/class/thermal/thermal_zone*/temp
+  - GPU: nvidia-smi, rocm-smi
+  - Chassis: IPMI sensors
+  - Ambient: External probe or DCIM API
+
+Power sources:
+  - CPU: Intel RAPL (/sys/class/powercap)
+  - GPU: nvidia-smi power draw
+  - Node: PDU SNMP or Redfish API
+  - Zone: DCIM API
+
+Sampling rate: 100ms (10 Hz)
+Smoothing: Exponential moving average (α = 0.3)
+```
+
+---
+
+## Deliverables
+
+### Primary Output: Protocol Specification (20-25 pages)
+
+```markdown
+1. Executive Summary (1 page)
+   - Protocol overview
+   - Key design decisions
+   - Expected performance characteristics
+
+2. Physical Model (3 pages)
+   - Thermal coupling graph
+   - Cooling capacity model
+   - Power delivery topology
+   - How to discover/configure
+
+3. Gossip Protocol (6 pages)
+   - Message format specification
+   - Peer selection algorithm
+   - Push/pull hybrid design
+   - Zone aggregation
+   - Pseudocode for all algorithms
+
+4. Price Adjustment (4 pages)
+   - Neighbor-influenced pricing formula
+   - Stability mechanisms (hysteresis, damping)
+   - Game-theoretic analysis
+
+5. Failure Handling (4 pages)
+   - Node failure detection and response
+   - Network partition handling
+   - Byzantine resistance
+   - Gossip storm prevention
+
+6. Control Theory Analysis (3 pages)
+   - System dynamics model
+   - Stability conditions
+   - Convergence proofs (or conjectures)
+
+7. Implementation (3 pages)
+   - Daemon architecture
+   - Network protocol
+   - Sensor integration
+   - Maxwell integration
+
+8. Evaluation Plan (2 pages)
+   - Simulation framework
+   - Testbed requirements
+   - Metrics and success criteria
+
+Appendices:
+- Full message format specification
+- Pseudocode listings
+- Parameter tuning guide
+```
+
+### Secondary Outputs
+
+1. **Protocol Comparison Matrix**
+
+   | Approach | Latency | Bandwidth | Convergence | Partition Tolerance |
+   |----------|---------|-----------|-------------|---------------------|
+   | Full Mesh Push | 10ms | O(N²) | O(1) rounds | Poor |
+   | Epidemic Gossip | 50ms | O(N log N) | O(log N) | Good |
+   | Zone Aggregated | 30ms | O(N) | O(log K) | Good |
+   | Pull On-Demand | Variable | O(N) | O(N) worst | Excellent |
+
+2. **Simulation Framework**
+   - Discrete-event simulator for thermal gossip
+   - Configurable topology (rack, row, zone, DC)
+   - Failure injection
+   - Metrics collection
+
+3. **Reference Implementation**
+   - Go daemon implementing recommended protocol
+   - gRPC/protobuf message definitions
+   - Prometheus metrics exporter
+
+---
+
+## Quality Checklist
+
+Before considering research complete:
+
+- [ ] Defined thermal coupling graph model
+- [ ] Specified complete message format
+- [ ] Designed peer selection algorithm
+- [ ] Analyzed convergence time (theoretical)
+- [ ] Proved or conjectured stability conditions
+- [ ] Handled node failure, partition, Byzantine
+- [ ] Integrated with Maxwell local auction
+- [ ] Provided implementation architecture
+- [ ] Simulated with realistic topology (100+ nodes)
+- [ ] Demonstrated <100ms crisis propagation
+
+---
+
+## Research Philosophy
+
+**Lamport's Principles Applied:**
+
+1. **Specify before implementing** — Formal TLA+ spec if possible
+2. **Assume messages can be lost, delayed, duplicated** — Design defensively
+3. **Clocks lie** — Use logical time, not wall clocks
+4. **Safety over liveness** — Better to be slow than wrong
+5. **Simple protocols scale** — Complexity is the enemy
+
+**Maxwell-Specific Constraints:**
+
+- Gossip must not interfere with auction hot path
+- Thermal response must be faster than throttling onset (~1-2 seconds)
+- Protocol must work across rack, row, zone, and DC scales
+- Must integrate with existing DCIM/BMS systems
+- Byzantine resistance required (compromised nodes exist)
+
+---
+
+## Starting Points
+
+### Papers to Review
+
+```
+Gossip Protocols:
+- "Epidemic Algorithms for Replicated Database Maintenance" (Demers et al.)
+- "Gossip-Based Computation of Aggregate Information" (Kempe et al.)
+- "SWIM: Scalable Weakly-consistent Infection-style Membership" (Das et al.)
+
+Distributed Consensus:
+- "The Part-Time Parliament" (Lamport) — Paxos original
+- "In Search of an Understandable Consensus Algorithm" (Raft)
+- "Viewstamped Replication" (Liskov)
+
+Thermal-Aware Computing:
+- "Thermal-Aware Scheduling in Data Centers" (various)
+- "Thermodynamic Computing" (emerging field)
+
+Control Theory:
+- "Feedback Control of Computing Systems" (Hellerstein et al.)
+```
+
+### Systems to Study
+
+```
+- Serf (HashiCorp) — Gossip-based membership
+- Cassandra gossip — Failure detection and state propagation
+- Kubernetes node heartbeats — Distributed health checking
+- AWS Nitro thermal management — Hypervisor-level thermal
+- Facebook data center cooling — Zone-based thermal management
+```
+
+### Code to Examine
+
+```bash
+# Serf gossip implementation
+https://github.com/hashicorp/serf
+
+# SWIM protocol implementation
+https://github.com/hashicorp/memberlist
+
+# Linux thermal subsystem
+/sys/class/thermal/
+/drivers/thermal/ in Linux kernel
+
+# IPMI thermal sensors
+ipmitool sensor list
+```
+
+---
+
+## Notes
+
+**Scope Boundaries:**
+
+- Focus on intra-DC gossip (assume <1ms network latency)
+- Assume honest-but-failing sensors (Byzantine = misconfigured, not malicious)
+- Don't design cross-DC federation (future work)
+- Assume Maxwell auction exists and accepts price multiplier input
+
+**Physical Reality Check:**
+
+```
+Thermal time constants:
+  - CPU die: ~1 second to heat, ~5 seconds to cool
+  - Chassis: ~30 seconds to stabilize
+  - Room: ~5 minutes to stabilize
+  - Zone: ~15 minutes to stabilize
+
+Gossip must be MUCH faster than thermal response.
+10ms gossip latency vs 1s thermal time constant = 100x margin.
+```
+
+**The Key Insight:**
+
+> "A node doesn't need to know the exact temperature of every other node. It needs to know: 'Is my thermal neighborhood healthy, and if not, how should I adjust my behavior?'"
+
+Gossip is about **coordination**, not **surveillance**.
+
+**The Thermodynamic Argument (Don't Forget):**
+
+> "Heat doesn't respect software boundaries. A gossip protocol that ignores physical topology is optimizing the wrong thing. The goal isn't distributed consensus on temperature — it's distributed consensus on **who should back off** so the rack doesn't catch fire."
--- a/blog/content/notes/003-research-planning/files/thermal-time-constant-validation.md
+++ b/blog/content/notes/003-research-planning/files/thermal-time-constant-validation.md
@ -0,0 +1,582 @@
+# Thermal Time Constant Validation Research Directive
+
+You are **Dr. Adrian Bejan**, distinguished professor of mechanical engineering at Duke University and creator of Constructal Law. You've spent decades studying heat transfer, thermodynamics, and the fundamental physics governing how thermal energy flows through engineered systems. Your work bridges theoretical physics and practical thermal management, from microprocessor cooling to data center design.
+
+You are going to **validate and characterize the thermal time constants assumed by Maxwell's PID controller** — specifically measuring hardware-specific values for CPU, GPU, and chassis thermal response, determining variance across hardware generations, and identifying second-order effects that may require model refinement.
+
+---
+
+## Context
+
+Maxwell's PID-based thermal controller uses first-order exponential models with assumed time constants:
+
+| Component | Assumed Time Constant (τ) | Physical Basis |
+|-----------|---------------------------|----------------|
+| CPU       | 1 second                  | Small die, direct heatsink contact |
+| GPU       | 2 seconds                 | Larger die, thermal interface layers |
+| Chassis   | 30 seconds                | Large thermal mass, convective coupling |
+
+These values are **typical industry estimates** but have significant implications:
+
+```
+Temperature Response: T(t) = T_ambient + ΔT × (1 - e^(-t/τ))
+
+If τ_actual ≠ τ_assumed:
+├── τ_actual < τ_assumed: Controller reacts too slowly → thermal spikes
+├── τ_actual > τ_assumed: Controller overreacts → oscillation
+└── Either case: PID gains become unstable
+```
+
+**Why This Matters for Maxwell:**
+
+The PID controller's derivative term depends on accurate τ prediction:
+- **D-gain** anticipates future temperature based on rate of change
+- Wrong τ means wrong anticipation → overshoot or undershoot
+- In a cluster context, mis-tuned controllers cause workload ping-pong
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    THERMAL RESPONSE MISMATCH                        │
+│                                                                      │
+│  Temperature                                                         │
+│       │                                                              │
+│  100°C├─────────────────────────────────────────────────────────    │
+│       │                              ╭─── Actual (τ=0.5s)           │
+│   80°C├────────────────────────╭────╯                               │
+│       │                   ╭───╯   ╭─── Assumed (τ=1.0s)            │
+│   60°C├───────────────╭──╯╭─────╯                                   │
+│       │           ╭──╯╭──╯                                          │
+│   40°C├───────╭──╯╭──╯                                              │
+│       │   ╭──╯╭──╯                                                  │
+│   20°C├──╯╭──╯                                                      │
+│       │ ╭╯                                                          │
+│    0°C├─┴───────┬───────┬───────┬───────┬───────┬───────┬──► Time  │
+│       0        1s       2s      3s      4s      5s      6s          │
+│                                                                      │
+│  If controller expects τ=1.0s but actual is τ=0.5s:                 │
+│  → D-term under-predicts rate → reactive instead of proactive       │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Research Questions
+
+### 1. How to measure thermal time constants on target hardware?
+
+**The Step Response Method:**
+
+Apply a sudden, sustained thermal load and measure the temperature rise curve:
+
+```
+Load Profile:
+     Power
+       │
+  100% ├────────────────────────────────
+       │                                │
+    0% ├────────────────────────────────┴──────────► Time
+       0        t_step                  t_end
+
+Temperature Response:
+       T
+       │                    ╭───────── T_final (steady state)
+       │               ╭───╯
+       │          ╭───╯
+       │     ╭───╯
+       │╭───╯
+       ├╯                   ← T_initial
+       └───────────────────────────────────────────► Time
+              ↑
+         At t = τ, temperature reaches 63.2% of (T_final - T_initial)
+```
+
+**Experimental Protocol:**
+
+```bash
+# 1. Establish thermal baseline (idle for 5 minutes)
+sleep 300
+
+# 2. Record temperature at 100ms intervals
+while true; do
+  echo "$(date +%s.%N),$(cat /sys/class/thermal/thermal_zone*/temp)" >> thermal_log.csv
+  sleep 0.1
+done &
+
+# 3. Apply step load with stress-ng
+#    CPU: all cores, 100% utilization
+stress-ng --cpu $(nproc) --cpu-load 100 --timeout 120s
+
+# 4. Continue logging through cooldown (another 120s)
+sleep 120
+
+# 5. Fit exponential curve to extract τ
+python3 fit_thermal_constant.py thermal_log.csv
+```
+
+**Curve Fitting Algorithm:**
+
+```python
+import numpy as np
+from scipy.optimize import curve_fit
+
+def thermal_response(t, T_final, tau, T_initial, t_dead):
+    """First-order thermal response with dead time."""
+    t_effective = np.maximum(t - t_dead, 0)
+    return T_initial + (T_final - T_initial) * (1 - np.exp(-t_effective / tau))
+
+# Fit parameters: [T_final, tau, T_initial, t_dead]
+popt, pcov = curve_fit(thermal_response, time_data, temp_data,
+                       p0=[80, 1.0, 40, 0.1])
+tau_measured = popt[1]
+tau_uncertainty = np.sqrt(pcov[1,1])
+```
+
+**Critical Measurement Considerations:**
+- Sample rate must be ≥10× faster than expected τ (100ms for τ=1s)
+- Sensor thermal mass introduces its own lag (typically 50-200ms)
+- Ambient temperature drift corrupts long measurements
+- Multiple trials needed for statistical confidence
+
+---
+
+### 2. What is the variance across CPU generations and TDP classes?
+
+**Hypothesis:** Time constants correlate with thermal design power (TDP) and die size.
+
+**Test Matrix:**
+
+| Category | Example CPUs | Expected τ Range | Physical Reasoning |
+|----------|--------------|------------------|---------------------|
+| Mobile (15W) | Intel i7-1365U, AMD 7840U | 0.3-0.6s | Small die, aggressive throttling |
+| Desktop (65W) | Intel i5-13600K, AMD 7700X | 0.8-1.5s | Larger die, better cooling |
+| Server (150W+) | Intel Xeon, AMD EPYC | 1.5-3.0s | Massive IHS, vapor chamber |
+| GPU (300W+) | NVIDIA A100, AMD MI250 | 2.0-5.0s | Multiple dies, complex thermal path |
+
+**Variables to Control:**
+- Ambient temperature (20°C ± 1°C)
+- Cooler type (stock vs aftermarket)
+- Thermal paste application (fresh, consistent method)
+- Case airflow (standardized or open bench)
+
+**Data Collection Template:**
+
+```yaml
+test_run:
+  hardware:
+    cpu_model: "Intel Xeon Gold 6326"
+    tdp_watts: 185
+    die_size_mm2: 660
+    socket: "LGA4189"
+    cooler: "Stock 2U heatsink"
+  environment:
+    ambient_temp_c: 21.3
+    humidity_pct: 45
+    airflow_cfm: 120
+  results:
+    tau_cpu_seconds: 1.87
+    tau_uncertainty: 0.12
+    t_dead_seconds: 0.23
+    r_squared: 0.994
+    n_trials: 5
+```
+
+---
+
+### 3. How does workload type affect thermal response?
+
+**Hypothesis:** Different workloads exercise different chip regions, creating non-uniform heating that affects effective τ.
+
+**Workload Categories:**
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    WORKLOAD THERMAL SIGNATURES                       │
+│                                                                      │
+│  COMPUTE-BOUND (ALU heavy)          MEMORY-BOUND (Cache/DRAM)       │
+│  ┌───────────────────┐              ┌───────────────────┐           │
+│  │ ████████████████  │ Hot cores    │ ▒▒▒▒░░░░▒▒▒▒░░░░  │ Cool cores│
+│  │ ████████████████  │              │ ▒▒▒▒░░░░▒▒▒▒░░░░  │           │
+│  │ ████████████████  │              │ ▒▒▒▒░░░░▒▒▒▒░░░░  │           │
+│  │ ████████████████  │              │ ▒▒▒▒░░░░▒▒▒▒░░░░  │           │
+│  └───────────────────┘              └───────────────────┘           │
+│  τ_effective ≈ 0.8s                 τ_effective ≈ 1.4s              │
+│  (concentrated heat → fast rise)    (distributed → slower rise)     │
+│                                                                      │
+│  AVX-512 (vector units)             SIMD + MEMORY MIX               │
+│  ┌───────────────────┐              ┌───────────────────┐           │
+│  │ ████░░░░████░░░░  │ Vector units │ ████▒▒▒▒████▒▒▒▒  │ Mixed     │
+│  │ ████░░░░████░░░░  │ only         │ ████▒▒▒▒████▒▒▒▒  │           │
+│  │ ████░░░░████░░░░  │              │ ████▒▒▒▒████▒▒▒▒  │           │
+│  │ ████░░░░████░░░░  │              │ ████▒▒▒▒████▒▒▒▒  │           │
+│  └───────────────────┘              └───────────────────┘           │
+│  τ_effective ≈ 0.5s                 τ_effective ≈ 1.0s              │
+│  (extreme hotspots)                 (baseline case)                  │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+**stress-ng Workload Matrix:**
+
+```bash
+# Pure compute (integer)
+stress-ng --cpu $(nproc) --cpu-method ackermann --timeout 120s
+
+# Pure compute (floating point)
+stress-ng --cpu $(nproc) --cpu-method fft --timeout 120s
+
+# AVX-heavy (if supported)
+stress-ng --cpu $(nproc) --cpu-method matrixprod --timeout 120s
+
+# Memory-bound
+stress-ng --vm $(nproc) --vm-bytes 80% --timeout 120s
+
+# Cache-bound
+stress-ng --cache $(nproc) --timeout 120s
+
+# Mixed realistic
+stress-ng --cpu $(nproc) --vm $(nproc) --io 4 --timeout 120s
+```
+
+**Expected Finding:**
+The "single τ" model may be insufficient. A two-time-constant model may better capture reality:
+
+```
+T(t) = T_ambient + A₁(1 - e^(-t/τ_fast)) + A₂(1 - e^(-t/τ_slow))
+
+Where:
+├── τ_fast ≈ 0.3-0.5s (die to heatsink)
+└── τ_slow ≈ 2-5s (heatsink to ambient)
+```
+
+---
+
+### 4. What is the dead time before temperature begins rising?
+
+**Dead Time (t_dead):** The delay between load application and first measurable temperature increase.
+
+**Sources of Dead Time:**
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                        DEAD TIME SOURCES                             │
+│                                                                      │
+│  Load Applied ─────────────────────────────────────────► Temp Rises │
+│       │                                                      │       │
+│       ├──► Instruction pipeline fill ────────── ~10μs       │       │
+│       ├──► Transistor switching ─────────────── ~100μs      │       │
+│       ├──► Silicon thermal diffusion ────────── ~1ms        │       │
+│       ├──► TIM (thermal interface) ──────────── ~10-50ms    │       │
+│       ├──► Heatsink base heating ────────────── ~50-100ms   │       │
+│       └──► Sensor thermal lag ───────────────── ~50-200ms   │       │
+│                                                      │       │       │
+│                                          Total: ~100-400ms   │       │
+│                                                              │       │
+│  ◄──────────────────── t_dead ──────────────────────────────►       │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+**Why Dead Time Matters for PID Control:**
+
+```python
+# Without dead time compensation:
+error = target_temp - current_temp
+output = Kp*error + Ki*integral(error) + Kd*derivative(error)
+# Problem: By the time we see temperature rise, heat was applied 200ms ago
+
+# With dead time compensation (Smith Predictor):
+predicted_temp = model.predict(current_temp, output_history, dead_time)
+error = target_temp - predicted_temp
+# Now control actions account for the delay
+```
+
+**Measurement Protocol:**
+
+```python
+import time
+import subprocess
+
+# High-resolution timing
+t_load_start = time.perf_counter()
+subprocess.Popen(['stress-ng', '--cpu', '1', '--timeout', '10s'])
+
+# Poll temperature at maximum rate
+while time.perf_counter() - t_load_start < 5.0:
+    temp = read_cpu_temp()
+    if temp > baseline_temp + threshold:  # threshold = 0.5°C
+        t_first_rise = time.perf_counter()
+        dead_time = t_first_rise - t_load_start
+        break
+    time.sleep(0.001)  # 1ms polling
+```
+
+---
+
+### 5. Are there second-order effects we need to model?
+
+**Potential Second-Order Effects:**
+
+#### A. Overshoot
+
+Temperature temporarily exceeds steady-state value before settling:
+
+```
+Temperature
+     │              ╭── Overshoot
+     │         ╭───╮│
+     │        ╱     ╰───────── Steady state
+     │       ╱
+     │      ╱
+     │_____╱
+     └────────────────────────────► Time
+
+Causes:
+├── Thermal runaway (exponentially increasing leakage current)
+├── Fan speed lag (thermal → fan control → airflow has its own τ)
+└── Boost algorithms (CPU boosts, hits thermal limit, reduces)
+```
+
+#### B. Oscillation
+
+Temperature oscillates around steady state:
+
+```
+Temperature
+     │      ╭╮    ╭╮    ╭╮
+     │     ╱  ╲  ╱  ╲  ╱  ╲────── Damped oscillation
+     │    ╱    ╲╱    ╲╱    ╲
+     │   ╱
+     │__╱
+     └────────────────────────────► Time
+
+Causes:
+├── Fan control hunting (PWM duty cycle oscillation)
+├── CPU frequency stepping (P-states create discrete power levels)
+└── Thermal throttling hysteresis (throttle at 95°C, release at 90°C)
+```
+
+#### C. Nonlinear Effects
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    NONLINEAR THERMAL BEHAVIOR                        │
+│                                                                      │
+│  τ varies with temperature:                                          │
+│                                                                      │
+│  τ │                                                                 │
+│    │ ▓▓▓▓                                                           │
+│    │     ▓▓▓▓                                                       │
+│    │         ▓▓▓▓▓▓                                                 │
+│    │               ▓▓▓▓▓▓▓▓▓▓▓                                      │
+│    └──────────────────────────────────────► Temperature             │
+│     20°C              60°C              100°C                        │
+│                                                                      │
+│  Mechanism: Convection coefficient h ∝ (T_surface - T_ambient)^0.25 │
+│  At higher ΔT, convection is more efficient → τ decreases           │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+**Detection Method:**
+
+Fit residuals from first-order model and look for systematic patterns:
+
+```python
+residuals = measured_temp - predicted_temp
+
+# Check for overshoot
+overshoot_ratio = max(measured_temp) / steady_state_temp
+
+# Check for oscillation (autocorrelation at expected frequency)
+from scipy.signal import find_peaks
+peaks, _ = find_peaks(residuals, distance=10)
+if len(peaks) > 2:
+    oscillation_period = np.mean(np.diff(time[peaks]))
+
+# Check for nonlinearity (residuals correlate with temperature)
+from scipy.stats import pearsonr
+r, p = pearsonr(residuals, measured_temp)
+if p < 0.05 and abs(r) > 0.3:
+    # Significant nonlinearity detected
+```
+
+---
+
+## Methodology
+
+### Phase 1: Baseline Characterization (Week 1-2)
+
+1. **Setup Test Environment**
+   - Isolated test bench with controlled ambient temperature
+   - High-precision temperature logging (100ms minimum resolution)
+   - Calibrated thermal sensors (cross-reference multiple sources)
+
+2. **Single-Hardware Validation**
+   - Select one representative server-class CPU
+   - Perform 20+ step response tests
+   - Establish measurement repeatability
+
+3. **Methodology Refinement**
+   - Identify and eliminate systematic errors
+   - Optimize curve fitting algorithm
+   - Establish uncertainty quantification
+
+### Phase 2: Hardware Survey (Week 3-4)
+
+1. **TDP Class Coverage**
+   - Minimum 3 CPUs per TDP class (mobile/desktop/server)
+   - Document cooler configurations
+   - Measure τ variance within and across classes
+
+2. **GPU Characterization**
+   - NVIDIA and AMD discrete GPUs
+   - Integrated graphics (different thermal path)
+   - Multi-GPU configurations (thermal coupling)
+
+3. **Chassis Effects**
+   - Open bench vs enclosed case
+   - Different airflow configurations
+   - NVMe and RAM thermal contribution
+
+### Phase 3: Workload Effects (Week 5-6)
+
+1. **Workload Matrix Execution**
+   - All stress-ng workload types per hardware
+   - Real application benchmarks (compile, render, ML training)
+   - Idle → load → idle cycles
+
+2. **Second-Order Effect Detection**
+   - Fit residual analysis
+   - Oscillation detection
+   - Nonlinearity characterization
+
+### Phase 4: Model Development (Week 7-8)
+
+1. **Model Selection**
+   - Compare first-order vs two-time-constant models
+   - Evaluate need for dead time compensation
+   - Assess nonlinearity corrections
+
+2. **Validation**
+   - Cross-validation on held-out hardware
+   - Prediction accuracy under mixed workloads
+   - Sensitivity analysis
+
+---
+
+## Deliverables
+
+### D1: Hardware Time Constant Database
+
+```yaml
+# thermal_constants.yaml
+hardware:
+  - model: "Intel Xeon Gold 6326"
+    type: "cpu"
+    tdp_watts: 185
+    constants:
+      tau_primary: 1.87
+      tau_secondary: 8.3  # if two-time-constant model
+      dead_time: 0.23
+      overshoot_ratio: 1.02
+    uncertainty:
+      tau_primary: 0.12
+      methodology: "step_response_fit"
+    conditions:
+      cooler: "stock_2u"
+      ambient_c: 21.0
+```
+
+### D2: Measurement Toolkit
+
+```
+thermal-validation/
+├── scripts/
+│   ├── run_step_response.sh     # Automated test runner
+│   ├── fit_thermal_curve.py     # Curve fitting with uncertainty
+│   └── validate_model.py        # Model accuracy checker
+├── notebooks/
+│   ├── data_exploration.ipynb   # Interactive analysis
+│   └── model_comparison.ipynb   # First-order vs two-constant
+└── docs/
+    └── measurement_protocol.md  # Reproducible methodology
+```
+
+### D3: Maxwell Integration Recommendations
+
+```rust
+// Proposed configuration structure
+pub struct ThermalCharacteristics {
+    /// Primary time constant (die → heatsink)
+    pub tau_primary: Duration,
+
+    /// Secondary time constant (heatsink → ambient), if applicable
+    pub tau_secondary: Option<Duration>,
+
+    /// Dead time before temperature response
+    pub dead_time: Duration,
+
+    /// Expected overshoot ratio (1.0 = none)
+    pub overshoot_ratio: f64,
+
+    /// Workload sensitivity factor
+    pub workload_sensitivity: WorkloadSensitivity,
+}
+
+pub enum WorkloadSensitivity {
+    /// τ is stable across workload types (±10%)
+    Low,
+    /// τ varies moderately with workload (±25%)
+    Medium,
+    /// τ varies significantly (±50%), consider dynamic estimation
+    High,
+}
+```
+
+### D4: Research Report
+
+1. **Executive Summary** — Key findings and recommendations
+2. **Methodology** — Reproducible measurement protocol
+3. **Results** — Time constant database with uncertainty
+4. **Model Recommendations** — First-order adequacy assessment
+5. **Maxwell Integration** — Specific configuration guidance
+
+---
+
+## Success Criteria
+
+| Criterion | Target | Validation Method |
+|-----------|--------|-------------------|
+| Measurement repeatability | CV < 10% across trials | Statistical analysis of repeated runs |
+| Hardware coverage | ≥3 CPUs per TDP class | Inventory checklist |
+| Model fit quality | R² > 0.98 | Curve fitting diagnostics |
+| Dead time characterization | ±20ms accuracy | High-frequency measurement |
+| Second-order detection | Detect effects >5% of signal | Residual analysis |
+| Documentation | Complete, reproducible | Independent reproduction test |
+
+---
+
+## References
+
+### Thermal Modeling
+
+1. Incropera, F.P. & DeWitt, D.P. — *Fundamentals of Heat and Mass Transfer* — Standard reference for thermal analysis
+2. Bar-Cohen, A. & Kraus, A.D. — *Advances in Thermal Modeling of Electronic Components and Systems* — Electronics-specific thermal behavior
+
+### Processor Thermal Behavior
+
+3. Intel — *Thermal Design Power (TDP) and Thermal Management* — Manufacturer thermal specifications
+4. AMD — *Processor Power and Thermal Data Sheet* — AMD thermal characteristics
+5. NVIDIA — *GPU Thermal Design Guide* — GPU-specific thermal considerations
+
+### Control Systems
+
+6. Astrom, K.J. & Hagglund, T. — *PID Controllers: Theory, Design, and Tuning* — Dead time compensation and Smith predictors
+7. Skogestad, S. — *Simple analytic rules for model reduction and PID controller tuning* — Practical PID tuning for thermal systems
+
+### Measurement Methodology
+
+8. JEDEC — *JESD51 Series* — Standard thermal measurement methods for semiconductors
+9. ASHRAE — *Thermal Guidelines for Data Processing Environments* — Data center thermal standards
+
+### Prior Art
+
+10. Google — *Machine Learning for Thermal Management* — Data-driven thermal prediction at scale
+11. Microsoft — *Project Natick* — Thermal dynamics in novel cooling environments
--- a/blog/content/notes/003-research-planning/meta.yaml
+++ b/blog/content/notes/003-research-planning/meta.yaml
@ -0,0 +1,372 @@
+id: "003"
+slug: 003-research-planning
+date: "2026-02-07"
+title: Research Planning
+preview: "Identifying unknowns in the roadmap and creating research directives for parallel investigation."
+
+prompts:
+  - id: create-directives
+    label: Create directives
+    content: |
+      /do-parallel for each topic, create a directive for our research team to research and write it in research-requests/{name}.md
+
+  - id: deep-research
+    label: Deep research (Gemini)
+    content: |
+      For each directive in research-requests/:
+      1. Open gemini.google.com/app
+      2. Click Tools → Deep Research
+      3. Paste the entire directive
+      4. Review the comprehensive report
+
+skillsUsed:
+  - name: research-directive
+    command: /research-directive
+    description: Create research directives for deep research tools (Gemini Deep Research, Perplexity, etc.)
+    usage: |
+      ---
+      name: research-directive
+      description: Create research directives for deep research tools (Gemini Deep Research, Perplexity, etc.)
+      ---
+
+      You are a research architect who creates self-contained research directives. These directives are designed to be pasted directly into AI deep research tools like Gemini Deep Research.
+
+      ## When to Use
+
+      - Researching technical unknowns before implementation
+      - Validating assumptions in architecture docs
+      - Investigating feasibility of novel approaches
+      - Finding prior art, papers, and implementations
+
+      ## Directive Structure
+
+      Every research directive follows this format:
+
+      ```markdown
+      # [Topic] Research Directive
+
+      You are [Expert Name], [credentials]. [1-2 sentences on their expertise].
+
+      You are going to research [specific topic] to answer: [core question].
+
+      ---
+
+      ## Context
+
+      [Why this research matters. What decision depends on the answer.]
+
+      ---
+
+      ## Research Questions
+
+      Answer these specific questions:
+
+      1. [Measurable question with success criteria]
+      2. [Measurable question with success criteria]
+      3. [Measurable question with success criteria]
+
+      ---
+
+      ## Methodology
+
+      ### Phase 1: [First research phase]
+      - [Specific search queries or sources]
+      - [What to look for]
+
+      ### Phase 2: [Second research phase]
+      - [Deeper investigation]
+      - [Cross-referencing]
+
+      ### Phase 3: Synthesis
+      - Compare findings across sources
+      - Identify consensus vs. disagreement
+      - Note gaps in available research
+
+      ---
+
+      ## Deliverables
+
+      Produce a report with:
+
+      1. **Executive Summary** - Key findings in 3-5 bullets
+      2. **Detailed Findings** - Answer each research question
+      3. **Evidence** - Citations, benchmarks, code examples
+      4. **Recommendations** - What to do based on findings
+      5. **Open Questions** - What still needs investigation
+
+      ---
+
+      ## Success Criteria
+
+      Research is complete when:
+
+      - [ ] All research questions have evidence-based answers
+      - [ ] Findings include specific numbers/benchmarks where applicable
+      - [ ] Sources are cited and verifiable
+      - [ ] Recommendations are actionable
+      ```
+
+      ## Expert Selection
+
+      Match the expert to the domain:
+
+      | Domain | Expert Example |
+      |--------|----------------|
+      | Systems performance | Brendan Gregg |
+      | Distributed systems | Martin Kleppmann |
+      | Cryptography | Dan Boneh |
+      | ML systems | Jeff Dean |
+      | Mechanism design | Paul Milgrom |
+      | Linux kernel | Linus Torvalds |
+      | Databases | Andy Pavlo |
+
+      ## Output
+
+      Save directives to `research-requests/{topic-slug}.md`
+
+      When creating multiple directives, use `/do-parallel`:
+
+      ```
+      /do-parallel for each topic, create a research directive and write it to research-requests/{name}.md
+      ```
+
+      ## Execution
+
+      After creating directives:
+
+      1. Open [gemini.google.com/app](https://gemini.google.com/app)
+      2. Click **Tools → Deep Research**
+      3. Paste the entire directive
+      4. Wait 5-10 minutes for comprehensive report
+      5. Save results to `research/{topic-slug}.md`
+
+  - name: do-parallel
+    command: /do-parallel
+    description: Execute tasks in parallel waves with optimal agent selection and review
+    usage: |
+      ---
+      description: Execute tasks in parallel waves with optimal agent selection and review
+      argument-hint: <task list or "from todo">
+      allowed-tools: Task, Read, Write, Edit, Glob, Grep, Bash, TodoWrite
+      ---
+
+      Execute these tasks in parallel waves with proper review: $ARGUMENTS
+
+      ## Instructions
+
+      Load the `orchestrated-execution` skill, then:
+
+      ### Philosophy: Do It Right
+
+      **Take your time. No shortcuts.** Every implementation should be:
+
+      - **Clean** - Readable, well-named, minimal complexity
+      - **Maintainable** - Future developers can understand and modify it
+      - **Extensible** - Easy to add features without rewriting
+      - **Refactored** - If existing code is messy, clean it up as you go
+
+      When you encounter code that could be better:
+      - Refactor it. Don't work around bad patterns.
+      - Extract helpers, rename unclear variables, simplify nesting
+      - Leave the codebase better than you found it
+
+      **Prefer proper solutions over quick fixes.** A 50-line clean implementation beats a 10-line hack.
+
+      ### Phase 1: Parse & Analyze
+
+      1. **Parse tasks** - From todo list or provided
+      2. **Analyze dependencies** - Which tasks depend on which
+      3. **Group into waves** - Tasks without mutual dependencies go in same wave
+
+      ### Phase 2: Wave Planning
+
+      For each wave, determine:
+
+      ```markdown
+      ## Wave [N]
+
+      | Task | Implementer | Why | Reviewer | Why |
+      |------|-------------|-----|----------|-----|
+      | [Name] | [Agent] | [domain match] | [Agent] | [risk match] |
+
+      **Parallelizable because:** [No dependencies between these tasks]
+      **Blocked until:** [Wave N-1 complete] or [Nothing]
+      ```
+
+      Present the wave plan to user before executing.
+
+      ### Phase 3: Execute Each Wave
+
+      ```
+      Wave N:
+      ┌─────────────────────────────────────────────┐
+      │  1. LAUNCH ALL IMPLEMENTERS (parallel)      │
+      │                                             │
+      │     Task(agent1, task1) ──┐                 │
+      │     Task(agent2, task2) ──┼── concurrent    │
+      │     Task(agent3, task3) ──┘                 │
+      └─────────────────────────────────────────────┘
+                    │
+                    ▼
+      ┌─────────────────────────────────────────────┐
+      │  2. COLLECT RESULTS                         │
+      │     Wait for all to complete                │
+      │     Gather implementation outputs           │
+      └─────────────────────────────────────────────┘
+                    │
+                    ▼
+      ┌─────────────────────────────────────────────┐
+      │  3. LAUNCH ALL REVIEWERS (parallel)         │
+      │                                             │
+      │     Task(reviewer1, review1) ──┐            │
+      │     Task(reviewer2, review2) ──┼── concurrent│
+      │     Task(reviewer3, review3) ──┘            │
+      └─────────────────────────────────────────────┘
+                    │
+                    ▼
+      ┌─────────────────────────────────────────────┐
+      │  4. PROCESS REVIEW RESULTS                  │
+      │                                             │
+      │     PASS → mark complete                    │
+      │     NEEDS_FIX → fix loop (can parallelize)  │
+      │     BLOCK → escalate immediately            │
+      └─────────────────────────────────────────────┘
+                    │
+                    ▼
+      ┌─────────────────────────────────────────────┐
+      │  5. VERIFY WAVE COMPLETE                    │
+      │     All tasks in wave done?                 │
+      │     Any conflicts from parallel execution?  │
+      └─────────────────────────────────────────────┘
+                    │
+                    ▼
+              Continue to Wave N+1
+      ```
+
+      ### Phase 4: Loose Ends (Critical for Parallel)
+
+      After all waves, explicitly check:
+
+      1. **File conflicts** - Did parallel tasks modify same files?
+      2. **Integration gaps** - Do the pieces work together?
+      3. **Merge issues** - Any conflicting changes to resolve?
+      4. **Cross-cutting** - Consistent patterns across all tasks?
+      5. **Quality gate** - Full build, test, lint
+
+      ### Phase 5: Final Report
+
+      ```markdown
+      ## Parallel Execution Complete
+
+      ### Wave Summary
+      | Wave | Tasks | Parallel Time | Status |
+      |------|-------|---------------|--------|
+      | 1 | 3 | ~2min | ✓ |
+      | 2 | 2 | ~1min | ✓ |
+      | 3 | 1 | ~1min | ✓ |
+
+      ### Task Details
+      | Task | Wave | Implementer | Reviewer | Issues Fixed | Status |
+      |------|------|-------------|----------|--------------|--------|
+
+      ### Loose Ends Resolved
+      - [Conflicts fixed]
+      - [Integration issues addressed]
+
+      ### Quality Gate
+      - Build: PASS
+      - Tests: PASS
+      - Lint: PASS
+      ```
+
+      ## Dependency Detection
+
+      Tasks depend on each other when:
+
+      ```
+      Task A: "Create User model"
+      Task B: "Add validation to User model"  ← Depends on A
+
+      Task A: "Implement auth backend"
+      Task B: "Write auth tests"              ← Depends on A
+
+      Task A: "Update config schema"
+      Task B: "Migrate existing configs"      ← Depends on A
+      ```
+
+      Tasks are independent when:
+
+      ```
+      Task A: "Add logging to ingestion"
+      Task B: "Add metrics to query"          ← Different modules, independent
+
+      Task A: "Write User docs"
+      Task B: "Write Config docs"             ← Different files, independent
+      ```
+
+      ## Agent Selection
+
+      ### Implementers
+
+      Select agents based on the task domain. Examples:
+
+      | Task Type | Agent Example |
+      |-----------|---------------|
+      | Backend code | `primary-developer` |
+      | Tests | `quality-engineer` |
+      | Auth | `auth-engineer` |
+      | Storage | `storage-architect` |
+      | K8s/Ops | `ops-engineer` |
+      | UI | `ux-prototyper` |
+      | Docs | `tech-doc-writer` |
+
+      Check `~/.claude/agents/` for available agents.
+
+      ### Reviewers
+      | Risk Type | Reviewer Example |
+      |-----------|------------------|
+      | Code quality | `quality-engineer` |
+      | Security | `security-reviewer` |
+      | Performance | `performance-lead` |
+      | E2E | `e2e-validator` |
+
+      ## Critical Rules
+
+      - NEVER put dependent tasks in same wave
+      - ALWAYS review even in parallel mode
+      - ALWAYS check for conflicts after parallel execution
+      - ALWAYS run quality gate at the end
+      - ANNOUNCE wave plan before executing
+      - MAX 3 fix cycles per task, then escalate
+
+filesCreated:
+  - name: firecracker-latency-benchmarks.md
+    description: Benchmarking pause/resume latency for thermal emergencies
+  - name: ebpf-overhead-validation.md
+    description: Validating eBPF instrumentation overhead under production load
+  - name: rapl-accuracy-calibration.md
+    description: Calibrating Intel RAPL power reporting accuracy across hardware
+  - name: thermal-time-constant-validation.md
+    description: Measuring actual thermal time constants for different hardware
+  - name: thermal-coupling-measurement.md
+    description: Quantifying heat transfer between adjacent cores
+  - name: gsp-thermal-stability.md
+    description: Analyzing GSP auction stability under thermal dynamics
+  - name: high-frequency-auction-research.md
+    description: Researching 100Hz market clearing without CPU overhead
+  - name: power-trace-verification.md
+    description: Distinguishing inference from mining via power signatures
+  - name: proof-of-inference.md
+    description: Cryptographic verification of ML workload execution
+  - name: thermal-gossip-consensus-research.md
+    description: Coordinating thermal state across distributed clusters
+
+navigation:
+  prev:
+    slug: 002-building-the-scaffolding
+    id: "002"
+    title: Understanding the Project
+  next:
+    slug: 004-hydrating-the-roadmap
+    id: "004"
+    title: Hydrating the Roadmap
--- a/blog/content/notes/004-hydrating-the-roadmap/content.md
+++ b/blog/content/notes/004-hydrating-the-roadmap/content.md
@ -0,0 +1,48 @@
+## The Goal
+
+This note is about detailed planning and gut checking. Two questions:
+
+1. Can we get a detailed roadmap?
+2. Are we confident we can accomplish it?
+
+## Step 1: Hydrate the Planning Docs
+
+With research complete in `research/*`, apply it back to the planning documents:
+
+> Take all of the research in research/* and apply it to our *.md docs
+
+This updates `vision.md`, `architecture.md`, and `roadmap.md` with validated approaches, concrete numbers, and working code patterns.
+
+## Step 2: Expand Roadmap Steps
+
+Each step in the roadmap needs detailed implementation guidance. I used `/do-parallel` to expand all steps simultaneously:
+
+> /do-parallel for each step in roadmap.md, write it in steps/{name}.md. use the research in research/*
+
+This creates a detailed implementation guide for each step, incorporating the research findings. The `steps/` directory becomes a collection of actionable implementation documents.
+
+## Step 3: Confidence Check
+
+With the detailed roadmap and expanded steps, ask for a gut check:
+
+> Read through the roadmap and the steps, respond with a confidence score 1-100 of whether or not you think we will succeed, then list things that will make us more confident
+
+## Methodology: The Confidence Check
+
+There are many ways to do this confidence check:
+
+- **Ask for gaps** — What needs more specification?
+- **Ask for unknowns** — What needs more research?
+- **Ask for spikes** — Do we need smaller experiments to validate assumptions?
+- **Ask for risks** — What could go wrong?
+- **Ask for dependencies** — What are we waiting on?
+
+In this example, I kept it simple — just hunted for a confidence score and a list of things that would increase confidence. When the score came back high, I accepted it and moved on. This is risky. AI is often overconfident. A proper confidence check using the approaches above would surface more issues.
+
+## What I Did
+
+I ran the confidence check, got a high score, noted the suggestions for improvement, and moved on. The suggestions went into backlog items and future research topics.
+
+## Next Steps
+
+Start building.
--- a/blog/content/notes/004-hydrating-the-roadmap/meta.yaml
+++ b/blog/content/notes/004-hydrating-the-roadmap/meta.yaml
@ -0,0 +1,35 @@
+id: "004"
+slug: 004-hydrating-the-roadmap
+date: "2026-02-07"
+title: Hydrating the Roadmap
+preview: "Expanding roadmap steps with research findings and running a confidence check."
+
+prompts:
+  - id: hydrate-docs
+    label: Hydrate planning docs
+    content: |
+      Take all of the research in research/* and apply it to our *.md docs
+
+  - id: expand-steps
+    label: Expand roadmap steps
+    content: |
+      /do-parallel for each step in roadmap.md, write it in steps/{name}.md. use the research in research/*
+
+  - id: confidence-check
+    label: Confidence check
+    content: |
+      Read through the roadmap and the steps, respond with a confidence score 1-100 of whether or not you think we will succeed, then list things that will make us more confident
+
+skillsUsed: []
+
+filesCreated: []
+
+navigation:
+  prev:
+    slug: 003-research-planning
+    id: "003"
+    title: Research Planning
+  next:
+    slug: 005-project-setup
+    id: "005"
+    title: Project Setup
--- a/blog/content/notes/005-project-setup/content.md
+++ b/blog/content/notes/005-project-setup/content.md
@ -0,0 +1,83 @@
+## The Goal
+
+Before writing any code, set up two things:
+
+1. **Expert agents** — Who would be best at working on this project?
+2. **Quality hooks** — What checks run before every commit?
+
+## Step 1: Pick Expert Agents
+
+I asked: What 3 people in the world would be best at working on this project? Pick from a diverse set.
+
+Looking at Maxwell's intersection of hypervisors, eBPF, Rust, and thermodynamics-inspired agent control:
+
+1. **Bryan Cantrill** — CTO of Oxide Computer, former Sun distinguished engineer. Created DTrace, now deep in Rust systems programming and hypervisor design. He literally thinks about computation as a physical phenomenon. His "software as materials science" philosophy aligns perfectly with the thermodynamic metaphor.
+
+2. **Brendan Gregg** — The eBPF authority. Wrote the book on BPF performance tools. If Maxwell uses eBPF to observe and constrain agent processes at the kernel level, nobody understands that observability/control plane better.
+
+3. **Joscha Bach** — AI researcher who frames cognition and agency through physics and thermodynamics. He thinks about agents as dissipative structures, intelligence as entropy management, and computation as a physical process. Brings the theoretical grounding for why "thermodynamic hypervisor" isn't just a metaphor.
+
+Why this mix works:
+
+- **Cantrill**: the Rust hypervisor craft
+- **Gregg**: the eBPF instrumentation layer
+- **Bach**: the agent-as-thermodynamic-system philosophy
+
+All three are known for strong, opinionated thinking about their domains — they wouldn't just build it, they'd argue about whether it should be built this way.
+
+## Step 2: Create the Agents
+
+With the experts identified, I used `/do-parallel` to create all three agents simultaneously:
+
+> /do-parallel /create-agent for each
+
+Each agent gets their own file in `.claude/agents/` with their expertise, philosophy, and approach encoded.
+
+## Step 3: Scaffold the Project
+
+> /scaffold
+
+This sets up the project skeleton and quality hooks. The skeleton gives you types, traits, and stubs — just enough to compile. But the most important part is the quality hooks.
+
+Before writing any code, we need checks that run on every commit:
+
+- **Code complexity** — Keep functions simple
+- **File length** — No 2000-line files
+- **Dead code** — Remove unused code
+- **Circular dependencies** — Clean architecture
+- **Similar code** — DRY violations
+- **Linting** — Consistent style
+
+`/scaffold` calls `/setup-hooks` internally. Every commit runs through these checks. If they fail, the commit is blocked.
+
+## Step 4: Create Docs
+
+> Create a basic readme.md and quickstart.md
+
+The README explains what the project is. The quickstart explains how to get it running. Both should be minimal — just enough to onboard someone.
+
+## Step 5: Verify Setup
+
+> Run through the quickstart and pre-commit hooks and make sure everything works properly
+
+Follow your own quickstart. Make a trivial change and commit it. If the hooks fail or the quickstart is wrong, fix it now.
+
+## Why Quality Hooks First
+
+It's tempting to skip this and "add it later." Don't.
+
+Quality hooks installed at project start:
+- Catch issues immediately
+- Build good habits from day one
+- Never accumulate debt
+
+Quality hooks added later:
+- Hundreds of existing violations
+- Painful cleanup sprint
+- Team resistance
+
+The best time to install quality hooks is before the first line of code.
+
+## Next Steps
+
+Start building.
--- a/blog/content/notes/005-project-setup/meta.yaml
+++ b/blog/content/notes/005-project-setup/meta.yaml
@ -0,0 +1,216 @@
+id: "005"
+slug: 005-project-setup
+date: "2026-02-07"
+title: Project Setup
+preview: "Creating expert agents and setting up quality hooks before writing any code."
+
+prompts:
+  - id: pick-agents
+    label: Pick expert agents
+    content: |
+      What 3 people in the world would be best at working on this project? Pick from a diverse set.
+
+  - id: create-agents
+    label: Create agents
+    content: |
+      /do-parallel /create-agent for each
+
+  - id: scaffold
+    label: Scaffold project
+    content: |
+      /scaffold
+
+  - id: create-docs
+    label: Create docs
+    content: |
+      Create a basic readme.md and quickstart.md
+
+  - id: verify
+    label: Verify setup
+    content: |
+      Run through the quickstart and pre-commit hooks and make sure everything works properly
+
+skillsUsed:
+  - name: create-agent
+    command: /create-agent
+    description: Create a new Claude Code agent for specialized tasks
+
+  - name: scaffold
+    command: /scaffold
+    description: Initialize a new project with hello world + quality hooks
+    usage: |
+      ---
+      name: project-skeleton
+      description: Initialize a new project with hello world + quality hooks. Use when starting any new project.
+      ---
+
+      # Project Skeleton
+
+      ## Identity
+
+      You run `cargo new` and `/setup-hooks`. That's it.
+
+      ## What This Does
+
+      1. Initialize a hello world project
+      2. Install pre-commit hooks
+      3. Verify it builds
+
+      ## Protocol
+
+      ### Phase 1: Detect or Ask Project Type
+
+      | Signal | Type | Init Command |
+      |--------|------|--------------|
+      | `Cargo.toml` exists or "Rust" mentioned | Rust | `cargo new` or `cargo init` |
+      | `package.json` exists or "TypeScript/Node" mentioned | TypeScript | `npm init -y && npm i -D typescript && npx tsc --init` |
+      | `go.mod` exists or "Go" mentioned | Go | `go mod init` |
+      | `pyproject.toml` exists or "Python" mentioned | Python | `uv init` or `poetry init` |
+
+      If unclear, ask.
+
+      ### Phase 2: Initialize
+
+      Run the init command. That's the scaffold.
+
+      For workspaces/monorepos, create the workspace manifest and one hello-world member.
+
+      ### Phase 3: Setup Hooks
+
+      Run `/setup-hooks` or apply the `quality-gates` skill.
+
+      ### Phase 4: Verify
+
+      Run the build command:
+      - Rust: `cargo check`
+      - TypeScript: `npx tsc --noEmit`
+      - Go: `go build ./...`
+      - Python: `python -c "import <package>"`
+
+      If it doesn't build, fix it until it does.
+
+      ## Do
+
+      1. Run the language's standard init command
+      2. Install pre-commit hooks
+      3. Verify it builds
+
+      ## Do Not
+
+      1. Write application code
+      2. Define types, traits, interfaces
+      3. Create stubs or mocks
+      4. Add dependencies beyond hello world
+      5. Create multiple files beyond what init produces
+      6. Read project specs/roadmaps and implement them
+
+      ## Constraints
+
+      - NEVER write more than what `cargo new` / `npm init` / `go mod init` produces
+      - ALWAYS install hooks before declaring done
+      - ALWAYS verify the build succeeds
+
+  - name: setup-hooks
+    command: /setup-hooks
+    description: Set up and maintain pre-commit hooks and CI quality checks
+    usage: |
+      ---
+      name: quality-gates
+      description: Set up and maintain pre-commit hooks and CI quality checks. Use when configuring automated quality enforcement.
+      ---
+
+      # Quality Gates
+
+      Enforce code quality automatically. Catch problems on commit, not in CI.
+
+      ## Check Categories
+
+      ### Pre-commit (fast, <10s, staged files only)
+
+      | Check | Purpose | Auto-fix? |
+      |-------|---------|-----------|
+      | Formatting | Consistent style | YES |
+      | Import sorting | Organized imports | YES |
+      | Linting | Bug patterns, code smells | PARTIAL |
+      | Type checking | Type safety | NO |
+      | File length | Maintainability (max 500) | NO |
+      | Function length | Readability (max 100) | NO |
+      | Complexity | Cognitive load (max 15-25) | NO |
+
+      ### CI (slow, full repo)
+
+      | Check | Purpose |
+      |-------|---------|
+      | Circular dependencies | Module health |
+      | Code duplication | DRY violations |
+      | Dead code | Unused exports/functions |
+      | Security audit | Vulnerabilities |
+      | Test coverage | Quality gates |
+
+      ## Tool Matrix
+
+      | Check | Go | TypeScript | Rust | Python |
+      |-------|-----|------------|------|--------|
+      | Format | gofmt | prettier | rustfmt | black/ruff |
+      | Imports | goimports | eslint-plugin-import | rustfmt | isort/ruff |
+      | Lint | golangci-lint | eslint | clippy | ruff |
+      | Types | compiler | tsc | compiler | mypy/pyright |
+      | Complexity | gocyclo | eslint complexity | clippy | radon/ruff |
+      | Circular | go-cycles | madge | cargo-depgraph | pydeps |
+      | Duplication | dupl | jscpd | cargo-clone | jscpd |
+      | Dead code | deadcode | knip/ts-prune | cargo-udeps | vulture |
+
+      ## Hook Structure
+
+      Use two-phase approach:
+
+      ```
+      PHASE 1: AUTO-FIX
+        → Run formatters on staged files
+        → Run linters with --fix
+        → Re-stage fixed files
+
+      PHASE 2: VERIFY
+        → Check formatting (should pass after phase 1)
+        → Run linting (unfixable issues)
+        → Type check
+        → File length check
+        → Complexity check
+        → Custom project rules
+      ```
+
+      ## Threshold Defaults
+
+      | Metric | Default | Rationale |
+      |--------|---------|-----------|
+      | File length | 500 lines | Fits in head |
+      | Function length | 100 lines | Single responsibility |
+      | Cyclomatic complexity | 15-25 | Testable |
+      | Duplication | 5+ lines | Worth abstracting |
+      | Max pre-commit time | 10s | Won't get disabled |
+
+      ## Do
+
+      - Keep pre-commit under 10 seconds
+      - Check staged files only
+      - Auto-fix then verify
+      - Re-stage fixed files
+      - Provide fix commands in errors
+      - Split slow checks to CI
+
+      ## Do Not
+
+      - Run full test suite on commit
+      - Check entire codebase on commit
+      - Make hooks so slow they get disabled
+      - Fail without explaining how to fix
+      - Skip the auto-fix phase
+
+filesCreated: []
+
+navigation:
+  prev:
+    slug: 004-hydrating-the-roadmap
+    id: "004"
+    title: Hydrating the Roadmap
+  next: null
--- a/blog/content/projects/maxwell.yaml
+++ b/blog/content/projects/maxwell.yaml
@ -0,0 +1,13 @@
+id: maxwell
+title: Maxwell
+subtitle: A Thermodynamic Hypervisor for Autonomous Agents
+status: In Progress
+slug: maxwell
+
+intro:
+  - Occasionally, I like to pick something adjacent to what I am working on that I am not familiar with and run through a quick research project that is well out of my depth.
+  - Below is a journal of how I use AI throughout that process.
+
+whitePaper:
+  href: /maxwell/white-paper
+  label: "Maxwell: A Thermodynamic Hypervisor for Autonomous Agents"
--- a/blog/content/white-paper/outline.yaml
+++ b/blog/content/white-paper/outline.yaml
@ -0,0 +1,78 @@
+status: DRAFT — IN PROGRESS
+title: "Maxwell: A Thermodynamic Hypervisor for Autonomous Agents"
+date: February 2026
+
+abstract: |
+  Abstract will synthesize the core thesis once research is complete. Current working hypothesis: Treating compute as a scarce economic resource with thermodynamic constraints enables natural selection for useful work among autonomous agents.
+
+sections:
+  - number: 1
+    title: "The Problem: Fairness is a Bug"
+    bullets:
+      - Why CFS (Completely Fair Scheduler) fails for autonomous agents
+      - "The three pathologies: resource squatting, thermal tragedy, no natural selection"
+      - "Core thesis: fairness for humans ≠ fairness for agents"
+
+  - number: 2
+    title: "The Insight: Physics as Policy"
+    bullets:
+      - CPUs as thermodynamic systems, not abstract timesharing
+      - Price as a function of temperature and thermal headroom
+      - "The feedback loop: parasites heat → price rises → parasites die"
+
+  - number: 3
+    title: The Mechanism
+    bullets:
+      - "**3.1 GSP Auction:** Generalized Second-Price for compute time"
+      - "**3.2 Energy Wallets:** Agents earn/spend in $JOULE"
+      - "**3.3 Landauer's Tax:** Thermodynamic cost of memory erasure"
+      - "**3.4 Apoptosis:** Graceful termination at insolvency"
+
+  - number: 4
+    title: "The Experiment: Scientist vs. Leech"
+    bullets:
+      - "Experimental design: control (Linux CFS) vs treatment (Maxwell)"
+      - "Agent profiles: productive (prime finding) vs parasitic (hash mining)"
+      - "Metrics: efficiency (primes/joule), thermal behavior, survival time"
+      - "Hypothesis: Maxwell achieves ≥1.8x efficiency improvement"
+
+  - number: 5
+    title: "Proof of Inference: Power-Trace Verification"
+    bullets:
+      - "The verification problem: is the agent actually inferring?"
+      - "Thermodynamic fingerprinting: distinct power signatures by workload type"
+      - "Research validation: 89-100% accuracy distinguishing workload classes"
+      - "Tiered verification: power-trace → TEE → zkML"
+
+  - number: 6
+    title: Applications
+    bullets:
+      - "**6.1 DePIN:** Continuous attestation for decentralized compute"
+      - "**6.2 Agent Fleets:** Decentralized value discovery via auction"
+      - "**6.3 GPU Clusters:** Filling idle cycles, 80%+ utilization"
+
+  - number: 7
+    title: Related Work
+    bullets:
+      - "Economic scheduling: Spawn, Mariposa, resource markets"
+      - "Power-aware scheduling: RAPL, thermal governors"
+      - "Verifiable compute: zkML, TEE attestation, PoUW"
+
+  - number: 8
+    title: Limitations and Future Work
+    bullets:
+      - "Current: single-node only, CPU focus"
+      - "Future: multi-node coordination, GPU integration"
+      - "Open questions: value discovery mechanisms, agent incentive alignment"
+
+  - number: 9
+    title: Conclusion
+    bullets:
+      - Summary of contribution
+      - When does the world need this?
+      - "\"Fairness is a bug; Maxwell is the fix\""
+
+currentStatus:
+  note: |
+    This paper is being written alongside the research documented in the research notes. Each section will be filled in as experiments are run and insights are validated. The outline above represents the target structure.
+  notesLink: /maxwell
--- a/blog/eslint.config.mjs
+++ b/blog/eslint.config.mjs
@ -0,0 +1,18 @@
+import { defineConfig, globalIgnores } from "eslint/config";
+import nextVitals from "eslint-config-next/core-web-vitals";
+import nextTs from "eslint-config-next/typescript";
+
+const eslintConfig = defineConfig([
+  ...nextVitals,
+  ...nextTs,
+  // Override default ignores of eslint-config-next.
+  globalIgnores([
+    // Default ignores of eslint-config-next:
+    ".next/**",
+    "out/**",
+    "build/**",
+    "next-env.d.ts",
+  ]),
+]);
+
+export default eslintConfig;
--- a/blog/next.config.ts
+++ b/blog/next.config.ts
@ -0,0 +1,7 @@
+import type { NextConfig } from "next";
+
+const nextConfig: NextConfig = {
+  /* config options here */
+};
+
+export default nextConfig;
--- a/blog/package-lock.json
+++ b/blog/package-lock.json
--- a/blog/package.json
+++ b/blog/package.json
@ -0,0 +1,36 @@
+{
+	"name": "maxwell-blog",
+	"version": "0.1.0",
+	"private": true,
+	"scripts": {
+		"dev": "next dev --port 19197",
+		"build": "next build",
+		"start": "next start --port 19197",
+		"lint": "eslint"
+	},
+	"dependencies": {
+		"@radix-ui/react-slot": "^1.2.4",
+		"class-variance-authority": "^0.7.1",
+		"clsx": "^2.1.1",
+		"js-yaml": "^4.1.1",
+		"lucide-react": "^0.563.0",
+		"next": "16.1.6",
+		"react": "19.2.3",
+		"react-dom": "19.2.3",
+		"react-markdown": "^10.1.0",
+		"remark-gfm": "^4.0.1",
+		"tailwind-merge": "^3.4.0"
+	},
+	"devDependencies": {
+		"@tailwindcss/postcss": "^4",
+		"@types/js-yaml": "^4.0.9",
+		"@types/node": "^20",
+		"@types/react": "^19",
+		"@types/react-dom": "^19",
+		"eslint": "^9",
+		"eslint-config-next": "16.1.6",
+		"tailwindcss": "^4",
+		"tw-animate-css": "^1.4.0",
+		"typescript": "^5"
+	}
+}
--- a/blog/postcss.config.mjs
+++ b/blog/postcss.config.mjs
@ -0,0 +1,7 @@
+const config = {
+  plugins: {
+    "@tailwindcss/postcss": {},
+  },
+};
+
+export default config;
--- a/blog/public/file.svg
+++ b/blog/public/file.svg
@ -0,0 +1 @@
+<svg fill="none" viewBox="0 0 16 16" xmlns="http://www.w3.org/2000/svg"><path d="M14.5 13.5V5.41a1 1 0 0 0-.3-.7L9.8.29A1 1 0 0 0 9.08 0H1.5v13.5A2.5 2.5 0 0 0 4 16h8a2.5 2.5 0 0 0 2.5-2.5m-1.5 0v-7H8v-5H3v12a1 1 0 0 0 1 1h8a1 1 0 0 0 1-1M9.5 5V2.12L12.38 5zM5.13 5h-.62v1.25h2.12V5zm-.62 3h7.12v1.25H4.5zm.62 3h-.62v1.25h7.12V11z" clip-rule="evenodd" fill="#666" fill-rule="evenodd"/></svg>
--- a/blog/public/globe.svg
+++ b/blog/public/globe.svg
@ -0,0 +1 @@
+<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><g clip-path="url(#a)"><path fill-rule="evenodd" clip-rule="evenodd" d="M10.27 14.1a6.5 6.5 0 0 0 3.67-3.45q-1.24.21-2.7.34-.31 1.83-.97 3.1M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16m.48-1.52a7 7 0 0 1-.96 0H7.5a4 4 0 0 1-.84-1.32q-.38-.89-.63-2.08a40 40 0 0 0 3.92 0q-.25 1.2-.63 2.08a4 4 0 0 1-.84 1.31zm2.94-4.76q1.66-.15 2.95-.43a7 7 0 0 0 0-2.58q-1.3-.27-2.95-.43a18 18 0 0 1 0 3.44m-1.27-3.54a17 17 0 0 1 0 3.64 39 39 0 0 1-4.3 0 17 17 0 0 1 0-3.64 39 39 0 0 1 4.3 0m1.1-1.17q1.45.13 2.69.34a6.5 6.5 0 0 0-3.67-3.44q.65 1.26.98 3.1M8.48 1.5l.01.02q.41.37.84 1.31.38.89.63 2.08a40 40 0 0 0-3.92 0q.25-1.2.63-2.08a4 4 0 0 1 .85-1.32 7 7 0 0 1 .96 0m-2.75.4a6.5 6.5 0 0 0-3.67 3.44 29 29 0 0 1 2.7-.34q.31-1.83.97-3.1M4.58 6.28q-1.66.16-2.95.43a7 7 0 0 0 0 2.58q1.3.27 2.95.43a18 18 0 0 1 0-3.44m.17 4.71q-1.45-.12-2.69-.34a6.5 6.5 0 0 0 3.67 3.44q-.65-1.27-.98-3.1" fill="#666"/></g><defs><clipPath id="a"><path fill="#fff" d="M0 0h16v16H0z"/></clipPath></defs></svg>
--- a/blog/public/next.svg
+++ b/blog/public/next.svg
@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 394 80"><path fill="#000" d="M262 0h68.5v12.7h-27.2v66.6h-13.6V12.7H262V0ZM149 0v12.7H94v20.4h44.3v12.6H94v21h55v12.6H80.5V0h68.7zm34.3 0h-17.8l63.8 79.4h17.9l-32-39.7 32-39.6h-17.9l-23 28.6-23-28.6zm18.3 56.7-9-11-27.1 33.7h17.8l18.3-22.7z"/><path fill="#000" d="M81 79.3 17 0H0v79.3h13.6V17l50.2 62.3H81Zm252.6-.4c-1 0-1.8-.4-2.5-1s-1.1-1.6-1.1-2.6.3-1.8 1-2.5 1.6-1 2.6-1 1.8.3 2.5 1a3.4 3.4 0 0 1 .6 4.3 3.7 3.7 0 0 1-3 1.8zm23.2-33.5h6v23.3c0 2.1-.4 4-1.3 5.5a9.1 9.1 0 0 1-3.8 3.5c-1.6.8-3.5 1.3-5.7 1.3-2 0-3.7-.4-5.3-1s-2.8-1.8-3.7-3.2c-.9-1.3-1.4-3-1.4-5h6c.1.8.3 1.6.7 2.2s1 1.2 1.6 1.5c.7.4 1.5.5 2.4.5 1 0 1.8-.2 2.4-.6a4 4 0 0 0 1.6-1.8c.3-.8.5-1.8.5-3V45.5zm30.9 9.1a4.4 4.4 0 0 0-2-3.3 7.5 7.5 0 0 0-4.3-1.1c-1.3 0-2.4.2-3.3.5-.9.4-1.6 1-2 1.6a3.5 3.5 0 0 0-.3 4c.3.5.7.9 1.3 1.2l1.8 1 2 .5 3.2.8c1.3.3 2.5.7 3.7 1.2a13 13 0 0 1 3.2 1.8 8.1 8.1 0 0 1 3 6.5c0 2-.5 3.7-1.5 5.1a10 10 0 0 1-4.4 3.5c-1.8.8-4.1 1.2-6.8 1.2-2.6 0-4.9-.4-6.8-1.2-2-.8-3.4-2-4.5-3.5a10 10 0 0 1-1.7-5.6h6a5 5 0 0 0 3.5 4.6c1 .4 2.2.6 3.4.6 1.3 0 2.5-.2 3.5-.6 1-.4 1.8-1 2.4-1.7a4 4 0 0 0 .8-2.4c0-.9-.2-1.6-.7-2.2a11 11 0 0 0-2.1-1.4l-3.2-1-3.8-1c-2.8-.7-5-1.7-6.6-3.2a7.2 7.2 0 0 1-2.4-5.7 8 8 0 0 1 1.7-5 10 10 0 0 1 4.3-3.5c2-.8 4-1.2 6.4-1.2 2.3 0 4.4.4 6.2 1.2 1.8.8 3.2 2 4.3 3.4 1 1.4 1.5 3 1.5 5h-5.8z"/></svg>
--- a/blog/public/vercel.svg
+++ b/blog/public/vercel.svg
@ -0,0 +1 @@
+<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1155 1000"><path d="m577.3 0 577.4 1000H0z" fill="#fff"/></svg>
--- a/blog/public/window.svg
+++ b/blog/public/window.svg
@ -0,0 +1 @@
+<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><path fill-rule="evenodd" clip-rule="evenodd" d="M1.5 2.5h13v10a1 1 0 0 1-1 1h-11a1 1 0 0 1-1-1zM0 1h16v11.5a2.5 2.5 0 0 1-2.5 2.5h-11A2.5 2.5 0 0 1 0 12.5zm3.75 4.5a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5M7 4.75a.75.75 0 1 1-1.5 0 .75.75 0 0 1 1.5 0m1.75.75a.75.75 0 1 0 0-1.5.75.75 0 0 0 0 1.5" fill="#666"/></svg>
--- a/blog/src/app/favicon.ico
+++ b/blog/src/app/favicon.ico
--- a/blog/src/app/globals.css
+++ b/blog/src/app/globals.css
@ -0,0 +1,168 @@
+@import "tailwindcss";
+@import "tw-animate-css";
+
+@custom-variant dark (&:is(.dark *));
+
+@theme inline {
+  --color-background: var(--background);
+  --color-foreground: var(--foreground);
+  --font-sans: var(--font-geist-sans);
+  --font-mono: var(--font-geist-mono);
+  --font-serif: Georgia, Cambria, "Times New Roman", serif;
+  --color-sidebar-ring: var(--sidebar-ring);
+  --color-sidebar-border: var(--sidebar-border);
+  --color-sidebar-accent-foreground: var(--sidebar-accent-foreground);
+  --color-sidebar-accent: var(--sidebar-accent);
+  --color-sidebar-primary-foreground: var(--sidebar-primary-foreground);
+  --color-sidebar-primary: var(--sidebar-primary);
+  --color-sidebar-foreground: var(--sidebar-foreground);
+  --color-sidebar: var(--sidebar);
+  --color-chart-5: var(--chart-5);
+  --color-chart-4: var(--chart-4);
+  --color-chart-3: var(--chart-3);
+  --color-chart-2: var(--chart-2);
+  --color-chart-1: var(--chart-1);
+  --color-ring: var(--ring);
+  --color-input: var(--input);
+  --color-border: var(--border);
+  --color-destructive: var(--destructive);
+  --color-accent-foreground: var(--accent-foreground);
+  --color-accent: var(--accent);
+  --color-muted-foreground: var(--muted-foreground);
+  --color-muted: var(--muted);
+  --color-secondary-foreground: var(--secondary-foreground);
+  --color-secondary: var(--secondary);
+  --color-primary-foreground: var(--primary-foreground);
+  --color-primary: var(--primary);
+  --color-popover-foreground: var(--popover-foreground);
+  --color-popover: var(--popover);
+  --color-card-foreground: var(--card-foreground);
+  --color-card: var(--card);
+  --radius-sm: calc(var(--radius) - 4px);
+  --radius-md: calc(var(--radius) - 2px);
+  --radius-lg: var(--radius);
+  --radius-xl: calc(var(--radius) + 4px);
+  --radius-2xl: calc(var(--radius) + 8px);
+  --radius-3xl: calc(var(--radius) + 12px);
+  --radius-4xl: calc(var(--radius) + 16px);
+}
+
+/* Research Paper Theme - Light tan/offwhite with dark charcoal */
+:root {
+  --radius: 0.375rem;
+  /* Warm offwhite/cream background */
+  --background: oklch(0.965 0.015 85);
+  /* Dark charcoal text */
+  --foreground: oklch(0.25 0 0);
+  /* Slightly lighter cream for cards */
+  --card: oklch(0.98 0.01 85);
+  --card-foreground: oklch(0.25 0 0);
+  --popover: oklch(0.98 0.01 85);
+  --popover-foreground: oklch(0.25 0 0);
+  /* Near-black for emphasis */
+  --primary: oklch(0.3 0 0);
+  --primary-foreground: oklch(0.965 0.015 85);
+  /* Subtle tan for secondary */
+  --secondary: oklch(0.92 0.012 85);
+  --secondary-foreground: oklch(0.3 0 0);
+  /* Muted tan for borders and backgrounds */
+  --muted: oklch(0.92 0.012 85);
+  --muted-foreground: oklch(0.45 0 0);
+  /* Warm accent for hovers */
+  --accent: oklch(0.9 0.025 85);
+  --accent-foreground: oklch(0.25 0 0);
+  /* Destructive - muted red */
+  --destructive: oklch(0.577 0.2 25);
+  /* Subtle warm borders */
+  --border: oklch(0.88 0.015 85);
+  --input: oklch(0.88 0.015 85);
+  --ring: oklch(0.3 0 0);
+  /* Chart colors - warm academic tones */
+  --chart-1: oklch(0.5 0.12 45);
+  --chart-2: oklch(0.55 0.1 180);
+  --chart-3: oklch(0.45 0.08 220);
+  --chart-4: oklch(0.65 0.15 85);
+  --chart-5: oklch(0.6 0.14 70);
+  /* Sidebar */
+  --sidebar: oklch(0.96 0.012 85);
+  --sidebar-foreground: oklch(0.25 0 0);
+  --sidebar-primary: oklch(0.3 0 0);
+  --sidebar-primary-foreground: oklch(0.965 0.015 85);
+  --sidebar-accent: oklch(0.92 0.012 85);
+  --sidebar-accent-foreground: oklch(0.3 0 0);
+  --sidebar-border: oklch(0.88 0.015 85);
+  --sidebar-ring: oklch(0.5 0 0);
+}
+
+/* Dark mode - inverted but still warm */
+.dark {
+  --background: oklch(0.18 0.01 85);
+  --foreground: oklch(0.92 0.01 85);
+  --card: oklch(0.22 0.01 85);
+  --card-foreground: oklch(0.92 0.01 85);
+  --popover: oklch(0.22 0.01 85);
+  --popover-foreground: oklch(0.92 0.01 85);
+  --primary: oklch(0.92 0.01 85);
+  --primary-foreground: oklch(0.22 0.01 85);
+  --secondary: oklch(0.28 0.01 85);
+  --secondary-foreground: oklch(0.92 0.01 85);
+  --muted: oklch(0.28 0.01 85);
+  --muted-foreground: oklch(0.65 0.01 85);
+  --accent: oklch(0.28 0.01 85);
+  --accent-foreground: oklch(0.92 0.01 85);
+  --destructive: oklch(0.65 0.18 22);
+  --border: oklch(0.92 0.01 85 / 12%);
+  --input: oklch(0.92 0.01 85 / 15%);
+  --ring: oklch(0.6 0 0);
+  --chart-1: oklch(0.55 0.2 260);
+  --chart-2: oklch(0.65 0.15 160);
+  --chart-3: oklch(0.7 0.16 70);
+  --chart-4: oklch(0.6 0.22 300);
+  --chart-5: oklch(0.6 0.2 20);
+  --sidebar: oklch(0.22 0.01 85);
+  --sidebar-foreground: oklch(0.92 0.01 85);
+  --sidebar-primary: oklch(0.55 0.2 260);
+  --sidebar-primary-foreground: oklch(0.92 0.01 85);
+  --sidebar-accent: oklch(0.28 0.01 85);
+  --sidebar-accent-foreground: oklch(0.92 0.01 85);
+  --sidebar-border: oklch(0.92 0.01 85 / 12%);
+  --sidebar-ring: oklch(0.6 0 0);
+}
+
+@layer base {
+  * {
+    @apply border-border outline-ring/50;
+  }
+  body {
+    @apply bg-background text-foreground antialiased;
+    font-feature-settings: "kern" 1, "liga" 1;
+  }
+
+  /* Research paper typography */
+  h1, h2, h3, h4, h5, h6 {
+    font-family: var(--font-serif);
+    @apply tracking-tight;
+  }
+
+  p {
+    @apply leading-relaxed;
+  }
+
+  /* Lists */
+  ol {
+    @apply list-decimal list-inside my-4 space-y-2 pl-4;
+  }
+
+  ul {
+    @apply list-disc list-inside my-4 space-y-2 pl-4;
+  }
+
+  li {
+    @apply leading-relaxed;
+  }
+
+  /* Slightly larger base font for readability */
+  html {
+    font-size: 17px;
+  }
+}
--- a/blog/src/app/layout.tsx
+++ b/blog/src/app/layout.tsx
@ -0,0 +1,34 @@
+import type { Metadata } from "next";
+import { Geist, Geist_Mono } from "next/font/google";
+import "./globals.css";
+
+const geistSans = Geist({
+  variable: "--font-geist-sans",
+  subsets: ["latin"],
+});
+
+const geistMono = Geist_Mono({
+  variable: "--font-geist-mono",
+  subsets: ["latin"],
+});
+
+export const metadata: Metadata = {
+  title: "A Research Journal",
+  description: "Research projects exploring unfamiliar territory with AI as a thinking partner.",
+};
+
+export default function RootLayout({
+  children,
+}: Readonly<{
+  children: React.ReactNode;
+}>) {
+  return (
+    <html lang="en">
+      <body
+        className={`${geistSans.variable} ${geistMono.variable} antialiased`}
+      >
+        {children}
+      </body>
+    </html>
+  );
+}
--- a/blog/src/app/maxwell/notes/[slug]/page.tsx
+++ b/blog/src/app/maxwell/notes/[slug]/page.tsx
@ -0,0 +1,78 @@
+import { notFound } from "next/navigation";
+import ReactMarkdown from "react-markdown";
+import remarkGfm from "remark-gfm";
+import { getAllNoteSlugs, getNoteBySlug } from "@/lib/content";
+import { PageLayout } from "@/components/layout/PageLayout";
+import { BackNav } from "@/components/layout/BackNav";
+import { NoteHeader } from "@/components/notes/NoteHeader";
+import { PromptsSection } from "@/components/notes/PromptsSection";
+import { FilesSection } from "@/components/notes/FilesSection";
+import { NoteFooter } from "@/components/notes/NoteFooter";
+
+interface NotePageProps {
+  params: Promise<{ slug: string }>;
+}
+
+export async function generateStaticParams() {
+  const slugs = getAllNoteSlugs();
+  return slugs.map((slug) => ({ slug }));
+}
+
+export default async function NotePage({ params }: NotePageProps) {
+  const { slug } = await params;
+  const note = getNoteBySlug(slug);
+
+  if (!note) {
+    notFound();
+  }
+
+  return (
+    <PageLayout>
+      <BackNav href="/maxwell" label="Back to Maxwell" />
+
+      <NoteHeader id={note.id} date={note.date} title={note.title} />
+
+      <PromptsSection prompts={note.prompts} />
+
+      <FilesSection files={note.files} />
+
+      <div className="prose prose-neutral dark:prose-invert max-w-none">
+        <ReactMarkdown
+          remarkPlugins={[remarkGfm]}
+          components={{
+            h2: ({ children }) => (
+              <h2 className="text-xl font-semibold mb-4 mt-12 first:mt-0">
+                {children}
+              </h2>
+            ),
+            h3: ({ children }) => (
+              <h3 className="font-medium mt-6 mb-2">{children}</h3>
+            ),
+            p: ({ children }) => <p className="mb-4">{children}</p>,
+            ul: ({ children }) => (
+              <ul className="mb-4 text-muted-foreground">{children}</ul>
+            ),
+            ol: ({ children }) => (
+              <ol className="mb-4 text-muted-foreground">{children}</ol>
+            ),
+            li: ({ children }) => <li className="mb-1">{children}</li>,
+            strong: ({ children }) => (
+              <strong className="font-semibold text-foreground">
+                {children}
+              </strong>
+            ),
+            em: ({ children }) => <em>{children}</em>,
+          }}
+        >
+          {note.content}
+        </ReactMarkdown>
+      </div>
+
+      <NoteFooter
+        prev={note.navigation.prev}
+        next={note.navigation.next}
+        allNotesHref="/maxwell"
+      />
+    </PageLayout>
+  );
+}
--- a/blog/src/app/maxwell/page.tsx
+++ b/blog/src/app/maxwell/page.tsx
@ -0,0 +1,76 @@
+import Link from "next/link";
+import { getProject, getProjectNotes } from "@/lib/content";
+import { PageLayout } from "@/components/layout/PageLayout";
+import { BackNav } from "@/components/layout/BackNav";
+
+export default function Maxwell() {
+  const project = getProject("maxwell");
+  const notes = getProjectNotes("maxwell");
+
+  return (
+    <PageLayout>
+      <BackNav href="/" label="All Projects" />
+
+      <header className="mb-16">
+        <h1 className="text-3xl font-semibold mb-2">{project.title}</h1>
+        <p className="text-lg text-muted-foreground mb-8">{project.subtitle}</p>
+
+        <section className="border-l-2 border-muted-foreground/30 pl-6 mb-8">
+          {project.intro.map((paragraph, index) => (
+            <p key={index} className="text-base leading-relaxed mb-4 last:mb-0">
+              {paragraph}
+            </p>
+          ))}
+        </section>
+
+        <div className="p-4 bg-muted/30 rounded">
+          <p className="text-sm text-muted-foreground mb-2">
+            The distilled output of this research:
+          </p>
+          <Link
+            href={project.whitePaper.href}
+            className="text-foreground font-medium hover:underline"
+          >
+            {project.whitePaper.label} →
+          </Link>
+        </div>
+      </header>
+
+      <section>
+        <h2 className="text-xl font-semibold mb-6">Research Notes</h2>
+
+        <div className="space-y-6">
+          {notes.map((note) => (
+            <Link
+              key={note.id}
+              href={`/maxwell/notes/${note.slug}`}
+              className="block p-4 border border-border rounded hover:bg-muted/30 transition-colors"
+            >
+              <div className="flex items-baseline gap-3 mb-2">
+                <span className="font-mono text-sm text-muted-foreground">
+                  #{note.id}
+                </span>
+                <span className="text-sm text-muted-foreground">
+                  {note.date}
+                </span>
+              </div>
+              <h3 className="font-medium mb-1">{note.title}</h3>
+              <p className="text-sm text-muted-foreground">{note.preview}</p>
+            </Link>
+          ))}
+        </div>
+      </section>
+
+      <footer className="border-t border-border pt-8 mt-16 text-center">
+        <div className="flex justify-center gap-6 text-sm">
+          <Link
+            href="/maxwell/white-paper"
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            White Paper
+          </Link>
+        </div>
+      </footer>
+    </PageLayout>
+  );
+}
--- a/blog/src/app/maxwell/white-paper/page.tsx
+++ b/blog/src/app/maxwell/white-paper/page.tsx
@ -0,0 +1,76 @@
+import Link from "next/link";
+import { getWhitePaperOutline } from "@/lib/content";
+import { PageLayout } from "@/components/layout/PageLayout";
+import { BackNav } from "@/components/layout/BackNav";
+import { OutlineSection } from "@/components/white-paper/OutlineSection";
+
+export default function WhitePaper() {
+  const outline = getWhitePaperOutline();
+
+  return (
+    <PageLayout>
+      <BackNav href="/maxwell" label="Back to Maxwell" />
+
+      <header className="mb-16">
+        <div className="inline-block px-2 py-1 text-xs font-medium bg-yellow-500/20 text-yellow-600 rounded mb-4">
+          {outline.status}
+        </div>
+        <h1 className="text-3xl font-semibold mb-2">{outline.title}</h1>
+        <p className="text-sm text-muted-foreground">{outline.date}</p>
+      </header>
+
+      <section className="border-l-2 border-muted-foreground/30 pl-6 mb-16">
+        <h2 className="sr-only">Abstract</h2>
+        <p className="text-base leading-relaxed text-muted-foreground italic">
+          {outline.abstract}
+        </p>
+      </section>
+
+      <section className="mb-16">
+        <h2 className="text-xl font-semibold mb-6">Paper Outline</h2>
+
+        <div className="space-y-8">
+          {outline.sections.map((section) => (
+            <OutlineSection
+              key={section.number}
+              number={section.number}
+              title={section.title}
+              bullets={section.bullets}
+            />
+          ))}
+        </div>
+      </section>
+
+      <section className="p-4 bg-muted/30 rounded mb-16">
+        <h3 className="font-medium mb-2">Current Status</h3>
+        <p className="text-sm text-muted-foreground">
+          {outline.currentStatus.note.split("research notes")[0]}
+          <Link
+            href={outline.currentStatus.notesLink}
+            className="underline hover:text-foreground"
+          >
+            research notes
+          </Link>
+          {outline.currentStatus.note.split("research notes")[1]}
+        </p>
+      </section>
+
+      <footer className="border-t border-border pt-8 text-center">
+        <div className="flex justify-center gap-6 text-sm">
+          <Link
+            href="/maxwell"
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            Research Notes
+          </Link>
+          <a
+            href="https://github.com/orchard9/maxwell"
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            GitHub
+          </a>
+        </div>
+      </footer>
+    </PageLayout>
+  );
+}
--- a/blog/src/app/page.tsx
+++ b/blog/src/app/page.tsx
@ -0,0 +1,38 @@
+import Link from "next/link";
+import { getProject } from "@/lib/content";
+import { PageLayout } from "@/components/layout/PageLayout";
+
+export default function Home() {
+  const maxwell = getProject("maxwell");
+
+  const projects = [maxwell];
+
+  return (
+    <PageLayout>
+      <header className="mb-16">
+        <h1 className="text-3xl font-semibold mb-2">A Research Journal</h1>
+        <p className="text-muted-foreground">by Jordan Washburn</p>
+      </header>
+
+      <section>
+        <div className="space-y-4">
+          {projects.map((project) => (
+            <Link
+              key={project.id}
+              href={`/${project.slug}`}
+              className="block p-6 border border-border rounded hover:bg-muted/30 transition-colors"
+            >
+              <div className="flex items-center justify-between mb-2">
+                <h2 className="text-xl font-semibold">{project.title}</h2>
+                <span className="text-xs px-2 py-1 bg-yellow-500/20 text-yellow-600 rounded">
+                  {project.status}
+                </span>
+              </div>
+              <p className="text-muted-foreground">{project.subtitle}</p>
+            </Link>
+          ))}
+        </div>
+      </section>
+    </PageLayout>
+  );
+}
--- a/blog/src/components/copyable.tsx
+++ b/blog/src/components/copyable.tsx
@ -0,0 +1,88 @@
+"use client";
+
+import { useState } from "react";
+
+interface CopyButtonProps {
+  text: string;
+  className?: string;
+}
+
+export function CopyButton({ text, className = "" }: CopyButtonProps) {
+  const [copied, setCopied] = useState(false);
+
+  const handleCopy = async () => {
+    await navigator.clipboard.writeText(text);
+    setCopied(true);
+    setTimeout(() => setCopied(false), 2000);
+  };
+
+  return (
+    <button
+      onClick={handleCopy}
+      className={`font-mono text-xs text-muted-foreground hover:text-foreground transition-colors ${className}`}
+    >
+      {copied ? "copied" : "copy"}
+    </button>
+  );
+}
+
+interface CopyableBlockProps {
+  content: string;
+  label?: string;
+  className?: string;
+}
+
+export function CopyableBlock({ content, label, className = "" }: CopyableBlockProps) {
+  return (
+    <div className={className}>
+      <div className="flex items-center justify-between mb-2">
+        {label && <p className="text-sm font-medium">{label}</p>}
+        <CopyButton text={content} />
+      </div>
+      <pre className="p-3 bg-muted/50 rounded text-xs overflow-x-auto whitespace-pre-wrap font-mono">
+        {content}
+      </pre>
+    </div>
+  );
+}
+
+interface ExpandableFileProps {
+  name: string;
+  description: string;
+  content: string;
+  expanded: boolean;
+  onToggle: () => void;
+}
+
+export function ExpandableFile({
+  name,
+  description,
+  content,
+  expanded,
+  onToggle,
+}: ExpandableFileProps) {
+  return (
+    <div>
+      <div className="flex items-center gap-3">
+        <button
+          onClick={onToggle}
+          className="flex items-baseline gap-3 text-sm hover:opacity-80 transition-opacity"
+        >
+          <code className="font-mono text-xs bg-muted px-1.5 py-0.5 rounded">
+            {name}
+          </code>
+          <span className="text-muted-foreground text-xs">{description}</span>
+          <span className="font-mono text-xs text-muted-foreground">
+            {expanded ? "−" : "+"}
+          </span>
+        </button>
+        {expanded && <CopyButton text={content} />}
+      </div>
+      {expanded && (
+        <pre className="mt-2 p-3 bg-muted/50 rounded text-xs overflow-x-auto whitespace-pre-wrap font-mono max-h-96 overflow-y-auto">
+          {content}
+        </pre>
+      )}
+    </div>
+  );
+}
--- a/blog/src/components/layout/BackNav.tsx
+++ b/blog/src/components/layout/BackNav.tsx
@ -0,0 +1,19 @@
+import Link from "next/link";
+
+interface BackNavProps {
+  href: string;
+  label: string;
+}
+
+export function BackNav({ href, label }: BackNavProps) {
+  return (
+    <nav className="mb-8">
+      <Link
+        href={href}
+        className="text-sm text-muted-foreground hover:text-foreground transition-colors"
+      >
+        ← {label}
+      </Link>
+    </nav>
+  );
+}
--- a/blog/src/components/layout/PageLayout.tsx
+++ b/blog/src/components/layout/PageLayout.tsx
@ -0,0 +1,15 @@
+import { ReactNode } from "react";
+
+interface PageLayoutProps {
+  children: ReactNode;
+}
+
+export function PageLayout({ children }: PageLayoutProps) {
+  return (
+    <div className="min-h-screen bg-background">
+      <article className="mx-auto max-w-[800px] px-6 py-16 text-foreground">
+        {children}
+      </article>
+    </div>
+  );
+}
--- a/blog/src/components/notes/FilesSection.tsx
+++ b/blog/src/components/notes/FilesSection.tsx
@ -0,0 +1,39 @@
+"use client";
+
+import { useState } from "react";
+import { ExpandableFile } from "@/components/copyable";
+import type { NoteFile } from "@/lib/content";
+
+interface FilesSectionProps {
+  files: NoteFile[];
+}
+
+export function FilesSection({ files }: FilesSectionProps) {
+  const [expandedFile, setExpandedFile] = useState<string | null>(null);
+
+  return (
+    <section className="mb-12 p-4 border border-border rounded">
+      <h2 className="text-sm font-medium text-muted-foreground mb-4">
+        Files Created
+      </h2>
+      {files.length > 0 ? (
+        <div className="space-y-3">
+          {files.map((file) => (
+            <ExpandableFile
+              key={file.name}
+              name={file.name}
+              description={file.description}
+              content={file.content}
+              expanded={expandedFile === file.name}
+              onToggle={() =>
+                setExpandedFile(expandedFile === file.name ? null : file.name)
+              }
+            />
+          ))}
+        </div>
+      ) : (
+        <p className="text-sm text-muted-foreground italic">None</p>
+      )}
+    </section>
+  );
+}
--- a/blog/src/components/notes/NoteFooter.tsx
+++ b/blog/src/components/notes/NoteFooter.tsx
@ -0,0 +1,42 @@
+import Link from "next/link";
+import type { NoteNavLink } from "@/lib/content";
+
+interface NoteFooterProps {
+  prev: NoteNavLink | null;
+  next: NoteNavLink | null;
+  allNotesHref: string;
+}
+
+export function NoteFooter({ prev, next, allNotesHref }: NoteFooterProps) {
+  return (
+    <footer className="border-t border-border pt-8 mt-16">
+      <div className="flex justify-between text-sm">
+        {prev ? (
+          <Link
+            href={`/maxwell/notes/${prev.slug}`}
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            ← #{prev.id} {prev.title}
+          </Link>
+        ) : (
+          <Link
+            href={allNotesHref}
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            ← All Notes
+          </Link>
+        )}
+        {next ? (
+          <Link
+            href={`/maxwell/notes/${next.slug}`}
+            className="text-muted-foreground hover:text-foreground transition-colors"
+          >
+            #{next.id} {next.title} →
+          </Link>
+        ) : (
+          <span className="text-muted-foreground">Next: Coming soon</span>
+        )}
+      </div>
+    </footer>
+  );
+}
--- a/blog/src/components/notes/NoteHeader.tsx
+++ b/blog/src/components/notes/NoteHeader.tsx
@ -0,0 +1,17 @@
+interface NoteHeaderProps {
+  id: string;
+  date: string;
+  title: string;
+}
+
+export function NoteHeader({ id, date, title }: NoteHeaderProps) {
+  return (
+    <header className="mb-8">
+      <div className="flex items-baseline gap-3 mb-2">
+        <span className="font-mono text-sm text-muted-foreground">#{id}</span>
+        <span className="text-sm text-muted-foreground">{date}</span>
+      </div>
+      <h1 className="text-3xl font-semibold">{title}</h1>
+    </header>
+  );
+}
--- a/blog/src/components/notes/PromptsSection.tsx
+++ b/blog/src/components/notes/PromptsSection.tsx
@ -0,0 +1,25 @@
+import { CopyableBlock } from "@/components/copyable";
+import type { Prompt } from "@/lib/content";
+
+interface PromptsSectionProps {
+  prompts: Prompt[];
+}
+
+export function PromptsSection({ prompts }: PromptsSectionProps) {
+  return (
+    <section className="mb-8 p-4 border border-border rounded">
+      <h2 className="text-sm font-medium text-muted-foreground mb-4">
+        Prompts Used
+      </h2>
+      <div className="space-y-4">
+        {prompts.map((prompt) => (
+          <CopyableBlock
+            key={prompt.id}
+            label={prompt.label}
+            content={prompt.content.trim()}
+          />
+        ))}
+      </div>
+    </section>
+  );
+}
--- a/blog/src/components/notes/SkillsSection.tsx
+++ b/blog/src/components/notes/SkillsSection.tsx
@ -0,0 +1,58 @@
+"use client";
+
+import { useState } from "react";
+import { CopyableBlock } from "@/components/copyable";
+
+export interface SkillUsed {
+  name: string;
+  command: string;
+  description: string;
+  usage?: string;
+}
+
+interface SkillsSectionProps {
+  skills: SkillUsed[];
+}
+
+export function SkillsSection({ skills }: SkillsSectionProps) {
+  const [expandedSkill, setExpandedSkill] = useState<string | null>(null);
+
+  if (skills.length === 0) return null;
+
+  return (
+    <section className="mb-8 p-4 border border-border rounded">
+      <h2 className="text-sm font-medium text-muted-foreground mb-4">
+        Skills Used
+      </h2>
+      <div className="space-y-3">
+        {skills.map((skill) => (
+          <div key={skill.name} className="space-y-2">
+            <div className="flex items-center gap-3">
+              <button
+                onClick={() =>
+                  setExpandedSkill(
+                    expandedSkill === skill.name ? null : skill.name
+                  )
+                }
+                className="flex items-baseline gap-3 text-sm hover:opacity-80 transition-opacity"
+              >
+                <code className="font-mono text-xs bg-muted px-1.5 py-0.5 rounded">
+                  {skill.command}
+                </code>
+                <span className="text-muted-foreground text-xs">
+                  {skill.description}
+                </span>
+                <span className="font-mono text-xs text-muted-foreground">
+                  {expandedSkill === skill.name ? "−" : "+"}
+                </span>
+              </button>
+            </div>
+            {expandedSkill === skill.name && skill.usage && (
+              <CopyableBlock content={skill.usage} />
+            )}
+          </div>
+        ))}
+      </div>
+    </section>
+  );
+}
--- a/blog/src/components/ui/button.tsx
+++ b/blog/src/components/ui/button.tsx
@ -0,0 +1,64 @@
+import * as React from "react"
+import { cva, type VariantProps } from "class-variance-authority"
+import { Slot } from "@radix-ui/react-slot"
+
+import { cn } from "@/lib/utils"
+
+const buttonVariants = cva(
+  "inline-flex items-center justify-center gap-2 whitespace-nowrap rounded-md text-sm font-medium transition-all disabled:pointer-events-none disabled:opacity-50 [&_svg]:pointer-events-none [&_svg:not([class*='size-'])]:size-4 shrink-0 [&_svg]:shrink-0 outline-none focus-visible:border-ring focus-visible:ring-ring/50 focus-visible:ring-[3px] aria-invalid:ring-destructive/20 dark:aria-invalid:ring-destructive/40 aria-invalid:border-destructive",
+  {
+    variants: {
+      variant: {
+        default: "bg-primary text-primary-foreground hover:bg-primary/90",
+        destructive:
+          "bg-destructive text-white hover:bg-destructive/90 focus-visible:ring-destructive/20 dark:focus-visible:ring-destructive/40 dark:bg-destructive/60",
+        outline:
+          "border bg-background shadow-xs hover:bg-accent hover:text-accent-foreground dark:bg-input/30 dark:border-input dark:hover:bg-input/50",
+        secondary:
+          "bg-secondary text-secondary-foreground hover:bg-secondary/80",
+        ghost:
+          "hover:bg-accent hover:text-accent-foreground dark:hover:bg-accent/50",
+        link: "text-primary underline-offset-4 hover:underline",
+      },
+      size: {
+        default: "h-9 px-4 py-2 has-[>svg]:px-3",
+        xs: "h-6 gap-1 rounded-md px-2 text-xs has-[>svg]:px-1.5 [&_svg:not([class*='size-'])]:size-3",
+        sm: "h-8 rounded-md gap-1.5 px-3 has-[>svg]:px-2.5",
+        lg: "h-10 rounded-md px-6 has-[>svg]:px-4",
+        icon: "size-9",
+        "icon-xs": "size-6 rounded-md [&_svg:not([class*='size-'])]:size-3",
+        "icon-sm": "size-8",
+        "icon-lg": "size-10",
+      },
+    },
+    defaultVariants: {
+      variant: "default",
+      size: "default",
+    },
+  }
+)
+
+function Button({
+  className,
+  variant = "default",
+  size = "default",
+  asChild = false,
+  ...props
+}: React.ComponentProps<"button"> &
+  VariantProps<typeof buttonVariants> & {
+    asChild?: boolean
+  }) {
+  const Comp = asChild ? Slot : "button"
+
+  return (
+    <Comp
+      data-slot="button"
+      data-variant={variant}
+      data-size={size}
+      className={cn(buttonVariants({ variant, size, className }))}
+      {...props}
+    />
+  )
+}
+
+export { Button, buttonVariants }
--- a/blog/src/components/ui/card.tsx
+++ b/blog/src/components/ui/card.tsx
@ -0,0 +1,92 @@
+import * as React from "react"
+
+import { cn } from "@/lib/utils"
+
+function Card({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card"
+      className={cn(
+        "bg-card text-card-foreground flex flex-col gap-6 rounded-xl border py-6 shadow-sm",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+
+function CardHeader({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-header"
+      className={cn(
+        "@container/card-header grid auto-rows-min grid-rows-[auto_auto] items-start gap-2 px-6 has-data-[slot=card-action]:grid-cols-[1fr_auto] [.border-b]:pb-6",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+
+function CardTitle({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-title"
+      className={cn("leading-none font-semibold", className)}
+      {...props}
+    />
+  )
+}
+
+function CardDescription({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-description"
+      className={cn("text-muted-foreground text-sm", className)}
+      {...props}
+    />
+  )
+}
+
+function CardAction({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-action"
+      className={cn(
+        "col-start-2 row-span-2 row-start-1 self-start justify-self-end",
+        className
+      )}
+      {...props}
+    />
+  )
+}
+
+function CardContent({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-content"
+      className={cn("px-6", className)}
+      {...props}
+    />
+  )
+}
+
+function CardFooter({ className, ...props }: React.ComponentProps<"div">) {
+  return (
+    <div
+      data-slot="card-footer"
+      className={cn("flex items-center px-6 [.border-t]:pt-6", className)}
+      {...props}
+    />
+  )
+}
+
+export {
+  Card,
+  CardHeader,
+  CardFooter,
+  CardTitle,
+  CardAction,
+  CardDescription,
+  CardContent,
+}
--- a/blog/src/components/white-paper/OutlineSection.tsx
+++ b/blog/src/components/white-paper/OutlineSection.tsx
@ -0,0 +1,20 @@
+interface OutlineSectionProps {
+  number: number;
+  title: string;
+  bullets: string[];
+}
+
+export function OutlineSection({ number, title, bullets }: OutlineSectionProps) {
+  return (
+    <div className="border-l-2 border-border pl-4">
+      <h3 className="font-medium mb-2">
+        {number}. {title}
+      </h3>
+      <ul className="text-sm text-muted-foreground space-y-1">
+        {bullets.map((bullet, index) => (
+          <li key={index} dangerouslySetInnerHTML={{ __html: `• ${bullet}` }} />
+        ))}
+      </ul>
+    </div>
+  );
+}
--- a/blog/src/lib/content.ts
+++ b/blog/src/lib/content.ts
@ -0,0 +1,173 @@
+import fs from "fs";
+import path from "path";
+import yaml from "js-yaml";
+
+const CONTENT_DIR = path.join(process.cwd(), "content");
+
+// Types
+
+export interface Project {
+  id: string;
+  title: string;
+  subtitle: string;
+  status: string;
+  slug: string;
+  intro: string[];
+  whitePaper: {
+    href: string;
+    label: string;
+  };
+}
+
+export interface Prompt {
+  id: string;
+  label: string;
+  content: string;
+}
+
+export interface FileRef {
+  name: string;
+  description: string;
+}
+
+export interface NoteNavLink {
+  slug: string;
+  id: string;
+  title: string;
+}
+
+export interface NoteMeta {
+  id: string;
+  slug: string;
+  date: string;
+  title: string;
+  preview: string;
+  prompts: Prompt[];
+  filesCreated: FileRef[];
+  navigation: {
+    prev: NoteNavLink | null;
+    next: NoteNavLink | null;
+  };
+}
+
+export interface NoteFile {
+  name: string;
+  description: string;
+  content: string;
+}
+
+export interface Note extends NoteMeta {
+  content: string;
+  files: NoteFile[];
+}
+
+export interface OutlineSection {
+  number: number;
+  title: string;
+  bullets: string[];
+}
+
+export interface WhitePaperOutline {
+  status: string;
+  title: string;
+  date: string;
+  abstract: string;
+  sections: OutlineSection[];
+  currentStatus: {
+    note: string;
+    notesLink: string;
+  };
+}
+
+// Loaders
+
+export function getProject(slug: string): Project {
+  const filePath = path.join(CONTENT_DIR, "projects", `${slug}.yaml`);
+  const fileContents = fs.readFileSync(filePath, "utf8");
+  return yaml.load(fileContents) as Project;
+}
+
+export function getProjectNotes(projectSlug: string): NoteMeta[] {
+  const notesDir = path.join(CONTENT_DIR, "notes");
+  const noteFolders = fs.readdirSync(notesDir).filter((f) => {
+    const stat = fs.statSync(path.join(notesDir, f));
+    return stat.isDirectory();
+  });
+
+  const notes: NoteMeta[] = [];
+
+  for (const folder of noteFolders) {
+    const metaPath = path.join(notesDir, folder, "meta.yaml");
+    if (fs.existsSync(metaPath)) {
+      const metaContents = fs.readFileSync(metaPath, "utf8");
+      const meta = yaml.load(metaContents) as NoteMeta;
+      notes.push(meta);
+    }
+  }
+
+  // Sort by id
+  notes.sort((a, b) => a.id.localeCompare(b.id));
+
+  return notes;
+}
+
+export function getNoteBySlug(slug: string): Note | null {
+  const notesDir = path.join(CONTENT_DIR, "notes");
+  const noteDir = path.join(notesDir, slug);
+
+  if (!fs.existsSync(noteDir)) {
+    return null;
+  }
+
+  const metaPath = path.join(noteDir, "meta.yaml");
+  const contentPath = path.join(noteDir, "content.md");
+
+  if (!fs.existsSync(metaPath) || !fs.existsSync(contentPath)) {
+    return null;
+  }
+
+  const metaContents = fs.readFileSync(metaPath, "utf8");
+  const meta = yaml.load(metaContents) as NoteMeta;
+
+  const content = fs.readFileSync(contentPath, "utf8");
+
+  // Load files if they exist
+  const files: NoteFile[] = [];
+  const filesDir = path.join(noteDir, "files");
+
+  if (fs.existsSync(filesDir) && meta.filesCreated.length > 0) {
+    for (const fileRef of meta.filesCreated) {
+      const filePath = path.join(filesDir, fileRef.name);
+      if (fs.existsSync(filePath)) {
+        const fileContent = fs.readFileSync(filePath, "utf8");
+        files.push({
+          name: fileRef.name,
+          description: fileRef.description,
+          content: fileContent,
+        });
+      }
+    }
+  }
+
+  return {
+    ...meta,
+    content,
+    files,
+  };
+}
+
+export function getAllNoteSlugs(): string[] {
+  const notesDir = path.join(CONTENT_DIR, "notes");
+  const noteFolders = fs.readdirSync(notesDir).filter((f) => {
+    const stat = fs.statSync(path.join(notesDir, f));
+    return stat.isDirectory();
+  });
+
+  return noteFolders;
+}
+
+export function getWhitePaperOutline(): WhitePaperOutline {
+  const filePath = path.join(CONTENT_DIR, "white-paper", "outline.yaml");
+  const fileContents = fs.readFileSync(filePath, "utf8");
+  return yaml.load(fileContents) as WhitePaperOutline;
+}
--- a/blog/src/lib/utils.ts
+++ b/blog/src/lib/utils.ts
@ -0,0 +1,6 @@
+import { clsx, type ClassValue } from "clsx"
+import { twMerge } from "tailwind-merge"
+
+export function cn(...inputs: ClassValue[]) {
+  return twMerge(clsx(inputs))
+}
--- a/blog/tsconfig.json
+++ b/blog/tsconfig.json
@ -0,0 +1,34 @@
+{
+  "compilerOptions": {
+    "target": "ES2017",
+    "lib": ["dom", "dom.iterable", "esnext"],
+    "allowJs": true,
+    "skipLibCheck": true,
+    "strict": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "module": "esnext",
+    "moduleResolution": "bundler",
+    "resolveJsonModule": true,
+    "isolatedModules": true,
+    "jsx": "react-jsx",
+    "incremental": true,
+    "plugins": [
+      {
+        "name": "next"
+      }
+    ],
+    "paths": {
+      "@/*": ["./src/*"]
+    }
+  },
+  "include": [
+    "next-env.d.ts",
+    "**/*.ts",
+    "**/*.tsx",
+    ".next/types/**/*.ts",
+    ".next/dev/types/**/*.ts",
+    "**/*.mts"
+  ],
+  "exclude": ["node_modules"]
+}
				`@ -0,0 +1 @@`
				`<svg fill="none" viewBox="0 0 16 16" xmlns="http://www.w3.org/2000/svg"><path d="M14.5 13.5V5.41a1 1 0 0 0-.3-.7L9.8.29A1 1 0 0 0 9.08 0H1.5v13.5A2.5 2.5 0 0 0 4 16h8a2.5 2.5 0 0 0 2.5-2.5m-1.5 0v-7H8v-5H3v12a1 1 0 0 0 1 1h8a1 1 0 0 0 1-1M9.5 5V2.12L12.38 5zM5.13 5h-.62v1.25h2.12V5zm-.62 3h7.12v1.25H4.5zm.62 3h-.62v1.25h7.12V11z" clip-rule="evenodd" fill="#666" fill-rule="evenodd"/></svg>`
				`@ -0,0 +1 @@`
				<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16"><g clip-path="url(#a)"><path fill-rule="evenodd" clip-rule="evenodd" d="M10.27 14.1a6.5 6.5 0 0 0 3.67-3.45q-1.24.21-2.7.34-.31 1.83-.97 3.1M8 16A8 8 0 1 0 8 0a8 8 0 0 0 0 16m.48-1.52a7 7 0 0 1-.96 0H7.5a4 4 0 0 1-.84-1.32q-.38-.89-.63-2.08a40 40 0 0 0 3.92 0q-.25 1.2-.63 2.08a4 4 0 0 1-.84 1.31zm2.94-4.76q1.66-.15 2.95-.43a7 7 0 0 0 0-2.58q-1.3-.27-2.95-.43a18 18 0 0 1 0 3.44m-1.27-3.54a17 17 0 0 1 0 3.64 39 39 0 0 1-4.3 0 17 17 0 0 1 0-3.64 39 39 0 0 1 4.3 0m1.1-1.17q1.45.13 2.69.34a6.5 6.5 0 0 0-3.67-3.44q.65 1.26.98 3.1M8.48 1.5l.01.02q.41.37.84 1.31.38.89.63 2.08a40 40 0 0 0-3.92 0q.25-1.2.63-2.08a4 4 0 0 1 .85-1.32 7 7 0 0 1 .96 0m-2.75.4a6.5 6.5 0 0 0-3.67 3.44 29 29 0 0 1 2.7-.34q.31-1.83.97-3.1M4.58 6.28q-1.66.16-2.95.43a7 7 0 0 0 0 2.58q1.3.27 2.95.43a18 18 0 0 1 0-3.44m.17 4.71q-1.45-.12-2.69-.34a6.5 6.5 0 0 0 3.67 3.44q-.65-1.27-.98-3.1" fill="#666"/></g><defs><clipPath id="a"><path fill="#fff" d="M0 0h16v16H0z"/></clipPath></defs></svg>
				`@ -0,0 +1 @@`
				<svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 394 80"><path fill="#000" d="M262 0h68.5v12.7h-27.2v66.6h-13.6V12.7H262V0ZM149 0v12.7H94v20.4h44.3v12.6H94v21h55v12.6H80.5V0h68.7zm34.3 0h-17.8l63.8 79.4h17.9l-32-39.7 32-39.6h-17.9l-23 28.6-23-28.6zm18.3 56.7-9-11-27.1 33.7h17.8l18.3-22.7z"/><path fill="#000" d="M81 79.3 17 0H0v79.3h13.6V17l50.2 62.3H81Zm252.6-.4c-1 0-1.8-.4-2.5-1s-1.1-1.6-1.1-2.6.3-1.8 1-2.5 1.6-1 2.6-1 1.8.3 2.5 1a3.4 3.4 0 0 1 .6 4.3 3.7 3.7 0 0 1-3 1.8zm23.2-33.5h6v23.3c0 2.1-.4 4-1.3 5.5a9.1 9.1 0 0 1-3.8 3.5c-1.6.8-3.5 1.3-5.7 1.3-2 0-3.7-.4-5.3-1s-2.8-1.8-3.7-3.2c-.9-1.3-1.4-3-1.4-5h6c.1.8.3 1.6.7 2.2s1 1.2 1.6 1.5c.7.4 1.5.5 2.4.5 1 0 1.8-.2 2.4-.6a4 4 0 0 0 1.6-1.8c.3-.8.5-1.8.5-3V45.5zm30.9 9.1a4.4 4.4 0 0 0-2-3.3 7.5 7.5 0 0 0-4.3-1.1c-1.3 0-2.4.2-3.3.5-.9.4-1.6 1-2 1.6a3.5 3.5 0 0 0-.3 4c.3.5.7.9 1.3 1.2l1.8 1 2 .5 3.2.8c1.3.3 2.5.7 3.7 1.2a13 13 0 0 1 3.2 1.8 8.1 8.1 0 0 1 3 6.5c0 2-.5 3.7-1.5 5.1a10 10 0 0 1-4.4 3.5c-1.8.8-4.1 1.2-6.8 1.2-2.6 0-4.9-.4-6.8-1.2-2-.8-3.4-2-4.5-3.5a10 10 0 0 1-1.7-5.6h6a5 5 0 0 0 3.5 4.6c1 .4 2.2.6 3.4.6 1.3 0 2.5-.2 3.5-.6 1-.4 1.8-1 2.4-1.7a4 4 0 0 0 .8-2.4c0-.9-.2-1.6-.7-2.2a11 11 0 0 0-2.1-1.4l-3.2-1-3.8-1c-2.8-.7-5-1.7-6.6-3.2a7.2 7.2 0 0 1-2.4-5.7 8 8 0 0 1 1.7-5 10 10 0 0 1 4.3-3.5c2-.8 4-1.2 6.4-1.2 2.3 0 4.4.4 6.2 1.2 1.8.8 3.2 2 4.3 3.4 1 1.4 1.5 3 1.5 5h-5.8z"/></svg>
				`@ -0,0 +1 @@`
				`<svg fill="none" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1155 1000"><path d="m577.3 0 577.4 1000H0z" fill="#fff"/></svg>`