---
name: aeries-fullstack-engineer
description: Build the Aeries chat application — frontend, API, vLLM streaming, observation pipeline, tidalDB integration
---

# aeries-fullstack-engineer

## When to Use

- Building or modifying the Aeries Next.js application
- Implementing chat streaming from vLLM
- Wiring up the observation pipeline (observer LM call → signal writes)
- Integrating tidalDB's iknowyou engine
- Fixing bugs in the chat flow, API routes, or real-time UI

Invoked via: `/aeries-fullstack-engineer`

## Delegation

This skill delegates to **@kai-park** — the Aeries full-stack engineer. All implementation, API design, streaming infrastructure, and tidalDB integration go through his lens.

For design decisions (colors, spacing, component visual specs), defer to **@kaya-osei** via `/aeries-design-architect`.

For product decisions (what to build, what to defer, personality), defer to **@mira-vasquez** via `/aeries-product-visionary`.

## Step Back

Before implementing, ask:

1. **Is the vLLM server healthy?** `curl http://msd5685.mjhst.com:8000/health` — if it's down, nothing else matters.
2. **Does this block the response stream?** Anything that adds latency to the user seeing tokens is wrong. Observation, signal writes, preference updates — all async, all after the stream closes.
3. **Am I over-engineering the MVP?** The first version needs: send message → stream response → store conversation. Not: authentication, multi-user, observation pipeline, preference vectors.
4. **Does the server own the truth?** The client sends `{ message, conversationId }`. Everything else — history, brief, observation — lives server-side.
5. **What happens when vLLM is slow or down?** Every external call needs a timeout and a graceful fallback. Never show a stack trace in the UI.

## Workflow

### Phase 1: Context
- Read `applications/iknowyou/devsetup.md` for infrastructure details
- Read `applications/iknowyou/architecture.md` for system design
- Check vLLM health: `curl http://msd5685.mjhst.com:8000/v1/models`
- Review existing code in `applications/iknowyou/`

### Phase 2: Plan
- Identify which layer the work touches (frontend, API, vLLM client, observer, tidalDB)
- Check dependencies between layers
- Determine if design input is needed (delegate to `/aeries-design-architect`)
- Determine if product input is needed (delegate to `/aeries-product-visionary`)

### Phase 3: Implement
- Write types first (`lib/types.ts`)
- Build from the API route outward (server → client)
- Test streaming with `curl` before building UI
- Use `console.log` timestamps to verify streaming latency

### Phase 4: Verify
- Test the full flow: type message → see streaming response → verify storage
- Check browser DevTools Network tab for SSE stream behavior
- Verify error handling (kill vLLM, send a message, see graceful error)
- Run through Done Gate checklist

## Quick Reference

| Path | Purpose |
|------|---------|
| `applications/iknowyou/app/` | Next.js app directory (routes, layouts) |
| `applications/iknowyou/app/api/chat/route.ts` | Chat streaming API endpoint |
| `applications/iknowyou/components/chat/` | Chat UI components |
| `applications/iknowyou/lib/vllm.ts` | vLLM client (streaming) |
| `applications/iknowyou/lib/types.ts` | Shared TypeScript types |
| `applications/iknowyou/server/observer.ts` | Observer pipeline |
| `applications/iknowyou/server/brief.ts` | Brief assembly |
| `applications/iknowyou/devsetup.md` | vLLM server details, API examples |
| `applications/iknowyou/architecture.md` | System architecture |
| `.claude/agents/kai-park.md` | Engineer agent — stack, patterns, constraints |

## Infrastructure Quick Reference

| Resource | Location |
|----------|----------|
| **vLLM API** | `http://msd5685.mjhst.com:8000/v1` |
| **Model** | `Qwen/Qwen3-8B` |
| **SSH** | `ssh ubuntu@msd5685.mjhst.com` |
| **vLLM logs** | `sudo journalctl -u vllm -f` (on server) |
| **vLLM restart** | `sudo systemctl restart vllm` (on server) |
| **GPU check** | `nvidia-smi` (on server) |
| **Dev server port** | 59521 (following tidalDB port range 59520-59529) |

## Standards

- All API responses are typed (no `any`)
- Streaming uses `ReadableStream` + SSE (not WebSocket)
- Observer runs async after response stream completes
- Client sends `{ message, conversationId }` — server owns history
- Error states show human-readable messages, never stack traces
- vLLM calls include timeout (10s for health, 30s for completion)

## Done Gate

- [ ] Full flow works: type → stream → display → store
- [ ] First token appears within 500ms of send
- [ ] Streaming text renders without flicker or reflow
- [ ] vLLM-down case shows graceful error message
- [ ] Conversation history persists across page reloads
- [ ] Types are complete — no `any` in the chain
- [ ] API route returns proper SSE headers
- [ ] No observation logic in the response critical path