--- name: aeries-fullstack-engineer description: Build the Aeries chat application — frontend, API, vLLM streaming, observation pipeline, tidalDB integration --- # aeries-fullstack-engineer ## When to Use - Building or modifying the Aeries Next.js application - Implementing chat streaming from vLLM - Wiring up the observation pipeline (observer LM call → signal writes) - Integrating tidalDB's iknowyou engine - Fixing bugs in the chat flow, API routes, or real-time UI Invoked via: `/aeries-fullstack-engineer` ## Delegation This skill delegates to **@kai-park** — the Aeries full-stack engineer. All implementation, API design, streaming infrastructure, and tidalDB integration go through his lens. For design decisions (colors, spacing, component visual specs), defer to **@kaya-osei** via `/aeries-design-architect`. For product decisions (what to build, what to defer, personality), defer to **@mira-vasquez** via `/aeries-product-visionary`. ## Step Back Before implementing, ask: 1. **Is the vLLM server healthy?** `curl http://msd5685.mjhst.com:8000/health` — if it's down, nothing else matters. 2. **Does this block the response stream?** Anything that adds latency to the user seeing tokens is wrong. Observation, signal writes, preference updates — all async, all after the stream closes. 3. **Am I over-engineering the MVP?** The first version needs: send message → stream response → store conversation. Not: authentication, multi-user, observation pipeline, preference vectors. 4. **Does the server own the truth?** The client sends `{ message, conversationId }`. Everything else — history, brief, observation — lives server-side. 5. **What happens when vLLM is slow or down?** Every external call needs a timeout and a graceful fallback. Never show a stack trace in the UI. ## Workflow ### Phase 1: Context - Read `applications/iknowyou/devsetup.md` for infrastructure details - Read `applications/iknowyou/architecture.md` for system design - Check vLLM health: `curl http://msd5685.mjhst.com:8000/v1/models` - Review existing code in `applications/iknowyou/` ### Phase 2: Plan - Identify which layer the work touches (frontend, API, vLLM client, observer, tidalDB) - Check dependencies between layers - Determine if design input is needed (delegate to `/aeries-design-architect`) - Determine if product input is needed (delegate to `/aeries-product-visionary`) ### Phase 3: Implement - Write types first (`lib/types.ts`) - Build from the API route outward (server → client) - Test streaming with `curl` before building UI - Use `console.log` timestamps to verify streaming latency ### Phase 4: Verify - Test the full flow: type message → see streaming response → verify storage - Check browser DevTools Network tab for SSE stream behavior - Verify error handling (kill vLLM, send a message, see graceful error) - Run through Done Gate checklist ## Quick Reference | Path | Purpose | |------|---------| | `applications/iknowyou/app/` | Next.js app directory (routes, layouts) | | `applications/iknowyou/app/api/chat/route.ts` | Chat streaming API endpoint | | `applications/iknowyou/components/chat/` | Chat UI components | | `applications/iknowyou/lib/vllm.ts` | vLLM client (streaming) | | `applications/iknowyou/lib/types.ts` | Shared TypeScript types | | `applications/iknowyou/server/observer.ts` | Observer pipeline | | `applications/iknowyou/server/brief.ts` | Brief assembly | | `applications/iknowyou/devsetup.md` | vLLM server details, API examples | | `applications/iknowyou/architecture.md` | System architecture | | `.claude/agents/kai-park.md` | Engineer agent — stack, patterns, constraints | ## Infrastructure Quick Reference | Resource | Location | |----------|----------| | **vLLM API** | `http://msd5685.mjhst.com:8000/v1` | | **Model** | `Qwen/Qwen3-8B` | | **SSH** | `ssh ubuntu@msd5685.mjhst.com` | | **vLLM logs** | `sudo journalctl -u vllm -f` (on server) | | **vLLM restart** | `sudo systemctl restart vllm` (on server) | | **GPU check** | `nvidia-smi` (on server) | | **Dev server port** | 59521 (following tidalDB port range 59520-59529) | ## Standards - All API responses are typed (no `any`) - Streaming uses `ReadableStream` + SSE (not WebSocket) - Observer runs async after response stream completes - Client sends `{ message, conversationId }` — server owns history - Error states show human-readable messages, never stack traces - vLLM calls include timeout (10s for health, 30s for completion) ## Done Gate - [ ] Full flow works: type → stream → display → store - [ ] First token appears within 500ms of send - [ ] Streaming text renders without flicker or reflow - [ ] vLLM-down case shows graceful error message - [ ] Conversation history persists across page reloads - [ ] Types are complete — no `any` in the chain - [ ] API route returns proper SSE headers - [ ] No observation logic in the response critical path