- applications/iknowyou: new Next.js chat application with persona-aware conversations, briefing API, cohort logic, vLLM streaming, and sidebar navigation - tidal M8: add replication control plane (control.rs), tenant migration state machine (migration.rs), tenant/upgrade coordinators, cluster/fault test harnesses - tidal M8 tests: expand m8p2/m8p3/m8p4 test suites; add m8p5_multitenancy and m8_uat - tidal db: split replication_ops out of db/mod.rs (was 647 lines, now 574) - .claude: add kai-park, kaya-osei, mira-vasquez agents; add aeries-design-architect, aeries-fullstack-engineer, aeries-product-visionary skills - docs: update ROADMAP.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
108 lines
4.8 KiB
Markdown
108 lines
4.8 KiB
Markdown
---
|
|
name: aeries-fullstack-engineer
|
|
description: Build the Aeries chat application — frontend, API, vLLM streaming, observation pipeline, tidalDB integration
|
|
---
|
|
|
|
# aeries-fullstack-engineer
|
|
|
|
## When to Use
|
|
|
|
- Building or modifying the Aeries Next.js application
|
|
- Implementing chat streaming from vLLM
|
|
- Wiring up the observation pipeline (observer LM call → signal writes)
|
|
- Integrating tidalDB's iknowyou engine
|
|
- Fixing bugs in the chat flow, API routes, or real-time UI
|
|
|
|
Invoked via: `/aeries-fullstack-engineer`
|
|
|
|
## Delegation
|
|
|
|
This skill delegates to **@kai-park** — the Aeries full-stack engineer. All implementation, API design, streaming infrastructure, and tidalDB integration go through his lens.
|
|
|
|
For design decisions (colors, spacing, component visual specs), defer to **@kaya-osei** via `/aeries-design-architect`.
|
|
|
|
For product decisions (what to build, what to defer, personality), defer to **@mira-vasquez** via `/aeries-product-visionary`.
|
|
|
|
## Step Back
|
|
|
|
Before implementing, ask:
|
|
|
|
1. **Is the vLLM server healthy?** `curl http://msd5685.mjhst.com:8000/health` — if it's down, nothing else matters.
|
|
2. **Does this block the response stream?** Anything that adds latency to the user seeing tokens is wrong. Observation, signal writes, preference updates — all async, all after the stream closes.
|
|
3. **Am I over-engineering the MVP?** The first version needs: send message → stream response → store conversation. Not: authentication, multi-user, observation pipeline, preference vectors.
|
|
4. **Does the server own the truth?** The client sends `{ message, conversationId }`. Everything else — history, brief, observation — lives server-side.
|
|
5. **What happens when vLLM is slow or down?** Every external call needs a timeout and a graceful fallback. Never show a stack trace in the UI.
|
|
|
|
## Workflow
|
|
|
|
### Phase 1: Context
|
|
- Read `applications/iknowyou/devsetup.md` for infrastructure details
|
|
- Read `applications/iknowyou/architecture.md` for system design
|
|
- Check vLLM health: `curl http://msd5685.mjhst.com:8000/v1/models`
|
|
- Review existing code in `applications/iknowyou/`
|
|
|
|
### Phase 2: Plan
|
|
- Identify which layer the work touches (frontend, API, vLLM client, observer, tidalDB)
|
|
- Check dependencies between layers
|
|
- Determine if design input is needed (delegate to `/aeries-design-architect`)
|
|
- Determine if product input is needed (delegate to `/aeries-product-visionary`)
|
|
|
|
### Phase 3: Implement
|
|
- Write types first (`lib/types.ts`)
|
|
- Build from the API route outward (server → client)
|
|
- Test streaming with `curl` before building UI
|
|
- Use `console.log` timestamps to verify streaming latency
|
|
|
|
### Phase 4: Verify
|
|
- Test the full flow: type message → see streaming response → verify storage
|
|
- Check browser DevTools Network tab for SSE stream behavior
|
|
- Verify error handling (kill vLLM, send a message, see graceful error)
|
|
- Run through Done Gate checklist
|
|
|
|
## Quick Reference
|
|
|
|
| Path | Purpose |
|
|
|------|---------|
|
|
| `applications/iknowyou/app/` | Next.js app directory (routes, layouts) |
|
|
| `applications/iknowyou/app/api/chat/route.ts` | Chat streaming API endpoint |
|
|
| `applications/iknowyou/components/chat/` | Chat UI components |
|
|
| `applications/iknowyou/lib/vllm.ts` | vLLM client (streaming) |
|
|
| `applications/iknowyou/lib/types.ts` | Shared TypeScript types |
|
|
| `applications/iknowyou/server/observer.ts` | Observer pipeline |
|
|
| `applications/iknowyou/server/brief.ts` | Brief assembly |
|
|
| `applications/iknowyou/devsetup.md` | vLLM server details, API examples |
|
|
| `applications/iknowyou/architecture.md` | System architecture |
|
|
| `.claude/agents/kai-park.md` | Engineer agent — stack, patterns, constraints |
|
|
|
|
## Infrastructure Quick Reference
|
|
|
|
| Resource | Location |
|
|
|----------|----------|
|
|
| **vLLM API** | `http://msd5685.mjhst.com:8000/v1` |
|
|
| **Model** | `Qwen/Qwen3-8B` |
|
|
| **SSH** | `ssh ubuntu@msd5685.mjhst.com` |
|
|
| **vLLM logs** | `sudo journalctl -u vllm -f` (on server) |
|
|
| **vLLM restart** | `sudo systemctl restart vllm` (on server) |
|
|
| **GPU check** | `nvidia-smi` (on server) |
|
|
| **Dev server port** | 59521 (following tidalDB port range 59520-59529) |
|
|
|
|
## Standards
|
|
|
|
- All API responses are typed (no `any`)
|
|
- Streaming uses `ReadableStream` + SSE (not WebSocket)
|
|
- Observer runs async after response stream completes
|
|
- Client sends `{ message, conversationId }` — server owns history
|
|
- Error states show human-readable messages, never stack traces
|
|
- vLLM calls include timeout (10s for health, 30s for completion)
|
|
|
|
## Done Gate
|
|
|
|
- [ ] Full flow works: type → stream → display → store
|
|
- [ ] First token appears within 500ms of send
|
|
- [ ] Streaming text renders without flicker or reflow
|
|
- [ ] vLLM-down case shows graceful error message
|
|
- [ ] Conversation history persists across page reloads
|
|
- [ ] Types are complete — no `any` in the chain
|
|
- [ ] API route returns proper SSE headers
|
|
- [ ] No observation logic in the response critical path
|