- Add /v1/feed API endpoint with handler and tests - Remove health endpoint rate limiting (behind firewall, caused spurious 429s) - Add dashboard feed panel with list, row, empty state, and loading skeleton - Update home page to show feed instead of redirecting to skeptic - Improve API key auth middleware and DTO create/query params - Add OpenAPI conceptual guide (api-intro.md) with semaglutide examples - Add FindMyHealth application scaffolding (vision, architecture, prototypes) - Add FindMyHealth designer/writer and Aphoria founder-CEO agents - Update roadmap with current progress Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
6.1 KiB
FindMyHealth: Technical Architecture
The Automated Ingestion Pipeline
This architecture leverages the existing StemeDB backbone and shifts the focus from "database" to "automated research engine." The pipeline identifies what is trending, finds the evidence, and structures the Truth Lenses with minimal human effort.
Pipeline Overview
[Watchtower] Trend detection & topic selection
|
v
[Harvester] Fan-out scraping across all tiers
|
v
[Extraction Cortex] LLM-powered claim extraction
|
v
[StemeDB Spine] Assertion storage, indexing, lens resolution
|
v
[Output Engine] Newsletter generation & premium alerts
1. The Watchtower (Trigger Layer)
A cron job (or serverless function) that identifies the "Subject" for the research sprint.
- Google Trends API: Filter for
HealthandSciencecategories with >50% breakout velocity. - Reddit Scraper: Monitor
r/all,r/Biohackers, andr/Nootropicsfor keyword frequency spikes. - PubMed RSS: Watch for new publications in high-impact journals (NEJM, Lancet).
- Output: A
TopicID(e.g.,magnesium_threonate_sleep) sent to the Orchestrator.
2. The Harvester (Scraping Layer)
Once a topic is picked, the system fans out to gather the "Evidence Stack."
Tier 0/1 (Regulatory/Clinical):
- PubMed/NCBI API: Fetch abstracts of the top 5 most cited papers on the topic.
- FDA/EMA Crawlers: Search official label databases and Adverse Event reports.
Tier 5 (Anecdotal):
- Reddit API: Fetch the top 3 threads from the last 90 days.
- Twitter/X API: Sample recent high-engagement posts for sentiment signal.
Output: A collection of raw text blobs associated with the TopicID.
3. The Extraction Cortex (LLM Layer)
Raw text becomes StemeDB Signed Assertions via structural extraction using a high-reasoning model.
Prompt Instruction:
"Extract every distinct claim regarding [Topic]. For each claim, identify: The Proposition (Subject-Predicate-Object), the Date of the claim, the Source Type, and the Confidence level of the author. Output as a JSON array of StemeDB assertions."
Transformation Example:
- Input: "Reddit user says: 'I took Mag Threonate and had vivid nightmares for a week.'"
- Output:
{
"subject": "magnesium_threonate",
"predicate": "side_effect",
"object": "vivid_nightmares",
"source_class": 5,
"confidence": 0.9,
"timestamp": 1707340800,
"source_metadata": {"user": "u/jdoe", "platform": "reddit"}
}
4. The Spine (StemeDB Integration)
Extracted assertions are pushed into StemeDB.
- Latticing: StemeDB automatically indexes new assertions against the existing graph.
- Lens Resolution: The system runs a
SkepticLensquery. If the Tier 5 "Social" cluster deviates significantly from the Tier 0 "Regulatory" consensus, a Conflict Flag is raised.
5. The Output Engine (Newsletter/App)
The final layer converts database state into human-readable intelligence.
- Automated Summarization: An LLM reads the resolved state of StemeDB (the output of the Truth Lens) and writes a 200-word summary for the newsletter.
- Alert Trigger: If
ConflictScore > 0.8, the system pushes a notification to Premium users: "Emerging Signal: High volume of anecdotal reports for [Topic] contradicts clinical data."
Email Architecture: The Dual-Track System
Resend handles two tracks: Transactional (high-priority, immediate) and Broadcast (bulk, scheduled).
| Track | Type | Usage | Strategy |
|---|---|---|---|
| Track A: Transactional | Individual API calls | Alerts, password resets, opt-ins | resend.emails.send() |
| Track B: Broadcasts | Batch/Audience API | Daily Evidence Pulse, Weekly Trends | resend.broadcasts.create() |
Broadcast Engine (Newsletter)
// /lib/email/broadcast.ts
import { Resend } from 'resend';
const resend = new Resend(process.env.RESEND_API_KEY);
export async function sendDailyDigest(topicData: any) {
await resend.broadcasts.create({
audienceId: process.env.FMH_AUDIENCE_ID!,
from: 'FindMyHealth Intel <digest@findmyhealth.com>',
subject: `[Evidence Alert] ${topicData.topic_name}: Signal Shift Detected`,
html: await render(DigestTemplate({ data: topicData })),
});
}
Evidence Alert (Transactional)
// /api/alerts/trigger.ts
const { data, error } = await resend.emails.send({
from: 'FindMyHealth Alerts <alerts@findmyhealth.com>',
to: user.email,
subject: `Urgent: New Research Conflict for ${substance}`,
react: AlertTemplate({ substance, conflictDetails }),
tags: [{ name: 'category', value: 'conflict_alert' }],
});
Subscriber Flow
- Opt-in: User signs up on the homepage.
- Double Opt-in (Mandatory): Transactional email with unique verification link.
- Audience Sync: Add to Resend
Audienceonly after verification click. - Tagging: Add metadata (e.g.,
specialty: "oncology",tier: "premium") for segmented broadcasts.
Webhook Feedback Loop
email.bounced: Mark user asinactiveto protect sender reputation.email.clicked: Track which Truth Tiers users engage with to inform ingestion priority.
Deliverability Checklist
- Subdomain Isolation:
digest.findmyhealth.comfor newsletters,auth.findmyhealth.comfor transactional. - DKIM/SPF: Authenticate via Resend DNS settings.
- Plain Text Fallback: Always include a
textversion via@react-email/components. - Batching: Use
resend.batch.send()(up to 100 per call) to stay under rate limits.
Tech Stack
| Component | Technology | Why |
|---|---|---|
| Orchestrator | Temporal.io or Node-RED | Long-running, retriable scraping workflows |
| Scrapers | Firecrawl or Apify | Turns complex websites into LLM-ready Markdown |
| The Cortex | Claude API | Best-in-class at complex JSON extraction schemas |
| The Spine | StemeDB | Custom probabilistic knowledge graph |
| Frontend | Next.js + Tailwind | Fast, SEO-friendly, Linear/Stripe aesthetic |
| Resend + React Email | Developer-first, compliance built-in |