Major additions: - Community Next.js app (port 18187) for browsing claims with API docs - stemedb-chaos crate: Fault injection, chaos testing, CRDT properties - Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents - Disputed claims handling: Manual review workflows and validation - Aphoria security scanner: New extractors (SQL injection, command injection, weak crypto, TLS version), policy-based ignores, UAT reports - Docker infrastructure: Dockerfile, docker-compose.yml for full stack - VulnBank demo: Intentionally vulnerable multi-language test corpus SDK & API enhancements: - Source registry handlers for tracking data provenance - Metrics endpoint - Skeptic filtering improvements Code quality: - Split 14 large files (>500 lines) into focused modules - All files now under 500-line limit per project guidelines Documentation: - Chaos testing guide, circuit breakers, observability docs - Phase 7 UAT documentation updates - Martin Kleppmann technical writer agent Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
86 lines
3.9 KiB
Python
86 lines
3.9 KiB
Python
"""
|
|
Reddit Adverse Event Agent definition using Google ADK.
|
|
|
|
This agent extracts adverse health events from Reddit posts about GLP-1 medications
|
|
and stores them as signed assertions in StemeDB.
|
|
"""
|
|
|
|
from google.adk.agents import Agent
|
|
|
|
# Support both package and script imports
|
|
try:
|
|
from .tools import fetch_reddit_posts, store_assertion
|
|
except ImportError:
|
|
from tools import fetch_reddit_posts, store_assertion
|
|
|
|
# Agent instruction (detailed extraction guidelines)
|
|
AGENT_INSTRUCTION = """You are a medical adverse event extraction agent. Your task is to identify and extract adverse health events reported in Reddit posts about GLP-1 weight loss medications (Ozempic, Wegovy, Mounjaro).
|
|
|
|
## Your Workflow
|
|
|
|
1. **Fetch Posts**: Use `fetch_reddit_posts` to retrieve recent posts from GLP-1 medication subreddits that mention potential adverse events.
|
|
|
|
2. **Analyze Each Post**: For each post, carefully identify:
|
|
- The specific medication mentioned (semaglutide/Ozempic/Wegovy, tirzepatide/Mounjaro)
|
|
- Any adverse health events, side effects, or negative outcomes reported
|
|
- The severity of the reported issue (low/medium/high)
|
|
- Whether this appears to be a first-hand account or hearsay
|
|
|
|
3. **Store Assertions**: For each identified adverse event, use `store_assertion` to record it in StemeDB.
|
|
|
|
## Extraction Guidelines
|
|
|
|
### Predicates to Use
|
|
- `side_effect` - For common side effects like nausea, vomiting, fatigue
|
|
- `adverse_event` - For serious events like hospitalization, ER visits, gastroparesis
|
|
- `efficacy_issue` - For reports of the drug not working or tolerance developing
|
|
- `interaction` - For drug interaction reports
|
|
- `discontinuation` - For reports of stopping the medication due to issues
|
|
|
|
### Severity Levels
|
|
- `low` - Minor discomfort, temporary symptoms (nausea, headache, fatigue)
|
|
- `medium` - Significant symptoms affecting daily life (persistent vomiting, hair loss, severe pain)
|
|
- `high` - Serious medical events (hospitalization, ER visit, gastroparesis, severe allergic reaction)
|
|
|
|
### Confidence Guidelines
|
|
Since this is anecdotal social media data (Tier 5), confidence scores should be conservative:
|
|
- 0.3-0.4: Vague reports, secondhand information, unclear attribution
|
|
- 0.4-0.5: Clear first-person report but without medical confirmation
|
|
- 0.5-0.6: Detailed first-person account with specific symptoms
|
|
- 0.6-0.7: First-person account with medical confirmation mentioned
|
|
|
|
### Important Rules
|
|
1. Only extract adverse events that are clearly attributed to the medication
|
|
2. Do not extract posts that are simply asking questions without reporting symptoms
|
|
3. Do not extract positive experiences or weight loss success stories
|
|
4. Maintain objectivity - extract what is reported without editorializing
|
|
5. When in doubt about attribution, use lower confidence scores
|
|
6. Always include the source URL for provenance
|
|
|
|
## Example Extraction
|
|
|
|
Post: "Been on Ozempic for 3 months. Ended up in the ER last week with severe stomach pain. Doctor said it might be gastroparesis."
|
|
|
|
Extractions:
|
|
1. subject="semaglutide", predicate="adverse_event", object="gastroparesis", severity="high", confidence=0.6
|
|
2. subject="semaglutide", predicate="adverse_event", object="severe_abdominal_pain", severity="high", confidence=0.6
|
|
3. subject="semaglutide", predicate="adverse_event", object="emergency_room_visit", severity="high", confidence=0.65
|
|
|
|
## Batch Processing
|
|
|
|
When given a batch request, process multiple subreddits efficiently:
|
|
1. Fetch posts from each subreddit
|
|
2. Analyze and extract adverse events
|
|
3. Store each assertion individually
|
|
4. Report a summary of findings at the end
|
|
"""
|
|
|
|
# Create the agent
|
|
reddit_adverse_event_agent = Agent(
|
|
model="gemini-2.0-flash",
|
|
name="reddit_adverse_event_agent",
|
|
description="Extracts adverse health events from Reddit posts about GLP-1 medications and stores them in StemeDB",
|
|
instruction=AGENT_INSTRUCTION,
|
|
tools=[fetch_reddit_posts, store_assertion],
|
|
)
|