# Latent: Reddit Ingestor (Tier 5) This component monitors social signals ("The Noise") to detect latent safety issues before they appear in clinical literature. ## Scope (Week 2) - **Source:** Reddit API (PRAW) - **Targets:** r/Ozempic, r/Mounjaro, r/Semaglutide, r/Wegovy - **Method:** Fetches `new` posts, filters by severity keywords (`paralysis`, `er`, `hospital`), and extracts structured assertions. ## Setup 1. **Credentials:** You need a Reddit App (Script type). Go to [https://www.reddit.com/prefs/apps](https://www.reddit.com/prefs/apps). Create a `.env` file in `latent/ingest-reddit/`: ```env REDDIT_CLIENT_ID=your_id_here REDDIT_CLIENT_SECRET=your_secret_here REDDIT_USER_AGENT=LatentBot/0.1 # Optional: For real extraction (otherwise uses Regex Mock) OPENAI_API_KEY=sk-... ``` 2. **Install:** ```bash pip install -r requirements.txt ``` 3. **Run:** ```bash python main.py ``` ## Output Generates `tier5_social_graph.jsonl`. Entries look like: ```json { "subject": "semaglutide", "predicate": "side_effect", "object": "gastroparesis", "source_class": 5, "source_metadata": { "type": "reddit_post", "subreddit": "Ozempic", "severity": "high" } } ``` ## Privacy Note Author names are hashed (`hash(post["author"])`) before storage to provide basic anonymization while allowing for "Cluster" analysis (detecting if one user is spamming vs many unique users).