stemedb/.claude/skills/playwright-macro-builder/SKILL.md
jordan b3e8a9a058 feat: Multi-application expansion with chaos testing and community UI
Major additions:
- Community Next.js app (port 18187) for browsing claims with API docs
- stemedb-chaos crate: Fault injection, chaos testing, CRDT properties
- Latent ingestion system: Reddit/FDA ingesters with ADK-Go agents
- Disputed claims handling: Manual review workflows and validation
- Aphoria security scanner: New extractors (SQL injection, command
  injection, weak crypto, TLS version), policy-based ignores, UAT reports
- Docker infrastructure: Dockerfile, docker-compose.yml for full stack
- VulnBank demo: Intentionally vulnerable multi-language test corpus

SDK & API enhancements:
- Source registry handlers for tracking data provenance
- Metrics endpoint
- Skeptic filtering improvements

Code quality:
- Split 14 large files (>500 lines) into focused modules
- All files now under 500-line limit per project guidelines

Documentation:
- Chaos testing guide, circuit breakers, observability docs
- Phase 7 UAT documentation updates
- Martin Kleppmann technical writer agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 01:24:14 -07:00

355 lines
10 KiB
Markdown

---
name: playwright-macro-builder
description: Build browser automation macros using Playwright with stealth capabilities. Use when creating undetectable browser automation scripts in ./macros.
---
# Playwright Macro Builder
## Identity
You are building **stealth browser automation macros** using Playwright. Your macros live in `./macros/` and are designed to evade bot detection while automating repetitive browser tasks.
## Principles
- **Screenshot-First Development**: Capture screenshots at every step to verify state before acting
- **Stealth by Default**: Use Patchright or playwright-stealth to avoid detection
- **Human-Like Behavior**: Add realistic delays, mouse movements, and typing patterns
- **Fail-Safe**: Every action must have verification and graceful error handling
- **Reproducible**: Macros must work reliably across runs with clear state management
## Step Back: Before Building Any Macro
Before writing automation code, challenge yourself:
### 1. Is Automation Appropriate?
> "Am I automating something I have legitimate access to?"
- Is this for a service I own or have explicit permission to automate?
- Could this violate Terms of Service?
- Is there an official API I should use instead?
### 2. Is Stealth Necessary?
> "Why does this need to be undetectable?"
- Am I bypassing rate limits that exist for good reasons?
- Would the site operator object to this automation?
- Is there a legitimate reason (e.g., accessibility, testing my own site)?
### 3. Is This the Right Tool?
> "Should I use Playwright at all?"
- Would a simple HTTP client suffice?
- Is there a browser extension that does this?
- Would manual operation be faster for a one-time task?
**After step back:** Document your justification in the macro's README.
## Technology Stack
### Primary: Patchright (Recommended)
```bash
pip install patchright
playwright install chromium
```
Patchright is an undetected fork of Playwright that patches detection vectors at the source level.
### Alternative: playwright-stealth
```bash
pip install playwright playwright-stealth
```
Use when Patchright isn't available or for simpler use cases.
## Macro Structure
Every macro lives in `./macros/<name>/` with this structure:
```
macros/
└── <macro-name>/
├── README.md # Purpose, justification, usage
├── main.py # Entry point
├── config.py # Configuration (no secrets!)
├── steps/ # Individual step modules
│ ├── __init__.py
│ ├── step_01_login.py
│ ├── step_02_navigate.py
│ └── step_03_extract.py
├── screenshots/ # Auto-captured verification screenshots
│ └── .gitkeep
├── requirements.txt # Dependencies
└── .env.example # Template for secrets
```
## Do
1. **Always screenshot before acting**
```python
async def click_button(page, selector: str, step_name: str):
await page.screenshot(path=f"screenshots/{step_name}_before.png")
await page.click(selector)
await page.wait_for_load_state("networkidle")
await page.screenshot(path=f"screenshots/{step_name}_after.png")
```
2. **Use Patchright's stealth context**
```python
from patchright.async_api import async_playwright
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ...",
locale="en-US",
timezone_id="America/New_York",
)
```
3. **Add human-like delays**
```python
import random
import asyncio
async def human_delay(min_ms=500, max_ms=2000):
delay = random.randint(min_ms, max_ms) / 1000
await asyncio.sleep(delay)
async def human_type(page, selector: str, text: str):
await page.click(selector)
for char in text:
await page.keyboard.type(char)
await asyncio.sleep(random.uniform(0.05, 0.15))
```
4. **Verify state before proceeding**
```python
async def wait_for_element(page, selector: str, timeout=10000):
try:
await page.wait_for_selector(selector, timeout=timeout)
return True
except:
await page.screenshot(path="screenshots/error_missing_element.png")
raise Exception(f"Element not found: {selector}")
```
5. **Use explicit waits, not sleep**
```python
# Good
await page.wait_for_selector("#result")
await page.wait_for_load_state("networkidle")
# Bad
await asyncio.sleep(5)
```
6. **Rotate fingerprints for repeated runs**
```python
VIEWPORTS = [
{"width": 1920, "height": 1080},
{"width": 1366, "height": 768},
{"width": 1536, "height": 864},
]
viewport = random.choice(VIEWPORTS)
```
7. **Store credentials in .env, never in code**
```python
from dotenv import load_dotenv
import os
load_dotenv()
USERNAME = os.getenv("MACRO_USERNAME")
PASSWORD = os.getenv("MACRO_PASSWORD")
```
## Do Not
1. **Never hardcode credentials or secrets**
```python
# WRONG
password = "hunter2"
# RIGHT
password = os.getenv("PASSWORD")
```
2. **Never skip screenshot verification**
```python
# WRONG
await page.click("#submit")
# RIGHT
await page.screenshot(path="screenshots/before_submit.png")
await page.click("#submit")
await page.screenshot(path="screenshots/after_submit.png")
```
3. **Never use default User-Agent**
```python
# WRONG - exposes automation
browser = await p.chromium.launch()
# RIGHT
context = await browser.new_context(
user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36"
)
```
4. **Never ignore errors silently**
```python
# WRONG
try:
await page.click("#button")
except:
pass
# RIGHT
try:
await page.click("#button")
except Exception as e:
await page.screenshot(path="screenshots/error.png")
logging.error(f"Click failed: {e}")
raise
```
5. **Never run at machine speed**
```python
# WRONG - instant, bot-like
await page.fill("#search", "query")
await page.click("#submit")
# RIGHT - human-like
await human_type(page, "#search", "query")
await human_delay(300, 800)
await page.click("#submit")
```
6. **Never commit screenshots to git** (add to .gitignore)
7. **Never automate services without legitimate access**
## Stealth Checklist
Before running a macro, verify these evasion techniques:
- [ ] Using Patchright or playwright-stealth
- [ ] Custom User-Agent string (recent Chrome version)
- [ ] Realistic viewport dimensions
- [ ] Timezone matches expected locale
- [ ] WebGL vendor/renderer not exposed as headless
- [ ] navigator.webdriver = undefined
- [ ] Human-like typing delays (50-150ms per character)
- [ ] Random delays between actions (500-2000ms)
- [ ] Mouse movements before clicks (optional but recommended)
- [ ] Cookies/session persistence between runs if needed
## Template: Basic Macro
```python
#!/usr/bin/env python3
"""
Macro: [NAME]
Purpose: [DESCRIPTION]
Justification: [WHY AUTOMATION IS APPROPRIATE]
"""
import asyncio
import os
import random
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
from patchright.async_api import async_playwright
load_dotenv()
SCREENSHOTS_DIR = Path(__file__).parent / "screenshots"
SCREENSHOTS_DIR.mkdir(exist_ok=True)
async def human_delay(min_ms=500, max_ms=2000):
await asyncio.sleep(random.randint(min_ms, max_ms) / 1000)
async def screenshot(page, name: str):
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
path = SCREENSHOTS_DIR / f"{timestamp}_{name}.png"
await page.screenshot(path=str(path))
print(f"[Screenshot] {path}")
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=False, # Set True for production
)
context = await browser.new_context(
viewport={"width": 1920, "height": 1080},
user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
locale="en-US",
timezone_id="America/New_York",
)
page = await context.new_page()
try:
# Step 1: Navigate
await page.goto("https://example.com")
await page.wait_for_load_state("networkidle")
await screenshot(page, "01_loaded")
# Step 2: Your automation here
await human_delay()
# ...
# Step 3: Verify success
await screenshot(page, "99_complete")
print("[OK] Macro completed successfully")
except Exception as e:
await screenshot(page, "error")
print(f"[ERROR] {e}")
raise
finally:
await browser.close()
if __name__ == "__main__":
asyncio.run(main())
```
## Decision Points
Stop and ask yourself:
- **"The site shows a CAPTCHA"** → Do not attempt to bypass. Stop and notify the user.
- **"I need to handle 2FA"** → Design for manual intervention or use app-based TOTP with user consent.
- **"The element structure changed"** → Take screenshot, update selectors, verify with new screenshots.
- **"Rate limiting detected"** → Increase delays, reduce frequency, or reconsider if automation is appropriate.
## Constraints
- **NEVER** attempt CAPTCHA solving or bypass
- **NEVER** automate financial transactions without explicit user confirmation per transaction
- **NEVER** scrape personal data without consent
- **NEVER** violate robots.txt for web scraping use cases
- **ALWAYS** include justification in macro README
- **ALWAYS** capture screenshots at every significant step
- **ALWAYS** use environment variables for credentials
## Output Format
When creating a new macro, produce:
1. `README.md` with purpose and justification
2. `main.py` using the template above
3. `requirements.txt` with pinned versions
4. `.env.example` with required variables
5. Initial test run with screenshots demonstrating it works
## Resources
- [Patchright](https://github.com/Kaliiiiiiiiii-Vinyzu/patchright) - Undetected Playwright fork
- [playwright-stealth](https://pypi.org/project/playwright-stealth/) - Stealth plugin for standard Playwright
- [ZenRows Guide](https://www.zenrows.com/blog/playwright-stealth) - Avoiding bot detection