254 lines
10 KiB
Markdown
254 lines
10 KiB
Markdown
# Forage — Build Plan
|
||
|
||
## What We Are Proving
|
||
|
||
Each phase proves something specific. Do not build phase N+1 until phase N has proven its thesis.
|
||
|
||
| Phase | Proves | Delivers |
|
||
|-------|--------|----------|
|
||
| **P0** | The loop closes — signal in, re-rank out, observable in real time | Local server + seed data + Claude observes interactions |
|
||
| **P1** | The Chrome extension can drive the entire signal surface from real web pages | Extension posts signals automatically from browsing behavior |
|
||
| **P2** | Semantic search works over content Forage finds on the real web | Embedding service + real web crawl |
|
||
| **P3** | The MAB sharpens — exploration items hit more often over time | Adaptive exploration budget, centroid tracking, exploration-hit instrumentation |
|
||
| **P4** | The surprise moment — cross-centroid discoveries emerge naturally | Multi-session preference evolution, intersection surfacing |
|
||
|
||
---
|
||
|
||
## Phase 0 — Close the Loop (MVP Demo)
|
||
|
||
**Goal:** A running demo where a user interacts with a local feed page, signals are posted by the page itself, and Claude's Chrome extension observes visible ranking shifts. No real web crawl. No real embeddings. Proves the feedback loop.
|
||
|
||
**What we build:**
|
||
|
||
### `forage-engine` (library crate)
|
||
|
||
The reusable core. Wraps tidalDB with the foraging-specific schema, seed corpus, MAB layer, and public API. This is what transfers to other applications.
|
||
|
||
```rust
|
||
pub struct ForageEngine { db: TidalDb }
|
||
|
||
impl ForageEngine {
|
||
pub fn ephemeral() -> Result<Self>
|
||
pub fn persistent(data_dir: &Path) -> Result<Self>
|
||
pub fn seed_default_corpus(&self) -> Result<()>
|
||
pub fn signal(&self, user: u64, item: u64, kind: SignalKind) -> Result<()>
|
||
pub fn signal_dwell(&self, user: u64, item: u64, duration_ms: u64) -> Result<()>
|
||
pub fn feed(&self, user: u64, limit: usize) -> Result<Vec<ForageItem>>
|
||
pub fn all_items(&self) -> &[SeedItem]
|
||
pub fn add_item(&self, item: ForageItemInput) -> Result<u64> // P1
|
||
}
|
||
```
|
||
|
||
### `forage-server` (Axum binary)
|
||
|
||
A thin HTTP wrapper over `forage-engine`. Serves:
|
||
|
||
```
|
||
POST /signal { user_id, item_id, signal_type, duration_ms? }
|
||
GET /feed ?user=X&limit=7
|
||
GET /items (all items, for page render)
|
||
GET / (serves the feed HTML page)
|
||
```
|
||
|
||
Runtime mode:
|
||
- Persistent by default (`~/.forage/data`)
|
||
- Optional `--ephemeral` mode for throwaway demo runs
|
||
|
||
Schema on startup:
|
||
- 100 seed items across 8 categories (tech, music, jazz, cooking, fitness, travel, science, literature)
|
||
- Each item: title, url (placeholder), category, source, reading_time, description
|
||
- Seeded RNG — same items every run, deterministic
|
||
- 3 pre-built users: `cold` (no signals), `explorer` (light signals), `convergent` (heavy signals in 2 categories)
|
||
|
||
Signal types registered:
|
||
- `view` — half-life 7d, AllTime + 24h windows
|
||
- `dwell` — half-life 3d (reading time is stronger signal than click)
|
||
- `save` — half-life 30d (strong intent)
|
||
- `skip` — half-life 1d (mild negative, decays fast)
|
||
- `share` — half-life 14d (strongest positive)
|
||
|
||
Ranking profiles:
|
||
- `forage_default` — personalized, 14% exploration (~1 in 7), max_per_category:2
|
||
- `forage_explore` — heavy exploration, weighted toward underexplored categories
|
||
- `forage_converge` — pure exploitation, no exploration
|
||
|
||
MAB layer (thin wrapper over tidalDB query):
|
||
```
|
||
candidate_pool = RETRIEVE items FOR USER @u USING PROFILE forage_default LIMIT 20
|
||
exploit = candidate_pool[0..6] // top 6 by score
|
||
explore = candidate_pool filtered by (category_signal_count < 5) // pick 1 from underexplored
|
||
final = interleave(exploit, explore, ratio=0.14) // ~1 in 7
|
||
```
|
||
|
||
Item labels returned in feed response:
|
||
- `"match"` — near a known centroid
|
||
- `"exploring"` — from exploration budget
|
||
- `"trending"` — high velocity regardless of personalization
|
||
- `"resurfaced"` — user had prior engagement, decayed, being re-checked
|
||
|
||
### Feed Page (`/`)
|
||
|
||
Static HTML + minimal JS. No framework.
|
||
|
||
- Grid of 7 item cards
|
||
- Each card: title, source, category chip, reading time, description, label badge
|
||
- Click card → `POST /signal {signal_type: "view"}`, open URL in new tab
|
||
- Hover for >3s → `POST /signal {signal_type: "dwell", duration_ms: N}`
|
||
- "Skip" button on card → `POST /signal {signal_type: "skip"}`
|
||
- "Save" button → `POST /signal {signal_type: "save"}`
|
||
- Auto-refresh feed every 5s (or on any signal write)
|
||
- Visual: ranking shift animation when feed re-orders
|
||
|
||
### What Claude Does in P0
|
||
|
||
Claude uses the Chrome extension **lightly** — as an observer, not a puppeteer. The feed page handles signal posting itself via JS `fetch()`. Claude's role:
|
||
|
||
1. `navigate` to `localhost:4242` — one call
|
||
2. `read_page` to snapshot the initial feed state — one call
|
||
3. Wait while a human (or scripted JS) interacts with the feed for 10+ interactions
|
||
4. `read_page` again to snapshot the final feed state — one call
|
||
5. Report: what categories dominated before vs. after, which exploration items appeared, how the labels shifted
|
||
|
||
Three MCP tool calls per session. That is the ceiling. The interesting loop — signal → re-rank → new feed — runs entirely in the browser without Claude's involvement. Claude observes the outcome, it does not produce it.
|
||
|
||
This is the demo. This is the proof-of-concept that makes the thesis visible.
|
||
|
||
**Deliverables:**
|
||
- `applications/forage/engine/` — `ForageEngine` library crate (tidalDB + MAB + schema)
|
||
- `applications/forage/server/` — thin Axum binary wrapping the engine
|
||
- `applications/forage/server/static/index.html` — feed page (plain HTML/JS, signals via fetch())
|
||
- CORS headers on the server so the feed page can post signals without browser errors
|
||
|
||
---
|
||
|
||
## Phase 1 — Real Signal Surface
|
||
|
||
**Goal:** The Chrome extension captures signals from real browsing behavior, not just the demo feed page.
|
||
|
||
**What changes:**
|
||
|
||
Claude uses `javascript_tool` to inject a lightweight signal collector on pages it navigates to:
|
||
```js
|
||
// injected on each visited page via javascript_tool
|
||
const title = document.title;
|
||
const url = location.href;
|
||
const readingTime = Math.round(document.body.innerText.split(/\s+/).length / 200);
|
||
// POST to forage-server: add item if unknown, write "view" signal
|
||
fetch('http://localhost:4242/signal', { method: 'POST', ... });
|
||
// After 30s dwell, fire "dwell" signal
|
||
setTimeout(() => fetch(...), 30_000);
|
||
```
|
||
|
||
`ForageEngine` gains an `add_item` method — engine API extends to:
|
||
```rust
|
||
pub fn add_item(&self, item: ForageItemInput) -> Result<u64> // returns item_id
|
||
```
|
||
|
||
The feed page now shows a mix of:
|
||
- Seed items (known corpus)
|
||
- Items the user actually visited (added via `add_item`)
|
||
|
||
**No publishable Chrome extension is built.** Claude is the browsing agent. The signal injection is Claude executing JS on pages it visits.
|
||
|
||
**Proves:** tidalDB can serve as a memory layer for real browsing behavior, not just a demo corpus.
|
||
|
||
---
|
||
|
||
## Phase 2 — Real Embeddings
|
||
|
||
**Goal:** Semantic search and similarity-based recommendations over content Forage actually finds.
|
||
|
||
**What changes:**
|
||
|
||
A thin embedding sidecar (separate process, any language):
|
||
```
|
||
POST /embed { text: string } → { vector: f32[1536] }
|
||
```
|
||
|
||
Default: OpenAI `text-embedding-3-small`. Swappable. Forage calls this when writing new items.
|
||
|
||
With real embeddings:
|
||
- `SearchBuilder::semantic("jazz theory")` works for real
|
||
- `SearchBuilder::similar_to(item_id)` produces genuine similarity
|
||
- Preference vectors actually mean something — they are in embedding space
|
||
|
||
The feed profile adds:
|
||
- `semantic_boost: 0.3` — items semantically near preference centroid score higher
|
||
- `similar_to_saved: true` — items near saved items get boosted
|
||
|
||
**Proves:** The preference vector is not just a signal frequency map — it is a semantic model of what the user cares about, queryable by meaning.
|
||
|
||
---
|
||
|
||
## Phase 3 — Adaptive MAB
|
||
|
||
**Goal:** The exploration budget adapts per-user based on their exploration-hit history.
|
||
|
||
**What changes:**
|
||
|
||
Track per-user: `exploration_hits` / `exploration_total` → hit rate.
|
||
|
||
```
|
||
if hit_rate > 0.5: exploration_ratio = 0.25 (adventurous user)
|
||
if hit_rate < 0.2: exploration_ratio = 0.10 (convergent user)
|
||
else: exploration_ratio = 0.14 (default)
|
||
```
|
||
|
||
UCB1 bonus on underexplored categories:
|
||
```
|
||
ucb_bonus = sqrt(2 * ln(total_signals) / category_signal_count)
|
||
```
|
||
Categories with few signals get a score boost, naturally surfacing exploration candidates higher.
|
||
|
||
Instrumentation persists per-user exploration outcomes (`exploration_hits`, `exploration_total`) and feeds adaptation logic.
|
||
|
||
**Proves:** The MAB is not static noise — it learns the user's exploration tolerance and adjusts. Power users feel the system getting bolder with them.
|
||
|
||
---
|
||
|
||
## Phase 4 — The Surprise Moment
|
||
|
||
**Goal:** Cross-centroid discoveries emerge. Users find things at the intersection of two interests they did not know were related.
|
||
|
||
**What changes:**
|
||
|
||
Centroid intersection query:
|
||
```
|
||
centroids = top_2_active_centroids(user)
|
||
midpoint = (centroid_a.vector + centroid_b.vector) / 2
|
||
intersection_candidates = ANN(midpoint, limit=5)
|
||
inject 1 intersection candidate into every 7-item feed
|
||
label it: "bridge: {category_a} × {category_b}"
|
||
```
|
||
|
||
Over time, if intersection items hit consistently, they form a new centroid — the user's interests have genuinely merged into a new territory.
|
||
|
||
**Proves:** This is not a feature that could be added to a recommendation system after the fact. It is the natural consequence of having a semantic preference model that updates with the feedback loop. The "surprise moment" is an emergent property of the system working correctly.
|
||
|
||
---
|
||
|
||
## What We Are Not Building
|
||
|
||
- A Chrome extension we publish to the Chrome Web Store (P0 is Claude-driven, not user-installed)
|
||
- A mobile app
|
||
- Multi-user / server-hosted version (single user, local process)
|
||
- Content moderation, NSFW filtering, language filtering
|
||
- Payment, accounts, authentication
|
||
- A scraper that violates robots.txt or rate limits
|
||
|
||
---
|
||
|
||
## Phase 0 Acceptance Criteria
|
||
|
||
The P0 demo is complete when:
|
||
|
||
1. `cargo run -p forage-server --manifest-path applications/forage/server/Cargo.toml` starts a server at `localhost:4242`
|
||
2. The feed page loads with 7 items across ≥3 categories
|
||
3. A user generates 10+ signals from the feed page (mix of views, skips, saves) while Claude observes before/after state
|
||
4. After 10 signals, the feed has visibly shifted toward the signaled categories
|
||
5. At least 1 item in the feed is labeled `exploring` (from the exploration budget)
|
||
6. The signal-to-re-rank latency is < 200ms (measured by feed refresh after `POST /signal`)
|
||
7. A second user (`?user=2`) with no signals gets a different, more exploratory feed than a user with 20+ signals
|
||
|
||
If these 7 criteria are met, the loop is closed and the thesis is proven.
|