# Site and Blog Analysis: The Cohort Pivot The analysis below covers every dimension you asked about. It quotes current copy, suggests replacements, describes new sections, and provides enough specificity to start editing files immediately. --- ## 1. Site Messaging Changes ### The Hero Must Expand Its Claim The current hero headline in `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/page.tsx` (line 24-29): ```tsx "Ranking is not a feature. It is a primitive." ``` This still works. Ranking-as-primitive is the foundational insight and remains true with cohorts. But the subtitle underneath (lines 31-35) now sells the product too small: ```tsx "Replace Elasticsearch + Redis + Kafka + feature store + vector DB + ranking service with a single process, a single query, and a single operational model." ``` The "replace 6 systems" pitch was the right entry point for individual user ranking. The cohort direction makes the ambition larger. tidalDB does not just answer "what should this user see?" It answers "what's happening among users who look like this?" The first is a recommendation engine. The second is audience intelligence. **Recommended subtitle replacement:** > One database for personalized ranking and audience intelligence. Know what's trending globally, within any cohort, and for any individual -- in a single query. Or, more concise: > The database that ranks content for individuals, cohorts, and populations. One process. One query. One model of the world. The "replace 6 systems" line moves down to the Problem section where it already lives. It becomes supporting evidence, not the lead pitch. ### Does the Cohort Story Strengthen or Complicate the "Replace 6 Systems" Pitch? It strengthens it. The original pitch had one vulnerability: a skeptical CTO might think "I can glue Elasticsearch and Redis together. It's ugly but it works." The cohort story removes that escape hatch. No one has a clean solution for "show me what's trending among US females 18-24 who like jazz." That query currently requires a data warehouse join, a custom aggregation pipeline, and a separate trending computation -- none of which operate in real-time. Cohorts make the argument harder to dismiss because cohort-scoped trending is something the 6-system stack genuinely cannot do well. It turns the pitch from "we make the same thing simpler" into "we make things possible that weren't before." The risk of complication is real but manageable: the site must not feel like it's pitching two products. The narrative arc should be: 1. Ranking is a primitive (the thesis -- unchanged). 2. Existing systems can't do it (the problem -- unchanged). 3. tidalDB ranks for individuals, cohorts, and entire populations (the solution -- expanded). 4. Here's the query (the proof -- expanded to show all three layers). ### New Value Propositions Unlocked by Cohort-Based Trending - **Audience intelligence as a query.** "What's trending among jazz fans in Brazil?" is not a data science project. It is a database query. - **Three-layer trending.** Global, cohort, individual. Same engine, same query interface, same latency. - **Cohorts as named predicates.** Not ad-hoc SQL WHERE clauses. Named, versioned, reusable audience definitions that live in schema alongside ranking profiles. - **Real-time cohort signals.** Cohort trending updates as signals arrive. Not batch-computed overnight. - **Search within trending.** "Jazz piano tutorials trending among beginners" -- scoped discovery. ### Talking About Scale Without Undermining Simplicity The current messaging leans hard on "single-node-first, embeddable, runs in your process." The new scale ambitions create tension. Here is how to navigate it. Do not lead with scale. Lead with the mental model. The pitch: tidalDB models the world correctly (signals, cohorts, ranking as primitives). Correct modeling enables both embeddable single-node deployment AND horizontal scale. The architecture is distribution-aware from day one, but the first experience is `cargo add tidaldb`. Suggested framing for the Vision page (`/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/vision/page.tsx`, lines 116-118 -- the "Single-node first" principle): **Current:** > Single-node first. Embeddable. Runs in your process. Scales vertically before horizontally. Distribution is a later problem. **Replacement:** > Embeddable first. Runs in your process. The architecture is distribution-aware from day one -- sharding, replication, and multi-node cohort aggregation are built into the data model, not bolted on later. But your first experience is `cargo add tidaldb` and a query that returns in under 50ms. --- ## 2. New Content Needed ### A "Three Layers" Section on the Homepage Insert after the current "One Query" section. This is the visual proof that tidalDB operates at every level. Three queries, three scopes, one database: ``` -- Global: what's trending everywhere RETRIEVE items USING PROFILE trending LIMIT 25 -- Cohort: what's trending for this audience RETRIEVE items USING PROFILE trending FOR COHORT us_gen_z_jazz LIMIT 25 -- Individual: what should this person see RETRIEVE items FOR USER @user_id USING PROFILE for_you FILTER unseen, unblocked DIVERSITY max_per_creator:2 LIMIT 50 ``` ### Showing the Three-Layer Model Visually The three-layer model is the most compelling new concept. Show it as a narrowing scope, not three separate boxes. A terminal-aesthetic rendering: ``` GLOBAL "AI music video" trending at 4.2x baseline COHORT:jazz "Modal jazz comeback" trending at 12.8x baseline SEARCH:piano "Jazz piano tutorial" trending at 8.1x in cohort ``` This shows something moderately trending globally can be massively trending within a cohort. That is the insight worth showing. ### Cohorts as a Fifth Primitive In the Primitives section (`page.tsx`, the `HowItWorks` function, lines 150-172), add Cohorts: ```typescript { title: "Cohorts", description: "Named predicates over user attributes -- locale, demographics, interests, behavioral segments. Define a cohort once, query trending within it forever. Not filters applied after the fact. First-class scopes the database maintains.", }, ``` Update the Entities primitive: ```typescript { title: "Entities", description: "Items, Users, Creators. Users carry demographics, behavioral segments, and interest taxonomies -- not just preference vectors. The database understands populations, not just individuals.", }, ``` ### New Query Examples That Resonate **Cohort-scoped trending:** ``` RETRIEVE items USING PROFILE trending FOR COHORT us_gen_z_jazz FILTER format:video LIMIT 25 ``` **Audience intelligence:** ``` RETRIEVE items USING PROFILE rising FOR COHORT brazil_subscribers LIMIT 10 ``` **Search within cohort trending:** ``` SEARCH items QUERY "piano tutorial" USING PROFILE trending FOR COHORT jazz_beginners LIMIT 20 ``` ### Cohort Definition Code Block A code example showing cohort declaration in schema: ```rust db.define_cohort(CohortDef { name: "us_gen_z_jazz", predicate: Predicate::all(vec![ Predicate::eq("region", "US"), Predicate::range("age", 18..25), Predicate::contains("interests", "jazz"), ]), })?; ``` --- ## 3. What to Remove or Tone Down ### "Single-Node First" as a Lead Message On the Vision page (line 118), the statement "Distribution is a later problem" now conflicts with the scale ambitions. Replace with: > Embeddable first. The architecture is distribution-aware from day one -- but your first deployment is a single binary. ### Claims That Now Feel Too Small **Meta description** in `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/layout.tsx` (lines 22-23): **Current:** > "Replace Elasticsearch + Redis + Kafka + feature store + vector DB + ranking service with a single process, a single query, and a single operational model." **Replacement:** > "The database for personalized content ranking and audience intelligence. Trending globally, within any cohort, and for any individual -- in one query." **Get Started section copy** (`page.tsx`, line 262): **Current:** > "tidalDB is open source, embeddable, and purpose-built for the personalized content ranking problem." **Replacement:** > "tidalDB is open source, embeddable, and purpose-built for personalized ranking and audience intelligence." ### Problem Section Stats The current stats (lines 88-92): ``` 6 -- Systems to operate N -- Seams where data drifts 0 -- Of them built for ranking ``` Consider updating the middle stat: ``` 6 -- Systems to operate 0 -- That understand your audience 0 -- Built for ranking ``` This sets up the cohort pitch. No existing system in the 6-system stack has a concept of a user cohort as a first-class queryable entity. --- ## 4. Blog Post #1 Changes ### Current State The existing post at `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/content/blog/why-tidaldb.mdx` is titled "Why we're building tidalDB." It tells the 6-system stack problem, the "ranking is a primitive" thesis, the core primitives, and the roadmap. It is well-written. ### What Needs to Change The post needs a second act. The current version ends at "ranking is a primitive." The cohort pivot adds a second, larger insight: **trending is broken because it ignores audience structure.** **New narrative arc:** 1. Every platform builds the same 6-system stack. (Problem -- keep) 2. Ranking is a primitive, not a feature. (Thesis -- keep) 3. But individual ranking is only half the problem. (Pivot -- **new**) 4. Trending is broken because it treats all users as one population. (Second problem -- **new**) 5. Cohorts as a database primitive. (Second thesis -- **new**) 6. Three layers: global, cohort, individual. (Solution -- **new**) 7. Here are the primitives (expanded). (Proof -- expand) 8. Here's what we're building. (What's next -- update) ### New Section to Insert: "The second observation" After the current "The observation" section (line 17), add: ```markdown ## The second observation Individual ranking is only half the problem. Every content platform also needs to answer: what's trending? Not globally -- that's the easy version. What's trending *among users who look like this?* A Gen Z jazz fan in the US and a 45-year-old classical listener in Germany are on the same platform. "Trending" means something completely different to each of them. But every existing system computes one global trending list, maybe bucketed by category, and calls it done. The reality is richer. Trending has layers: - **Global trending** -- what the whole platform is engaging with right now. - **Cohort trending** -- what's gaining traction among a specific audience segment. US females 18-24 who listen to jazz. Brazilian subscribers who watch cooking content. Any named predicate over user attributes. - **Search within cohort trending** -- find specific content within what's trending for an audience. "Jazz piano tutorials" that are trending among beginners. No database supports this natively. Data teams build it with batch jobs, warehouse queries, and custom aggregation pipelines that run overnight. By the time the numbers arrive, the trends have moved. tidalDB models cohorts as a first-class primitive. A cohort is a named predicate over user attributes -- locale, demographics, interests, behavioral segments. You define it once. The database maintains real-time trending signals scoped to that cohort. Querying it is one operation: \``` RETRIEVE items USING PROFILE trending FOR COHORT us_gen_z_jazz LIMIT 25 \``` Same engine that ranks for individuals. Same latency. Same signal system. ``` ### Updates to "What tidalDB is" (line 29) Add Cohorts to the primitives list: ```markdown - **Cohorts** -- Named predicates over user attributes. Define an audience segment once, query trending within it forever. Real-time aggregation, not batch computation. ``` ### Updates to "What we're building first" (line 60) Replace the current roadmap list: ```markdown 1. **Signal engine** -- WAL, entity store, signal ledger with forward-decay scoring. Signals are the atomic unit of engagement data. 2. **Cohort engine** -- Named audience predicates over user attributes. Real-time signal aggregation scoped to any cohort. Three-layer trending. 3. **Query engine** -- RETRIEVE, SEARCH, and SUGGEST with filtering, ranking, and cohort scoping in a single query path. 4. **Vector and text search** -- HNSW via USearch, BM25 via Tantivy, hybrid fusion with RRF. Search within any trending scope. ``` ### Updated Closing (line 77) **Current:** > If you're operating a 6-system stack for content ranking and wondering why it has to be this hard -- it doesn't. That's why we're building tidalDB. **Replacement:** > If you're operating a 6-system stack for content ranking, running nightly batch jobs to compute trending by audience segment, and wondering why you can't answer "what's trending among our jazz fans in Brazil?" in real time -- that's why we're building tidalDB. ### Updated Description (frontmatter, line 5) **Current:** > "Every content platform builds the same 6-system stack from scratch. We're replacing it with one database." **Replacement:** > "Every content platform builds the same 6-system stack. Trending ignores audience structure. We're building the database that fixes both." ### Recommended Second Blog Post Consider a standalone Post #2: **"Why trending is broken."** This is the cohort manifesto. It stands alone as a shareable artifact. Narrative: 1. Global trending is a solved problem (and a boring one). 2. The interesting question is: trending for whom? 3. How TikTok, Spotify, and YouTube approximate cohort trending internally (batch jobs, ML pipelines, custom infrastructure with hundreds of engineers). 4. Why no database product offers this natively. 5. Cohorts as database primitives -- what the query looks like, how signals aggregate in real-time. 6. The three-layer model and why it matters for any content platform. Title is a thesis statement: "Why trending is broken." CTOs forward this one to their teams. --- ## 5. Visual and Design Implications ### New Visualizations Needed **1. Three-Layer Trending Visualization (homepage).** Terminal-aesthetic. Not a flowchart. Something that looks like data output showing narrowing scope and amplification: ``` GLOBAL "AI music video" trending at 4.2x baseline COHORT:jazz "Modal jazz comeback" trending at 12.8x baseline SEARCH:piano "Jazz piano tutorial" trending at 8.1x in cohort ``` **2. Cohort Definition Code Block (homepage or vision page).** The Rust schema declaration showing a named cohort predicate. Proves cohorts are declared, not ad-hoc. **3. Before/After Comparison for Cohort Trending:** **Before (the 6-system way):** ``` 1. Query warehouse for user segment membership 2. Batch-compute trending per segment (nightly) 3. Store results in Redis 4. Query Redis for trending in segment 5. Cross-reference with Elasticsearch for filtering 6. Apply ranking service for personalization ``` **After (tidalDB):** ``` RETRIEVE items USING PROFILE trending FOR COHORT us_gen_z_jazz FILTER format:video LIMIT 25 ``` ### Design System Implications No changes needed. The dark-first editorial aesthetic supports the new content naturally. The only new component is a potential "layered code block" showing three queries stacked with subtle labels between them -- buildable with the existing code block component and spacing. --- ## 6. Competitive Positioning ### Differentiation from Algolia, Typesense, Meilisearch These are search-first products. They answer "what matches this query?" tidalDB answers "what should this user/audience see?" | Capability | Algolia/Typesense/Meilisearch | tidalDB | |---|---|---| | Full-text search | Yes | Yes | | Signal-based ranking | Manual relevance tuning | Native decay, velocity, windowed aggregation | | Personalization | Rules-based or plugin | User preference vectors, feedback loops | | Trending | Not a concept | Native, three-layer (global/cohort/individual) | | Cohort intelligence | Not a concept | First-class primitive | | Diversity enforcement | Not a concept | Query parameter | | Feedback loop | Separate system | Built-in, atomic signal writes | The cohort story widens the gap. Algolia can search. tidalDB can tell you what's trending among jazz fans in Brazil. ### Comparison to Spotify, TikTok, YouTube Internal Systems These companies have built exactly what tidalDB is building -- as custom internal infrastructure: - **Spotify** has Discover Weekly: cohort-based collaborative filtering requiring hundreds of engineers and a custom ML pipeline. - **TikTok** has the For You Page: individualized ranking with population-level trending awareness, built on a custom real-time feature store. - **YouTube** has trending per region and category -- a coarse version of cohort trending. tidalDB's position: **the infrastructure these companies built internally, available as an embeddable database.** Suggested site copy: > Every platform with serious content ranking -- Spotify, TikTok, YouTube -- has built custom infrastructure for cohort-scoped trending and real-time signal aggregation. tidalDB puts that infrastructure in a database. Use as conceptual comparison, not a claim of equivalence. ### A New Category The existing categories (search engines, recommendation engines, feature stores, analytics databases) do not contain tidalDB. The new category is something like **audience-aware ranking database** or **content intelligence database**. The site should not name the category explicitly. Describe the capability and let the reader realize there is no existing category for it. That realization is more powerful than a label. --- ## 7. Summary of Changes by File ### `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/page.tsx` | Section | Change | Priority | |---|---|---| | Hero subtitle | Replace "Replace 6 systems" with population+cohort+individual framing | High | | Problem section stats | Consider updating middle stat to "0 that understand your audience" | Medium | | One Query section | Expand to show three queries (global, cohort, individual) | High | | **New: Three Layers section** | Insert after One Query | High | | Primitives section | Add Cohorts as 5th primitive. Update Entities description. | High | | Feedback Loop section | Keep as-is | Low | | Get Started section | Update description to include "audience intelligence" | Medium | ### `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/vision/page.tsx` | Section | Change | Priority | |---|---|---| | Header subtitle | Expand to include cohort/audience language | Medium | | Thesis section | Add second paragraph about cohort insight | Medium | | What tidalDB models | Add Cohorts primitive. Expand User entity. | High | | Design principles | Rewrite "Single-node first" principle | High | | What tidalDB is not | Nuance the cloud-native/embeddable point re: distribution | Medium | ### `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/content/blog/why-tidaldb.mdx` | Section | Change | Priority | |---|---|---| | Frontmatter description | Expand to include cohort angle | Medium | | After "The observation" | Add "The second observation" section on cohort trending | High | | "What tidalDB is" | Add Cohorts to primitives list | High | | "What we're building first" | Add cohort engine to roadmap | Medium | | Closing | Rewrite to include cohort use case in emotional hook | Medium | ### `/Users/jordanwashburn/Workspace/orchard9/tidalDB/site/src/app/layout.tsx` | Field | Change | Priority | |---|---|---| | `title` meta | "tidalDB -- Ranking and audience intelligence for content platforms" | Medium | | `description` meta | Include cohort/audience intelligence framing | Medium | ### New Content | Asset | Priority | |---|---| | Blog Post #2: "Why trending is broken" | High | --- ## 8. What NOT to Change - **The design system.** Black background, copper accent, serif headlines, gray body. It works. - **The "ranking is a primitive" thesis.** Cohorts extend it. They do not replace it. - **The tone.** Direct, engineering-first, no fluff. - **The code block aesthetic.** Terminal-like, monospace, dark surface. - **The blog infrastructure.** MDX, gray-matter, the card design. Ship more posts, not more infrastructure. - **The Feedback Loop section.** Signal writes updating user state atomically is still the key write-path differentiator. Cohorts are primarily a read-path concept. --- The cohort pivot does not break the existing story. It completes it. tidalDB was always about the question "what should this user see?" Cohorts expand that to "what's happening among users who look like this?" Same thesis, larger aperture.