# Capacity Planning This document provides RAM, disk, and startup time estimates for tidalDB deployments. Use these tables to provision hardware before going to production. All estimates assume a single-node deployment with default configuration (30-second checkpoint interval, f16 vector quantization, DashMap-based hot tier). --- ## RAM Capacity tidalDB is an in-memory-first database. USearch HNSW indexes, the signal ledger hot tier, and Tantivy reader segments all reside in RAM during operation. There is no swap tolerance for USearch -- if the process is swapped, ANN query latency degrades from microseconds to seconds. | Items | Embedding Dims | USearch RAM | Signal Ledger RAM (10 signals) | Tantivy RAM | Total Estimate | |------:|---------------:|------------:|-------------------------------:|------------:|---------------:| | 100K | 128D | ~26 MB | ~110 MB | ~50 MB | ~200 MB | | 100K | 768D | ~154 MB | ~110 MB | ~50 MB | ~320 MB | | 100K | 1536D | ~307 MB | ~110 MB | ~50 MB | ~470 MB | | 1M | 128D | ~256 MB | ~1.1 GB | ~200 MB | ~1.6 GB | | 1M | 768D | ~1.5 GB | ~1.1 GB | ~200 MB | ~2.8 GB | | 1M | 1536D | ~3.1 GB | ~1.1 GB | ~200 MB | ~4.4 GB | | 10M | 128D | ~2.6 GB | ~11 GB | ~500 MB | ~14 GB | | 10M | 768D | ~15 GB | ~11 GB | ~500 MB | ~27 GB | | 10M | 1536D | ~31 GB | ~11 GB | ~500 MB | ~43 GB | ### Formulas **USearch HNSW index:** ``` items * dims * 2 bytes (f16 quantization) * 1.2 (HNSW graph overhead) ``` The 20% graph overhead accounts for HNSW neighbor lists (M=16 default, two layers). Actual overhead varies with M and ef_construction parameters. **Signal ledger hot tier:** ``` items * signal_count * ~1,088 bytes/entry ``` Each `(entity_id, signal_type_id)` entry in the DashMap holds the running decay score, windowed counters (BucketedCounter with minute and hour buckets), velocity state, and the DashMap per-shard overhead. The 1,088 bytes/entry figure was measured in the m7p3 scale benchmarks. The signal ledger has a memory budget of 5M entries (`DEFAULT_MAX_SIGNAL_ENTRIES`). When exceeded, the checkpoint thread evicts cold entries (oldest `last_update` timestamp). If your workload has more than 5M active `(entity, signal_type)` pairs, cold entries will be served from fjall checkpoints (slower, but correct). **Tantivy text index:** Tantivy's RAM usage depends on the number of indexed documents, average document length, and the number of open reader segments. The estimates above assume short metadata fields (title + description, ~200 bytes average). Long-form content indexing will increase RAM proportionally. ### Notes - Signal ledger RAM is for the in-memory hot tier only. The WAL and fjall checkpoints add disk usage, not RAM. - The "10 signals" column assumes 10 distinct signal types per entity. Scale linearly for more signal types. - USearch RAM is the dominant cost at high dimensionality. If you use 1536D embeddings (e.g., OpenAI text-embedding-3-large), plan for USearch to consume 70%+ of total RAM at 10M items. --- ## Disk Capacity Disk usage comes from three sources: fjall LSM-tree storage (metadata, relationships, signal checkpoints), WAL segments (append-only signal event log), and Tantivy/USearch index files. | Items | Metadata Size | Signal Events/Day | Disk/Day (WAL) | Fjall (90 days) | Total (90 days) | |------:|:----------------|------------------:|----------------:|----------------:|----------------:| | 100K | small (256B avg) | 50K | ~2 MB | ~1 GB | ~1.2 GB | | 1M | small | 500K | ~20 MB | ~10 GB | ~11.8 GB | | 10M | small | 5M | ~200 MB | ~100 GB | ~118 GB | ### Formulas **WAL daily growth:** ``` signal_events_per_day * ~40 bytes/event ``` Each WAL entry contains: 4-byte magic, 8-byte sequence, 1-byte event type, 8-byte entity ID, 2-byte signal type ID, 8-byte timestamp, 8-byte weight (f64), 32-byte BLAKE3 checksum. WAL segments are compacted after each successful checkpoint (every 30 seconds), so WAL disk usage represents only the uncompacted tail, not cumulative growth. **Fjall storage:** ``` items * metadata_avg_bytes * 1.5 (LSM write amplification) ``` The 1.5x amplification factor accounts for LSM-tree space amplification (multiple sorted runs before compaction merges them). Actual amplification depends on the compaction strategy and write pattern. Signal checkpoints are also stored in fjall -- add ~100 bytes per active `(entity, signal_type)` pair for the serialized checkpoint data. **Tantivy and USearch on disk:** - Tantivy: roughly 1.5-2x the raw text size after indexing (inverted index + postings + term dictionary). - USearch: saved index files are approximately the same size as the in-memory representation (items * dims * 2 bytes + graph metadata). ### WAL Compaction WAL segments older than the last successful checkpoint are automatically deleted by the checkpoint thread (every 30 seconds). Under normal operation, WAL disk usage stays bounded at roughly `signal_rate * 40 bytes * 30 seconds`. Monitor `tidaldb_wal_lag_bytes` -- if it grows unbounded, checkpointing may be failing (check `tidaldb_checkpoint_failures_total`). --- ## Startup Time Startup involves: opening fjall keyspaces, restoring the signal ledger from checkpoint, replaying WAL events since the last checkpoint, rebuilding in-memory indexes (bitmap, range, universe, creator-items, collections, suggestions), and loading USearch vector indexes. | Items | Vectors | Typical Startup | |------:|--------:|:----------------| | 100K | 100K | ~2-5 sec | | 1M | 1M | ~15-45 sec | | 10M | 10M | ~3-8 min | ### Dominant Costs 1. **USearch index load** is the dominant startup cost at 1M+ vectors. USearch rebuilds the HNSW graph from its serialized format. Progress is logged every 10K vectors. 2. **Signal ledger restore** reads the checkpoint from fjall (a single prefix scan of `Tag::Sig` keys), then replays any WAL events with sequence numbers higher than the checkpoint's `wal_sequence`. Time is proportional to the number of active signal entries + unreplayed WAL events. 3. **Entity state rebuild** scans the items and users keyspaces to reconstruct creator-items bitmaps, relationship indexes (follows, blocks, hides), and interaction weights. Progress is logged every 10K items. 4. **Suggestion index rebuild** scans all item metadata for "title" fields and indexes terms for autocomplete. This is a sequential scan -- fast for 100K items, noticeable at 10M. 5. **Collection index rebuild** reconstructs collection membership bitmaps from fjall. ### Notes - Startup time is I/O-bound, not CPU-bound. Fast NVMe storage reduces startup time significantly compared to spinning disk. - WAL replay time depends on how many signals were written since the last checkpoint (at most ~30 seconds of writes under normal operation). - Tantivy indexes are opened directly from disk (memory-mapped) and do not require a rebuild step. --- ## Recommended Provisioning **General rule:** provision 2x the estimated RAM for headroom. | Scale | Recommended RAM | Recommended Disk | CPU Cores | |:---------|:----------------|:-----------------|:----------| | 100K items, 128D | 512 MB | 5 GB SSD | 2 | | 100K items, 768D | 1 GB | 5 GB SSD | 2 | | 1M items, 128D | 4 GB | 25 GB SSD | 4 | | 1M items, 768D | 8 GB | 25 GB SSD | 4 | | 10M items, 128D | 32 GB | 250 GB NVMe | 8 | | 10M items, 768D | 64 GB | 250 GB NVMe | 8 | | 10M items, 1536D | 96 GB | 250 GB NVMe | 16 | ### Why 2x headroom? - Signal ledger entries grow as new `(entity, signal_type)` pairs are written. The hot tier can hold up to 5M entries before trimming kicks in. - Tantivy segment merges temporarily double the index size during merge operations. - USearch does not support incremental resize -- if you approach capacity, you need enough free RAM to hold both the old and new index during a potential rebuild. - The Rust allocator (jemalloc or system) has its own fragmentation overhead. ### Swap Do not configure swap for production tidalDB instances. USearch HNSW traversal accesses memory in a random-access pattern that defeats page-level caching. A single swapped page in the HNSW graph can turn a 50-microsecond ANN query into a 50-millisecond disk seek. ### Disk Type SSD is strongly recommended for all deployments. NVMe is recommended at 10M+ items. The WAL uses synchronous `fsync` on every segment rotation, and fjall's journal uses `persist(SyncAll)` during checkpoint. Spinning disk latency on these operations directly impacts signal write throughput.