stemedb/crates/stemedb-core/src/limits.rs
jordan 55349845d0 refactor: Split all files to enforce 500-line max
Break monolith source files into focused modules:
- stemedb-core/types.rs → types/ directory (assertion, source, gold_standard, etc.)
- stemedb-storage: audit_store, quota_store, trust_rank_store, vector_index, vote_store → module directories
- stemedb-ingest/worker.rs → worker/ with separate test modules
- stemedb-query: engine, materializer, query → module directories
- stemedb-lens: epoch_aware, skeptic → module directories
- stemedb-sim/lib.rs → agent, arenas/, helpers, runner, strategy, types
- stemedb-api/tests: integration_tests → http_basic, http_validation, http_epoch, http_pipeline
- stemedb-api/tests: e2e_flow_test → e2e_full_pipeline, e2e_lens_resolution
- stemedb-query/tests: e2e_pipeline → e2e_pipeline + e2e_decay

Also adds new features: gold standard verification, escalation handlers,
admin endpoints, concept hierarchy spec, arena roadmap, and Go SDK.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 01:13:45 -07:00

69 lines
2.5 KiB
Rust

//! Shared configuration constants and limits for StemeDB.
//!
//! This module centralizes validation limits and default values used across
//! the codebase to prevent duplication and ensure consistency.
/// Maximum allowed subject (entity ID) length in bytes.
///
/// Subjects should be concise identifiers, not full-text descriptions.
/// This limit prevents unbounded memory growth and ensures reasonable
/// index performance.
///
/// # Example
/// - Valid: `"Tesla_Inc"` (9 bytes)
/// - Valid: `"COMPOUND_CID_12345"` (18 bytes)
/// - Invalid: A 2KB entity description embedded in the subject field
pub const MAX_SUBJECT_LEN: usize = 1024;
/// Maximum allowed predicate (relation ID) length in bytes.
///
/// Predicates should be short relation names, not sentences.
/// This limit ensures efficient indexing on the predicate dimension.
///
/// # Example
/// - Valid: `"has_revenue"` (11 bytes)
/// - Valid: `"treats_condition"` (16 bytes)
/// - Invalid: A 512-byte natural language description of the relationship
pub const MAX_PREDICATE_LEN: usize = 256;
/// Maximum allowed object (value) length in bytes.
///
/// Objects can contain numbers, strings, or structured data (JSON).
/// This limit prevents unbounded memory growth while allowing reasonable
/// values like long text descriptions or JSON payloads.
///
/// # Example
/// - Valid: `"96.7"` (4 bytes)
/// - Valid: `"oblate_spheroid"` (16 bytes)
/// - Valid: JSON with nested structure (up to 4096 bytes)
/// - Invalid: A 10KB embedded document in the object field
pub const MAX_OBJECT_LEN: usize = 4096;
/// Maximum source document size in bytes (10 MB).
///
/// Source documents are stored as opaque byte blobs indexed by content hash.
/// This limit prevents storage exhaustion from maliciously large uploads
/// while supporting typical PDFs, images, and text documents.
///
/// # Rationale
/// - Average research paper PDF: 1-3 MB
/// - High-resolution screenshot: 2-5 MB
/// - 10 MB allows reasonable documents while preventing abuse
///
/// # Note
/// For larger source materials, consider storing a pointer (URL, IPFS CID)
/// in the source metadata instead of the raw bytes.
pub const MAX_SOURCE_SIZE: usize = 10 * 1024 * 1024;
/// Default limit for paginated query results.
///
/// Applied when no explicit limit is provided in the query parameters.
/// Prevents accidentally fetching unbounded result sets.
///
/// # Example
/// ```ignore
/// GET /v1/query?subject=Tesla_Inc&limit=50 // explicit limit
/// GET /v1/query?subject=Tesla_Inc // uses DEFAULT_QUERY_LIMIT
/// ```
pub const DEFAULT_QUERY_LIMIT: usize = 100;