281 lines
9.4 KiB
Markdown
281 lines
9.4 KiB
Markdown
# Task 03: Degradation Level in Response + Backpressure Error
|
|
|
|
## Delivers
|
|
|
|
`degradation_level` field on `Results` and `SearchResults` so callers can observe the current load state. `TidalError::Backpressure` variant for WAL queue saturation. Backpressure check in the signal write path before enqueueing to the WAL channel.
|
|
|
|
## Complexity: M
|
|
|
|
## Dependencies
|
|
|
|
- task-01 (`DegradationLevel` enum)
|
|
- m2p5 `Results` struct (`tidal/src/query/retrieve/types.rs`)
|
|
- m5p3 `SearchResults` struct (`tidal/src/query/search/types.rs`)
|
|
- m1p4 WAL handle (`tidal/src/wal/mod.rs`, `DEFAULT_CHANNEL_CAPACITY`)
|
|
- `TidalError` (`tidal/src/schema/error.rs`)
|
|
|
|
## Technical Design
|
|
|
|
### 1. Add degradation_level to Results
|
|
|
|
```rust
|
|
// In tidal/src/query/retrieve/types.rs, add to Results:
|
|
|
|
/// The response from executing a RETRIEVE query.
|
|
pub struct Results {
|
|
pub items: Vec<RetrieveResult>,
|
|
pub next_cursor: Option<Cursor>,
|
|
pub total_candidates: usize,
|
|
pub constraints_satisfied: bool,
|
|
pub warnings: Vec<String>,
|
|
pub session_snapshot: Option<SessionSnapshot>,
|
|
/// The degradation level under which this query was executed.
|
|
///
|
|
/// `Full` means the query ran at full fidelity. Any other level
|
|
/// indicates that quality was reduced due to load pressure. Callers
|
|
/// should treat non-`Full` responses as best-effort: results are
|
|
/// valid but may be less diverse, use coarser aggregation windows,
|
|
/// or draw from a smaller candidate pool.
|
|
pub degradation_level: crate::load::DegradationLevel,
|
|
}
|
|
```
|
|
|
|
Update ALL construction sites of `Results` to include the new field. There are two:
|
|
1. `RetrieveExecutor::execute()` -- sets `degradation_level: self.degradation_level`
|
|
2. Any test helpers that construct `Results` directly -- set `degradation_level: DegradationLevel::Full`
|
|
|
|
### 2. Add degradation_level to SearchResults
|
|
|
|
```rust
|
|
// In tidal/src/query/search/types.rs, add to SearchResults:
|
|
|
|
pub struct SearchResults {
|
|
pub items: Vec<SearchResultItem>,
|
|
pub next_cursor: Option<Cursor>,
|
|
pub total_candidates: usize,
|
|
pub constraints_satisfied: bool,
|
|
pub warnings: Vec<String>,
|
|
pub session_snapshot: Option<SessionSnapshot>,
|
|
/// The degradation level under which this search was executed.
|
|
pub degradation_level: crate::load::DegradationLevel,
|
|
}
|
|
```
|
|
|
|
Update the construction site in `SearchExecutor::execute()`.
|
|
|
|
### 3. Add TidalError::Backpressure variant
|
|
|
|
```rust
|
|
// In tidal/src/schema/error.rs, add to TidalError:
|
|
|
|
/// The WAL write queue is saturated. The caller should retry after the
|
|
/// suggested delay. This is NOT a data loss event -- the signal was
|
|
/// never enqueued, so it can be safely retried.
|
|
#[error("backpressure: WAL queue full, retry after {retry_after_ms}ms")]
|
|
Backpressure {
|
|
/// Suggested delay before retrying, in milliseconds.
|
|
retry_after_ms: u64,
|
|
},
|
|
```
|
|
|
|
### 4. Backpressure threshold config
|
|
|
|
```rust
|
|
// In tidal/src/load/detector.rs (or a separate backpressure.rs):
|
|
|
|
/// Configuration for WAL backpressure.
|
|
///
|
|
/// When the WAL command channel's pending message count exceeds
|
|
/// `queue_depth_threshold`, signal writes are rejected with
|
|
/// `TidalError::Backpressure` to prevent unbounded memory growth
|
|
/// and give the writer thread time to drain.
|
|
#[derive(Debug, Clone, Copy)]
|
|
pub struct BackpressureConfig {
|
|
/// Maximum pending messages in the WAL channel before rejecting.
|
|
/// Default: 80% of `DEFAULT_CHANNEL_CAPACITY` (8000 out of 10000).
|
|
pub queue_depth_threshold: usize,
|
|
/// Suggested retry delay in milliseconds returned to the caller.
|
|
/// Default: 50ms.
|
|
pub retry_after_ms: u64,
|
|
}
|
|
|
|
impl Default for BackpressureConfig {
|
|
fn default() -> Self {
|
|
Self {
|
|
queue_depth_threshold: 8_000,
|
|
retry_after_ms: 50,
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Store this config on `TidalDb`:
|
|
|
|
```rust
|
|
// In tidal/src/db/mod.rs:
|
|
backpressure_config: crate::load::BackpressureConfig,
|
|
```
|
|
|
|
### 5. Backpressure check in TidalDb::signal()
|
|
|
|
The `TidalDb::signal()` method currently writes to the WAL via the `WalHandleWriter` (which implements `signals::WalWriter`). The backpressure check must happen BEFORE the WAL enqueue, not inside the WAL writer, because:
|
|
1. The WAL writer trait does not return typed errors (it returns `signals::WalError`)
|
|
2. The check is a policy decision belonging to the database layer, not the WAL layer
|
|
|
|
```rust
|
|
// In tidal/src/db/signals.rs, in TidalDb::signal():
|
|
|
|
impl TidalDb {
|
|
pub fn signal(
|
|
&self,
|
|
signal_type: &str,
|
|
entity_id: EntityId,
|
|
weight: f64,
|
|
ts: Timestamp,
|
|
) -> crate::Result<()> {
|
|
// Backpressure check: inspect WAL channel depth before enqueuing.
|
|
// This is O(1) -- crossbeam::channel::bounded::len() is atomic.
|
|
if let Ok(guard) = self.wal.lock()
|
|
&& let Some(wal) = guard.as_ref()
|
|
{
|
|
let queue_depth = wal.channel_len();
|
|
if queue_depth >= self.backpressure_config.queue_depth_threshold {
|
|
tracing::warn!(
|
|
queue_depth,
|
|
threshold = self.backpressure_config.queue_depth_threshold,
|
|
"WAL backpressure: rejecting signal write"
|
|
);
|
|
return Err(TidalError::Backpressure {
|
|
retry_after_ms: self.backpressure_config.retry_after_ms,
|
|
});
|
|
}
|
|
}
|
|
|
|
// ... existing signal write logic ...
|
|
}
|
|
}
|
|
```
|
|
|
|
### 6. Expose channel length on WalHandle
|
|
|
|
The `WalHandle` currently does not expose the channel's pending message count. Add a method:
|
|
|
|
```rust
|
|
// In tidal/src/wal/mod.rs, add to WalHandle:
|
|
|
|
impl WalHandle {
|
|
/// Return the number of pending commands in the writer channel.
|
|
///
|
|
/// O(1) operation. Used by the backpressure check in `TidalDb::signal()`
|
|
/// to detect queue saturation before enqueuing.
|
|
#[must_use]
|
|
pub fn channel_len(&self) -> usize {
|
|
self.tx.len()
|
|
}
|
|
}
|
|
```
|
|
|
|
`crossbeam::channel::Sender::len()` is documented as O(1) and returns the number of messages currently in the channel. This does not require holding a lock.
|
|
|
|
### 7. Re-export from lib.rs
|
|
|
|
The `DegradationLevel` should be accessible from the public API:
|
|
|
|
```rust
|
|
// In tidal/src/lib.rs, add:
|
|
pub use load::DegradationLevel;
|
|
```
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `Results.degradation_level` field present and set by `RetrieveExecutor`
|
|
- [ ] `SearchResults.degradation_level` field present and set by `SearchExecutor`
|
|
- [ ] `TidalError::Backpressure { retry_after_ms }` variant added
|
|
- [ ] `BackpressureConfig` with configurable threshold (default 8000) and retry delay (default 50ms)
|
|
- [ ] `WalHandle::channel_len()` returns pending command count
|
|
- [ ] Backpressure check in `TidalDb::signal()` rejects writes when queue exceeds threshold
|
|
- [ ] Backpressure does NOT affect query reads (only signal writes)
|
|
- [ ] `DegradationLevel` re-exported from `tidaldb::DegradationLevel`
|
|
- [ ] All existing tests pass (Results construction updated with `degradation_level: DegradationLevel::Full`)
|
|
- [ ] `cargo clippy -D warnings` clean
|
|
|
|
## Test Strategy
|
|
|
|
```rust
|
|
#[cfg(test)]
|
|
#[allow(clippy::unwrap_used)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
#[test]
|
|
fn results_includes_degradation_level() {
|
|
let results = Results {
|
|
items: vec![],
|
|
next_cursor: None,
|
|
total_candidates: 0,
|
|
constraints_satisfied: true,
|
|
warnings: vec![],
|
|
session_snapshot: None,
|
|
degradation_level: DegradationLevel::ReducedCandidates,
|
|
};
|
|
assert_eq!(
|
|
results.degradation_level,
|
|
DegradationLevel::ReducedCandidates
|
|
);
|
|
}
|
|
|
|
#[test]
|
|
fn search_results_includes_degradation_level() {
|
|
let results = SearchResults {
|
|
items: vec![],
|
|
next_cursor: None,
|
|
total_candidates: 0,
|
|
constraints_satisfied: true,
|
|
warnings: vec![],
|
|
session_snapshot: None,
|
|
degradation_level: DegradationLevel::Full,
|
|
};
|
|
assert_eq!(results.degradation_level, DegradationLevel::Full);
|
|
}
|
|
|
|
#[test]
|
|
fn backpressure_error_display() {
|
|
let err = TidalError::Backpressure { retry_after_ms: 50 };
|
|
let msg = err.to_string();
|
|
assert!(msg.contains("backpressure"));
|
|
assert!(msg.contains("50"));
|
|
}
|
|
|
|
#[test]
|
|
fn wal_channel_len_reports_zero_when_empty() {
|
|
let dir = tempfile::tempdir().unwrap();
|
|
let config = crate::wal::WalConfig {
|
|
dir: dir.path().to_path_buf(),
|
|
..Default::default()
|
|
};
|
|
let (handle, _, _) = crate::wal::WalHandle::open(config).unwrap();
|
|
// After open with no pending commands, len should be 0 (or very small).
|
|
// The writer thread may have consumed any initial commands.
|
|
assert!(handle.channel_len() < 10);
|
|
handle.shutdown().unwrap();
|
|
}
|
|
|
|
#[test]
|
|
fn backpressure_rejects_signal_when_queue_full() {
|
|
// Integration test:
|
|
// 1. Open a TidalDb with a very low backpressure threshold (e.g., 1).
|
|
// 2. Flood the WAL channel by sending commands faster than the writer
|
|
// can drain them (or use a mock WAL that never consumes).
|
|
// 3. Call db.signal() and assert TidalError::Backpressure is returned.
|
|
}
|
|
|
|
#[test]
|
|
fn backpressure_does_not_affect_queries() {
|
|
// Integration test:
|
|
// Even when the WAL queue is saturated, retrieve() and search()
|
|
// should still return Ok results (possibly with degraded quality,
|
|
// but never a Backpressure error).
|
|
}
|
|
}
|
|
```
|