---
title: "Agents get memory lanes, not context windows"
date: "2026-02-22"
author: "Jordan Washburn"
description: "An LLM agent curating content for a user can write structured signals into tidalDB -- reward, preference hints, annotations -- scoped to a session, governed by a schema-declared policy, and immediately reflected in the next ranking query. No Redis key. No appended context window. Decaying, queryable, crash-safe memory."
tags: ["agents", "sessions", "signals", "architecture"]
---

Here is how agent memory works in the 6-system stack:

The agent appends its observations to the LLM context window. "The user seems to like jazz piano." "This creator posts consistently high-quality content." "Suppress cooking videos for this session." These observations are strings. They have no structure, no decay, no ranking integration. When the context window fills, the oldest observations are truncated. When the process restarts, they are gone.

Or: the agent writes a Redis key. `agent:planner:user:42:session:abc:preferences = "jazz,piano"`. The key has a TTL. The ranking service does not read it. The key exists in a different system from the one that scores content. Connecting the two requires glue code that someone writes, nobody tests, and everyone regrets.

Or: the agent writes to a vector store. The observation is embedded and stored. But the vector store has no concept of decay, no concept of signal weight, no concept of policy enforcement. The observation persists at full strength forever, or until someone manually deletes it.

None of these give the agent what it actually needs: a structured, decaying, policy-governed memory that feeds directly into the ranking query.

## What M4 shipped

An agent session in tidalDB is a scoped memory lane. The agent starts a session, writes signals within it, and the next `RETRIEVE` query reflects those writes. The session is tied to a user, governed by a schema-declared policy, and crash-recoverable via WAL.

The lifecycle:

```rust
// 1. Declare a policy in the schema.
let _ = builder.session_policy(
    "planner_policy",
    AgentPolicy {
        allowed_signals: vec!["reward".to_string(), "view".to_string()],
        denied_signals: vec!["skip".to_string()],
        max_session_duration: Duration::from_secs(3600),
        max_signals_per_session: 100,
    },
);

// 2. Start a session.
let mut meta = HashMap::new();
meta.insert("context".to_string(), "video-feed".to_string());

let handle = db.start_session(
    42,                  // user_id
    "planner-agent",     // agent_id
    "planner_policy",    // policy from schema
    meta,                // session metadata
)?;

// 3. Write signals with optional annotations.
db.session_signal(
    &handle,
    "reward",            // signal type
    EntityId::new(5),    // entity being rewarded
    1.0,                 // weight
    Timestamp::now(),
    Some("rust programming language".to_string()), // annotation
)?;

// 4. Query with session context.
let query = RetrieveBuilder::new(EntityKind::Item, ProfileRef::new("hot"))
    .for_session(handle.id)
    .limit(10)
    .build()?;
let results = db.retrieve(&query)?;

// 5. Close the session.
let summary = db.close_session(handle)?;
// summary.signals_written, summary.rejections, summary.duration_ms
```

The `SessionHandle` is move-only. When you call `close_session`, it consumes the handle. The compiler prevents use-after-close. This is not a runtime check. It is a type-level guarantee.

## Policy enforcement inside the database

The agent does not get to write arbitrary signals. The policy is declared in the schema, validated at build time, and enforced on every `session_signal` call.

The `planner_policy` above allows `reward` and `view`. It denies `skip`. It caps the session at 100 signals and 1 hour. If the agent tries to write a `skip` signal:

```rust
let result = db.session_signal(&handle, "skip", EntityId::new(99), 1.0, ts, None);
// Err(PolicyViolation { signal_type: "skip", policy_name: "planner_policy", ... })
```

The rejection is counted, logged to the audit trail, and returned as a typed error. The agent knows exactly what failed and why. The database enforced the constraint. The application did not need a permission check, a middleware layer, or a policy service.

Four checks run on every session signal, in order:

1. **Duration** -- has the session exceeded `max_session_duration`? If so, `SessionExpired`.
2. **Count cap** -- has the session hit `max_signals_per_session`? If so, `CountCap`.
3. **Deny list** -- is this signal type explicitly denied? If so, `Denied`.
4. **Allow list** -- if the allow list is non-empty, is this signal type on it? If not, `NotAllowed`.

Every decision is recorded in a bounded audit log (capped at 10,000 entries with FIFO eviction). The application can inspect the log at any time:

```rust
let audit = db.session_audit(session_id)?;
// Vec<AuditEntry> -- each with timestamp_ns, signal_type, accepted, reason
```

This is the difference between "the agent can write whatever it wants and we hope the application validates it" and "the database enforces what the agent can do, audits every decision, and the schema is the source of truth."

## Immediate reflection in the ranking query

Session signals feed into the `RETRIEVE` pipeline through the `for_session` clause. When the query executor sees a session ID, it loads the session snapshot, extracts a `SessionContext`, and applies an additive boost during scoring.

The boost has two components:

**Keyword hint matching.** Annotations written by the agent are split into keywords. Each keyword is matched against item metadata values. The fraction of matching keywords produces a hint score, weighted at 0.3. An agent that annotates "rust programming language" on a reward signal will boost items whose metadata contains those terms.

**Reward velocity.** The session tracks a running decay score for the `reward` signal. Higher reward velocity produces a larger boost, weighted at 0.2, with Michaelis-Menten saturation (`vel / (vel + 1)`) so the boost asymptotes rather than growing without bound.

The boost is additive. It applies after base scoring, before normalization. An item that the agent rewarded heavily in the current session ranks higher in the next query. An item the agent ignored is unaffected. The agent's observations shape ranking within the session scope without contaminating the global signal ledger.

The acceptance test proves this directly:

```rust
// Signal entity 5 heavily in the session.
for _ in 0..10 {
    db.session_signal(&handle, "reward", EntityId::new(5), 2.0, ts, None)?;
}

// Query WITH for_session: entity 5 gets a boost.
let query = RetrieveBuilder::new(EntityKind::Item, ProfileRef::new("hot"))
    .for_session(session_id)
    .limit(10)
    .build()?;
let results_with = db.retrieve(&query)?;

// Query WITHOUT for_session: entity 5 has no global signals.
let query_no_session = RetrieveBuilder::new(EntityKind::Item, ProfileRef::new("hot"))
    .limit(10)
    .build()?;
let results_without = db.retrieve(&query_no_session)?;

// Entity 5 ranks higher with the session boost.
```

There is no delay between the `session_signal` calls and the query. The session state is in-memory. The query reads it directly. Same process, same memory space.

## Session isolation

Two agents running concurrently for different users do not see each other's session state. Session A's signals do not appear in Session B's snapshot. Session B's annotations do not influence Session A's ranking boost.

```rust
let handle_a = db.start_session(70, "agent-a", "planner_policy", HashMap::new())?;
let handle_b = db.start_session(71, "agent-b", "planner_policy", HashMap::new())?;

db.session_signal(&handle_a, "reward", EntityId::new(1), 1.0, ts, None)?;
db.session_signal(&handle_b, "reward", EntityId::new(5), 1.0, ts, None)?;

let snap_a = db.session_snapshot(handle_a.id)?;
assert!(snap_a.signaled_entities.contains(&1));
assert!(!snap_a.signaled_entities.contains(&5));  // agent-b's signal is not here

let snap_b = db.session_snapshot(handle_b.id)?;
assert!(snap_b.signaled_entities.contains(&5));
assert!(!snap_b.signaled_entities.contains(&1));  // agent-a's signal is not here
```

The isolation is structural. Each session has its own `DashMap` of signal state, its own set of signaled entities, its own annotation list. There is no shared mutable state between sessions. The `for_session` clause in a query reads exactly one session.

## Crash safety

Session metadata, annotations, and signal writes are WAL-backed. Every `start_session` writes a start record to durable storage and a WAL event. Every `session_signal` writes a WAL event with the signal type, entity ID, weight, timestamp, and annotation. Every `close_session` persists a frozen snapshot and removes the start record atomically.

If the process crashes mid-session, the recovery path at next startup:

1. The WAL is replayed. Start events without matching Close events identify sessions that were active at crash time.
2. For each orphaned session, the start record is read from storage to recover metadata.
3. Signal events are replayed into a fresh `SessionSignalState` with the correct decay lambda.
4. Annotations are restored from WAL signal events.
5. The session resumes as active. The agent can continue writing signals.

The `next_session_id` counter advances past any restored session IDs to prevent collisions. The session's `Instant` (monotonic clock) is approximated as "now" since monotonic timestamps do not survive restarts -- but the nanosecond Unix timestamp from the original start is preserved for archival and audit purposes.

Closed sessions are also restored. The frozen snapshot is persisted to storage at close time, and `restore_sessions()` scans for these at startup, populating the `closed_sessions` cache. A `session_snapshot` call for a closed session reads from this cache, falling back to storage if evicted.

## The agent-as-user model

The architectural insight behind M4 is that agents and users are not different kinds of actors. They are different *scopes* of the same primitives.

A user writes signals: view, like, share, skip. Those signals decay. They contribute to ranking profiles. They update interaction weights and preference vectors. They survive crashes via WAL replay.

An agent writes signals: reward, view, preference hints via annotations. Those signals decay. They contribute to ranking via the `for_session` boost. They are governed by a policy. They survive crashes via WAL replay.

Same primitive. Different scope. The user's signals shape global ranking. The agent's signals shape session-scoped ranking. Both use the same decay math, the same windowed counters, the same WAL infrastructure. The agent does not need a separate memory system. It writes into the same database, through the same signal interface, with the same durability guarantees.

The policy is the boundary. It defines what the agent can write, how long the session lasts, and how many signals it can produce. The policy lives in the schema, not in application middleware. The database enforces it. The application trusts it.

## What this replaces

In the 6-system stack, agent memory is an afterthought bolted onto infrastructure designed for request-response serving:

```
Agent observes user preference
  -> writes Redis key (no decay, no ranking integration)
  -> or appends to context window (no persistence, truncation on overflow)
  -> or writes to vector store (no decay, no policy, no ranking integration)
  -> ranking service does not read any of these
  -> next query serves the same results as if the agent had done nothing
```

In tidalDB:

```
Agent observes user preference
  -> db.session_signal(&handle, "reward", entity_id, weight, ts, annotation)
       -> policy evaluated (4 checks, audit logged)
       -> session signal state updated (O(1) decay + windowed counter)
       -> entity tracked in session's signaled_entities set
       -> annotation stored (capped at 100)
       -> WAL event written (crash recovery)
  -> next query with for_session(session_id)
       -> session snapshot loaded
       -> keyword hints matched against item metadata
       -> reward velocity applied as boost
       -> results reflect agent's observations
```

One function call. One process. One consistency model. The agent's memory is structured, decaying, policy-governed, crash-safe, and immediately queryable.

---

*The session lifecycle is at [tidal/src/db/sessions.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/db/sessions.rs). Policy evaluation is at [tidal/src/session/policy.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/session/policy.rs). Session crash recovery is at [tidal/src/db/session_restore.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/src/db/session_restore.rs). The acceptance tests are at [tidal/tests/m4_uat.rs](https://github.com/orchard9/tidalDB/blob/main/tidal/tests/m4_uat.rs). Follow the build on [GitHub](https://github.com/orchard9/tidalDB).*