Memory

In one sentence

Memory is what lets your agent remember things — like a good assistant who knows your name, recalls what you discussed last week, and doesn’t ask you the same question twice. By default, an AI model forgets everything the moment a conversation ends. Memory fixes that.

The 30-second version: Add a memory block to your agent, point it at a database, and your agent now remembers users across conversations — automatically. No extra code. Everything else on this page is optional fine-tuning.

Why it matters

Think about the difference between two customer-service experiences:

Without memory: “Hi, can I get your name? And your order number? And what was the issue again?” — every single time, even if you called yesterday.
With memory: “Hi Akash — is this about the delayed order ORD-7421 we discussed yesterday? I see it shipped this morning.”

The second one feels like a relationship. That’s memory. It’s the difference between a chatbot people tolerate and an assistant people trust.

You are a…	Why you care about memory
Business / CEO	It turns one-off chats into lasting customer relationships — the thing that drives retention and loyalty.
Product / CTO	It’s a built-in, production-grade system — no need to build (and secure) your own. Works across chat, voice, and browser agents identically.
Developer	One config block. Point it at MongoDB/Postgres and you get persistence, fact extraction, and cross-session recall for free.

The mental model

Think of memory as a smart filing cabinet that sits behind your agent:

Every conversation, the agent opens the relevant drawers before answering (“what do I know about this person?”).
After the conversation, it files away anything new it learned (“they mentioned they prefer email”).
The cabinet has separate drawers for separate things — one for chat history, one for facts about the person, one for company knowledge, and so on.
Each person’s drawers are private — your data never shows up in someone else’s folder.

The rest of this page explains what’s in each drawer and how to configure them. But you can start with just one line: a storage backend.

Quick Start

import { Agent, MongoDBStorage, openai } from "@agentium/core";

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  memory: {
    storage: new MongoDBStorage({ uri: "mongodb://localhost/agentium" }),
  },
});

With just storage, you get:

Session persistence — message history saved across runs
Summaries — overflow messages automatically summarized

Full Configuration

import { InMemoryGraphStore } from "@agentium/core";

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  memory: {
    storage: new MongoDBStorage({ uri: "mongodb://localhost/agentium" }),
    maxMessages: 20,          // messages in session history (default: 50)
    maxTokens: 128_000,       // auto-trim history to fit context window

    summaries: true,          // ON by default — long-term conversation context
    userFacts: true,          // OFF by default — "prefers dark mode", "lives in Mumbai"
    userProfile: true,        // OFF by default — structured: name, role, timezone
    entities: true,           // OFF by default — companies, people, projects
    decisions: true,          // OFF by default — audit trail of agent choices

    learnings: {              // OFF by default — needs a vector store
      vectorStore: qdrant(...),
    },

    graph: {                  // OFF by default — knowledge graph
      store: new InMemoryGraphStore(),
    },

    procedures: true,         // OFF by default — learns multi-step workflows

    contextBudget: {          // optional — controls context token allocation
      maxTokens: 4000,
      priorities: { summaries: 0.3, graph: 0.2 },
    },

    model: openai("gpt-4o-mini"), // cheaper model for background extraction
    timezone: "Asia/Kolkata",     // IANA timezone — anchors date-relative
                                  // extraction ("today", "yesterday"). Falls
                                  // back to UTC. Always set in production.
    tenantId: "acme-corp",        // optional — required for tenant-scoped
                                  // learnings/procedures to be visible.
  },
});

Architecture

Memory is a layered subsystem, not a single store. One orchestrator (MemoryManager) coordinates up to nine specialized stores, each owning its own schema, extraction prompt, and scope rules, all sitting on a shared StorageDriver.

        ┌──────────────────────────────────────────────────┐
        │       Agent / VoiceAgent / BrowserAgent           │
        │  beforeRun → buildContext()                       │
        │  afterRun  → appendMessages + background extract   │
        └────────────────────────┬───────────────────────────┘
                                 ▼
        ┌──────────────────────────────────────────────────┐
        │                 MemoryManager                      │
        │  buildContext() · appendMessages() · afterRun()    │
        │  recall() · remember() · forget() · curator        │
        └───┬──────┬──────┬──────┬──────┬──────┬──────┬─────┘
            ▼      ▼      ▼      ▼      ▼      ▼      ▼
        Sessions Summary Facts Profile Entity Decision Procedures
            │                                          │
            ▼                                          ▼
        ┌──────────────────────────────────────────────────┐
        │              StorageDriver interface               │
        │     MongoDB · Postgres · Redis · SQLite · …         │
        └──────────────────────────────────────────────────┘

        Learnings ──► VectorStore (Qdrant · Pinecone · InMemory · …)
        Graph     ──► GraphStore  (Neo4j · InMemory)

Each layer talks to the next only through a typed interface — swap MongoDB for Postgres, or Qdrant for Pinecone, without touching any store logic.

The nine “drawers” (stores)

Each store is one drawer in the filing cabinet. You turn on the ones you need.

Store	In plain terms	Example	Default
Sessions	The transcript of the current chat	”everything we’ve said this conversation”	ON
Summaries	A short recap of older chats	”last week you asked about refunds”	ON
User Facts	Things you know about the person	”Akash lives in Mumbai, prefers email”	OFF
User Profile	A tidy profile card	name · role · company · timezone	OFF
Entities	People/companies/products mentioned	”Acme Corp is their employer”	OFF
Decisions	A log of what the agent decided & why	”approved refund — 7-day delay”	OFF
Learnings	Lessons that apply to many chats	”Vendor X invoices always have errors”	OFF
Procedures	Step-by-step playbooks the agent learned	”how to reconcile a mismatched invoice”	OFF
Graph	A web of how things connect	”Raj reports to Priya at Acme”	OFF

You don’t need all of them. Most agents use Sessions + Summaries (on by default) plus User Facts. Turn on the rest only when your use case calls for it — the table later in this page tells you when.

Rule of thumb:

Want the agent to remember the person → turn on User Facts.
Want it to remember the conversation → Sessions + Summaries (already on).
Want a team to share knowledge → turn on Learnings or Procedures (see Scope Hierarchy).

How It Works

Every time your agent answers, two things happen automatically — like an assistant glancing at their notes before speaking, then jotting down anything new afterward:

Before answering → the agent reads its memory and brings the relevant bits into the conversation.
After answering → the agent quietly files away anything new it learned.

You write zero code for either. Here’s what’s happening under the hood.

1. Before the answer: gather what we know (`buildContext()`)

MemoryManager.buildContext() gathers relevant data from all enabled stores and creates a context string injected into the system prompt:

// What buildContext() produces (approximate):
`
## Memory Context

### Session Summary
The user previously discussed shipping delays for order #12345
and requested a refund, which was processed successfully.

### About This User
- Name: Akash Sengar
- Role: Product Manager
- Company: Xhipment
- Prefers dark mode
- Timezone: Asia/Kolkata

### Relevant Entities
- Xhipment (company): Logistics platform, user's employer
- Order #12345: Delayed shipment from Dec 15

### Recent Decisions
- Approved refund for order #12345 (reason: 7-day delay exceeded SLA)

### Relevant Learnings
- Refunds for delays >5 days should be auto-approved per company policy
`

This context is appended to the system prompt, giving the model persistent awareness across sessions.

2. After the answer: remember what’s new (`afterRun()`)

The user already has their answer — so this step runs in the background and never slows down the response. It quietly re-reads the conversation (using a cheaper model to keep costs low) and files away anything worth remembering:

New user facts and profile updates
Entity mentions (companies, people, projects)
Decision records
Learnings worth remembering

// Background extraction happens automatically — no code needed.
// To use a cheaper model for extraction:
memory: {
  storage,
  model: openai("gpt-4o-mini"), // Uses ~10x less tokens than the main model
  summaries: true,
  userFacts: true,
}

3. Keeping it from getting too big (session overflow)

A conversation can’t grow forever — that would blow past the model’s limits and cost. So when a chat gets long (past maxMessages), the oldest messages are summarized into a short recap and then removed. The agent keeps the gist without carrying every word. Think of it as turning ten pages of notes into a single sticky note.

Works Everywhere

The same memory config works across all agent types:

// Text Agent
new Agent({ model, memory: { storage } });

// Voice Agent
new VoiceAgent({ provider, memory: { storage } });

// Browser Agent
new BrowserAgent({ model, memory: { storage } });

Simplified API

For quick operations without dealing with individual stores, use the high-level remember, recall, and forget methods:

const mm = agent.memory!;

// Store a fact
await mm.remember("User prefers dark mode", { userId: "user-42" });

// Search across all stores with composite scoring
const results = await mm.recall("dark mode preference", { userId: "user-42" });
console.log(results[0].content, results[0].score);

// Remove memories
await mm.forget({ userId: "user-42", factId: "fact-abc" });

See Simplified API for full details.

Default Feature States

Feature	Default	Requires
Sessions	ON	`storage`
Summaries	ON	`storage`
User Facts	OFF	`userFacts: true`
User Profile	OFF	`userProfile: true`
Entities	OFF	`entities: true`
Decisions	OFF	`decisions: true`
Learnings	OFF	`learnings: { vectorStore }`
Graph Memory	OFF	`graph: { store }`
Procedures	OFF	`procedures: true`

Accessing Stores Directly

You can access individual stores via the MemoryManager:

const mm = agent.memory; // MemoryManager | null

const facts = await mm?.getUserFacts()?.getFacts("user-123");
const profile = await mm?.getUserProfile()?.getProfile("user-123");
// Entity / graph / procedure stores require a userId — every read and write
// is scoped to a user so two tenants can never see each other's data.
const entities = await mm?.getEntityMemory()?.listEntities("user-123");

Inspecting Memory Context

You can call buildContext() directly to see what the model receives:

const mm = agent.memory;
if (mm) {
  const ctx = await mm.buildContext(
    "session-abc",      // sessionId
    "user-42",          // userId
    "current user input", // currentInput (used for relevance scoring)
    "assistant",        // agentName
  );
  console.log(ctx);
  // Prints the full context string that would be injected into the system prompt.
}

Each section is wrapped in an explicit scope marker so the LLM never conflates user/session/agent data:

<memory section="userFacts" scope="current_user">
What you know about this user:
Facts the user told you directly (high confidence):
- User's name is Akash.
- Akash is based in Mumbai.
</memory>

<memory section="summaries" scope="current_user">
Previous conversation context (most recent first):
...
</memory>

This is useful for debugging — if the model seems to “forget” something, check if the relevant store is enabled and producing context.

Multi-User Isolation

The short version: one user’s memory never leaks into another user’s conversation. Akash’s data stays in Akash’s drawers. This is enforced automatically — you don’t have to do anything to get it. This matters because the moment you have more than one user (every real product), a memory system that mixes people’s data is a privacy incident waiting to happen. Agentium treats memory as a security boundary, not just a feature. Every memory store is scoped to the calling user by default. Two tenants whose users happen to share a userId collision still cannot see each other’s data, because every read and write includes the relevant scope key.

Store	Default scope key	Supports shared scopes?
Session messages	`sessionId`	—
Summaries	`sessionId`	—
User facts	`userId`	personal only
User profile	`userId`	personal only
Entity memory	`userId`	personal only
Graph memory	`userId` (per-node `_userId`)	personal only
Procedure memory	`userId`	agent / tenant / global ✓
Decision log	`agentName` + `sessionId`	—
Learnings	`userId` (vector post-filter)	agent / tenant / global ✓

Learnings and Procedures support an explicit scope hierarchy so that genuinely shared knowledge — like “invoice reconciliation workflow” or “refunds > $500 need VP approval” — can be saved once and seen by every authorised user. See Multi-User Isolation for the full contract and a worked example. When you call MemoryManager.buildContext(sessionId, userId, ...) without a userId, stores that require one return empty strings rather than risk surfacing another user’s data.

Observability

Memory subsystem failures emit events on the agent’s EventBus so they don’t silently disappear. The most useful one is memory.error:

agent.eventBus.on("memory.error", ({ store, error, agentName }) => {
  console.error(`[${agentName}] ${store} extraction failed:`, error.message);
  // Forward to Sentry / Datadog / Langfuse / etc.
});

Other memory events:

memory.fact.added / memory.fact.invalidated — fact-store mutations.
memory.extract — background extraction triggered.
memory.context.built — buildContext returned (with totalTokens and per-section breakdown).

These are first-class members of AgentEventMap, so you can wire them into @agentium/observability and graph extraction failure rates over time.

Cross-References

Multi-User Isolation — Security guarantees and the user-scoping contract
Memory Stores — Deep dive into each store type
Graph Memory — Knowledge graph with traversal
Temporal Awareness — Fact validity and contradiction detection
Composite Scoring — How memories are ranked
Procedural Memory — Learning multi-step workflows
Simplified API — remember/recall/forget convenience methods
Cross-Agent Sharing — Shared memory in teams
Context Budget — Token-aware allocation
Memory Curator — Pruning, dedup, consolidation, and maintenance
Storage Backends — MongoDB, Postgres, in-memory options

​Memory

​In one sentence

​Why it matters

​The mental model

​Quick Start

​Full Configuration

​Architecture

​The nine “drawers” (stores)

​How It Works

​1. Before the answer: gather what we know (buildContext())

​2. After the answer: remember what’s new (afterRun())

​3. Keeping it from getting too big (session overflow)

​Works Everywhere

​Simplified API

​Default Feature States

​Accessing Stores Directly

​Inspecting Memory Context

​Multi-User Isolation

​Observability

​Cross-References

Memory

In one sentence

Why it matters

The mental model

Quick Start

Full Configuration

Architecture

The nine “drawers” (stores)

How It Works

1. Before the answer: gather what we know (`buildContext()`)

2. After the answer: remember what’s new (`afterRun()`)

3. Keeping it from getting too big (session overflow)

Works Everywhere

Simplified API

Default Feature States

Accessing Stores Directly

Inspecting Memory Context

Multi-User Isolation

Observability

Cross-References