Memory
In one sentence
Memory is what lets your agent remember things — like a good assistant who knows your name, recalls what you discussed last week, and doesn’t ask you the same question twice.
By default, an AI model forgets everything the moment a conversation ends. Memory fixes that.
The 30-second version: Add a memory block to your agent, point it at a database, and your agent now remembers users across conversations — automatically. No extra code. Everything else on this page is optional fine-tuning.
Why it matters
Think about the difference between two customer-service experiences:
- Without memory: “Hi, can I get your name? And your order number? And what was the issue again?” — every single time, even if you called yesterday.
- With memory: “Hi Akash — is this about the delayed order ORD-7421 we discussed yesterday? I see it shipped this morning.”
The second one feels like a relationship. That’s memory. It’s the difference between a chatbot people tolerate and an assistant people trust.
| You are a… | Why you care about memory |
|---|
| Business / CEO | It turns one-off chats into lasting customer relationships — the thing that drives retention and loyalty. |
| Product / CTO | It’s a built-in, production-grade system — no need to build (and secure) your own. Works across chat, voice, and browser agents identically. |
| Developer | One config block. Point it at MongoDB/Postgres and you get persistence, fact extraction, and cross-session recall for free. |
The mental model
Think of memory as a smart filing cabinet that sits behind your agent:
- Every conversation, the agent opens the relevant drawers before answering (“what do I know about this person?”).
- After the conversation, it files away anything new it learned (“they mentioned they prefer email”).
- The cabinet has separate drawers for separate things — one for chat history, one for facts about the person, one for company knowledge, and so on.
- Each person’s drawers are private — your data never shows up in someone else’s folder.
The rest of this page explains what’s in each drawer and how to configure them. But you can start with just one line: a storage backend.
Quick Start
import { Agent, MongoDBStorage, openai } from "@agentium/core";
const agent = new Agent({
name: "assistant",
model: openai("gpt-4o"),
memory: {
storage: new MongoDBStorage({ uri: "mongodb://localhost/agentium" }),
},
});
With just storage, you get:
- Session persistence — message history saved across runs
- Summaries — overflow messages automatically summarized
Full Configuration
import { InMemoryGraphStore } from "@agentium/core";
const agent = new Agent({
name: "assistant",
model: openai("gpt-4o"),
memory: {
storage: new MongoDBStorage({ uri: "mongodb://localhost/agentium" }),
maxMessages: 20, // messages in session history (default: 50)
maxTokens: 128_000, // auto-trim history to fit context window
summaries: true, // ON by default — long-term conversation context
userFacts: true, // OFF by default — "prefers dark mode", "lives in Mumbai"
userProfile: true, // OFF by default — structured: name, role, timezone
entities: true, // OFF by default — companies, people, projects
decisions: true, // OFF by default — audit trail of agent choices
learnings: { // OFF by default — needs a vector store
vectorStore: qdrant(...),
},
graph: { // OFF by default — knowledge graph
store: new InMemoryGraphStore(),
},
procedures: true, // OFF by default — learns multi-step workflows
contextBudget: { // optional — controls context token allocation
maxTokens: 4000,
priorities: { summaries: 0.3, graph: 0.2 },
},
model: openai("gpt-4o-mini"), // cheaper model for background extraction
timezone: "Asia/Kolkata", // IANA timezone — anchors date-relative
// extraction ("today", "yesterday"). Falls
// back to UTC. Always set in production.
tenantId: "acme-corp", // optional — required for tenant-scoped
// learnings/procedures to be visible.
},
});
Architecture
Memory is a layered subsystem, not a single store. One orchestrator (MemoryManager) coordinates up to nine specialized stores, each owning its own schema, extraction prompt, and scope rules, all sitting on a shared StorageDriver.
┌──────────────────────────────────────────────────┐
│ Agent / VoiceAgent / BrowserAgent │
│ beforeRun → buildContext() │
│ afterRun → appendMessages + background extract │
└────────────────────────┬───────────────────────────┘
▼
┌──────────────────────────────────────────────────┐
│ MemoryManager │
│ buildContext() · appendMessages() · afterRun() │
│ recall() · remember() · forget() · curator │
└───┬──────┬──────┬──────┬──────┬──────┬──────┬─────┘
▼ ▼ ▼ ▼ ▼ ▼ ▼
Sessions Summary Facts Profile Entity Decision Procedures
│ │
▼ ▼
┌──────────────────────────────────────────────────┐
│ StorageDriver interface │
│ MongoDB · Postgres · Redis · SQLite · … │
└──────────────────────────────────────────────────┘
Learnings ──► VectorStore (Qdrant · Pinecone · InMemory · …)
Graph ──► GraphStore (Neo4j · InMemory)
Each layer talks to the next only through a typed interface — swap MongoDB for Postgres, or Qdrant for Pinecone, without touching any store logic.
The nine “drawers” (stores)
Each store is one drawer in the filing cabinet. You turn on the ones you need.
| Store | In plain terms | Example | Default |
|---|
| Sessions | The transcript of the current chat | ”everything we’ve said this conversation” | ON |
| Summaries | A short recap of older chats | ”last week you asked about refunds” | ON |
| User Facts | Things you know about the person | ”Akash lives in Mumbai, prefers email” | OFF |
| User Profile | A tidy profile card | name · role · company · timezone | OFF |
| Entities | People/companies/products mentioned | ”Acme Corp is their employer” | OFF |
| Decisions | A log of what the agent decided & why | ”approved refund — 7-day delay” | OFF |
| Learnings | Lessons that apply to many chats | ”Vendor X invoices always have errors” | OFF |
| Procedures | Step-by-step playbooks the agent learned | ”how to reconcile a mismatched invoice” | OFF |
| Graph | A web of how things connect | ”Raj reports to Priya at Acme” | OFF |
You don’t need all of them. Most agents use Sessions + Summaries (on by default) plus User Facts. Turn on the rest only when your use case calls for it — the table later in this page tells you when.
Rule of thumb:
- Want the agent to remember the person → turn on User Facts.
- Want it to remember the conversation → Sessions + Summaries (already on).
- Want a team to share knowledge → turn on Learnings or Procedures (see Scope Hierarchy).
How It Works
Every time your agent answers, two things happen automatically — like an assistant glancing at their notes before speaking, then jotting down anything new afterward:
- Before answering → the agent reads its memory and brings the relevant bits into the conversation.
- After answering → the agent quietly files away anything new it learned.
You write zero code for either. Here’s what’s happening under the hood.
1. Before the answer: gather what we know (buildContext())
MemoryManager.buildContext() gathers relevant data from all enabled stores and creates a context string injected into the system prompt:
// What buildContext() produces (approximate):
`
## Memory Context
### Session Summary
The user previously discussed shipping delays for order #12345
and requested a refund, which was processed successfully.
### About This User
- Name: Akash Sengar
- Role: Product Manager
- Company: Xhipment
- Prefers dark mode
- Timezone: Asia/Kolkata
### Relevant Entities
- Xhipment (company): Logistics platform, user's employer
- Order #12345: Delayed shipment from Dec 15
### Recent Decisions
- Approved refund for order #12345 (reason: 7-day delay exceeded SLA)
### Relevant Learnings
- Refunds for delays >5 days should be auto-approved per company policy
`
This context is appended to the system prompt, giving the model persistent awareness across sessions.
2. After the answer: remember what’s new (afterRun())
The user already has their answer — so this step runs in the background and never slows down the response. It quietly re-reads the conversation (using a cheaper model to keep costs low) and files away anything worth remembering:
- New user facts and profile updates
- Entity mentions (companies, people, projects)
- Decision records
- Learnings worth remembering
// Background extraction happens automatically — no code needed.
// To use a cheaper model for extraction:
memory: {
storage,
model: openai("gpt-4o-mini"), // Uses ~10x less tokens than the main model
summaries: true,
userFacts: true,
}
3. Keeping it from getting too big (session overflow)
A conversation can’t grow forever — that would blow past the model’s limits and cost. So when a chat gets long (past maxMessages), the oldest messages are summarized into a short recap and then removed. The agent keeps the gist without carrying every word. Think of it as turning ten pages of notes into a single sticky note.
Works Everywhere
The same memory config works across all agent types:
// Text Agent
new Agent({ model, memory: { storage } });
// Voice Agent
new VoiceAgent({ provider, memory: { storage } });
// Browser Agent
new BrowserAgent({ model, memory: { storage } });
Simplified API
For quick operations without dealing with individual stores, use the high-level remember, recall, and forget methods:
const mm = agent.memory!;
// Store a fact
await mm.remember("User prefers dark mode", { userId: "user-42" });
// Search across all stores with composite scoring
const results = await mm.recall("dark mode preference", { userId: "user-42" });
console.log(results[0].content, results[0].score);
// Remove memories
await mm.forget({ userId: "user-42", factId: "fact-abc" });
See Simplified API for full details.
Default Feature States
| Feature | Default | Requires |
|---|
| Sessions | ON | storage |
| Summaries | ON | storage |
| User Facts | OFF | userFacts: true |
| User Profile | OFF | userProfile: true |
| Entities | OFF | entities: true |
| Decisions | OFF | decisions: true |
| Learnings | OFF | learnings: { vectorStore } |
| Graph Memory | OFF | graph: { store } |
| Procedures | OFF | procedures: true |
Accessing Stores Directly
You can access individual stores via the MemoryManager:
const mm = agent.memory; // MemoryManager | null
const facts = await mm?.getUserFacts()?.getFacts("user-123");
const profile = await mm?.getUserProfile()?.getProfile("user-123");
// Entity / graph / procedure stores require a userId — every read and write
// is scoped to a user so two tenants can never see each other's data.
const entities = await mm?.getEntityMemory()?.listEntities("user-123");
Inspecting Memory Context
You can call buildContext() directly to see what the model receives:
const mm = agent.memory;
if (mm) {
const ctx = await mm.buildContext(
"session-abc", // sessionId
"user-42", // userId
"current user input", // currentInput (used for relevance scoring)
"assistant", // agentName
);
console.log(ctx);
// Prints the full context string that would be injected into the system prompt.
}
Each section is wrapped in an explicit scope marker so the LLM never conflates
user/session/agent data:
<memory section="userFacts" scope="current_user">
What you know about this user:
Facts the user told you directly (high confidence):
- User's name is Akash.
- Akash is based in Mumbai.
</memory>
<memory section="summaries" scope="current_user">
Previous conversation context (most recent first):
...
</memory>
This is useful for debugging — if the model seems to “forget” something, check if the relevant store is enabled and producing context.
Multi-User Isolation
The short version: one user’s memory never leaks into another user’s conversation. Akash’s data stays in Akash’s drawers. This is enforced automatically — you don’t have to do anything to get it.
This matters because the moment you have more than one user (every real product), a memory system that mixes people’s data is a privacy incident waiting to happen. Agentium treats memory as a security boundary, not just a feature.
Every memory store is scoped to the calling user by default. Two tenants
whose users happen to share a userId collision still cannot see each other’s
data, because every read and write includes the relevant scope key.
| Store | Default scope key | Supports shared scopes? |
|---|
| Session messages | sessionId | — |
| Summaries | sessionId | — |
| User facts | userId | personal only |
| User profile | userId | personal only |
| Entity memory | userId | personal only |
| Graph memory | userId (per-node _userId) | personal only |
| Procedure memory | userId | agent / tenant / global ✓ |
| Decision log | agentName + sessionId | — |
| Learnings | userId (vector post-filter) | agent / tenant / global ✓ |
Learnings and Procedures support an explicit scope hierarchy so that
genuinely shared knowledge — like “invoice reconciliation workflow” or
“refunds > $500 need VP approval” — can be saved once and seen by every
authorised user. See Multi-User Isolation for the
full contract and a worked example.
When you call MemoryManager.buildContext(sessionId, userId, ...) without a
userId, stores that require one return empty strings rather than risk
surfacing another user’s data.
Observability
Memory subsystem failures emit events on the agent’s EventBus so they don’t
silently disappear. The most useful one is memory.error:
agent.eventBus.on("memory.error", ({ store, error, agentName }) => {
console.error(`[${agentName}] ${store} extraction failed:`, error.message);
// Forward to Sentry / Datadog / Langfuse / etc.
});
Other memory events:
memory.fact.added / memory.fact.invalidated — fact-store mutations.
memory.extract — background extraction triggered.
memory.context.built — buildContext returned (with totalTokens and per-section breakdown).
These are first-class members of AgentEventMap, so
you can wire them into @agentium/observability and
graph extraction failure rates over time.
Cross-References