The four kinds of memory an AI agent needs

An AI agent that makes the same mistake over and over isn't stupid — it simply doesn't remember. And that's an important distinction, because "memory" isn't one thing here. Just like a person, a well-designed agent needs several different kinds of memory, each serving a different purpose. I'll show you four: how they work, what they're for, and what breaks when one is missing.

Two terms first, because they'll keep coming back. An AI agent is a program built on a language model that doesn't just answer a question but can carry out a task on its own — reach for data, use a tool, work through a process step by step. Context is everything the agent "sees" at a given moment of work: your instruction, the files it has loaded, the conversation so far. Memory is what draws the sharpest line between an ordinary chatbot and an agent — a chatbot gives an answer, while an agent gives an answer shaped by lasting knowledge and accumulated experience.

The easiest way to grasp these four kinds is to look at yourself first. A person has a memory of what's happening here and now — whatever you're thinking about right now. You have knowledge of facts — that Warsaw is the capital of Poland, that your company has one security policy rather than another. You have learned skills — riding a bike, running a meeting. And you have personal memories — specific past events you can return to. It turns out a well-designed agent needs exactly the same set.

Working memory: what the agent has in its hands right now

The first kind is working memory — everything the agent has in front of it at this moment. The current conversation, the instructions it was started with, the files and data loaded into the current prompt. It's the agent's scratchpad: the place where the thinking happens for the task it's working on right now.

In technical terms, working memory is the context window — the amount of text the agent takes in at once. The closest everyday analogy is the short-term store in your head while you're working on something: you hold what you're dealing with right now vividly, but step away from your desk and come back the next day, and some of the detail has evaporated. The agent's working memory behaves the same way — it's instant and immediately at hand, but fleeting. When the session ends, its contents are gone.

It also has a ceiling. The largest context windows available today are genuinely roomy — some hold a million words or more — but there's still a limit. And what's more, if you cram too much in, the quality of the work drops: the model starts losing whatever got buried somewhere in the middle. Every agent has working memory — but so does every ordinary chatbot, because it's simply the context window. So the question becomes: what else does a system need if it's meant to be more than a chatbot?

Semantic memory: what the agent knows permanently

The second kind is semantic memory — the agent's knowledge base. This is where facts, rules, conventions, and documentation live. In a person, its equivalent is knowledge of the world: things you simply know, which are true regardless of what you happen to be doing at any given moment.

In the technical literature, it's often described through concepts like the vector database or the knowledge graph. A vector database is a way of storing information not by keyword but by meaning — so the agent can find a passage that matches the sense of a question, even if the question uses an entirely different word. It sounds heavyweight, and sometimes it really is implemented that way — but in many systems running today, semantic memory is something far simpler: just plain text files. A single well-written document covering the project's architecture, its agreed rules, and a list of do's and don'ts — loaded at the start of every session — serves perfectly well as a knowledge base in practice.

Semantic memory tells the agent what it should know in general, independent of any particular task. Without it, the agent is doomed to repeat the same mistakes over and over — because it has no lasting knowledge to draw on. Each time it starts from scratch, as if it had never heard of your rules before.

Procedural memory: how the agent knows a thing is done

Abstract graphic: a concise index icon unfolding into a full set of steps only at the moment of use, a bright path emerging from a dimmed list of names on a graphite background.

The third kind is procedural memory — the one responsible for how the agent knows a thing is done. In a person, it's like riding a bike or running a meeting: skills you carry in your muscles and reflexes rather than as facts to recite. In an agent, it takes the form of skills — named procedures, each describing what it can do and how to do it step by step. It can be anything: preparing a presentation, running a structured review, producing a report.

There's a clever mechanism here worth understanding, because it explains why procedural memory doesn't clog the agent up. If the agent loaded the descriptions of all its skills at once, a long list would blow the working-memory budget. So it does it differently: day to day it sees only a lightweight index — just the name and a short description of each skill, literally a few lines. Only when a task arrives that matches one of those descriptions does the agent load the full instruction. And if that instruction refers to additional files or templates, those are pulled in later still — only at the moment they're genuinely needed during execution.

This is the fundamental difference from semantic memory. Semantic knowledge is present in the context the whole time — the agent "knows it" continuously. A procedural skill, the agent only advertises, and it reaches for the details solely when it's about to use it. It's the difference between "knowing that" and "knowing how" — and between carrying everything with you and pulling a tool out of the drawer for a moment when you need it.

Episodic memory: what the agent remembered from the past

The fourth kind is episodic memory — a record of what happened in past interactions, what decisions were made, and what the agent learned from them. It's the closest equivalent to human memories: not general knowledge, but specific events you can return to.

The most naive implementation is to keep every conversation in full and search those records when needed. Technically that counts as episodic memory — but it's rarely useful. Working systems do something smarter: they distill experience. As it works, the agent builds up its own notes, but it doesn't record everything — it decides what's worth remembering based on whether a given piece of information will actually be useful in a future conversation. The result is condensed experience. Instead of keeping a full record of an hour-long analysis, the agent keeps a single sentence of the gist: "last time the problem wasn't where it looked, but in the intermediate layer." That's a far more valuable keepsake than a raw transcript.

And this is exactly where memory starts to genuinely resemble learning — because the agent gets better over time. It remembers the project, it remembers preferences, and well-designed memory can also remember mistakes so as not to repeat them. The catch is that episodic memory is the hardest to get right, because it forces decisions a person makes by reflex. What do you delete? When does information become stale? If someone changes role, do the old memories of their projects stay, or should they disappear? People forget efficiently and — frustrating as it can be — usually to good effect. For an agent, forgetting doesn't happen on its own: you have to design it.

Not every agent needs all four

Here we reach the most important practical point: not every agent needs the full set. The choice of memory follows from the task, not from an ambition to have everything.

The simplest reflex agent — something like a thermostat, or a bot that merely routes queries to the right place — needs nothing beyond working memory. It sees what's in front of it, reacts, and that's all.

A customer-support agent, still fairly narrow — say, one that resets passwords — has working memory, of course, but it also needs procedural memory, because it has to call up the password-reset procedure. And that may be enough for it.

A coding agent is a different weight class altogether — it needs all four at once. Working memory, to hold the current task. Semantic memory, to know the project's architecture and rules. Procedural memory, to reach for its skills. And episodic memory, to learn between sessions. The rough pattern is this: the more an agent is meant to do on its own, and the longer it's meant to operate beyond a single conversation, the more kinds of memory you have to give it.

This leads to a principle that holds even once you've forgotten all the names. When an agent forgets, repeats the same mistake over and over, or can't make use of something you already worked through together — it's almost never a matter of a "weak model," but of a missing kind of memory. Repeating mistakes despite clear rules? It's missing semantic memory. Can't carry out a process it supposedly knows? It's missing procedural memory. Drawing no lessons from past conversations? It's missing episodic memory. The next time an agent lets you down, don't immediately ask whether it's smart enough — ask which kind of memory you failed to give it. Naming the gap is usually half the solution: now you know what to add.

Working memory: what the agent has in its hands right now

Semantic memory: what the agent knows permanently

Procedural memory: how the agent knows a thing is done

Episodic memory: what the agent remembered from the past

Not every agent needs all four

Related in the knowledge base