Memory Layer
Also known as: persistent memory, long-term memory, cross-session memory
A context window resets with every new session. A memory layer doesn't. It's an external store, typically a vector database or key-value store, that the agent can read from and write to. When a user returns to the agent, it queries the memory layer for relevant prior context and injects it into the current session. From the user's perspective, the agent remembers them.
Different memory designs have different trade-offs. Episodic memory stores summaries of past conversations. Semantic memory stores facts and relationships as embeddings for fuzzy retrieval. Working memory is the current context window. Procedural memory encodes how to do things, closer to skills or fine-tuning. Most production agents use a combination.
For builders, implementing a memory layer is one of the most impactful UX improvements you can make to a multi-session AI product. A customer support agent that remembers your account history, a coding assistant that remembers your architectural preferences, or a research tool that remembers your prior findings all feel qualitatively different from stateless alternatives.