← Back to glossary
+Suggest a term
Concept·Patterns & Practices·Added 1 day ago

Retrieval

Also known as: document retrieval, semantic retrieval, vector retrieval, search retrieval

The step in a RAG pipeline where the system searches a knowledge base and fetches the most relevant documents or chunks before the model generates an answer. The model can only work with what retrieval gives it. Good retrieval is what separates a useful AI assistant from one that makes things up.

Language models don't have access to your company's Notion docs, your customer database, or last week's news. Retrieval is how you fix that. When a user asks a question, the system first searches a stored index, finds the most relevant content, and appends that content to the prompt before asking the model to answer. The model's job is then to synthesize those retrieved passages into a response.

Retrieval can be vector-based (semantic search using embeddings, which capture meaning rather than exact words), keyword-based (classic search like BM25), or hybrid (combining both). Vector retrieval is better at finding conceptually relevant content even when the wording differs. Keyword retrieval is better at finding exact matches and proper nouns. Most production systems use a hybrid approach.

The quality of retrieval is often more important than model choice: a mediocre model with excellent retrieval beats a frontier model with bad retrieval. If the right information never reaches the model's context window, no amount of prompt engineering fixes the output. This is why builders talk about 'retrieval quality' as a first-class concern, separate from and often more important than 'model quality.'

This definition is AI-generated and refreshed weekly. It may contain inaccuracies. Use your own judgment, especially for production decisions.
Related terms
RAGRetrieval-augmented generationEmbeddingsChunkingRerankingVector database