Glossary

chunking

Splitting source documents into smaller passages for embedding and retrieval, where the chunk size and overlap directly affect retrieval quality and context efficiency.

Retrieval and Memory aka text chunking, document chunking

The preprocessing step in any RAGretrieval-memoryA pattern where a model retrieves relevant documents from an external store at query time and conditions its answer on them, instead of relying only on parametric knowledge. Open full entry pipeline. Documents are split into overlapping passages of roughly 200 to 1000 tokens, each embedded separately, indexed in a vector databaseretrieval-memoryA datastore optimized for approximate nearest-neighbor search over high-dimensional embedding vectors, the storage substrate for most RAG and recommendation pipelines. Open full entry . Retrieval returns these passages, not whole documents. Chunk size is a tradeoff: small chunks give precise retrieval but fragment context; large chunks preserve context but blur retrieval signal.

Three pattern families. Fixed-size chunking splits at character or token counts with optional overlap. Recursive chunking respects document structure (paragraphs, then sentences, then characters) to keep semantic units intact. Late chunking (Jina, 2024) embeds the full document then pools per-chunk after, preserving cross-chunk context in the embeddingretrieval-memoryA fixed-size vector representation of a piece of text learned so semantically similar texts land near each other in the vector space, the basis for vector search and most RAG. Open full entry .

In practice chunk strategy is the most-tweaked knob in RAGretrieval-memoryA pattern where a model retrieves relevant documents from an external store at query time and conditions its answer on them, instead of relying only on parametric knowledge. Open full entry pipelines and the source of much retrieval-quality variation. Production systems often layer chunking with a parent-document fallback: embed small chunks for precise matching, return the larger parent passage to the generator for context.

Sources

LangChain: Text Splitters documentation

Mentioned in

Back to glossary