If naive RAG's failure modes cluster around precision and recall, Advanced RAG is the set of techniques that directly target those two metrics - by wrapping the naive core with pre-retrieval and post-retrieval stages.
The diagram above shows the three-stage structure. Each stage has a specific job: pre-retrieval improves what you ask for, retrieval casts a wide net, post-retrieval narrows it to what actually matters.
Pre-retrieval: improving the query before it hits the index
Query rewriting reformulates ambiguous queries into a form that retrieves better. Multi-query expansion generates several paraphrased versions of the question, retrieves for each, and merges the results. Decomposition splits compound questions into sub-queries retrieved independently, directly addressing the multi-hop failure mode from Part 2.
Hybrid retrieval: combining two fundamentally different search mechanisms
Dense (vector) search catches semantic similarity. Sparse (BM25/keyword) search catches exact matches - error codes, product IDs, names. Neither alone is sufficient for enterprise content, which mixes prose with identifiers constantly. Reciprocal Rank Fusion (RRF) merges both: documents that multiple retrieval methods agree on rise to the top.
Post-retrieval: refining results before they reach the model
Cross-encoder reranking is the highest-leverage technique here. The standard pattern: retrieve top-50 with fast vector search, rerank down to top-5 with a cross-encoder that scores (query, document) jointly. Contextual compression then strips irrelevant sentences from surviving chunks.
Spotlight: HyDE (Hypothetical Document Embeddings)
Questions and answers often live in different regions of embedding space. HyDE's fix: have the LLM generate a hypothetical answer to the question, then embed that instead. The hypothetical answer's embedding lands much closer to real documents - because it's written in the same register as the documents. Particularly valuable in legal, medical, and technical support domains.
The takeaway: optimise retrieval for recall first - cast a wide net. Then optimise for precision - turning a noisy top-50 into a trustworthy top-5.