Parts 2 and 3 covered upgrades within a single retrieval pipeline. This part covers two structural changes to what the system does before or during retrieval - often confused with each other despite solving completely different problems.
The diagram shows both patterns: Modular RAG routing queries across distinct data domains, and Graph RAG traversing entity relationships that vector search cannot follow.
Modular RAG: routing across domains
A router classifies query intent first, then dispatches to whichever index, retriever, or structured data source is best suited. A query about parental leave goes to the HR policy index. A query about a specific error code goes to engineering documentation. A query asking "how many tickets did we close last month" goes to a text-to-SQL path against a structured database - not to a document index at all.
This earns its complexity when you have multiple genuinely distinct data domains - HR, Legal, Engineering, Finance - each with different content types and different access rules. The router itself is usually lightweight: a small classifier or a single LLM call cheap relative to the retrieval it's directing.
Graph RAG: reasoning over relationships
At ingestion time, an LLM extracts entities and relationships from documents and builds a knowledge graph. At query time, retrieval can traverse these relationships - not just match on semantic similarity. This matters for questions vector search structurally cannot answer: "how does Policy A affect Department Y?" Vector search finds documents mentioning either, but has no mechanism for understanding the relationship runs through Department X. Graph traversal can follow that path explicitly.
Cost note: Graph RAG's ingestion cost isn't one-time - it's recurring every time your corpus changes. Justify it when relationship-aware queries are a meaningful share of real traffic, evidenced by eval data - not because relationship reasoning sounds like a natural next level.