← Back to Series
RAG Architecture Series Part 6 of 6

RAG Architecture Decision Tree: A Practical Guide From Naive to Agentic

This is the wrap-up for the series - and the part worth bookmarking, because it's the one you'll return to when scoping a new system.

The decision tree above maps every pattern from the series to the specific query characteristic that justifies it. Work top to bottom - add complexity only where you have evidence of a gap, not anticipation of one.

The decision logic, in plain language

Single-hop, FAQ-style queries? Start with Naive RAG. If eval shows precision gaps, add Advanced RAG (hybrid search + reranking) before reaching for anything more structural. Multiple data domains or mixed structured/unstructured data? Add Modular RAG routing. Answers requiring entity relationship traversal? Add Graph RAG on top. Multi-hop, self-correcting queries? Agentic RAG - with hard iteration limits, cost ceiling, and full tracing from day one.

What kind of queries? Single-hop · FAQ-style · Single domain? YES NO Naive RAG Start here always ↓ precision gap? add Advanced RAG Multiple data domains? YES NO Modular RAG + intent routing Relationship traversal needed? YES NO Graph RAG entity relationships Advanced RAG + Agentic RAG - at any tier, when multi-hop self-correction is required Adds the Reason → Act → Observe → Evaluate loop · Requires: max iterations, cost ceiling, full tracing

Scale tiers

Small (under 50K documents): Naive or lightly-Advanced RAG. pgvector or a serverless Pinecone keeps infrastructure overhead minimal. ~50 golden questions for evaluation is a reasonable starting point.

Medium (50K-1M documents): Advanced RAG's hybrid search and reranking earn their cost here. Dedicated vector infrastructure - Qdrant, Weaviate, OpenSearch. Automated regression suite: RAGAS-style metrics on every change, not just at launch.

Large (1M+ documents): Modular + Graph + Agentic, applied selectively per domain based on real query patterns - not uniformly. Multi-tenant indexes with access-control-aware retrieval become a requirement.

The four eval metrics that actually matter

Closing thought: every pattern in this series is a response to a specific, observable gap. The discipline that separates well-architected RAG from over-engineered systems isn't knowing these patterns - it's having the evaluation infrastructure to know which gap you actually have, and adding exactly the layer that closes it.

-->

Let's Connect

Interested in discussing AI architecture, LLMOps, or production agent systems?

Get in Touch