Graph RAG

Graph RAG augments retrieval-augmented generation with structured knowledge graph traversal. The promise: questions whose answers span multiple documents can be answered by following graph relationships, not just by vector similarity.

This page covers what graph RAG is, when it helps, and the costs.

Standard RAG, briefly

Embed query
Find similar passages from a vector index
Pass passages + query to LLM
LLM generates answer

Works well when:

Answer exists in one passage
Top-k retrieval surfaces it

Fails when:

Answer requires combining passages
Implicit reasoning across documents needed
Specific entities or relationships need to be tracked

What graph RAG adds

Build a knowledge graph from your corpus:

Entities (nodes)
Relationships (edges)
Entity properties

Graph queries can:

Traverse relationships (find papers citing X cited by Y)
Aggregate (count, list)
Constrain by entity type or property

For multi-hop questions, graph traversal can find answers vector retrieval misses.

Architectures

Naive: graph-only retrieval

Extract entities from query; traverse graph; pass results to LLM.

Issues: brittle to entity extraction errors; misses passages without entities.

Hybrid: vector + graph

Vector retrieval for breadth; graph for relationships.

Combine results before LLM call.

Iterative: agent with graph tool

LLM agent uses graph queries as a tool. Decides when to traverse vs read.

Most flexible; highest cost.

Microsoft GraphRAG

Build hierarchical community summaries from graph. Use community summaries for global questions.

Specifically targets "what are the major themes" type queries.

Building the graph

Entity extraction

Identify entities mentioned in documents.

Tools:

spaCy NER
LLM-based extraction (more flexible)
Custom domain models

Relationship extraction

Identify relationships between entities.

LLMs are good at this with appropriate prompts.

Schema

Two extremes:

Schema-free: any entity, any relationship. Flexible but messy.
Strongly typed: predefined entity and relationship types. Cleaner but limited.

Most production systems land in the middle.

Storage

Property graph: Neo4j, Amazon Neptune
Triple store: Apache Jena, Stardog
General databases: PostgreSQL with graph extensions
Specialized: TigerGraph, ArangoDB

Choice depends on query patterns and existing infrastructure.

Querying the graph

Direct graph queries

Cypher (Neo4j), SPARQL (RDF), Gremlin (TinkerPop).

Powerful but require schema knowledge.

Natural language to graph query

LLM translates question to graph query language.

Quality depends on schema clarity and example coverage.

Path queries

"Find paths from X to Y." Useful for knowledge exploration.

Graph algorithms

PageRank for importance, community detection for clusters, shortest path for relationships.

When graph RAG helps

Genuinely relational questions

"Who collaborates with researchers at company X?" "What are the consequences of decision Y?"

Vector retrieval misses these.

Disambiguation

Multiple entities with same name. Graph context resolves.

Aggregation

"How many papers cite this work in the last year?"

Provenance / lineage

"Where did this claim originate?"

Multi-hop reasoning

Connecting facts across documents.

When standard RAG is enough

Most question-answering doesn't need graphs:

Single-document answers
"What does X mean" / definitions
Procedural questions
Most enterprise FAQ use cases

For these, graph RAG adds complexity without quality.

Costs

Construction

Building a knowledge graph from documents:

Significant LLM costs (entity + relationship extraction)
Iteration on schema
Quality issues to fix

Often as much work as the rest of the system.

Maintenance

New documents → new entities and relationships
Entity resolution (X mentioned in different docs)
Schema evolution
Quality drift over time

Query latency

Graph queries can be slow for complex traversals.

Engineering complexity

Now you have two retrieval systems to maintain.

Practical patterns

Start without graph

Build standard RAG. Measure quality.

If quality is good enough, you don't need graph.

Add graph for specific queries

Identify question types where standard RAG fails.

Build graph features specifically for those.

Hybrid retrieval

For each query, do both vector retrieval and graph queries; combine.

Often the best balance.

Document the graph

Schema documentation matters. LLMs querying the graph need it.

Microsoft GraphRAG specifically

Anthropic's open-source GraphRAG builds:

Entity-relationship graph
Hierarchical community detection
Community summaries at multiple levels

For "global" questions about a corpus:

Generate answer from community summaries
Reduce summaries to single answer

For "local" questions:

Standard graph + vector retrieval

Effective for "summarize the main themes in this corpus" use cases.

Evaluation

Hard. Standard RAG evaluation (precision/recall on retrieved passages) doesn't capture graph value.

Approaches

Question typology: easy / multi-hop / aggregation
Answer correctness on multi-hop questions
Latency / cost comparison
Coverage of graph relationships

Curate eval sets that test relational reasoning.

Common failure patterns

Graph quality issues

Bad entity extraction → useless graph.

Schema rigidity

Schema doesn't match evolving content.

Over-engineering

Graph RAG when standard RAG would suffice.

LLM-generated queries fail silently

Query returns empty; agent makes up answer.

Update lag

Graph stale relative to documents.

Hallucinated entities

LLM extraction creates entities that aren't in source.

Decision framework

Build graph RAG when:

Standard RAG measurably fails on important question types
Your domain has clear entity/relationship structure
You can invest in graph maintenance
Multi-hop reasoning is core to use cases

Skip graph RAG when:

Standard RAG works
Domain is unstructured
Maintenance cost outweighs benefit
Team can't support two retrieval systems

Future direction

Graph RAG is evolving:

Better automated schema discovery
LLM-native graph reasoning
Better evaluation methodologies
Tooling maturing (LangChain, LlamaIndex graph integrations)

It's promising for complex domains; not always needed.