Graph RAG

Graph RAG augments retrieval-augmented generation with structured knowledge graph traversal. The promise: questions whose answers span multiple documents can be answered by following graph relationships, not just by vector similarity.

This page covers what graph RAG is, when it helps, and the costs.

Standard RAG, briefly

1. Embed query

2. Find similar passages from a vector index

3. Pass passages + query to LLM

4. LLM generates answer

Works well when:

- Answer exists in one passage

- Top-k retrieval surfaces it

Fails when:

- Answer requires combining passages

- Implicit reasoning across documents needed

- Specific entities or relationships need to be tracked

What graph RAG adds

Build a knowledge graph from your corpus:

- Entities (nodes)

- Relationships (edges)

- Entity properties

Graph queries can:

- Traverse relationships (find papers citing X cited by Y)

- Aggregate (count, list)

- Constrain by entity type or property

For multi-hop questions, graph traversal can find answers vector retrieval misses.

Architectures

Naive: graph-only retrieval

Extract entities from query; traverse graph; pass results to LLM.

Issues: brittle to entity extraction errors; misses passages without entities.

Hybrid: vector + graph

Vector retrieval for breadth; graph for relationships.

Combine results before LLM call.

Iterative: agent with graph tool

LLM agent uses graph queries as a tool. Decides when to traverse vs read.

Most flexible; highest cost.

Microsoft GraphRAG

Build hierarchical community summaries from graph. Use community summaries for global questions.

Specifically targets "what are the major themes" type queries.

Building the graph

Entity extraction

Identify entities mentioned in documents.

Tools:

- spaCy NER

- LLM-based extraction (more flexible)

- Custom domain models

Relationship extraction

Identify relationships between entities.

LLMs are good at this with appropriate prompts.

Schema

Two extremes:

- **Schema-free**: any entity, any relationship. Flexible but messy.

- **Strongly typed**: predefined entity and relationship types. Cleaner but limited.

Most production systems land in the middle.

Storage

- Property graph: Neo4j, Amazon Neptune

- Triple store: Apache Jena, Stardog

- General databases: PostgreSQL with graph extensions

- Specialized: TigerGraph, ArangoDB

Choice depends on query patterns and existing infrastructure.

Querying the graph

Direct graph queries

Cypher (Neo4j), SPARQL (RDF), Gremlin (TinkerPop).

Powerful but require schema knowledge.

Natural language to graph query

LLM translates question to graph query language.

Quality depends on schema clarity and example coverage.

Path queries

"Find paths from X to Y." Useful for knowledge exploration.

Graph algorithms

PageRank for importance, community detection for clusters, shortest path for relationships.

When graph RAG helps

Genuinely relational questions

"Who collaborates with researchers at company X?"

"What are the consequences of decision Y?"

Vector retrieval misses these.

Disambiguation

Multiple entities with same name. Graph context resolves.

Aggregation

"How many papers cite this work in the last year?"

Provenance / lineage

"Where did this claim originate?"

Multi-hop reasoning

Connecting facts across documents.

When standard RAG is enough

Most question-answering doesn't need graphs:

- Single-document answers

- "What does X mean" / definitions

- Procedural questions

- Most enterprise FAQ use cases

For these, graph RAG adds complexity without quality.

Costs

Construction

Building a knowledge graph from documents:

- Significant LLM costs (entity + relationship extraction)

- Iteration on schema

- Quality issues to fix

Often as much work as the rest of the system.

Maintenance

- New documents → new entities and relationships

- Entity resolution (X mentioned in different docs)

- Schema evolution

- Quality drift over time

Query latency

Graph queries can be slow for complex traversals.

Engineering complexity

Now you have two retrieval systems to maintain.

Practical patterns

Start without graph

Build standard RAG. Measure quality.

If quality is good enough, you don't need graph.

Add graph for specific queries

Identify question types where standard RAG fails.

Build graph features specifically for those.

Hybrid retrieval

For each query, do both vector retrieval and graph queries; combine.

Often the best balance.

Document the graph

Schema documentation matters. LLMs querying the graph need it.

Microsoft GraphRAG specifically

Anthropic's open-source GraphRAG builds:

- Entity-relationship graph

- Hierarchical community detection

- Community summaries at multiple levels

For "global" questions about a corpus:

- Generate answer from community summaries

- Reduce summaries to single answer

For "local" questions:

- Standard graph + vector retrieval

Effective for "summarize the main themes in this corpus" use cases.

Evaluation

Hard. Standard RAG evaluation (precision/recall on retrieved passages) doesn't capture graph value.

Approaches

- Question typology: easy / multi-hop / aggregation

- Answer correctness on multi-hop questions

- Latency / cost comparison

- Coverage of graph relationships

Curate eval sets that test relational reasoning.

Common failure patterns

Graph quality issues

Bad entity extraction → useless graph.

Schema rigidity

Schema doesn't match evolving content.

Over-engineering

Graph RAG when standard RAG would suffice.

LLM-generated queries fail silently

Query returns empty; agent makes up answer.

Update lag

Graph stale relative to documents.

Hallucinated entities

LLM extraction creates entities that aren't in source.

Decision framework

Build graph RAG when:

- Standard RAG measurably fails on important question types

- Your domain has clear entity/relationship structure

- You can invest in graph maintenance

- Multi-hop reasoning is core to use cases

Skip graph RAG when:

- Standard RAG works

- Domain is unstructured

- Maintenance cost outweighs benefit

- Team can't support two retrieval systems

Future direction

Graph RAG is evolving:

- Better automated schema discovery

- LLM-native graph reasoning

- Better evaluation methodologies

- Tooling maturing (LangChain, LlamaIndex graph integrations)

It's promising for complex domains; not always needed.

Further Reading

- [AgentPromptEngineering](AgentPromptEngineering) — Agent patterns

- [TransformerArchitecture](TransformerArchitecture) — LLM foundation

- [Generative AI Hub](GenerativeAIHub) — Cluster index