Knowledge Graph vs Relational Database

The "we need a knowledge graph" decision often defaults to "let's deploy Neo4j" without examining whether the data and queries actually warrant a graph database. Most knowledge-graph use cases work fine in Postgres with the right schema. Some genuinely don't.

This page is the decision criteria, the trade-offs, and the hybrid pattern that most mature systems converge on.

What a knowledge graph actually is

A KG stores entities (nodes) and typed relationships (edges) between them, with the relationships being first-class queryable.

(Anthropic) -[:founded_in]-> (San Francisco)
(Dario Amodei) -[:ceo_of]-> (Anthropic)
(Anthropic) -[:produces]-> (Claude)
(Anthropic) -[:competitor_of]-> (OpenAI)

Queries traverse: "what companies does Dario lead?" "what does Anthropic compete with?" "what are the products of companies founded in SF that compete with OpenAI?"

The shape that benefits: queries that follow many edges, where the joining structure isn't fixed, where you want to ask graph-shaped questions.

When a relational database is enough

You probably don't need a KG when:

Your relationships are stable. Each User has exactly one Account; each Order has many OrderLines. Foreign keys handle this.
Queries follow at most 2-3 hops. A relational JOIN does this fine.
You don't need to traverse "any edge." Specific known traversals fit a relational schema.
You want SQL. ACID, SQL ecosystem, mature tooling, your team already knows it.

Most CRUD systems, e-commerce, SaaS dashboards — these work in Postgres without a KG. A graph database adds operational complexity without providing usable benefit.

When a KG fits

You probably benefit from a KG when:

The relationships are themselves the data. Citation networks, social graphs, biological pathways, fraud rings — the structure is the point.
Queries are deeply traversal-heavy. "Find all paths between A and B with at most 5 hops." Doable in SQL but ugly.
The entity types and relationship types proliferate. Schema-flexible queries help.
You want pattern matching across relationships. "Find any person who reports to someone who reports to someone" — Cypher / Gremlin handle this naturally.
Provenance / context per edge matters. Edges carry their own metadata, sources, confidence.

These are the hard sells. Identifying them requires understanding both your data and your queries.

The graph-on-relational pattern

Most KGs in 2026 don't use a graph database. They use Postgres with a graph schema:

CREATE TABLE nodes (
    id BIGSERIAL PRIMARY KEY,
    type TEXT NOT NULL,
    name TEXT NOT NULL,
    properties JSONB,
    UNIQUE (type, name)
);

CREATE TABLE edges (
    id BIGSERIAL PRIMARY KEY,
    source_id BIGINT NOT NULL REFERENCES nodes(id),
    target_id BIGINT NOT NULL REFERENCES nodes(id),
    relation TEXT NOT NULL,
    properties JSONB,
    confidence REAL,
    source TEXT
);

CREATE INDEX ON edges (source_id, relation);
CREATE INDEX ON edges (target_id, relation);

For 1-3 hop queries, this is fast. Recursive CTEs handle deeper traversals. The whole stack is Postgres; you have transactions, joins with non-graph tables, and the operational simplicity of one database.

This is what most production "knowledge graphs" actually are. Calling it a KG and serving it from Postgres is fine.

When you need an actual graph database

For specific cases:

Heavy graph algorithms. Centrality, shortest path on million-node graphs, community detection. Neo4j Graph Data Science library wins on these.
Complex pattern matching. Cypher's MATCH (a)-[*..5]-(b) is genuinely simpler than the SQL recursive CTE equivalent.
Very high traversal depth. 10+ hop traversals. Graph databases are tuned for this.
Schema fluidity at extreme scale. Adding new entity types and relationships dynamically without migrations.

For these, Neo4j, JanusGraph, TigerGraph, or AgensGraph make sense.

For most other cases, Postgres + graph schema wins on operational simplicity.

Triple stores (RDF)

A different KG flavour: triples (subject-predicate-object) with formal semantics (RDF, OWL, SPARQL). Stronger reasoning capabilities; useful for ontology-heavy domains (life sciences, library science, semantic web).

Less common in industry; most "knowledge graph" projects in 2026 use property graphs (Neo4j-style) or relational implementations.

Hybrid: Postgres with graph + relational + vector

The pattern most mature production knowledge bases land on:

Postgres with extensions:
  - Relational tables for structured data (users, accounts, orders).
  - nodes / edges tables for the graph layer.
  - pgvector for embeddings (semantic search).
  - JSONB for flexible properties.

Single substrate; transactional consistency; one ops story.

Queries cross layers:

-- Find users related to "AI" by topic, with their recent orders
SELECT u.name, COUNT(o.id) AS orders
FROM users u
JOIN edges e ON e.source_id = u.kg_node_id
JOIN nodes n ON n.id = e.target_id
LEFT JOIN orders o ON o.user_id = u.id
WHERE n.name = 'AI' AND e.relation = 'interested_in'
  AND o.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.name;

This is the wiki you're looking at. The Wikantik knowledge graph is built on Postgres + pgvector + a graph schema. It works.

What KGs add to RAG

For retrieval-augmented generation:

Entity-aware retrieval. "Tell me about Anthropic" returns paragraphs about Anthropic, not paragraphs that mention Anthropic.
Multi-hop retrieval. "Companies competing with companies in SF" — KG traversal can construct the candidate set.
Constraint enforcement. "Documents about Anthropic and RAG, after 2023" — structured metadata + relation lookup.

Pure vector RAG doesn't do these well. KG-augmented RAG ("GraphRAG") fills the gap. Microsoft's GraphRAG project popularised the approach; many production systems now combine KG and vector retrieval.

See KnowledgeGraphCompletion for the construction side; RagImplementationPatterns for retrieval.

Practical decision criteria

For a new project considering a KG:

Sketch the queries. What questions will you ask?
For each query, write the SQL (assuming relational + JSONB). Is it ugly?
For each query, write the Cypher (assuming graph DB). Is it materially better?
If queries are simple in SQL, use Postgres. Add graph schema if you need some graph-shaped questions.
If queries are genuinely graph-shaped and deep, evaluate Neo4j vs Postgres-graph-on-Postgres on your data scale.

Most projects stop at step 4. The minority needing step 5's graph DB are the genuinely graph-heavy use cases.

Failure modes

Premature graph adoption. Adopt Neo4j; spend months ramping up Cypher; realize relational was fine.
Schemaless graph chaos. "We don't need a schema!" → 50 different "Person" node types with overlapping properties. Validate at write time even in graph stores.
Forgetting transactions. Some graph databases have weaker transactional guarantees than Postgres. Verify; design accordingly.
Vector store bolted on awkwardly. Storing graph in Neo4j and embeddings in Pinecone produces sync nightmares. Co-locate when you can.
Query language proliferation. App speaks SPARQL, Cypher, Gremlin, SQL, ElasticSearch DSL. Pick few; resist the urge to add another.

A pragmatic recommendation

For most teams in 2026:

Start with Postgres + pgvector. Add a graph-schema (nodes + edges tables) only when you have queries that benefit.
Use a graph DB only when you've hit specific limits with the relational approach (deep traversals, graph algorithms at scale, complex pattern matching).
Don't conflate "we have related data" with "we need a graph database." Foreign keys are also graphs.

This is conservative advice; deviate when you have a specific reason.