Ontology Design Patterns

An ontology is a formal description of the kinds of things in a domain and the relationships among them. Ontology design patterns are reusable templates for common modelling situations — taxonomies, parts-of-things, time-stamped facts, n-ary relationships.

Most knowledge-graph projects in 2026 don't use formal ontologies (OWL, RDFS) directly. They use lighter property-graph schemas with informal modelling guidelines. But the patterns from the formal-ontology tradition still inform good schema design.

When formal ontologies are worth it

Specific cases:

- **Long-lived, broadly-shared knowledge** — biomedical (UMLS, SNOMED), library science (Dublin Core), legal (LKIF). Multi-decade lifespan; cross-organisation use.

- **Strong reasoning requirements** — entailment ("if A is a B and B is a C, then A is a C") computed by an ontology reasoner. Useful in some research and compliance contexts.

- **Cross-organisation interoperability** — linked data, semantic web. Standardised vocabularies enable integration.

- **Regulatory or compliance frameworks** — pharma, financial services where formal definitions matter.

Most enterprise knowledge graphs don't fit these cases. They benefit from ontology-style thinking without ontology-style formalism.

Common patterns

Class hierarchy (taxonomy)

The simplest pattern: types organised in an is-a hierarchy.

```

Vehicle

├── Car

│ ├── Sedan

│ ├── SUV

│ └── Hatchback

├── Truck

└── Motorcycle

```

In RDFS / OWL, `rdfs:subClassOf`. In a property graph, multiple labels or a `parent_class` relationship.

Patterns to follow:

- **Single inheritance is simpler**; multiple inheritance is sometimes necessary but creates ambiguity.

- **Don't over-classify**. A 6-level deep tree is harder than 2-3 levels with finer-grained relations.

- **Keep types orthogonal where possible**. A vehicle might be `(:Car {fuel:'electric'})` rather than `:ElectricCar` if "electric" is a property.

Part-of (mereology)

Modelling parts of things:

```

Engine -[:PART_OF]-> Car

Wheel -[:PART_OF]-> Car

Cylinder -[:PART_OF]-> Engine

```

Subtleties:

- **Composition vs aggregation.** Composition: the part can't exist without the whole (an engine in a specific car). Aggregation: the part is shared (an off-the-shelf bolt).

- **Transitive part-of.** A cylinder is part of an engine which is part of a car; queries for "what's part of this car" should follow transitively. Most graph DBs handle via variable-length traversal.

- **Part-of vs has-property.** "The car has a colour" is property; "the car has wheels" is part-of.

Membership (taxonomic)

Things belonging to groups, roles, categories:

```

Alice -[:MEMBER_OF]-> EngineeringTeam

Alice -[:HAS_ROLE]-> Manager

Alice -[:WORKS_FOR]-> Anthropic

```

Pattern: use distinct relationship types for distinct membership concepts. Don't overload `MEMBER_OF` to mean both "is in this team" and "has this role."

Time-indexed facts

Most facts are true at a particular time. The CEO of a company changes; the price of a product changes; a relationship is established and ended.

Three approaches:

Reified relationships (common in RDF)

A relationship becomes its own node:

```

(Dario)-[:HOLDS_POSITION]->(position_1)

position_1 :Position {role:'CEO', company:'Anthropic', from:2021}

```

Pros: time, source, confidence all attached cleanly.

Cons: more nodes; queries are more verbose.

Edge properties (property graphs)

Properties on the edge:

```

(Dario)-[:CEO_OF {since:2021, until:NULL, source:'web'}]->(Anthropic)

```

Pros: less verbose; queries simpler.

Cons: limited support for time-querying patterns.

Effective-dated rows (relational)

Each fact gets a separate row with `valid_from` / `valid_until`. See [DatabaseDesign]().

For most modern KGs, edge properties suffice. Reified relationships are formally cleaner but heavier.

N-ary relationships

A relationship that involves more than two things:

```

"In 2021, Anthropic, with Series A funding from Google, founded its San Francisco office."

```

Two entities (Anthropic, Google), a relation (funded), a year (2021), an event (founding office).

Modelling options:

- **A relationship node** that connects all the participants:

```

(event_1) -[:HAPPENED_IN]-> (2021)

(event_1) -[:INVOLVES]-> (Anthropic)

(event_1) -[:FUNDED_BY]-> (Google)

(event_1) -[:ESTABLISHED]-> (SF_office)

```

- **Multiple binary relationships with shared metadata** — clutter; harder to query.

Reified n-ary relationships are how RDF/OWL handle this; property graphs increasingly adopt the same pattern.

Provenance

Where did this fact come from? Critical for any KG that ingests from multiple sources.

Patterns:

- **Provenance on edges**: `since`, `source`, `confidence`, `extraction_method`, `extracted_at`.

- **Provenance graph**: a separate graph layer linking facts to their sources, witnesses, derivation chains.

For agentic / RAG use cases, provenance is non-negotiable. Without it, you can't tell "the model said this from training data" from "the KG said this from a verified source."

Identity and equivalence

Same entity in different sources, or the same entity referred to differently:

- **`owl:sameAs`** in RDF: declares two URIs refer to the same thing.

- **Entity resolution table** in property graphs: maps source-specific IDs to canonical IDs.

This is also where the "open-world" vs "closed-world" assumption matters. Open world: absence of a fact doesn't mean it's false. Closed world: everything I haven't said is false. KGs typically operate open-world; SQL databases closed-world. Mismatching produces bugs.

Lightweight alternative patterns

For most teams in 2026 building a KG, the formal-ontology toolkit (OWL, RDF, SPARQL) is overkill. Lighter alternatives:

Schema as documentation

Document your KG's vocabulary in a wiki or schema-as-code (a Markdown file, a YAML schema, dbt docs).

```yaml

node_types:

Person:

description: A natural person

properties: [name, email, birth_year]

Company:

description: A legal entity

properties: [name, founded_year, headquarters]

edge_types:

WORKS_AT:

description: Employment

source: Person

target: Company

properties: [role, start_date, end_date]

```

Enforced by code at ingestion. No reasoner required; constraints are concrete.

Schema validation

For structured KGs, validate insertions against the schema:

- Allowed node types and properties.

- Allowed edge types and source/target combinations.

- Property type constraints.

Tools: `pydantic` for Python; JSON Schema; custom validators. Reject malformed data at insertion.

Light formal vocabulary

If you need some formal-ontology benefits without full RDF/OWL:

- Use SKOS (Simple Knowledge Organization System) for taxonomies.

- Adopt schema.org vocabulary for common entity types.

- Use Wikidata IDs as a cross-reference for well-known entities.

This gives interoperability and shared vocabulary without committing to the full semantic-web stack.

Anti-patterns

- **Over-engineered class hierarchies.** 12 levels of `Thing → Object → ... → SmartphoneCase`. The deeper the hierarchy, the less it helps.

- **Properties masquerading as classes.** "RedCar" as a class instead of "Car with colour=red."

- **No provenance.** Facts pour in from everywhere; no record of where; debugging is impossible.

- **Mixing time-varying and time-invariant facts** without distinguishing. The CEO is time-varying; the founding year isn't.

- **Trying to model the world.** Your domain is bounded; resist the urge to import every related concept.

Pragmatic recommendations

For a new KG project:

1. **Start with a small, concrete schema.** Five to ten node types; ten to fifteen edge types. Document each.

2. **Use property graphs (not RDF) unless you have a specific reason.** Easier; more tooling.

3. **Adopt time, provenance, and confidence as edge properties.** Standardise from day one.

4. **Reuse vocabulary from existing schemas where applicable.** Schema.org, Wikidata, domain-specific ontologies.

5. **Validate at ingestion.** Schema as code; reject malformed.

6. **Iterate.** The schema will evolve; design for additive change.

You'll have an ontology, just an informal one. That's usually enough.

Further reading

- [KnowledgeGraphCompletion]() — building / extending the graph

- [KnowledgeGraphVsRelationalDatabase]() — substrate decision

- [DatabaseDesign]() — relational schema discipline carries over

- [AbstractAlgebra]() — formal-structure math underlying ontologies