Frontmatter and Knowledge Graphs

A wiki page is text. With frontmatter, it's also structured data. With consistent frontmatter conventions, the wiki becomes a queryable knowledge graph.

This page covers how frontmatter feeds knowledge graphs and the patterns for designing useful schema.

What frontmatter is

YAML (or similar) at the top of a markdown file:

---
title: Page Title
tags:
  - tag1
  - tag2
related:
  - OtherPage
status: published
---

The metadata is structured; the body is unstructured. Tools can use the metadata without reading the body.

What frontmatter enables

Filtering

"Show all pages tagged 'security'" is a query against frontmatter. No body parsing needed.

Cross-reference graphs

related: field links pages. Across the wiki, this builds a graph.

Lifecycle management

status: deprecated lets tools surface old pages.

Search relevance

Title, tags, and other frontmatter fields can be weighted in search. See WikiSearchOptimization.

Knowledge-graph queries

With consistent schemas, the wiki answers questions:

"Which pages reference this concept?"
"Which pages are stale (old + low confidence)?"
"What's the dependency graph for this topic?"

Frontmatter schemas

Required fields

Some fields every page needs:

title: ...
date: ...
type: article | hub | runbook | ...

Validation enforces presence.

Optional fields

Fields that may or may not apply:

tags: ...           # for topic indexing
related: ...        # cross-references
status: ...         # lifecycle
canonical_id: ...   # stable identifier

Type-specific fields

A page with type: runbook might have a structured runbook: block:

type: runbook
runbook:
  when_to_use: [...]
  steps: [...]
  references: [...]

A page with type: article doesn't need that.

The Wikantik agent-cookbook uses this pattern.

Knowledge graph mechanics

Nodes

Each page is a node.

Edges

Frontmatter related:, links_to:, embedded references — these are edges.

Properties

Frontmatter fields are properties of the node.

Querying

Tools query the graph:

Find all pages where:
  type = "runbook"
  AND tags contain "security"
  AND status = "published"

This is a structured query; doesn't require text search.

Wikantik specifics

Per CLAUDE.md, Wikantik's structural spine + agent-grade content design uses frontmatter heavily:

canonical_id: stable identifier (ULID)
cluster: thematic grouping
tags: topic tags
related: cross-references
hubs: hub page memberships
type: page type (article, hub, runbook)
status: lifecycle state
verified_at, verified_by, confidence: verification metadata
audience: humans / agents / both

These feed:

The structural-spine index
Knowledge-graph traversal
Agent-grade content projection (/api/pages/for-agent/{id})
Verification dashboards

Design principles

Schema first

Decide what fields exist; what they mean; what values are valid. Then write pages following the schema.

Without schema, frontmatter is inconsistent; queries don't work.

Validation

Tools validate frontmatter at save time. Pages with invalid schema reject (or warn).

Wikantik uses save-time enforcement via StructuralSpinePageFilter and RunbookValidationPageFilter.

Consistency

A field's name and meaning don't change. Consistent across all pages.

Renaming related → links_to is a major migration. Avoid unless necessary.

Extensibility

New fields can be added; existing pages without them still work.

For required fields with defaults, this is straightforward. For required fields without defaults, migration is needed.

Specific patterns

Tag taxonomy

Tags should be from a controlled vocabulary. Otherwise users invent variants ("security", "Security", "infosec", "SecurityRelated").

Either:

Suggest tags from existing vocabulary
Auto-correct to canonical form
Periodic taxonomy review

Cross-reference enforcement

related: entries point to other pages. Validate the targets exist.

Tools surface broken cross-references.

Hub pages

Hub pages have type: hub and list cluster members. The cluster's pages reference back via hubs: field.

Bidirectional links: hub lists members; members reference hub.

Confidence and verification

Pages can have:

verified_at: 2026-04-26
verified_by: alice
confidence: authoritative | provisional | stale

Tools surface stale or unverified pages.

Cluster membership

Frontmatter cluster: name groups pages. Hub pages organize the cluster.

For 5+ pages on a topic, a cluster + hub provides structure.

Tooling

Frontmatter linters

Validate schema on save. Reject invalid pages.

Index builders

Read all pages' frontmatter; build searchable index. Periodic rebuild.

Cross-reference checkers

Find broken related: links; missing hub members.

Visualization

Graph visualization of the knowledge graph. Useful for understanding wiki structure.

Common failure patterns

Inconsistent schemas

Different fields used; same fields with different values. Queries unreliable.

No validation

Bad frontmatter slips in. Subtle issues compound.

Tag drift

Tags multiply; no curation; eventual mess.

Frontmatter as afterthought

Pages created without thinking about metadata. Later, can't be queried.

Hand-maintained graphs

Manual updates to related: lists. Drift; missing cross-references.

A reasonable approach

For wikis using frontmatter for knowledge graphs:

Define schema explicitly
Validate on save
Tags from controlled vocabulary
Bidirectional cross-references (hubs ↔ members)
Periodic graph audit
Tool-supported updates (don't expect humans to maintain manually)

The Wikantik approach (structural spine + agent-grade content) shows mature implementation.