Frontmatter and Knowledge Graphs
A wiki page is text. With frontmatter, it's also structured data. With consistent frontmatter conventions, the wiki becomes a queryable knowledge graph.
This page covers how frontmatter feeds knowledge graphs and the patterns for designing useful schema.
What frontmatter is
YAML (or similar) at the top of a markdown file:
```yaml
---
title: Page Title
tags:
- tag1
- tag2
related:
- OtherPage
status: published
---
```
The metadata is structured; the body is unstructured. Tools can use the metadata without reading the body.
What frontmatter enables
Filtering
"Show all pages tagged 'security'" is a query against frontmatter. No body parsing needed.
Cross-reference graphs
`related:` field links pages. Across the wiki, this builds a graph.
Lifecycle management
`status: deprecated` lets tools surface old pages.
Search relevance
Title, tags, and other frontmatter fields can be weighted in search. See [WikiSearchOptimization](WikiSearchOptimization).
Knowledge-graph queries
With consistent schemas, the wiki answers questions:
- "Which pages reference this concept?"
- "Which pages are stale (old + low confidence)?"
- "What's the dependency graph for this topic?"
Frontmatter schemas
Required fields
Some fields every page needs:
```yaml
title: ...
date: ...
type: article | hub | runbook | ...
```
Validation enforces presence.
Optional fields
Fields that may or may not apply:
```yaml
tags: ... # for topic indexing
related: ... # cross-references
status: ... # lifecycle
canonical_id: ... # stable identifier
```
Type-specific fields
A page with `type: runbook` might have a structured `runbook:` block:
```yaml
type: runbook
runbook:
when_to_use: [...]
steps: [...]
references: [...]
```
A page with `type: article` doesn't need that.
The Wikantik agent-cookbook uses this pattern.
Knowledge graph mechanics
Nodes
Each page is a node.
Edges
Frontmatter `related:`, `links_to:`, embedded references — these are edges.
Properties
Frontmatter fields are properties of the node.
Querying
Tools query the graph:
```
Find all pages where:
type = "runbook"
AND tags contain "security"
AND status = "published"
```
This is a structured query; doesn't require text search.
Wikantik specifics
Per CLAUDE.md, Wikantik's structural spine + agent-grade content design uses frontmatter heavily:
- `canonical_id`: stable identifier (ULID)
- `cluster`: thematic grouping
- `tags`: topic tags
- `related`: cross-references
- `hubs`: hub page memberships
- `type`: page type (article, hub, runbook)
- `status`: lifecycle state
- `verified_at`, `verified_by`, `confidence`: verification metadata
- `audience`: humans / agents / both
These feed:
- The structural-spine index
- Knowledge-graph traversal
- Agent-grade content projection (`/api/pages/for-agent/{id}`)
- Verification dashboards
Design principles
Schema first
Decide what fields exist; what they mean; what values are valid. Then write pages following the schema.
Without schema, frontmatter is inconsistent; queries don't work.
Validation
Tools validate frontmatter at save time. Pages with invalid schema reject (or warn).
Wikantik uses save-time enforcement via `StructuralSpinePageFilter` and `RunbookValidationPageFilter`.
Consistency
A field's name and meaning don't change. Consistent across all pages.
Renaming `related` → `links_to` is a major migration. Avoid unless necessary.
Extensibility
New fields can be added; existing pages without them still work.
For required fields with defaults, this is straightforward. For required fields without defaults, migration is needed.
Specific patterns
Tag taxonomy
Tags should be from a controlled vocabulary. Otherwise users invent variants ("security", "Security", "infosec", "SecurityRelated").
Either:
- Suggest tags from existing vocabulary
- Auto-correct to canonical form
- Periodic taxonomy review
Cross-reference enforcement
`related:` entries point to other pages. Validate the targets exist.
Tools surface broken cross-references.
Hub pages
Hub pages have `type: hub` and list cluster members. The cluster's pages reference back via `hubs:` field.
Bidirectional links: hub lists members; members reference hub.
Confidence and verification
Pages can have:
```yaml
verified_at: 2026-04-26
verified_by: alice
confidence: authoritative | provisional | stale
```
Tools surface stale or unverified pages.
Cluster membership
Frontmatter `cluster: name` groups pages. Hub pages organize the cluster.
For 5+ pages on a topic, a cluster + hub provides structure.
Tooling
Frontmatter linters
Validate schema on save. Reject invalid pages.
Index builders
Read all pages' frontmatter; build searchable index. Periodic rebuild.
Cross-reference checkers
Find broken `related:` links; missing hub members.
Visualization
Graph visualization of the knowledge graph. Useful for understanding wiki structure.
Common failure patterns
Inconsistent schemas
Different fields used; same fields with different values. Queries unreliable.
No validation
Bad frontmatter slips in. Subtle issues compound.
Tag drift
Tags multiply; no curation; eventual mess.
Frontmatter as afterthought
Pages created without thinking about metadata. Later, can't be queried.
Hand-maintained graphs
Manual updates to `related:` lists. Drift; missing cross-references.
A reasonable approach
For wikis using frontmatter for knowledge graphs:
1. Define schema explicitly
2. Validate on save
3. Tags from controlled vocabulary
4. Bidirectional cross-references (hubs ↔ members)
5. Periodic graph audit
6. Tool-supported updates (don't expect humans to maintain manually)
The Wikantik approach (structural spine + agent-grade content) shows mature implementation.
Further Reading
- [WikiPageTemplates](WikiPageTemplates) — Templates encode frontmatter conventions
- [WikiSearchOptimization](WikiSearchOptimization) — Frontmatter affects search
- [PropertyGraphModel](PropertyGraphModel) — Knowledge graph foundations