Wikantik Knowledge Graph Administration

The Wikantik knowledge graph is a structured layer that sits on top of the wiki's content, capturing entities, relationships, and properties extracted from page frontmatter and body links. When properly maintained, it enables AI agents to traverse contextual relationships between topics, discover connections that would otherwise require reading dozens of pages, and propose new knowledge for human review. This guide walks through every aspect of administering the knowledge graph to keep it accurate, complete, and valuable.

The Knowledge Graph Architecture

Before diving into operations, it helps to understand how the knowledge graph is built and maintained.

Nodes and Edges

Every entity in the knowledge graph is a node. Nodes have:

Name — typically matching a wiki page name (e.g. AssetAllocation, RetirementPlanningHub)
Node type — an optional classification (e.g. hub, article) drawn from the page's type frontmatter field
Source page — the wiki page this node was projected from (if any)
Properties — key-value metadata stored alongside the node (tags, summary, status, cluster, etc.)
Stub flag — true when the node was created as a reference target but has no wiki page yet

Edges connect two nodes with a named relationship. For example, if a page's frontmatter contains related: [IndexFundPortfolioConstruction, UnderstandingRiskTolerance], the projector creates edges of type related from the page's node to each target node.

Provenance Tracking

Every node and edge carries a provenance value that records how it entered the graph:

Provenance	Meaning
`human-authored`	Created from page frontmatter, body links, or manual admin action in the UI
`ai-inferred`	Submitted by an AI agent via the MCP `propose_knowledge` tool, still pending review
`ai-reviewed`	Originally proposed by an AI agent, then approved by a human administrator

Provenance tracking ensures you always know whether a fact in the graph was authored by a human or suggested by AI, and whether AI suggestions have been vetted.

How Pages Become Graph Data

The Graph Projector is a page filter that fires automatically on every page save. It:

Parses the page's YAML frontmatter
Upserts a node for the page itself
Detects which frontmatter keys represent relationships versus properties
Creates edges for each relationship target
Scans the page body for markdown links and creates links_to edges
Removes stale edges that no longer appear in the frontmatter or body
Creates stub nodes for any target that doesn't have its own wiki page yet

Relationship detection rule: A frontmatter key is treated as a relationship (creating edges) when its value is a list of strings and the key is not in the reserved property set. The reserved property keys — always treated as node properties, never as edges — are: tags, keywords, type, summary, date, author, cluster, status, title, description, category, and language.

For example, given this frontmatter:

---
type: article
tags: [investing, diversification]
summary: How to allocate assets across classes.
related: [UnderstandingRiskTolerance, IndexFundPortfolioConstruction]
depends_on: [MarketFundamentals]
---

The projector creates:

Properties on the node: type=article, tags=[investing, diversification], summary=...
Edges from this page: related → UnderstandingRiskTolerance, related → IndexFundPortfolioConstruction, depends_on → MarketFundamentals
Additionally, any [PageName](PageName) links in the body text produce links_to edges

System pages (CSS themes, navigation fragments, etc.) are automatically excluded from projection.

Accessing the Knowledge Graph Admin Panel

Navigate to Admin > Knowledge in the Wikantik admin panel. The page presents five tabs:

Proposals — Review queue for AI-submitted knowledge proposals
Node Explorer — Browse, search, and manage graph nodes
Edge Explorer — Browse, search, and manage graph edges
KG Embeddings — Structural embedding model for link prediction and anomaly detection
Content Embeddings — Content similarity model for finding related and duplicate pages

A Clear All button in the toolbar permanently deletes all knowledge graph data (nodes, edges, proposals, and embeddings). Use with extreme caution — this is irreversible.

Step 1: Initial Graph Population

When starting with a fresh knowledge graph (or after a Clear All), the first task is to populate it from existing wiki content.

1.1 Ensure Pages Have Frontmatter

The knowledge graph is only as rich as the frontmatter in your wiki pages. Before projecting, check the Content Embeddings tab — it lists all Pages Without Frontmatter. Each entry links to the page editor so you can add frontmatter.

A well-structured page should have at minimum:

---
type: article
tags: [relevant, topic, keywords]
summary: One sentence describing the page's purpose.
related: [RelatedPageOne, RelatedPageTwo]
---

Hub pages that organize a cluster of articles should use type: hub and include a cluster identifier:

---
type: hub
tags: [topic-area]
cluster: my-topic-cluster
summary: Overview hub for the topic area.
related: [SubArticleOne, SubArticleTwo]
---

The more relationship keys you add to frontmatter (e.g. depends_on, supersedes, implements, related), the richer the graph becomes. Any key whose value is a list of page names will be detected as a relationship.

1.2 Project All Pages

Go to the Node Explorer tab and click Project All Pages. This scans every non-system wiki page, parses its frontmatter and body links, and creates the corresponding nodes and edges. The status line will report how many pages were scanned, how many were projected, and any errors encountered.

After projection, the schema summary at the top of the Node Explorer shows the total node count, edge count, and which node types and relationship types exist in the graph.

1.3 Verify the Initial Graph

After projection, review the graph for quality:

Node Explorer: Search for key topics and verify they appear as nodes. Check that node types are correct. Look for unexpected stub nodes — these indicate frontmatter references to pages that don't exist yet.
Edge Explorer: Filter by relationship type to see whether the expected relationships were created. Check that links_to edges from body links are present.
Schema summary: Confirm the relationship types match what you expect from your frontmatter conventions.

Step 2: Embeddings

Wikantik uses a single Ollama-backed embedding model that produces dense vectors for every page chunk. Those vectors back hybrid search and are also reused — as on-the-fly centroids over chunk_entity_mentions — for KG-node similarity, so there is no longer a separate "structural" or "content" embedding model to retrain.

The chunk embedding indexer runs continuously: it picks up newly chunked pages on each save and writes vectors into content_chunk_embeddings. The status bar at /admin/knowledge/embeddings/status reports:

Ready — true when the in-memory mention-centroid index has loaded
Dimension — vector size for the active model
Mentioned node count — KG nodes for which at least one mention chunk has an embedding

There is no manual "retrain" button — operators rebuild the embedding layer with bin/kg-rebuild.sh --skip-chunks (which forwards to the embedding indexer) when they want a wholesale recompute, e.g. after switching the active embedding model.

Step 3: Reviewing AI Proposals

AI agents that interact with the wiki via the MCP server can submit knowledge graph proposals using the propose_knowledge tool. These proposals appear in the Proposals tab for human review.

3.1 Proposals

Each proposal includes:

Field	Description
Type	`new-node`, `new-edge`, `new-property`, or `modify-property`
Source Page	The wiki page that motivated the proposal
Proposed Data	The full data — node definition, edge definition, or property change
Confidence	The agent's self-assessed confidence (0-100%)
Reasoning	Why the agent believes this is correct, citing specific evidence

3.2 Reviewing Effectively

When reviewing proposals:

Read the reasoning carefully. A well-formed proposal cites specific evidence from the source page. Vague reasoning ("these seem related") is a warning sign.
Check the confidence score. Low confidence proposals deserve more scrutiny. Very high confidence (>90%) from an AI agent should also be verified — overconfidence can indicate hallucination.
Verify the source page. Click through to the source page and confirm the proposed relationship or property actually reflects the page content.
Consider the graph impact. A new edge between two major hub nodes has broader implications than an edge between two leaf articles.

3.3 Approving Proposals

Click Approve to accept a proposal. For new-edge proposals, approval triggers two actions:

The edge is created in the knowledge graph with ai-reviewed provenance
The relationship is automatically written back into the source page's frontmatter, creating a durable record that persists even if the knowledge graph is rebuilt

This frontmatter write-back is important: it means approved AI knowledge becomes part of the page's permanent content, visible to future page editors and graph projections.

3.4 Rejecting Proposals

Click Reject to decline a proposal. You'll be prompted for an optional rejection reason. Providing a clear reason is important because:

The rejection is recorded in a rejection history
When an AI agent tries to submit the same relationship again, the system automatically declines it and returns the rejection reason
This teaches agents what relationships are inappropriate, reducing future noise

3.5 Maintaining a Healthy Review Cadence

The Node Explorer's schema summary shows the count of pending proposals. Aim to keep this near zero. A backlog of unreviewed proposals means AI agents are working with an incomplete graph, and stale proposals may become irrelevant as pages change.

Step 4: Exploring and Curating Nodes

The Node Explorer is your primary tool for understanding and curating the graph's entities.

4.1 Browsing and Filtering

Search: Type in the search box to filter nodes by name
Type filter: Select a node type (e.g. hub, article) to narrow the list
Status filter: Filter by status values present in node properties
Pagination: Navigate through large node sets with Prev/Next buttons

4.2 Inspecting Node Details

Click any node row to view its details in the right panel:

Properties: All key-value metadata stored on the node
Outbound Edges: Relationships where this node is the source (this node → target)
Inbound Edges: Relationships where this node is the target (source → this node)
Similar by Structure: Nodes that occupy similar positions in the graph topology (requires trained KG embedding model)
Similar by Content: Nodes whose wiki page content is semantically similar (requires trained content model)

Click any node name in the edge tables to navigate to that node's detail view.

4.3 Handling Stub Nodes

Stub nodes appear with a "Yes" in the Stub column and an italic warning in the detail panel. These represent entities that are referenced in frontmatter or body links but don't have their own wiki page. Stubs are normal — they're placeholders that will be fleshed out when a page is created. However, a large number of stubs may indicate:

Typos in frontmatter — a misspelled page name creates an orphaned stub
Planned but unwritten pages — a hub page references articles that haven't been authored yet
External concepts — references to things that may never get their own page

Review stubs periodically. Fix typos, create missing pages, or delete stubs that will never be resolved.

4.4 Deleting Nodes

Click Delete in the node detail panel to remove a node and all its edges. Use this for:

Cleaning up typo-created stubs
Removing test data
Eliminating obsolete entities after a page is deleted

Deletion is permanent — the node and all its edges are removed from the database.

Step 5: Exploring and Curating Edges

The Edge Explorer lets you browse the relationship layer of the graph.

5.1 Browsing Edges

Search: Filter edges by source or target node name
Relationship type filter: Select a specific relationship type (e.g. related, links_to, depends_on)
Pagination: Navigate with Prev/Next buttons

5.2 Inspecting Edge Details

Click any edge row to see:

The full source → relationship → target path
The provenance badge (human-authored, ai-inferred, or ai-reviewed)
Detailed cards for both the source and target nodes, including their properties and source pages

5.3 Relationship Type Conventions

Maintain consistent relationship types across your wiki. Common patterns:

Relationship Type	Meaning	Example
`related`	General topical relationship	`AssetAllocation → UnderstandingRiskTolerance`
`links_to`	Body text contains a markdown link (auto-generated)	`RetirementPlanningHub → AiDrivenRetirementPlanning`
`depends_on`	Topic B requires understanding of topic A	`PortfolioRebalancing → AssetAllocation`
`supersedes`	This page replaces an older one	`NewPolicy → DeprecatedPolicy`
`implements`	Describes an implementation of a concept	`OurDeployProcess → ContinuousDeployment`

Document your conventions and share them with content authors. Consistent relationship types make the graph queryable and meaningful.

Step 6: Using Structural Embeddings for Graph Improvement

The KG Embeddings tab provides three AI-powered tools for improving graph quality. All require a trained structural embedding model.

6.1 Predicted Missing Edges

The model scores all potential edges (entity pairs not currently connected) and surfaces the highest-scoring predictions — relationships the model believes should exist based on the graph's structure.

Each prediction shows:

Source and Target node names
Relationship type the model predicts
Score — higher means the model is more confident

Review each prediction and click Create to add the edge to the graph. The edge is created with human-authored provenance since you're making the editorial decision.

Best practice: Don't blindly accept high-scoring predictions. Verify that the relationship makes semantic sense by checking both pages. The model is finding structural patterns, not reading content — it might predict that two nodes should be connected because they have similar graph neighborhoods, even if the actual topics are unrelated.

6.2 Low-Plausibility Edges (Anomaly Detection)

These are existing edges that the model considers unlikely given the rest of the graph structure. A low plausibility score suggests the relationship is unusual or potentially incorrect.

Review these edges and ask:

Is this a data quality issue? Perhaps a typo in frontmatter created an edge to the wrong node.
Is this genuinely unusual but correct? Some legitimate relationships are structurally unusual — a cross-domain link between two otherwise unrelated topic clusters, for example.
Should this be deleted? If the edge is clearly wrong, go to the Edge Explorer to delete it, or fix the source page's frontmatter.

6.3 Merge Candidates

The model identifies pairs of nodes that may be duplicates based on three similarity scores:

Structure — how similar their graph neighborhoods are (same edges, same neighbors)
Content — how similar their wiki page content is (TF-IDF cosine similarity)
Combined — a weighted blend of structural and content similarity

Click Merge to combine two duplicate nodes. Merging:

Moves all edges from the source node to the target node
Updates all frontmatter references across wiki pages (renames the old name to the new name)
Deletes the source node

Before merging, verify that the two nodes genuinely represent the same concept. Common legitimate merge scenarios:

AssetAllocation and [Asset Allocation](AssetAllocation) (naming inconsistency)
REST API and RestApi (different naming conventions)
A stub node and a fully-realized page node for the same concept

Step 7: Using Content Embeddings for Page Quality

The Content Embeddings tab provides content-level intelligence.

7.1 Similar Page Pairs

After training the content model, this table shows the most similar page pairs ranked by TF-IDF cosine similarity. High similarity between two pages may indicate:

Duplicate content that should be merged or deduplicated
Overlapping topics that should cross-reference each other
Content that belongs in the same cluster but isn't tagged that way

Review the top pairs and take action:

Add related frontmatter links between legitimately related pages
Merge or consolidate truly duplicative pages
Update cluster assignments to group related content

7.2 Pages Without Frontmatter

This table lists all wiki pages that lack YAML frontmatter. These pages are invisible to the knowledge graph — they produce no nodes, no edges, and no properties. Each page name links to the editor so you can add frontmatter.

Prioritize adding frontmatter to:

Frequently accessed pages — these represent important knowledge that should be in the graph
Hub pages — these organize clusters and their relationships are particularly valuable
Pages referenced by other pages' frontmatter — without frontmatter, these are stub nodes instead of full entities

Step 8: Ongoing Maintenance

A healthy knowledge graph requires periodic attention, not just initial setup.

8.1 Weekly Review Checklist

Check the Proposals tab — approve or reject all pending proposals
Retrain both embedding models (or verify automatic retraining is running)
Review Predicted Missing Edges — accept valid predictions to fill gaps
Review Low-Plausibility Edges — investigate and fix any data quality issues
Review Merge Candidates — merge confirmed duplicates
Check Similar Page Pairs — ensure high-similarity pages cross-reference each other
Review Pages Without Frontmatter — add frontmatter to newly created pages

8.2 After Major Content Changes

When you add a new cluster of pages, reorganize existing content, or bulk-edit frontmatter:

Click Project All Pages in the Node Explorer to refresh the entire graph
Retrain both embedding models to incorporate the new structure and content
Check the Node Explorer for any unexpected stubs (typos in new frontmatter)
Review the Edge Explorer to confirm new relationship types are consistent with conventions

8.3 Monitoring Graph Health

Use the schema summary in the Node Explorer to track key metrics:

Node count — should grow steadily as content is added
Edge count — should grow proportionally to nodes; a very low ratio may indicate sparse frontmatter
Pending proposals — should stay near zero
Node types and Relationship types — watch for unexpected types that indicate inconsistent naming

8.4 Re-projection vs. Incremental Updates

The Graph Projector runs automatically on every page save, so the graph stays current incrementally. Project All Pages is only needed when:

Setting up the graph for the first time
Recovering from a Clear All
The graph has drifted out of sync due to a bug or database issue
You've made bulk frontmatter changes outside the normal page save flow

Step 9: Optimizing the Graph for AI Agents

AI agents interact with the knowledge graph through the MCP server's propose_knowledge and list_proposals tools. To maximize the value agents get from the graph:

9.1 Maintain Rich Frontmatter

The more relationship types and targets you define in frontmatter, the more context agents can discover through graph traversal. A page with only tags and summary produces a node with properties but no edges to other entities. A page with related, depends_on, implements, and other relationship keys produces a richly connected node that agents can traverse.

9.2 Use Descriptive Relationship Types

Agents use relationship types to understand the nature of connections, not just their existence. depends_on conveys different meaning than related, which is different from supersedes. Use specific, descriptive relationship types rather than dumping everything into related.

9.3 Provide Fast Feedback on Proposals

When agents submit proposals and receive timely feedback (approvals or rejections with reasons), they learn what kinds of knowledge are valued. Rejected proposals with clear reasons are especially valuable — the rejection history prevents agents from re-submitting the same incorrect relationships.

9.4 Fill Stub Nodes

Stub nodes are dead ends for agent traversal. When an agent follows an edge to a stub node, it finds no properties, no source page to read, and no onward edges. Prioritize creating wiki pages for stubs that appear as targets of many edges — these are clearly important concepts that the graph references but cannot describe.

9.5 Keep the Chunk Embeddings Current

Agents benefit from up-to-date similarity data when looking for related concepts. The chunk embedding indexer runs continuously on every page save, but if the active embedding model changes you'll want to issue bin/kg-rebuild.sh --skip-chunks to recompute all chunk vectors against the new model.

Step 10: Advanced Configuration

10.1 Adding Custom Relationship Types

To introduce a new relationship type, simply start using it in page frontmatter:

---
audited_by: [ComplianceTeam, SecurityReview]
---

Because audited_by is not in the reserved property set and its value is a list of strings, the Graph Projector will automatically create edges of type audited_by to each target. No configuration changes are needed — the schema is dynamic.

10.2 Reserving New Property Keys

If you need a list-valued frontmatter key to be treated as a property (not a relationship), it must be added to the PROPERTY_ONLY_KEYS set in the FrontmatterRelationshipDetector class. The current reserved set is: tags, keywords, type, summary, date, author, cluster, status, title, description, category, language.

Troubleshooting

Projection produces zero nodes

Check that your pages have valid YAML frontmatter blocks delimited by --- lines. Pages without frontmatter still produce nodes but with no properties or typed edges.

Proposals keep getting re-submitted by AI

Ensure you're rejecting unwanted proposals (not just ignoring them) and providing clear rejection reasons. Only rejected proposals are recorded in the rejection history that prevents re-submission.

Merge causes unexpected page edits

When you merge node A into node B, the system renames all frontmatter references from A to B across all wiki pages that have edges pointing to A. This is by design — it keeps frontmatter consistent with the graph. Review the merge confirmation dialog carefully before proceeding.

Stale edges persist after editing frontmatter

The Graph Projector runs a diff on every page save that removes edges no longer present in the frontmatter. If stale edges persist, try re-saving the affected page, or use Project All Pages to refresh the entire graph.

Controlling KG Inclusion

The knowledge graph isn't built from every page. Cluster-level policy decides what contributes; per-page frontmatter overrides handle the rest. For the full model and dashboard walkthrough, see KgInclusionPolicy.

The short version:

Default-exclude. A cluster you haven't touched contributes nothing.
Cluster dashboard at /admin/kg-policy lets you toggle cluster inclusion with a reason. Eager reconciliation runs on commit.
Frontmatter override (kg_include: true | false) wins over cluster policy. Useful for WIP, sensitive, or one-off content.
CLI at bin/kg-policy.sh mirrors the dashboard for scripting and emergencies. purge --confirm is the only destructive operation.

System pages (Sandbox, Main, etc.) are always excluded — both from the KG and from the search index — via the existing SystemPageRegistry.