Choosing a Retrieval Mode

Wikantik exposes three retrieval surfaces. Each has a different cost profile and a different sweet spot. Picking the wrong one wastes tokens and produces low-quality answers; picking the right one tends to land the agent on a useful page in one or two calls.

When to use this runbook

Whenever an agent's first instinct is "I'll just call search_knowledge and see what comes back". That works most of the time, but the failure modes are real and the cost of a wrong second call is high. Use the frontmatter steps block as the contract; this body explains the why behind each step.

Context

BM25 (/api/search) — fast, deterministic, no dependency on the embedding service. Best when the query already contains the right vocabulary.
Hybrid (/knowledge-mcp/search_knowledge) — BM25 + dense embeddings fused via Reciprocal Rank Fusion (k=60). The default. Best for natural-language queries that don't necessarily share vocabulary with the target page.
Graph traversal (/knowledge-mcp/query_nodes + traverse) — only useful when the query is about a named entity the knowledge graph has indexed. Wrong tool for "how does feature X work" questions.

Walkthrough

The frontmatter steps are the canonical sequence. A few elaborations:

The "top-5 share one cluster" trick (step 2) is a classic over-eager hybrid result. Hybrid loves clusters; if you want breadth, you have to ask for it explicitly via the structural index.
The graph-traversal path (step 3) sounds powerful but is a niche optimisation. Most agent queries are not about named entities; reach for it only when you can name the entity in the query.
The BM25 fallback (step 4) is the diagnostic move. If BM25 returns results that hybrid missed, the embedding service is probably misbehaving — see HandlingEmbeddingServiceOutages.

Pitfalls

The frontmatter pitfalls list captures the failure modes worth internalising. The chaining-budget pitfall (≤ 3 calls) is the most expensive to violate — agents that loop "search again with a different phrasing" routinely burn 15+ retrieval calls without improving the hit set.