Choosing a Retrieval Mode

Wikantik exposes three retrieval surfaces. Each has a different cost

profile and a different sweet spot. Picking the wrong one wastes tokens

and produces low-quality answers; picking the right one tends to land

the agent on a useful page in one or two calls.

When to use this runbook

Whenever an agent's first instinct is "I'll just call search_knowledge

and see what comes back". That works most of the time, but the failure

modes are real and the cost of a wrong second call is high. Use the

frontmatter `steps` block as the contract; this body explains the *why*

behind each step.

Context

- **BM25 (`/api/search`)** — fast, deterministic, no dependency on the

embedding service. Best when the query already contains the right

vocabulary.

- **Hybrid (`/knowledge-mcp/search_knowledge`)** — BM25 + dense

embeddings fused via Reciprocal Rank Fusion (k=60). The default. Best

for natural-language queries that don't necessarily share vocabulary

with the target page.

- **Graph traversal (`/knowledge-mcp/query_nodes` + `traverse`)** — only

useful when the query is about a *named entity* the knowledge graph

has indexed. Wrong tool for "how does feature X work" questions.

Walkthrough

The frontmatter `steps` are the canonical sequence. A few elaborations:

- The "top-5 share one cluster" trick (step 2) is a classic over-eager

hybrid result. Hybrid loves clusters; if you want breadth, you have to

ask for it explicitly via the structural index.

- The graph-traversal path (step 3) sounds powerful but is a niche

optimisation. Most agent queries are not about named entities; reach

for it only when you can name the entity in the query.

- The BM25 fallback (step 4) is the diagnostic move. If BM25 returns

results that hybrid missed, the embedding service is probably

misbehaving — see `HandlingEmbeddingServiceOutages`.

Pitfalls

The frontmatter `pitfalls` list captures the failure modes worth

internalising. The chaining-budget pitfall (≤ 3 calls) is the most

expensive to violate — agents that loop "search again with a different

phrasing" routinely burn 15+ retrieval calls without improving the

hit set.