Choosing a Retrieval Mode
Wikantik exposes three retrieval surfaces. Each has a different cost
profile and a different sweet spot. Picking the wrong one wastes tokens
and produces low-quality answers; picking the right one tends to land
the agent on a useful page in one or two calls.
When to use this runbook
Whenever an agent's first instinct is "I'll just call search_knowledge
and see what comes back". That works most of the time, but the failure
modes are real and the cost of a wrong second call is high. Use the
frontmatter `steps` block as the contract; this body explains the *why*
behind each step.
Context
- **BM25 (`/api/search`)** — fast, deterministic, no dependency on the
embedding service. Best when the query already contains the right
vocabulary.
- **Hybrid (`/knowledge-mcp/search_knowledge`)** — BM25 + dense
embeddings fused via Reciprocal Rank Fusion (k=60). The default. Best
for natural-language queries that don't necessarily share vocabulary
with the target page.
- **Graph traversal (`/knowledge-mcp/query_nodes` + `traverse`)** — only
useful when the query is about a *named entity* the knowledge graph
has indexed. Wrong tool for "how does feature X work" questions.
Walkthrough
The frontmatter `steps` are the canonical sequence. A few elaborations:
- The "top-5 share one cluster" trick (step 2) is a classic over-eager
hybrid result. Hybrid loves clusters; if you want breadth, you have to
ask for it explicitly via the structural index.
- The graph-traversal path (step 3) sounds powerful but is a niche
optimisation. Most agent queries are not about named entities; reach
for it only when you can name the entity in the query.
- The BM25 fallback (step 4) is the diagnostic move. If BM25 returns
results that hybrid missed, the embedding service is probably
misbehaving — see `HandlingEmbeddingServiceOutages`.
Pitfalls
The frontmatter `pitfalls` list captures the failure modes worth
internalising. The chaining-budget pitfall (≤ 3 calls) is the most
expensive to violate — agents that loop "search again with a different
phrasing" routinely burn 15+ retrieval calls without improving the
hit set.