Interpreting Hybrid Retrieval Metrics

Wikantik publishes a small set of Prometheus metrics covering structural

index health, for-agent projection size, and (post-Phase 5) retrieval

quality. They're all under the `wikantik_` prefix.

When to use this runbook

When you have a `/metrics` scrape and want to convert numbers into

verdicts.

Context

- **Phase 1 (shipped):** `wikantik_structural_index_{pages_total,clusters_total,tags_total,unclaimed_total,lag_seconds}` — gauges. Plus `wikantik_structural_index_rebuild_duration_seconds` (timer) for rebuild cost.

- **Phase 2 (shipped):** `wikantik_for_agent_response_bytes` — DistributionSummary with percentile histogram. Records every projection's serialised size.

- **Phase 5 (planned):** `wikantik_retrieval_{ndcg_at_5,ndcg_at_10,recall_at_20,mrr}{set,mode}` — gauges, plus `wikantik_retrieval_run_duration_seconds` (histogram) and `wikantik_retrieval_run_failed_total` (counter).

Walkthrough

The frontmatter `steps` walk the metric set in priority order: index

health first (the foundation), projection size second (the agent

contract), retrieval quality third (the eventual signal).

Pitfalls

The frontmatter `pitfalls` capture the recurring misreads. The

"comparing absolute numbers across deploys" trap is especially common —

agents take a snapshot of metrics, deploy a different corpus, then

report that "retrieval got worse" when the page count just shifted.