Knowledge Graph Rerank
Knowledge Graph Rerank is the final stage of the Wikantik retrieval pipeline. It transforms a raw list of search results into a contextually relevant set of pages by leveraging the semantic relationships stored in the [Knowledge Graph](Knowledge Graph).
The Retrieval Pipeline
The Wikantik search engine (exposed via `/api/search` and the `/knowledge-mcp` tool `retrieve_context`) follows a multi-stage process:
1. **Lexical Retrieval (BM25)**: A fast keyword-based search against the Lucene index.
2. **Semantic Retrieval (Dense Vector)**: Cosine similarity search using embeddings stored in `pgvector`.
3. **Hybrid Fusion**: Combining the scores from BM25 and Vector search using Reciprocal Rank Fusion (RRF).
4. **Graph Rerank (Final Stage)**: Adjusting the scores based on "Node Mention" density and co-occurrence in the graph.
How Graph Rerank Works
The reranker identifies "seed nodes" within the top-N results from the hybrid fusion stage. It then uses the [KnowledgeGraphService](KnowledgeGraphService) to find co-mentioned neighbors and high-confidence relationships.
- **Boost by Co-mention**: If a page is not in the top results but is heavily co-mentioned with multiple pages that *are* in the top results, its score is boosted.
- **Entity Density**: Pages that contain a high density of relevant entities (nodes) related to the query are prioritized.
- **Fail-Closed Fallback**: As documented in [WikantikDevelopment](WikantikDevelopment), if the embedding service or the graph database is unreachable, the system fails closed to a BM25-only result set to ensure availability.
Configuration
Reranking behavior is controlled via `wikantik-custom.properties`:
```properties
Enable/Disable graph reranking
jspwiki.search.graphRerank.enabled = true
Weights for different retrieval signals
jspwiki.search.hybrid.bm25Weight = 0.4
jspwiki.search.hybrid.vectorWeight = 0.6
Rerank depth (how many initial results to consider)
jspwiki.search.graphRerank.depth = 20
```
Performance Impact
Benchmarks recorded in [KnowledgeGraphExtractionBenchmarks](KnowledgeGraphExtractionBenchmarks) show that while graph reranking adds approximately 15-20ms of latency, it significantly improves `Recall@5` for "multi-hop" queries where the relevant information is spread across related topics rather than a single page.