Knowledge Graph Completion

Knowledge Graph Completion (KGC) is the task of inferring missing triples (s, r, o) in a graph. In a production Wikantik instance, this isn't an academic exercise; it's the mechanism that turns a sparse set of extracted entities into a dense reasoning substrate for agents.

A complete KG allows an agent to answer "What is the security posture of the authentication service?" even if no single document explicitly links AuthenticationService to OAuth2.

1. The Geometry of Link Prediction

Link prediction assumes that entities and relations can be mapped to a continuous vector space where the truth of a triple is proportional to a score function $f_r(s, o)$ .

Translational Distance Models (TransE, RotatE)

In TransE, the relation is a translation vector: $\mathbf{s} + \mathbf{r} \approx \mathbf{o}$ .

Failure mode: Cannot handle 1-to-N relations. If (USA, has_state, NewYork) and (USA, has_state, California), TransE forces NewYork and California to the same vector.
Production Fix (RotatE): Maps entities to complex vectors $\mathbb{C}^d$ and relations to rotations: $\mathbf{o} = \mathbf{s} \circ \mathbf{r}$ , where $|\mathbf{r}_i| = 1$ . This handles symmetry, antisymmetry, and inversion.

Bilinear Models (ComplEx)

ComplEx uses the Hermitian dot product in complex space:

f_r(s, o) = \text{Re}(\langle \mathbf{w}_r, \mathbf{e}_s, \bar{\mathbf{e}}_o \rangle)

This is the state-of-the-art baseline for large-scale KGs because it scales linearly with entity count and captures asymmetric relations (e.g., parent_of) effectively.

2. LLM-Augmented Extraction (The Production Path)

While embedding models predict links from existing structure, LLMs extract links from unstructured evidence. In Wikantik, we use a verification loop:

def verify_extracted_triple(subject, relation, obj, context_chunk):
    # 1. Structural Check
    if not kg.has_relation_type(relation):
        return False, "INVALID_RELATION"
    
    # 2. Embedding Consensus (using RotatE score)
    score = kg_embedding_model.score(subject, relation, obj)
    if score < THRESHOLD_ANOMALY:
        # If the model is shocked by this triple, require higher LLM confidence
        min_confidence = 0.95
    else:
        min_confidence = 0.70
        
    # 3. LLM Multi-Pass Verification
    return llm.verify(
        f"Does '{context_chunk}' prove ({subject}, {relation}, {obj})?",
        min_confidence=min_confidence
    )

3. Evaluation: Moving Beyond Hits@10

Standard academic benchmarks (FB15k-237) are often leaked into LLM training sets. For production KGC, you must measure:

Metric	Why it matters	Calculation
MRR (Mean Reciprocal Rank)	Rewards the model for putting the truth at #1 vs #10.	$\frac{1}{\|Q\|} \sum_{i=1}^{\|Q\|} \frac{1}{rank_i}$
Filtered Hits@1	Strict accuracy. "Filtered" means we don't penalize the model for picking a different true triple that isn't the ground truth for this specific test case.	Count of true positives at rank 1 / total queries
Relation-Specific Precision	Some relations (e.g., `is_a`) are easier than others (`impacts`).	$\frac{TP}{TP + FP}$ per relation type

Critical Trap: Beware of "Entity Leakage." If your training set contains (A, part_of, B) and your test set contains (B, contains, A), a simple model will "predict" the link via inversion without understanding the semantics.

4. Implementation Checklist

Entity Resolution First: If OpenAI and OpenAI Inc. are separate nodes, KGC will fail. Perform hard-string normalization and LLM-based fuzzy matching before training embeddings.
Negative Sampling: To train, you need "false" triples. Generate these by corrupting true triples (replace o with a random entity o').
Graph Neural Networks (CompGCN): If your graph has high-order dependencies (A affects B, B affects C, therefore A affects C), use a GCN layer to propagate features before the scoring function.
Schema Enforcement: Never allow an LLM to invent a relation type. Use a RelationRegistry to map LLM-produced strings to canonical IDs.