Back to Blog
Published:
Last Updated:
Fresh Content
Graph Rot & Knowledge Graph QualityChapter 2

The Edge That Shouldn't Exist: Detecting Wrong Relationships in a Knowledge Graph

10 min read
1,974 words
high priority
Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

The Edge That Shouldn't Exist: Detecting Wrong Relationships in a Knowledge Graph

TL;DR

A mislink is an edge between two real nodes that no document supports. How we detect wrong relationships in a production knowledge graph.

A mislink is an edge between two real nodes that no document supports. How we detect wrong relationships in a production knowledge graph.
mislink detectionknowledge graph wrong edge detectiondetect wrong relationships knowledge graphknowledge graph error detectionknowledge graph refinementvalidate knowledge graph edges

A knowledge graph we run for a family office once told an agent that a managing director sat on the board of a company he had never been part of. Both the director and the company were real. Both had been correctly identified, deduplicated, and scored. The only thing wrong was the line drawn between them: an edge no document actually supported.

That edge passed every check we had at the time. The director node was clean. The company node was clean. The relationship had a type, a direction, and a confidence score. Nothing was missing. It was simply false.

This is the third post in the series on graph rot. The first named the seven ways a knowledge graph rots. The second went deep on duplicate entities and how you decide that eleven names are one company. This one is about the failure mode that is hardest to see and most dangerous to leave in place: the mislink, an edge that should not exist.

What is a mislink, and how is it different from a duplicate?

A mislink is a relationship between two nodes that are each correct, but the connection between them is wrong. The endpoints are right. The edge is a lie.

That makes it the mirror image of the duplicate problem. A duplicate is a failure of identity: one real thing stored as several nodes. A mislink is a failure of relationship: two real things joined by an edge that no source supports. Fixing duplicates is about deciding what a node is. Fixing mislinks is about deciding whether a connection is true.

In a knowledge graph, the nodes are the nouns and the edges are the claims. “Acme is owned by Fund II” is a claim. “Jane Doe is a director of Acme” is a claim. The entire reason you build a graph instead of keeping a pile of documents is so an agent can traverse those claims to answer questions. Which means a wrong edge is not a cosmetic flaw. It is a false statement the agent will repeat as fact.

The nodes are the nouns. The edges are the claims. A mislink is a false claim that passed every check you had.

Why are mislinks the hardest kind of graph rot to catch?

Because every individual piece of a mislink looks valid.

When a node is duplicated, you can often spot it by scanning for similar names. When a fact is stale, you can check it against a date. A mislink has none of those tells. The source node is a real, validated entity. The target node is a real, validated entity. The relationship type is one your schema allows. The edge even carries a confidence score, because the extraction step that created it was confident.

Nothing about a single mislink is anomalous on its own. It reveals itself only in context: when you cross-check it against the source documents, against the other edges around it, or against a ground truth like a cap table. A spot check finds them by accident. A system has to go looking for them on purpose. That is exactly why most pipelines never catch them. They were built to extract relationships, not to doubt them.

Academically, this lives in the field of knowledge graph refinement, the body of research on finding and repairing wrong facts in a graph (Heiko Paulheim's survey on the subject is the standard reference, and the error-detection work that followed it). Almost all of that research is academic, with very little of it turned into production tooling. That gap is part of why a real pipeline so rarely ships with a step whose only job is to catch wrong edges.

Where do mislinks come from?

They come from four places, and naming them is half the battle.

  1. Over-eager extraction. The language model is asked to find relationships, so it finds relationships. Given a document that mentions a director and a company in the same paragraph, a model will often connect them even when the text only places them on the same page. Co-occurrence is not a relationship, but to a model under instruction to extract edges, it can look like one.
  2. Ambiguous references. A document says “the Fund,” or “the Company,” or “he,” and the extractor has to decide which fund, which company, which person. Resolve that reference to the wrong entity and you get a perfectly typed edge pointing at the wrong node. This is where identity and relationship blur together: a near-miss in entity resolution becomes a wrong edge.
  3. Cross-document stitching errors. When you assemble one graph from six document types, you stitch facts from a PPM to facts from a K-1 to facts from a cap table. Each stitch is an inference. Get one wrong and you connect the right two entities through the wrong intermediary.
  4. Schema pressure. A pipeline built to populate a fixed set of relationship types tends to force ambiguous evidence into one of those slots rather than leave it unconnected. The edge gets created because the schema has a place for it, not because the document earned it.

How do you detect an edge that shouldn't exist?

You run a dedicated pass, after the graph is built, whose only job is to doubt edges.

Flowchart of a mislink-detection pass: all edges pass through a grounding check, a structural-anomaly check, then an LLM evidence re-read, with suspicious edges flagged for human review and a cap-table reconciler as ground truth.

On the family office platform, this is a post-creation mislink-detection stage. It sits after extraction, identity resolution, and validation in our eight-stage pipeline, and before any agent is allowed to query the Neo4j graph of five node types: company, person, investment, vehicle, and document. It works on three signals, because no single signal catches every mislink.

  1. Grounding. Every edge must be able to point to the sentence it came from. An edge whose supporting evidence cannot be located in any source document is the first thing we flag. This is the same grounding discipline used against hallucination elsewhere: a claim that cannot cite its source does not get to stay.
  2. Structural anomaly. Some edges are wrong because they are structurally impossible or improbable. A person who suddenly holds board seats at forty companies, a fund that owns itself, an investment that points at a person instead of a company. Type constraints and degree checks catch the edges that violate the shape the graph is supposed to have.
  3. Evidence re-reading with a model. For the edges that survive the first two checks but still look suspicious, we hand the model the edge and the exact passages it was supposedly drawn from, and ask a narrow question: does this text actually support this relationship, yes or no, with a reason. This is LLM disambiguation pointed at edges instead of entities, and it is deliberately scoped to the ambiguous middle so the cost stays bounded.

When do you bring a model in to judge an edge?

Only on the suspicious minority, never on the whole graph.

Re-reading every edge with a model would be slow and expensive, and most edges are obviously fine. So the cheap checks run first: grounding and structural anomaly filter the graph down to the edges that are actually in doubt. Only those reach the model, and when they do, the model is not asked to imagine a relationship. It is asked to confirm or reject one against evidence already in hand, and to explain its verdict so the decision can be audited later.

This is the same principle from the entity-resolution work: the model is the most expensive tool in the pipeline, so you spend it only where cheaper signals have already narrowed the question. It judges. It does not hunt.

How do you check edges at scale without re-reading everything?

By treating suspicious-edge selection as its own step.

A graph built from a multi-document portfolio has far more edges than you can afford to re-verify one by one. The trick is the same one that makes entity resolution tractable: you do not compare everything to everything. You generate a candidate set of edges worth doubting, using cheap structural and grounding signals, and you spend the expensive verification only on that set. Most edges never need a second look. The ones that do are the edges where the evidence is thin, the structure is odd, or the same relationship was asserted inconsistently across two documents.

And where a ground truth exists, you use it without mercy. A cap table is a closed system: ownership percentages sum to a whole, and every stake ties to an owner. So the cap-table reconciler doubles as a mislink detector. If an ownership edge implies a stake that breaks the totals, the edge is wrong, and the arithmetic says so without anyone re-reading a single page. This is the same reconciliation we lean on for data engineering and pipeline correctness across the platform: trust the signal that is expensive to fake, and let the math flag what the prose hides.

What does a mislink cost once an agent acts on it?

More than a wrong answer. It costs a wrong decision that looks well-sourced.

When a GraphRAG or agent system traverses the graph to answer “who controls this company” or “what is our exposure to this counterparty,” it treats every edge as true. A mislink does not produce a vague or hedged answer. It produces a specific, confident, traceable one that happens to be false. The agent will even cite the node it walked through, which makes the wrong answer more believable, not less.

This is why edge correctness is a precondition for everything built on top of the graph. When we put an enterprise retrieval and agent layer over a knowledge graph, its trustworthiness is capped by the edges underneath it. A reranker cannot fix a false relationship. A better model only states the falsehood more fluently. The correctness has to live in the graph itself.

How do you know your edges are trustworthy?

You measure the graph against its sources and its ground truths, not against itself.

Every edge in the family office graph carries a confidence score from 0.0 to 1.0 and, where possible, a pointer to its supporting evidence. That lets us ask the graph directly for its weakest edges and route them to review, the same way we surface low-confidence entities. We also score the extraction the way we score everything else: judge models running a 100-point rubric across five dimensions, against a suite of 61 evaluation cases, including grounding checks that fail an extraction for asserting a relationship the text does not support.

The test that matters in the end is blunt. Pick a claim the agent makes, follow the edge back to the document, and see whether the document actually says it. When that round trip holds, the edges are trustworthy. When it does not, you have mislinks you have not found yet.

What did building it teach us?

The first lesson was that you cannot prevent mislinks at extraction time, only reduce them. An extractor tuned to never invent an edge also misses real ones. So the leverage is not a perfect extractor. It is a good-enough extractor followed by a dedicated pass that doubts edges after the fact. Detection beats prevention here, because detection gets to use the whole graph and every ground truth, while extraction only ever sees one document at a time.

The second lesson was that the most valuable signal was also the cheapest: grounding. Most mislinks could not point at a real sentence, because they were never in the text to begin with. Requiring every edge to cite its source caught more wrong edges than any clever model did, and it cost almost nothing to run.

We build and fix knowledge graphs for AI systems, including a document-intelligence platform for a family office managing $850M in assets. If your agents are acting on relationships you are not sure are real, book a 15-minute call.

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI | 10+ years

Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...

Founder & CEO of Cognilium AI; 50+ projects delivered with 96% client satisfaction; 4 production AI products built and operated; multi-cloud AI architecture (AWSGCPAzure)
Agentic AIRAG → GraphRAG retrievalVoice AIMulti-Agent Orchestration

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us

Related Articles

Continue exploring related topics and insights from our content library.

Graph Rot: Why Your Knowledge Graph Is Lying to Your AI
6 min
1
Muhammad Mudassir
June 5, 2026

Graph Rot: Why Your Knowledge Graph Is Lying to Your AI

Graph rot is the silent decay of a knowledge graph's correctness. The 7 ways production graphs go bad, from an engineering team that builds them.

words
Read Article
One Company, Eleven Names: How a Knowledge Graph Learns Identity
10 min
2
Muhammad Mudassir
June 9, 2026

One Company, Eleven Names: How a Knowledge Graph Learns Identity

Extraction gives you names. Entity resolution decides identity. How we taught a $850M family office knowledge graph to tell one company from its eleven aliases.

words
Read Article

Explore More Insights

Discover more expert articles on AI, engineering, and technology trends.