How often should I run a knowledge graph health check?

Run the twenty-minute sampling check monthly for any graph in production, and immediately after any large ingestion or schema change. Graphs rot continuously because the world they describe keeps changing, so a single audit at launch tells you nothing about the graph six months later. The point of keeping the check short is that you will actually repeat it, and a cheap check run often beats a thorough audit run once.

What is the most common knowledge graph quality problem?

In our experience the two that do the most damage are stale edges and mislinks. A stale edge is a relationship that was true when extracted and is no longer true now. A mislink is a relationship the extraction got wrong in the first place. Both are invisible from the outside, because the graph keeps answering confidently, and both surface only when a downstream agent states something false. Duplicate entities and missing confidence scores are common too, but they degrade answers rather than fabricate them.

Can I check knowledge graph quality without special tools?

Yes. The entire twenty-minute check needs only read access and a way to pull a random sample. You sample twenty edges, trace each back to its source, look at your fifty highest-degree nodes for duplicates, and ask how the graph updates when a source changes. Tools help you measure the exact rate across a large graph, but they are not required to find out whether a problem exists. Sampling answers that in minutes.

What is the difference between a twenty-minute check and a full graph audit?

The check is a smoke test that tells you whether your graph has a quality problem. A full audit measures the rate of each problem across the entire graph, identifies which parts are worst, and fixes the extraction and update pipeline that caused them. The check uses a small random sample and your own judgment. The audit instruments the whole graph and the system that builds it. Use the check to decide whether you need the audit.

How do I know if I even need a knowledge graph at all?

Look at your real queries. If they are multi-hop traversals, where the answer is a path through several relationships, a graph is the right tool. If they are similarity lookups, where you are retrieving passages that resemble a query, a vector index is usually cheaper and has far less to rot. Many graphs that cause quality problems should never have been built, which is why "do you need one" is check six and a post of its own: do you actually need a knowledge graph.

The 20-Minute Knowledge Graph Health Check

PURPOSE: capstone that collapses the whole cluster into one runnable checklist. Each of the 7 checks maps to one prior post and links it. This is the conceptual top of the audit funnel (the paid Graph Health Audit), but the audit page is NOT built yet, so CTA routes to /services/data-engineering-intelligence + /contact.

VERIFIED DEPTH (anonymized, reuse exactly as established in #1-#7):
- confidence scores 0.0-1.0 per edge/fact; a hundred-point rubric across sixty-one evaluation cases (from the scoring post #4) - PUBLISHABLE, in campaign verified-stats list.
- 23 agents = 12 scorers + 11 analysts, smart routing cut model calls ~75% (from #7) - PUBLISHABLE.
- family-office platform = Neo4j, six entity types, five edge types, small agent set + router (from #6/#7 anonymization) - PUBLISHABLE.
- decay is non-uniform: ownership/employment/pricing rot in months, founding dates are stable (from #1/#5).
- the 2-in-100 / 50,000-edge figure is ARITHMETIC ILLUSTRATION clearly framed as "if", NOT a measured client number.

DO-NOT-PUBLISH: client names (DENSO, Brady/Smekens, FamilyOffice.ai), $850M AUM, marketing "90%+ accuracy" / "80% time reduction" / "40-50 redlines", any ROI %, internal codenames, infra identifiers.

Internal links (cognilium.ai only, all verified 200 on 2026-06-22):
cluster (all 7): /blogs/graph-rot-knowledge-graph-quality (#1), /blogs/entity-resolution-knowledge-graph (#2), /blogs/mislink-detection-knowledge-graph (#3), /blogs/scoring-knowledge-graph-before-agents (#4), /blogs/keeping-knowledge-graph-fresh-incremental-updates (#5), /blogs/do-you-need-a-knowledge-graph (#6), /blogs/agent-orchestration-vs-knowledge-graph (#7)
services: /services/data-engineering-intelligence, /services/multi-agents, /products/paralegent-ai, /contact
~2,300 words | No em dashes
-->

A knowledge graph almost never tells you it is sick. It keeps answering. The queries return, the agent responds in a confident voice, the demo looks fine. The rot shows up later, as a wrong answer that nobody can trace, about a relationship that quietly stopped being true months ago. By the time you notice, the graph has been feeding bad facts to every system downstream of it for weeks.

This is the last post in our series on graph rot, and it is the practical one. The previous seven were deep dives, one failure mode at a time. This one collapses all of them into a checklist you can run yourself, today, in about twenty minutes, without a six-week audit or a single new tool. You need read access to your graph, a way to sample it, and the willingness to look at twenty random edges and ask whether each one deserves to exist.

You do not need a long audit to find out your graph is rotting. You need twenty minutes and the right seven questions.

Each check below maps to one earlier post in this series, so if a check fails and you want the full treatment of why, the link takes you there. Run them in order. Keep a tally of fails as you go, because the score at the end is the part that tells you how worried to be.

How to run this without instrumentation

You do not need dashboards or a quality pipeline to take the temperature of a graph. You need a sample. Pull a random set of edges and a random set of your highest-degree nodes, put them in front of you, and answer seven questions about what you see. Sampling is the whole trick: if a problem shows up in twenty random edges, it is in the population, and you have learned what you needed to learn in two minutes instead of two weeks. The check finds rot. Measuring exactly how much of it there is comes later, and that is a different job.

A knowledge graph health scorecard with seven checks: freshness, duplicate entities, defensible edges, confidence scores, update path, whether a graph is needed, and facts versus control flow, each with a time estimate and a fail signal.

Check 1: Can your graph tell a fresh fact from a stale one?

Sample twenty edges and ask, for each, when it was last confirmed against its source. If your edges carry no "as of" date and no pointer back to where they came from, your graph cannot tell the difference between a fact verified yesterday and one that was true three years ago and has since changed.

This is the failure the whole series is named after. Decay is not uniform. A company's founding year never changes, but ownership, employment, pricing, and org structure rot in months. A graph that stamps every edge with a source and a date can be aged and refreshed selectively. A graph that does not has no way to know which of its facts are still load-bearing. We covered the mechanism in the opening post on graph rot.

Pass if every sampled edge carries a source and a timestamp. Fail if more than one edge in five has neither, because that is the share of your graph you are trusting blind.

Check 2: How many of your nodes are secretly the same thing?

Pull your fifty highest-degree nodes, the most connected entities in the graph, and read the list for duplicates. "Acme Inc", "Acme Incorporated", and "ACME Corp" as three separate nodes is the classic tell. If you can run a query, group entities by a normalized name and count the collisions.

The reason to look at the most-connected nodes first is that those are the ones your agents query most, and a fragmented entity is the most expensive kind to have. When one company is split across three nodes, every question about it sees a third of the evidence, and the graph confidently returns a partial answer as if it were complete. This is the problem we walked through in one company, eleven names, where entity resolution is the difference between a graph that knows who it is talking about and one that only thinks it does.

Pass if there are zero duplicates among your top fifty entities. Fail on the first one you find, because the most-connected nodes are exactly the ones you cannot afford to fragment.

Check 3: Pull twenty random edges. Can you defend every one?

For each of twenty randomly sampled edges, trace it back to the sentence in the source that justifies it. An edge you cannot defend from the source text is a mislink: a relationship the extraction invented or got wrong. This is the single most useful check in the list, and it takes about four minutes.

A missing edge is a problem you notice, because the answer comes back incomplete and someone complains. A wrong edge is the dangerous one, because the answer comes back confident and complete and false. Mislinks are what make an agent state, with no hedging, that two companies are related when they are not. The arithmetic is unforgiving at scale: if even two edges in a hundred are wrong, a fifty-thousand-edge graph is carrying a thousand bad relationships, and you will not feel any of them until one surfaces in an answer. We dedicated a whole post to finding these in the edge that shouldn't exist.

Pass if all twenty edges are defensible from source. Fail on a single edge you cannot trace, and multiply the rate by your graph size to see the real number you are living with.

Check 4: Does every fact carry a confidence score?

Look at how your edges are stored. If a relationship is binary, either present or absent, with no confidence attached, your graph is treating a hard fact and a soft inference as identical, and so is every agent that reads it.

Confidence is what lets you set a floor. With a score from 0.0 to 1.0 on each edge, you can refuse to serve anything below a threshold, route uncertain claims to a human, and tell a strong relationship from a weak guess. Without it, a guess extracted from one ambiguous sentence carries the same weight as a fact stated outright in fifty documents. When we score a graph before trusting it, we grade it against a hundred-point rubric across sixty-one evaluation cases, and confidence on every edge is the precondition that makes any of that possible. The full method is in scoring a knowledge graph before you trust it.

Pass if edges carry a confidence value and you have a threshold below which you do not serve them. Fail if your edges are binary, because you have no way to keep your shakiest facts out of your most important answers.

Check 5: When a source changes, what happens to the graph?

Ask whoever maintains the system one question: when an underlying document updates, how does the graph find out? There are three answers that should worry you. "Nothing, until the next full rebuild." "We re-ingest everything on a schedule." And "we do not."

A graph with no incremental update path is a snapshot that begins decaying the moment it is built. Full rebuilds are expensive enough that teams run them rarely, which means the graph spends most of its life out of date between rebuilds. The healthy pattern is targeted: when a source changes, you re-extract only the nodes and edges that source touched, and you leave the rest alone. That is the difference between a living graph and a photograph of one, and it is the subject of keeping a graph fresh without rebuilding it.

Pass if a changed source triggers a targeted re-extraction of the affected part of the graph. Fail if the only update mechanism is a full rebuild, or if there is no mechanism at all.

Check 6: Could a plain vector index answer your actual queries?

Take your last twenty real queries and sort them into two piles. One pile is multi-hop traversals: questions like which companies the person who controls one entity also has a stake in, where the answer is a path through several relationships. The other pile is similarity lookups: find the documents about a topic, retrieve passages like this one. Now count.

If most of your queries are in the similarity pile, you are paying the maintenance cost of a graph to do work a vector index does more cheaply and with far less to rot. Half the graphs that rot should never have been built, and the cheapest graph to keep healthy is the one you did not build. This check can save you the entire cost of the other five. We made the full build-versus-buy case in do you actually need a knowledge graph.

Pass if your queries are genuinely path-shaped, the kind a vector store cannot answer. Fail if they are mostly similarity lookups, and treat that as a signal to price out a vector index before you invest another quarter in graph upkeep.

Check 7: Is your graph storing facts, or coordinating agents?

This is the subtlest one. Look at what your graph actually encodes. If the nodes and edges describe entities and the relationships you traverse to answer questions, that is a knowledge graph. If they mostly describe which agent runs after which, under what condition, that is an orchestration graph wearing a knowledge graph's clothes, and a graph database is the wrong home for it.

The two get conflated constantly, because both are "graphs." We built a contract reviewer with twenty-three agents and no knowledge graph at all: twelve scoring agents and eleven analysts, coordinated by a scores table and a router, where smart routing cut model calls by roughly seventy-five percent. The graph that mattered there was the routing, not the data, and storing it in Neo4j would have bought nothing. Contrast that with our family-office platform, which is genuinely graph-native, six entity types joined by five kinds of edge, because there the relationships are the product. The full distinction is in what 23 agents taught us about knowledge graphs, and it draws on our multi-agent work.

Pass if your graph stores entity relationships you traverse to produce answers. Fail if it mostly encodes control flow, because that belongs in a router and a state table, not a graph database.

What your score means

Add up the fails.

Zero fails is rare, and it usually means someone has been treating the graph as a maintained system rather than a one-time build. If that is you, the work now is to keep it that way, because a clean graph drifts back toward rot the moment the update discipline lapses.

One or two fails is normal and fixable. Start with whichever of checks one and three failed, because those two, staleness and mislinks, are the ones that directly produce wrong answers. A duplicate entity or a missing confidence score degrades quality. A stale or invented edge fabricates it.

Three or more fails means your agents are very likely already serving confident wrong answers, and the graph is rotting faster than you are catching it. The order of operations still holds: if check six also failed, resolve that first, because the cheapest fix for a rotting graph is sometimes to retire it in favor of a vector index that does the actual job. There is no point hardening a graph you did not need.

Why twenty minutes finds the rot but does not fix it

This check is a smoke test, not a repair. Finding one undefendable edge in a sample of twenty tells you the population has a problem. It does not tell you whether the real rate is four percent or fourteen, which nodes are worst, or how to correct the extraction that produced them. Those answers require measuring the whole graph, and measuring is a different and larger job than sampling.

That is the honest boundary. The twenty-minute check is enough to know whether your graph is healthy. It is not enough to make a sick one well, the same way a thermometer tells you that you have a fever but not what to do about it. When the checks come back dirty and the graph is load-bearing, the next step is to measure the actual rates across the full graph and fix the pipeline that produced them, which is the substance of our data engineering and intelligence work.

That is also the whole series in one idea. A knowledge graph is a powerful tool that fails silently, and the only defense is to look at it on purpose, often, with the seven questions above. Graphs rot. The teams whose graphs stay trustworthy are not the ones with the fanciest pipeline. They are the ones who keep checking.

We build knowledge graphs and the agent systems that run on them, and a large part of that work is keeping graphs trustworthy as the world they describe keeps changing. If your health check came back dirty and the graph is load-bearing, book a 15-minute call and we will walk through measuring and fixing the rot. We work US business hours.

The 20-Minute Knowledge Graph Health Check

How to run this without instrumentation

Check 1: Can your graph tell a fresh fact from a stale one?

Check 2: How many of your nodes are secretly the same thing?

Check 3: Pull twenty random edges. Can you defend every one?

Check 4: Does every fact carry a confidence score?

Check 5: When a source changes, what happens to the graph?

Check 6: Could a plain vector index answer your actual queries?

Check 7: Is your graph storing facts, or coordinating agents?

What your score means

Why twenty minutes finds the rot but does not fix it

Share this article

Muhammad Mudassir

Muhammad Mudassir

Frequently Asked Questions

How often should I run a knowledge graph health check?

What is the most common knowledge graph quality problem?

Can I check knowledge graph quality without special tools?

What is the difference between a twenty-minute check and a full graph audit?

How do I know if I even need a knowledge graph at all?

Still have questions?

Related Articles

Graph Rot: Why Your Knowledge Graph Is Lying to Your AI

One Company, Eleven Names: How a Knowledge Graph Learns Identity

The Edge That Shouldn't Exist: Detecting Wrong Relationships in a Knowledge Graph

Explore More Insights