---
title: Cross-reference graphs and the supersession problem
slug: cross-reference-graphs-supersession
category: Engineering
tags: Retrieval, Supersession, Graphs
date: 2026-04-28
read_time: 7 min read
word_count: 1080
canonical: https://quelvio.com/blog/cross-reference-graphs-supersession
machine_url: https://quelvio.com/ai/blog/cross-reference-graphs-supersession
publisher: Quelvio
---

# Cross-reference graphs and the supersession problem

Engineering · April 28, 2026 · 7 min read

How we detect when a Notion page silently replaces a Confluence one — and which signal wins.

The hardest version of the retrieval problem is not what to return — it is what to demote without anyone telling you to demote it. A Confluence page describing the company's deployment process was written in 2023. In late 2025, someone wrote a Notion doc that quietly replaced it. Nobody marked the Confluence page deprecated. Nobody emailed the team. The new doc just started getting linked from runbooks, referenced in Slack threads, and updated as the actual canonical source. Six months later, an agent retrieving the old Confluence page is confidently wrong about how the company deploys.

This is the supersession problem, and it is everywhere. Most large companies' canonical knowledge is shaped by silent supersession events that the source systems do not record. Solving it is what the cross-reference graph layer of Quelvio Brain exists to do.

## What the graph stores

Each passage in the corpus — the meaning-based chunk unit, not the document — is a node in the cross-reference graph. Edges between nodes are typed. The four types we care about are *agreement*, *disagreement*, *reference*, and *supersession*.

An **agreement** edge says two passages make claims that are consistent. A **disagreement** edge says two passages make claims that conflict. A **reference** edge says one passage explicitly cites another — by URL, by document name, or by canonical identifier inside the source system. A **supersession** edge is the strongest: passage B replaces passage A in the role of canonical claim on a topic.

The first three edges are relatively easy to identify. Agreement and disagreement fall out of a classifier comparing claims in matched domains. References are surfaced by parsing URLs, link blocks, and source-system identifiers — every wiki has its own link syntax, and the ingestion layer normalizes them. Supersession is the one that takes work.

## How supersession is detected

Supersession is rarely declared. The original document is not edited. The new document does not announce itself as the replacement. The system has to infer the edge from three independent signals.

**Signal one — claim divergence in the same domain.** Two passages in the same taxonomy domain make contradictory claims about the same underlying behavior. The newer one is a supersession candidate. Just being newer is not enough — drafts, opinions, and side experiments are also newer. Claim divergence is necessary but not sufficient.

**Signal two — authority concentration on the newer source.** The contributors to the newer passage, weighted by their domain authority, are the ones with current ownership of the topic. The contributors to the older passage may have moved on, or may be reviewers rather than owners. When the authorship graph for the topic has shifted toward the newer source, that's a supersession signal.

**Signal three — link decay around the older source.** The number of inbound references to the older passage from other passages in the corpus is decreasing over time; the number of inbound references to the newer passage is increasing. Other authors in the corpus are voting with their links. This is the slowest signal to develop but the most reliable one when it does.

When all three signals point the same direction with high enough confidence, the graph builder writes a supersession edge from old → new. The older passage's authority score drops sharply; the newer one accumulates the canonical role. Retrieval results from that point forward return the newer source first, with the older one labeled `SUPERSEDED` and accessible via `--include-superseded`.

## What this looks like in practice

Take the Confluence-to-Notion case. Around the time the Notion page was written, all three signals start moving. Claim divergence appears — the deployment steps in the Notion page differ in material ways from the Confluence page. The engineer who wrote the Notion page has higher current authority in `engineering.deploys` than the original Confluence author, who shipped that document and moved to a different team. And over six months, the number of Slack messages and code-comment links pointing to the new Notion doc steadily climbs while inbound references to the Confluence page flatten.

By the time anyone explicitly asks the brain *"what's our deployment process?"*, the graph has already written the supersession edge. The retrieval result returns the Notion content first. The Confluence page is still in the corpus — sometimes the historical view matters — but it ranks below the canonical version and carries the lifecycle label that says so.

## Why this can't be solved at the source level

The natural objection is *"shouldn't this be solved by the source systems themselves?"* The honest answer is that source systems have never been able to solve it, and probably never will. Confluence does not know about your Notion. Notion does not know about your Drive. Your wiki cannot mark itself superseded by content in a different tool. The act of supersession crosses the source boundary, which is exactly why it is invisible to any one source's UI.

Cross-source supersession is a property of the *combined* corpus, not of any individual document. It can only be solved at the layer that sees the whole corpus at once. That layer is the brain.

[Inspecting the graph] If you want to see which supersession edges have been written for a recent query, `quelvio source <query-id>` prints lifecycle labels in v0.3.0. The full per-passage graph view ships in the next Brain release — it's the most-requested thing on the dashboard roadmap right now.

Tags: Retrieval, Supersession, Graphs

# Cross-reference graphs and the supersession problem

## What the graph stores

## How supersession is detected

## What this looks like in practice

## Why this can't be solved at the source level

## Related