Product

From scattered Drive to a single brain in 48 hours

What the connector setup actually does in the background — and why most companies are surprised by the coverage map.

May 12, 20266 min read

A 500-person company we onboarded last quarter expected the same thing every customer expects on day one — connect Google Drive, connect Notion, connect Slack, and the system starts answering questions. They were not wrong. What surprised them was what showed up on the coverage map 48 hours later.

Their engineering.platform domain showed strong coverage. Their customer-support.refunds domain showed strong coverage. Their legal.contracts domain showed almost nothing, despite a Drive folder with 1,400 contracts in it. Their product.roadmap domain showed a fractured shape — half the canonical content was in Notion, half had migrated to a Confluence space someone had set up six months ago and never told anyone about. A senior product lead saw the coverage map and said "I knew that, but I forgot I knew."

This is what the 48-hour setup actually delivers. The connectors are the easy part. The coverage map is the part that changes the conversation.

What happens during the 48 hours

Once a connector is authorized, Quelvio runs through five passes on the source corpus, each one writing into the same brain but solving a different problem.

Ingestion. Documents are pulled with permissions intact. Every document carries its source ACL, the contributor identity, and the lifecycle metadata the source provides — last edit, last comment, retention policy if any. The ingestion layer never strips permissions; downstream retrieval inherits them.

Extraction. Each document is parsed by format-aware extractors. PDFs go through layout-aware OCR. Slides extract title, body, and notes separately. Videos and audio recordings are transcribed with speaker diarization. Spreadsheets extract sheet-level structure, not just cell text. The output is a normalized, semantically structured representation of the document — not just a text blob.

Embedding. Each document is split into meaning-based passages, not fixed-size chunks. Section boundaries, semantic shifts, and natural paragraph breaks drive the chunking. Each passage gets a vector that captures what it means, not the words it contains. This is why a question phrased with completely different vocabulary can still find the right passage.

Classification. Every passage is routed into the taxonomy — the per-tenant domain graph that organizes content by what it's about. Classification draws on the passage content, the document's location in the source, the contributor's domain history, and cross-references to other documents already classified. The output is a per-passage taxonomy assignment with a confidence score.

Cross-reference resolution. This is the slowest of the five passes and the one that produces the surprising coverage map. The graph builder walks every passage and looks for relationships — does this passage agree with, disagree with, supersede, or reference another passage in the corpus? Supersession is detected by comparing claims in the same domain across time, with the authorship graph weighing in. Disagreement gets surfaced rather than averaged away.

Why the coverage map is the surprising part

Every company we have onboarded sees three predictable patterns on the coverage map in the first week.

The first is ghost domains: areas of the business where the team knows knowledge exists but the corpus carries almost nothing. Sometimes this is because the knowledge lives in heads, sometimes it lives in Loom recordings that nobody connected, sometimes it lives in a private channel that the integration wasn't authorized for. The coverage map makes the gap visible.

The second is fractured canon: two or more sources both treated as authoritative, with content that contradicts each other. The classic case is a Confluence space and a Notion workspace that both claim to be the current architecture reference. Quelvio detects the contradiction during cross-reference resolution and flags it; humans then decide which one is canonical.

The third is silent supersession: a Slack decision, an updated runbook, or a new spec quietly replaced what was previously canonical, but the old document was never marked stale. The graph picks this up and updates the authority ranking — the old document still exists, it just no longer ranks first. Engineers seeing this for the first time usually identify three or four documents they didn't realize were dead.

Why this matters more than connector count

Most knowledge-tools onboarding stories optimize for breadth — how many connectors did you add, how many documents indexed, how much storage. The coverage map reframes the question. Indexing 10 million documents is not the goal; understanding which of them are still load-bearing on which topics is. A corpus where 2,000 documents carry 80% of the answers for 80% of real queries is in better shape than one where 10 million documents are indexed but the authority signals haven't been built yet.

The 48-hour setup is when the brain becomes useful, not just connected. The connectors get you to the raw material; the five passes turn it into something an agent can reason over.

If you've already connected

The coverage map lives at enterprise.quelvio.com/coverage — open it and look for the three patterns above. Most teams find at least one fractured-canon case in the first session, and addressing it is usually the highest-leverage move available on day one.

What happens during the 48 hours

Why the coverage map is the surprising part

Why this matters more than connector count

Stay updated