TL;DR
Building a single retrieval surface over heterogeneous unstructured media — meeting transcripts, Slack threads, Confluence pages, Loom recordings — with sourc
A useful enterprise knowledge assistant has to answer questions like "what did we decide about pricing in last week's meeting?" The answer is rarely in one place. It might be a meeting transcript on Loom, a follow-up Slack thread, and a Confluence page someone updated three days later. A separate connector per source returns three result lists; the user mentally merges them. One retrieval surface with cross-source ranking does the merging at the right layer.
Heterogeneous chunking, homogeneous retrieval
Each source gets its own chunking strategy. Slack: thread-as-chunk (a 30-message thread is one retrievable unit). Confluence: heading-aware split with parent context preserved. Meeting transcripts (Zoom/Teams/Meet/Loom): speaker-aware chunking on speaker change + 90-second sliding window. Each chunk lands in the same vector store with normalized metadata: source, source_id, source_url, created_at, updated_at, author, source_authority_score, recency_decay_anchor.
Cross-source ranking is the hard part
Naively ranking across sources by cosine similarity destroys quality. A Slack message and a Confluence heading match the query similarly but should not rank similarly. Three layers handle this:
- Per-source authority score: Confluence 1.0, Meeting transcript 0.85, Loom 0.75, Slack 0.55. Multiplied with the cosine score before fusion.
- Per-source recency decay: Slack ages out fast (half-life 30 days), Confluence slow (180 days), transcripts never decay.
- Reranker sees source as a feature: trained on click-through pairs from real queries — Slack with a high vote count beats stale Confluence in practice.
Source attribution that survives fan-out
Every chunk carries enough metadata to deep-link back to the source. Slack: workspace + channel + thread_ts so the answer links into the thread, not the channel. Loom: video URL + start timestamp so loom.com/share/abc?t=42m13s lands at the cited moment. Confluence: page version (so deletion-edits do not invalidate the link).
The LLM is prompted to cite inline — "according to the Confluence doc on pricing strategy (updated 2024-01-15)…" — so freshness shows up in the answer text, not just the search ranking. Users learn to trust answers that cite recent transcripts more than ones citing year-old docs.
Numbers from production
- 4 source connectors: Zoom + Teams + Slack + Confluence + Loom + Drive + SharePoint
- Sub-3-second response time on a 50,000-chunk corpus
- 95% knowledge accuracy on internal eval set (verified citations match the cited source)
- 70-85% of common support tickets answered from organizational memory before reaching a human
Where this gets hard
Permissions. A Confluence space is private to a department; a Slack channel is private to a team; a Loom is private to its owner. The vector store has to honor those permissions at query time, not just at ingestion. We attach an ACL list to every chunk and filter on user.groups at query time. Re-permission events (a channel made public) are processed via webhook into a re-attach job — re-embedding is unnecessary, only the ACL changes.
Share this article
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years experience
Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...
