Why one retrieval surface instead of one connector per source?

Users ask questions that span sources — "what did we decide about X?" needs Slack thread + Confluence doc + meeting transcript merged. Per-source connectors return a list of lists; the user has to mentally merge. One surface with cross-source ranking removes that.

Doesn’t merging Slack with Confluence destroy quality? Slack is noisy.

It would, with naive ranking. The fix is per-source recency decay (Slack ages out at 30 days, Confluence at 6 months, meeting transcripts never), per-source authority scores (Confluence > Loom > Slack), and a reranker that sees the source type as a feature.

How do you handle Slack threads vs. messages?

Each thread is one chunk, not each message. A 30-message thread is one retrievable unit with the OP message as the chunk title. The thread-as-unit boundary preserves context that per-message chunking destroys.

What about Loom transcripts that are 2 hours long?

Speaker-aware chunking: split on speaker change + 90s windows. Each chunk has speaker metadata + start/end timestamps. The retrieval result links back to loom.com/share/abc?t=42m13s so the user lands at the exact moment.

How do you prevent stale information from out-of-date docs?

Two layers. Per-source recency decay reduces the score of old chunks. The LLM is prompted to surface the document date when citing — "according to the Confluence doc updated 2024-01-15…" — so users see freshness in the answer, not just the retrieval ranking.

Organizational Memory: RAG Across Slack, Confluence, and …

A useful enterprise knowledge assistant has to answer questions like "what did we decide about pricing in last week's meeting?" The answer is rarely in one place. It might be a meeting transcript on Loom, a follow-up Slack thread, and a Confluence page someone updated three days later. A separate connector per source returns three result lists; the user mentally merges them. One retrieval surface with cross-source ranking does the merging at the right layer.

Heterogeneous chunking, homogeneous retrieval

Each source gets its own chunking strategy. Slack: thread-as-chunk (a 30-message thread is one retrievable unit). Confluence: heading-aware split with parent context preserved. Meeting transcripts (Zoom/Teams/Meet/Loom): speaker-aware chunking on speaker change + 90-second sliding window. Each chunk lands in the same vector store with normalized metadata: source, source_id, source_url, created_at, updated_at, author, source_authority_score, recency_decay_anchor.

Cross-source ranking is the hard part

Naively ranking across sources by cosine similarity destroys quality. A Slack message and a Confluence heading match the query similarly but should not rank similarly. Three layers handle this:

Per-source authority score: Confluence 1.0, Meeting transcript 0.85, Loom 0.75, Slack 0.55. Multiplied with the cosine score before fusion.
Per-source recency decay: Slack ages out fast (half-life 30 days), Confluence slow (180 days), transcripts never decay.
Reranker sees source as a feature: trained on click-through pairs from real queries — Slack with a high vote count beats stale Confluence in practice.

Source attribution that survives fan-out

Every chunk carries enough metadata to deep-link back to the source. Slack: workspace + channel + thread_ts so the answer links into the thread, not the channel. Loom: video URL + start timestamp so loom.com/share/abc?t=42m13s lands at the cited moment. Confluence: page version (so deletion-edits do not invalidate the link).

The LLM is prompted to cite inline — "according to the Confluence doc on pricing strategy (updated 2024-01-15)…" — so freshness shows up in the answer text, not just the search ranking. Users learn to trust answers that cite recent transcripts more than ones citing year-old docs.

Numbers from production

4 source connectors: Zoom + Teams + Slack + Confluence + Loom + Drive + SharePoint
Sub-3-second response time on a 50,000-chunk corpus
95% knowledge accuracy on internal eval set (verified citations match the cited source)
70-85% of common support tickets answered from organizational memory before reaching a human

Where this gets hard

Permissions. A Confluence space is private to a department; a Slack channel is private to a team; a Loom is private to its owner. The vector store has to honor those permissions at query time, not just at ingestion. We attach an ACL list to every chunk and filter on user.groups at query time. Re-permission events (a channel made public) are processed via webhook into a re-attach job — re-embedding is unnecessary, only the ACL changes.

Organizational Memory: RAG Across Slack, Confluence, and Loom

Heterogeneous chunking, homogeneous retrieval

Cross-source ranking is the hard part

Source attribution that survives fan-out

Numbers from production

Where this gets hard

Share this article

Muhammad Mudassir

Muhammad Mudassir

Frequently Asked Questions

Why one retrieval surface instead of one connector per source?

Doesn’t merging Slack with Confluence destroy quality? Slack is noisy.

How do you handle Slack threads vs. messages?

What about Loom transcripts that are 2 hours long?

How do you prevent stale information from out-of-date docs?

Still have questions?

Related Articles

RAG vs GraphRAG: When the Vector Database Stops Being Enough

Hybrid Retrieval With Prefetch-Time Metadata Filtering

Anti-Hallucination via Runtime Grounding Against a Domain Vocabulary

Explore More Insights