Vector search finds semantically similar content but misses exact matches. Keyword search finds exact matches but misses synonyms. Graph search finds relationships but misses context. Hybrid search combines all three—and outperforms any single method by 23% in our benchmarks. Here's how to implement it.
What is Hybrid Search?
Hybrid search combines multiple retrieval methods—typically vector (semantic), keyword (lexical), and graph (relational)—and fuses their results into a single ranked list. The fusion algorithm (usually Reciprocal Rank Fusion) weights and combines results so the strengths of each method compensate for others' weaknesses.
1. Why Single-Method Search Fails
Vector Search Weaknesses
Query: "MSA-2024-001"
Vector Search: Returns contracts about "master service agreements" (semantic match)
Expected: The specific contract with ID MSA-2024-001 (exact match)
Result: ❌ Wrong documents
Keyword Search Weaknesses
Query: "unauthorized termination consequences"
Keyword Search: No exact phrase match
Document has: "breach of contract penalties" (same meaning)
Result: ❌ Missed relevant document
Graph Search Weaknesses
Query: "What are the security requirements?"
Graph Search: No entity to start traversal
Needed: Semantic understanding of "security requirements"
Result: ❌ Can't start without entities
Hybrid Search Wins
| Query Type | Vector | Keyword | Graph | Hybrid |
|---|---|---|---|---|
| Semantic | ✅ | ❌ | ⚠️ | ✅ |
| Exact match | ❌ | ✅ | ⚠️ | ✅ |
| Relationships | ⚠️ | ❌ | ✅ | ✅ |
| Average Accuracy | 72% | 68% | 71% | 89% |
2. The Three Search Methods
Method 1: Vector Search (Semantic)
from pinecone import Pinecone
from anthropic import Anthropic
pc = Pinecone(api_key="your-key")
index = pc.Index("documents")
anthropic = Anthropic()
def vector_search(query: str, top_k: int = 10) -> list:
response = anthropic.embeddings.create(
model="voyage-3",
input=query
)
query_embedding = response.embeddings[0]
results = index.query(
vector=query_embedding,
top_k=top_k,
include_metadata=True
)
return [
{
"id": match.id,
"score": match.score,
"content": match.metadata.get("content", ""),
"source": "vector"
}
for match in results.matches
]
Method 2: Keyword Search (Lexical)
from opensearchpy import OpenSearch
os_client = OpenSearch(hosts=[{"host": "localhost", "port": 9200}])
def keyword_search(query: str, top_k: int = 10) -> list:
response = os_client.search(
index="documents",
body={
"query": {
"multi_match": {
"query": query,
"fields": ["title^2", "content", "summary"],
"type": "best_fields"
}
},
"size": top_k
}
)
return [
{
"id": hit["_id"],
"score": hit["_score"],
"content": hit["_source"].get("content", ""),
"source": "keyword"
}
for hit in response["hits"]["hits"]
]
Method 3: Graph Search (Relational)
from neo4j import GraphDatabase
driver = GraphDatabase.driver("neo4j+s://xxx.neo4j.io", auth=("neo4j", "password"))
def graph_search(query: str, top_k: int = 10) -> list:
entities = extract_entities(query)
if not entities:
return []
entity_names = [e["name"] for e in entities]
with driver.session() as session:
result = session.run("""
MATCH (e)-[r*1..2]-(related:Document)
WHERE e.name IN $names
RETURN DISTINCT related.id as id,
related.content as content,
count(r) as connection_strength
ORDER BY connection_strength DESC
LIMIT $limit
""", {"names": entity_names, "limit": top_k})
return [
{
"id": record["id"],
"score": record["connection_strength"],
"content": record["content"],
"source": "graph"
}
for record in result
]
3. Reciprocal Rank Fusion
RRF combines ranked lists by considering rank positions, not raw scores.
The Formula
RRF_score(doc) = Σ (1 / (k + rank_in_list))
Where k is typically 60 (prevents high-ranked documents from dominating).
Implementation
def reciprocal_rank_fusion(
result_lists: list[list[dict]],
weights: list[float] = None,
k: int = 60
) -> list[dict]:
if weights is None:
weights = [1.0] * len(result_lists)
total_weight = sum(weights)
weights = [w / total_weight for w in weights]
doc_scores = {}
for results, weight in zip(result_lists, weights):
for rank, doc in enumerate(results):
doc_id = doc["id"]
if doc_id not in doc_scores:
doc_scores[doc_id] = {
"doc": doc,
"rrf_score": 0,
"sources": []
}
doc_scores[doc_id]["rrf_score"] += weight / (k + rank + 1)
doc_scores[doc_id]["sources"].append(doc["source"])
ranked = sorted(
doc_scores.values(),
key=lambda x: x["rrf_score"],
reverse=True
)
return [
{
**item["doc"],
"rrf_score": item["rrf_score"],
"sources": item["sources"]
}
for item in ranked
]
4. Implementation: Vector + Keyword
Start with the most common hybrid: vector + keyword.
class HybridSearchV1:
def __init__(self, vector_store, keyword_store):
self.vector = vector_store
self.keyword = keyword_store
def search(
self,
query: str,
top_k: int = 10,
vector_weight: float = 0.6,
keyword_weight: float = 0.4
) -> list:
vector_results = self.vector.search(query, top_k=top_k * 2)
keyword_results = self.keyword.search(query, top_k=top_k * 2)
fused = reciprocal_rank_fusion(
[vector_results, keyword_results],
weights=[vector_weight, keyword_weight]
)
return fused[:top_k]
5. Implementation: Adding Graph Search
Full three-way hybrid for maximum recall.
class HybridSearchV2:
def __init__(self, vector_store, keyword_store, graph_store):
self.vector = vector_store
self.keyword = keyword_store
self.graph = graph_store
def search(self, query: str, top_k: int = 10, weights: dict = None) -> list:
if weights is None:
weights = {"vector": 0.4, "keyword": 0.3, "graph": 0.3}
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as executor:
vector_future = executor.submit(self.vector.search, query, top_k * 2)
keyword_future = executor.submit(self.keyword.search, query, top_k * 2)
graph_future = executor.submit(self.graph.search, query, top_k * 2)
vector_results = vector_future.result()
keyword_results = keyword_future.result()
graph_results = graph_future.result()
fused = reciprocal_rank_fusion(
[vector_results, keyword_results, graph_results],
weights=[weights["vector"], weights["keyword"], weights["graph"]]
)
return fused[:top_k]
Adaptive Weights
def adaptive_search(self, query: str, top_k: int = 10) -> list:
has_exact_ids = bool(re.search(r'[A-Z]{2,}-\d+', query))
has_entities = len(extract_entities(query)) > 0
is_semantic = len(query.split()) > 5
if has_exact_ids:
weights = {"vector": 0.2, "keyword": 0.7, "graph": 0.1}
elif has_entities:
weights = {"vector": 0.3, "keyword": 0.2, "graph": 0.5}
elif is_semantic:
weights = {"vector": 0.6, "keyword": 0.3, "graph": 0.1}
else:
weights = {"vector": 0.4, "keyword": 0.3, "graph": 0.3}
return self.search(query, top_k, weights)
6. Re-ranking for Final Quality
After fusion, re-rank with a cross-encoder or LLM for best results.
Cross-Encoder Re-ranking
from sentence_transformers import CrossEncoder
reranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-12-v2")
def rerank(query: str, results: list, top_k: int = 5) -> list:
pairs = [(query, r["content"]) for r in results]
scores = reranker.predict(pairs)
for i, result in enumerate(results):
result["rerank_score"] = float(scores[i])
reranked = sorted(results, key=lambda x: x["rerank_score"], reverse=True)
return reranked[:top_k]
7. Tuning Fusion Weights
Optimal weights depend on your data and queries.
Grid Search Approach
def tune_weights(test_queries: list, ground_truth: dict) -> dict:
best_weights = None
best_score = 0
for v in [0.2, 0.3, 0.4, 0.5, 0.6]:
for k in [0.2, 0.3, 0.4]:
g = 1.0 - v - k
if g < 0:
continue
weights = {"vector": v, "keyword": k, "graph": g}
score = evaluate(test_queries, ground_truth, weights)
if score > best_score:
best_score = score
best_weights = weights
return best_weights
Recommended Starting Weights
| Use Case | Vector | Keyword | Graph |
|---|---|---|---|
| General purpose | 0.4 | 0.3 | 0.3 |
| Technical docs (exact IDs matter) | 0.3 | 0.5 | 0.2 |
| Legal/contracts (relationships matter) | 0.3 | 0.2 | 0.5 |
| Knowledge base (semantic) | 0.5 | 0.3 | 0.2 |
8. Production Architecture
Next Steps
- GraphRAG Implementation Guide → - Full architecture for graph-enhanced RAG
- RAG vs GraphRAG → - When to add graph search
- Evidence-Mapped Retrieval → - Traceable citations in search results
Need help implementing hybrid search?
At Cognilium, we've built hybrid search systems processing millions of documents. Let's discuss your retrieval needs →
Share this article
Muhammad Mudassir
Founder & CEO, Cognilium AI
Muhammad Mudassir
Founder & CEO, Cognilium AI
Mudassir Marwat is the Founder & CEO of Cognilium AI, where he leads the design and deployment of pr...