Back to Blog
Published:
Last Updated:
Fresh Content

GraphRAG Implementation Guide: From Basic RAG to Enterprise Knowledge Systems

20 min read
4,000 words
high priority
M

Muhammad Mudassir

Founder & CEO, Cognilium AI

GraphRAG architecture diagram showing Neo4j knowledge graph connected to vector embeddings and LLM
Complete GraphRAG implementation guide. From basic RAG to enterprise knowledge graphs with Neo4j. Architecture patterns, code, and production deployment.
GraphRAG Neo4jenterprise RAG searchknowledge graph LLMRAG best practicesevidence-mapped retrievalhybrid search RAG

Basic RAG retrieves chunks. GraphRAG retrieves understanding. When your enterprise knowledge spans 1.2 million documents with complex relationships—contracts referencing other contracts, policies linking to regulations, employees connected to projects—basic vector search fails. GraphRAG uses knowledge graphs to capture these relationships and deliver accurate, contextual answers.

What is GraphRAG?

GraphRAG (Graph Retrieval-Augmented Generation) is an architecture that combines knowledge graphs with vector embeddings for LLM retrieval. Unlike basic RAG which retrieves text chunks based on semantic similarity, GraphRAG understands entity relationships, traverses connections, and retrieves contextually relevant information even when the exact words don't match. Microsoft introduced the term in 2024, and it's now the standard for enterprise knowledge systems.

What is Basic RAG?

Basic RAG (Retrieval-Augmented Generation) embeds documents into vectors and retrieves the most semantically similar chunks to a query. It works well for simple lookups but struggles with multi-hop reasoning, relationship queries, and large document collections where context matters more than keyword matching.

1. Why Basic RAG Fails at Scale

Basic RAG works great for simple use cases:

  • "What's our refund policy?"
  • "How do I reset my password?"
  • "What are the product features?"

But it breaks down when queries require:

Multi-Hop Reasoning

Query: "Which contracts with Acme Corp reference the 2023 pricing amendment?"

Basic RAG: Searches for "Acme Corp" + "2023 pricing amendment"
Result: Returns chunks mentioning these terms, but misses contracts that 
        reference the amendment indirectly through clause numbers

GraphRAG: Traverses: Acme Corp → Contracts → References → Amendment 2023-P
Result: Returns all 7 contracts, including 3 with indirect references

Relationship Queries

Query: "Who approved the budget for Project Atlas and what's their reporting chain?"

Basic RAG: Finds mentions of "Project Atlas" and "budget" and "approved"
Result: Scattered chunks, no clear answer

GraphRAG: Traverses: Project Atlas → Budget → Approval → Sarah Chen → Reports To → ...
Result: "Sarah Chen (VP Engineering) approved. Reports to: Mike Johnson (CTO) → CEO"

Temporal Context

Query: "What changed in our security policy after the 2024 audit?"

Basic RAG: Returns current security policy and audit report
Result: No diff, no timeline

GraphRAG: Traverses: Security Policy → Versions → Changes After → Audit 2024
Result: "3 changes: MFA requirement added, password rotation increased, 
        new vendor review process. Effective March 2024."

The Numbers

ScenarioBasic RAG AccuracyGraphRAG Accuracy
Simple lookup92%94%
Multi-hop reasoning54%89%
Relationship queries41%87%
Temporal queries38%82%
Average (complex queries)44%86%

GraphRAG delivers ~2x better accuracy on complex enterprise queries.

2. GraphRAG Architecture Overview

The Three Layers

Architecture Diagram

Component Breakdown

ComponentPurposeTechnology Options
Vector StoreSemantic similarity searchPinecone, Weaviate, Qdrant, pgvector
Graph DatabaseRelationship traversalNeo4j, Amazon Neptune, TigerGraph
Keyword SearchExact match, filtersOpenSearch, Elasticsearch, PostgreSQL
Fusion LayerCombine and re-rank resultsCustom logic, Cohere Rerank
LLMGenerate final responseClaude, GPT-4, Gemini

3. When to Use GraphRAG vs Basic RAG

Use Basic RAG When:

✅ Documents are independent (no relationships matter) ✅ Queries are simple lookups ("What is X?") ✅ Document collection is small (<10,000 docs) ✅ Speed is critical (graph traversal adds latency) ✅ Budget is limited (graphs add infrastructure cost)

Examples: FAQ bots, simple documentation search, customer support for straightforward issues.

Use GraphRAG When:

✅ Documents reference each other (contracts, policies, regulations) ✅ Queries involve relationships ("Who approved what?") ✅ Multi-hop reasoning is common ✅ Accuracy on complex queries is critical ✅ Document collection is large and interconnected

Examples: Legal document analysis, enterprise knowledge bases, compliance systems, research databases.

Decision Matrix

FactorChoose Basic RAGChoose GraphRAG
Query complexitySimple lookupsMulti-hop reasoning
Document relationshipsIndependentInterconnected
Accuracy requirementGood enough (85%+)Must be high (95%+)
Latency tolerance<500ms required1-3s acceptable
Infrastructure budgetLimitedAvailable
Team expertiseBasic MLGraph + ML

4. Building Your Knowledge Graph

Step 1: Entity Extraction

Extract entities from your documents:

from langchain.chat_models import ChatAnthropic
from langchain.prompts import ChatPromptTemplate

llm = ChatAnthropic(model="claude-3-sonnet-20240229")

extraction_prompt = ChatPromptTemplate.from_template("""
Extract all entities and relationships from this document.

Document:
{document}

Return JSON with:
- entities: list of {{name, type, properties}}
- relationships: list of {{source, target, type, properties}}

Entity types: Person, Organization, Contract, Policy, Project, Date, Amount
Relationship types: WORKS_FOR, APPROVED_BY, REFERENCES, EFFECTIVE_DATE, AMOUNT

JSON:
""")

def extract_entities(document: str) -> dict:
    response = llm.invoke(extraction_prompt.format(document=document))
    return json.loads(response.content)

Step 2: Build the Graph in Neo4j

from neo4j import GraphDatabase

class KnowledgeGraph:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))
    
    def add_entity(self, entity: dict):
        query = """
        MERGE (e:{type} {{name: $name}})
        SET e += $properties
        RETURN e
        """.format(type=entity['type'])
        
        with self.driver.session() as session:
            session.run(query, name=entity['name'], properties=entity.get('properties', {}))
    
    def add_relationship(self, rel: dict):
        query = """
        MATCH (source {{name: $source}})
        MATCH (target {{name: $target}})
        MERGE (source)-[r:{type}]->(target)
        SET r += $properties
        """.format(type=rel['type'])
        
        with self.driver.session() as session:
            session.run(query, 
                       source=rel['source'], 
                       target=rel['target'],
                       properties=rel.get('properties', {}))
    
    def ingest_document(self, document: str, doc_id: str):
        # Extract entities and relationships
        extracted = extract_entities(document)
        
        # Add document node
        self.add_entity({
            'type': 'Document',
            'name': doc_id,
            'properties': {'content': document[:1000]}  # Store snippet
        })
        
        # Add entities
        for entity in extracted['entities']:
            self.add_entity(entity)
            # Link entity to document
            self.add_relationship({
                'source': doc_id,
                'target': entity['name'],
                'type': 'CONTAINS'
            })
        
        # Add relationships
        for rel in extracted['relationships']:
            self.add_relationship(rel)

Step 3: Create Vector Embeddings

from langchain.embeddings import BedrockEmbeddings
import pinecone

embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1")

def embed_and_store(document: str, doc_id: str, metadata: dict):
    # Generate embedding
    vector = embeddings.embed_query(document)
    
    # Store in Pinecone
    pinecone_index.upsert([
        {
            'id': doc_id,
            'values': vector,
            'metadata': {
                **metadata,
                'text': document[:8000]  # Store text for retrieval
            }
        }
    ])

Graph Schema Example

// Schema for a contract analysis system

// Nodes
(:Contract {id, title, effective_date, status})
(:Party {name, type, jurisdiction})
(:Clause {id, type, text})
(:Amendment {id, date, description})
(:Person {name, title, email})

// Relationships
(:Contract)-[:BETWEEN]->(:Party)
(:Contract)-[:CONTAINS]->(:Clause)
(:Contract)-[:AMENDED_BY]->(:Amendment)
(:Clause)-[:REFERENCES]->(:Clause)
(:Contract)-[:APPROVED_BY]->(:Person)
(:Person)-[:WORKS_FOR]->(:Party)

5. Implementing Hybrid Search

The power of GraphRAG comes from combining multiple search methods.

Hybrid Query Flow

class HybridSearch:
    def __init__(self, vector_store, graph_db, keyword_search):
        self.vector = vector_store
        self.graph = graph_db
        self.keyword = keyword_search
    
    def search(self, query: str, top_k: int = 10) -> list:
        # 1. Vector search (semantic similarity)
        vector_results = self.vector.similarity_search(query, k=top_k)
        
        # 2. Graph traversal (relationship-based)
        entities = self.extract_entities_from_query(query)
        graph_results = self.graph.traverse(entities, max_hops=2)
        
        # 3. Keyword search (exact match)
        keyword_results = self.keyword.search(query, k=top_k)
        
        # 4. Fuse results using Reciprocal Rank Fusion
        fused = self.reciprocal_rank_fusion([
            vector_results,
            graph_results,
            keyword_results
        ], weights=[0.4, 0.4, 0.2])
        
        return fused[:top_k]
    
    def reciprocal_rank_fusion(self, result_lists: list, weights: list, k=60) -> list:
        """
        Combine multiple ranked lists using RRF.
        Score = sum(weight / (k + rank)) for each list
        """
        scores = {}
        
        for results, weight in zip(result_lists, weights):
            for rank, doc in enumerate(results):
                doc_id = doc['id']
                if doc_id not in scores:
                    scores[doc_id] = {'doc': doc, 'score': 0}
                scores[doc_id]['score'] += weight / (k + rank + 1)
        
        # Sort by fused score
        ranked = sorted(scores.values(), key=lambda x: x['score'], reverse=True)
        return [item['doc'] for item in ranked]

Graph Traversal Queries

// Find all contracts related to a party and their amendments
MATCH (p:Party {name: $party_name})<-[:BETWEEN]-(c:Contract)
OPTIONAL MATCH (c)-[:AMENDED_BY]->(a:Amendment)
OPTIONAL MATCH (c)-[:CONTAINS]->(clause:Clause)-[:REFERENCES]->(ref:Clause)
RETURN c, collect(DISTINCT a) as amendments, collect(DISTINCT ref) as references

// Find approval chain for a document
MATCH (d:Document {id: $doc_id})-[:APPROVED_BY]->(approver:Person)
MATCH path = (approver)-[:REPORTS_TO*0..3]->(manager:Person)
RETURN path

// Find related documents through shared entities
MATCH (d1:Document {id: $doc_id})-[:CONTAINS]->(e:Entity)<-[:CONTAINS]-(d2:Document)
WHERE d1 <> d2
RETURN d2, count(e) as shared_entities
ORDER BY shared_entities DESC
LIMIT 10

6. Production Architecture

Recommended Stack

Architecture Diagram

AWS Implementation

# Infrastructure as Code (CDK)
from aws_cdk import (
    Stack,
    aws_lambda as lambda_,
    aws_apigateway as apigw,
    aws_s3 as s3,
    aws_neptune as neptune,  # Or use Neo4j Aura
)

class GraphRAGStack(Stack):
    def __init__(self, scope, id, **kwargs):
        super().__init__(scope, id, **kwargs)
        
        # Document bucket
        self.doc_bucket = s3.Bucket(self, "DocumentBucket")
        
        # Graph database (Neptune or Neo4j Aura)
        self.graph_db = neptune.DatabaseCluster(self, "GraphDB",
            instance_type=neptune.InstanceType.R5_LARGE,
            vpc=vpc
        )
        
        # Query handler
        self.query_handler = lambda_.Function(self, "QueryHandler",
            runtime=lambda_.Runtime.PYTHON_3_11,
            handler="query.handler",
            timeout=Duration.seconds(30),
            memory_size=1024,
            environment={
                "GRAPH_ENDPOINT": self.graph_db.cluster_endpoint.hostname,
                "PINECONE_INDEX": "graphrag-prod",
                "OPENSEARCH_ENDPOINT": opensearch_domain.domain_endpoint
            }
        )
        
        # API Gateway
        self.api = apigw.RestApi(self, "GraphRAGAPI")
        self.api.root.add_resource("query").add_method(
            "POST",
            apigw.LambdaIntegration(self.query_handler)
        )

Scaling Considerations

ComponentScaling Strategy
Neo4jRead replicas, causal clustering
PineconeManaged scaling, pod selection
OpenSearchHorizontal sharding
LambdaConcurrency limits, provisioned concurrency
LLMRequest queuing, fallback models

7. Performance Optimization

Query Latency Breakdown

StageTypical LatencyOptimization
Intent classification50-100msCache common patterns
Vector search50-150msReduce k, use approximate NN
Graph traversal100-500msIndex hot paths, limit hops
Keyword search20-50msOptimize index settings
Result fusion10-30msPre-compute common fusions
LLM generation1-3sStream response, use smaller context
Total1.5-4s

Optimization Techniques

1. Graph Indexing

// Create indexes for common traversal patterns
CREATE INDEX contract_party FOR (c:Contract) ON (c.party_name);
CREATE INDEX clause_type FOR (cl:Clause) ON (cl.type);
CREATE INDEX person_name FOR (p:Person) ON (p.name);

// Create full-text index for text search in graph
CREATE FULLTEXT INDEX contract_text FOR (c:Contract) ON EACH [c.title, c.summary];

2. Caching Strategy

import redis

cache = redis.Redis(host='localhost', port=6379)

def cached_search(query: str, cache_ttl: int = 3600):
    cache_key = f"graphrag:{hash(query)}"
    
    # Check cache
    cached = cache.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Execute search
    results = hybrid_search.search(query)
    
    # Cache results
    cache.setex(cache_key, cache_ttl, json.dumps(results))
    
    return results

3. Query Planning

def plan_query(query: str) -> dict:
    """
    Analyze query and determine optimal search strategy.
    """
    # Classify query type
    if is_relationship_query(query):
        return {
            'strategy': 'graph_first',
            'weights': {'vector': 0.2, 'graph': 0.7, 'keyword': 0.1}
        }
    elif is_exact_match_query(query):
        return {
            'strategy': 'keyword_first',
            'weights': {'vector': 0.2, 'graph': 0.2, 'keyword': 0.6}
        }
    else:
        return {
            'strategy': 'balanced',
            'weights': {'vector': 0.4, 'graph': 0.4, 'keyword': 0.2}
        }

8. Real Implementation: Legal Lens AI

At Cognilium, we built Legal Lens AI using GraphRAG architecture. Here's how it performs in production.

The Challenge

  • 1.2 million contracts to analyze
  • Contracts reference other contracts, amendments, clauses
  • Users ask complex queries: "Find all contracts with Acme Corp that have non-standard termination clauses referencing the 2022 master agreement"
  • 95% accuracy required for enterprise compliance

Our Architecture

Architecture Diagram

Results

MetricBasic RAGGraphRAG (Legal Lens)
Simple query accuracy91%95%
Multi-hop query accuracy52%94%
Relationship query accuracy43%96%
Average latency1.2s2.8s
User satisfaction72%94%

Key Learnings

  1. Entity extraction quality matters most: Garbage in, garbage out. We spent 40% of development time on extraction prompts.

  2. Graph schema design is critical: Bad schema = slow queries. We redesigned twice before production.

  3. Hybrid search beats single-method: Neither vector nor graph alone matched hybrid performance.

  4. Caching reduces latency significantly: 60% of queries hit cache, reducing average latency from 2.8s to 1.1s.

9. Common GraphRAG Mistakes

Mistake 1: Over-Extracting Entities

# ❌ Bad: Extract everything
entities: ["the", "contract", "shall", "be", "effective", ...]

# ✅ Good: Extract meaningful entities
entities: ["Acme Corp", "Master Agreement 2024", "Section 5.2", "$500,000"]

Mistake 2: Ignoring Relationship Direction

// ❌ Bad: Undirected relationships lose meaning
(Contract)-[:RELATED]-(Amendment)

// ✅ Good: Directed relationships preserve semantics
(Contract)-[:AMENDED_BY]->(Amendment)
(Amendment)-[:AMENDS]->(Contract)

Mistake 3: No Graph Indexes

// ❌ Bad: Full graph scan on every query
MATCH (c:Contract) WHERE c.title CONTAINS "Acme"

// ✅ Good: Use indexes
CREATE FULLTEXT INDEX contract_title FOR (c:Contract) ON EACH [c.title]
CALL db.index.fulltext.queryNodes("contract_title", "Acme") YIELD node

Mistake 4: Ignoring Edge Cases

# ❌ Bad: Assumes extraction always works
entities = extract_entities(document)
for entity in entities:
    graph.add(entity)

# ✅ Good: Handle extraction failures
try:
    entities = extract_entities(document)
    if not entities:
        # Fall back to keyword extraction
        entities = keyword_extract(document)
    for entity in entities:
        graph.add(entity)
except ExtractionError as e:
    logger.warning(f"Extraction failed for {doc_id}: {e}")
    # Store document without graph enrichment
    vector_store.add(document, doc_id)

Mistake 5: No Evidence Mapping

# ❌ Bad: Return answer without sources
return {"answer": "The contract was approved by John Smith."}

# ✅ Good: Include evidence trail
return {
    "answer": "The contract was approved by John Smith.",
    "evidence": [
        {
            "source": "Contract-2024-001, Section 12",
            "text": "Approved by: John Smith, VP Legal",
            "confidence": 0.95
        }
    ],
    "graph_path": ["Contract-2024-001", "APPROVED_BY", "John Smith"]
}

10. Getting Started

Quick Start: 30 Minutes to First GraphRAG Query

Prerequisites:

  • Python 3.9+
  • Neo4j Aura account (free tier available)
  • Pinecone account (free tier available)
  • AWS Bedrock access

Step 1: Install Dependencies

pip install neo4j pinecone-client langchain boto3

Step 2: Set Up Neo4j

from neo4j import GraphDatabase

driver = GraphDatabase.driver(
    "neo4j+s://your-instance.neo4j.io",
    auth=("neo4j", "your-password")
)

# Create schema
with driver.session() as session:
    session.run("""
        CREATE CONSTRAINT doc_id IF NOT EXISTS FOR (d:Document) REQUIRE d.id IS UNIQUE
    """)

Step 3: Ingest a Document

# See code examples in Section 4 above
graph = KnowledgeGraph(uri, user, password)
graph.ingest_document(document_text, "doc-001")

Step 4: Query

hybrid = HybridSearch(vector_store, graph, keyword_search)
results = hybrid.search("Find contracts referencing Section 5.2")

Next Steps

  1. RAG vs GraphRAG Comparison → Detailed benchmarks and decision framework

  2. Building Knowledge Graphs with Neo4j → Deep dive on graph construction

  3. Enterprise RAG Security → RBAC, audit trails, compliance

  4. Hybrid Search Implementation → Vector + Keyword + Graph fusion

  5. Evidence-Mapped Retrieval → Why citations matter in enterprise


Need help building GraphRAG for your enterprise?

At Cognilium, we built Legal Lens AI analyzing 1.2M contracts with 95% accuracy. Let's discuss your knowledge system →

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

Mudassir Marwat is the Founder & CEO of Cognilium AI, where he leads the design and deployment of pr...

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us