Back to Blog
Published:
Last Updated:
Fresh Content

Smart Category Routing for Contract Review

6 min read
1,300 words
medium priority
Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

Smart Category Routing for Contract Review — Cognilium AI

TL;DR

A focused application of the LLMOps routing pattern to legal contract analysis — the analyst-selection logic that ships fewer clauses to fewer agents and fini

A focused application of the LLMOps routing pattern to legal contract analysis — the analyst-selection logic that ships fewer clauses to fewer agents and finishes a 3,300-call review in 154 seconds.
AI contract reviewlegal document AIclause analysissmart routingmulti-agent legalChromaDBHyDE retrieval

Contract review is a domain where the LLMOps routing pattern earns its complexity. A typical contract has 50-100 chunks. Each chunk is potentially relevant to one or two of 11 specialist analysts (compliance, indemnity, IP, payment, termination, etc.). Running every analyst on every chunk is 1,100+ LLM calls. Routing cuts that to ~250.

Playbook-driven configuration

Each customer has a playbook in S3 — categories that matter to them, severity weights, and party-specific clauses (one customer cares deeply about IP indemnity; another cares about data-residency clauses). At job start, the system loads the playbook and configures analysts accordingly.

  • Categories: 12 standard, 1-3 customer-specific
  • Severity weights: how much to escalate findings in each category
  • Party-specific clauses: customer-defined patterns the analyst should specifically look for

Per-chunk scoring

Each chunk runs through 12 category scorers (cheap model, $0.25/M tokens). Each scorer emits a 0-100 score for "is this chunk relevant to my category?" The router selects analysts to run based on the scores: above-threshold categories trigger their analyst; below-threshold categories skip.

HyDE-augmented retrieval

Within each analyst's context, retrieval pulls related chunks. HyDE generates a hypothetical ideal answer for the analyst's question, embeds that, retrieves real chunks similar to it. Better recall than embedding the literal question — especially when the analyst question uses legal jargon and the contract uses plain English (or vice versa).

LLM reranking after HyDE

HyDE retrieves 30 candidates; an LLM reranker (cheap model, scores each candidate 0-100) picks the top 5 to actually include in the analyst context. Reranking buys ~10-15% F1 over pure embedding similarity at modest cost.

Numbers from production

  • 22 chunks → 116 LLM calls per chunk (12 scorers + ~3 routed analysts × ~30 LLM calls each) = ~660 calls per chunk on the misleading top-line
  • Actually: 22 chunks × 12 scorers + ~3 selected analysts per chunk × 8 calls = 264 + 528 = ~800 calls per contract typical
  • P50 review time: 154 seconds end-to-end
  • Per-contract cost: $0.50-2.00 depending on contract length and customer playbook
  • Reduction vs. naive fan-out: ~75%

Where this fails

Customer playbooks with overlapping categories (the "compliance" category overlaps with "regulatory" and "data-handling" 70% of the time). Routing collapses to "everyone." Mitigation: routing analytics dashboard shows per-category overlap rates; surfaces the problem; encourages playbook tightening.

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI | 10+ years

Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...

Founder & CEO of Cognilium AI; 100+ production AI systems shipped; multi-cloud AI architecture (AWSGCPAzure); built and operated 4 production AI products
Agentic AIRAG → GraphRAG retrievalVoice AIMulti-Agent Orchestration

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us

Related Articles

Continue exploring related topics and insights from our content library.

The 8-Stage Document Intelligence Pipeline
11 min
1
Muhammad Mudassir
May 5, 2026

The 8-Stage Document Intelligence Pipeline

Parse, classify, evidence-map, extract, validate, score, graph, cross-document-link. The eight-stage pipeline that turns unstructured legal/financial PDFs into validated structured data with mislink detection at the end.

words
Read Article
Gemini-Driven Entity Disambiguation With Post-Creation Mislink Detection
7 min
2
Muhammad Mudassir
May 5, 2026

Gemini-Driven Entity Disambiguation With Post-Creation Mislink Detection

Auto-merging "Acme Corp" with "Acme Corporation" is the easy half. The hard half is catching the merges that should not have happened — a re-check pass after creation that flags 3% of merges as suspect.

words
Read Article
Zero-Trust Multi-Tenant Firestore: Middleware, Claims, and 60+ Wildcard Permissions
9 min
3
Muhammad Mudassir
May 5, 2026

Zero-Trust Multi-Tenant Firestore: Middleware, Claims, and 60+ Wildcard Permissions

Hard tenant isolation on Firestore is not a query-pattern choice — it is a middleware layer, an immutable claim source, and a permission model with wildcards. The architecture that makes cross-tenant data leakage structurally impossible.

words
Read Article

Explore More Insights

Discover more expert articles on AI, engineering, and technology trends.