Back to Blog
Published:
Last Updated:
Fresh Content

Supervisor-Router on Google ADK with Per-Org Tool Registration

9 min read
1,700 words
high priority
Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

Supervisor-Router on Google ADK with Per-Org Tool Registration — Cognilium AI

TL;DR

Building a multi-tenant agent platform on Google ADK where the supervisor binds only the tools each org has paid for and integrated — without forking the agen

Building a multi-tenant agent platform on Google ADK where the supervisor binds only the tools each org has paid for and integrated — without forking the agent definition per tenant.
Google ADKagent supervisordynamic tool registrationRBACmulti-tenant agentsVertex AIFirestore claimsagent factory

A multi-tenant SaaS that exposes 7 specialist agents through a supervisor needs a model where each org sees only the tools they have paid for and connected. The naive approach — register every tool, let the LLM ignore the irrelevant ones — fails on cost (every tool description in the system prompt costs tokens on every turn) and on security (the LLM hallucinates a Salesforce call for an org without Salesforce, the user sees a 401 and blames the AI).

The system this writeup describes runs on Google ADK 1.15 with Gemini 2.0 Flash for chat and Gemini 2.5 Pro for document intelligence. The supervisor sits in front of seven specialists (Doc, Email, Calendar, Investment, Compliance, Audit, Reporting). What makes it shippable is that the supervisor is instantiated per-request from a factory that reads org-level RBAC and integration status before binding tools.

The agent factory

Every request hits a TenantContextMiddleware that reads the immutable Firebase custom claim, looks up organizations/{orgId}/permissions, and attaches it to request.context. The supervisor factory takes that context and assembles the agent: which specialists to register (Investment is excluded for ops-only orgs), which tools to bind to each specialist (Salesforce-write only for orgs with the Salesforce integration in connected state), and which system prompt fragment to splice in (compliance-mode adds an audit-trail clause).

The factory output is a fresh ADK Agent instance per request. A naive implementation rebuilds it from scratch every call, which costs ~150ms in Firestore reads. The factory caches by {orgId, integrations_hash} so warm path is ~5ms. Hash invalidates on any integration change via a Pub/Sub topic the integrations service publishes to.

RBAC at two layers

Layer one: tool registration. The factory only binds tools the org is allowed to use. The LLM literally does not know the others exist — system prompt is shorter, hallucination cannot reach into a non-connected Salesforce.

Layer two: per-tool permission check inside the tool handler. Every tool starts with assert_permission(request.context, "salesforce:read"). This catches the case where someone bypasses the factory (test fixtures, internal scripts) and acts as a runtime audit point. Either layer alone is brittle — the factory layer can be bypassed accidentally; the tool layer is enforced last but cannot trim the prompt. Together they cover both gaps.

Why ADK and not Assistants v2

  • ADK gives you agent-as-code with versioning in git — every supervisor change is reviewable
  • Vertex AI handles model routing without a third-party gateway
  • Native streaming through ADK works with Gemini 2.5 Pro grounded responses
  • Tool definitions live in Python with full type-checking, not in a Web Studio

The Assistants v2 alternative is one assistant per org, which works but turns "add a feature to the supervisor" into a fan-out migration across N orgs. ADK keeps the source of truth in code and the per-org variation in data.

Numbers from production

  • Cold supervisor instantiation: ~150ms (Firestore × 3 + tool factory)
  • Warm: ~5ms (in-memory cache, hash-keyed)
  • 7 specialists × up to 12 tools each = ~84-tool catalog at registration time
  • Average org sees 4-6 specialists with 3-5 tools each — supervisor prompt stays under 4KB
  • Pub/Sub-driven cache invalidation: <1s from integration change to next request seeing the new state

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI | 10+ years

Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...

Founder & CEO of Cognilium AI; 100+ production AI systems shipped; multi-cloud AI architecture (AWSGCPAzure); built and operated 4 production AI products
Agentic AIRAG → GraphRAG retrievalVoice AIMulti-Agent Orchestration

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us

Related Articles

Continue exploring related topics and insights from our content library.

Multi-Agent Orchestration on AWS Bedrock AgentCore
9 min
1
Muhammad Mudassir
May 4, 2026

Multi-Agent Orchestration on AWS Bedrock AgentCore

The supervisor + specialist pattern is the most reliable way to ship multi-agent systems on AWS — here is how to wire it, observe it, and bound its cost.

words
Read Article
Surviving Partial Failure in a 3,300-Call Agent Pipeline
8 min
2
Muhammad Mudassir
May 5, 2026

Surviving Partial Failure in a 3,300-Call Agent Pipeline

Two-tier retries, atomic DynamoDB chunk claims, and checkpoint-based cancellation — the failure-recovery layer that lets a multi-agent contract review pipeline finish even when 5% of LLM calls fail.

words
Read Article
When to Mix SQS FIFO and Standard Queues in an Agent Pipeline
7 min
3
Muhammad Mudassir
May 5, 2026

When to Mix SQS FIFO and Standard Queues in an Agent Pipeline

FIFO for chunk ordering, Standard for parallel analysis fan-out. Why a single queue type for the whole pipeline is the wrong default, with the dead-letter and retry settings that make the split work.

words
Read Article

Explore More Insights

Discover more expert articles on AI, engineering, and technology trends.