TL;DR
Building a multi-tenant agent platform on Google ADK where the supervisor binds only the tools each org has paid for and integrated — without forking the agen
A multi-tenant SaaS that exposes 7 specialist agents through a supervisor needs a model where each org sees only the tools they have paid for and connected. The naive approach — register every tool, let the LLM ignore the irrelevant ones — fails on cost (every tool description in the system prompt costs tokens on every turn) and on security (the LLM hallucinates a Salesforce call for an org without Salesforce, the user sees a 401 and blames the AI).
The system this writeup describes runs on Google ADK 1.15 with Gemini 2.0 Flash for chat and Gemini 2.5 Pro for document intelligence. The supervisor sits in front of seven specialists (Doc, Email, Calendar, Investment, Compliance, Audit, Reporting). What makes it shippable is that the supervisor is instantiated per-request from a factory that reads org-level RBAC and integration status before binding tools.
The agent factory
Every request hits a TenantContextMiddleware that reads the immutable Firebase custom claim, looks up organizations/{orgId}/permissions, and attaches it to request.context. The supervisor factory takes that context and assembles the agent: which specialists to register (Investment is excluded for ops-only orgs), which tools to bind to each specialist (Salesforce-write only for orgs with the Salesforce integration in connected state), and which system prompt fragment to splice in (compliance-mode adds an audit-trail clause).
The factory output is a fresh ADK Agent instance per request. A naive implementation rebuilds it from scratch every call, which costs ~150ms in Firestore reads. The factory caches by {orgId, integrations_hash} so warm path is ~5ms. Hash invalidates on any integration change via a Pub/Sub topic the integrations service publishes to.
RBAC at two layers
Layer one: tool registration. The factory only binds tools the org is allowed to use. The LLM literally does not know the others exist — system prompt is shorter, hallucination cannot reach into a non-connected Salesforce.
Layer two: per-tool permission check inside the tool handler. Every tool starts with assert_permission(request.context, "salesforce:read"). This catches the case where someone bypasses the factory (test fixtures, internal scripts) and acts as a runtime audit point. Either layer alone is brittle — the factory layer can be bypassed accidentally; the tool layer is enforced last but cannot trim the prompt. Together they cover both gaps.
Why ADK and not Assistants v2
- ADK gives you agent-as-code with versioning in git — every supervisor change is reviewable
- Vertex AI handles model routing without a third-party gateway
- Native streaming through ADK works with Gemini 2.5 Pro grounded responses
- Tool definitions live in Python with full type-checking, not in a Web Studio
The Assistants v2 alternative is one assistant per org, which works but turns "add a feature to the supervisor" into a fan-out migration across N orgs. ADK keeps the source of truth in code and the per-org variation in data.
Numbers from production
- Cold supervisor instantiation: ~150ms (Firestore × 3 + tool factory)
- Warm: ~5ms (in-memory cache, hash-keyed)
- 7 specialists × up to 12 tools each = ~84-tool catalog at registration time
- Average org sees 4-6 specialists with 3-5 tools each — supervisor prompt stays under 4KB
- Pub/Sub-driven cache invalidation: <1s from integration change to next request seeing the new state
Share this article
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years experience
Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...
