Why not just register every tool and let the LLM choose?

Two reasons: cost (every registered tool inflates the system prompt, which costs tokens on every turn) and security (if the supervisor knows about a Salesforce tool the org has not connected, hallucination produces a 401 the user sees as "the AI broke"). Bind only what the org has paid for and integrated.

How is RBAC enforced — on the agent or on the tool?

Both, defense in depth. Agent factory binds only allowed tools at registration time (first line). Each tool also re-checks the user permission inside its handler before calling the downstream API (second line). Either layer alone is brittle; together they are robust.

Where do per-org permissions actually live?

Immutable Firebase custom claims set at signup, plus a Firestore document at organizations/{orgId}/permissions. The middleware reads the claim, fetches the doc, and merges into the request context. Anything downstream uses request.context — no service ever queries Firestore directly for ACLs.

How fast is per-org agent instantiation?

Tool factory caches by {orgId, integrations_hash}. Cold path: ~150ms (Firestore reads + tool wiring). Warm: ~5ms (in-memory map lookup). Cache is invalidated on integration change via Pub/Sub.

Does this work with OpenAI Assistants v2 or only ADK?

The pattern is portable. Assistants v2 has tool registration via the API; the equivalent is creating an assistant per org, or building a thin layer that filters tools at the request boundary. The trade-off: ADK gives you native agent-as-code, Assistants v2 gives you a hosted thread runtime — pick by ops preference.

Supervisor-Router on Google ADK with Per-Org Tool Registr…

A multi-tenant SaaS that exposes 7 specialist agents through a supervisor needs a model where each org sees only the tools they have paid for and connected. The naive approach — register every tool, let the LLM ignore the irrelevant ones — fails on cost (every tool description in the system prompt costs tokens on every turn) and on security (the LLM hallucinates a Salesforce call for an org without Salesforce, the user sees a 401 and blames the AI).

The system this writeup describes runs on Google ADK 1.15 with Gemini 2.0 Flash for chat and Gemini 2.5 Pro for document intelligence. The supervisor sits in front of seven specialists (Doc, Email, Calendar, Investment, Compliance, Audit, Reporting). What makes it shippable is that the supervisor is instantiated per-request from a factory that reads org-level RBAC and integration status before binding tools.

The agent factory

Every request hits a TenantContextMiddleware that reads the immutable Firebase custom claim, looks up organizations/{orgId}/permissions, and attaches it to request.context. The supervisor factory takes that context and assembles the agent: which specialists to register (Investment is excluded for ops-only orgs), which tools to bind to each specialist (Salesforce-write only for orgs with the Salesforce integration in connected state), and which system prompt fragment to splice in (compliance-mode adds an audit-trail clause).

The factory output is a fresh ADK Agent instance per request. A naive implementation rebuilds it from scratch every call, which costs ~150ms in Firestore reads. The factory caches by {orgId, integrations_hash} so warm path is ~5ms. Hash invalidates on any integration change via a Pub/Sub topic the integrations service publishes to.

RBAC at two layers

Layer one: tool registration. The factory only binds tools the org is allowed to use. The LLM literally does not know the others exist — system prompt is shorter, hallucination cannot reach into a non-connected Salesforce.

Layer two: per-tool permission check inside the tool handler. Every tool starts with assert_permission(request.context, "salesforce:read"). This catches the case where someone bypasses the factory (test fixtures, internal scripts) and acts as a runtime audit point. Either layer alone is brittle — the factory layer can be bypassed accidentally; the tool layer is enforced last but cannot trim the prompt. Together they cover both gaps.

Why ADK and not Assistants v2

ADK gives you agent-as-code with versioning in git — every supervisor change is reviewable
Vertex AI handles model routing without a third-party gateway
Native streaming through ADK works with Gemini 2.5 Pro grounded responses
Tool definitions live in Python with full type-checking, not in a Web Studio

The Assistants v2 alternative is one assistant per org, which works but turns "add a feature to the supervisor" into a fan-out migration across N orgs. ADK keeps the source of truth in code and the per-org variation in data.

Numbers from production

Cold supervisor instantiation: ~150ms (Firestore × 3 + tool factory)
Warm: ~5ms (in-memory cache, hash-keyed)
7 specialists × up to 12 tools each = ~84-tool catalog at registration time
Average org sees 4-6 specialists with 3-5 tools each — supervisor prompt stays under 4KB
Pub/Sub-driven cache invalidation: <1s from integration change to next request seeing the new state

Supervisor-Router on Google ADK with Per-Org Tool Registration

The agent factory

RBAC at two layers

Why ADK and not Assistants v2

Numbers from production

Share this article

Muhammad Mudassir

Muhammad Mudassir

Frequently Asked Questions

Why not just register every tool and let the LLM choose?

How is RBAC enforced — on the agent or on the tool?

Where do per-org permissions actually live?

How fast is per-org agent instantiation?

Does this work with OpenAI Assistants v2 or only ADK?

Still have questions?

Related Articles

Multi-Agent Orchestration on AWS Bedrock AgentCore

Surviving Partial Failure in a 3,300-Call Agent Pipeline

When to Mix SQS FIFO and Standard Queues in an Agent Pipeline

Explore More Insights