What are the main multi-agent architecture patterns?

Four topologies cover almost everything. A sequential pipeline runs agents in a line, one feeding the next, and fits inherently ordered work but suffers multiplicative reliability decay. An orchestrator-worker or supervisor pattern has one coordinator delegate to workers and synthesize their results, and fits independent subtasks that can run in parallel. A hierarchical pattern layers supervisors over sub-supervisors for very large decompositions, at the cost of a serial hop and a reliability factor per layer. A network or peer-to-peer pattern lets every agent talk to every other, which is flexible but the hardest to keep consistent and debug. Real systems compose these primitives, so the useful skill is reading the right one off the task's dependency graph rather than accepting a framework's default.

Which multi-agent topology is the most reliable?

Reliability is set by how the agents are wired, not by which topology has the nicest name. Any arrangement where every step must succeed, a pipeline or an orchestrator that needs all workers, multiplies the per-step reliabilities and decays fast: three agents at eighty percent each are only about fifty-one percent reliable together. The same three agents wired for redundancy, a majority vote, reach about ninety percent, and if you can verify which output is correct so any one success is enough, about ninety-nine percent. So the most reliable pattern is one that uses redundancy on critical, verifiable steps rather than stacking dependent steps. The caveat is that voting only helps when the agents' errors are reasonably independent, which you engineer by diversifying prompts, models, or tools.

When should I use an orchestrator-worker pattern versus a pipeline?

Use an orchestrator-worker pattern when the task decomposes into subtasks that are genuinely independent and can run in parallel, like exploring several sources or reviewing many documents at once, because the parallelism is the whole payoff and it fans out cleanly from a coordinator. Use a pipeline when the work is inherently ordered and each stage truly needs the finished output of the previous one. But first check whether the pipeline stages involve any autonomous decisions at all: if they are fixed transformations, build a deterministic pipeline in ordinary code and use no agents, since wrapping deterministic steps in agents adds non-determinism and token cost for no benefit.

What is the biggest mistake teams make wiring multi-agent systems?

Passing natural-language messages between agents instead of sharing a structured store. Every hand-off is a fresh chance to misread, meaning drifts at each step like a game of telephone, and there is no single place to inspect the system's state. The fix is a shared structured store, a blackboard, that every agent reads from and writes to, paired with the single-writer principle so that many agents can read and propose in parallel but only one path commits changes to shared state. This substrate decision costs almost nothing to make and reduces coordination bugs more than any change to the topology itself, which is why it is worth deciding before you pick a framework.

Do more agents make a system more capable?

Not by themselves. Adding agents adds coordination cost, token cost, and, if they are wired as dependent steps, reliability decay, so a poorly wired five-agent system is often worse than one strong agent. Capability comes from matching the topology to the task's dependency structure and from using redundancy where you can verify results, not from raw head count. The prior question, whether you need multiple agents at all, is worth settling first, and we covered it in the decision post; this one is about wiring the agents well once you have decided you genuinely need them.

Multi-Agent Architecture Patterns: Choosing a Topology

TL;DR: Once you have decided you actually need more than one agent, the next choice matters more than most teams realize: how you wire them together. There are four canonical topologies. A sequential pipeline runs agents in a line, simple to reason about but its reliability multiplies down, so a ten-step chain at ninety-five percent per step is only about sixty percent reliable end to end. An orchestrator-worker setup has one coordinator delegate to parallel workers and synthesize their results, which is the right shape for independent subtasks but leaves the orchestrator as a serial spine that Amdahl's law caps and a single point of failure. A hierarchical setup layers supervisors over sub-supervisors for very large decompositions, and every layer it adds is another tax on latency, cost, and reliability. A network, where every agent talks to every other agent, is the most flexible and the least controllable, and in production it is almost always a trap. The wiring changes the numbers by more than the agent count does: the same three agents at eighty percent each are about fifty-one percent reliable if all must succeed, about ninety percent under a majority vote, and about ninety-nine percent if any one can, as long as you can verify which one is right. And underneath the topology sits a second axis most teams ignore: whether agents share a structured store or pass each other natural-language messages. Share structured state, keep the writes single-threaded, and route instead of fanning out, and the topology stops fighting you.

The last post in this series was about whether to use multiple agents at all, and its answer was to default to one and make the second agent prove it is necessary. This post assumes you did that, the gate came back yes, and now you have a genuinely multi-agent problem on your hands. The question is no longer how many agents. It is how they connect. And that question gets skipped constantly, because the frameworks make one particular shape so easy to spin up that teams adopt it by default and never notice they chose an architecture at all. They notice later, when the bill is an order of magnitude too high, or the system contradicts itself, or a single flaky step takes the whole run down with it. All three of those are wiring problems, not agent problems.

The number of agents is a rounding error next to how they are connected. Two systems with the same five agents and the same models can differ by forty points of reliability and ten times the cost, purely because one was wired as a pipeline and the other as a voting pool. Topology is the decision. Head count is a consequence.

The wiring matters more than the head count

Here is the claim, stated plainly, because the rest of the post is evidence for it. When people describe a multi-agent system, they say how many agents it has. That is the least informative fact about it. What determines whether the system is fast, cheap, reliable, and coherent is the graph: which agents can talk to which, in what order, and through what medium. Change the graph and hold everything else fixed, and you get a different system with different economics and different failure modes.

This is not an abstract point. In the decision post we established that a multi-agent architecture charges you three costs every time, token multiplication, coordination failure, and compounding unreliability, in exchange for two possible gains, parallelism and isolation. The topology is precisely what decides how much of each cost you pay and how much of each gain you get. A pipeline gets you almost no parallelism and pays full compounding cost. An orchestrator-worker graph gets you real parallelism on the independent parts but pays for a coordinator. A network gets you maximum flexibility and pays maximum coordination cost. Same agents, same models, radically different outcomes, and the only thing that changed was the shape.

So the useful way to design a multi-agent system is to start from the dependency structure of the task, not from a framework's default. Map which parts of the work truly depend on which other parts, and the right topology falls out of that map almost mechanically. The parts that must happen in order want a sequence. The parts that can happen independently want to fan out from a coordinator. The parts that are enormous want a hierarchy, if anything. The parts that genuinely need open negotiation want a network, though that is rarer than it sounds. Get the map right and the wiring is nearly determined. Get it wrong and no amount of prompt tuning will save you, because you will be fighting the graph.

The four topologies, and the shape each one fits

There are four patterns worth knowing by name. Real systems combine them, an orchestrator whose workers each run a small internal pipeline is common, but you compose from these four primitives, so it pays to know exactly what each is for and exactly where each breaks.

The sequential pipeline is agents in a line: agent A's output is agent B's input, and so on to the end. It is the easiest to reason about because it is just a chain, and it fits work that is genuinely ordered, where each stage depends on the finished output of the one before it. Its failure is the one you can put a number on. Reliabilities in a chain multiply, so a pipeline is only as strong as the product of its links, and that product falls off a cliff. There is a second, quieter failure: a lot of what teams build as agent pipelines has no autonomous decision-making in it at all. If each stage does a fixed transformation and hands off, that is not a multi-agent system, it is a data pipeline, and wrapping deterministic steps in autonomous agents buys you non-determinism and token cost in exchange for nothing. Frameworks like CrewAI's sequential process or a linear graph in LangGraph make this shape trivial, which is exactly why it gets overused.

The orchestrator-worker pattern, also called supervisor or hub-and-spoke, has one coordinating agent that decomposes the task, delegates pieces to worker agents, and synthesizes their results. This is the shape most people mean when they say multi-agent, and it is the right one for the canonical good case: subtasks that are genuinely independent and can run in parallel. Anthropic's public account of a multi-agent research system is the textbook example, an orchestrator that plans a query, spins up three to five subagents to explore different threads at the same time, then runs a synthesis and citation pass over what they found. It fits because researching one source does not depend on researching another. Its failures are structural. The orchestrator is a serial spine, the planning at the front and the synthesis at the back cannot be parallelized, and by Amdahl's law that serial fraction caps your speedup no matter how many workers you add: if thirty percent of the work is the orchestrator's serial glue, infinite workers still only get you about three and a third times faster. The orchestrator is also a single point of failure and a context bottleneck, since everything the workers produce has to funnel back through it. Supervisor graphs in LangGraph, hierarchical crews in CrewAI, and the handoff model in OpenAI's Agents SDK all implement versions of this.

The hierarchical pattern is orchestrator-worker taken up a level: a top supervisor manages sub-supervisors, each of which manages its own workers, forming a tree. It exists for one honest reason, decompositions so large that a single orchestrator cannot hold the whole plan in one context. That is a real situation and a rare one. The cost is that every layer you add is another serial hop in the spine, another context handoff where meaning can drift, and another multiplicative factor in the reliability product. A three-layer hierarchy has three chances to misroute a task before any real work happens. Most systems that reach for a hierarchy are not too big for an orchestrator, they are over-decomposed, and the fix is fewer, more capable agents rather than more layers of management. When you genuinely do need it, frameworks like MetaGPT and Amazon Bedrock AgentCore model this kind of layered structure, but the burden of proof for a third layer should be high.

The network, or peer-to-peer, pattern lets every agent talk to every other agent directly, with no central coordinator, often as a group chat where agents take turns contributing. It is the most flexible topology and the most dangerous in production. Any-to-any communication means the number of possible interactions grows with the square of the agent count, there is no single place that owns the truth, and the conversation can loop, drift, or stall with no authority to stop it. This is Cognition's warning at full volume: agents on partial context making conflicting decisions, except now there is nobody whose job is to reconcile them. Networks are useful for open-ended exploration and brainstorming-style tasks where you want emergent disagreement, and AutoGen's group-chat mode is built for exactly that. But for a system that has to produce one correct, consistent answer and be debuggable when it does not, a network is the hardest possible shape to trust, because when it goes wrong there is no single trace to read.

The same agents, wired three ways, are not equally reliable

The strongest argument that topology dominates head count is arithmetic, so here it is with the numbers shown. Take three agents, each of which does its job correctly eighty percent of the time. Eighty percent per agent is fixed. What changes is how you wire them.

Wire them so that all three must succeed for the system to succeed, which is what a pipeline does and what an orchestrator that needs every worker's result does. The reliabilities multiply: 0.8 times 0.8 times 0.8 is about 0.51. Three solid agents combine into a system that is barely better than a coin flip, and it got worse, not better, by adding agents, because every additional required step is another factor below one. This is the same compounding math from the decision post, and it is why long chains are fragile: 0.95 to the tenth power is only about 0.60.

Now wire the same three agents as a majority vote: run all three on the same task and take the answer at least two of them agree on. The system is now correct whenever any two or all three are right. That probability is 0.8 cubed plus three times 0.8 squared times 0.2, which is about 0.90. The identical three agents jumped from fifty-one percent to ninety percent, not by getting smarter, but by being wired for redundancy instead of dependency. And if you can actually verify which answer is correct, say the task is to produce code and you can run the test, then you only need any one of the three to succeed, and the OR math is 1 minus 0.2 cubed, about 0.99.

So the same three agents span from about fifty-one percent to about ninety-nine percent reliable, a forty-eight point swing, entirely on wiring. That is the whole thesis in one calculation: topology, not head count, sets your reliability. Two honest caveats keep this from being a magic trick. First, all of that redundancy math assumes the agents' errors are independent, and they are not fully independent when they are the same model with the same prompt seeing the same input, so real voting gains are smaller than the clean numbers and you get them by deliberately diversifying prompts, models, or tools. Second, the OR case only helps when you can cheaply and correctly verify which output is right, because if you cannot tell the good answer from the bad one, running three and picking wrong is just three times the cost. Redundancy buys reliability, but only when you engineer for independence and verification on purpose.

A map of the four ways to wire a multi-agent system, each shown as a small node graph with the task shape it fits and the failure it hides. Sequential pipeline: three nodes in a line, fits inherently ordered work, breaks because reliability multiplies so if the steps are fixed you should use plain code. Orchestrator-worker, also called supervisor or hub-and-spoke: one coordinator fanning out to three workers, fits independent subtasks run in parallel, breaks because the orchestrator is a serial spine capped by Amdahl and a single point of failure. Hierarchical: a supervisor over two sub-supervisors each over two workers, fits very large decompositions, breaks because every layer adds latency, cost, and another reliability factor. Network or peer-to-peer group chat: four nodes all connected to each other, fits open-ended exploration, breaks because any-to-any messaging is a consistency nightmare and the hardest to debug. A middle band shows the same three agents, each eighty percent reliable, wired three ways: all must succeed which is 0.8 times 0.8 times 0.8 or about fifty-one percent, majority vote of two of three about ninety percent, and any one succeeding if you can verify which is correct which is one minus 0.2 cubed or about ninety-nine percent, with the caveat that this assumes independent errors that same-model agents only partly satisfy. A bottom band shows the second axis, the substrate: message-passing handoffs where agents mail prose and every hand-off is a chance to misread, versus a shared structured state or blackboard where every agent reads and writes one source of truth, and you should prefer the blackboard and keep the writes single-threaded.

Topology is only half the decision: the substrate underneath

Choosing who talks to whom is the first axis. The second, which teams skip almost universally, is how they talk: what actually moves between agents. This substrate decision quietly determines more of your coordination cost than the topology does, and there are two options.

The first is message passing, where agents hand each other natural-language messages, one agent's output prose becomes the next one's input prompt. It is the default in most conversational frameworks and in the handoff model, and it has a built-in defect: every hand-off is a fresh translation, and natural language is lossy and ambiguous, so meaning drifts a little at each step. Ten agents passing messages is a game of telephone with ten players, and the final message rarely matches the first. Worse, there is no single place to inspect the system's state, because the state is smeared across a transcript that no two agents read the same way.

The second is shared structured state, where every agent reads from and writes to one common store with defined fields, a table of facts, scores, decisions, or claims, rather than mailing prose around. This is an old and well-proven idea. The blackboard architecture, built for the Hearsay-II speech-understanding system around 1980, had independent specialist modules all reading and writing a shared blackboard instead of talking to each other directly, precisely so there would be one consistent source of truth. It is the same reason the memory series treated the structured store, not the message history, as the real memory, and the same discipline that keeps a knowledge graph trustworthy: structured state is queryable, consistent, and does not mutate on each retelling. Give agents a blackboard and coordination bugs drop, because there is nothing to misread, there is only state to read.

Pair the substrate with the single-writer principle and most of the remaining risk goes away. Cognition's sharpened conclusion, which the field is converging on, is that extra agents are welcome to read, analyze, and propose in parallel, because reading never conflicts, but the writes, the actions that commit to shared state, should flow through one path so two agents can never commit contradictory changes. A shared blackboard for reading plus a single writer for committing gives you the parallelism you paid for without the consistency tax that sinks most multi-agent builds. Topology decides the shape of the conversation. Substrate decides whether the conversation stays coherent, and it is the cheaper of the two to get right.

Matching the wiring to the task's dependency graph

With the primitives and the two axes in hand, the design procedure is short. Draw the task as a dependency graph, which parts need which other parts' output before they can start, and read the topology off it.

If the graph is a straight line, every step needing the last, you have a sequence, and your first question should be whether the steps involve any real autonomous decisions or just fixed transformations. If they are fixed, build a deterministic pipeline in code and skip agents entirely, the way we did on the data platform described below. If the graph is a hub with independent spokes, subtasks that share a parent but not each other, you have an orchestrator-worker case, so build the coordinator, run the workers in parallel, and keep the synthesis honest about which workers actually mattered. If the graph is genuinely enormous and no single coordinator can hold the plan, and only then, add a layer of hierarchy, knowing each layer costs you. If the graph is a dense mesh where everything depends on everything and the value is in open negotiation rather than a single correct output, a network might fit, but treat that as the exception that has to argue for itself, the same way a second agent had to in the previous post.

Then layer the reliability decision on top. Where a step is critical and you can verify its output, consider redundancy, run it more than once and vote or take the first verified success, because that is how you buy back the reliability the chain took away. Where you cannot verify, you cannot lean on redundancy, so you invest in making that single step as reliable as possible and in detecting its failure explicitly, since a downstream agent cannot tell that an upstream one quietly returned a confident wrong answer. And regardless of topology, put the whole thing on a shared structured store with single-threaded writes, because that decision costs almost nothing and pays off in every shape. The wiring is not a matter of taste once you have the dependency graph. The graph tells you the topology, the criticality tells you where to add redundancy, and the need for coherence tells you to use a blackboard. Taste is what fills the gaps.

What this looks like when we build it

The abstract version is four primitives and two axes. Here is the concrete version, anonymized, from systems we have shipped.

Paralegent, our multi-agent legal-analysis system, is an orchestrator-worker graph, and it is that shape for the reason the pattern exists: analyzing a legal document decomposes into many independent passes over the same source, and independence is exactly what fans out cleanly from a coordinator. It runs twenty-three agents, twelve scorers and eleven analysts. But the topology is only why it can parallelize, the substrate is why it stays coherent. Those agents do not pass transcripts to each other, they read and write a shared, structured scores table and a queue, which is a blackboard in everything but name, so there is one source of truth instead of twenty-three drifting retellings. The writes land in that shared table in a disciplined way rather than twenty-three agents taking conflicting actions, which is the single-writer principle in production. And a routing layer decides which agents a given document actually needs instead of firing all twenty-three on every input, the change that cut model calls by roughly seventy-five percent. That routing figure is not a performance flex, it is what keeps a twenty-three-agent orchestrator economically sane, and it exists because we treated the topology and its cost as a design problem, not a default.

The counter-example is just as instructive. On a wealth-management platform we built, the document-to-knowledge pipeline runs eight stages, extraction, entity resolution, confidence scoring, and the rest, feeding a knowledge graph in Neo4j with six entity types and five relationship types. On paper that is a sequential topology. In practice we built it as a deterministic pipeline in code, not as a chain of autonomous agents, because the stages are fixed, ordered transformations with no genuine decision for an agent to make. Wrapping them in agents would have handed us the sequential topology's worst property, multiplicative reliability decay across eight non-deterministic steps, in exchange for none of the flexibility agents are supposed to provide. The skill is not knowing how to wire agents. It is recognizing when the honest topology has no agents in it at all, and having the discipline to ship the boring, reliable version. It is the same judgment we apply when we grade a graph before trusting it or evaluate whether a memory system actually works: the architecture serves the task, never the fashion.

How to tell your topology is wrong

A few symptoms point straight at a wiring problem rather than an agent problem, and each one maps to a specific fix.

Your reliability drops as you add stages, and adding a retry or a better prompt to one agent barely moves the end-to-end number. That is compounding decay from a chain that is too long, and the fix is redundancy on the critical steps or collapsing the chain, not a better model. Your token bill is an order of magnitude high and your workers spend most of their tokens re-establishing context that another worker already had. That is a substrate problem, your agents are passing prose instead of sharing state, and the fix is a blackboard, not a bigger context window. Your system produces subtly self-contradictory answers, and there is a step whose real job is to reconcile disagreements the architecture itself created. That is a network or a fan-out with no single writer, and the fix is to route through a coordinator and commit through one path. Your workers wait on each other in a line despite being separate agents. That is a pipeline wearing an orchestrator's clothes, and the fix is to either make the subtasks genuinely independent or stop paying for agents you are running serially anyway. And the tell that settles it: you cannot draw your system's dependency graph on a whiteboard in under a minute. If the shape is not clear to you, it is not clear to the system either, and clarity of topology is the thing that makes a multi-agent system debuggable.

None of these fixes is a new framework. They are all the same move, match the wiring to the dependency structure of the work, share structured state instead of messages, keep the writes single-threaded, and add redundancy only where you can verify it. The frameworks are interchangeable. The topology is the decision, and it is one you should make on purpose, from the task's dependency graph, before the first agent gets spun up. The next posts in this series go deeper into the coordination substrate, into how you evaluate a multi-agent system specifically rather than just its individual agents, and into the cost engineering that keeps all of this affordable. But the shape comes first, because every other decision is easier once the graph is right and nearly impossible when it is wrong.

Not sure whether your problem wants an orchestrator, a pipeline, or no agents at all? That is one of the first things we work out on an engagement, because the topology decision is cheaper to get right on a whiteboard than to unwind in production. Book a 15-minute call and we will map your task's dependency graph with you and tell you honestly which shape it wants, where redundancy earns its cost, and where a single well-built agent or a plain deterministic pipeline beats a swarm. We work US business hours.

Four Ways to Wire a Multi-Agent System (and When Each One Breaks)

The wiring matters more than the head count

The four topologies, and the shape each one fits

The same agents, wired three ways, are not equally reliable

Topology is only half the decision: the substrate underneath

Matching the wiring to the task's dependency graph

What this looks like when we build it

How to tell your topology is wrong

Share this article

Muhammad Mudassir

Muhammad Mudassir

Frequently Asked Questions

What are the main multi-agent architecture patterns?

Which multi-agent topology is the most reliable?

When should I use an orchestrator-worker pattern versus a pipeline?

What is the biggest mistake teams make wiring multi-agent systems?

Do more agents make a system more capable?

Still have questions?

Related Articles

Most Multi-Agent Systems Would Work Better as One Agent

Your Agent's Memory Benchmark Is Measuring the Wrong Thing

Why Your AI Agent Keeps Forgetting

Explore More Insights