TL;DR
Once you have decided you need more than one agent, how you wire them matters more than how many you have. There are four canonical topologies: a sequential pipeline (simple but its reliability multiplies down, so a ten-step chain at ninety-five percent per step is only about sixty percent end to end), an orchestrator-worker or supervisor pattern (right for independent parallel subtasks, but the orchestrator is a serial spine capped by Amdahl and a single point of failure), a hierarchical pattern (for very large decompositions, at a cost per layer), and a network (most flexible, least controllable, usually a trap in production). The same three agents at eighty percent each are about fifty-one percent reliable if all must succeed, about ninety percent under a majority vote, and about ninety-nine percent if any one can and you can verify it. Underneath the topology sits a second axis: share a structured store (a blackboard) instead of passing natural-language messages, and keep the writes single-threaded.
TL;DR: Once you have decided you actually need more than one agent, the next choice matters more than most teams realize: how you wire them together. There are four canonical topologies. A sequential pipeline runs agents in a line, simple to reason about but its reliability multiplies down, so a ten-step chain at ninety-five percent per step is only about sixty percent reliable end to end. An orchestrator-worker setup has one coordinator delegate to parallel workers and synthesize their results, which is the right shape for independent subtasks but leaves the orchestrator as a serial spine that Amdahl's law caps and a single point of failure. A hierarchical setup layers supervisors over sub-supervisors for very large decompositions, and every layer it adds is another tax on latency, cost, and reliability. A network, where every agent talks to every other agent, is the most flexible and the least controllable, and in production it is almost always a trap. The wiring changes the numbers by more than the agent count does: the same three agents at eighty percent each are about fifty-one percent reliable if all must succeed, about ninety percent under a majority vote, and about ninety-nine percent if any one can, as long as you can verify which one is right. And underneath the topology sits a second axis most teams ignore: whether agents share a structured store or pass each other natural-language messages. Share structured state, keep the writes single-threaded, and route instead of fanning out, and the topology stops fighting you.
The last post in this series was about whether to use multiple agents at all, and its answer was to default to one and make the second agent prove it is necessary. This post assumes you did that, the gate came back yes, and now you have a genuinely multi-agent problem on your hands. The question is no longer how many agents. It is how they connect. And that question gets skipped constantly, because the frameworks make one particular shape so easy to spin up that teams adopt it by default and never notice they chose an architecture at all. They notice later, when the bill is an order of magnitude too high, or the system contradicts itself, or a single flaky step takes the whole run down with it. All three of those are wiring problems, not agent problems.
The number of agents is a rounding error next to how they are connected. Two systems with the same five agents and the same models can differ by forty points of reliability and ten times the cost, purely because one was wired as a pipeline and the other as a voting pool. Topology is the decision. Head count is a consequence.
The wiring matters more than the head count
Here is the claim, stated plainly, because the rest of the post is evidence for it. When people describe a multi-agent system, they say how many agents it has. That is the least informative fact about it. What determines whether the system is fast, cheap, reliable, and coherent is the graph: which agents can talk to which, in what order, and through what medium. Change the graph and hold everything else fixed, and you get a different system with different economics and different failure modes.
This is not an abstract point. In the decision post we established that a multi-agent architecture charges you three costs every time, token multiplication, coordination failure, and compounding unreliability, in exchange for two possible gains, parallelism and isolation. The topology is precisely what decides how much of each cost you pay and how much of each gain you get. A pipeline gets you almost no parallelism and pays full compounding cost. An orchestrator-worker graph gets you real parallelism on the independent parts but pays for a coordinator. A network gets you maximum flexibility and pays maximum coordination cost. Same agents, same models, radically different outcomes, and the only thing that changed was the shape.
So the useful way to design a multi-agent system is to start from the dependency structure of the task, not from a framework's default. Map which parts of the work truly depend on which other parts, and the right topology falls out of that map almost mechanically. The parts that must happen in order want a sequence. The parts that can happen independently want to fan out from a coordinator. The parts that are enormous want a hierarchy, if anything. The parts that genuinely need open negotiation want a network, though that is rarer than it sounds. Get the map right and the wiring is nearly determined. Get it wrong and no amount of prompt tuning will save you, because you will be fighting the graph.
The four topologies, and the shape each one fits
There are four patterns worth knowing by name. Real systems combine them, an orchestrator whose workers each run a small internal pipeline is common, but you compose from these four primitives, so it pays to know exactly what each is for and exactly where each breaks.
The sequential pipeline is agents in a line: agent A's output is agent B's input, and so on to the end. It is the easiest to reason about because it is just a chain, and it fits work that is genuinely ordered, where each stage depends on the finished output of the one before it. Its failure is the one you can put a number on. Reliabilities in a chain multiply, so a pipeline is only as strong as the product of its links, and that product falls off a cliff. There is a second, quieter failure: a lot of what teams build as agent pipelines has no autonomous decision-making in it at all. If each stage does a fixed transformation and hands off, that is not a multi-agent system, it is a data pipeline, and wrapping deterministic steps in autonomous agents buys you non-determinism and token cost in exchange for nothing. Frameworks like CrewAI's sequential process or a linear graph in LangGraph make this shape trivial, which is exactly why it gets overused.
The orchestrator-worker pattern, also called supervisor or hub-and-spoke, has one coordinating agent that decomposes the task, delegates pieces to worker agents, and synthesizes their results. This is the shape most people mean when they say multi-agent, and it is the right one for the canonical good case: subtasks that are genuinely independent and can run in parallel. Anthropic's public account of a multi-agent research system is the textbook example, an orchestrator that plans a query, spins up three to five subagents to explore different threads at the same time, then runs a synthesis and citation pass over what they found. It fits because researching one source does not depend on researching another. Its failures are structural. The orchestrator is a serial spine, the planning at the front and the synthesis at the back cannot be parallelized, and by Amdahl's law that serial fraction caps your speedup no matter how many workers you add: if thirty percent of the work is the orchestrator's serial glue, infinite workers still only get you about three and a third times faster. The orchestrator is also a single point of failure and a context bottleneck, since everything the workers produce has to funnel back through it. Supervisor graphs in LangGraph, hierarchical crews in CrewAI, and the handoff model in OpenAI's Agents SDK all implement versions of this.
The hierarchical pattern is orchestrator-worker taken up a level: a top supervisor manages sub-supervisors, each of which manages its own workers, forming a tree. It exists for one honest reason, decompositions so large that a single orchestrator cannot hold the whole plan in one context. That is a real situation and a rare one. The cost is that every layer you add is another serial hop in the spine, another context handoff where meaning can drift, and another multiplicative factor in the reliability product. A three-layer hierarchy has three chances to misroute a task before any real work happens. Most systems that reach for a hierarchy are not too big for an orchestrator, they are over-decomposed, and the fix is fewer, more capable agents rather than more layers of management. When you genuinely do need it, frameworks like MetaGPT and Amazon Bedrock AgentCore model this kind of layered structure, but the burden of proof for a third layer should be high.
The network, or peer-to-peer, pattern lets every agent talk to every other agent directly, with no central coordinator, often as a group chat where agents take turns contributing. It is the most flexible topology and the most dangerous in production. Any-to-any communication means the number of possible interactions grows with the square of the agent count, there is no single place that owns the truth, and the conversation can loop, drift, or stall with no authority to stop it. This is Cognition's warning at full volume: agents on partial context making conflicting decisions, except now there is nobody whose job is to reconcile them. Networks are useful for open-ended exploration and brainstorming-style tasks where you want emergent disagreement, and AutoGen's group-chat mode is built for exactly that. But for a system that has to produce one correct, consistent answer and be debuggable when it does not, a network is the hardest possible shape to trust, because when it goes wrong there is no single trace to read.
The same agents, wired three ways, are not equally reliable
The strongest argument that topology dominates head count is arithmetic, so here it is with the numbers shown. Take three agents, each of which does its job correctly eighty percent of the time. Eighty percent per agent is fixed. What changes is how you wire them.
Wire them so that all three must succeed for the system to succeed, which is what a pipeline does and what an orchestrator that needs every worker's result does. The reliabilities multiply: 0.8 times 0.8 times 0.8 is about 0.51. Three solid agents combine into a system that is barely better than a coin flip, and it got worse, not better, by adding agents, because every additional required step is another factor below one. This is the same compounding math from the decision post, and it is why long chains are fragile: 0.95 to the tenth power is only about 0.60.
Now wire the same three agents as a majority vote: run all three on the same task and take the answer at least two of them agree on. The system is now correct whenever any two or all three are right. That probability is 0.8 cubed plus three times 0.8 squared times 0.2, which is about 0.90. The identical three agents jumped from fifty-one percent to ninety percent, not by getting smarter, but by being wired for redundancy instead of dependency. And if you can actually verify which answer is correct, say the task is to produce code and you can run the test, then you only need any one of the three to succeed, and the OR math is 1 minus 0.2 cubed, about 0.99.
So the same three agents span from about fifty-one percent to about ninety-nine percent reliable, a forty-eight point swing, entirely on wiring. That is the whole thesis in one calculation: topology, not head count, sets your reliability. Two honest caveats keep this from being a magic trick. First, all of that redundancy math assumes the agents' errors are independent, and they are not fully independent when they are the same model with the same prompt seeing the same input, so real voting gains are smaller than the clean numbers and you get them by deliberately diversifying prompts, models, or tools. Second, the OR case only helps when you can cheaply and correctly verify which output is right, because if you cannot tell the good answer from the bad one, running three and picking wrong is just three times the cost. Redundancy buys reliability, but only when you engineer for independence and verification on purpose.
Topology is only half the decision: the substrate underneath
Choosing who talks to whom is the first axis. The second, which teams skip almost universally, is how they talk: what actually moves between agents. This substrate decision quietly determines more of your coordination cost than the topology does, and there are two options.
The first is message passing, where agents hand each other natural-language messages, one agent's output prose becomes the next one's input prompt. It is the default in most conversational frameworks and in the handoff model, and it has a built-in defect: every hand-off is a fresh translation, and natural language is lossy and ambiguous, so meaning drifts a little at each step. Ten agents passing messages is a game of telephone with ten players, and the final message rarely matches the first. Worse, there is no single place to inspect the system's state, because the state is smeared across a transcript that no two agents read the same way.
The second is shared structured state, where every agent reads from and writes to one common store with defined fields, a table of facts, scores, decisions, or claims, rather than mailing prose around. This is an old and well-proven idea. The blackboard architecture, built for the Hearsay-II speech-understanding system around 1980, had independent specialist modules all reading and writing a shared blackboard instead of talking to each other directly, precisely so there would be one consistent source of truth. It is the same reason the memory series treated the structured store, not the message history, as the real memory, and the same discipline that keeps a knowledge graph trustworthy: structured state is queryable, consistent, and does not mutate on each retelling. Give agents a blackboard and coordination bugs drop, because there is nothing to misread, there is only state to read.
Pair the substrate with the single-writer principle and most of the remaining risk goes away. Cognition's sharpened conclusion, which the field is converging on, is that extra agents are welcome to read, analyze, and propose in parallel, because reading never conflicts, but the writes, the actions that commit to shared state, should flow through one path so two agents can never commit contradictory changes. A shared blackboard for reading plus a single writer for committing gives you the parallelism you paid for without the consistency tax that sinks most multi-agent builds. Topology decides the shape of the conversation. Substrate decides whether the conversation stays coherent, and it is the cheaper of the two to get right.
Matching the wiring to the task's dependency graph
With the primitives and the two axes in hand, the design procedure is short. Draw the task as a dependency graph, which parts need which other parts' output before they can start, and read the topology off it.
If the graph is a straight line, every step needing the last, you have a sequence, and your first question should be whether the steps involve any real autonomous decisions or just fixed transformations. If they are fixed, build a deterministic pipeline in code and skip agents entirely, the way we did on the data platform described below. If the graph is a hub with independent spokes, subtasks that share a parent but not each other, you have an orchestrator-worker case, so build the coordinator, run the workers in parallel, and keep the synthesis honest about which workers actually mattered. If the graph is genuinely enormous and no single coordinator can hold the plan, and only then, add a layer of hierarchy, knowing each layer costs you. If the graph is a dense mesh where everything depends on everything and the value is in open negotiation rather than a single correct output, a network might fit, but treat that as the exception that has to argue for itself, the same way a second agent had to in the previous post.
Then layer the reliability decision on top. Where a step is critical and you can verify its output, consider redundancy, run it more than once and vote or take the first verified success, because that is how you buy back the reliability the chain took away. Where you cannot verify, you cannot lean on redundancy, so you invest in making that single step as reliable as possible and in detecting its failure explicitly, since a downstream agent cannot tell that an upstream one quietly returned a confident wrong answer. And regardless of topology, put the whole thing on a shared structured store with single-threaded writes, because that decision costs almost nothing and pays off in every shape. The wiring is not a matter of taste once you have the dependency graph. The graph tells you the topology, the criticality tells you where to add redundancy, and the need for coherence tells you to use a blackboard. Taste is what fills the gaps.
What this looks like when we build it
The abstract version is four primitives and two axes. Here is the concrete version, anonymized, from systems we have shipped.
Paralegent, our multi-agent legal-analysis system, is an orchestrator-worker graph, and it is that shape for the reason the pattern exists: analyzing a legal document decomposes into many independent passes over the same source, and independence is exactly what fans out cleanly from a coordinator. It runs twenty-three agents, twelve scorers and eleven analysts. But the topology is only why it can parallelize, the substrate is why it stays coherent. Those agents do not pass transcripts to each other, they read and write a shared, structured scores table and a queue, which is a blackboard in everything but name, so there is one source of truth instead of twenty-three drifting retellings. The writes land in that shared table in a disciplined way rather than twenty-three agents taking conflicting actions, which is the single-writer principle in production. And a routing layer decides which agents a given document actually needs instead of firing all twenty-three on every input, the change that cut model calls by roughly seventy-five percent. That routing figure is not a performance flex, it is what keeps a twenty-three-agent orchestrator economically sane, and it exists because we treated the topology and its cost as a design problem, not a default.
The counter-example is just as instructive. On a wealth-management platform we built, the document-to-knowledge pipeline runs eight stages, extraction, entity resolution, confidence scoring, and the rest, feeding a knowledge graph in Neo4j with six entity types and five relationship types. On paper that is a sequential topology. In practice we built it as a deterministic pipeline in code, not as a chain of autonomous agents, because the stages are fixed, ordered transformations with no genuine decision for an agent to make. Wrapping them in agents would have handed us the sequential topology's worst property, multiplicative reliability decay across eight non-deterministic steps, in exchange for none of the flexibility agents are supposed to provide. The skill is not knowing how to wire agents. It is recognizing when the honest topology has no agents in it at all, and having the discipline to ship the boring, reliable version. It is the same judgment we apply when we grade a graph before trusting it or evaluate whether a memory system actually works: the architecture serves the task, never the fashion.
How to tell your topology is wrong
A few symptoms point straight at a wiring problem rather than an agent problem, and each one maps to a specific fix.
Your reliability drops as you add stages, and adding a retry or a better prompt to one agent barely moves the end-to-end number. That is compounding decay from a chain that is too long, and the fix is redundancy on the critical steps or collapsing the chain, not a better model. Your token bill is an order of magnitude high and your workers spend most of their tokens re-establishing context that another worker already had. That is a substrate problem, your agents are passing prose instead of sharing state, and the fix is a blackboard, not a bigger context window. Your system produces subtly self-contradictory answers, and there is a step whose real job is to reconcile disagreements the architecture itself created. That is a network or a fan-out with no single writer, and the fix is to route through a coordinator and commit through one path. Your workers wait on each other in a line despite being separate agents. That is a pipeline wearing an orchestrator's clothes, and the fix is to either make the subtasks genuinely independent or stop paying for agents you are running serially anyway. And the tell that settles it: you cannot draw your system's dependency graph on a whiteboard in under a minute. If the shape is not clear to you, it is not clear to the system either, and clarity of topology is the thing that makes a multi-agent system debuggable.
None of these fixes is a new framework. They are all the same move, match the wiring to the dependency structure of the work, share structured state instead of messages, keep the writes single-threaded, and add redundancy only where you can verify it. The frameworks are interchangeable. The topology is the decision, and it is one you should make on purpose, from the task's dependency graph, before the first agent gets spun up. The next posts in this series go deeper into the coordination substrate, into how you evaluate a multi-agent system specifically rather than just its individual agents, and into the cost engineering that keeps all of this affordable. But the shape comes first, because every other decision is easier once the graph is right and nearly impossible when it is wrong.
Not sure whether your problem wants an orchestrator, a pipeline, or no agents at all? That is one of the first things we work out on an engagement, because the topology decision is cheaper to get right on a whiteboard than to unwind in production. Book a 15-minute call and we will map your task's dependency graph with you and tell you honestly which shape it wants, where redundancy earns its cost, and where a single well-built agent or a plain deterministic pipeline beats a swarm. We work US business hours.
Share this article
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years
Muhammad Mudassir
Founder & CEO, Cognilium AI | 10+ years experience
Mudassir Marwat is the Founder & CEO of Cognilium AI. He has shipped 100+ production AI systems acro...
