Multi-Agent Systems Are Repeating the Microservices Mistake
In 2015, microservices were the answer to everything. Monoliths were bad. Small, focused services were good. Split everything. Deploy independently. Scale horizontally. The pattern seemed obviously correct — until teams discovered they’d traded a local complexity problem for a distributed one. Latency budgets. Partial failures. Distributed transactions. Observability nightmares.
We are doing this again, but with AI agents.
The current orthodoxy: decompose your AI workloads into specialized agents, wire them together with an orchestration layer, and watch them collaborate like a well-run engineering team. In practice, teams are discovering the same coordination tax that killed naive microservices adoption — and for exactly the same reasons.
Why Multi-Agent Decomposition Feels Right
The logic is seductive. A single general-purpose agent that codes, searches the web, manages files, calls APIs, and writes documentation is doing too much. It lacks focus. Context windows fill up with noise. Errors in one domain corrupt reasoning in another.
So you split: a research agent, a coding agent, a QA agent, a planning agent. Each is smaller, more focused, easier to evaluate. You can swap individual agents for better models as they improve. The system feels modular and composable.
This works extremely well — for simple pipelines with clear handoff points. Research → summarize → write → publish. Linear flow. Defined interfaces. Bounded scope.
The problem starts when you need coordination.
The Coordination Tax Is Real
The moment agents need to collaborate on a shared goal with ambiguous sub-tasks, you’ve introduced all the classic distributed systems problems:
Partial failure. One agent in a pipeline fails or produces garbage output. Does the orchestrator retry? Skip? Abort? Each choice has downstream consequences that are hard to predict and harder to test. Unlike a function call that throws an exception, agent failure is often soft — the agent returns something plausible but wrong, and the downstream agent accepts it without complaint.
State consistency. Multiple agents reading and writing shared state (files, databases, API resources) without coordination creates races. An orchestration framework that doesn’t solve distributed state properly just moves the problem from agent logic to infrastructure configuration.
Latency amplification. Each agent handoff is a round-trip to an LLM. Chain six specialized agents and you’ve multiplied your wall-clock time by six, minimum. Human users working at interactive speed don’t tolerate this. Neither do downstream services with SLAs.
Context loss at boundaries. When you pass work between agents, you lose implicit context. The research agent knows why it chose certain sources. The coding agent doesn’t. This produces subtly wrong outputs that look correct until you read them carefully.
The Real Lesson From Microservices
The industry eventually learned that microservices aren’t wrong — premature decomposition is wrong. You split a system when you have a clear, stable interface between components, independent scaling requirements, and a team structure that maps to the split. You don’t split because splitting feels architecturally pure.
The same principle applies to agents.
A single, well-scoped agent with a large context window and good tool use will outperform a four-agent pipeline on most real-world tasks — not because the single agent is smarter, but because it doesn’t pay the coordination tax. It holds the full context. It can revise its own work without negotiating across a message bus. It fails in ways that are easier to observe and correct.
The cases where multi-agent systems genuinely win are narrower than the current hype suggests:
- Parallelism that can’t be serialized. Tasks where you need multiple independent workstreams running simultaneously, not sequentially.
- Specialization with a clean interface. A code-execution sandbox that an orchestrator calls with defined inputs and outputs. Not “a coding agent that talks to a planning agent.”
- Scale beyond a single context window. Very large tasks that genuinely exceed what any model can hold — though context windows are growing fast enough that this threshold keeps moving.
What Good Agent Architecture Actually Looks Like
Start with one agent. Give it good tools. Define its scope clearly. Measure its failure modes.
When you hit a genuine scalability or quality ceiling — not a theoretical one, a real one — identify the specific bottleneck. Is it context size? Is it a capability gap a specialized model would fill? Is it a task that’s genuinely parallelizable?
Then decompose along that specific seam, with a defined protocol at the boundary. Not a vague “the planning agent will tell the coding agent what to do” — an actual schema, a clear success/failure signal, an owner for the interface.
Apply the same discipline you’d apply to splitting a service: define the API first, write the contract tests, understand the failure modes before you commit to the split.
If you can’t define a clean interface between two agents, they shouldn’t be two agents.
The Orchestration Layer Problem
Most current orchestration frameworks are solving the wrong problem. They focus on routing messages between agents and retrying failures. They don’t solve the hard problems: detecting soft failures where an agent returns plausible-but-wrong output, maintaining coherent state across agent boundaries, or giving a human operator a legible view into what the system is actually doing.
The frameworks that will win in 2026 are the ones that treat observability as a first-class concern — where every agent action is logged with enough context to answer “why did it do that?” and every state mutation is traceable to a causal chain. Without this, multi-agent systems are black boxes that fail silently and degrade gradually.
This isn’t glamorous. It doesn’t make for impressive architecture diagrams. But it’s the difference between a system you can operate in production and one that works in demos.
Conclusion
Multi-agent systems are not inherently wrong. The pattern is genuinely powerful for the right problems. But the industry is pattern-matching to microservices without learning microservices’ hardest lesson: distribution is a cost, not a feature.
Before you split your agent into four specialized agents, ask whether the problem is that your agent is doing too much, or that you don’t have good enough tools, context, or prompting for it to do the one thing it should do well.
Start simple. Decompose deliberately. Define your interfaces before you build them. Treat observability as load-bearing infrastructure, not an afterthought.
The teams that get this right will build systems that compound in capability over time. The ones that don’t will spend 2027 rewriting their orchestration layer — the same way 2018 was spent consolidating microservices back into sensible service boundaries.
History is only useful if you actually learn from it.