Multi-Agent Orchestration
A single agent has its own context, its own attention, its own work to do. For complex tasks, multiple agents working in parallel can be dramatically faster than sequential. Or dramatically wasteful — depending on how the orchestration is designed.
This page covers the patterns.
When multi-agent helps
Independent investigation
Three different things to research; each can be a subagent. Results return; main agent synthesizes.
For research-heavy tasks, parallelism is real speedup.
Parallel implementation
Three independent files to modify; three subagents in parallel. Each completes; main agent moves on.
For tasks that decompose cleanly, real parallel work.
Different specializations
One subagent handles backend; another handles frontend; another handles tests. Each has its own focus.
Context isolation
Some work is exploratory and produces lots of intermediate context. Doing it in a subagent keeps the parent context clean.
The "research in subagent; report in main" pattern.
When multi-agent doesn't help
Sequential dependencies
Step B needs Step A's results. Spawning subagents doesn't help if work is inherently sequential.
Small tasks
For 2-minute tasks, the orchestration overhead exceeds the gain.
Tightly-coupled work
If subagents constantly need to communicate, they're not really parallel.
Subagents replicating the main agent
If the subagent is doing the same thing the main agent could, just sequentially, it's overhead.
Patterns
Fan-out, fan-in
Main agent decomposes the task; spawns N subagents; collects results; synthesizes.
Common for: research, multi-file refactors, parallel analysis.
Specialist subagents
Different subagents for different specializations. Each has its own skill set.
Common for: code review (different reviewers); multi-stage pipelines; complex workflows.
Hierarchical
Subagents may spawn their own sub-subagents. Tree of work.
Rarely needed; usually two levels are enough. Too deep = coordination overhead.
Serial via subagents (non-parallel)
Sometimes a subagent is used not for parallelism but for context isolation. The parent dispatches; waits; gets a clean result.
Useful when the work would be context-heavy but the result is concise.
Designing for multi-agent
Self-contained tasks
Each subagent's task should be self-contained. The prompt has everything needed; no implicit context from the parent.
Concise results
Subagents return their work as text. Should be concise — long results bloat parent context.
Clear scope
What can each subagent decide? When does it return for parent decision?
Failure handling
If a subagent fails, what does the parent do? Retry? Different approach? Report?
Specific patterns
Research fan-out
```
Main agent: "Research X across these 3 sources"
↓ Spawn 3 subagents, one per source
↓ Each subagent investigates its source
↓ Returns summary
↓ Main agent synthesizes
```
Multi-file refactor
```
Main agent: identify files needing changes
↓ Spawn subagent per file
↓ Each modifies its file
↓ Returns "done" or specific issues
↓ Main agent verifies
```
Code review with multiple aspects
```
Main agent: "Review this PR"
↓ Spawn:
- Subagent for security review
- Subagent for style review
- Subagent for test coverage
↓ Each returns its findings
↓ Main agent aggregates
```
Parallel option exploration
```
Main agent: "Should we do A, B, or C?"
↓ Spawn subagent for each
↓ Each explores its option in depth
↓ Returns trade-offs
↓ Main agent recommends
```
Specific operational concerns
Tool access per subagent
Subagents have their own tool access. Configure per subagent type if needed.
Cost per subagent
Each subagent has its own context, its own tokens. Multi-agent uses more tokens than sequential. Worth it for the speedup; not free.
Coordination
Some patterns need subagents to coordinate. Anthropic SDK has some support; specific tools (CrewAI, AutoGen) provide more.
For most needs in Claude Code: simple fan-out without inter-subagent communication.
Result format
Subagents return text. Structured output makes synthesis easier:
```
Subagent reports:
- Key finding: X
- Supporting evidence: Y
- Recommended action: Z
```
Frameworks
Claude Code's Agent tool
Built-in support for spawning subagents. Each subagent has its own subagent_type and prompt.
Anthropic Agent SDK
For building multi-agent systems on the Claude API. Programmatic agent construction.
LangGraph
LangChain's multi-agent framework. Graph-based agent orchestration.
CrewAI
Multi-agent framework with role-based agents.
AutoGen (Microsoft)
Multi-agent framework with conversation patterns.
For Claude Code, the built-in Agent tool covers most needs. For complex agent systems, the SDK or one of the frameworks.
Common failure patterns
Spawning subagents reflexively
Subagents for everything; even tiny tasks. Overhead exceeds benefit.
Sequential disguised as parallel
Subagents called serially, one after another. No real parallelism.
Subagents needing constant coordination
Defeats parallelism; turns into expensive serialization.
Result aggregation overhead
Subagents return huge results; parent agent spends most of its tokens parsing them.
No error handling
Subagent fails; parent doesn't notice; produces wrong result.
Recursion without bounds
Subagents spawning subagents spawning subagents. Coordination overhead explodes.
A reasonable approach
For multi-agent design:
1. Identify genuinely parallel work
2. Use subagents for that
3. Keep tasks self-contained
4. Concise result formats
5. Single level of subagents when possible
6. Measure: faster than sequential? worth the cost?
Further Reading
- [CustomSkillsArchitecture](CustomSkillsArchitecture) — Skill basics
- [SkillComposition](SkillComposition) — Single-agent skill chains
- [TokenMetrics](TokenMetrics) — Cost measurement
- [AgenticAi Hub](AgenticAiHub) — Cluster index