Skip to main content
Agents

Agent orchestration

Master multi-agent orchestration with Claude Code: parallel execution, task coordination, context sharing, and workflow management.

The art of multi-agent orchestration

Multi-agent orchestration is about combining multiple agents to accomplish complex tasks that no single agent could handle efficiently. This is the advanced level of Claude Code usage, the one that turns an intelligent assistant into a true automated development team.

The 4 orchestration patterns

1. Sequential pattern

The simplest pattern: agents execute one after another, each using the previous one's result as input.

# Sequential: each agent depends on the previous one
> Step 1: Use the planner agent to plan the refactoring
> Step 2: Use the tdd-guide agent to implement according to the plan
> Step 3: Use the code-reviewer agent to validate the code
> Step 4: Use the doc-updater agent to update the docs

When to use sequential?

Use this pattern when each step depends on the previous one's result. This is the typical development pipeline: plan → code → review → document. Simple, predictable, easy to debug.

Advantages:

  • Easy to understand and debug
  • Each step has clear context
  • Errors are easily traceable

Disadvantages:

  • Slow: steps cannot be parallelized
  • If one step fails, the entire pipeline stops

2. Parallel pattern

Multiple agents work simultaneously on independent tasks, then their results are merged.

# Parallel: agents work at the same time
> Launch in parallel:
> - Agent security-reviewer: audit the auth module
> - Agent code-reviewer: review the API module
> - Agent e2e-runner: test the user journey
> Then synthesize the results from all three agents.

This pattern is ideal when tasks are independent. Claude Code can launch sub-agents simultaneously using the run in background feature.

# Conceptually, Claude Code does:
# 1. Launches 3 sub-agents in parallel (run_in_background: true)
# 2. Waits for all 3 to finish
# 3. Consolidates results into a single report

Advantages:

  • Much faster than sequential
  • Uses resources efficiently

Disadvantages:

  • Agents cannot depend on each other's results
  • Risk of conflicts if agents modify the same files

3. Pipeline pattern

A pipeline combines sequential and parallel: some steps are parallelized, others are sequential.

# Complete release pipeline
> Execute this pipeline:
>
> Phase 1 (parallel):
> - Agent tdd-guide: verify all tests pass
> - Agent security-reviewer: security audit
>
> Phase 2 (sequential, after Phase 1):
> - Agent code-reviewer: final code review
>
> Phase 3 (parallel, after Phase 2):
> - Agent doc-updater: documentation update
> - Agent e2e-runner: end-to-end tests
>
> Phase 4 (sequential, after Phase 3):
> - Prepare the release tag and changelog
1

Phase 1: Parallel checks

Tests and the security audit are independent and can run in parallel. If either fails, the pipeline stops.

2

Phase 2: Sequential review

The review can only start once tests and security are validated. The reviewer needs to know the code is functional and safe.

3

Phase 3: Documentation and E2E

Documentation and E2E tests are independent. They can run in parallel after the review.

4

Phase 4: Release

Release preparation is only triggered if all previous steps are green.

4. Split-role pattern (multi-perspective)

Multiple agents analyze the same subject from different angles, then a synthesizer agent combines the perspectives.

# Split-role: multiple perspectives on the same problem
> Analyze this PR from 4 different angles:
>
> Agent 1 (factual): Verify the code does what the PR says
> Agent 2 (senior): Evaluate quality and maintainability
> Agent 3 (security): Look for security flaws
> Agent 4 (consistency): Check consistency with the rest of the codebase
>
> Then synthesize the 4 analyses into a consolidated report.

Context management between agents

One of the major challenges of orchestration is context management. Each agent has its own context window, and information is not automatically shared.

Context passing strategies

# Strategy 1: Via files
Agent A writes its results to a file.
Agent B reads that file at the start of its mission.
# Strategy 2: Via the prompt
The orchestrator agent summarizes Agent A's result
and includes it in Agent B's prompt.
# Strategy 3: Via Git
Agent A commits its changes.
Agent B works on the same branch and sees the modifications.

Watch out for context overflow

Each sub-agent consumes context in the main agent. If you launch too many sub-agents or their results are too verbose, the main agent can hit its context window limit. Prefer concise, structured results.

Worktrees for isolation

Git worktrees are essential for multi-agent orchestration. They let each agent work in an isolated copy of the code without risk of conflict.

# Conceptually, Claude Code creates isolated worktrees:
# Agent 1 works in /tmp/worktree-security
git worktree add /tmp/worktree-security main
# Agent 2 works in /tmp/worktree-tests
git worktree add /tmp/worktree-tests main
# Agent 3 works in /tmp/worktree-docs
git worktree add /tmp/worktree-docs main
# Each agent modifies its files without affecting the others
# At the end, changes are merged

When to use worktrees?

SituationWorktree?Reason
Agents that only readNoNo risk of conflict
Agents modifying different filesOptionalLow risk of conflict
Agents modifying the same filesYesHigh risk of conflict
Agents in parallelRecommendedGuaranteed isolation

Run in background

The run in background feature lets you launch sub-agents without blocking the main agent. This is essential for parallelization.

# Without background: forced sequential
# Agent A works... (60 seconds)
# Agent B works... (60 seconds)
# Total: 120 seconds
# With background: parallel
# Agent A works in background... (60 seconds)
# Agent B works in background... (60 seconds)
# Total: 60 seconds (both in parallel)

The main agent launches sub-agents in the background, continues its work, then retrieves results when they're ready.

Best practices

1. Avoid context overflow

The golden rule: never use more than 80% of the context window for multi-agent operations. Keep a margin for corrections and adjustments.

# GOOD: Concise results
"The security audit found 3 issues:
1 CRITICAL (missing CSRF), 2 MEDIUM (rate limiting)."
# BAD: Verbose results
"I analyzed each file one by one. First auth.ts,
which contains 342 lines of code. Line 42 is
interesting because..." (500-line report)

2. Avoid duplicate work

Clearly define each agent's responsibilities to prevent two agents from doing the same work.

# BAD: Overlap
Agent 1: "Review the code and check security"
Agent 2: "Check security and code quality"
# → Both do security = duplication
# GOOD: Distinct responsibilities
Agent 1: "Review code quality (readability, patterns, tests)"
Agent 2: "Security audit only (injection, XSS, secrets)"
# → Each in its own domain, no overlap

3. Define success criteria

Each agent must know when its task is successfully completed.

## Success criteria for the testing agent
- All tests pass (exit code 0)
- Code coverage > 80%
- No flaky tests (rerun 3 times if a test fails)
- Coverage report generated in /coverage

4. Plan for error handling

What happens if an agent fails? Define a fallback plan.

# Fallback plan
If the security-reviewer agent finds a CRITICAL issue:
→ Stop the pipeline
→ Notify the developer with the issue details
→ Do NOT continue to review or release
If the e2e-runner agent fails on a test:
→ Rerun the test 2 times (might be a flaky test)
→ If still failing, flag it and continue

Full example: release pipeline

Here's a prompt that orchestrates a complete release pipeline using all the patterns.

> Execute a release pipeline for version 2.3.0:
>
> 1. PLANNING (sequential)
> - Use the planner agent to list all changes
> since the last tag
>
> 2. CHECKS (parallel)
> - Agent tdd-guide: all tests pass, coverage 80%+
> - Agent security-reviewer: full security audit
> - Agent refactor-cleaner: no dead code introduced
>
> 3. REVIEW (split-role)
> - Quality perspective: clean and maintainable code
> - Performance perspective: no regressions
> - Consistency perspective: coherent with the codebase
>
> 4. DOCUMENTATION (parallel)
> - Agent doc-updater: update technical docs
> - Generate the changelog since the last tag
>
> 5. RELEASE (sequential)
> - If everything is green: create the v2.3.0 tag
> - Generate the release notes
>
> If a CRITICAL step fails, stop everything and give me
> a detailed report of the problem.

This pipeline combines all 4 orchestration patterns for a robust and automated release process.

Comparison with other multi-agent tools

Claude Code isn't the only tool offering agents. Here's how it compares to the main alternatives.

Claude Code vs Devin

Devin (Cognition AI) is an autonomous development agent that runs in a complete cloud environment (browser, terminal, editor).

CriterionClaude CodeDevin
EnvironmentYour local terminalCloud (dedicated VM)
ControlFull, you see every actionAutonomous, final result
CostPay-as-you-go (tokens)Monthly subscription
CustomizationCustom agents, MCP, SkillsLimited to built-in capabilities
CollaborationYou stay in the loopThe agent works alone
IntegrationTerminal, SDK, CI/CDWeb interface + GitHub PRs

Claude Code favors control and customization. Devin favors full autonomy. For well-defined and repetitive tasks, Devin may be more practical. For day-to-day development with fine-grained control, Claude Code has the edge.

Claude Code vs Aider

Aider is an open-source pair-programming tool with LLMs, compatible with multiple models (GPT-4, Claude, etc.).

CriterionClaude CodeAider
ModelsClaude only (Haiku, Sonnet, Opus)Multi-model (GPT-4, Claude, Gemini...)
AgentsSub-agents, orchestration, SDKNo agent system
EcosystemMCP, Skills, PluginsLimited to code editing
ModeInteractive terminal + headlessInteractive terminal
PricingIncluded in Max/Pro subscription or APIFree (you pay for the API)

Aider is excellent for simple pair-programming (editing code file by file). Claude Code goes further with multi-agent orchestration, MCPs for connecting external services, and the SDK for automation.

Claude Code vs CrewAI

CrewAI is a Python framework for orchestrating specialized AI agents.

CriterionClaude CodeCrewAI
NatureComplete tool (terminal + SDK)Python code framework
AgentsBuilt-in, ready to useMust be built entirely
ModelsClaude (optimized)Multi-model
Setupnpm install and you're readyPython project, code to write
ToolsBash, Read, Edit, Grep, MCP...Must integrate manually
Use casesSoftware developmentAny type of agent (marketing, research...)

CrewAI offers more flexibility for building custom multi-agent systems in any domain. Claude Code is optimized for software development with ready-to-use tools. If your need is 100% development, Claude Code is more productive. If you're building agents outside of development, CrewAI offers more freedom.

Multi-agent architectures

Beyond orchestration patterns, two major architectures structure multi-agent systems.

Leader/worker architecture

A main agent (leader) coordinates multiple specialized agents (workers). The leader receives the request, breaks it into sub-tasks, and distributes them.

# Leader: the orchestrator agent
> You coordinate 3 workers for the "CSV export" feature.
> Break down the task and assign each part.
# Worker 1: Backend
# → Implement the /api/export endpoint
# Worker 2: Frontend
# → Add the export button to the UI
# Worker 3: Tests
# → Write E2E tests for the export flow

This is the default architecture in Claude Code when it uses sub-agents: the main agent is the leader, the sub-agents are the workers.

Strengths: centralized coordination, clear global view, easy to debug. Weaknesses: the leader is a single point of failure, it consumes a lot of context.

Peer-to-peer architecture

Agents communicate directly with each other without a central coordinator. Each agent knows its role and knows when to hand off.

# Agent Teams in peer-to-peer mode
# Each agent works and signals when done
# Other agents react to changes
# Developer agent: codes → signals "code ready"
# Tester agent: detects "code ready" → writes tests
# Reviewer agent: detects "tests written" → reviews everything

This architecture corresponds to Claude Code's Agent Teams mode (see Agent Teams). Each agent has its own session and communicates via files and Git state.

Strengths: no central bottleneck, more resilient. Weaknesses: more complex coordination, risk of conflicts, harder debugging.

CI/CD integration with agents

Agents integrate into your CI/CD pipelines to automate pre-merge checks.

GitHub Actions

# .github/workflows/agent-review.yml
name: Agent Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Claude Code
run: npm install -g @anthropic-ai/claude-code
- name: Agent Review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
claude --print --max-turns 15 \
"Do a complete review of this PR.
Analyze the diff with git diff origin/main...HEAD.
Produce a report with issues by severity.
If you find a CRITICAL, end with EXIT_CODE=1."

GitLab CI

# .gitlab-ci.yml
agent-security-audit:
stage: review
image: node:20-alpine
before_script:
- npm install -g @anthropic-ai/claude-code
script:
- |
claude --print --max-turns 10 \
"Security audit on the diff for this MR.
Look for: SQL injections, XSS, hardcoded secrets,
vulnerable dependencies. JSON format."
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"

Full pipeline with the SDK

For finer control, use the SDK in a Node.js script called by your CI.

// scripts/ci-review.ts
import { claude } from "@anthropic-ai/claude-code-sdk";
async function ciReview() {
// Phase 1: Security
const security = await claude({
prompt: "Security audit of the diff against main",
options: { maxTurns: 10, allowedTools: ["Bash", "Read", "Grep"] },
});
// Phase 2: Tests
const tests = await claude({
prompt: "Verify that test coverage is > 80%",
options: { maxTurns: 8, allowedTools: ["Bash", "Read"] },
});
// Consolidated result
const hasCritical = security.text.includes("CRITICAL");
const lowCoverage = tests.text.includes("< 80%");
if (hasCritical || lowCoverage) {
console.error("Review failed:");
if (hasCritical) console.error("- CRITICAL security issue");
if (lowCoverage) console.error("- Insufficient coverage");
process.exit(1);
}
console.log("Review OK");
}
ciReview();

Command, Agent, Skill: which one to use?

Claude Code offers three complementary mechanisms to structure your workflows. They don't do the same thing, and telling them apart will save you from building an Agent when a Skill would have been enough.

The comparison table

CriterionSkill (.claude/skills/)Agent (.claude/agents/)Command (.claude/commands/)
TriggerManual slash command (/skill)Auto (Claude decides) or via Agent toolManual slash command (/project:cmd)
ContextShared with the main sessionIsolated (its own window)Shared with the main session
AutonomyInstructions followed by the orchestratorAutonomous sub-agent, makes its own decisionsInstructions followed by the orchestrator
PersistenceNo, loaded on demandMemory possible (files, notes)No, loaded on demand
Use caseRepetitive workflows, work recipesComplex tasks that shouldn't pollute the main contextProject scripts shared across the team

The decision tree

Before choosing, ask yourself these three questions in order.

I want to automate something. What exactly is it?

1. Is it a task I trigger myself, repeatedly?
   └─ Yes → Skill or Command (depending on whether it's personal or shared)
   └─ No  → next question

2. Is the task complex, and I want it to run without polluting
   my main context?
   └─ Yes → Agent (isolated context, autonomous)
   └─ No  → next question

3. Do I just want to give Claude permanent knowledge
   about this project?
   └─ Yes → CLAUDE.md (not an agent, not a skill: just context)

In practice, the guiding principle is simple: prefer the lightest mechanism that fits. A Skill covers 80% of use cases. An Agent is the right choice when you need real isolation or autonomy.

The Command + Agent + Skill pattern

These three mechanisms work together. The most powerful combination looks like this.

User
 │
 └─ /project:pre-commit  ← Command (manual trigger)
         │
         ├─ Agent code-reviewer  ← Agent (isolated context, autonomous)
         │       │
         │       └─ Skill tdd-guide  ← Preloaded skill (domain knowledge)
         │
         └─ Skill changelog-format  ← Skill invoked inline to format output

The Command is the entry point: the user triggers it, it orchestrates the rest. The Agent handles the complex part in its own context. The Skill brings the specialized knowledge the agent needs.

The same need, three approaches

Let's take a concrete example: checking code quality before a commit. You can solve this with each of the three mechanisms. Here's how, and more importantly, when you'd pick one over the other.

Approach 1: a Skill /pre-commit

The simplest solution. A Markdown file in ~/.claude/skills/ that describes the steps to follow.

# Pre-commit Quality Check
You are a thorough code reviewer. Before every commit, check the following.
## Steps
1. Run tests: `npm test`
- If a test fails, stop and explain the problem
2. Run lint: `npm run lint`
- List errors by file and severity
3. Check TypeScript types: `npm run type-check`
4. Analyze the diff (`git diff --staged`) and look for:
- Hardcoded secrets or tokens
- Forgotten `console.log` calls
- Unused imports
## Output format
For each issue found:
- **File**: path
- **Type**: TEST / LINT / TYPE / SECURITY
- **Severity**: BLOCKING / WARNING
- **Description**: what's wrong
If everything is clean: "Ready to commit."
# Usage
/user:pre-commit

When to choose this approach: for personal use, across any project. The Skill runs in your main context, you see every step in real time. Fast to create, easy to iterate on.

Limitation: if the check takes time or produces a lot of text, it fills up your context window.

Approach 2: an Agent code-reviewer

Same goal, but the work happens in an isolated context. The agent is more autonomous: it can re-run commands, fix minor issues on its own, and only presents you with a final report.

# Code Reviewer Agent
## Role
You are a senior code reviewer. You work autonomously to verify code
quality before a commit. You can fix minor issues (auto-fixable lint,
unused imports) without asking for confirmation.
## Available tools
- Bash (run commands)
- Read / Edit (read and fix files)
- Grep (search in code)
## Instructions
1. Get the staged diff: `git diff --staged`
2. Run tests: `npm test -- --passWithNoTests`
3. Run lint with auto-fix: `npm run lint -- --fix`
4. Check types: `npm run type-check`
5. Look for problematic patterns in modified files:
- Secret regex: `(api_key|password|token)\s*=\s*['"][^'"]+['"]`
- `console\.log` in non-test files
6. If BLOCKING issues remain, list them with suggested fixes.
Otherwise, confirm the code is ready.
## Constraints
- Never commit yourself
- Only modify files already in the staged diff
- Keep the report concise: one line per issue maximum
# The agent is invoked automatically by Claude when the context calls for it,
# or explicitly:
> Use the code-reviewer agent on the current diff

When to choose this approach: when you want to fully delegate the check. The agent works in its own context while you keep working on something else. Ideal for large codebases where checks generate a lot of output.

Limitation: more setup time, less visibility into what's happening along the way.

Approach 3: a project Skill code-quality

Here the goal is not to run commands but to provide knowledge about the project's quality standards. This Skill will be read by other agents or invoked directly to get a contextual opinion.

# Project Quality Standards
## TypeScript rules
- No explicit `any`. Use `unknown` if the type is genuinely unknown.
- Interfaces prefixed with `I` are forbidden (convention: `type` or interface without prefix).
- Every public function must have a minimal JSDoc (description + `@param` + `@returns`).
## Testing rules
- Minimum coverage: 80% on branches.
- A file `utils/format.ts` must have a corresponding `utils/format.test.ts`.
- Mocks go in `__mocks__/`, never inline inside test files.
## Security
- No API keys in the code (use environment variables).
- API endpoints must validate inputs with Zod before processing.
- No `dangerouslySetInnerHTML` without an explicit review.
## Commit message
Format: `type(scope): description` (Conventional Commits).
Valid types: feat, fix, docs, chore, refactor, test, perf.
# Direct invocation for an opinion
/project:code-quality
# Or used as a reference in an agent prompt
> Following the standards defined in the code-quality skill,
> check this file: src/api/users.ts

When to choose this approach: when you want to centralize project rules and make them accessible to everyone (agents, developers, reviews). This Skill does nothing on its own — it's a source of truth. It pairs naturally with the code-reviewer agent, which can read it before starting work.

Summary of the three approaches

Skill /pre-commitAgent code-reviewerSkill code-quality
What it doesRuns the checkRuns and fixes autonomouslyDocuments the standards
ContextMain (visible)Isolated (transparent)Main or reference
Auto-fixNoYes (minor issues)Not applicable
Setup time5 minutes15 minutes10 minutes
CombinableStandalone or with othersCan read the code-quality SkillReferenced by others
Best forQuick personal useFull delegationTeam standardization

Next steps

You now master multi-agent orchestration. Continue learning with these related resources.