Multi-Agent Architecture Patterns Skill

Multi-Agent Architecture Patterns

Distribute work across multiple LM instances w/ isolated context windows. Sub-agents exist to isolate context, not to anthropomorphize roles.

When to Activate

Single-agent context limits constrain task complexity
Tasks decompose into parallel subtasks
Different subtasks need different tools | system prompts
Building multi-domain agent systems

Core Concepts

Three patterns: supervisor/orchestrator (centralized), peer-to-peer/swarm (flexible handoffs), hierarchical (layered abstraction). Key principle: context isolation — sub-agents partition context, not simulate org roles. Requires explicit coordination protocols & consensus mechanisms avoiding sycophancy.

Token Economics

| Architecture | Token Multiplier | Use Case | |---|---|---| | Single agent | 1x | Simple queries | | Agent w/ tools | ~4x | Tool-using tasks | | Multi-agent | ~15x | Complex research/coordination |

BrowseComp: token usage explains 80% of performance variance. Model upgrades often outperform doubling token budgets — model selection & multi-agent architecture are complementary.

Parallelization

Tasks w/ independent subtasks: assign each to dedicated agent w/ fresh context. All work simultaneously -> total time approaches longest subtask, not sum of all.

Architectural Patterns

Pattern 1: Supervisor/Orchestrator

User Query -> Supervisor -> [Specialist, Specialist, Specialist] -> Aggregation -> Final Output

Use when: Clear decomposition, cross-domain coordination, human oversight needed Pros: Strict workflow control, easier human-in-the-loop Cons: Supervisor context bottleneck, cascade failures, "telephone game" problem

Telephone Game Fix: forward_message tool lets sub-agents pass responses directly to users:

def forward_message(message: str, to_user: bool = True):
    """Forward sub-agent response directly to user w/o supervisor synthesis."""
    if to_user:
        return {"type": "direct_response", "content": message}
    return {"type": "supervisor_input", "content": message}

Pattern 2: Peer-to-Peer/Swarm

Agents communicate directly via handoff mechanisms. No central control.

def transfer_to_agent_b():
    return agent_b  # Handoff via fn return

agent_a = Agent(name="Agent A", functions=[transfer_to_agent_b])

Use when: Flexible exploration, emergent requirements Pros: No single point of failure, scales for breadth-first exploration Cons: Coordination complexity grows w/ agent count, divergence risk

Pattern 3: Hierarchical

Strategy Layer (Goals) -> Planning Layer (Decomposition) -> Execution Layer (Atomic Tasks)

Use when: Large-scale projects, enterprise workflows, mixed high/low-level tasks Pros: Clear separation of concerns, different context per level Cons: Inter-layer coordination overhead, strategy-execution misalignment

Context Isolation

Primary purpose of multi-agent architecture. Three mechanisms:

Full context delegation: Planner shares entire context -> max capability but defeats isolation purpose
Instruction passing: Planner creates instructions via fn call -> maintains isolation, limits flexibility
File system memory: Agents read/write persistent storage -> shared state w/o context bloat, adds latency

Consensus & Coordination

Weighted voting: Weight by confidence/expertise (not simple majority)
Debate protocols: Adversarial critique > collaborative consensus for complex reasoning
Trigger-based intervention: Stall triggers (no progress), sycophancy triggers (mimicked answers)

Failure Modes

| Failure | Mitigation | |---|---| | Supervisor bottleneck | Output schema constraints, workers return distilled summaries, checkpointing | | Coordination overhead | Clear handoff protocols, batch results, async communication | | Divergence | Objective boundaries per agent, convergence checks, TTL limits | | Error propagation | Validate outputs before passing, retry w/ circuit breakers, idempotent ops |

Examples

Supervisor
├── Researcher (web search, doc retrieval)
├── Analyzer (data analysis, statistics)
├── Fact-checker (verification)
└── Writer (report generation)

def handle_customer_request(request):
    if request.type == "billing":
        return transfer_to(billing_agent)
    elif request.type == "technical":
        return transfer_to(technical_agent)
    elif request.type == "sales":
        return transfer_to(sales_agent)
    else:
        return handle_general(request)

Guidelines

Design for context isolation as primary benefit
Choose pattern by coordination needs, not org metaphor
Explicit handoff protocols w/ state passing
Weighted voting | debate for consensus
Monitor supervisor bottlenecks, use checkpointing
Validate outputs between agents
Set TTL limits to prevent infinite loops
Test failure scenarios explicitly

Integration

Builds on context-fundamentals & context-degradation:

memory-systems — shared state across agents
tool-design — tool specialization per agent
context-optimization — context partitioning strategies

References

LangGraph — Multi-agent patterns & state management
AutoGen — GroupChat & conversational patterns
CrewAI — Hierarchical agent processes
Multi-Agent Coordination Survey

Agent Skills: Multi-Agent Architecture Patterns

Install this agent skill to your local

Skill Files