Sub-agents & the swarm
How the lead agent delegates to specialized sub-agents that share its sandbox — and why that keeps it fast.
A thread's lead agent doesn't do everything itself. For anything heavy, it spawns sub-agents — focused specialists that do one job and report back. That delegation is what makes the "swarm" inside a single thread.
The specialists
| Sub-agent | Job |
|---|---|
| Explorer | Maps the codebase and answers "where does X live?" so the lead doesn't spend its own context reading files. |
| UI driver | Opens the app in a real browser, drives a flow like a person, and records a video of it working. |
| Reviewer | Audits a diff for security and correctness before you integrate. |
| Worker | General implementation handed a scoped task. |
| Knowledge | Reads the code and writes durable entries into the knowledge base. |
Two ways to delegate
- Blocking — the lead spawns a sub-agent, waits for it, and splices the result back into its own conversation. Used when it needs the answer to continue.
- Detached — the lead launches an agent and moves on. A detached agent can open a pull request that auto-merges on green CI, for work that doesn't need a human in the loop.
They share the thread's sandbox
This is the key difference from threads. Separate threads get separate sandboxes (that's the isolation boundary). Sub-agents inside one thread share the lead's sandbox, so they see the same live working tree — the uncommitted edits, the running dev server, the logged-in browser session. There's no merge step between them.
Why delegation matters
A sub-agent runs with a fresh, focused context — just its task and a short brief about the current state — not the lead's entire conversation. So a deep investigation or a long UI walkthrough costs the lead a short summary, not thousands of tokens of transcript. Pushing heavy work into sub-agents is the main thing that keeps a long-running thread from filling its context window.
Because they share one working tree, fan-out has rules: read-only explorers and reviewers run safely in parallel, but the UI driver runs solo (it controls the one desktop), and you don't run two writers at once.