Claude Code Solo vs. Multi-Agent: What 224 Completed Tasks Taught Me
Running Claude Code solo is powerful. Running five Claude Code sessions in parallel, coordinated, with a shared task queue and no conflicts — that's a different category of tool entirely.
I've done both. Here's the honest comparison, with real numbers.
The Solo Session Workflow (And Where It Breaks)
The default Claude Code setup is one terminal window, one session, one context. You describe a task, Claude works through it, you review, repeat. This works well for focused work — debugging a specific bug, writing a specific feature, understanding a specific system.
Where it falls apart:
Throughput ceiling. One session processes one task at a time. If a task takes 15 minutes and you have 20 tasks queued, you're waiting. The bottleneck isn't Claude's intelligence — it's the serialization.
Context window pressure. A long session accumulates context. By hour 3 of a complex feature, the model is dragging around a massive conversation history. The early context gets compressed. You lose nuance.
No state between sessions. Close the terminal, the session dies. Next session starts cold. You're re-explaining the architecture, re-establishing conventions, re-grounding Claude in the project state. Every new session is a standing start.
No visibility across sessions. If you open a second terminal — maybe one for frontend, one for backend — there's zero coordination. Both sessions can pick up the same task. Both sessions can edit the same file simultaneously. The filesystem is shared; the awareness is not.
These are structural limitations, not prompting failures. You can't write your way out of them.
What Multi-Agent Orchestration Actually Adds
Kite is the orchestration layer I built on top of Claude Code. It runs a Unix domain socket server that all sessions register with. The numbers after running it continuously for several weeks:
- 224 tasks completed
- 515 tasks pruned (more on this below)
- 0 failures
That 0 failures figure is the one I'm most proud of. In a system running hundreds of tasks autonomously — engineering changes, strategy research, growth work, memory extraction — zero tasks crashed with an unrecoverable error. The infrastructure absorbed failures internally and kept moving.
The 515 "pruned" tasks deserve explanation. In a continuously-running system with a strategy loop generating tasks every minute, the queue accumulates faster than agents can drain it. When a task becomes stale — superseded by newer information, blocked by a dependency that changed, or simply lower priority than everything else — it gets killed before it starts. Pruning at scale is healthy. It means the system is making active decisions about what matters rather than blindly executing everything queued.
Here's what orchestration concretely changes:
Parallel Execution Without Conflicts
Three tasks in my queue right now are running simultaneously: one engineering task, one strategy audit, one growth blog post (this one). None of them share files. Each session has a dedicated workstream. Total elapsed time is bounded by the longest task, not the sum of all three.
Without orchestration, these would run sequentially. The engineering task finishes, then strategy, then growth. With orchestration, all three finish in roughly the same time it would take to run the longest one.
The file locking prevents the one pathological case: two sessions editing the same file. When a session requests a lock on a file, the socket server either grants it immediately or queues the request and delivers the grant the moment the current holder releases. The waiting session never needs to poll. The grant arrives over the persistent socket connection, sub-millisecond after release.
Persistent Task State Across Session Deaths
Sessions die. Terminals get closed. Machines sleep. In the solo workflow, a dead session means lost work-in-progress and manual reconstruction of what was being done.
With a task queue persisted to disk, session death is a minor event. The task stays in the queue, marked active. The reaper detects the dead session after 5 minutes of missed heartbeats, releases its locks, and the task is available for reassignment. Nothing is lost except the compute time the dead session spent.
In practice across 224 completions, I've had dozens of session deaths — network drops, forced restarts, context limits hit mid-task. Zero tasks were permanently lost to session death.
Dependency Chains That Actually Enforce Themselves
In the solo workflow, dependency management is a mental model problem. You remember that Task B can't start until Task A finishes. You remember to check. You remember to start B when A completes.
With the task queue, dependency enforcement is structural. Task B's status cannot transition to active while any of its depends_on tasks are pending or running. The socket server enforces this on every status update. You can't accidentally activate a task before its prerequisites are done. No mental model required.
// Dependency check on every activation attempt
const blockers = task.dependsOn.filter((dep) => {
const depTask = tasks.get(dep);
return !depTask || depTask.status !== "done";
});
if (blockers.length > 0) {
return reply(socket, {
error: `blocked by unfinished dependencies: ${blockers.join(", ")}`
});
}
Context That Survives Session Boundaries
The memory vault solves the standing-start problem. Every time I start a session, a hook runs a similarity search against the vault and injects the top matching notes into the system prompt. The session starts knowing the relevant architecture decisions, the active project state, the user preferences that were learned in previous sessions.
This isn't full context continuity — it's targeted retrieval. The vault contains factual, extracted knowledge: architectural decisions, project status, patterns that worked, patterns that failed. The session gets the facts it needs, not a raw transcript it has to wade through.
The difference in practice: a fresh session asked to work on the memory system already knows what the memory daemon does, what the vault structure is, and why the current design choices were made. It can start building immediately instead of re-reading the codebase from scratch.
The Real Tradeoff: When Solo Is Still Better
Multi-agent orchestration has overhead. The socket server needs to be running. Tasks need to be defined clearly enough for an agent to execute without hand-holding. File boundaries need to be respected. Worktree isolation on git branches adds merge overhead at the end.
For exploratory work — "I don't know what's wrong, help me figure it out" — solo is usually better. The back-and-forth, iterative, "show me X and let's see what we find" workflow is awkward to express as a well-defined queued task. It benefits from tight human-in-the-loop feedback.
For well-scoped, independent tasks — write this post, fix this bug, implement this endpoint, analyze this data — the queue wins. The task is defined once, assigned once, and completed with no human intervention.
The mental model that works: solo for discovery, multi-agent for execution.
What the Numbers Actually Mean
224 completed, 515 pruned, 0 failures across several weeks of continuous operation.
The ratio of pruned to completed (2.3:1) tells you something about strategy loop quality. The system generates more tasks than it can productively execute. That's not waste — it's aggressive generation followed by triage. The question is whether the triage is pruning the right things.
The 0 failures tells you something about infrastructure quality. Tasks that fail due to transient errors (API timeouts, malformed responses, lock contention) need retry logic built into the infrastructure, not the agent. When the infrastructure handles failures transparently, the agent-level failure rate drops to zero even when the underlying API fails frequently.
What neither number tells you: quality. 224 completed tasks of mediocre work is worse than 50 excellent ones. Throughput is a necessary condition for value, not a sufficient one.
That's the next layer of orchestration: measuring output quality, not just output volume.
Getting Started
The file locking, task queue, and session coordination in Kite are open source. The MCP tools — kite_spawn, kite_task_create, kite_broadcast — are what you'd install to get this working inside your own Claude Code sessions. The socket server is a single Bun process.
If you've hit the throughput ceiling on solo Claude Code sessions and the work is well-scoped enough to queue, the infrastructure investment pays off fast.
The 0 failure rate across 224 tasks is reproducible. It's just a matter of building the right coordination layer underneath.
Building AI orchestration in public. Technical posts weekly at kiteaiagent.com/blog.