CLI, MCP, or code mode: the answer depends on who's running the sandbox

5 Apr 2026 · 4 min read ·

Two influential takes from the last six months. Simon Willison, October 2025: “My own interest in MCPs has waned ever since I started taking coding agents seriously. Almost everything I might achieve with an MCP can be handled by a CLI tool instead.” A month later: “I don’t use MCP at all any more when working with coding agents.” Cloudflare, February 2026: “The Cloudflare API has over 2,500 endpoints. Exposing each one as an MCP tool would consume over 2 million tokens. With Code Mode, we collapsed all of it into two tools and roughly 1,000 tokens of context.” Their new MCP server exposes search and execute. The agent writes JavaScript against the OpenAPI spec inside a V8 isolate, and the spec never enters the context window.

Both are credible. Both are widely cited. They appear to contradict each other. They do not. They are answering different questions.

The real question is where does the sandbox live, and who owns the tools. MCP as a protocol is agnostic about both. That is its weakness. Willison and Cloudflare have each picked a concrete answer, and their answers are right for their respective situations. Willison’s sandbox is a coding agent’s filesystem. He has Claude Code with shell access on a machine he controls. The tools are CLIs installed on his own machine. In this world, MCP is pure overhead. Cloudflare’s sandbox is a V8 isolate running on their edge. Their tools are 2,500 API endpoints that exist whether or not an agent is calling them. They need something that ships with an OpenAPI spec and can run the agent’s code close to the API. Neither is wrong. They are solving different problems.

Generalise this and you get three viable architectures for agent tool-use in 2026. Agent-local CLI inventory, where the agent runs on a machine it trusts and tools are binaries on PATH — this is Willison’s world and the right answer for any individual developer with direct machine access. Server-side code mode, where the tool owner runs a sandbox and exposes two meta-tools, search and execute — the agent writes code against a typed API, the server runs it, only the result crosses back. And hosted sandbox with curated tools, where a third party provides a sandboxed Linux container with pre-installed CLIs wrapping approved APIs — the agent gets shell access inside the sandbox only, everything is logged, network egress is whitelisted, credentials are short-lived. This is E2B, Modal, AWS Bedrock AgentCore, and the architecture any regulated enterprise will actually land on.

The decision criterion is clean. Agent runs locally on a machine you control: CLI inventory. You are a service provider with a large API surface: server-side code mode. Agent runs for users who cannot be given a shell: hosted sandbox.

MCP still has a job. All three architectures can use MCP as the transport. Cloudflare’s code-mode server is itself an MCP server. E2B has MCP adapters. CLI-first setups still use MCP for the handful of tools that genuinely benefit from it, like persistent browser sessions. What does not survive is the default from 2025: wrap everything in MCP, expose hundreds of tools upfront, trust that context windows will grow fast enough. The pattern replacing it in every case is progressive disclosure. Tools exist on disk or behind a search function. The agent discovers them when needed. The context window only sees what is relevant to the current task. Willison’s CLIs do this via help flags. Cloudflare does it via search. Hosted sandboxes do it via filesystem navigation. It is the same idea reached from four directions.

If you are standing up agent infrastructure inside a regulated institution and someone hands you “just install these MCP servers,” you now have grounds to push back. The right shape is almost certainly a hosted sandbox with a curated CLI inventory wrapping your approved internal APIs. It gives you audit trails native to Linux, identity scoping via short-lived credentials, network isolation via VPC rules, and change control via standard CI/CD. None of which requires a new threat model, and all of which your security team already understands.

The twelve-month forecast I would bet on: MCP survives as the glue between agents and tools. Almost no enterprise directly exposes more than a handful of tools over raw MCP. The rest live behind either a code-mode gateway or a sandbox runtime.

Sources: Simon Willison, “Claude Skills are awesome” (October 2025). Anthropic, “Code execution with MCP” (November 2025). Cloudflare, “Code Mode” (February 2026).