skip to content
Terry Li

I spent last weekend migrating a CLI from Click to cyclopts and fell down a research hole: what makes a CLI good for AI agents to invoke?

Three authors published independently on this in early 2026. Joel Hooks, Ugo Enyioha, and the OpenStatus team. They never cite each other. They converge on almost identical patterns. That convergence is the signal.

The seven patterns

1. JSON is the only output format. Not --json as a flag. JSON by default, always. Humans pipe through jq. This is the most opinionated pattern and the one most existing CLIs get wrong. The --json flag means agents must remember to pass it. Forgetting produces unparseable output that wastes a retry.

2. Every response is a HATEOAS envelope. Success:

{
"ok": true,
"command": "mytool status abc123",
"result": {"id": "abc123", "state": "running"},
"next_actions": [
{"command": "mytool logs abc123", "description": "Fetch output logs"},
{"command": "mytool cancel abc123", "description": "Cancel if stuck"}
]
}

The next_actions array is what makes this HATEOAS. The agent doesn’t need to know the full API upfront. Each response tells it what to do next. Roy Fielding’s constraint, applied to CLIs.

3. Errors include a fix field. Not just “what went wrong” but “what to do about it”:

{
"ok": false,
"command": "mytool status bad-id",
"error": {"message": "Workflow not found", "code": "NOT_FOUND"},
"fix": "Verify the ID with: mytool list",
"next_actions": [{"command": "mytool list", "description": "List all workflows"}]
}

Machine-readable error codes. Human-readable fix strings. The agent doesn’t need to reason about recovery from scratch.

4. Bare invocation returns the command tree. Running mytool with no arguments returns a JSON object describing every command, parameter, type, enum value, and default. One call and the agent knows the entire API surface. No --help parsing required.

5. Exit codes are a semantic contract. Not just 0 and 1. A defined vocabulary: 0 = ok, 1 = generic error, 2 = usage error, 3 = resource not found, 4 = permission denied, 5 = conflict. The agent’s control flow branches on the exit code before parsing the body.

6. NDJSON for streaming. Long-running operations emit newline-delimited JSON. Each line has a type discriminator. The last line is always the standard envelope. Tools that don’t understand streaming just read the last line. Backwards compatible by design.

7. TTY detection for dual mode. isatty() on stdout determines behavior. TTY = human mode (spinners to stderr, colors, progress bars). Pipe = agent mode (clean JSON, no decoration). One binary, two audiences. The OpenStatus team calls this “the single source of truth” pattern: the CLI command returns one dict, --json dumps it raw, human mode pretty-prints the same dict.

What the agents themselves use

I checked the source code of six major AI coding agents:

AgentLanguageFramework
Claude CodeTypeScriptCustom
Codex CLIRustclap
Gemini CLITypeScriptcommander.js
GooseRustCustom
AiderPythonconfigargparse
OpenCodeGocobra

None use Click, Typer, or cyclopts. But this is the agent harness, not the tools the agent invokes. The agent calls git, gh, rg, and your custom CLIs via subprocess. That’s where the framework choice matters.

Why nobody has built a framework for this

The envelope pattern is about 30 lines of helper code. ok(), err(), action(). Three functions. Most developers would copy-paste them rather than pip install a library.

But the part that is genuinely framework-worthy is auto-generating the command tree and schema from type hints. My reference CLI has 200+ lines of manually maintained JSON describing every command, parameter, type, and enum value. That information already exists in the function signatures. A framework that introspects registered commands and generates the command tree eliminates that duplication.

The MCP angle also explains the gap. The people who care most about structured agent-tool interfaces have been building MCP servers, not CLIs. MCP gives you typed schemas, tool discovery, and structured responses by design. But as Enyioha notes: “I deleted three MCP servers in favor of direct CLI usage. Token cost: switching from MCP to CLI cut token usage by roughly 40%.”

The market for agent-facing CLIs is growing as people discover this cost difference.

The framework-agnostic layer

The right abstraction isn’t a replacement for Click or Typer. It’s a layer on top. The seven patterns don’t depend on which CLI parser you use. They depend on what comes out of stdout and how the agent discovers what’s available.

A thin library that provides envelope helpers, auto-generates command trees from any CLI framework’s registered commands, and handles NDJSON streaming would close the gap. The parsing framework is a commodity. The agent-facing contract is the value.

Sources