Human DX optimizes for discoverability and forgiveness.
Agent DX optimizes for predictability and defense-in-depth.
These are different enough that retrofitting a human-first CLI for agents is a losing bet.
> The real question: what does it actually look like to build for this?
What was the not-so-real question? Or the surreal question?
I know it's becoming tiresome complaining of slop in HN. But folks! Put a bit of care in your writing! It is starting to look as if people had one more agent skill "write blogpost", with predictable results, as we are not a Python interpreter putting up with meh-to-disgusting code but actual humans with real lives and a sense of taste in communication
I just did the opposite and am seeing better results.
Claude was struggling to use the ‘gh’ command to reliably read and respond to code review line level comments because it had to use the api. I had it write a few simple command line tools and a skill for invoking it, instantly better results.
I love how AI gave the command-line and TUI interfaces a kind of Second Renaissance. It is not just AI that loves CLIs. It is especially blind people like me, who still use a lot of text-mode tools for their implicit accessibility. I gave codex a whirl recently, and hey! No accessibility problems at all. Just works. A few years back, that would have been released as a GUI-only program and would have locked me out completely[1]. A blessing that text oriented interaction is becoming important again!!!
1: Strictly speaking, there are ways to access some GUI programs on Linux with a screen reader. However, frankly, most are not really a joy to use. The speed of interaction I get from a TUI is simply unmatched. Whenever I work with a true GUI, no matter if Windows, Mac or Linux, it feels like I am trying to run away from a monster in a dream. I try to run, but all I manage to do is wobble about...
A second renaissance? The entire time internet had been running on CLI. All modern services relying om containers rely on images based on CLI. There is no renaissance needed because it never stopped.
No. Nope. Agents do just fine with all sorts of CLIs. Old standards, new custom stuff, whatever.
The CLIs I’ve seen agents struggle with are those that wrap an enormous, unwieldy, poorly designed API under one namespace. All of Google Workspace apis, for example.
This feels completely speculative: there's no measure of whether this approach is actually effective.
Personally, I'm skeptical:
- Having the agent look up the JSON schemas and skills to use the CLI still dumps a lot of tokens into its context.
- Designing for AI agents over humans doesn't seem very future proof. Much of the world is still designed for humans, so the developers of agents are incentivized to make agents increasingly tolerate human design.
- This design is novel and may be fairly unfamiliar in the LLM's training data, so I'd imagine the agent would spend more tokens figuring this CLI out compared to a more traditional, human-centered CLI.
Yeah, people seem to forget one of the L's in LLM stands for Language, and human language is likely the largest chunk in training data.
A cli that is well designed for humans is well designed for agents too. The only difference is that you shouldn't dump pages of content that can pollute context needlessly. But then again, you probably shouldn't be dumping pages of content for humans either.
It's not obvious that human language is or should be the largest amount of training data. It's much easier to generate training data from computers than from humans, and having more training data is very valuable. In paticular, for example, one could imagine creating a vast number of debugging problems, with logs and associated command outputs, and training on them.
Claude will load the name and description of each enabled skill into context at startup[0]; the LLM needs to know what it can invoke, after all. It's negligible for a few skills, but a hundred skills will likely have some impact, e.g. deemphasizing other skills by adding noise.
And for more persistent services, worth considering using varlink, for your agents sake and just if you need two cli thinks to chat.
https://varlink.org/
The systemd universe is moving this way from dbus, and there doesn't seem to be a ton of protest against giving up dbus for json over unix sockets. There's really not that many protocols that are super pleasant for conversing with across sockets.
With all due respect, but if humans can figure out how new unseen programs work by using -h and seeing what options exist and what they do, I am sure robots can figure it out too, or else they weren’t that intelligent to begin with.
That's how artificial this "intelligence" is, when LLMs can't even use text based tools full of txt based documentation formatted coherently without those very tools being adapted.
It looks like an AI generated fluff article without any evidence. People also did this for image generators as if you needed these arcane templates to prompt them, but actually the latest models are great at figuring out what you want from messy human input. Similarly LLMs can use regular CLI just fine. But how do you write a hype FOMO article about the fact that actually you don't need to do anything...
I write my tools for humans, without help or use of AI. If the AI agent wants to use my tool so bad, they need to rise to that level. I'll not crouch on my knees to meet it.
If I ever write a tool for AI interaction, I'd give it a well-defined API, to make it even easier for the agent.
I like CLI tools with json output that can be piped through jq. I've seen llms do that with existing tools.
The human needs and llm needs seem to overlap, especially if the human is using scripts and piping between tools.
The number of times it implies you didn't need to validate "human" input until llms arrived is scary too.
I'm also surprised to hear them say llms shouldn't Google, as that seems an area that Google themselves could optimize their search service for their llms and get an advantage.
Finally, I wonder if just using older, smaller llms is a valid fuzzing approach for clis (or anything else llms might be controlling). Or do you need a high powered llm trained to generate adversarial input?
I strongly disagree here. Yes, build CLIs. No, don't target then at agents.
Build for humans, including good man pages or `--help` docs as needed.
If LLMs are worth the name AI they will understand how to discover and use Unix-style commands. In my experience, this is exactly the case and I need only say "I use tool X for use case) Y."
If AI agents need CLIs, then whats stopping them from using APIs directly. I see CLIs as good wrappers over APIs, and nothing more. What more will CLIs provide which `curl -X POST` can't/won't provide?
Context limits and context poison/rot/whatnot stops them. CLI's are a great way to make a focused context. But any other trick to filter out noise would also work.
John Carmack made this observation (cli-centred dev for agents) a year ago:
LLM assistants are going to be a good forcing function to make sure all app features are accessible from a textual interface as well as a gui. Yes, a strong enough AI can drive a gui, but it makes so much more sense to just make the gui a wrapper around a command line interface that an LLM can talk to directly.
Andrej Karpathy reiterated it a couple of weeks ago:
CLIs are super exciting precisely because they are a "legacy" technology, which means AI agents can natively and easily use them, combine them, interact with them via the entire terminal toolkit.
Thanks for sharing these contents. They are very interesting. I found "making all app features accessible from a textual interface..." actually quite challenging in cerntain domains such as graphics related editing tools. Though many editing functions can be exposed as CLI properly, but the content being edited is very hard to be converted into texts without losing its geometric meaning. Maybe this is where we truly need the multimodal models or where training on specialized data is needed.
I don't follow the need to write CLIs for the agent. Why not use simply the API and document it well? The token difference between using an API and CLI is not that much, and models are trained to use REST APIs and understand their patterns, compared to your random CLI.
One thing I keep wondering about with agents is what happens once they start running real workflows.
Right now a lot of the focus is on getting them to execute tasks reliably. But the harder problem might be reconstructing what actually happened later, why a decision was made, what context the agent had, and how you audit or override it if something goes wrong.
Feels like autonomy is moving faster than the accountability layer around it.
I absolutely disagree. Tools should always be built for humans, not for machines! Believe it or not, I'm still one of those people who is actually capable of operating their own computer and doing meaningful work without any AI assistant (and I am proud of it). If that possibility or ability is taken away from me because the applications on my computer can only be operated through an AI — then I'm out and start a carer in woodworking or something.
[OP] justinwp | a day ago
---
Human DX optimizes for discoverability and forgiveness. Agent DX optimizes for predictability and defense-in-depth. These are different enough that retrofitting a human-first CLI for agents is a losing bet.
dang | 16 hours ago
Google Workspace CLI - https://news.ycombinator.com/item?id=47255881 - March 2026 (136 comments)
smy20011 | 16 hours ago
Maybe asking agent to write/execute code that wraps CLI is a better solution.
tayo42 | 16 hours ago
CamperBob2 | 15 hours ago
Everything old is new again...
lejalv | 16 hours ago
What was the not-so-real question? Or the surreal question?
I know it's becoming tiresome complaining of slop in HN. But folks! Put a bit of care in your writing! It is starting to look as if people had one more agent skill "write blogpost", with predictable results, as we are not a Python interpreter putting up with meh-to-disgusting code but actual humans with real lives and a sense of taste in communication
redanddead | 16 hours ago
vasco | 15 hours ago
jeppeb | 16 hours ago
[OP] justinwp | 14 hours ago
peddling-brink | 14 hours ago
Claude was struggling to use the ‘gh’ command to reliably read and respond to code review line level comments because it had to use the api. I had it write a few simple command line tools and a skill for invoking it, instantly better results.
YMMV
computerfriend | 16 hours ago
I don't think this is true?
MattGaiser | 15 hours ago
You want me to hand type a file name? I’ll flip a letter or skip one!
bewuethr | 15 hours ago
esseph | 15 hours ago
devmor | 15 hours ago
If AI agents are so underdeveloped and useless that they can’t parse out CLI flags, then the answer is not to rewrite the CLI.
You either give the agents an API layer or you don’t use them because they’re not mature enough for the problem space.
lynx97 | 15 hours ago
1: Strictly speaking, there are ways to access some GUI programs on Linux with a screen reader. However, frankly, most are not really a joy to use. The speed of interaction I get from a TUI is simply unmatched. Whenever I work with a true GUI, no matter if Windows, Mac or Linux, it feels like I am trying to run away from a monster in a dream. I try to run, but all I manage to do is wobble about...
utopiah | 15 hours ago
lynx97 | 14 hours ago
utopiah | 12 hours ago
lynx97 | 10 hours ago
rsanheim | 15 hours ago
The CLIs I’ve seen agents struggle with are those that wrap an enormous, unwieldy, poorly designed API under one namespace. All of Google Workspace apis, for example.
danw1979 | 14 hours ago
I don’t disagree with your point about agent abilities with older, concise, well-represented-in-training-data tools though.
jsunderland323 | 15 hours ago
The pattern I used was this:
1) made a docs command that printed out the path of the available docs
$ my-cli docs
- README.md
- DOC1.md
- dir2/DOC2.md
2) added a --path flag to print out a specific doc (tried to keep each doc less than 400 lines).
$ my-cli docs --path dir2/DOC2.md
# Contents of DOC2.md
3) added embeddings so I could do semantic search
$ my-cli search "how do I install x?"
[1] DOC1.md
"You can install x by ..."
[2] dir2/DOC2.md
"after you install..."
You then just need a simple skill to tell the agent about the docs and search command.
I actually love this as a pattern, it works really well. I got it to work with i18n too.
danw1979 | 14 hours ago
jsunderland323 | 14 hours ago
Not schilling, just easier to show you the repo since it's open source. https://github.com/coast-guard/coasts
sheept | 15 hours ago
Personally, I'm skeptical:
- Having the agent look up the JSON schemas and skills to use the CLI still dumps a lot of tokens into its context.
- Designing for AI agents over humans doesn't seem very future proof. Much of the world is still designed for humans, so the developers of agents are incentivized to make agents increasingly tolerate human design.
- This design is novel and may be fairly unfamiliar in the LLM's training data, so I'd imagine the agent would spend more tokens figuring this CLI out compared to a more traditional, human-centered CLI.
gck1 | 15 hours ago
A cli that is well designed for humans is well designed for agents too. The only difference is that you shouldn't dump pages of content that can pollute context needlessly. But then again, you probably shouldn't be dumping pages of content for humans either.
Smaug123 | 14 hours ago
rkagerer | 14 hours ago
Is there progress happening in that trajectory?
Imustaskforhelp | 11 hours ago
There was a recent Hackernews post which had a novel approach about making agents interact with GUI/computer-use
https://news.ycombinator.com/item?id=47125014: The First Fully General Computer Action Model : https://si.inc/posts/fdm1/
Hope this helps
magospietato | 15 hours ago
sheept | 15 hours ago
> gws ships 100+ SKILL.md files
Which must altogether be hundreds of lines of YAML frontmatter polluting your context.
[OP] justinwp | 14 hours ago
danw1979 | 14 hours ago
sheept | 14 hours ago
[0]: https://platform.claude.com/docs/en/agents-and-tools/agent-s...
danw1979 | 10 hours ago
Also, the author specifically mentions OpenClaw in the example skill frontmatter, so I'm wondering if their workflow even involves CC.
jauntywundrkind | 15 hours ago
The systemd universe is moving this way from dbus, and there doesn't seem to be a ton of protest against giving up dbus for json over unix sockets. There's really not that many protocols that are super pleasant for conversing with across sockets.
inkdust2021 | 15 hours ago
antisol | 15 hours ago
theshrike79 | 14 hours ago
danirod | 15 hours ago
theshrike79 | 14 hours ago
AIs don't. If they don't reach for the --help switch every time they'll attempt the statistical average, which may or may not work.
For super-common or popular tools like `gh` the usage is already in the training data though.
utopiah | 15 hours ago
bonoboTP | 14 hours ago
bayindirh | 14 hours ago
I write my tools for humans, without help or use of AI. If the AI agent wants to use my tool so bad, they need to rise to that level. I'll not crouch on my knees to meet it.
If I ever write a tool for AI interaction, I'd give it a well-defined API, to make it even easier for the agent.
donpark | 14 hours ago
ZeroGravitas | 14 hours ago
I like CLI tools with json output that can be piped through jq. I've seen llms do that with existing tools.
The human needs and llm needs seem to overlap, especially if the human is using scripts and piping between tools.
The number of times it implies you didn't need to validate "human" input until llms arrived is scary too.
I'm also surprised to hear them say llms shouldn't Google, as that seems an area that Google themselves could optimize their search service for their llms and get an advantage.
Finally, I wonder if just using older, smaller llms is a valid fuzzing approach for clis (or anything else llms might be controlling). Or do you need a high powered llm trained to generate adversarial input?
danw1979 | 14 hours ago
I took away a completely different message: humans and LLMs make different mistakes that require different validation.
948382828528 | 14 hours ago
climike | 14 hours ago
_heimdall | 14 hours ago
Build for humans, including good man pages or `--help` docs as needed.
If LLMs are worth the name AI they will understand how to discover and use Unix-style commands. In my experience, this is exactly the case and I need only say "I use tool X for use case) Y."
bingemaker | 14 hours ago
zihotki | 9 hours ago
lloydatkinson | 13 hours ago
mellosouls | 13 hours ago
LLM assistants are going to be a good forcing function to make sure all app features are accessible from a textual interface as well as a gui. Yes, a strong enough AI can drive a gui, but it makes so much more sense to just make the gui a wrapper around a command line interface that an LLM can talk to directly.
https://x.com/ID_AA_Carmack/status/1874124927130886501
https://xcancel.com/ID_AA_Carmack/status/1874124927130886501
Andrej Karpathy reiterated it a couple of weeks ago:
CLIs are super exciting precisely because they are a "legacy" technology, which means AI agents can natively and easily use them, combine them, interact with them via the entire terminal toolkit.
https://x.com/karpathy/status/2026360908398862478
https://xcancel.com/karpathy/status/2026360908398862478
lidn12 | 12 hours ago
resiros | 13 hours ago
impure | 12 hours ago
Them: Agents will write code 10x faster than a human.
Also them: You will have to dumb down your CLI for them to function properly.
Dansvidania | 11 hours ago
stratifyintel | 10 hours ago
Right now a lot of the focus is on getting them to execute tasks reliably. But the harder problem might be reconstructing what actually happened later, why a decision was made, what context the agent had, and how you audit or override it if something goes wrong.
Feels like autonomy is moving faster than the accountability layer around it.
pickleglitch | 7 hours ago
alexruf | 3 hours ago