> I tried to avoid writing this for a long time, but I'm convinced MCP provides no real-world benefit
IMO this is 100% correct and I'm glad someone finally said it. I run AI agents that control my entire dev workflow through shell commands and they are shockingly good at it. the agent figures out CLI flags it has never seen before just from --help output. meanwhile every MCP server i've used has been a flaky process that needs babysitting.
the composability argument is the one that should end this debate tbh. you can pipe CLI output through jq, grep it, redirect to files - try doing that with MCP. you can't. you're stuck with whatever the MCP server decided to return and if it's too verbose you're burning tokens for nothing.
> companies scrambled to ship MCP servers as proof they were "AI first"
FWIW this is the real story. MCP adoption is a marketing signal not a technical one. 242% growth in MCP servers means nothing if most of them are worse than the CLI that already existed
MCP blew up in 2024, before terminal agents (claude code) blew up in early 2025. The story isn’t “MCP was a fake marketing thing pushed on us”. It’s a story of how quickly the meta evolves. These frameworks are discovered!
> The story isn’t “MCP was a fake marketing thing pushed on us”. It’s a story of how quickly the meta evolves.
The original take was that "We need to make tools which an AI can hold, because they don't have fingers" (like a quick-switch on a CNC mill).
My $job has been generating code for MCP calls, because we found that MCP is not a good way to take actions from a model, because it is hard to make it scriptable.
It definitely does a good job of progressively filling a context window, but changing things is often multiple operations + a transactional commit (or rename) on success.
We went from using a model to "change this" vs "write me a reusable script to change this" & running it with the right auth tokens.
Strongly disagree, despite that meaning I'm swimming upstream here.
Unlike cli flags, with MCP I can tune the comments for the tool more easily (for my own MCPs at least) than a cli flag. You can only put so much in a cli --help output. The error handling and debugability is also nicer.
Heck, I would even favor writing an MCP tool to wrap cli commands. It's easier for me to ensure dangerous flags or parameters aren't used, and to ensure concrete restrictions and checks are in place. If you control the cli tools it isn't as bad, but if you don't, and it isn't a well known cli tool, the agent might need things like vague errors explaing to it a bit.
MCP is more like "REST" or "GRPC", at the simplest level just think of it as a wrapper.
You mentioned redirecting to files, what if the output is too much that way, you'll still burn tokens. But with MCP, if the output is too much you can count the tokens and limit, or... better yet you can paginate so that it gets some results, it sees how many results there are and either decides to re-run the tool with params that will yield less results, or consume the results page-by-page.
This is what I've been working on. I've written a project wrapper CLI that has a consistent interface that wraps a bunch of tools. The reason I wrote the CLI wrapper is for consistency. I wrote a skill that states when and how to call the CLI. AI agents are frequently inconsistent with how they will call something. There are some things I want executed in a consistent and controlled way.
It is also easier to write and debug CLI tooling, and other human devs get to benefit from the CLI tools. MCP includes agent instructions of how to use it, but the same can be done with skills or AGENTS.md (CLAUDE.md) for CLI.
that's what the MCP server is, except I don't always want a cli.
If I need to call API on top of a cli tool, i don't have to have a second wrapper, or extend my existing wrapper. You're suggesting I recreate everything MCP does, just so..it's my own?
MCP is just a way to use use wrappers other people have built, and to easily manage wrapping "tools", those could be cli tools, api calls, database query,etc..
cli tools aren't aware of the context window either, they're not keeping track of it. I might want my cli tool to output lots of useful text but maybe I don't want some of that for the LLM to save on tokens. Sure, I could create another cli tool to wrap my cli tool, now i have two cli tools to maintain. I'd prefer to do all the wrapping and pre-llm cleanup done in one consistent place. The instructions for the LLM letting it know what tools, parameters,etc.. are available is also defined in a consistent way, instead of me inventing my own scheme. I'd rather just focus on getting a usable agent.
I don't get the issue people in this thread have with MCP, is there some burden about it I haven't ran into? It's pretty easy to set one up.
It does not, your MCP server can be a small python file, your agent would execute it as a process and use stdio to communicate with it. You can also run it as an HTTP server, but if it's all on the same machine, I don't see the point. I'm pretty sure in under 15 loc of python you can wrap subprocess.check_ouptput as an stdio MCP server for example to let your agent run any commands or a specific command.
Because that is a consistent and reliable way of doing it? what happens when I have to use something that can't be done via cli, or if I have lots of small use cases (like I sometimes do with MCP servers - lots of tiny functions), do I create a separate readme for each of them and manage the mess? what exactly is the issue with MCP? is it too well organized?
I mean technically I could be using cli tools to browse HN as well I guess. curl would do fine I suppose, but that'd be too annoying. Why not use the best tool for the task. as far as I'm concerned an stdio MCP server is a cli tool, it just happens to be an integration layer that can run either other cli tools, or do other things as it makes sense.
And FFS! I know jq can do wonderful things, but I'd seriously question anyone's competency if you're building a production code base that relies on a tangled mess of jq piping commands when you could just write a python function to parse, validate and process the content. And don't get me started with the risks of letting an agent running commands unchecked. What happens when your agent is running your cli tool using user-input arguments and you forgot to make sure command-injection won't be a thing? That can happen with MCP as well, but in many cases you shouldn't just run cli commands, you would call libraries, apis, process data files directly instead. You wouldn't call the sqlite3 command when you can just use the library/module.
There are certainly things can't be done via CLI, or more suitable for a persistent daemon with various RPC rather than a CLI. But most things are simpler than that, and MCP is overcomplicating it.
MCP does not make things more organized. Everything is a file, and the filesystem is a mature infrastructure that we can trust. I don't see how MCP can be more organized than it.
curl is a great example of what CLI can do. Is there really a better way than curl for AI to browse HN?
Of course we should use Python or other sane scripts rather than shell to process JSON in production, but there is no need to hide the script in an MCP server. Also I don't see how it's easier to sandbox an MCP server than the CLI tools.
Maybe I don't understand how other people are using MCP, if it is for code generation agents, that I can't speak of. but for your own agent code, for me at least an MCP server is much easier to use than running commands directly.
> Is there really a better way than curl for AI to browse HN?
Yes, curl can't render the DOM and HN requires captcha when registering. WebMCP is a better way!
If you do that, you end up with all the problems that MCP attempts to solve: how to authorize using a standard mechanism, how to summarize operations in a better way than just dumping the OpenAPI spec on the LLM, providing structured input/output, providing answers to users that the LLM cannot see (for sensitive data or just avoiding polluting the context) and so on.
Authorization, in my opinion, is the big problem you need MCP for, though the current MCP Authorization spec still needs some refinement.
I avoid most MCPs. They tend to take more context than getting the LLM to script and ingest ouputs. Trying to use JIRA MCP was a mess, way better to have the LLM hit the API, figure out our custom schemas, then write a couple scripts to do exactly what I need to do. Now those scripts are reusable, way less context used.
I don't know, to me it seems like the LLM cli tools are the current pinnacle. All the LLM companies are throwing a ton of shit at the wall to see what else they can get to stick.
For Jira/Confluence, I also struggled with their MCPs. JIRA’s MCPs was hit or miss and Confluence never worked for me.
We don’t use the cloud versions, so not sure if they work better with cloud.
On the other hand, i found some unofficial CLIs for both and they work great.
I wrote a small skill just to give enough detail about how to format Epics, Stories, etc and then some guidance on formatting content and I can get the agent do anything i need with them.
I deal with a ton of different atlassian instances and the most infuriating thing to me about the mcp configuration is that atlassian really thinks you should only have one atlassian instance to auth against. Their mcp auth window takes you to a webpage where you can’t see which thing you are authoring against forcing you to paste the login page url into an incognito window. Pretty half baked implementation.
I noticed that it’s better for some things than others. It’s pretty bad at working with confluence it just eats tokens but if you outlay a roadmap you want created or updated in Jira it’s pretty good at that
I have had some positive experiences using the Jira and Confluence MCPs. However, I use a third-party MCP because my company has a data centre deployment of Jira and Confluence, which the official Atlassian MCP does not support.
My use case was for using it as an advanced search tool rather than for creating tickets or documentation. Considering how poor the Confluence search function is, the results from Confluence via an MCP-powered search are remarkably good. I was able to solve one or two obscure, company-specific issues purely by using the MCP search, and I'm convinced that finding these pages would have been almost impossible without it.
MCP servers were also created at a time where ai and llms were less developed and capable in many ways.
It always seemed weird we'd want to post train on MCP servers when I'm sure we have a lot of data with using cli and shell commands to improve tool calling.
It’s telling the best MCP implementations are those which are a CLI to handle the auth flow then allow the agent to invoke it and return results to stdout.
But even those are not better for agent use than the human cli counterpart.
How do you segregate the CLI interface the LLM sees versus a human? For example if you’d like the LLM to only have access to read but not write data. One obvious fix is to put this at the authz layer. But it can be ergonomic to use MCP in this case.
I've been running Claude Code in a Docker compose environment with two containers - one without Claude that has all the credentials setup and a Claude container which transparently executes commands via ssh. The auth container then has wrappers which explicitly allow certain subcommands (eg. `gh api` isn't allowed). The `gh` command in the Claude container is just a wrapper script which bassically `ssh auth-container gh-wrapper`.
Lots of manual, opinionated stuff in here, but it prevents Claude from even accessing the credentials and limits what it can do with them.
I’ve been testing with an ENV variable for a cli tool for LLMs that I’m making. Basically, I have a script that sets an ENV variable to launch the TUI that I want and that ENV variable changes the behavior for LLMs if they run it (changes the -h output, changes the default output format to json to make it easier to grep)
None of those work when dealing with external services, I wouldn’t even trust them as a solution for dealing with access to a database. It seems like the pushback against MCPs is based on their application to problems like filesystem access, but I’d say there are plenty of cases in which they are useful or can be a tool used to solve a problem.
Interestingly think I just came to the opposite conclusion after building CLIs + MCPs for code.deepline.com
Where MCPs fit in - short answer is enterprise auth for non-technical users.
CLIs (or APIs + Skills) are easier + faster to set up, UX is better for most use cases, but a generalized API with an easier auth UX (in some cases, usually the MCP Oauth flow is flaky too).
So feels like an imperfect solution, but once you start doing a ton of enterprise auth setups, MCP starts to make more sense.
That said the core argument for MCP servers is providing an LLM a guard-railed API around some enterprise service. A gmail integration is a great example. Without MCP, you need a VM as scratch space, some way to refresh OAuth, and some way to prevent your LLM from doing insane things like deleting half of your emails. An MCP server built by trusted providers solves all of these problems.
But that's not what happened.
Developers and Anthropic got coked up about the whole thing and extended the concept to nuts and bolts. I always found the example servers useless and hilarious.[0] Unbelievably, they're still maintained.
In my experience, a skill is better suited for this instead of an MCP.
If you don’t want the agent to probe the CLI when it needs it, a skill can describe the commands, arguments and flags so the agent can use them as needed.
They make a big difference. For example if you use the Jira cli, most LLMs aren’t trained on it. A simple MCP wrapper makes a huge difference in usability unless you’re okay having the LLM poke and prod a bunch of different commands
Fwiw I'm having a good experience with a skill using Jira CLI directly. My first attempt using a Jira MCP failed. I didn't invest much time debugging the MCP issues, I just switched to the skill and it just worked.
Yes occasionally Claude uses the wrong flag and it has to retry the command (I didn't even bother to fork the skill and add some memory about the bad flag) but in practice it just works
Do you mean wrap the CLI with an MCP? I don't get that approach. I wrapped the Jira cli with a skill. It's taken a few iterations to dial it in but it works pretty damn well now.
I'm good, yet my coworkers keep having problems using the Atlassian MCP.
Even if the help isn't great, good coding agents can try out the cli for a few minutes and write up a skill, or read the sources or online docs and write up a skill. That takes the spot of the --help if needed. I found that I can spare quite a lot of time, I don't have to type up how to use it, if there is available info about it on the web, in man pages, in help pages, or the source is available, it can figure it out. I've had Claude download the sources of ffmpeg and blender to obtain deeper understanding of certain things that aren't much documented. Recent LLMs are great at quickly finding where a feature is implemented, how it works based on the code, testing the hypothesis, writing it up so it's not lost, and moving on with the work with much more grounding and fewer guessing and assumptions.
Using the source code to ask questions about poorly documented features in projects you have no experience is my favourite thing that LLMs make possible (of course you could do this before but it would take way, way more time). There are so many little annoyances that I’ve been able to patch and, thanks to NixOS, have the patched software permanently available to me.
In fact NixOS + LLMs feels like the full promise of open source software is finally available to me. Everything is within reach. If you don’t like something, patch it out. If you want to change a default, patch that in.
No need to know the language, the weird build process, or the custom tooling. Idea to working binary in minutes. I love it so much.
Yes, the idea that you can meaningfully modify the program for your own purposes (one of Stallman's four freedoms) was quite unrealistic except for the most skilled and invested among users. LLMs change this. I mean, as long as you use open models. I fear that in the future, corporate models may start to refuse building software like this that is inconvenient for them. Like possible future-Gemini saying, "oh I see you're patching chromium to continue working with adblockers, this is harmful activity, I cannot help you and reported your account to Google. Cease and desist from these plans or you lose your Gmail!"
Today is the honeymoon phase, enshittification will come later when the pie stops growing and the aspect of control comes more into focus.
It's just too good to be true. Most people still don't know that you can now do what you just described. Once people in the suits understand this, the propaganda will start about how unsafe this all is and that platforms must be locked down at the hardware level, subscriptions cut off if building unapproved software etc.
We have a lot of tools (starting with the internal wiki) which are normally only exposed to engineers through web interfaces; MCPs make them available to terminal agents to use autonomously. This can get really interesting with e.g. giving Claude access to query logs and metrics to debug a production issue.
It is obnoxious that MCP results always go directly into the context window. I'd prefer to dump a large payload into Claude's filesystem and let him figure it out from there. But some of the places MCPs can be used don't even have filesystems.
Because "him" is objectively wrong, under almost any interpretation of any words involved. You can cause Claude, or any text-based LLM, to emit language that matches almost any personality / gender / character in the training set. At best you might be able to say "the default outputs have a masculine tone / vibe", but this still doesn't justify, by modern discourse, the "him".
The use of "him" by GP is extremely unusual IMO, and I suspect is odd for anyone with English as their native language. The current convention among normal people seems to me to be to avoid pronouns other than "it" with these tools, and generally just use the name. The name is not really relevant: like, sure, in some contexts we think of ships as "she/her", and may prefer feminine names for them, but if you used e.g. "she" rather than "it" to refer to the Titanic or any other ship with a female name, this is going to cause some double-takes / disfluent comprehension in the vast majority of native speakers in most cases.
Only if you imagine e.g. some stereotypical pirate with an eyepatch slapping the hull and saying something like "Aye, but she weathered the storm, as she always does" might this feel normal. Or, maybe if you are a Redditor and trying to make it your AI boyfriend / girlfriend, you can use he/him or some other neo-pronoun, but this is currently abnormal and not the general context.
And the fact that you can make the model act as any gender again shows why choosing "him" as some default here is strange. Absent any specific context, the choice of "him" here is poorly justified.
The GPTs are "it" because they were deliberately named in a way to discourage anthropomorphizing them. Anthropic does want you to anthropomorphize Claude, and they gave their model a male name. It's not that deep!
Thanks for reading! And yes, if anyone takes anything away from this, it's around composition of tools. The other arguments in the post are debatable, but not that one.
I'll just disagree with an example: Codex on Windows.
They are known to be very inefficient using only Powershell to interact with files, unless put in WSL. They tend to make mistakes and have to retry with different commands.
Another example is Serena. I knew about it since the first day I tried out MCP but didn't appreciate it, but tried it out again on IDEs recently showed impressive result; the symbolic tools are very efficient and helps the agents a lot.
> IMO this is 100% correct and I'm glad someone finally said it. I run AI agents that control my entire dev workflow through shell commands and they are shockingly good at it.
That's a very developer-centric view. So many things in the world don't have a CLI API at all and will never have one, like the huge majority of desktop GUI programs.
And your most common command sequences can be dropped into a script that takes options. Add a tools.md documenting the tools and providing a few examples. How many workflows need more than maybe two dozen robust scripts?
When is MCP the right choice though? For example - letting internal users ask questions on top of datasets? Is it better to just offer the api openapi specs and let claude run wild? or provide an MCP with instructions?
If you want to build an AI app that lets people “do random thing here”, then build an app.
Peak MCP is people trying to write a declarative UI as part of the MCP spec (1, I’m not kidding); which is “tldr; embed a webview of a web app and call it MCP”.
MCP is just “me too”; people want MCP to be an “AI App Store”; but the blunt, harsh reality is that it’s basically impossible to achieve that dream; that any MCP consumer can have the same app like experience for installed apps.
Seriously, if we can barely do that for browsers which have a consistent ui, there was never any hope even remotely that it would work out for the myriad of different MCP consumer apps.
It’s just stupid. Build a web app or an API.
You don’t need an MCP sever; agents can very happily interact with higher level functions.
While I do agree that MCP was probably bit too far from whats required, there is some benefit for sure. Providing information in a consistent format across all the services makes it easier work with. It lowers the brittleness of figuring out things making the products built using LLMs more stable/predictable. Most importantly it becomes the latest version of the documentation about a service. This can go a long way in M2M communication, pretty much standardization of Application layer.
Oh wait, things like open-api and all already exists and pretty much built to solve the same problem.
This was my gut from the beginning. If they can't do "fully observable" or "deterministic" (for what is probably a loose definition of that word) -- then, what's the point?
IMO this is 100% correct and I'm glad someone finally said it. I run AI agents that control my entire dev workflow through shell commands and they are shockingly good at it. the agent figures out CLI flags it has never seen before just from --help output. meanwhile every MCP server i've used has been a flaky process that needs babysitting.
the composability argument is the one that should end this debate tbh. you can pipe CLI output through jq, grep it, redirect to files - try doing that with MCP. you can't. you're stuck with whatever the MCP server decided to return and if it's too verbose you're burning tokens for nothing.
> companies scrambled to ship MCP servers as proof they were "AI first"
FWIW this is the real story. MCP adoption is a marketing signal not a technical one. 242% growth in MCP servers means nothing if most of them are worse than the CLI that already existed