1) It chews through tokens. If you're on a metered API plan I would avoid it. I've spent $300+ on this just in the last 2 days, doing what I perceived to be fairly basic tasks.
2) It's terrifying. No directory sandboxing, etc. On one hand, it's cool that this thing can modify anything on my machine that I can. On the other, it's terrifying that it can modify anything on my machine that I can.
That said, some really nice things that make this "click":
1) Dynamic skill creation is awesome.
2) Having the ability to schedule recurring and one-time tasks makes it terribly convenient.
3) Persistent agents with remote messaging makes it really feel like an assistant.
> It chews through tokens. If you're on a metered API plan I would avoid it. I've spent $300+ on this just in the last 2 days, doing what I perceived to be fairly basic tasks.
Didn’t Anthropic make it so you can’t use your Claude Code Pro/Max with other tools? Has anyone experienced a block because of that policy while using this tool?
Also really curious what kind of tasks ran up $300 in 2 days? Definitely believe it’s possible. Just curious.
Seen a couple of people on X have posted about their Claude accounts being suspended after using this. All of them seem to have used it with Claude Code so yes looks like it violates their policy (not surprising really, it breaks their TOS).
I've tried it on Codex (ChatGPT Pro) and within an hour of just getting stuff set up and tested used half my weekly limit so I can see using $300 in a couple of days being very easy.
Until thats figured out this is basically a non starter, you can't use it if its going to cost $1k+ per week to use, and I'm not sure theres any local models that'd handle it without $10k+ in hardware costs.
I’ve been working on adapting Claude Code to do some repetitive “personal assistant” type tasks so I was really excited to try this tool.
One of my tasks is a skill that fetches my calendar via MCP and slots events into a JSON to be used for an OR-Tools constraint optimizer that finds a workable schedule for something. It then uploads those events to the calendar using MCP when I choose my favorite candidate solution.
I checked token usage for this task last time I ran it. It would’ve cost $29 in API usage with Opus 4.5.
So yea, you’re absolutely right that this stuff isn’t going to go mainstream at these rates.
One thing you can try is powering Clawdbot with a local model. My company recently wrote[0] about it.
Unclear what kind of quality you'll get out of it, but since the tokens are all local, kinda doesn't matter if it burns through 10x more for the same outcome.
I offhandedly set it up to do a weather alert every 4 hours during the big winter storm. Absent a well-specified API, I can only assume it was repeatedly doing a bunch of work to access some open API it discovered.
Very much the LLM equivalent of “to bake an apple pie you must first invent the universe”.
Hear hear. Elixir is a dream for this kind of stuff. But it requires very different decisions "all the way down" to make it work outside of BEAM. And BEAM itself feels heavy to most systems devs.
(IMO it's not for many use cases, and to the extent it is I'm happy to see things like AtomVM start to address it.)
That plus the new React diff viewer in beta. The old one seemed to be a simpler Web Component inside a Rails turbo frame.
I've tested the beta one and like most SPAs it doesn't scale well to large amounts of data (large numbers of files / line counts). You can feel the DOM slowing down even on a high end macbook. It even blanked out the page a couple times, another common issue when browsers are overloaded. So I switched back to the old one.
>The point is that they are prioritizing this over new features.
Good! Shoring up infrastructure vs. delivering the latest hotness is something that is rarely prioritized. I'll take boring and reliable every day of the week.
You would be a fool to think the Copilot Coding Agent is not their most important feature at the moment. It's not particularly great, but it must become so.
The infrastructure behind serving git repos the way they do is pretty fiddly—I'd not be a bit surprised if this move reduces stability and/or performance.
That started with MS and accelerated with Copilot. Word is that GH leadership doesn't care about anything other than Copilot/AI. All other features are receiving far less focus and fewer resources. I've heard this repeatedly from current and former employees.
IMHO: the acceleration curve into point-of-no-return was when Microsoft decided to go hard on AI, and saw GitHub's Copilot as one of the key inflection points they were going to use to do so - even going so far to adopt the Copilot brand across the entire company.
Before that, it still felt like there _some_ degree of autonomy and ability to think about the developer experience on the platform as a whole. Once ChatGPT took off and MSFT decided that they were going to go hard on AI, though, Copilot (and therefore GitHub) became too important to Microsoft to leave alone.
I kinda suspect the slide was inevitable anyway, given how acquisitions tend to go. But IMO, Copilot was the tsunami that washed the octocat out to sea.
What I find pretentious is the legion of commenters who can't find anything better to comment on and instead pretend they're smart by nitpicking some stylistic choice in the most low-effort way possible.
Classic case of "you're pretentious", "no, you're pretentious". It's exhausting how often we reach for the word "pretentious" when we have bitter feelings about one person's opinion of another person or their work.
I just used it because he did. My real feeling was exhaustion. More of those comments than comments about the subject matter of the post. Like going to a swimming meet as a pro and finding it full of kids instead.
1) It chews through tokens. If you're on a metered API plan I would avoid it. I've spent $300+ on this just in the last 2 days, doing what I perceived to be fairly basic tasks.
2) It's terrifying. No directory sandboxing, etc. On one hand, it's cool that this thing can modify anything on my machine that I can. On the other, it's terrifying that it can modify anything on my machine that I can.
That said, some really nice things that make this "click":
1) Dynamic skill creation is awesome.
2) Having the ability to schedule recurring and one-time tasks makes it terribly convenient.
3) Persistent agents with remote messaging makes it really feel like an assistant.