So what's best low cost coding agent these days? Kimi 2.6? Qwen's latest closed ...

bwfan123 · 2026-05-24T14:31:11 1779633071

In my experience, it is claude-code paired with deepseek-v4. For penny-pinchers like me, I can have long coding sessions with it with no anxiety about the cost. Also, prompting it to what you want and verifying the outputs is more important than the quality of the model. So, I am better off with a cheaper model and taking the responsibility for prompting it and verifying the results.

esperent · 2026-05-24T15:14:41 1779635681

It's obviously much cheaper paying by the token but how does it compare to a codex subscription on cost?

epolanski · 2026-05-24T14:58:37 1779634717

Can you quantify the actual costs in a week and the use you make?

wongarsu · 2026-05-24T15:26:22 1779636382

Not GP, but for my use I'd estimate $0.10-0.30 per hour of use per agent with DeepSeek v4 Pro

raybb · 2026-05-25T03:07:51 1779678471

How to do connect deepseek to Claude code?

qaz_plm · 2026-05-25T12:03:54 1779710634

https://api-docs.deepseek.com/quick_start/agent_integrations...

passive · 2026-05-24T14:39:28 1779633568

I've gone through ~600m tokens in Xiaomi Mimo though Claude, and it's been the most effective use of an agent I've had yet. It's very capable, but generally not ambitious, picking simple but effective solutions to most problems I give it. Going to write something longer about the experience when I get to a billion tokens.

Alifatisk · 2026-05-24T15:26:34 1779636394

I do have my eyes on the coding plan, which is quite generous.

https://mimo.mi.com

Alifatisk · 2026-05-28T16:49:02 1779986942

Update, they increased the Lite plan from 60M credits/month to 4,1B credits/month. It's more than generous now, its a steal.

gandreani · 2026-05-24T14:45:02 1779633902

Are you using Mimo 2.5 pro?

passive · 2026-05-24T15:55:19 1779638119

Yes. I tried a couple of weeks with non-Pro, and it was pretty good, but I had too many spare tokens, so I switched back to Pro. :)

gandreani · 2026-05-28T20:14:49 1779999289

I use it through my opencode go subscription and it's exactly how you described. Very pragmatic and not too ambitious. It's similar to Kimi 2.5/6 in that regard.

I like it!

ac29 · 2026-05-24T14:35:09 1779633309

Kimi 2.6 is great. Qwen3.7-max benchmarks similarly but I havent used it yet

abalashov · 2026-05-24T16:03:47 1779638627

Although I have little interest in agentic coding, when I do use it, I have found Kimi K2.6 to give Opus-quality output, and have switched entirely to it for pretty much everything.

throw10920 · 2026-05-24T16:16:28 1779639388

I've used Opus extensively and tried K2.6 on a few projects, and the gap is huge. K2.6 is nowhere near the performance of Opus. That's fine because it's also far cheaper, but public benchmarks line up with my own personal experience that they aren't comparable in terms of intelligence.

(that is, different places on the Pareto efficiency graph)

abalashov · 2026-05-24T19:15:34 1779650134

No two uses are alike, I suppose. For me, whatever difference is a wash. However, I probably tend to shy away from throwing high-complexity/long-horizon tasks at the model.

skeledrew · 2026-05-24T14:36:27 1779633387

Seems to be DeepSeek.

https://qht.co/item?id=48237663

stavros · 2026-05-24T14:52:19 1779634339

For me, it's by far Deepseek. It's many times cheaper than competitors, and about as good as Sonnet 4.6.

fouric · 2026-05-24T16:21:00 1779639660

I'd generally agree about Deepseek being as good as Sonnet - but I have extreme trouble with prompt compliance with V4 Pro in a way that I've never had with Sonnet. I'll tell it "find the bug, but don't fix it" or "please use this tool I just developed" and it'll ignore me a high fraction of the time.

It's bad enough that I'm working on guardrails at the harness level because prompting appears to be useless.

Do you have the same issue?

stavros · 2026-05-24T16:25:50 1779639950

I have Opus make a fairly detailed plan, then Deepseek implements, and GPT reviews. With that setup, I have zero issues, probably because what you mention is handled (the plan keeps it on track and the reviewer catches any issues).

Now that you mention it, though, I have seen it do a few things that weren't in the plan. The reviewer caught them, though, so they didn't cause a problem, and it's so cheap that overall it's a massive improvement.

e2e4 · 2026-05-24T22:20:43 1779661243

Which CLIs are you using for each of the steps?

stavros · 2026-05-24T22:25:52 1779661552

OpenCode for everything: https://www.stavros.io/posts/how-i-write-software-with-llms/

e2e4 · 2026-05-25T00:16:08 1779668168

thank you; will read your post

gandreani · 2026-05-28T20:16:47 1779999407

I also have this problem!

It's the only model where an explicit instruction at the end of my message is sometimes ignored. This doesn't happen with any of the gpts, kimis, glms, qwen, etc. Just a deepseek problem.

Hope it improves!

fouric · 2026-05-29T00:38:12 1780015092

I'm glad I'm not going insane...

I have also noticed this with Sonnet, funnily enough - it's not as strong, but it's still there. But yeah, I haven't seen this with any other model so far (although I mostly use the stronger ones - maybe it's a function of intelligence?).

throw10920 · 2026-05-24T18:12:00 1779646320

Cursor with Composer 2.5 seems to be competitive with frontier models (Opus and GPT-5.5) for a significant price discount. Benchmarks are gamed, as always, but $0.55/task vs $11.02 a task definitely indicates that there's some cost advantage.

https://cursor.com/evals

lostmsu · 2026-05-24T14:33:59 1779633239

Just use codex with 5.5 on low reasoning levels