More

bel8 · 2026-06-18T15:09:39 1781795379

Instant for me but I have a beefy CPU Ryzen 9800X3D and some crazy nvme.

And most important: no corporate spyware disguised as anti-virus, in this machine.

bel8 · 2026-06-17T22:16:38 1781734598

DeepSeek V4 Flash being the winner in cost efficiency causes me exactly zero surprise.

It's a monster at coding. And a fast monster at that.

I use it daily and have been testing if MiMo 2.5 (non pro) is comparable. The nice thing about MiMo is that it has vision capability.

tombert · 2026-06-18T03:08:16 1781752096

I threw twenty bucks into DeepSeek just to see how it compared to Claude.

Pretty well, actually! It wasn't quite as good (at least with the coding tasks I threw at it), but it was so much cheaper per-token that it almost doesn't matter; if it screws up something, just correct and try again.

rgbrgb · 2026-06-17T22:24:23 1781735063

Notably it has 0 wins.

plaguuuuuu · 2026-06-17T23:01:10 1781737270

Friendo, this is an anti-benchmark to figure out which AI is more likely to kill you.

If you point both at some github issues you can gauge their relative ability to solve problems.

Petersipoi · 2026-06-18T03:39:27 1781753967

No, it's a test of how good an AI is at completing this given task. You can't extrapolate beyond that, and that is what makes this article so annoying. Grok got good at the task that was given. That doesn't mean that Grok is going to use the same strategy if given an entirely different task. Grok obviously didn't need collaboration to win, as made evident by the fact that it won without collaboration. Anyone who is claiming that Grok wouldn't collaborate if it was beneficial is just guessing.

luipugs · 2026-06-17T22:46:54 1781736414

"if you judge a fish by its ability to climb a tree" yada yada

eru · 2026-06-18T01:59:52 1781747992

Well, monkeys are botanically speaking fish. Well, cladistically.

bel8 · 2026-06-17T22:38:25 1781735905

Not much less than GPT 5.4 with 2 wins or gemini-3.1-pro with 3 wins in 30 rounds.

Such is life in royal rumble games.

altmanaltman · 2026-06-18T03:59:45 1781755185

DeepSeek v4 flash and pro are both surprisingly good at coding. I shifted to them from Claude due to costs concerns and haven't really looked back. I would say Claude is still overall better when it comes to complex tasks but my current workflow is never about delegating complex or actual thinking tasks to agents but just implementation and I do all the testing and thinking.

bel8 · 2026-06-17T19:10:17 1781723417

did you try to iterate? copypasting your brief message here to the prompt would probably fix something.

zachdive · 2026-06-17T19:42:46 1781725366

yes i agree, this would probably fix it

bel8 · 2026-06-17T15:14:07 1781709247

One upside I can think is that it is easier to trust and verify one repo than hundreds.

And the chances of a rogue actor or id theft reduce drastically.

bel8 · 2026-06-17T14:38:27 1781707107

repo: https://github.com/EpicGames/lore

Looks very git-ish. But probably better equipped for large binary files.

    echo "Hello, Lore" > hello.txt

    lore stage hello.txt

    lore status --scan

    lore commit "Initial revision"

    lore push

Snafuh · 2026-06-17T15:48:31 1781711311

Git-ish CLI is great. The GUI is more important though. Non-programmers don't want to dabble with CLI. One reason why Perforce is the defacto standard IMO. The GUI covers 99% of daily used operations and is easy to use.

bel8 · 2026-06-17T13:31:30 1781703090

I beg to differ. I replaced a $40/mo GitHub Copilot subscription where I used Opus 4.6 and GPT 5.5 with a $10/mo opencode Go plan where I use mostly DeepSeek V4 Flash and testing MiMo 2.5.

I work on mid-sized projects currently (200k to 1kk lines of code).

Alifatisk · 2026-06-17T14:03:27 1781705007

> 1kk lines of code

Isn't that a million?

bel8 · 2026-06-17T14:23:09 1781706189

Yep. I consider up to a million lines of code as mid-sized.

When I worked in banking, the codebases were often larger than a million.

bel8 · 2026-06-17T13:18:28 1781702308

you left some models out like DeepSeek and Kimi, for example.

kristopolous · 2026-06-17T13:42:58 1781703778

It was a truncated output from the script to demonstrate what it does ...

If you really want to see all of them:

https://day50.dev/output.txt

Or run the script

ashenke · 2026-06-17T13:41:14 1781703674

Because it's not in the top 20 in their benchmark, it's at #23

bel8 · 2026-06-16T21:47:09 1781646429

It's a common sentiment. An example from few hours ago: https://qht.co/item?id=48558954

> I have absolutely zero interest in free. I honestly don't think I'm even remotely in the same demographic as people using free tiers / models. I want to pay. I don't want my data used for training...

They want to use LLMs trained on others code but don't want to contribute with their own.

Not casting judgement, just pointing out.

skissane · 2026-06-16T22:04:58 1781647498

It makes sense from a business perspective-SaaS firms value the ability of coding agents to accelerate development, but also worry the models will learn the secret sauce of their business and destroy its moat. So their desire to contractually exclude training on their data has some logic to it.

(Disclaimer: Not speaking for or about my current employer, just a general industry observation.)

davebren · 2026-06-16T21:57:23 1781647043

I don't really use LLMs myself, but if someone wants to have any kind of software business then having the models trained on their products isn't ideal.

bel8 · 2026-06-16T17:59:14 1781632754

These competent open models you want to use were trained on data from people like you and me.

I wonder if there are competent models trained purely on permissive open-source code like MIT or Apache 2.0.

yencabulator · 2026-06-16T18:17:33 1781633853

MIT and Apache 2.0 both require attribution, so it's not like limiting to those would help in license compliance.

bel8 · 2026-06-15T20:19:25 1781554765

And his site looks like another thousand, as I'm sure he knows.

We consume his site for the content, not for the minimalistic design that exist since the inception of the universe.

sasas · 2026-06-15T20:55:29 1781556929

It appears the author is leveraging Fabien's branding by copying not only the style of post but more importantly the formatting and style of Fabien's books. Even structurally the book appears quite similar.

If you didn't double check the author while skimming this Keen book, you may well be mistaken that it was written by Fabien.

I took a quick skim of the content, it looks great - tempted to purchase a copy to support the author for their efforts (and have it sit on the bookshelf next to two other very similar looking books...) but need to sit on this for a while.

Maybe I'm overthinking this. Would like to hear Fabien's take on the situation.

[EDIT] It looks that the Keen book author copied and pasted direct sections[1] out of Fabien's book's TeX[2].

[EDIT] The git commit history has comments that indicate that Fabien reviewed the book, e.g. "updated all chapters after feedback Fabien" and "Proofread Fabien Sanglard en hardcopy review", so perhaps this is all sanctioned.

[1] https://github.com/bsmits74/Keen_White_Papers/blob/master/sr...

[2] https://github.com/fabiensanglard/gebbdoom/blob/master/src/b...

fabiensanglard · 2026-06-16T03:58:54 1781582334

> perhaps this is all sanctioned

In short, it is not.

I received early draft from bas. While I encouraged him to finish his book, I also told him, in no uncertain terms, that I had released the source code to inspire people and give them a kickstart, but not for them to copy the content.

As months passed, I asked repeatedly to remove whole paragraphs he had copy/pasted as is, and drawings he had also copy/pasted. Later, as A.I became more powerful, I used it to compare Wolfenstein 3D and Keen book and noticed a lot more that was verbatim. At which point I told him I no longer wanted to help him in his project.

It is cool that someone documented Keen thanks to my framework. But I have only myself to blame for opensourcing the code of my books/website and thinking people would use it as intended.

PS: I commented it was a "good idea" to change the cover because originally this was going to be the "Game Engine Black Book: Commander Keen". I did not like that his work could have been mistaken for mine (he also elected to use the same style as my website for his website which only adds to the confusion).

sasas · 2026-06-16T06:38:56 1781591936

Thanks for the clarifying the situation. Something did feel a bit off but I didn't want to jump to any premature conclusions.

The author has clearly put quite some investment in writing their book, it's a shame and hard to understand that someone would self sabotage by plagiarising the work of others - Especially to someone donating their own free time to initially help out.

> But I have only myself to blame for opensourcing the code of my books/website and thinking people would use it as intended.

Even if you had not, perhaps this scenario would have still eventuated.

Keep up the good work. Any new books planned on the horizon?

fabiensanglard · 2026-06-16T15:40:25 1781624425

> Any new books planned on the horizon?

I always have projects :) ! Whether I can go all the way and actually ship it is another story.

markus_zhang · 2026-06-16T13:24:09 1781616249

Maybe in the future hand crafting code with the help of books is going to be an art. I’m sure some people would love to pay to watch.

(Going to do that in 10-15 years when I reach 50/55 to semi retire)

bombcar · 2026-06-15T23:10:00 1781565000

He responds to this thread here: https://qht.co/item?id=48546779