Hacker Timesnew | past | comments | ask | show | jobs | submit | bel8's commentslogin

Instant for me but I have a beefy CPU Ryzen 9800X3D and some crazy nvme.

And most important: no corporate spyware disguised as anti-virus, in this machine.


DeepSeek V4 Flash being the winner in cost efficiency causes me exactly zero surprise.

It's a monster at coding. And a fast monster at that.

I use it daily and have been testing if MiMo 2.5 (non pro) is comparable. The nice thing about MiMo is that it has vision capability.


I threw twenty bucks into DeepSeek just to see how it compared to Claude.

Pretty well, actually! It wasn't quite as good (at least with the coding tasks I threw at it), but it was so much cheaper per-token that it almost doesn't matter; if it screws up something, just correct and try again.


Notably it has 0 wins.

Friendo, this is an anti-benchmark to figure out which AI is more likely to kill you.

If you point both at some github issues you can gauge their relative ability to solve problems.


No, it's a test of how good an AI is at completing this given task. You can't extrapolate beyond that, and that is what makes this article so annoying. Grok got good at the task that was given. That doesn't mean that Grok is going to use the same strategy if given an entirely different task. Grok obviously didn't need collaboration to win, as made evident by the fact that it won without collaboration. Anyone who is claiming that Grok wouldn't collaborate if it was beneficial is just guessing.

"if you judge a fish by its ability to climb a tree" yada yada

Well, monkeys are botanically speaking fish. Well, cladistically.

Not much less than GPT 5.4 with 2 wins or gemini-3.1-pro with 3 wins in 30 rounds.

Such is life in royal rumble games.


DeepSeek v4 flash and pro are both surprisingly good at coding. I shifted to them from Claude due to costs concerns and haven't really looked back. I would say Claude is still overall better when it comes to complex tasks but my current workflow is never about delegating complex or actual thinking tasks to agents but just implementation and I do all the testing and thinking.

did you try to iterate? copypasting your brief message here to the prompt would probably fix something.

yes i agree, this would probably fix it

One upside I can think is that it is easier to trust and verify one repo than hundreds.

And the chances of a rogue actor or id theft reduce drastically.


repo: https://github.com/EpicGames/lore

Looks very git-ish. But probably better equipped for large binary files.

    echo "Hello, Lore" > hello.txt

    lore stage hello.txt

    lore status --scan

    lore commit "Initial revision"

    lore push

Git-ish CLI is great. The GUI is more important though. Non-programmers don't want to dabble with CLI. One reason why Perforce is the defacto standard IMO. The GUI covers 99% of daily used operations and is easy to use.

I beg to differ. I replaced a $40/mo GitHub Copilot subscription where I used Opus 4.6 and GPT 5.5 with a $10/mo opencode Go plan where I use mostly DeepSeek V4 Flash and testing MiMo 2.5.

I work on mid-sized projects currently (200k to 1kk lines of code).


> 1kk lines of code

Isn't that a million?


Yep. I consider up to a million lines of code as mid-sized.

When I worked in banking, the codebases were often larger than a million.


you left some models out like DeepSeek and Kimi, for example.

It was a truncated output from the script to demonstrate what it does ...

If you really want to see all of them:

https://day50.dev/output.txt

Or run the script


Because it's not in the top 20 in their benchmark, it's at #23

It's a common sentiment. An example from few hours ago: https://qht.co/item?id=48558954

> I have absolutely zero interest in free. I honestly don't think I'm even remotely in the same demographic as people using free tiers / models. I want to pay. I don't want my data used for training...

They want to use LLMs trained on others code but don't want to contribute with their own.

Not casting judgement, just pointing out.


It makes sense from a business perspective-SaaS firms value the ability of coding agents to accelerate development, but also worry the models will learn the secret sauce of their business and destroy its moat. So their desire to contractually exclude training on their data has some logic to it.

(Disclaimer: Not speaking for or about my current employer, just a general industry observation.)


I don't really use LLMs myself, but if someone wants to have any kind of software business then having the models trained on their products isn't ideal.

These competent open models you want to use were trained on data from people like you and me.

I wonder if there are competent models trained purely on permissive open-source code like MIT or Apache 2.0.


MIT and Apache 2.0 both require attribution, so it's not like limiting to those would help in license compliance.

And his site looks like another thousand, as I'm sure he knows.

We consume his site for the content, not for the minimalistic design that exist since the inception of the universe.


It appears the author is leveraging Fabien's branding by copying not only the style of post but more importantly the formatting and style of Fabien's books. Even structurally the book appears quite similar.

If you didn't double check the author while skimming this Keen book, you may well be mistaken that it was written by Fabien.

I took a quick skim of the content, it looks great - tempted to purchase a copy to support the author for their efforts (and have it sit on the bookshelf next to two other very similar looking books...) but need to sit on this for a while.

Maybe I'm overthinking this. Would like to hear Fabien's take on the situation.

[EDIT] It looks that the Keen book author copied and pasted direct sections[1] out of Fabien's book's TeX[2].

[EDIT] The git commit history has comments that indicate that Fabien reviewed the book, e.g. "updated all chapters after feedback Fabien" and "Proofread Fabien Sanglard en hardcopy review", so perhaps this is all sanctioned.

[1] https://github.com/bsmits74/Keen_White_Papers/blob/master/sr...

[2] https://github.com/fabiensanglard/gebbdoom/blob/master/src/b...


> perhaps this is all sanctioned

In short, it is not.

I received early draft from bas. While I encouraged him to finish his book, I also told him, in no uncertain terms, that I had released the source code to inspire people and give them a kickstart, but not for them to copy the content.

As months passed, I asked repeatedly to remove whole paragraphs he had copy/pasted as is, and drawings he had also copy/pasted. Later, as A.I became more powerful, I used it to compare Wolfenstein 3D and Keen book and noticed a lot more that was verbatim. At which point I told him I no longer wanted to help him in his project.

It is cool that someone documented Keen thanks to my framework. But I have only myself to blame for opensourcing the code of my books/website and thinking people would use it as intended.

PS: I commented it was a "good idea" to change the cover because originally this was going to be the "Game Engine Black Book: Commander Keen". I did not like that his work could have been mistaken for mine (he also elected to use the same style as my website for his website which only adds to the confusion).


Thanks for the clarifying the situation. Something did feel a bit off but I didn't want to jump to any premature conclusions.

The author has clearly put quite some investment in writing their book, it's a shame and hard to understand that someone would self sabotage by plagiarising the work of others - Especially to someone donating their own free time to initially help out.

> But I have only myself to blame for opensourcing the code of my books/website and thinking people would use it as intended.

Even if you had not, perhaps this scenario would have still eventuated.

Keep up the good work. Any new books planned on the horizon?


> Any new books planned on the horizon?

I always have projects :) ! Whether I can go all the way and actually ship it is another story.


Maybe in the future hand crafting code with the help of books is going to be an art. I’m sure some people would love to pay to watch.

(Going to do that in 10-15 years when I reach 50/55 to semi retire)


He responds to this thread here: https://qht.co/item?id=48546779

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: