Hacker Timesnew | past | comments | ask | show | jobs | submit | skinner_'s commentslogin

Also, if Claude had regurgitated a known solution, it would have come up with it in the first exploration round, not the 31st, as it actually did.


I think the nuanced take on Joel's rant is this: it was good advice for 26 years. It became slightly less good advice a few months ago. This is a good time to warn overenthuastic people that it’s still good advice in 2026, and to start a discussion about which of its assumptions remain to be true in 2027 and later.


Then I think you’ll like our project which aims to find the missing link between transformers and swarm simulations:

https://github.com/danielvarga/transformer-as-swarm

Basically a boid simulation where a swarm of birds can collectively solve MNIST. The goal is not some new SOTA architecture, it is to find the right trade-off where the system already exhibits complex emergent behavior while the swarming rules are still simple.

It is currently abandoned due to a serious lack of free time (*), but I would consider collaborating with anyone willing to put in some effort.

(*) In my defense, I’m not slacking meanwhile: https://arxiv.org/abs/2510.26543 https://arxiv.org/abs/2510.16522 https://www.youtube.com/watch?v=U5p3VEOWza8


https://www.astralcodexten.com/p/in-search-of-ai-psychosis is very relevant, but the main reason I’m posting it here is that, unlike this paper, it takes the opportunity to build the cleverest pun out of the same ingredients:

Folie A Deux Ex Machina


I interpreted it loosely, as "be aware of the possibility, and stop looking at it at the first signs of issues".


That seems to me to be a VERY generous interpretation of:

> I would check to make sure this can't trigger migraines or seizures. Maybe it's just me, but also, please double check.


100% frontpage-worthy! Frankly I was already bored with all those pelicans, and a bit worried that the labs are overfitting on pelicans specifically. This test clearly demonstrates that they are not.


That's very cool, but it's not an apples to apples comparison. The reasoning model learned how to do long multiplication. (Either from the internet, or from generated examples of long multiplication that were used to sharpen its reasoning skills. In principle, it might have invented it on its own during RL, but no, I don't think so.)

In this paper, the task is to learn how to multiply, strictly from AxB=C examples, with 4-digit numbers. Their vanilla transformer can't learn it, but the one with (their variant of) chain-of-thought can. These are transformers that have never encountered written text, and are too small to understand any of it anyway.


If being probabilistic prevented learning deterministic functions, transformers couldn’t learn addition either. But they can, so that can't be the reason.


People are probabilistic, and I've been informed that people are able to perform multiplication.


Yes, and unlike the LLM they can iterate on a problem.

When I multiply, I take it in chunks.

Put the LLM into a loop, instruct it to keep track of where it is and have it solve a digit at a time.

I bet it does just fine. See my other comment as to why I think that is.


Are you sure? I bet you if you pull 10 people off the street and ask them to multiply 5 digit by 5 digit numbers by hand, you won't have a 100% success rate.


The pertinent fact is that there exist people who can reliably perform 5x5 multiplication, not that every single person on the planet can do it.


I bet with a little training, practically anyone could multiply 5 digit numbers reliably.


> But which contributes more, they ask? Who gives a shit, really?

Funding agencies? Should they prioritize established researchers or newcomers? Should they support many smaller grant proposals or fewer large ones?


My uninformed and perhaps overly charitable interpretation: he warned them they were going to be steamrolled, they built their product anyway, and now OpenAI is buying them because (1) OpenAI doesn't want the negative publicity of steamrolling them all, and (2) OpenAI has the money and is a bit too lazy to build a clone.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: