More

adwf · 2026-06-11T23:12:59 1781219579

I imagine it's the same foundation model on the 4 series, with Fable 5/Mythos being a new or upgraded foundation model. Then the point releases are fine-tuning plus post-training alignment with desired outcomes. The "thinking" can involve multiple steps, eg. asking the model first what it thinks the user wants to do, why it wants to do it, rewriting the prompt to generate better outcomes, how it should do it, come up with a plan, etc. So when they announce each point release like Opus 4.8, they're probably adding new layers of thinking to try and get good results on benchmarks. And that of course has cost and speed implications.

Then Sonnet/Haiku are just attempts to quantise/distil down to an acceptable performance/cost ratio. The cynic in me says we probably won't see any more of those until post-IPO, keep people addicted to the most costly models to pump a quarter or two of revenue figures, unless a competitor starts seriously undercutting them on price/performance. Hence the recent requests to slow down model training worldwide with their competitors.

Of course it could be that Fable "5" is just a marketing bump to the version, not a new foundation model...

ValentineC · 2026-06-12T00:00:45 1781222445

> Then Sonnet/Haiku are just attempts to quantise/distil down to an acceptable performance/cost ratio. The cynic in me says we probably won't see any more of those until post-IPO, keep people addicted to the most costly models to pump a quarter or two of revenue figures, unless a competitor starts seriously undercutting them on price/performance. Hence the recent requests to slow down model training worldwide with their competitors.

I'm guessing there'll be a Sonnet/Haiku 5 release just around IPO, to keep the news cycle going, and so that user numbers will get a boost.

adwf · 2026-04-23T23:35:30 1776987330

That's the least of it: https://www.bbc.co.uk/news/articles/cpvxgl3n138o

All 500,000 participants for sale on Alibaba...

And official response: https://www.ukbiobank.ac.uk/news/a-message-to-our-participan...

nxobject · 2026-04-24T17:38:53 1777052333

More details on leaked information from El Reg, especially after (laudably) the British government has been more transparent: https://www.theregister.com/2026/04/23/500k_biobank_voluntee...

"The charity did not specify the types of data that were included, but Murray stated in the Commons that several markers were included in the listings:

- Gender

- Age

- Month and year of birth

- Assessment center data

- Attendance dates

- Socioeconomic status

- Lifestyle habits

- Measures from biological samples related to haematology, biology, and chemistry

- Sleep, diet, work environment, mental health, and health outcomes data."

fastaguy88 · 2026-04-24T15:19:57 1777043997

BioBank claims (1) only de-identified data was available and (2) none of the data was actually sold before the datasets were taken down.

john_strinlai · 2026-04-24T15:34:54 1777044894

unfortunately for most people, de-identified data is typically a very short analysis away from being re-identified.

the field of de-anonymization is booming.

nxobject · 2026-04-24T17:40:20 1777052420

Especially for a nation-state that's already hoovering up data broker products.

adwf · 2026-03-02T22:56:45 1772492205

This looks wonderful! After playing Cities Skylines 2 for the last week, all I can say is that as long as you have a half-decent traffic system, I'll be happy!

adwf · 2026-02-26T14:52:42 1772117562

Plus the SUV is usually point-to-point, leave home, go to work, come back. Whereas the bus is going back and forth ten times per day.

In Europe, the numbers differ even more. Lighter weight cars typically 1.5-2 tons, a new London bus can be upto 18 tons when loaded - that's ~5-16 units of wear for the car to 104,976 units for the bus...

But this is all supposing we're optimising for road wear, which isn't really the point of a bus system.

snk · 2026-02-26T21:16:57 1772140617

I'm old. Back in the olden days - the 1900s - 2-ton cars were not lightweight, the so-called heavy Chevys.

adwf · 2026-02-24T11:28:26 1771932506

I work on ML problems in the healthcare/life sciences area, anything that enhances explainability is helpful. To a regulator, it's not really good enough to point at a black box and say you don't know why it gave the wrong answer this time. They have an odd acceptance of human error, but very little for technological uncertainty.

adwf · 2026-02-13T01:02:45 1770944565

Oh god, the bad mocks are the worst. Try adding instructions not to make mocks and it creates "placeholders", ask it to not create mocks or placeholders and it creates "stubs". Drives me mad...

To add to this list:

- Duplicate functions when you've asked for a slight change of functionality (eg. write_to_database and write_to_database_with_cache), never actually updating all the calls to the old function so you have a split codebase.

- On a similar vein, the backup code path of "else: do a stupid static default" instead of erroring, which would be much more helpful for debugging.

- Strong desires to follow architecture choices it was trained on, regardless of instruction. It might have been trained on some presumably high quality, large and enterprise-y codebases, but I'm just trying to write a short little throwaway program which doesn't need the complexity. KISS seems anathema to coding agents.

ziml77 · 2026-02-13T04:16:09 1770956169

I'm sort of happy to see all these things I run into listed out as issues people have so I know it's not just me experiencing and being bothered by these behaviors.

adwf · on May 25, 2025

Not agreeing or disagreeing with your point, just adding info for context:

Fortnite: July 25, 2017 (Battle Royale mode launched September 26, 2017)

Apex Legends: February 4, 2019

Valorant: June 2, 2020

Overwatch: May 24, 2016

Call of Duty: 2003, Annual release

League of Legends: October 27, 2009

Dota 2: July 9, 2013

Roblox: 2006 (initially as DynaBlocks, rebranded to Roblox the same year)

Blame Claude 4 if any date is wrong...

leecommamichael · on May 25, 2025

I will not blame an LLM, I will blame your laziness for repeating a message you didn't verify.

johnisgood · on May 25, 2025

I know, right?! People misuse a valuable tool and blame the tool for it.

adwf · on May 22, 2025

In the UK and a few other countries (Norway, Hungary, Canada, ???), EVs will have a green "flash" on the number plate. Makes it a bit easier to identify!

adwf · on May 8, 2025

Forgive me if I find it somewhat difficult to take seriously an argument by a person judging progress on the Kardashev scale...

You could pick some slightly less sci-fi measures like "number of trivially preventable deaths from diseases for which we have vaccines", for example.

adwf · on April 8, 2025

Because an autistic person can be an amazing programmer? As could a blind person, a deaf person, etc...

Simple accommodations can be made if needed and then there's no need to exclude people on old-fashioned prejudice.

throwaway290 · on April 8, 2025

Where did I mention being an amazing programmer? If that's the requirement then why not. The comment was replying specifically about environment where you gotta sit through hour long meetings and that is what I wrote about

maybe there is a company where being an amazing programmer is enough. I worked with capable depressed programmer who never delivers and is too shy to delegate anything, capable psycho programmer who no one wants to work with, bad programmer who works crazy hours, carries the project and interacts nicely with customers when needed. The last one was probably the most valuable

adwf · on April 8, 2025

> Where did I mention being an amazing programmer?

I mean... that's what the title and context of the discussion thread is all about?

throwaway290 · on April 8, 2025

If you are an amazing programmer but can't function in the 1 hour sitdown meeting which is part of your job activities then you are de facto worse candidate than the next amazing programmer who can, that's just how it is.