More

daemonologist · 2026-06-16T19:48:01 1781639281

The benefit of running the full precision version is negligible (probably not even measurable above the benchmark noise floor). Most common for cost-conscious users is to run something around 4-6 bits per weight, which would fit on a 24 or 32 GB card (as you mentioned).

daemonologist · 2026-06-16T03:32:42 1781580762

There is no "coder" version of Qwen 3.6; I think they just mean it's a coding-focused model of similar size and performance (to Qwen 3.6 35B-A3B).

Regular Qwen 3.6 benchmarks slightly better and has much wider software support though, so this is probably of interest only to organizations which disallow models trained in China.

kadoban · 2026-06-16T03:56:11 1781582171

I mean, Qwen 3.6 kicks ass. I don't know who these people are, but if their first outing is "not quite as good as Qwen 3.6", that's not a bad start by any means.

30B vs 35B isn't nothing either.

If it ends up just being some tweaks to someone else's weights, then meh.

mtone · 2026-06-16T05:07:05 1781586425

It was trained from scratch by Cohere. They're the only Canadian AI lab - I'm glad they're releasing open weights and I wish them luck catching up!

daemonologist · 2026-06-14T17:18:30 1781457510

The allegation here is that it's not actually a fine-tune of Qwen, but instead an undisclosed mashup (merge) of someone else's fine-tune of Qwen and the original model. Rio subsequently said that the model was in fact a merge, that they did additional fine-tuning after the merge, and that they accidentally uploaded the base merge instead of the version with additional fine-tuning. But this seems like quite an oversight...

yieldcrv · 2026-06-14T18:55:01 1781463301

> But this seems like quite an oversight...

Not to me, what would people like to happen? Who are those people? And why do they care?

antonvs · 2026-06-15T05:52:03 1781502723

They made a public claim to having produced a useful model, which they published. Turns out they did nothing of the sort.

> why do they care?

Why does anyone ever care about having their time wasted by fraudulent claims?

yieldcrv · 2026-06-15T10:40:31 1781520031

Continue to explain like I’m 5 instead of the rhetoricals

daemonologist · 2026-06-13T17:32:47 1781371967

There are also significant economies of scale (namely: utilization and batching), which tend to make inference on a shared server more economical even after the operator takes a cut.

zozbot234 · 2026-06-13T19:28:39 1781378919

You can use batching on consumer hardware, it just requires a KV-cache efficient model (or short context only) and keeping multiple inference flows running in parallel. This is most useful in combination with streamed inference, since the compute intensity of decode with those newer KV-compressed models is high enough that you have limited compute headroom when running at the speed of RAM.

daemonologist · 2026-06-12T23:40:53 1781307653

You can bounce the ball up slightly (presumably the spin from rolling is modeled or approximated, and gives lift when hitting a bumper), which might be enough to skip from the tee to near the end of the course. Not sure that should be considered for "par" though. Took me 14.

daemonologist · 2026-06-12T14:27:55 1781274475

Opus 4.7 and 4.8 are also rather "proactive" - several times I've seen them try to inspect compiled binaries before there's even a problem, just to check that their changes are included (and if I let them do so they often get stuck down that rabbithole).

fwip · 2026-06-12T20:20:31 1781295631

I've also seen this. It'll run 'strings' against the binary and then convince itself that the Makefile isn't working right, and there's some imaginary sandbox preventing the code from compiling properly. So it will compile it by hand, and never run strings against the new binary, and proceed happily.

ElFitz · 2026-06-13T06:57:34 1781333854

These kinds of situations are why I gave my AI agents stray thoughts (automated insights / suggestions from a separate llm call with some curated context) that trigger on loop / rabbit hole detection.

Quite a bit of false positives, but it hasn’t had any ill-effect so far. Aside from increased quota usage.

daemonologist · 2026-06-11T22:35:19 1781217319

I admit I snorted when that was mentioned. It's frequently ranked as the most desirable place to live on earth.

Not to say the message of the article is completely without merit - there are things to see and do almost everywhere. But if I just get in the car and start driving I will 95% of the time find only strip malls and cornfields. Perhaps a suburban park with some trees.

qurren · 2026-06-12T18:47:25 1781290045

Switzerland is not unique in that aspect.

Pick any mountainous, desert, or coastal part of the world and you are guaranteed scenery for ~95% of the drive.

Pick any historic part of the world and you are guaranteed nice-looking buildings anywhere you walk.

A sizeable fraction of the world fits into either of the above. Yeah, if you live in cornfields and strip malls, you aren't going to find much interesting. But in fact, most of the world isn't like that. Arguably the cornfields and strip malls are the minority.

Throw a dart anywhere on Kyrgyzstan, Japan, Indonesia, Norway, China, India, Turkey, Uzbekistan, Argentina, Namibia, the southwestern US, northwestern US, or Mexico. You'll find lots of interesting things wherever your dart landed. These aren't very cherry-picked countries, I just named a few off the top of my head that come to mind where the "dart anyhere is interesting" is true. My point is that the world is full of "Switzerlands".

zazuke · 2026-06-16T08:48:22 1781599702

This is so true, and also what I thought reading these comments. Love the "the world is full of "Switzerland". I added it to the note, thanks for the comment!

daemonologist · 2026-06-10T21:43:42 1781127822

Unfortunately Radxa and Milk-V are almost completely out of stock and not much cheaper. If you need more than a microcontroller there's no circumventing the memory shortage at this point.

Kicking myself for not buying the Q6A at the beginning of the year (I wanted three and arace would only sell one per customer, but one would've been better than none).

daemonologist · 2026-05-30T07:08:46 1780124926

In the US, 99th percentile household wealth is ~$14M, which at historical rates of return is enough to live opulently indefinitely. (Of course although we're discussing a scenario where capital holds most of the cards, who knows if those returns would be dependable.)

33MHz-i486 · 2026-05-30T17:30:19 1780162219

if you dig into whats actual safe to distribute after inflation and taxes, or conservative FIRE mid-life recommendations, its around 1-2% of principal per year . From 14m, 10-20k/month, about the budget of the white collar household in a major metro. Which is nice but hardly opulent. Rent, healthcare, and kids (or some expensive hobbies) eat that up in hurry.

confidantlake · 2026-05-30T18:59:31 1780167571

20k a month being the budget of a white-collar household? We obviously live in very different worlds.

jp_sc · 2026-05-31T01:19:42 1780190382

Yes, but it’s 20k and all the time in the world.

daemonologist · 2026-05-28T22:51:36 1780008696