The benefit of running the full precision version is negligible (probably not even measurable above the benchmark noise floor). Most common for cost-conscious users is to run something around 4-6 bits per weight, which would fit on a 24 or 32 GB card (as you mentioned).
There is no "coder" version of Qwen 3.6; I think they just mean it's a coding-focused model of similar size and performance (to Qwen 3.6 35B-A3B).
Regular Qwen 3.6 benchmarks slightly better and has much wider software support though, so this is probably of interest only to organizations which disallow models trained in China.
I mean, Qwen 3.6 kicks ass. I don't know who these people are, but if their first outing is "not quite as good as Qwen 3.6", that's not a bad start by any means.
30B vs 35B isn't nothing either.
If it ends up just being some tweaks to someone else's weights, then meh.
The allegation here is that it's not actually a fine-tune of Qwen, but instead an undisclosed mashup (merge) of someone else's fine-tune of Qwen and the original model. Rio subsequently said that the model was in fact a merge, that they did additional fine-tuning after the merge, and that they accidentally uploaded the base merge instead of the version with additional fine-tuning. But this seems like quite an oversight...
There are also significant economies of scale (namely: utilization and batching), which tend to make inference on a shared server more economical even after the operator takes a cut.
You can use batching on consumer hardware, it just requires a KV-cache efficient model (or short context only) and keeping multiple inference flows running in parallel. This is most useful in combination with streamed inference, since the compute intensity of decode with those newer KV-compressed models is high enough that you have limited compute headroom when running at the speed of RAM.
You can bounce the ball up slightly (presumably the spin from rolling is modeled or approximated, and gives lift when hitting a bumper), which might be enough to skip from the tee to near the end of the course. Not sure that should be considered for "par" though. Took me 14.
Opus 4.7 and 4.8 are also rather "proactive" - several times I've seen them try to inspect compiled binaries before there's even a problem, just to check that their changes are included (and if I let them do so they often get stuck down that rabbithole).
I've also seen this. It'll run 'strings' against the binary and then convince itself that the Makefile isn't working right, and there's some imaginary sandbox preventing the code from compiling properly. So it will compile it by hand, and never run strings against the new binary, and proceed happily.
These kinds of situations are why I gave my AI agents stray thoughts (automated insights / suggestions from a separate llm call with some curated context) that trigger on loop / rabbit hole detection.
Quite a bit of false positives, but it hasn’t had any ill-effect so far. Aside from increased quota usage.
I admit I snorted when that was mentioned. It's frequently ranked as the most desirable place to live on earth.
Not to say the message of the article is completely without merit - there are things to see and do almost everywhere. But if I just get in the car and start driving I will 95% of the time find only strip malls and cornfields. Perhaps a suburban park with some trees.
Pick any mountainous, desert, or coastal part of the world and you are guaranteed scenery for ~95% of the drive.
Pick any historic part of the world and you are guaranteed nice-looking buildings anywhere you walk.
A sizeable fraction of the world fits into either of the above. Yeah, if you live in cornfields and strip malls, you aren't going to find much interesting. But in fact, most of the world isn't like that. Arguably the cornfields and strip malls are the minority.
Throw a dart anywhere on Kyrgyzstan, Japan, Indonesia, Norway, China, India, Turkey, Uzbekistan, Argentina, Namibia, the southwestern US, northwestern US, or Mexico. You'll find lots of interesting things wherever your dart landed. These aren't very cherry-picked countries, I just named a few off the top of my head that come to mind where the "dart anyhere is interesting" is true. My point is that the world is full of "Switzerlands".
This is so true, and also what I thought reading these comments. Love the "the world is full of "Switzerland". I added it to the note, thanks for the comment!
Unfortunately Radxa and Milk-V are almost completely out of stock and not much cheaper. If you need more than a microcontroller there's no circumventing the memory shortage at this point.
Kicking myself for not buying the Q6A at the beginning of the year (I wanted three and arace would only sell one per customer, but one would've been better than none).
In the US, 99th percentile household wealth is ~$14M, which at historical rates of return is enough to live opulently indefinitely. (Of course although we're discussing a scenario where capital holds most of the cards, who knows if those returns would be dependable.)
if you dig into whats actual safe to distribute after inflation and taxes, or conservative FIRE mid-life recommendations, its around 1-2% of principal per year . From 14m, 10-20k/month, about the budget of the white collar household in a major metro. Which is nice but hardly opulent. Rent, healthcare, and kids (or some expensive hobbies) eat that up in hurry.
reply