More

PhilippGille · 2026-06-09T06:36:36 1780986996

The interesting bits on how they achieved it:

> On the model side, we applied FP4 quantization

> introduced DFlash, an efficient speculative decoding method based on block-level masked parallel prediction

> On the system side, TileRT perfectly adapts to the dynamic characteristics of these algorithms

> 1000+ tokens/s output [...] using just a single standard 8-GPU commodity node

PhilippGille · 2026-06-01T07:29:00 1780298940

The blog post has more info: https://www.minimax.io/blog/minimax-m3

PhilippGille · 2026-05-29T22:11:23 1780092683

Do you mean MiMo V2 Flash? V2.5 doesn't have a Flash version.

whatifitoldyou · 2026-05-30T12:11:22 1780143082

You're right. I mean MiMo V2.5. The smaller model compared to MiMo V2.5 Pro.

PhilippGille · 2026-05-12T21:31:55 1778621515

It's in the article:

> HTTP also allows the DuckDB-Wasm distribution to speak Quack natively! So DuckDB running in a browser can e.g., directly connect to a DuckDB instance running in an EC2 server using Quack.

anentropic · 2026-05-13T14:06:22 1778681182

I missed that and it seems like one of the more compelling features...!

znite · 2026-05-13T01:48:56 1778636936

Thanks, thought I searched for it & didn't come up. Great stuff

philipallstar · 2026-05-13T10:24:28 1778667868

That is a pretty amazing feature.

PhilippGille · 2026-05-09T09:32:07 1778319127

Both the original Markdown spec [1] as well as CommonMark [2] clearly specify support for inline HTML. With that you can kind of get the best of both words depending on your use case.

For the most parts you just write the regular Markdown headers and paragraphs, embed images, insert tables etc without the need for any HTML tags, making it readable in source form. And if you want to embed an SVG file for example, which the author of the article mentions as one use case, you just embed the SVG directly, and people can render the Markdown in their favorite viewer.

Let's say you're viewing a raw Markdown file in VS Code. You come onto an HTML tag, so you hit Cmd+Shift+V to open the preview and that's it.

Of course for full-fledged web pages with interactive buttons and fully customized styling and all of that, which the author shows in some examples, this is not feasible. But you can get very far when you have mostly text/images/tables and just want to add some extras here and there.

[1] https://daringfireball.net/projects/markdown/syntax#html

[2] https://spec.commonmark.org/0.31.2/#html-blocks

the_gipsy · 2026-05-09T09:43:14 1778319794

You should never have to preview a markdown document, in my opinion. At that point, just make an HTML document.

PhilippGille · 2026-05-07T18:24:54 1778178294

On max it uses more than twice as many tokens as on high when running the ArtificialAnalysis benchmark suite, and then it's indeed the model with the highest token usage (among the current top tier models). See the "Intelligence vs. Token Use" chart here:

https://artificialanalysis.ai/models?models=gpt-5-5%2Cgpt-5-...

amunozo · 2026-05-07T19:11:41 1778181101

Wow, the difference is quite considerable and the gain in intelligence is not that much. I might try to use high and just iterate more often. I am working with hobby stuff so I don't have to worry whether it breaks things or not.

PhilippGille · 2026-05-05T06:04:56 1777961096

Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:

https://huggingface.co/spaces/mteb/leaderboard

PhilippGille · 2026-04-24T14:26:17 1777040777

When you say "Gemini", which exact model do you mean? You know there are several and they vary a lot in how capable they are? Pro 3.1 Preview, 2.5 Pro (their latest non-preview pro model), Flash 3 Preview, ...

Same with GPT-5: Latest 5.5, prior 5.4, or actually the original 5 (.0)?

You can't talk about model performance without specifying the exact model.

hodgehog11 · 2026-04-24T15:27:38 1777044458

My apologies, I thought it would be implicit that I am using the top-tier model of the time given the challenge of the tasks. GPT-5.5 was too new in this top comment (although I did test it a bit in a comment below), so I was using GPT-5.4. Gemini is Pro 3.1 Preview.

WarmWash · 2026-04-24T14:39:23 1777041563

High bet on 3.1 pro. I use it a lot for math and classic engineering, it's very strong.

PhilippGille · 2026-04-12T08:30:44 1775982644

> C# [...] only really works properly in Windows

What do you mean with this? Maybe you are thinking of the old ".NET Framework" runtime, which only runs on Windows? Nowadays there is ".NET Core" which runs on macOS and Linux as well.

jurschreuder · 2026-04-12T19:12:08 1776021128

Even om Windows .NET does not work properly with mySQL and Postgres it only really works properly with Microsoft MySQL-Clone or I don't know the official name.

leosanchez · 2026-04-13T04:10:37 1776053437

It works pretty well with Postgres, SQLite and MySQL. You don't know what you are talking about.

PhilippGille · 2026-04-12T07:49:57 1775980197

He specifically mentions that he is using GitHub Copilot because of how Microsoft bills per request instead of token.