Hacker Timesnew | past | comments | ask | show | jobs | submit | PhilippGille's commentslogin

The interesting bits on how they achieved it:

> On the model side, we applied FP4 quantization

> introduced DFlash, an efficient speculative decoding method based on block-level masked parallel prediction

> On the system side, TileRT perfectly adapts to the dynamic characteristics of these algorithms

> 1000+ tokens/s output [...] using just a single standard 8-GPU commodity node


The blog post has more info: https://www.minimax.io/blog/minimax-m3

Do you mean MiMo V2 Flash? V2.5 doesn't have a Flash version.


You're right. I mean MiMo V2.5. The smaller model compared to MiMo V2.5 Pro.


It's in the article:

> HTTP also allows the DuckDB-Wasm distribution to speak Quack natively! So DuckDB running in a browser can e.g., directly connect to a DuckDB instance running in an EC2 server using Quack.


I missed that and it seems like one of the more compelling features...!


Thanks, thought I searched for it & didn't come up. Great stuff


That is a pretty amazing feature.


Both the original Markdown spec [1] as well as CommonMark [2] clearly specify support for inline HTML. With that you can kind of get the best of both words depending on your use case.

For the most parts you just write the regular Markdown headers and paragraphs, embed images, insert tables etc without the need for any HTML tags, making it readable in source form. And if you want to embed an SVG file for example, which the author of the article mentions as one use case, you just embed the SVG directly, and people can render the Markdown in their favorite viewer.

Let's say you're viewing a raw Markdown file in VS Code. You come onto an HTML tag, so you hit Cmd+Shift+V to open the preview and that's it.

Of course for full-fledged web pages with interactive buttons and fully customized styling and all of that, which the author shows in some examples, this is not feasible. But you can get very far when you have mostly text/images/tables and just want to add some extras here and there.

[1] https://daringfireball.net/projects/markdown/syntax#html

[2] https://spec.commonmark.org/0.31.2/#html-blocks


You should never have to preview a markdown document, in my opinion. At that point, just make an HTML document.


On max it uses more than twice as many tokens as on high when running the ArtificialAnalysis benchmark suite, and then it's indeed the model with the highest token usage (among the current top tier models). See the "Intelligence vs. Token Use" chart here:

https://artificialanalysis.ai/models?models=gpt-5-5%2Cgpt-5-...


Wow, the difference is quite considerable and the gain in intelligence is not that much. I might try to use high and just iterate more often. I am working with hobby stuff so I don't have to worry whether it breaks things or not.


Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:

https://huggingface.co/spaces/mteb/leaderboard


When you say "Gemini", which exact model do you mean? You know there are several and they vary a lot in how capable they are? Pro 3.1 Preview, 2.5 Pro (their latest non-preview pro model), Flash 3 Preview, ...

Same with GPT-5: Latest 5.5, prior 5.4, or actually the original 5 (.0)?

You can't talk about model performance without specifying the exact model.


My apologies, I thought it would be implicit that I am using the top-tier model of the time given the challenge of the tasks. GPT-5.5 was too new in this top comment (although I did test it a bit in a comment below), so I was using GPT-5.4. Gemini is Pro 3.1 Preview.


High bet on 3.1 pro. I use it a lot for math and classic engineering, it's very strong.


> C# [...] only really works properly in Windows

What do you mean with this? Maybe you are thinking of the old ".NET Framework" runtime, which only runs on Windows? Nowadays there is ".NET Core" which runs on macOS and Linux as well.


Even om Windows .NET does not work properly with mySQL and Postgres it only really works properly with Microsoft MySQL-Clone or I don't know the official name.


It works pretty well with Postgres, SQLite and MySQL. You don't know what you are talking about.


He specifically mentions that he is using GitHub Copilot because of how Microsoft bills per request instead of token.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: