More

rhdunn · 2026-05-29T20:22:56 1780086176

It's all relative. For local use I'd classify it by hardware (VRAM size) using FP8 or Q6 quantization:

1. tiny <2-3B -- easily runnable on lower-spec hardware

2. small 4-8B -- runnable on 8GB GPUs

3. medium 9-12B -- runnable on 12GB GPUs

4. large 13-24B -- runnable on 16GB (for the lower end models) and 24GB GPUs

5. very large 25-32GB -- runnable on 32GB GPUs

6. huge >32GB -- not easily runnable on consumer GPUs without compromising performance (offloading layers to the CPU/RAM), quality (heavy quantization, esp. at <= Q4), or price (investing in multi-GPU setups and/or server-grade hardware).

You could possibly split huge down further, as 70GB models (e.g. llama 3) are easier to get working than >120GB models and 1TB models are completely intractable.

sroussey · 2026-05-29T20:58:41 1780088321

As a Mac user:

1. tiny <2-3B -- could run in a browser even, mac neo

2. small 4-8B -- last of browser options, MacBook Air base

3. medium 9-24B -- 32GB machine, air or pro notebook or mini

4. large 25-48B -- 64GB, pro notebook or mini

5. x-large 49-100B -- 128GB MacBook Pro or Studio

6. Huge > 100B -- 256/512GB Mac Studio

ElFitz · 2026-05-29T21:42:04 1780090924

> tiny <2-3B -- could run in a browser even, mac neo

Or a phone. I’m running Gemma 4 E2B in one of my apps on my 14 pro (which may or may not be killing my display through overheating. It might just be a coincidence).

rhdunn · 2026-05-29T19:59:21 1780084761

Yeah. I run LLM models locally and for me 22B-32B is the largest I'm willing to invest in trying out.

Even though Mistral 4 has 6B active parameters per token (allowing 3-3.5 per token parameters to be loaded on a 4090), the ~240GB download + storage is pushing the limits of being able to try this out locally, especially if you are downloading and evaluating multiple models.

It also makes it harder for other people to make downstream finetunes like with what happened with the older Mistral/Magistral models.

wolttam · 2026-05-29T22:21:01 1780093261

I think machines like the DGX Spark are about to become a lot more common/popular. It’s big enough to run sparse 150-250B MoEs with enough throughout for a single user. Deepseek v4 Flash is #1 (in terms of usage) on OpenRouter because it’s good enough to be useful. You can run it on a Spark (though it runs better across 2, which is getting up there in cost)

rhdunn · 2026-05-26T17:27:24 1779816444

NVDA is a free screen reader for Windows (written by blind devs) that works with Firefox and Chrome.

You don't need to pay for a specialist browser as all web browsers (Firefox, Chrome, Edge, Safari, etc.) will implement the native accessibility model of the operating system they are running on (IAccessible/MSAA for Windows, etc.).

In Firefox you can press the right mouse button and select "Inspect Accessibility Properties" or select the "Accessibility" tab from the developer window and it will show the accessibility tree (roles, states, properties, etc.) just like the DOM tree in the "Inspect" tab. That is what the browser is displaying to screen readers and other accessibility software and uses the behaviour of the HTML elements along with the ARIA roles/states/properties defined by the webpage to construct that tree. Thus, it will display an ol/ul as a `role=list` unless overridden to be e.g. a `tablist` by the website.

See https://www.w3.org/TR/wai-aria-implementation/ for a specification on how browsers should implement HTML and ARIA to different operating system accessibility APIs.

rhdunn · 2026-05-26T17:18:42 1779815922

See https://www.w3.org/WAI/ARIA/apg/patterns/ for a guide on how to create accessible markup for custom controls and the associated examples.

See specifically https://www.w3.org/WAI/ARIA/apg/practices/names-and-descript... for details on naming. That has extensive notes and details for labeling elements correctly.

See https://getbootstrap.com/docs/5.0/components/ for bootstrap markup on creating accessible components.

There are plenty of other resources.

nailer · 2026-05-26T19:57:01 1779825421

I didn't ask for resources on ARIA, are you replying to another comment?

rhdunn · 2026-05-26T20:20:11 1779826811

You said "see this article" re: how aria-label is not applicable to div elements, hence the second link which is the WAI-ARIA guide on labelling elements.

You also said that ARIA can't help with custom controls in that post, which is where the other links are applicable as they describe doing just that. I.e. using ARIA tags to implement tabs, accordions, etc. either with or without a framework library.

nailer · 2026-05-26T20:25:11 1779827111

> You also said that ARIA can't help with custom controls in that post

I didn't write the post. The author believes in ARIA, I believe ARIA is fundamentally broken.

rhdunn · 2026-05-26T21:09:51 1779829791

post != article

From https://qht.co/item?id=48281764:

> > ARIA can help when devs want to use the wrong elements for some reason or for custom controls.

> But it can't. See this article.

nailer · 2026-05-27T00:11:52 1779840712

People post both articles and comments to HN. My comment pointed out that ARIA can't help with anything as ARIA is a boil the ocean approach.

RobMurray · 2026-05-28T08:48:45 1779958125

ARIA is a solution to a specific problem, not something that should be used on every site. HTML is accessible out of the box when semantic elements are used as intended. If you are using a div as a button, you probably aren't hand writing HTML. It is likely part of a library. Adding the necessary ARIA attributes benefits every site using the library. Your boiling the ocean analogy implies that every web developer needs to scatter ARIA attributes all over their code, which just isn't true.

rhdunn · 2026-05-26T08:08:07 1779782887

IIRC, libraries like numpy and pytorch can already do that as they store the matrices as 1D arrays with information on things like the stride length (advancing to the next row). That allows you to implement operations like transposition by editing the stride length and other parameters without manipulating the content of the matrix array.

rhdunn · 2026-05-25T09:50:26 1779702626

That goes back to Ken Thompson's NFA regex interpreter from 1968 [1], [2], [3]. Note: that whole regex series by Russ Cox [4] is great.

[1] https://dl.acm.org/doi/10.1145/363347.363387 -- Programming Techniques: Regular expression search algorithm

[2] https://swtch.com/~rsc/regexp/regexp1.html -- Regular Expression Matching Can Be Simple And Fast

[3] https://swtch.com/~rsc/regexp/regexp2.html -- Regular Expression Matching: the Virtual Machine Approach

[4] https://swtch.com/~rsc/regexp/ -- Implementing Regular Expressions

kqr · 2026-05-25T11:26:57 1779708417

I second the Russ Cox recommendation. I read that ages ago and that was what made me realise some theory could actually be useful in practice.

rhdunn · 2026-05-24T14:36:52 1779633412

Ben Eater's 6502 series [1] uses MSBASIC for programming along with WozMon as the terminal interface.

[1] https://www.youtube.com/playlist?list=PLowKtXNTBypFbtuVMUVXN...

BeefySwain · 2026-05-24T14:47:25 1779634045

Is that the same BASIC as this?

rhdunn · 2026-05-24T15:56:24 1779638184

From the video [1] that links to Ben Eater's fork with extensions and configuration specific to his 6502 breadboard computer [2]. That in turn is forked from `mist64/msbasic` which refers to a blog post [3] which states:

> This episode of “Computer Archeology” is about reverse engineering eight different versions of Microsoft BASIC 6502 (Commodore, AppleSoft etc.), ...

> This article also presents a set of assembly source files that can be made to compile into a byte exact copy of seven different versions of Microsoft BASIC, and lets you even create your own version.

So Ben Eater's version is based on a reverse engineered version of the same program. You should be able to adapt the code released here to run on Ben Eater's 6502 with a bit of work.

[1] https://www.youtube.com/watch?v=XlbPnihCM0E&list=PLowKtXNTBy...

[2] https://github.com/beneater/msbasic

[3] https://www.pagetable.com/?p=46

rhdunn · 2026-05-22T17:16:49 1779470209

A better formulation would be something like Fuzzy Logic [1]. That represents floating point values from 0 (false) to 1 (true), so 0.5 could be "unsure", 0.9 could be "very likely", etc. However, that doesn't make boolean logic invalid.

Boolean logic is also the foundation of computing: logic gates, circuits like BCD, etc.

[1] https://en.wikipedia.org/wiki/Fuzzy_logic

gobdovan · 2026-05-22T17:37:45 1779471465

Fuzzy logic is indeed a better formulation. One nit tho: the intermediary values don't mean 'very likely' or 'unsure' in general. They usually represent degrees of truth or degrees of membership. So it's more like '0.9 tall' means 'quite tall', while '0.5 tall' would be interpreted as 'this guy is tall to a degree of 0.5 out of 1'.

They could technically refer to 'very likely' or 'unsure' only if the predicate you're modeling is itself about certainty or belief. For example, you could say "I'm certain about X to degree 0.8 out of 1" meaning you're quite certain about X. But notice that the 0.8 is about your belief, not about X itself.

smaudet · 2026-05-25T15:03:15 1779721395

Yeah, this (the OPs) reads like a confused teenagers post, who has just started to explore the intracacies of logic. The whole post disproves itself...

Fuzzy logic is fine, I suspect they saw something like this and got confused. I would recommend they think harder about how very pertinent boolean logic is to everything they are doing before dismissing it...

rhdunn · 2026-05-20T21:12:31 1779311551

> First, the discussion is about about news, not science (nor about general LLM behaviour).

What if science is the news, such as:

1. advancements in fusion power; or

2. progress/status of the Artemis missions; or

3. new LLM models and/or capabilities (e.g. Project Glasswing).

With things like that you typically have a press announcement/briefing, a research paper/publication, or both. That information is then presented in newspapers/media that may obscure, misrepresent, or overly generalize the original finding/announcement.

There may also be clarifications, retractions, etc. after publication, such as with the initial announcement/publication of the proof to Fermat's Last Theorem that initially had an error that was later corrected.

rhdunn · 2026-05-19T16:39:01 1779208741

That doesn't work if you have limited or no connectivity (e.g. on a mountain range). There are also privacy concerns, e.g. a doctor using it to transcribe medical information.