If turboquant can reliably reduce LLM inference RAM requirements by 6x, suddenly reducing total RAM needs by 6x should have a dramatic shift on the hardware market, or at least we can all hope. I know 6x is the key-value cache saving, so I'm not sure if that really translates to 6x total RAM requirements decrease for inference.
There are many sources for data on before and after school cell phone bans. Oregon is far from the first to implement this. 35 US states have some form of school cell phone ban, and I believe the UK is doing a nation-wide ban. There is a good amount of supporting data measuring results on this topic.
For ~$60 you get a device that can play every type of audio file and has better sound quality than your cellphone + streamer combo.
I've been reading more about Chinese hardware and if you've been sleeping on it there are a lot of great Chinese consumer products that are both extremely high quality + very cheap.
Turns out when you have tens of millions of engineers they pump out banger after banger. Also always hilarious, in an enduring way, finding the factory engineers engaging with consumers on random forums that take their feedback seriously.
Note that in this case, you are getting what you pay for: I had a FIIO DAC that sounded amazing but was really bad about full-scale turn-on, sync and desync pops to the extent that it damaged my speakers. Yes, perfect power sequence hygiene would have prevented the problem, but one can't always be ready with the amplifier volume knob when their playback system crashes.
ah good to know. Outside of having a very basic dac for my cans on my desktop, I wouldn't think of any serious equipment failures could happen. Probably wrong to assume that these things are engineered to be safe/redundant.
This is going to be my first DAP in like 15 years, zune being the last one I had. Pretty excited to rock it out for a bit.
There's a current fad out there to move to more single-service type of devices rather than using a phone for everything. Want to try it out myself to be more intentional with my digital actions and ween myself away from corporate social media.
If they're allowed and help where phones wouldn't or don't there are still lots of options for stand alone MP3 players with minimal or no connectivity. They still exist as a market because they're dirt cheap to make.
This is what I've been working on. I've written a project wrapper CLI that has a consistent interface that wraps a bunch of tools. The reason I wrote the CLI wrapper is for consistency. I wrote a skill that states when and how to call the CLI. AI agents are frequently inconsistent with how they will call something. There are some things I want executed in a consistent and controlled way.
It is also easier to write and debug CLI tooling, and other human devs get to benefit from the CLI tools. MCP includes agent instructions of how to use it, but the same can be done with skills or AGENTS.md (CLAUDE.md) for CLI.
I thought this was going to talk about a nerfed Opus 4.6 experience. I believe I experienced one of those yesterday. I usually have multiple active claude code sessions, using Opus 4.6, running. The other sessions were great, but one session really felt off. It just felt much more dumbed down than what I was used to. I accidentally gave that session a "good" feedback, which my inner conspiracy theorist immediately jumps to a conclusion that I just helped validate a hamstrung model in some A/B test.
This article has some cowboy coding themes I don't agree with. If the takeaway from the article is that frameworks are bad for the age of AI, I would disagree with that. Standardization, and working with a team of developers all using the same framework has huge benefits. The same is true with agents. Agents have finite context, when an agent knows it is using rails, it automatically can assume a lot about how things work. LLM training data has a lot of framework use patterns deeply instilled. Agents using frameworks that LLMs have extensive training on produce high quality, consistent results without needing to provide a bunch of custom context for bespoke foundational code. Multiple devs and agents all using a well known framework automatically benefit from a shared mental model.
When there are multiple devs + agents all interacting with the same code base, consistency and standards are essential for maintainability. Each time a dev fires up their agent for a framework their context doesn't need to be saturated with bespoke foundational information. LLM and devs can leverage their extensive training when using a framework.
I didn't even touch on all the other benefits mature frameworks bring outside of shared mental model: security hardening, teams providing security patches, performance tuning, dependability, documentation, 3rd party ecosystems. etc.
VRAM is the new moat, and controlling pricing and access to VRAM is part of it. There will be very few hobbyists who can run models of this size. I appreciate the spirit of making the weights open, but realistically, it is impractical for >99.999% of users to run locally.
I've often thought about this. There are times I would rather have CI run locally, and use my PGP signature to add a git note to the commit. Something like:
Then CI could check git notes and check the dev signature, and skip the workflow/pipeline if correctly signed. With more local CI, the incentive may shift to buying devs fancier machines instead of spending that money on cloud CI. I bet most devs have extra cores to spare and would not mind having a beefier dev machine.
I think this is a sound approach, but I do see one legitimate reason to keep using a third-party CI service: reducing the chance of a software supply chain attack by building in a hardened environment that has (presumably) had attention from security people. I'd say the importance of this is increasing.
reply