The design of the exposed mechanism is explicitly about booting unsigned versions of MacOS. There is zero support for booting anything else, but no enforcement that it must be MacOS.
However, apple's justification for exposing this mechanism to users appears to explicitly include "booting linux" even if the mechanism has zero explicit support for booting linux.
And if Apple were going to change their mind and try to block linux, they would intentionally modify the bootloader to remove that functionality, not break the boot picker.
Speedrunning communities generally hate it when having more money leads to an advantage.
If you ban deliberately smudging/scratching the disc, then some runner with a lot of money will just buy a lot of copies of the disc, find the one that glitches the most consistently (because of pre-existing scratches, or even manufacturing defects that aren’t visible)
Allowing some kind of mod is the most equitable compromise.
They typically ban the glitch entirely. For example cartridge manipulation or "CD streaming" glitches in Zelda speedruns are banned, and if you submitted a run containing them while claiming the game did it on its own, they would probably tell you to get a new copy of the game.
Yes, they loved the compactness and convenience (well, I’m not sure anyone ever loved the rewinding/fastfowarding experience)
But the quality/color was always a noticeable downgrade from broadcast quality video (and that was a noticeable downgrade from film). But the sacrifice was absolutely worth it.
It is notable that LaserDisc only came out two years after VHS (and before it reached mass adoption), and it could produce (and often exceed) prefect broadcast quality video. Anyone could see the improvement.
Yet LaserDisc never had much success outside of enthusiasts, simply because it couldn’t match the convenience of VHS. Well… it was mostly the lack of recording, but that’s an aspect of convenience too.
You’ve also had to flip the disk halfway through a movie, it couldn’t do two hours of continuous video, unlike a VHS tape.
The lack of recording was also a killer, if you went with VHS you could record and watch home movies if you had a camera, read videos at the video store, record from broadcast TV, it was much more versatile.
For me and most people I knew at the time, VHS didn't have a noticeable quality loss over broadcast unless you were watching LP/EP recordings.
Many TVs people already had in the 80s didn't have RCA connections so VCRs were connected via twin lead to F connector adapters. They had the same noise as the antenna or cable input. So your commercial tapes usually looked about as good as broadcast. If you actually read the instructions with your VCR to set the timing correctly recorded broadcasts in SP mode also tended to look pretty good.
In absolute terms the VHS video was worse than the original broadcast but on the TVs we had it was hard to notice.
This definitely changed through the 90s. Larger and brighter tubes made the deficiencies of VHS more noticeable. Moving to cable TV from antenna was also very noticeable and made VHS quality more apparent.
If you happened to see a LaserDisc video as a comparison to VHS then the quality difference was stark. As much as VHS and DVD by the late 90s and early 00s. However I think that direct comparison was out of reach for most people.
The tokeniser is not a dictionary. It doesn't provide definitions, or give the LLM any kind of mapping at all.
At best, it's a wordlist. It gives the LLM some idea of what humans consider to be common words. But it doesn't tell the LLM anything at all about those words. And it's not even comprehensive, many words map to multiple tokens. Nor is it exclusively words, some of those tokens are punctuation, or modifiers, or control tokens. On multimodal LLMs, some of the tokens actually represent image and audio data.
The LLM doesn't get informed about any of this up front, it has to learn what every single token means from context.
You are technically right, that it's something in an LLM that's not weights; But it's not that structured. And really it's only there so the LLM can interact with the outside world.
> There are grammar rules
There is no dedicated "grammar rule" structure in the LLM or the tokeniser. It has to learn them all from context, they get encoded as part of the 80 layers of weights.
I see people give too much importance to specific engineering design choices of the current generation of LLMs. Tokenizer is not an absolutely essential part of the system. It’s just and adapter for text input/output. It can be eliminated completely and model can use bytes directly.
I think the short story captures this well. Weights (connections) are the essential and philosophically important part. They do the thinking, memory, singing etc.
A tokenizer is roughly and approximately Huffman-coding sequences of input (bytes of English etc) into shorter sequences (list of tokens), as a performance optimization.
As you said, it's not in any way intrinsic to the LLM, though it may be a very necessary optimization on today's hardware.
IMO, we are probably talking about a 6x slow down (for typical english). You would need to be absolutely stupid not to implement some kind of optimisation along these lines.
Slower and maybe a little dumber; But it would work.
My thinking is that for most tasks, a byte-orientated LLM still needs something like the wide "single activation per word" formatting that the tokeniser mostly provides. And it will likely waste its first and last few layers implementing a replacement tokeniser (and would probably do a much better job at it). It would also need to decode and encode unicode at the same time.
My estimate is that it might lose about 10% of its weights to these new tasks. Your 80B parameter model becomes as smart as a 72B parameter model - Measurably dumber, but not drastically so.
IMO, the way the error bars combine is very intuitive. You are really just rounding to 6 or 12 sig-figs after every operation.
People just seem to get really hung up on the point that error bars exist in the first place, and combine.
I suspect it has a lot to do with the way that rounding is taught in school. It's absolutely hammered into use that you should never round until the very end, otherwise you lose precision.
Is a Mach-5 passenger aircraft actually the goal of this project?
Seems more likely that Japan is designing this engine for a hypersonic cruise missile program, and the passenger aircraft concept is somewhat of a cover.
IMO, there is no point in a Mach-5 Aircraft (other than cruise missiles). There is potentially some point in Mach 2-3 aircraft, (not that we have ever made them commercially viable) but at the boundary to hypersonic, you might as well just switch to a suborbital hop concept.
A suborbital hop gets you to anywhere in the world within ~90min, avoids issues of supersonic overflight and you don't need to worry about the massive engineering issues caused by sustaining hypersonic flight. And as a bonus, the passengers get a hour of weightlessness.
> (not that we have ever made them commercially viable)
Concorde was commercially viable at Mach 2.2 in supercruise (although there's a common misconception that it was not).
However, its overheads were very high, and its applicability was severely limited by fears around the sonic boom (most particularly in the US, which banned supersonic flight overland, possibly largely because they wanted to kill off foreign competition).
Air breathing engines don't need the oxidizer tank, so like the 2/3 of a rocket just goes away before even touching Tsiolkovsky math. That improves payload mass fraction massively.
Also, this doesn't scale down to Mach 3-4 and under. This thing uses scramjet, or supersonic combustion ramjet. It REQUIRES intake air to be at high supersonic speeds for it to work.
> It REQUIRES intake air to be at high supersonic speeds for it to work
This is why I am highly sceptical it can be part of a commercial supersonic passenger jet: how do you get from subsonic -> supersonic without also tacking on some kind of conventional jet engine?
Japan, Italy and UK have a program for a competing F35 design, GCAP. And Japan is focusing mainly on the engines.
Given there will at some point be the need to deliver competing cruise missiles for this platform, and after the crisis of the US not being able to keep demand with Israel's and Ukraine's orders they greenlighted SK and Japan to enter the European defense market, to answer your question yes, this is of course a defense related project.
There's been an industry request to develop native defense components on these matters within the EU following pressures and contrasts with the US (on a report to the EC for the ReArm campaign, EU's biggest playes of aerespace industry made a joint report estimating 60-80% of their components and tech are sourced from the US).
> Is a Mach-5 passenger aircraft actually the goal of this project?
> Seems more likely that Japan is designing this engine for a hypersonic cruise missile program, and the passenger aircraft concept is somewhat of a cover.
Case of China's got them, and can't rely on the Orange Emperor and his heirs to have their backs.
I didn't initially believe these numbers, but if you look at some real life stats, you are probably right.
Nominal SECO for the last starship mission was at ~8 minutes and it took ~20 minutes from deceleration started (well, from air resistance outweighed the forces of acceleration) to landing. So basically 30 minutes of flight is just the "getting up to speed" and "slowing down" part. Both account for some distance traveled, but still. ~45 minutes is probably a good bet.
Do note however that you may have to go around the world "the wrong way" to get some places due to launch constraints. But living in a world where going around the world "the wrong way" is the easier path is interesting. Imagine that.
Unless a suborbital trip is nearly at orbital velocity, it will involve a high, arcing trajectory. This will make the deceleration at the end unacceptably (lethally) high for all but short arcs. Some of the Mercury suborbital missions involved deceleration of 15 gees, if I recall correctly.
And all but rather short ballistic trajectories (well below orbital speed) will come in at a steep angle.
Unless one has seriously variable aerodynamics, the vehicle will have to swerve to nearly horizontal over a distance of about 1 scale height of the atmosphere, which is about 10 km. The exponentially thinning atmosphere goes from "too thin to matter" to "brick wall" over a short distance.
The acceleration for turning is v^2/r; for v = 5000 m/s and r = 10 km this is 250 g.
Acceleration also limits how rapidly one can reenter from beyond Earth orbit. At > LEO velocity, the vehicle has to use (downward) lift to stay in the atmosphere, and if v is too high the required acceleration is too high.
It would still be possible to fly a predominantly ballistic trajectory, yet use rocket propulsion to decelerate whilst still outside the atmosphere. It would require a huge amount of extra fuel compared to a purely ballistic trajectory, but perhaps still less than achieving a full orbit and de-orbiting again for some destinations.
At global distances a fraction orbital trajectory would use less total rocket delta-V.
More practical would be a trajectory that was a series of small suborbital arcs with skipping off the atmosphere (perhaps with some airbreathing propulsion during the skips.) The thermal protection can cool by radiation between the skips.
No, it's an issue for most arcing trajectories. Lift doesn't help much if you're coming in at a steep angle. Reentry from orbit only works well because the entry is almost flat; there even a little lift helps a lot.
No contest really: RISCV is a much better ISA, VexRISC is a hyper-optimised implementation of it (for FPGAs), and it's not hindered by trying to be microcode compatible.
The roughly equivalent VexRISC configuration (full with MMU) is only 2736 LUTs, running at 124 Mhz (on Cyclone V, which I'm pretty sure is the same arch)
I wouldn’t say it didn’t have any microcode. It actually had a small PLA for sequencing the multi-cycle instructions. [0]
I don’t think anyone would actually label it as microcode (not when the entire point of RISC was to avoid microcode) they would call it a sequencer or finite state machine; But really it’s the same thing. It’s certainly much simpler than the full microcode of any contemporary CISC, and the bulk of instructions execute in a single cycle without using it.
If you want a design with zero microcode, you really need to look at MIPS, or the original Berkeley RISC. Those ISAs go out of their way to avoid multicycle instructions. Not entirely successfully, but they don't use PLAs [1] to implement any state machines for the few remaining instructions like multiply and divide.
[1]At least on the few MIPS designs I've looked at.And I'm not sure if they deliberately avoided PLAs for doctrine reasons, or it was just more efficient to do so.
It's a bit harder to avoid windows than it is to avoid Bun.
More importantly, it's not the same thing at all. All the code in windows (at least until recently) was written by humans, understood by humans and reviewed by humans. And that code has stood the test of time, proven its value and stability in the wild, on billions of systems. The fact that the current maintainers haven't needed to understand or replace the code is some indication of the code's quality.
Almost none of Bun's rust code has been even seen by a human, and it's only about two weeks old.
I'm somewhat willing to accept vibe-coded code if it's either absolutely non-critical, well reviewed, or maybe in the long term if it's proven itself. But not two week old code.
That's a valid way to approach this - bun isn't valuable enough to bother with or at least wait for a while, Windows is.
But I think the comparison is closer than you are making it sound. I sincerely doubt the Windows codebase was all written by humans, let alone reviewed. And my understanding is that the code is being regularly rewritten and replaced because of how flawed it is, it's just a massive undertaking.
Also if you look at their investment in AI-driven code rewriting into Rust, my bet would be that some modern Windows code itself is being vibe-coded.
I mean should we even compare Bun to Windows in the first place? Like Mircosoft with its resources would find a way to support Bun and Windows is one of their most popular and most used products. The situation with Bun is very different in terms of business/product.
However, apple's justification for exposing this mechanism to users appears to explicitly include "booting linux" even if the mechanism has zero explicit support for booting linux.
reply