[edit: apparently the author is a deno employee but did not choose to disclose such? If so then that throws any moral position they have out the window. The below however is still accurate to my reading of the complaints.]
First up is the FFI test. On the one hand I think that this is a legitimate complaint, changing an existing benchmark for your comparisons is always suspect. That said I can see technically legitimate reasons for the change - for example if bun's ffi interface doesn't actually handle byte arrays as well as other runtimes. But in such a case that should be explicitly called out as a caveat.
The sqlite complaint is not reasonable, if sqlite is being used as a demonstration of wasm performance there is nothing at all wrong with this test - if anything using wasm makes it more valuable as a comparison of runtime perf, otherwise you're very quickly going to just be benchmarking one native build of sqlite to another native build of sqlite.
The React complaint - this just looks like bog standard formatted minified code, which is an absolutely reasonable thing to do - in fact I would argue it is a much better reflection of real world performance than using the non-minified code. Production websites only ever send minified content (at the react scale) and minification absolutely impacts page load performance, and historically could result in catastrophic runtime impact (though they're better at not doing this now).
The benchmark, while it seems (without knowing the function specifics) that it is overtesting (testing hash(toptr(n)) instead of just hash(m)), it is the same on both tests so it does feel like a fair comparison? Sure it could be done differently to make a more atomic comparison, but the composite comparison still feels fair if both operations tend to be performed together. Def not an "expose".
Overall though, Bun is an amazing accomplishment for many reasons and I'd love if the project focused a bit less on the supposedly benchmark/performance and more in some other things it brings to the table, like it being based on `JavaScriptCore`, being an alternative to Node but still mostly compatible, the speed of development (vs Node which we've been complaining that has stagnated for years), etc. Not saying not to mention impressive benchmarks, just that it'd be nice to highlight the other points as well!
I personally also don't like the direction Deno was taking, and def welcome Bun as a modern alternative to Node.
This reads as a bit salty but I’m not mired in javascript enough to evaluate the specifics. My impression is that bun is quite liked in the community. These appear to be a few minor grips.
The post is claiming the Deno benchmarks created by Bun.js devs to compare/market their runtimes makes nonsensical choices that are more adequately explained by a malicious intent to misrepresent Deno than oversights.
Seems to be the case IMO, but it's not 100% cut and dry. Bun.js' position could be adequately explained in every case as "the defaults/obvious choices for these benchmarks in Deno are slow", but then again they are consciously making apples to oranges comparisons which feels dishonest to me.
Mind you, despite being a possible conflict of interest this also provides credentials for likely relevant expertise, with skill and interest in correcting errors.
Or, perhaps the complete opposite, where it's someone who actually knows how to benchmark JS engines because they work for deno. But we won't know unless they explain which it is.
Agreed, simply having a conflict of interest isn't a reason to disregard the complaints. But the problem is that failing to disclose it up front creates the appearance of dishonesty, which undercuts the message. But it's important for us as readers because it makes us more aware that we should be looking for some of the techniques that can be more confounding or misleading.
I want to be clear, I am not saying that the post is wrong, just that we would know to be more careful in reviewing claims.
This is almost a meme in the Rust subreddit at this point. So many times, people come, post some kind of benchmark claiming that Rust is slower than X. Then after some back-and-forth discussion/tests, we found out that the release mode wasn't used.
Looks like the sqlite bench complaint was fixed 5 days ago and the react-hello benchmark was fixed earlier today (before the creation time of this gist).
I don't know if the suggested FFI benchmark is fair, as the change would no longer be simulating doing many ffi_hashes on many byteArrays, instead it simulates doing many ffi_hashes on the same byteArray over and over and benefits from caching the type casting necessary by deno.
Replace the const byte array with a randomly generated list (or real bytes you'd want to hash) of many passed to the benchmark as a parameter and the benchmark and the deno one loses the advantage from the cached type cast.
The thing about Bun that turns me off is around the OS-specific system call optimizations (using different optimizations for different platforms) which means that each new feature they add to bun that uses any system calls must be done 3 times for the 3 major platforms (MacOS, Windows, Linux)
Another thing is I believe Deno supports WebGPU and Node supports WebGL via Angle etc (for e.g. running Tensorflow with GPU support)but Bun not only has no story there but its author has refused to provide any info about their intent aroundGPU support.
The thing about Bun that turns me off is around the OS-specific system call optimizations (using different optimizations for different platforms) which means that each new feature they add to bun that uses any system calls must be done 3 times for the 3 major platforms (MacOS, Windows, Linux)
it might drive some skepticism toward long term support or reliability ( everything has to be developed three times ) and unknown performance characteristics ( or even potential behavior ) across platforms.
not sure I agree with the take, but that's how I read it
It's surprisingly easy to develop OS-specific code using Zig (the language Bun uses), because of its excellent integrated support for cross compilation and comptime feature. For example, with some zig code I write (a posix layer for node), I build for many platforms in parallel every time I make a change.
Very sad to read this. I thought Bun was above this type of thing and was the real deal. Do you think they were resorting to that kind of thing to pursue funding? This sucks :(
I wouldn't say it's even that. To me it reads like they're using sane defaults for Deno that any developer would use. The fact performance can be improved by leveraging knowledge of Deno internals is irrelevant. Comparing default OOB performance is an apples-to-apples comparison.
This "exposé" seems like mud slinging from a disgruntled Deno maintainer with inherent bias to see Deno perform better. If anything, it paints a poor picture of Deno, not Bun.
No, I disagree. A couple of those things pointed out are fair: they’re unfair comparisons that aren’t apples to apples, which matters for benchmarks. The SQLite one for example is like me benchmarking a game on two PCs to test graphics performance, but with wildly different settings on both. It tells the end user nothing useful about the comparative performance.
All of those things are easy mistakes to make though. Where it’s a bit not great is if there was apparently communication about it and the Bun team didn’t at least change the SQLite comparison — it’s not “OOTB” behaviour, it’s a library.
The "view source" link for SQLite is out of date (fixing shortly), but the numbers are correct. I forgot to change the label on the landing page from "x/sqlite" to "x/sqlite3" and I forgot to update the source link.
Here is what it shows for me, but I encourage you to run it on your own computer to see for yourself:
deno run --unstable -A deno.js
cpu: Apple M1 Max
runtime: deno 1.26.1 (aarch64-apple-darwin)
benchmark time (avg) (min … max) p75 p99 p995
------------------------------------------------------------------- -----------------------------
SELECT \* FROM "Order" 26.3 ms/iter (25.06 ms … 29.23 ms) 26.74 ms 29.23 ms 29.23 ms
SELECT \* FROM "Product" 53.91 µs/iter (52.17 µs … 317.75 µs) 54 µs 65.63 µs 76.75 µs
SELECT \* FROM "OrderDetail" 269.41 ms/iter (240.72 ms … 308.82 ms) 279.05 ms 308.82 ms 308.82 ms
bun bun.js
[0.26ms] ".env"
cpu: Apple M1 Max
runtime: bun 0.2.0 (arm64-darwin)
benchmark time (avg) (min … max) p75 p99 p995
------------------------------------------------------------------- -----------------------------
SELECT \* FROM "Order" 14.44 ms/iter (13.71 ms … 17.82 ms) 14.56 ms 17.82 ms 17.82 ms
SELECT \* FROM "Product" 34.39 µs/iter (30.46 µs … 4.55 ms) 32.88 µs 48.5 µs 60.13 µs
SELECT \* FROM "OrderDetail" 148.17 ms/iter (144.8 ms … 154.92 ms) 149.85 ms 154.92 ms 154.92 ms
Regarding the FFI benchmark, you can see the commit from today here:
It threw an error until I googled "Deno pointer ffi", which mentioned this function https://doc.deno.land/deno/unstable/~/Deno.UnsafePointer and that worked. I assumed that `Deno.UnsafePointer.of` is the expected way to get a pointer to a buffer, but it seems that changing the type to "buffer" is a faster way for this case. Will update the page shortly.
This makes SQLite transactions no longer serializable (in regard to the schema), and breaks the safety of any kind of external concurrency (e.g. mvSQLite and Litestream).
[edit: apparently the author is a deno employee but did not choose to disclose such? If so then that throws any moral position they have out the window. The below however is still accurate to my reading of the complaints.]
First up is the FFI test. On the one hand I think that this is a legitimate complaint, changing an existing benchmark for your comparisons is always suspect. That said I can see technically legitimate reasons for the change - for example if bun's ffi interface doesn't actually handle byte arrays as well as other runtimes. But in such a case that should be explicitly called out as a caveat.
The sqlite complaint is not reasonable, if sqlite is being used as a demonstration of wasm performance there is nothing at all wrong with this test - if anything using wasm makes it more valuable as a comparison of runtime perf, otherwise you're very quickly going to just be benchmarking one native build of sqlite to another native build of sqlite.
The React complaint - this just looks like bog standard formatted minified code, which is an absolutely reasonable thing to do - in fact I would argue it is a much better reflection of real world performance than using the non-minified code. Production websites only ever send minified content (at the react scale) and minification absolutely impacts page load performance, and historically could result in catastrophic runtime impact (though they're better at not doing this now).