Back in Leopard days, I was playing with the x86-64 ABI on Mac OS X (no real documentation existed whatsoever, or at least that I could find, except for the source code). Very soon I accidentally ran into a kernel panic that could be reduced to a three-instruction program:
mov rax, 1
mov rdi, 1
syscall
and it would bring down the entire OS (kernel panic) when run by any user mode application under a standard non-root user. Took a few releases for them to fix that. I stopped trusting them from a low-level reliability perspective since then.
Fuzzing their system call interface may not have been a bad idea.
P.S. XNU is an interesting as it lets userspace use Mach syscalls as well as a bunch of BSD system calls directly. Probably esoteric interactions between them are not very well-thought-out. (I hope I am not offending Avie Tevanian here :))
I immediately thought of the htop bug and then she references that in the article. How is this not fixed yet? Like, this is a security bug, right? You can DoS from a simple usermode app with this bug.
I don’t think the attack vector (presumably asking for admin credentials to install a startup item) would be any different than an app that wanted to fork bomb, allocate too much memory, or spin wait on all cores. In that way I don’t think it’s any more security critical than any other bug that hangs the system.
It’s definitely something that should be fixed of course.
> This will be a kernel data structure protected by a mutex or semaphore. task_for_pid waits at pri>=0 for a wakeup that won't happen because race. ps queues behind it at pri<0 (disabling ^C). At least two bugs there.
Seems like a reasonable explanation as to the underlying cause of the behaviour.
Might not just be the browser cache -- I use Cloudflare on my blog and have a script that uses the Cloudflare API to clear the CDN cache when I update the site, but the Cloudflare API doesn't appear to work 100% reliably and I'm not sure why. sorry about that!
Wanted to point out the same. It’s time for apple to fix something here. Not sure if I like the idea to work with timeouts in code just to prevent the bug from happening.
Timers in general are a code smell to me. It’s one thing if you’re waiting for something to complete, but having arbitrary sleeps in your program is undesirable IMO. It makes reasoning about your program more difficult.
The htop report (in TFA and an other comment) seems high sierra specific (as in, users report the issue starting right when they upgraded to high sierra), running the C snippet from TFA on my system (a 2010 MBP running El Cap'), I can run through 15000 iterations without any freeze, according to TFA "sometimes it needs to try 10 times before it’ll freeze."
Htop also had segfaults after a while in OpenBSD 6.2 i386, when I used OpenBSD exclusively for a couple of weeks. It could also be present in other BSD-like kernels.
Htop segfaulting on OpenBSD doesn't show that the OpenBSD kernel has the same bug, it shows that htop has some (unrelated) memory management issues (OpenBSD is pretty good at exposing those).
At first I remembered this and wondered if it was related:
"The rules for using Objective-C between fork() and exec() have changed in macOS 10.13. Incorrect code that happened to work most of the time in the past may now fail. Some workarounds are available."
Question from someone who knows very little about the Mach/XNU APIs: Does this code leak Mach ports? If you call task_for_pid, you get back a Mach task port. Do you have to close the port with mach_port_deallocate? Could a resource leak be contributing to the system freeze?
You mean Sierra / High Sierra. This is because of compatibility breaking changes to some low-level kernel system calls. Since valgrind is essentially a CPU emulator, it is tightly integrated with the OS kernel, and has to be updated accordingly. The macOS contributors to valgrind seem to be relatively few, probably because most macOS developers primarily use the various sanitisers in clang (they also have UI integration in Xcode).
Have you tried the clang or gcc asan/tsan/usan sanitisers as a replacment? There are pros and cons of valgrind vs compile time instrumentation. The sanitisers increase the memory footprint, but run with less overhead. valgrind can detect some errors that the sanitizers cannot etc.
I haven't, mostly just because it happens so rarely and I just want a quick fix, but some of my colleagues have started building the sanitizers into their production process. I probably should just for the sake of good practice.
It's not bad, nothing's a show-stopper, there's just several little annoyances and embarrassing security issues.
My example is that there's an empty blank line in the bottom of about half of my terminal sessions. I haven't looked into it, and I assume it'll disappear in a future point release.
I've waited for the dust to settle before doing the upgrade, and did upgrade one machine after another, but I didn't meet anything problematic, this has been a smooth upgrade.
I've waited a bit after the "root" security issue, and now I prefer to be on the latest version.
I found the style of the article to be quite refreshing somehow. The OP is not trying to look like a smartass about the discovery (a trait very common in the IT industry), and she acknowledges that she doesn't really understand what is the underlying cause. She is just happy that she discovered something and is keen on sharing it with the world.
I've been writing a devlog on implementing a Lua vm in handwritten WebAssembly, https://www.patreon.com/serprex it pieces together the commit log with a stream of consciousness aspect
Very much admits an 'I have no idea what I'm doing' experience
As long as we are talking about anecdotical evidence: my MBP with High Sierra has been running fine for the last few months. I haven't encountered any issues in my day-to-day work as iOS app developer and neither in my home use. I think it's a pretty decent release, though it didn't add any new features that I feel I really need.
Are you using anecdote to mean 'not factual/confirmed' or simply to mean 'single data point'? I understand that its common to switch between the two meanings. I can usually guess from the context but its unclear from your comment.
In any case, I would say that even a single data point of a kernel crashing bug is cause for concern.
Actually, my mid-2014 13" rMBP randomly reboots on High Sierra, even on a clean install. It leaves no trace of anything in any logs. I had to downgrade to Sierra.
"Snow leopard" was 10.6, was it not? 10.6.8 was the best OS X ever. I wish I could have stuck with it forever, but that machine eventually died and the replacement wouldn't boot under anything older than 10.9. Someday I'm sure I'll be forced to "upgrade" again, but in the meantime I'm leaving things well enough alone.
The mach kernels in general are buggy, regardless of the release. e.g. I'm pretty sure they end up delivering SIGPIPE to the wrong thread in some circumstances on Sierra. There is also a problem with recvmsg not returning control messages some time.
They killed server because that was a losing battle: Apple was never going to "win" at servers. Despite all their effort they managed to score only a few modest wins, and the intense competition in that space made it a huge distraction.
Remember Apple started down the "server" road decades ago with their A/UX UNIX server (https://en.wikipedia.org/wiki/A/UX). Throughout the 1990s you could get a high-end Mac spec'd as a UNIX machine if you wanted, though I've only ever seen a handful in the wild.
The introduction of a special-purpose server was an anomaly, the Xserve had no specific predecessor, perhaps some kind of "go big or go home" effort on Apple's part.
While that system wasn't bad by the standards of the time, it couldn't compete with the likes of Dell, HP and others who offered way more flexibility on configuration and who would seemingly sell super budget low-end servers at a loss to lock people into their product line so they can squeeze them on support costs, a model Apple's allergic to.
Even then they realized it was a losing battle. Apple is an Apple shop and even they couldn't use their own gear at scale. I'm sure when the Xserve team saw racks and racks and racks of generic kit in Apple's own datacentres they realized they weren't going to win.
You're right that the Mini is a very capable workgroup server, and providers like http://macminicolo.net/ are hosting Mac "servers" by the thousands, so it's not like they've completely given up. Intel's the only other player in this space with their NUC machines and those tend to cost as much or more.
Hopefully they'll kit out the Mini better in the 2018 iterations. That 2012 quad-code i7 variant was an exceptional unit.
On the other hand, the Darwin kernel also powers iOS, and Apple seems to be eager to keep the system secure so that the App Store appears trustworthy. While a local privilege escalation might seem dull on macOS, the fact that the same flaw might affect iOS where untrustworthy local apps need to be sandboxed is probably the saving grace for Macs.
Qualitatively, it feels like apple’s software quality has been on a slide for several years.
What attracted me to move to Mac OS in the first place some 15 years ago was the sheer quality. It was thrilling to use a computer that Just Worked, with no BSODs or the endless dependency hell that was Linux at the time.
It doesn’t feel like that any more, across either OSX or iOS - it feels fragile. Things crash, behaviours are inconsistent, and it feels like more emphasis has been placed on immediate commerciality than long term retention through quality.
For what it’s worth, I’m in the process of moving to Linux on my MBP. The pros of OSX just aren’t as strong any more.
Thank goodness that Linux is consistent and has no bugs :-)
Ubuntu still doesn't let me change my IP address to a static one via the UI and video drivers still crash or result in hours of googling and reading contradicting articles on how to change xorg.conf
Software has bugs. Operating systems are hard. It doesn't matter where you will go, you will most likely deal with similar issues. Whether it be Linux, Windows or macOS.
The difference is you pay a huge premium for Apple products, Linux distros are generally free (unless you want enterprise-style support).
The same criticism applies to Windows - give it away for free and I might care less about crashes and the huge slide in quality over the last ~5 years.
You’re not paying for the OS; you’re paying for the hardware. If Macintosh hardware was buggy when running Linux or Windows, for reasons to do with, say, badly-written ACPI tables, then you’d have an argument.
As it is, it’s the opposite: Macs run both Windows and Linux “easily”, while other vendors’ buggy hardware has to be patched over with heuristics in these OSes because it has non-conforming device names, responses to introspection queries, etc. Apple hardware is uniquely well-engineered. If you’re writing your own hobbyist OS, it’s a breath of fresh air to run it on a Mac, compared to other kinds of PCs.
This is, coincidentally, also the reason it’s so hard to run macOS on other vendors’ machines. Windows and Linux paper over all the brokenness in hardware land, and so machines built for Windows/Linux rely on these heuristics with sloppy integration work. Apple, meanwhile, just does the integration to the standard in their hardware, and then relies on said standards-conformance in their OS. If everyone conformed to hardware ABI standards as well as Apple does, Hackintoshing would be a matter of just dropping a patched dont-steal-osx.kext into your install disc and calling it good. Everything else you have to do is the hardware vendors’ fault.
You will not find significant fundamental hardware differences between the latest high-end Apple and Dell laptops.
> As it is, it’s the opposite: Macs run both Windows and Linux “easily”, while other vendors’ buggy hardware has to be patched over with heuristics in these OSes because it has non-conforming device names
> This is, coincidentally, also the reason it’s so hard to run macOS on other vendors’ machines
Are you not aware that OS X does various explicit checks to make sure it is running on approved hardware, regardless of compatibility and capability attestation? Apple doesn't want OS X running on non-Apple hardware.
> various explicit checks to make sure it is running on approved hardware
Er, yes, that's the "dont-steal-osx.kext" that I mentioned. My point was that thwarting those checks is a very small part of getting macOS to run on a system (and is a solved problem—if it was all that was required, Hackintoshing would be a one-and-done thing, rather than something that breaks on every system update.)
The majority of the (continuing) effort of getting macOS to run on arbitrary hardware—hardware that, by its components, should be compatible with macOS's drivers—is dealing with vendors' whack-ass "doesn't even pass static analysis using Intel's own provided AML compiler" DSDTs (which macOS rightfully tosses its hands up at, but which Windows and Linux heuristically munge into something barely passable and then use it.)
> latest high-end Apple and Dell laptops
Dell (along with HP and Lenovo) are the better vendors as far as spec-compliance goes. Really, any of the PC makers who have an "enterprise workstation" arm, have the in-house expertise for things like ACPI compliance, or UEFI compliance, or PXE compliance, etc. But other vendors? Acer? LG? Xiaomi? Razer? Better to not even try.
I'm a bit skeptical about your claim that apple hardware is well engineered. I think they pick aesthetics over function, to the point that functionality is compromised - leading to computers that overheat.
The result is they have a lot of problems with tin whiskers. Which is why you'll find about a billion forum threads about people baking their macs in ovens, and similar hijinks.
Granted, if you're not using a mac for anything that intensively uses the hardware (for instance, just text editing) you'll never run into this problem.
Yeah, except 15 years ago it was xf86config with xinerama if you wanted multiple heads, and lcds are a walk in the park compared to CRTs - I actually killed a nice 21” monitor by screwing up the frequencies. I even recall the delightful process of having to modify graphics drivers to work with the bizarre Chips&Technologies integrated graphics I had. At that point, OSX was such a breath of fresh air.
Xorg.conf is a delightful stroll through a meadow compared to the nightmare fuel that came before, and in setting Linux (Kali,Ubuntu) on several machines of various spec over the last weeks, it has Just Worked.
FWIW, X configures itself nowadays and Wayland also just works. But I know where you are coming from. Everything audio related for example is still in very bad shape when you want more than Intel HDA.
Wayland may be ready for default in Ubuntu and Fedora, but they clearly didn't consider it ready for LTS. They have already said the next non-LTS version will still have Wayland as default.
Re Ubuntu and docs, this something where I find CentOS valuable: it moves slowly but is very stable. I’ve never seen network issues on CentOS on cli or GUI. Red hat also puts out great comprehensive docs so you don’t have to dredge the forums.
Yes, all software has bugs. And pretty much all modern operating systems are good. But some are better than others in much the same way as all Olympic 100m runners are fast but some are faster than others.
Sure, but it doesn't help when you are rewriting fundamental parts of the stack like the window server, for instance.
And I'm very glad that they did, the move to Metal shows they are committed to the Mac and not standing still. Just don't treat us as involuntary beta testers.
Except their window system rewrite actually slowed everything down. Every single os release has added bugs to the graphics layer. I've been maintaining a simple little screensaver since Mac os beta (almost 20 years old now) and the last 3 or 4 releases there's always something that breaks it. Bug reports go unheeded.
That is right. I fill in the fields, then submit the change and nothing happens and it jumps back to DHCP. Always have to just edit the network settings manually.
Not sure why your comment is grayed out. This is very much true, and part of the reason I run CentOS when I want infrastructure to work 24/7/365, but eg Fedora for testing new software.
Agreed. 10.2 was the first OS X release that felt usable day-to-day without running into too many bugs, annoyingly missing features, or frustrating UI decisions.
10.4 felt like the first version that was a pleasure to use. It felt like the polish had finally caught up with OS 9.
I've edited 'he' to 'she' in the two otherwise fine comments that made this mistake (https://qht.co/item?id=16251566 and https://qht.co/item?id=16251562) and grouped several empty replies and one lame off-topic subthread under this one. It's rare that we do something like this (and I've emailed the author), but it seems fairer than to penalize their original posts, which were otherwise informative and on topic.
The gender of the author is one of the least interesting things about this blog post, yet it's brought up in (at the time of writing this) three separate comments here.
Comments that just say "It's a she" or "she" are basically spam and add nothing to the conversation unless the gender of the author is really that important. Since the blog post doesn't mention anything gender specific, I think it's safe to assume these comments are just spam.
No doubt, but by getting further into it like this, you blew it up by a lot and produced by far the least interesting thing about this thread.
The internet is replete with opportunities for getting triggered and starting flamewars. For HN not to sink into a deeper circle of hell, we all need to resist these temptations. So could you and everyone else please not take HN threads on generic, divisive tangents in the future?
I apologize for how this blew up, I did not anticipate that. I was trying to channel the guidelines similar to "it never does any good, and it makes boring reading", hoping HN could focus on the substance of the article, rather than the politics of whether or not it's okay to use "he" as a generic gender term.
It does get extremely boring to read on every article written by or about a woman someone correcting every comment that uses "he" as a generic antecedent, but in the future I will just ignore it instead of inciting a potential flamewar.
It's true, and a bit weird, that this happens even when one's sincere intention is just the opposite. It takes a bit of forethought to realize this in advance and avoid it. That's the skill we're most hoping to see become more widespread here.
Look, I don't particularly want this to degrade into a stupid pointless flame war, but getting this sort of thing right does matter.
Given the relative gender disparity in the technology field, it's easy to make the assumption that any given developer is a man – we all do it sometimes! It's useful to have that pointed out when we're wrong, so that we can be reminded that we sometimes make incorrect assumptions, and eventually it hopefully won't happen as much.
None of the comments along those lines have been needlessly faux-offended, or attacking anyone – just pointing out a mistake that it would be good to rectify. It's really not harming anyone :)
I am not sure if you have ever been on the other side of this, but having people constantly assume you are a man because you know how to program and you're on the internet gets very old, very quickly. You can either accept being constantly referred to as a guy, which becomes flat-out alienating, or you can speak up for yourself, in which case you get responses like the one you just wrote.
I get that the gender of the author doesn't matter to you, and it may not even matter to the author of the article, but it does matter to a lot of people. No one is suggesting mistakes are a grave sin—even if her blog does say "Julia Evans" in giant letters right at the top. If someone makes one though, the correct response to people pointing it out is to accept the correction and keep it in mind for next time, not to accuse them of spam.
You're perfectly allowed to use "they" since you weren't sure.
It's not nice to assume "he" to always be the default - that is what was being corrected. One did not have to know that "she" was correct, but it was easy to find out that "he" was probably wrong. Let's keep discussion on topic, rather than venturing into talking about what you're "allowed" to do.
The parent comment did not bring up their gender. They simply corrected the poster who erroneously thought they were a man. This was an error, and we should hold ourselves to a standard that corrects mistakes. If anyone "brought up their gender" it was the poster that used masculine pronouns as a default for some reason.
Referring to her with the wrong gender is no different than using the wrong name of an author. She is a woman. Anyone who glanced at the blog would see a giant, honking "JULIA" on top of the blog.
Getting the obvious gender right is a matter of courtesy.
This comment made me laugh. There's a bunch of charitable explanations as to why it was written, most notably "honest mistake".
But there's also very uncharitable explanations! For instance, I'm sure I'm not the only one who's noticed that hard technical IT blogs written by women are disproportionally often written by women who were born men (disproportionally wrt the general population of transgender women). Just like the one starting this thread might have assumed that "it's deep tech, so it must be a guy", I can imagine that stevenh assumed that "it's deep tech, so she must be trans". This is beautifully similar to the sentiment of his comment in general, to not under any circumstance let anyone get away with getting gender stuff wrong.
So it looks like even the people who have a strong opinion about being hard on each other about gender stuff, can do it wrong. Maybe the lesson is that we should be slightly less hard on each other about it?
I propose we politely point out mistakes and assume the best intentions (even though of course sometimes that assumption will be wrong).
> I propose we politely point out mistakes and assume the best intentions (even though of course sometimes that assumption will be wrong).
It's possible to politely point out peoples' mistakes while giving them no room to purposely misgender, which is exactly what this thread does. Let's examine the course of this conversation:
"He said..."
"It's a she" (In my opinion a polite response. This is where any discussion should have ended.)
"Why does that matter?"
"You can't let people get away with misgendering a trans person"
If you only respond to the final comment, then your point makes sense. But the the act of politely pointing out the mistake already happened - now we must justify why it matters.
I only responded to the final comment. Sorry, I thought that was clear.
I agree with you for the rest. I did think stevenh's comment had a needlessly militant vibe to it, which is what I responded to. The rest of the thread lacks that vibe, which is good.
mov rax, 1
mov rdi, 1
syscall
and it would bring down the entire OS (kernel panic) when run by any user mode application under a standard non-root user. Took a few releases for them to fix that. I stopped trusting them from a low-level reliability perspective since then.
Fuzzing their system call interface may not have been a bad idea.
P.S. XNU is an interesting as it lets userspace use Mach syscalls as well as a bunch of BSD system calls directly. Probably esoteric interactions between them are not very well-thought-out. (I hope I am not offending Avie Tevanian here :))