> A change in the handling of URL schemes was deployed a couple of days ago
This was deployed Thanksgiving week? I realize Github isn't a consumer company, so it doesn't face the same pressures as Amazon, but I'm surprised there wasn't a code freeze so people can have a quiet holiday weekend.
No. But this appears to now be the most important online shopping week of the year in all of Europe as well. Shops are abnormally sensitive to outages. You'd expect all kinds of service providers to be extremely conservative with code and config pushes this week due to that. Maybe GitHub is far enough removed from the actual consumers that they don't feel the pressure, but it does seem surprising.
That's amazing. Had no idea we had a GitHub office here, let alone in the country!
If you're working remotely, are you employed by Californian standards/regulations/benefits or by local ones here in Denmark? If you don't mind me asking.
Yep. A bug is something that happens (although too many can be judged).
But I don't understand how github didn't already know about the regression, from automated testing or error monitoring. I have high expectations for github, because they have met them.
They have error monitoring, but download URLs are one thing that's tricky to monitor for "errors" correctly because if two URLs 404, how do you know from your Grafana dashboard or whatever which are valid and which aren't? Having run a server whose only purpose is to serve static files myself -- you get a lot of 404s, all the time, it's no indication anything is wrong at all. This case would only be picked up by a dashboard if it fatalistically caused an error somewhere (like a 500) but by definition this wasn't ever a 500, it's a 404.
As you note, it's just a bug. Sometimes things you don't understand might actually surprise you, it turns out.
This was reported like it affected all download URLs based on a git tag, which also means all download URLs appearing on github "release" pages.
If so, I'd have expected there would have been some testing that would have caught this too. Of course, sure, bugs in tests happen too.
Obviously bugs happen because bugs happen. I stand by being more disturbed that it took two days to notice and revert to restore the regression than I am by the fact that a regression happened. Users noticed right away and tried to report the problem, it took two days after that for Github to comment on it and to revert, which seems problematic, no? Especially if the bug was really affecting as many download URLs as I think it was reported; if it was only affecting a minority of edge case ones, that's more understandable.
It's never possible to eliminate regressions. (It may be possible to reduce the rate of them of course). But whether by testing or by receiving error reports from users, it ought to be possible to notice all major regressions in less than two days.
If this affected all links, couldn’t you just monitor it with a general “does downloads of these URLs work” check? You don’t have to (and shouldn’t) monitor 404 returns to test whether downloads are working, the naive and obvious test is to actually attempt to download something.
If Github doesn't at least monitor their 404 error rate for large-scale spikes, whoever is in charge of SRE should be fired.
With no announcements and no response to a now two day old bug report, I see two possibilities:
1)Their monitoring of their infrastructure and monitoring of issues is shockingly incompetent for a company of their size and importance (the fact that it is a US holiday is irrelevant.)
2)This was 100% intentional and they're purposefully looking "incompetent" to get people to shift to using other services for downloads.
My money is on the latter, given others in this discussion are reporting random download link failures starting a month or two ago. A huge number of projects seem to use GitHub as a sort of free file hosting service. I imagine the opex for both storage and bandwidth is a not insignificant amount of money and someone has been told to shoo the freeloaders off the grass.
Announcing they're ending free file hosting for unpaid projects would generate a lot of noise and PR. Instead they just make it unreliable, and people go elsewhere. Multiple people in this discussion have described moving downloads of Github in response, which is exactly what Github likely wants.
You might want to not pick up your pitchfork so quickly.
> A change in the handling of URL schemes was deployed a couple of days ago that caused the regression being discussed here. Due to the amount of traffic that the archive endpoints see, and the high baseline of 404s on them, this regression did not cause an unusual increase of errors that would've caused our alerting to kick in. The change has just been rolled back, so the issue is fixed. We will investigate this issue further after the weekend and take the appropriate steps to make sure similar regressions don't happen in the future.
A change in the handling of URL schemes was deployed a couple of days ago that caused the regression being discussed here. Due to the amount of traffic that the archive endpoints see, and the high baseline of 404s on them, this regression did not cause an unusual increase of errors that would've caused our alerting to kick in. The change has just been rolled back, so the issue is fixed. We will investigate this issue further after the weekend and take the appropriate steps to make sure similar regressions don't happen in the future.
We ran into intermittent failure with Github download urls around a month or two ago that caused our builds to fail (there was no github status, but we replicated the failure easily manually). In response to the failure, we started self-hosting the dependency.
The fewer external dependencies you depend on at the actual point of builds/deploys, the better. Since you're using a fixed version of the dependency anyway, you might as well self-host or include it in the ami or container.
I suspect you have no idea what AUR is and how it works and furthermore, you have no experience with software packaging.
If a project is using github to publish releases, where else are consumers of that software going to get them from?
Having all sources of everything that is packaged backed up is a must for the official repository of a competent distro, but even in that case there is no reason not to use github in normal operation.
> Arch Linux as well for every package downloading tarballs from GitHub. Packages checking out the git source tree are not affected, but they are a minority.
When building distro packages from source, the full revision history of a Git repo is unnecessary. Downloading just the source code of a specific commit or tag with curl is simpler, faster, has fewer dependencies, and is less prone to breakage (well, unless your URLs change under you, as happened here).
The parameter is called --branch but it also takes tags.
It's not as fast as fetching a ZIP file, but it gets pretty close.
From my count, this method only requires one dependency (git) whereas the curl + unzip method requires, well, both curl and unzip.
The zip download method (download, decompress, build, compress into package, decompress onto system) is already silly enough, the first decompression part can easily be dropped.
> It's not as fast as fetching a ZIP file, but it gets pretty close.
In the context of distro packages (the bug report mentioned OpenBSD and Fedora) you might be building tens of thousands of packages of which thousands are likely to come from GitHub. A small difference becomes greatly magnified.
> From my count, this method only requires one dependency (git) whereas the curl + unzip method requires, well, both curl and unzip.
You’re forgetting that git itself depends on curl.
Combined with the outage yesterday this hasn't been a good weekend for GitHub SRE's
I'm surprised something like this happened on a weekend though since I wouldn't expect anyone to be changing anything in the codebase (then again it could be some infra has just ran out of storage or etc)
Perhaps there’s a lack of attention due to the holiday weekend in the US? Seems like major incidents in infrastructure and services tend to happen over holidays (general observation, not GitHub specific).
Whether this specific problem is intentional or not, these kinds of problems show the issue with using a single centralized service for distribution of third-party dependencies. But it's just so much more darned convenient than hosting your own Git server! It would be super cool if there was a decentralized alternative to GitHub, that used Git under the hood. Perhaps one would upload their repositories to a node, which would then be synchronized with all other nodes, and all you would need to do to use it is to specify any_node.com/author/project. This would keep GitHub's discoverability, while allowing all the benefits of decentralization.
Some software like gitea [1] can mirror github repositories (or vice versa). Some other software like fossil [2] is designed to do not just source code but also issue tracking etc. in a decentralised fashion.
It's definitely more work than just throwing it onto GitHub though, something more convenient would be really cool.
I would love to see a Git+Matrix forge, completely decentralised. With the spaces and the upcoming threads, seems like it wouldn't be so difficult to create a client that exposes that. It would be easier to make it closer to Gerrit than Github review process, which would be a step up!
Another gripe; unrelated, but since we're piling on...
My username ends in a hyphen. Apparently, that's no longer allowed, though my username appears to be grandfathered in.
Trying to give feedback about new experimental features lands me on the GitHub communities site, which is treated as a standalone app and thus requires you to log in via GitHub (it doesn't re-use the existing session token).
However, Communities won't let me sign up with my username since it has a hanging hyphen, and I can't change the username in the form. So I effectively can't sign up.
Support has not responded for over a month. Feels like things are inching toward getting worse with GitHub.
I am surprised a Microsoft MVP Certified Professional expert did not pop up yet in the newsgroup, asking you to reboot your machine and review the steps in a certain technote ;-)
Hey now, that's libel. There are problems that can be solved by rebooting, and a Microsoft MVP would never post a genuine potential solution. I'm pretty sure you meant to say "an MVP popped up in the newsgroup to copy/paste a paragraph of random intro text and then ask if he's solved your problem".
You are giving them too much credit, it will be copy and pasted from the post/thread/comment/accepted solution that you have already stated you have tried and does not work.
I still remember the night when logrus's author decided to rename his Github account and broke our production build (https://github.com/sirupsen/logrus/pull/384). Since then, I always vendor third party dependencies.
I don't mean to "victim blame", but I'm curious why you'd choose a username ending in a hyphen in the first place? (I would have thought this wouldn't work on lots of services)
I signed up to GitHub almost 11 years ago, and someone else already registered the username I used for everything. So 11-years-ago-me tacked on a hyphen as I had seen a few other people do it, too.
> I signed up to GitHub almost 11 years ago, and someone else already registered the username I used for everything. So 11-years-ago-me tacked on a hyphen
"I purposefully chose to create a nearly identical username to an existing user" isn't a great defense to someone saying the problem is between the chair and keyboard.
At best you didn't think of the possible confusion you'd cause.
Both the other account and myself are quite active on GitHub. There has been a mistype maybe 5 times in 11 years.
So sorry, but no. Things have been fine. In places where the hyphen is ambiguous, I always make a point to mention it. It's never really been an issue, and the other individual has always been quite nice when the hyphen is omitted, kindly pinging me instead.
I'm not trying to defend against anything except poor attempts to paint me as somehow malicious...
Hyphens and underscores are often permitted characters in usernames, more so than exclamation marks or other special characters.
I don't really see what problem using a hyphen in a username could pose, unless there's some kind of filter being applied that doesn't take into account the previously permitted characters. I'd guess someone applied an [A-z0-9]+ without thinking too much about it because that's what the current username rules are.
I'm more surprised that there's a second authorization endpoint, Github could've just used their existing OAuth2 implementation to log users in if they didn't want to reuse the existing login code.
I imagine if support got back to you, they'd say "Yes, that was grandfathered in for backwards compatibilty, but that username is no longer allowed and won't always work with new services or functions, as you discovered. We recommend changing your username."
However, not getting a response is not encouraging, it's true.
Please don’t shop unrelated concerns to threads that aren’t about that concern. GitHub breaking release URLs worldwide after a multi-hour global outage has no relationship whatsoever to your problem. I sympathize with your frustration, but pet peeve derails pollute HN discussions about every topic under the sun these days. Submit a post instead, and if it doesn’t get traction, so be it.
This isn't a pet peeve derailment. This is commentary on the multitude of issues GitHub has had in recent history, at least from my perspective.
A "pet peeve" is something I find annoying. This isn't that - it's a bug.
Further, my comment doesn't break the guidelines. I'm not commenting on the layout. I'm not flagrantly dismissing someone's work. If you don't like my comment, either keep scrolling or downvote and move on?
By your logic, any time someone posts about GitHub, it’s appropriate and sensible for each of us to post about whatever GitHub bug upsets us most. This leads to the HN we have today, with hundreds of upvoted comments per day clamoring about each individual’s personal upsets, wholly unrelated to the topics at hand.
If you had violated a guideline, I would have contacted the mods rather than reply. The tragedy is that no rule can sufficiently be written that keeps people from concern shopping their personal issues into discussions on the most tenuous of links. “This is a post about a GitHub bug” opens the door for me to post about the hundreds of GitHub bugs I’ve encountered over time, and with thousands of users at HN, if we each do this, there’s so much less room left for discussion about the actual bug this post is about.
This affects Show HN, too: when someone posts their cool thing, everyone chimes in with all the other cool things that they like better. It’s incredibly disheartening and sets aside the purpose of the post – “a thing, discuss” – so that people can use that request as a launchpad to discuss other things instead, without making even the slightest effort to tie it back with relative comparisons to the Show HN topic itself.
There is no guideline that asks us to set aside our personal needs and desires in these comments and focus on what brings the most value to the original topic, no matter what we feel about other topics that happen to also be about GitHub. But I continue to hope, out loud and with salient arguments, that HN will step up and respect itself more than the guidelines require.
Say what you want about "cloud" reliability, but my little home server in a residential ISP has been up for more time than Github.com in (at least!) the past year.
And if github.com tried to host their website entirely off your little home server, they’d surely have 24x7 outages from being bombarded by too much load. All your anecdote proves is that it is easier to keep a single server online than operate a big distributed system, which has been obvious for quite a while
No, it is not "obvious" to everyone at all. You can still see people here claiming that one should move to a centralized provider since they can guarantee nine nines of whatever, and that self-hosting is way too hard to make reliable. (Which is double irony when the centralized provider goes down and then the excuse is "well, that's because they're big!". If only...).
In any case, the point was that Github.com just sucks, rather than everything cloud sucks. For the past year, they have been down a couple of magnitudes more time than I have spent managing my server.
> For the past year, they have been down a couple of magnitudes more time than I have spent managing my server.
I have spent orders of magnitude less time feeding my pet rock than the average dog owner spends feeding their pet.
The difficulty of keeping a service online depends on what it actually does. Not to mention, outages are generally caused by making changes. Changes which are required if a service is going to continuously improve
If you are trying to make the point that Github.com is the only "non-pet-rock" web-based project managament and bug tracking system available (for selfhosting or not), well, that is just false. The alternatives have at least as many features, if not more. Some of them even predating Github.com by decades.
If you are again trying to make the point that they have a bazillion times the load, I will say again that it is self-inflicted, and thus part of the problem, and not a valid excuse nor justification. Other providers do not seem to have these problems anyway.
Comparing uptime between comparable services is reasonable. (Though I will say that having used some GitHub competitors, counting the number of features really overlooks the actual user experience between them.)
Comparing uptime between a production service and a single personal server however isn’t reasonable. You haven’t even said what your server actually does, only that it rarely goes down!
The problem is that you’re computing the probability of one service going down, when I actually care about the union of the downtime of all of the services that I need (or, equivalently, the intersection of their uptime).
If every Arch package hosted its own source code, then even if each one of them has better uptime than GitHub, at least one of them is probably going to go down every single day (assuming their uptime is uncorrelated).
If you can personally centralize all of your own work on a server that you host, like a company-run Mattermost or something, and you have someone who can be on call to keep it up, do it. The result will, as you’ve described, be simpler and better than GitHub. But expecting every individual to run their own Git server is just stupid, because then my project is going to wind up depending on dozens of separately-hosted Git servers and I’m back to taking the intersection of their uptimes to compute the uptime of the whole system. There’s a sweet-spot here; GitHub is above the sweet spot, but a lot of “FOSS projects” (the kind that have a single maintainer) are below it.
I remain unconvinced. If github goes down, it goes down for everyones projects. If everyone ran their own servers, only one or two packages would go down at a time. So instead of 100% its like 0.1%. And since they're simpler, as per OPs comment, they'd stay up longer. So a distributed git, as it was originally intended to be used, is more robust than a single point of failure like github.
For a lot of use cases, there really isn’t any important difference between 100% and 0.1%. If any of the source tarballs are unavailable, then I can’t produce a build, and the fact that some of them are available isn’t much consolation. I’m blocked either way.
Especially for an organization like Arch Linux, the solution is really obvious; maintain your own mirror. Debian and Fedora do it, so can they.
Obvious, maybe, but there are many who will criticise anyone self-hosting, saying that a cloud solution will inherently be more stable, since it has a bigger ops team.
Well, GitHub came back online without me doing anything. GitHub’s life depends on providing a good service to customers. I think occasional downtime is a good tradeoff for what I am getting, as opposed to having to manage my own server.
Well, my home server has about the uptime as WhatsApp had this year. Yet I am still happy that I moved my XMPP server to the Hetzner cloud as it seems to have a better uptime then both ;-)