Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I've definitely had my share of annoyances with musl-libc and would probably consider building my base images around distroless if that were something I were dealing with right now.

In the meantime, it's kinda shitty of someone to casually squat on the alpine linux namespace. If you want to make a small distribution that also ships glibc, then it's not alpine. Don't call it that.



I've had to battle quite a few portability issues, but everytime it was a program incorrectly assuming some sort of behaviour from glibc and not really musl's fault.


We had to stop using alpine because we have to resolve a DNS name that resolved in a 100 hosts. Musl fails to resolve that because it does not support “upgrade to TCP” and the response does not fit into single UDP packet so it gets truncated, so node fails to resolve the name. And not only node, normal Linux tools as well. And the author says it’s a feature, not a bug so for me it is kinda hard to take that thing seriously when parts of standard are unsupported.



That thread from 2018 refers to RFC 5966 which was obsoleted by RFC 7766 in 2016. RFC 7766 is much stricter saying TCP support is required. https://datatracker.ietf.org/doc/html/rfc7766#section-5


Oh wow. There's a big contrast between Linus screaming "don't break userspace," and that sort of crusade against the spec.


It's not the same because this is not an ABI, but an API needing a recompilation. It's actually explained in the article why it is not an ABI.


Why is DNS resolution even part of the libc and not, say, the base OS, a service on the base OS, or if need be, an external library like c-ares?

In fact, I thought Node already depended on c-ares, why is it failing on this?


Adding to the other responder.

Traditionally, in Unix libc is part of the OS. This situation is different in Linux but Linux is an outlier here, if we look at various BSDs they keep libc in the same tree as kernel.


If the outlier has >100x as much market share as the rest of the other unixes combined, is Linux really still the outlier?


Historically speaking? Yes.

C and Unix are considerably older than Linux after all.


Remember "Linux" is just the kernel, it is not an operating system itself.

Alpine or Debian including libC is more equivalent to the BSDs including it.


Yes. It explains why things work the way they do, and have since long before linux was a thing.


It has certainly evolved to be pretty complicated. You have whatever libc chooses to do with getaddrinfo(), nsswitch.conf, resolv.conf, systemd-resolved, various pieces of software (docker, vpns, wsl), and so on, all trying to control the local resolver.


Linux as a system is very ill-defined. I'd argue that GNU/Linux by definition contains glibc even if they are not in the same tree and musl based distributions are a variation which you could call "musl/Linux".


GNU formalized a system of tuple definitions for identifying build, host, and target environments, which was popularized by Autotools. See https://autotools.io/autoconf/canonical.html#autoconf.canoni... and https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/a... Even if you don't use Autotools, this is the canonical way to specify environments in the Unix world, though often a simplified version is employed. (By canonical I mean the one project-agnostic system that everybody at least nominally acknowledges. It's hardly the only system out there. Even Debian has their own alternative: https://wiki.debian.org/Multiarch/Tuples .)

The tuples historically had 3 components--cpu, vendor and operating system. But especially as uclibc and musl became more widespread the last component is commonly split into kernel-libc. (I think this was originally extended for the benefit of Debian GNU/kFreeBSD.) The formal OS identifier for glibc-based Linux systems is "linux-gnu" (e.g. x86_64-pc-linux-gnu), and for musl "linux-musl" (e.g. aarch64-alpine-linux-musl).

Vendor is not very useful these days. It's common to see 3-tuples of cpu-kernel-libc, as opposed to 4-tuples or traditional 3-tuples. Sometimes the system is extended into, e.g., 5-tuples like cpu-vendor-kernel-libc-compiler. Autotools projects commonly have a bit of generated shell code for parsing tuples; it's quite complex owing to ~30 years of accumulated idiosyncrasies.


Because people treat libraries differently today than 30 years ago. We're used to the integration points for things being some form of blocking IPC (like dbus on GNU/Linux, COM, or syscalls) but libc is different.

libc is that service on the base OS. But rather than connecting to an OS service and passing messages back and forth you dlopen and setjmp to do the same thing. On GNU/Linux libc isn't an interface to the NSS service, libc is the NSS service. That fact that you access it via your linker is just an implementation thing.

The kernel itself actually exposes integration points this way too with lib-vdso! The kernel will actually just stick it's own routines in your programs memory space so that you can avoid the syscall overhead for certain calls.



I think back then libresolv was separate from libc since many programs didn’t need it, and memory was tight


libc provides the standard POSIX sockets APIs, which include DNS functions such as gethostbyname() and getaddrinfo()


I certainly don't want DNS resolution inside the kernel, and outside the kernel, libc is as "base OS" as "base OS" comes, imo.


The Linux kernel can actually resolve names but it farms out the actual work to userspace using the request-key(8) machinery.

I personally it should be renamed because it's just a generic way for the kernel to ask for data from userspace, not just keys but still.


Someone added it to libc and now it needs to be provided forever for compatibility.


Typically, on a modern Linux system, DNS queries from libc (and everything else) will always query a local resolver (example: systemd-resolved).


That's up to your distro technically. DNS queries that use glibc (so everything basically) parse /etc/nsswitch.conf and follow the path of NSS modules which can do whatever they want to produce a name.

The resolve module provided by systemd talks to systemd-resolved but the dns module parses /etc/resolv.conf and does the resolution itself.


"For what it's worth, musl's DNS resolver is slated to gain support for TCP responses in the near future" from https://www.linkedin.com/pulse/musl-libc-alpines-greatest-we...


Ha, I ran into that one at work one time in our custom DNS resolver and had to add TCP upgrade. I was very confused why the tool worked when I shelled out to dig but not when I did the “correct” thing and used a resolver library. I’m very surprised that a project as big as musl would not have support for this.


If someone on my team had built an application and put 100 hosts into a DNS server, I would suggest they upload their hosts file to a webserver someplace. 100 hosts just doesn't do anything useful with most applications using gethostbyname() even in glibc, it's going to be slow, and the bug reports you get are going to be really confusing. Custom applications that are prepared to deal with all 100 hosts will be easier to implement using the output from a webserver.

What are you doing?


Service-Discovery-Over-DNS is typically the use-case. It's used as a compatibility layer for software where you either can't or don't want to integrate the native discovery APIs. Consul is a good example of this. You don't actually have to know how to speak Consul to get automatic service discovery, all you have to do is query a DNS name to get the hosts registered for a particular service.


That doesn’t answer my question at all, unless you mean that people make bad engineering decisions because they like using cute things.

dig is not affected by alpine’s decision here because dig does not use gethostbybame.

No DNS client would be.

This affects gethostbyname which very few programs in my experience even support robustly, so any “use-case” where someone is using 100 results would surprise me.

It seems if you need to write something custom, a www client is better (which consul also supports).

I think if you insist on writing gethostbyname instead of the res_* calls in bind, and robustly handle all results in a sensible way, then that’s silly, and if you have an existing application that works great with ~70 addresses but not 100 I would be curious to know what it is.


I'm not really sure what you mean by "support gethostbyname robustly" or that "dns clients aren't affected." Because on a GNU/Linux system the only correct method of resolving DNS is by using gethostbyname (or nowadays getaddrinfo) and friends. If you do anything else things will be broken because you aren't following the distro's/system integrator's/sysadmin's/user's configured NSS modules for name resolution.

And getaddrinfo returns a linked list of results so it's not exactly hard to support 100 results. All the actual junk about TCP/UDP is completely abstracted away from the caller.

So sure, while you could use your own DNS client specifically for talking to Consul's DNS server the whole point of the thing is to act as a compatibility layer for software you didn't write and which will 100% of the time use glibc's methods.


> on a GNU/Linux system the only correct method of resolving DNS is by using gethostbyname

I don't think that's right.

gethostbyname() doesn't query DNS, it queries names, which includes /etc/hosts, and possibly NIS, active directory, and other possible things. Most applications would never be expecting 100 results from one of these queries and many will not tolerate it well.

Specialised users of gethostbyname() can certainly do better, but what I doubt is the wisdom of such specialisation: It certainly has nothing to do with the application -- it is literally under the control of the network administrator as you are well aware. Specialisation can occur in your application, but it can just as easily specialise another way.

On the other hand, if your application really wants to specially speak to Consul's DNS (as opposed to whatever the network administrator is doing) it can definitely use res_query()

> so it's not exactly hard to support 100 results

Maybe we mean different things by "support": What do you do with them?

> I'm not really sure what you mean by "support gethostbyname robustly"

When most applications connect to a host they get from gethostbyname they often connect to the first, and give up if the connection opens and resets: This is exceptionally common with load balancers and address translation. To those applications, what is the point of giving them multiple results in this situation?

A few applications try to handle the result robustly: connect to a random member of the list, or connect to several in parallel and try the request in parallel. Some applications do really wild stuff here to make a good user-experience.

Most do not.

When someone types `ping google.com` (for example) you only ever get one result. If that name doesn't ping, it doesn't try another.

Most are like that.

Hopefully that makes what I mean by "robustly" clearer.


Does it matter whose fault it is when one just needs to ship a product and keep it running reliably?

I'm conflicted on this. I'm currently running an Alpine-based container in production but am thinking about revisiting the choice of base image.

On the one hand, using a smaller base system and (especially) a simpler libc translates to a smaller attack surface, and less noise in static scans for vulnerabilities. So I could argue that using Alpine is the responsible choice from a security perspective.

But maybe I'm just rationalizing a desire to pursue the kind of software quality (simplicity, minimization of bloat) that only we developers appreciate and that often has hidden downsides. Then I read about such downsides, like the sibling comment about DNS resolution, and I wonder if the responsible thing to do as a pragmatic product developer (and future manager of such developers) is to banish Alpine from the stack, tolerate the relative bloat of something like Debian, and throw more (and more complex) tools at the problem of the larger attack surface and more noise in vulnerability scans.


alternatively you could just use a dedicated local resolver


unfortunately that will not solve the problem: DNS RRsets cannot be split up, so if the answer is too big for UDP, it will still be too big when requested via a local resolver


The `dnsfunnel` resolver has been added to Alpine to solve that issue. Additionally, musl will gain TCP support for situations like DANE where individual records may be very large.


I mean, okay, it's the program's fault. But it is part of the trade-off you make using musl, and one that I, personally, wouldn't decide in its favor.

As the user, you have the choice to use musl and "battle quite a few portability issues" (whether that's the program's fault or not!) or to use glibc and not have to battle. If the benefits of musl outweigh that in your opinion, then go ahead. I don't see it.


Agreed. People knowingly using musl usually know the benefits.


For a very long time glibc was the only game in town. so to some extent glibc was Linux. you can forgive people for not testing with other libcs if none was available to test against.


Maintainer of the Alpine package referenced here. It's called `glibc`, makes no claims on Alpine Linux and certainly isn't a Linux distribution. You install the package in Alpine Linux, nothing else.


Most people's experience with alpin is through containers. Given you provide `alpine-glibc` containers is likely where the confusion comes from. If you don't know the backstory one would easily assume this is an Alpine endorsed container.


I don't provide "alpine-glibc" Docker images or containers, so I don't really know what you're referencing here. I maintain an Alpine package called `glibc`. Please enlighten me!


Try nix sometime! When I'm forced to use Docker as a container runtime, nix has made building lightweight images fairly painless without the manual dependency management I see in so many dockerfiles.

https://nix.dev/tutorials/building-and-running-docker-images


If it's alpine with glibc I don't see what's wrong with calling it that


What's wrong with that is the reason trademarks exist: Confusion.

If something is called "alpine-glibc", it's reasonable to expect that it's by the alpine people and supported like alpine.

This, as we see here, annoys the alpine people because they now get bug reports and support requests from people using it, and have to direct people elsewhere. And when it doesn't work, they get the hit to their image even tho they've had nothing to do with it.


It's the old Iceweasel story all over again. Sadly open-source is ill-equipped to handle naming issues.


I think it's clear from the website that this is based on Alpine and not part of the project https://hub.docker.com/r/frolvlad/alpine-glibc/

Without context, I think what this guy is pissed off about is that this project enables people to use alpine to run proprietary software.


It doesn't outright state anywhere that it's not an official alpine product, and I don't know if everyone reads the site - many people will just copy the "FROM frolvlad/alpine-glibc" from elsewhere.

In any case, the Alpine people would know, and they apparently think it's a problem.

>Without context,

But you have context here! It has issues with symbol versioning! There is "strange behavior and possible crashes, ". This makes alpine look bad, because people think it's the alpine project's fault!

> what this guy is pissed off

Please don't assume everyone is a "guy". In this case, Ariadne is not a "he" (she uses "she"), so a male-coded word like "guy" is ill-fitting.


  FROM frolvlad/alpine-glibc
That namespacing does make it look pretty unofficial to my eyes.


But do you know if alpine does official docker images namespaced as e.g. alpine/?

Is "alpine-glibc" an official project and someone just helpfully made the image (or an image including an official glibc package)? Or is this a prerelease?

Without a deep knowledge of alpine (or now reading this post) I couldn't answer any of these questions and I'm not sure I wouldn't try to go to alpine for bug reports. I think there's a reasonable potential for confusion, even with the namespace. (but granted, I don't use docker either, so maybe this is a common thing)

And I assume the alpine people (like the author) know that they get bug reports for it and that the issues with it cause bad publicity, and that that's the context for the post and the proposal to block the package.


The answer to the first question is an obvious yes, and it's obvious to anyone that uses Docker. No deep knowledge required.

If you want Alpine, you do FROM alpine:(version).

https://hub.docker.com/_/alpine


The problem is the people who use Docker but don't know anything about it. Or don't know that mixing libcs is a problem. They use this glibc package, break their containers and then blame Alpine for the breakage.


Ah, that's good to know, thanks!


> > what this guy is pissed off

>Please don't assume everyone is a "guy". In this case, Ariadne is not a "he" (she uses "she"), so a male-coded word like "guy" is ill-fitting.

Please do not mince words, people have a tenancy to refer to their own gender identity when referring to people who's gender identity they do not know. You knew what they meant.

from the hn guidelines:

> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.


There Are No Women On The Internet, so they got corrected. It isn't a bad faith interpretation, more like a gentle correction.


"somebody using a generic is really suggesting that women don't exist on the internet" sounds exactly like a bad faith interpretation.


I'm having a hard time here - do you really not see the connection between what I'm saying, what you're saying, and the use of a default masculine pronoun?


I see the connection, what I'm suggesting is that you knew what they meant when they said guy, meaning you did not need to suggest that they thought women didn't exist on the internet, nor did you need to suggest that they were assuming everybody on the internet is a man.

Both of those statements are bad faith.


I don't think we're operating under the same impression of bad faith. You seem to be using it to mean "an argument or line of reasoning I don't find compelling", where I am using it to mean "bringing up a line of argument or discussion for some reason other than participating".

So to circle back around on this, some people feel like using male as the default gender is rude and exclusionary. Try assuming people are women, just as an experiment, and see what sort of push back you get. This isn't in any way a bad faith argument.


Except it's not, as it mixes glibc with musl in ways that induce undefined behaviour you don't expect. If it was recompiling all packages to use glibc, the name would be more appropriate… but also still a trademark violation.


So, of the 49,530 images that show up with several using Alpine somewhere in the name or description... you think this is a trademark violation how? Alpine is synonymous with lightweight images. Several people and vendors use it in their image names.


There is a difference between "python:3.10-alpine" and "alpine-glibc".

One stands for: "we use alpine" (not a trademark violation)

The other one stands for "this is alpine" (a trademark violation)


First, the repo name is alpine-pkg-glibc because it is merely a package you install on Alpine. The container name, created by a different individual is frolvlad/alpine-glibc, and they make it clear that it is based off Alpine with the glibc package installed. In fact, you can even look at the source code. This is ridiculous, and if Alpine starts going after people for using alpine in the container image name or tag then I now know what distro to avoid entirely.


  debian-stable
  debian-buster
  debian-slim
Those names are clearly not packages but distros.

  python:3.10-debian
  python:3.10-alpine
Those names are clearly packages based on a distrop

So why not:

  glibc-alpine
This would avoid confusion.

> they make it clear that it is based off Alpine with the glibc package installed

Well, seeing the number of issues opened on the official Alpine bug tracker regarding this package, it seems it's not that clear.

> if Alpine starts going after people for using alpine in the container image name or tag then I now know what distro to avoid entirely.

Alpine starts going after people for misusing the name and impacting their reputation. This is completely normal and understandable.

Marketing and communication is an important part of every projects, even open source projects, this is not exclusive to businesses. If you want people to support your project, you need to protect your image.


I do expect images with "alpine" in the name to be based on the Alpine distro. And not some weird bastard of Alpine. And I see no problem with them asserting their trademark here.


This is merely a package you install on Alpine so it isn't a fork. Did you not do any independent research and just assume? Want to move the goalposts?


It makes it look like an "official" project.

Just give it a different name and explain that it's Alpine with glibc somewhere


It does not. They use Alpine in the name so people can find the image. Searching for Alpine on Docker Hub has 49,530 results. This image doesn't even show up on the first page of results. I think whoever wrote this needs to rethink how ridiculous they are being.


I assumed it did.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: