Well yes, 12pt should be the same everywhere, but points are a terrible unit of measurement for anything computer-related.
Points are useful on paper only. They should only exist in the context of word processing and like, where we expect things to be printed in an actual physical size, with 72 points to an inch.
For computer interfaces or web documents, we just need some kind of measurement that is a relatively known proportion to the computer or browser's interface. Fortunately, CSS "px" does that rather well -- it doesn't indicate physical pixels, but rather logical pixels, and we all know what "16px" text looks like relative to our OS, and it works great. And happily, people rarely use "pt" in CSS, and in my opinion it should never have been an option in the first place.
That's like telling me points are a terrible unit of measurement for print because we use different sizes of paper and points should be relative to the paper size.
The whole goal of points is that a point is a point is a point, regardless of whether I'm printing on a business card or a billboard. Why can't we have that uniformity for digital devices, too?
Older industries often had specialty units for measurement. The Point unit was designed during the early days of printing, so the system is several hundred years old. It was by no means the only system, but it is the one that seems to have taken over the industry by the 20th century.
A point being about 1/72 of an inch or around .35 millimeters.
The usefulness of a specialty unit, particularly in a pre-digital, largely human labor intensive world is that you can tailor the unit so that the common sizes are easy to remember and subdivide usefully without having to a lot of math involving fractions or decimals.
Most commonly used type sizes are around 12 points, or one pica (1 pica = 12 points) 12 is a good number because there are several whole number factors of it; 1,2,3,4, and 6. Most common font sizes are even numbers. And the most common font sizes used are going to be gotten by adding, subtracting, multiplying and dividing these numbers. Thus in mechanical type you would often see fonts at 6, 8, 10, 12, 14, 16, 18, 24, 28, 36, 48, 64, and so on. I'm not saying you never saw sizes other than these, but they were comparatively rare. Remember each font size of a typeface had to have the letter forms cut, molds made, type cast, stored and maintained. A big shop might have a larger collection, a smaller shop would get by with fewer sizes.
These days it's certainly no big deal to specify a 17.3 pt font sizes. At one time, when digital type still relied more heavily on hand tuned pixel based versions for screen display, the above numbers were useful because you could have algorithms for regular pixel thinning since the numbers all factored more easily; but even these used to require a lot of hand running.
And even now it' useful for humans to deal with a few, easily distinguished font sizes, at least conceptually. It's easy to remember what an 18 pt font size might look like vs., say a 24 pt character. And simply using whole number millimeters is a bit course right around the range of most common variation, say 6 - 24.
The thing is that with computers it shouldn't really matter what unit you use, assuming you have enough resolution and appropriate anti-aliasing. But the unit should be an absolute measurement, not something based on an undefined pixel size.
For a while there was a sort of standard. In the early days of the Macintosh, Apple declared that one screen pixel was one point, and we would pretend that for practical purposes that was 1/72 of an inch. And it was pretty close, at least if you were only using a word processor. Designers did have to worry about the difference and understood that the image on the screen was not exactly the same size. And there were wrinkles and variations in how the various font rendering technologies dealt with the issue.
The simplest solution would be to simply require the OS to know the actual size and aspect ratio of the pixels on your screen. But getting everyone to rewrite the software to be pixel agnostic is going to be the rub.
For UI uses, points would seem to be not terribly useful. Many elements would be large numbers, so the advantages over, say centimeters is going in the wrong direction. And you are going to be seeing much more variation in sizes for UI elements than you are going to be seeing in typography. You have little bitty UI elements and you have UI elements that easily take up half the page or screen. Normal metric measurements would seem the most appropriate.
And in fact, in typography and page layout you see other units other than points to describe things. There's the aforementioned Pica, which is often used to describe things like column and page widths. And of course, these days, plain old inches and centimeters are used for laying out pages all the time.
Edit: Pixels do still factor heavily into measurement on video displays, as opposed to computer displays. A video display is "stupid" in the sense that it is a known number of pixels, but the size of them is assumed to vary quite a bit; from a couple of inches to tens of meters. Of course these screens aren't conceived of as an up-close medium. You look at them from at ranges from across the living room to across the football stadium.
In video it is quite common for measurements to be expressed in percentages of the screen rather than pixels or absolute measurements. You have concepts like "lower thirds" for supplementary text and images, or 20% safe areas around the edges of the screen. You can put things in the center of the field or use the old standby "rule of threes" from photography composition. Confusingly and irritatingly type is typically specified in pixels. And yes, this does cause problems when designing imagery for use on different devices like real time character generators and special compositing hardware. These days you simply make the assumption that you are designing for one particular format, say 720p, specify in pixels for that, and then "do the math" when targeting the graphics for use at other formats.
Points/Picas are common in the print world. I assume he used them for the text example because that's what he was talking about. I don't know how well they'd translate for UI elements though.
But on any piece of paper, you know what the context is. You know if you're designing for a billboard or a business card. You never design anything that gets printed on both without human intervention in the middle.
We don't want that for digital devices, because having text be the same physical size on my iPhone and my iMac would be a usability disaster. But I know that CSS "16px" text on both devices will be readable and a little on the large side.
Some random points here.
(Disclosure: I'm a professional UI designer with a strong development background. Erstwhile CS dude, but still geek to the core. I consulted with the OP on this article.)
----
You really dont know what "16px" means with respect to your OS. The DPI presented to the user is hardware dependent, not OS dependent. True, in certain tablets and phones, the OS and the hardware are bound together (iPads, etc), but not so on the desktop where screens have different DPI (as evidenced by the OP's photos of his screen setup).
This issue is pretty subtle - there's lots of places where ems are the better way to specify point size, like print-centric expereices.
Other experiences, like 10' UI on set top boxes, I could argue that pixels are as good as anything given (a) the huge range of screen sizes and distances-to-screen the interface will find itself in, and (b) pixels are, for all practical purposes, the same as expressing things in percentages (HD being a fixed size), but easier for designers to reason about.
And finally: 12 point on the screen might be some other type size when moving to mobile (probably smaller) and should almost certainly be using different layout techniques when moving to mobile. Line length, inter-line spacing and distance to screen are all in play here, so don't get suckered into the idea that you can spec the text once and expect scaling on the platform to automagically adjust it everywhere.
Related: responsive layout. Spec'ing a type size is really only useful within a known domain range. You should think through the design as you move from platform to platform.
Quite the mess. There's no Grand Unified Theory for this stuff yet. Not sure it even makes sense to go find one. Great design isn't a matter of being uniform, but through being right-for-the-purpose.
That said, the OP is dead on. IF you're in a place where specifying points is the right thing to do, then a point should be a point, not a pixel.
Points are useful on paper only. They should only exist in the context of word processing and like, where we expect things to be printed in an actual physical size, with 72 points to an inch.
For computer interfaces or web documents, we just need some kind of measurement that is a relatively known proportion to the computer or browser's interface. Fortunately, CSS "px" does that rather well -- it doesn't indicate physical pixels, but rather logical pixels, and we all know what "16px" text looks like relative to our OS, and it works great. And happily, people rarely use "pt" in CSS, and in my opinion it should never have been an option in the first place.