I rather like this. A represents major changes like a substantial redesign of the whole API, while B catches all other breaking changes. Tiny changes to the public API of a library may not be strictly backwards compatible, even if they don't affect most users of the package or require substantial work to address.
A problem with Semver is that a jump from 101.1.2 to 102.0.0 might be a trivial upgrade, and then the jump to 103.0.0 requires rewriting half your code. With two major version numbers, that would be 1.101.1.2 to 1.102.0.0 to 2.0.0.0. That makes the difference immediately clear, and lets library authors push a 1.103.0.0 release if they really need to.
In practice, with Semver, changes like this get reflected in the package name instead of the version number. (Like maybe you go from data-frames 101.1.2 to data-frames-2 1.0.0.) But there's no consistent convention for how this works, and it always felt awkward to me, especially if the intention is that everyone migrates to the new version of the API eventually.
You put into words why I appreciate SemVer so much! It is so much better at being deterministic and therefore allows me a greater confidence in version control.
The author of a library has no idea how tightly coupled my code is to theirs and should therefore only make yes/no answers to ”is this a breaking” change.
For example, when a large ORM library si use changed a small thing like ”no longer expose db tables for certain queries because not all db engines support it anyway” (ie moving a protected property to private) it required a two week effort to restructure the code base.
> In practice, with Semver, changes like this get reflected in the package name instead of the version number.
Not once have I seen this happen. Any specific examples?
The practical upside is that it makes using higher-order functions much smoother, with less distracting visual noise.
In Haskell this comes up all over the place. It's somewhat nice for "basic" cases (`map (encode UTF8) lines` vs `map (\ line -> encode UTF8 line) lines`) and, especially, for more involved examples with operators: `encode <$> readEncoding env <*> readStdin` vs, well, I don't even know what...)
You could replace the latter uses with some alternative or special syntax that covered the most common cases, like replacing monads with an effect system that used direct syntax, but that would be a lot less flexible and extensible. Libraries would not be able to define their own higher-order operations that did not fit into the effect system without incurring a lot of syntactic overhead, which would make higher-order abstractions and embedded DSLs much harder to use. The only way I can think of for recovering a similar level of expressiveness would be to have a good macro system. That might actually be a better alternative, but it has its own costs and downsides!
In a non-strict language without side-effects, having a function with no arguments does not make sense. Haskell doesn't even let you do that.
You can write a function that takes a single throw-away argument (eg 0 vs \ () -> 0) and, while the two have some slight differences at runtime, they're so close in practice that you almost never write functions taking a () argument in Haskell. (Which is very different from OCaml!)
They've been gamed in the "study for the test" sense for years—a sort of human over-fitting—but managers did not mind. I've heard some insist that this is a feature, not a bug. (Depending on levels of cynicism, it's either testing for diligent workers who put effort into preparation, or selecting out non-conformists who aren't willing to put up with management bullshit.)
LLMs make it easier to cheat and give managers a push to develop new, AI-aware, assessment methods, but don't really change the underlying organizational dynamics that led to these tests in the first place.
> in what sense is make_string_comparator_for_locale "really" a function which takes a locale and a string and returns a function from string to ordering?
In the sense that "make_string_comparator" is not a useful concept. Being able to make a "string comparator" is inherently a function of being able to compare strings, and carving out a bespoke concept for some variation of this universal idea adds complexity that is neither necessary nor particularly useful. At the extreme, that's how you end up with Enterprise-style OO codebases full of useless nouns like "FooAdapter" and "BarFactory".
The alternative is to have a consistent, systematic way to turn verbs into nouns. In English we have gerunds. I don't have to say "the sport where you ski" and "the activity where you write", I can just say "skiing" and "writing". In functional programming we have lambdas. On top of that, curried functions are just a sort of convenient contraction to make the common case smoother. And hey, maybe the contraction isn't worth the learning curve or usability edge-cases, but the function it's serving is still important!
> Because of point 3, our codebase has a trivial wrapper to put round functions when your function actually returns a function
That seems either completely self-inflicted, or a limitation of whatever language you're using. I've worked on a number of codebases in Haskell, OCaml and a couple of Lisps, and I have never seen or wanted anything remotely like this.
No, it's worse than that. The city council very much implemented an anti-car (harassment) policy, to the point that car owners felt hounded by their own council's policies. It seriously wasn't a matter of "marginally less privileged".
Motorists are an easy scapegoat but without alternatives it's just political handwaving. And most people are motorists.
Take my city for example. I work in an office block around a 15 minute walk from the centre, which has free parking for employees. Monday this week the city announced that the land is now paid parking to the city effective immediately. When it was pointed out they they hadn't provided any of the necessary signage or machines for this, they decided it was illegal to park there at all, with fines and tow trucks for non compliance. An email from them suggested "cycling or using public transport as the weather is nicer".
I cannot stress this enough. No warning, no compromise, no other use for this land, just an immediate draconian announcement.
It's very easy to call another group entitled if you're not one of them
> the city announced that the land is now paid parking to the city
what a strange way to put it... why didn't they just say that they are not using any more taxpayer money to finance your parking space? Land in a city is not "for free".
> It's very easy to call another group entitled if you're not one of them
I'll be totally honest in that I don't know what the arrangement was before, but that free parking was previously enforced by permits so it's a reasonable assumption that it was not at the tax payers expense
Your job in any political office is not to leave everything as-is and to cement yourself into that position, but to make marginal improvements, even if doing so costs you the next elections or inconveniences people (hopefully only temporarily).
Most of those marginal improvements can only be seen as something positive in retrospective, not while they're being made. While they're being made, they'll always be unpopular, as the voter base is usually not keen on defending the people that are currently in charge. That doesn't mean they won't show up in the next elections, just that they are quieter in the meantime.
in the ideal world maybe - but we don't live in the ideal world: most are trying to get re-elected, or elected to a higher office now that they have experience.
and even in the ideal world a great leader can do more in the next term if they get relected.
I don’t know that it’s a helpful distinction. A lot of people do it all - drive, walk, bike, and take public transit. Only in this kind of discussion do I see people declaring it a team you have to choose.
The starting point is anti-anything-but-a-car, so it's understandable that in the process of getting to any sort of parity you'd feel like it's "harassment".
It's like claiming getting rid of slavery is "harassment", because your unfair privileges are being taken back.
The rule of 3 is awful because it focuses on the wrong thing. If two instances of the same logic represent the same concept, they should be shared. If 10 instances of the same logic represent unrelated concepts, they should be duplicated.
The goal is to have code that corresponds to a coherent conceptual model for whatever you are doing, and the resulting codebase should clearly reflect the design of the system. Once I started thinking about code in these terms, I realized that questions like "DRY vs YAGNI" were not meaningful.
Of course, the rule of 3 is saying that you often _can't tell_ what the shared concept between different instances is until you have at least 3 examples.
It's not about copying identical code twice, it's about refactoring similar code into a shared function once you have enough examples to be able to see what the shared core is.
But don’t let the rule of 3 be an excuse for you to not critically assess the abstract concepts that your program is operating upon and within.
I too often see junior engineers (and senior data scientists…) write code procedurally, with giant functions and many, many if statements, presumably because in their brain they’re thinking about “1st I do this if this, 2nd I do that if that, etc”.
I agree. And I think this also distills down to Rob Pike’s rule 5, or something quite like it. If your design prioritizes modeling the domain’s data, shaping algorithms around that model, it’s usually trivial to determine how likely some “duplication” is operating on shared concepts, versus merely following a similar pattern. It may even help you refine the data model itself when confronted with the question.
> If two instances of the same logic represent the same concept, they should be shared. If 10 instances of the same logic represent unrelated concepts, they should be duplicated.
Worth pointing out a success story: all ACM publications have gone open access starting this year[1]. Papers are now going to be CC licensed, with either the very open CC-BY[2] license or the pretty restrictive (but still better than nothing!) CC-BY-NC-ND[3] license.
Computer science as a discipline has always been relatively open and has had its own norms on publication that are different from most other fields (the top venues are almost always conferences rather than journals, and turn-around times on publications are relatively short), so it isn't a surprise that CS is one of the first areas to embrace open access.
Still, having a single example of how this approach works and how grass-roots efforts by CS researchers led to change in the community is useful to demonstrate that this idea is viable, and to motivate other research communities to follow suit.
That works nicely if your institution participates in ACM Open (no such institution in my country, and no, my country is not in the list of lower-middle income countries).
The combination of 'publish or perish' with 'pay for publication' and 'miserly grant money' is deadly.
While in theory the idea is nice, in practice this is a problem (maybe not in most rich countries, but here definitely).
Nowadays, you could always get the article you are interested in, even if it is beyond a paywall. Hence, perversely, the old model (which I hate, for reasons well explained in the original post) worked better for me. :-(
> you can't spec out something you have no clue how to build
Ideally—and at least somewhat in practice—a specification language is as much a tool for design as it is for correctness. Writing the specification lets you explore the design space of your problem quickly with feedback from the specification language itself, even before you get to implementing anything. A high-level spec lets you pin down which properties of the system actually matter, automatically finds an inconsistencies and forces you to resolve them explicitly. (This is especially important for using AI because an AI model will silently resolve inconsistencies in ways that don't always make sense but are also easy to miss!)
Then, when you do start implementing the system and inevitably find issues you missed, the specification language gives you a clear place to update your design to match your understanding. You get a concrete artifact that captures your understanding of the problem and the solution, and you can use that to keep the overall complexity of the system from getting beyond practical human comprehension.
A key insight is that formal specification absolutely does not have to be a totally up-front tool. If anything, it's a tool that makes iterating on the design of the system easier.
Traditionally, formal specification have been hard to use as design tools partly because of incidental complexity in the spec systems themselves, but mostly because of the overhead needed to not only implement the spec but also maintain a connection between the spec and the implementation. The tools that have been practical outside of specific niches are the ones that solve this connection problem. Type systems are a lightweight sort of formal verification, and the reason they took off more than other approaches is that typechecking automatically maintains the connection between the types and the rest of the code.
LLMs help smooth out the learning curve for using specification languages, and make it much easier to generate and check that implementations match the spec. There are still a lot of rough edges to work out but, to me, this absolutely seems to be the most promising direction for AI-supported system design and development in the future.
A problem with Semver is that a jump from 101.1.2 to 102.0.0 might be a trivial upgrade, and then the jump to 103.0.0 requires rewriting half your code. With two major version numbers, that would be 1.101.1.2 to 1.102.0.0 to 2.0.0.0. That makes the difference immediately clear, and lets library authors push a 1.103.0.0 release if they really need to.
In practice, with Semver, changes like this get reflected in the package name instead of the version number. (Like maybe you go from data-frames 101.1.2 to data-frames-2 1.0.0.) But there's no consistent convention for how this works, and it always felt awkward to me, especially if the intention is that everyone migrates to the new version of the API eventually.
reply