Hacker Timesnew | past | comments | ask | show | jobs | submit | Twey's commentslogin

John Day talks about this a fair bit: what people want to do is naming of applications, in a way that makes it independent of addressing (i.e. where the application is running). DNS still names the machine — it's just a one-step abstracted machine addressing scheme, not an application naming scheme per se. Then we designed an ad-hoc protocol on top that associates applications with specific machine names (and port numbers): if you go to facebook.com:443 (a.k.a. https) you expect to find an instance of the Facebook application, not a webmail client or an SSH server.

This isn't how any of this was really supposed to work. Back in the day the application identifier was the _port number_, according to a big list maintained by ICANN. The idea was that you could go to a machine (identified by IP or more conveniently by DNS) and see if it was running an instance of the ‘Facebook’ application, i.e. you'd find Facebook not at facebook.com:https but at meta.com:facebook. The end goal was to eliminate the need for the former part at some point, and come up with a better way of looking up applications than distributing a list by email. Instead the application ID is now used for transport and the host name instead encodes application ID, which it was never meant for, and that's why we can't have nice things (like device mobility).


The concept of linting feels somehow inimical to Markdown, a language designed to progressively enhance existing plain-text formatting conventions.

> On the higher philosophical level, I wanted to avoid the cursed tower-of-abstractions trap that I felt quite sharply in C++.

Then you might want to avoid computers in general: C also sits on a tower of ‘imaginary’ abstractions (binary, gates, functions, allocation, virtual memory…). The computer itself is an abstraction, and sits on top of a teetering tower of other abstractions like electronics, physics, and (if you want to get philosophical about it) discrete objects.

> same bytes packaged differently become entirely different incompatible entities (like std::string vs std::vector<char> vs std::valarray<> etc)

The same bytes simply _are_ (in)compatible with different things depending on semantic context, and C++ is just surfacing that to you. A given slice of linear memory (which is an abstraction, of course) could be representing a heap or a string with the same bytes, and you'd better know which when using them — that information is not stored in the bytes themselves. Your choice is either to represent this in a way that the compiler can check for you, or encode that information in freeform human documentation about how to avoid the error cases that the compiler can no longer help you with.


https://en.wikipedia.org/wiki/Fundamental_theorem_of_softwar...

My point, more or less. Of course, a heap is different from append-only array, on higher level. On lower level, these are bit strings. Which is handy, if you can send it over the network with exactly the same function, for example.


But my bitstring representation of a heap may not be your bitstring representation of the same heap. Neither C nor C++ makes enough guarantees about the representation of the heap that you can assume that, so a higher-level ‘heap’ type that abstracts over the respective representations of my heap and your heap is inevitable, and not something to be scared of. Of course you can reasonably ask for a language in which the representation of a heap is uniform across hardware, but it will come with some performance penalties.

Everything on the computer can be _represented as_ a bit string. But it's important not to confuse that fact with everything on the computer _being_ a bit string. The bit string is only a ‘name’ to represent the thing in conversation between people who share an understanding of what that representation should represent.


In the same vein Wasm is also an option that might be more suited to this kind of thing thanks to the higher-level ABI (/components).


Even the _design_ of languages is a generational project. There are open problems in the PL space that we know need to be fixed but we have no idea how. Once you've got the ‘core ideas’ of the language in some papers somewhere (and proofs that they're coherent, which is usually the meat of the process) it's a pretty quick step to get a toy implementation, but then the path to an implementation that is usable for day-to-day work (especially if performance is important) can take decades, and the path to adoption after that can easily take between 5 and 20 more years.

Then people figure out the thing about your core ideas that gets in the way when writing the kinds of real-world programs they want to write and you get to go back to the drawing board for the next language idea :)

As an example of the kind of time scales involved, linear logic was introduced in 1987. Linear (well, affine) types made it into Rust 1.0 in 2015, which IMO is the first time substructural types have made it to a ‘mainstream’ (albeit still far from ubiquitous) language. And that's a very straightforward language feature that doesn't really challenge the dominant imperative/functional hybrid paradigm or have any inherent effect on performance (since it's ‘just’ a type system feature).

IMHO the next big thing up (assuming LLMs and other AI advancements don't throw everything off-kilter) is probably effect systems, introduced in ~2013, for which we can linearly extrapolate a time frame of about 2041!

But maybe not — one exciting thing that's been happening is that (as Jonathan Blow noted) with the growth of lower-level substrates like LLVM the work to go from a toy to a working and performant language has decreased significantly.


Were there perhaps [licensing issues](https://www.phoronix.com/news/Chardet-LLM-Rewrite-Relicense) with the original?


The traditional name for this spec is ‘source code’ — a canonical source of truth for the behaviour of a system that is as human-readable as we know how to make it, that will be processed by automated tools into a less-readable derived artefact for a computer to execute.

Checking the compiled artefact into the codebase without checking in its source code has always been a risky move!


A specification, whether formal or less formal, is very different from the source code.


But it is also always less specific than source code, even if the attempt is to dictate the latter as close as possible.


I agree with you, I think the replies are misunderstanding the basis for code and specs and making semantic distinctions. Code is specs, just in a different syntax for machines to understand. This is a pillar of the discipline of the discipline of requirement specifications that Uncle Bob talks about in Clean Agile.


Technology evolves and traditions change. What persists is the role, not the filename and its extension. Weddings are still weddings even after things went from painted portraits to film cameras to camcorders to smartphones to livestreams. Same with birthdays. Cards became phone calls, Facebook wall posts, group chats, shared albums, or generated videos (Sora, RIP).

The tradition of having a deck of punch cards evolved to having assembly, to Pascal, Fortran, C, basic. The important part is a human-auditable directive, not an opaque, generated artifact as the thing that matters.

have evolved and adapted. Photography, film cameras, polaroids, camcorders, digital cameras, smartphones, social media, Zoom/virtual attendees. Same with birthdays. Handwritten cards, to phone calls to e-cards, Facebook wall posts, video calls, shared photo albums and Sora (RIP) videos.


> The important part is a human-auditable directive, not an opaque, generated artifact as the thing that matters.

Your arguments create a false dichotomy. You look at it from consumer perspective, while coding and it's artifacts are usually done by suppliers. If you change camcorder to tv advertisement, the requirements shift. The human auditable directive and the outcome matter. Coca Cola probably has very high standards for their IP (the directive) and doesn't care about the outcome (AI slop ads). The result is disgruntled consumers.

If you don't care about the "opaque" generated artifact, then you are Coca Cola.


As far as I understand it I'm on your side in this argument — I think ‘code’ continues to matter so long as the LLM-to-execution pipeline remains a leaky abstraction — but I don't think the analogy is correct. The ads are the resultant behaviour of the software, e.g. the UX, not the code.


>The traditional name for this spec is ‘source code’

Specs are the end goal, not how the software look at a moment in time.


Specs also evolve over time. There's no ‘end goal’ because requirements are always changing.

Specs are traditionally more forward-looking only because, by removing a lot of the implementation details that are required to write code, the specification can be written to be much broader in scope than code in an equivalent time period. But periodically we invent software that lets us automatically fill in more details of the software that now don't need to be specified by humans, and a level of specification that was previously ‘spec’ turns into ‘code’.


spec isn't code. There's a C language specification and many implementations. There are a handful of browsers each implementing HTML, JS, and CSS specs in their own way.


And given a C description of a program, a C runtime can implement that program in various different ways — interpreted vs compiled, explicit memory management vs garbage collection, different pointer sizes and memory layouts, parallelism at various points or not. It's turtles all the way down :) It just becomes ‘code’ at the point where a computer can execute it (in one way or another) without further human intervention.


The source code is not the specification, the source code is an implementation of the specification. The specification tells you what happens, the source code tells you how it happens. Ideally you also have some additional documentation for the why.


As any four-year-old can tell you, ‘why’ is infinitely recursive. ‘What’ from the perspective of level n is ‘how’ looking down from level n+1 and ‘why’ looking up from level n-k.


That usually does not matter in practice because you quickly reach a level of sufficient understanding.

We usually use UUIDs for this type of object but we have to send those objects to the legacy system XYZ, which only supports IDs with up to sixteen characters and is case insensitive, so we generate sixteen character random alphanumeric strings with uppercase letters which provides 82 bits of entropy.

Could you go deeper? Sure. Why do we have to send those objects to XYZ? Why does the legacy system still exist? Why does it not support UUIDs? Why is there no secondary key specifically for that system? Why are we using UUIDs?

But most likely you do not have to spell all those out. The point of a why is to explain why something is not what one would expect, you explain on top of some common knowledge. Everyone involved might know what XYZ does and why some objects have to get send there. If not, that is probably written down elsewhere. Why is the system using UUIDs? Maybe written down in the design for the persistence layer.


Sure, I'm not suggesting we need to go into infinite regress for every explanation! I'm saying that you should bear in mind that you _are_ in the middle of an infinite stack, and what is a ‘how’, a ‘what’ or a ‘why’ is just a function of your current position in it relative to the thing you're talking about. In the ID generation code you might want to explain why you're using this weird format here instead of a more standard format (because it needs to be passed to legacy system XYZ). But if you go up a step or two to where the ID is passed to XYZ in code, that ‘why’ has become a ‘what’ — the calling code acts as a ‘specification’ for the behaviour of that ID generation code.


Intuitionistic logic is a refinement of classical logic, not a limitation: for every proposition you can prove in classical logic there is at least one equivalent proposition in intuitionistic logic. But when your use of LEM is tracked by the logic (in intuitionistic logic a proof by LEM can only prove ¬¬A, not A, which are not equivalent) it's a constant temptation to try to produce a constructive proof that lets you erase the sin marker.

In compsci that's actually sometimes relevant, because the programs you can extract from a ¬¬A are not the same programs you can extract from an A.


And WhatsApp and Instagram were acquisitions, not creations.


Depending on your lambda calculus! From a categorical perspective a lambda calculus is just a nice syntax for Cartesian closed categories (or similar, e.g. *-autonomous categories for linear lambda calculus) so you can use it to reason about anything you can fit into that mould. For example, Paul Taylor likes to do exactly this: https://www.paultaylor.eu/ASD/analysis#lamcra


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: