I have just spent a month writing about 2000 lines of Forth. My answer is no, at least w/r to generating something that looks like the by-hand code I wrote. LLMs coast by on being able to reproduce idiomatic syntax and having other forms of tooling(type checkers, linters, unit tests, etc.) back them up.
But Forth taken holistically is a do-anything-anytime imperative language, not just "concatenative" or "postfix". It has a stack but the stack is an implementation detail, not a robust abstraction. If you want to do larger scale things you don't pile more things on the stack, you start doing load and store and random access, inventing the idioms as you go along to load more and store more. This breaks all kinds of tooling models that rely on robust abstractions with compiler-enforced boundaries. I briefly tested to see what LLMs would do with it and gave up quickly because it was a complete rewrite every single time.
Now, if we were talking about a simplistic stack machine it might be more relevant, but that wouldn't be the same model of computation.
> It has a stack but the stack is an implementation detail, not a robust abstraction.
Not exactly. Not only the stack is central in the design of Forth (see my comment over there [1]).
It seems to me that a point-free language like Forth would be highly problematic for an LLM, because it has to work with things that literally are not in the text. I suppose it has to make a lot of guesses to build a theory of the semantic of the words it can see.
Nearly every time the topic of Forth is discussed on HN, someone points out that the cognitive overload* of full point-free style is not viable.
I didn't finish my second sentence: not only the stack is a central element in the design of Forth, it also has 2 of them: data stack and return stack. the return stack is also used directly by programmers.
The claim seems extremely unlikely to me. LLM comprehension is very sophisticated by any metric, the idea that something as trivial as concatenative syntactic structure would make a significant difference is implausible.
LLMs handle deeply nested syntax just fine - parentheses and indentation are not the hard part. Linearization is not a meaningful advantage.
In fact, it’s much more likely to be a disadvantage, much as it is for humans. Stack effects are implicit, so correct composition requires global reasoning. A single missing dup breaks everything downstream. LLMs, and humans, are much more effective when constraints are named and localized, not implicit and global.
I’m not claiming forth should be used as is. I’ve opened the benchmark so others can reproduce the result I share in the post: https://github.com/rescrv/stack-bench
Most models are multi-paradigm, and so they get... Fixated on procedural language design. Concepts like the stack, backtracking, etc. violate the logic they've absorbed, leading to... Burning tokens whilst it corrects itself.
This won't show up in a smaller benchmark, because the clutching at straws tends to happen nearer to the edge of the window. The place where you can get it to give up obvious things that don't work, and actually try the problem space you've given.
Even though I really like postfix from an elegance standpoint, and I use an RPN calculator, IMO it's harder to reason about subexpressions with postfix. Being able to decompose an expression into independent parts is what allows us to understand it. If you just randomly scan a complex expression in infix, if you see parenthesis or a +, you know that what's outside of the parenthesis or on the other side of a + can't affect the part you're looking at.
If you're executing the operations interactively, you're seeing what's happening on the stack, and so it's easy to keep track of where you are, but if you're reading postfix expressions, it's significantly harder.
Probably prefix notation would work better, but I suspect that there would be a stronger effect from predictable and reasonable declinations/suffixes/prefixes at a grammatical level.
From where I sit today, I can say that I see things others don't. I have a hard time saying they aren't real, though, as experience has been a zero-knowledge proof the the commentary.
The front half of the book, up until a clear pivot chapter, is me trying to put to words a coherent schizophrenic vision; both to establish credentials and to give people a sense of what it can be like.
A common theme I've seen both in myself and in others who have spoken to me about their experience is that it's common to have a delusion that you are at the center of a religion.
Personally, I've been blessed with insights at the core of every major religion that I know more than a slight deal about. It just so happens that the episode I detail in this book has me at the center of a prophecy; It rings in my head as truth, but I don't have any proof of that truth to produce.
In the latter half of the book, I introduce a conversation with Master Yoshu that I feel is critical to the message. It would have been awkward to say, "I heard this from a voice" without the setup to give that.
Regarding your last paragraph, I think what you'll find is that life is rarely ever centered around something that you can ever adequately put to words. Even if someone is always 100% literal, their concepts will not match your concepts; labels are the source of much of our suffering, and trying to label something as example or metaphor or "not literal" already betrays that you're unable to falsify it; thus, there's something to it.
I don't want to dismiss you though: If I went through that burden of categorizing my own text, how would your life be improved?
My life wouldn't be improved at all, I mostly just wanted to get your reaction because I was curious. I have a person I care about who suffers from schizophrenia and I struggle to tell when they are exaggerating or saying something for comedic effect or hyperbole or if they are making an analogy or are saying something literal, so my question / comment was mostly to hopefully gain some perspective. I could just ask them "are you saying you literally believe X" every few minutes but it's hard to get a word in with them sometimes.
I guess the other point of my comment is potentially for your benefit in that if you want to have many people read your book, I have a fear that a large portion of potential readers will hear your story, open up the book, falsely conclude "oh this guy is still just crazy, this is a sci-fi/mythology text" and close it without reaching the parts that to me were the most interesting. But I would understand if you decide that you don't really care to change anything, just sharing in case it turned out to be helpful.
I appreciate the feedback. I can definitely see people thinking I'm crazy by the first few chapters and not getting to the good part.
I was trying (and failed) to be clever about improving your life.
What you say about exaggeration vs hyperbole vs analogy vs literal resonates with me. Have you considered they may not, themselves, have an answer to this? For example, I often throw out (hyperbolicly) really wild numbers at work that are off by orders of magnitude to illustrate what happens when stuff grows to an extreme; yet, I'm also very cautious to get precise numbers when building. My co-workers may be bothered by this (so I'll be more mindful now that I've probed for this insight), but I would like to think that hyperbole is easily detected.
Honestly, I don't know if I can help with perspective without witnessing you with your care-person. Everyone's different and the nature of the beast is you only interact with really ill people when you yourself are out of commission. I wouldn't want to generalize me onto someone else. Perhaps you just show them the post I'm replying to and ask what they think?
My understanding is schizoaffective disorder is one of four categories of schizophrenia. I think people often use them interchangeably.
Medication can be tricky because when it works it seems increasingly unnecessary and often has side effects of its own.
I hope no one reads my book as a substitute or prescription to avoid medication on their path. To each their own, but it's not what I'm advocating for.
I appreciate you saying something. As you'll read in the book, I've got a project going to primarily help myself, but to date it's been 100% open-source.
I haven't been able to be consistent enough to both pay the bills and volunteer and work on the open source, so I focus on the former.
Would you mind saying more about what you do every day? For me, I anchor myself for at least a set time; when I sit during that time, nothing can touch me and I can deconstruct symptoms to recover any deviation from baseline.
I'm surprised you can remember me. We once sat near each other on a bus for SOSP, and that's the extent we interacted in person, so it warms my heart to hear someone like you remembers someone like me.
In case you're looking for the technical, the book doesn't say such, but one innovation laid out for me was my work on lsmtk, a new compaction algorithm for LSM trees. I'm not sure if I'm off my rocker here or not, but I documented it when I released the crate: https://crates.io/crates/lsmtk. I know you're busy, but in case you revisit this thread and want a neat trick, I thought I'd bring it up.
Lastly (and most important as I'm prone to doing), I appreciate you sharing something personal like that. Hearing that others who have tried to make it work can often do so keeps me going on my worst days.
Yes. From a quick scan of the paper, it includes a formal proof in Lean4. That said, it is very long and complicated, with lots of steps in the chain (as you might expect) so it would need to be checked carefully to ensure it proves what it claims to prove.
Lean uses Curry-Howard correspondence, so how proofs work is you declare your propositions as types and then your proof is actually a recipe that goes from things that have already been established and finishes by instantiating that type. The guarantees there are very strong - if you succeed in instantiating the type you have definitely proved something. The question is whether you have proved the thing you said you have. So here scanning the proof (it’s like 100 pages and I am sick so definitely sub-par intellectually) they use category theory to embed the problem, so the proof is actually a proof of the properties of this embedding. So if there is a problem with the proof, my guess would be that it would lie in the embedding not being exactly representative of the problem somehow.
It seems a pretty serious attempt though- it’s not just some random crank paper.
Thank you! This is the kind of comment I hoped to see.
I'm betting it was published in a hurry. I know I would hit "publish" within 24 hours of creating such a result, and would hope it would go wide. I'd publish to arxiv before getting clearance to release the code. I bet that's what happened here.
I appreciate your explanation of the Curry-Howard correspondance. I was familiar with it, but not with Lean in particular. I'd heard of Lean, but didn't know how it worked.
You are most welcome. Reading it a little more, the summary of the proof is they embed computations as a particular sort of category and then use homological algebra to show that computations in P have a certain property[1] and computations in NP have a different property[2], and they say they go on to demonstrate these properties are mutually incompatible, thus proving P != NP.
I don't know homological algebra at all and only the very basics of category theory and while the (107-page) proof gives a lot of background it would take more time for me to get myself up to speed than I can really afford to spend right now. But that's the gist.
The fact that they have formalized the proof should mean it will be quicker to verify whether or not this is indeed it.
[1] which they call "contractible computational complexes (Hn(L) = 0 for all n > 0)."
[2] which they call "non-trivial homology (H1(SAT)̸ = 0)"
Most agent frameworks aim for chat; stigmergy aims to replace the Fortune 500 CEO's role, or be able to run a healthcare clinic. Something of that scale. This is a prototype. I’m looking for feedback before implementing the main loop.