> ALL context/prompt is instructions, there is no data. It is just unsolvable, period.
That really isn't true. There's no law of physics preventing you from having separate data and instruction inputs to models. The model's transcript format generally distinguishes between prompts and instructions and tool output and such. This isn't a solved problem, and it's possible it's entire unsolvable, but it probably is possible (in general, not with current models) to reject prompt injection to several nines.
This is a lot like making the same statement about CPUs, "the von Neumann architecture doesn't distinguish between code and data so it's impossible to reject malicious instructions." There's actually a lot you can do to reject malicious instructions, you can prevent execution in certain pages, you can prevent certain privileged instructions from being executed in certain pages, you can employ stack cookies, et cetera. Do they prevent all exploitation in all circumstances? No. But each component does function in it's lane and it is possible to create programs with high (though not absolute) guarantees against unauthorized code execution by composing them.
Similarly, you could prevent certain tokens from appearing in the prompt portions of a transcript, you can have a model with multiple input heads only one of which is trusted, etc. I'm not saying those techniques will necessarily work, but it is more complex than "models can only possibly take a single and undifferentiated input stream".
A lot of the solutions in the CPU space involve things like memory allocation flags, NX bits, canaries, etc. that fire deterministically. Those things are fundamentally not applicable to LLMs, and without those things modern software would be in a vastly worse place.
You could imagine that there are things to change around LLM architecture that will improve its ability to reject prompt "injection", but I think it's fundamentally true that from an information theory perspective there's no bright line between "instruction" and "input data" possible.
Nondeterminism is a red herring. There is a bright line between instructions and data right now, in virtually every transcript format. That we have not succeeded in training an LLM to respect it to a very high degree doesn't imply it is impossible; that they are nondeterministic doesn't imply it is impossible; only that we won't succeed 100% of the time.
A cosmic ray (or rowhammer attack) could flip an X bit too, there isn't anything truly deterministic under the sun.
I read the misanthropy as ironic. They're applying the same reductionist logic to humans, not because they are misanthropic, but to illustrate that it doesn't help us understand the case we can all agree on. "Humans aren't sentient either" is definitely not the takeaway.
The point is, we have no idea what "sentient" or "intelligent" even means. If we agreed on the definitions, the debate would have been settled long ago.
Let's assume we have infinite memory with constant time lookups. With a sufficiently large lookup table, you could exactly replicate the behavior of any person. You could encode it as a next-token predictor: you have precomputed every possible prefix and assigned it a next token. This is a Chinese room, but it is completely indistinguishable from an intelligent, sentient person. There is no experiment you can design to slip a piece of paper (a prompt) under the door to determine whether it is Bob or the lookup table clone of Bob inside the room.
Does that make the lookup table conscious or alive? Undefined. It's the wrong question. Or it's not a question science can address.
So we cannot dismiss on it's face the idea that next token predictors "are not and never will be alive" unless by "alive" you simply mean "biological," but that's not really what's debatable.
The argument is also very brittle because they are not in fact all next token predictors. I doubt people making this argument would be willing to concede that diffusion models are more likely to be conscious than causal models (which I do not believe but is an implication of the argument).
I'm not saying that they are conscious or sentient to be clear, but the reductionist argument that they are next token predictors and therefore don't have some property humans have is not an argument. That's going from A directly to Z. You need to flesh out the bit in the middle because that doesn't follow.
Right. Humans are a biological computer. They have a state and they compute an output. I had to look this up (and use AI) but an estimate for the state of a human mind is about 5 peta-bits (10^15) and the estimated processing power is about 1 exa-FLOP (10^18). Compare this to the largest models at ~5 tera-bits (10^12) of state space and ~2 x 10^14 FLOPS (for one session with some reasonable token rate).
Assuming the above is anywhere near true (I think there's a lot of debate about the capacity of the human mind, where data is actually stored, and where compute happens) then we are talking about 3 orders of magnitude win for humans in state and 4 orders of magnitude in compute. And we're doing all that pretty energy efficient as well.
The other big difference in humans is that we learn and the model only "learns" in context. Out "learn" space is much larger than the 1M tokens that frontier models struggle with.
Anyways, point is that a computer can appear to be alive. If we simulate the human brain perfectly and train it like a human then we'll have something that has human capabilities. LLMs have interesting capabilities but at least at this point not fully human ones (and the delta-state/compute would be a hint that there is still a large gap to cover).
human context/memory could just be an Agents.md file too that gets read instantly before your next token prediction runs. The AI can make multiple such memory files and read on demand depending on what the topic is, kind of like how as a human when you try to remember a math problem you don't go to your childhood bicycling Agents.md file either.
My experience teaching is limited (but I have taught some, to be clear) but I have found learned helplessness to be the biggest barrier. People have varying aptitudes for different tasks, and varying aptitude and a finite lifespan does imply some people have a lower ceiling than others in a given subject, but humans are powerful general learners. They don't generally reach their ceiling in most subjects. The limit for someone "bad at math" is almost certainly self-fulfilling prophesies they internalized.
Speaking for myself I have in the last five years or so been learning I have much more of a capacity for making art than I had thought. My art is nothing special, but I am improving every time I practice. But when I was younger I thought that I was just good at STEM-yy things and bad at other things. Relatively speaking I am better at STEM-yy than art-yy things, and I'm probably worse at art-yy things than most other people. But I have huge room for growth and I think I will eventually produce some beautiful watercolors.
As an aside, I've also found that almost everyone thinks they're bad at math? My friends with PhDs don't think they're good at math but they've forgotten more than I know about it. I think I'm bad at math but I can prove a thing or two. My spouse thinks they're bad at math, and they can't do the things I can do. But a few months ago they needed to do some simple algebra at work, and a coworker said, "dang, I wish I was good at math."
Somewhere out there Terence Tao is saying he's alright at math but he has nothing on that Euler fellow.
Could not agree more. I also taught for some time and generally had good results, almost everybody "got it".
There was one person, though, that I just could not get anywhere with. Even after several private lessons. Turned out that somehow she convinced herself that she will never get it and never be able to progress. Even if she did get it right one week, the next week was as if that never happened. I found no way across that barrier :(
I don't understand the alternative, that it would be legal to break the law as long as it were a petty crime and you were being paid to do it? It's a principle, it applies broadly. We tend to think about it in terms of the most dramatic and memorable example but that's neither here nor there.
No, this isn't a generational thing, if you don't see the problem with trashing someone's house (let alone doing so to the tune of $12k) that is a comment on your values alone.
I don't think GP disagrees with you. They said it wouldn't be a good idea if they had been honest. Elsewhere they call for the employees to face charges.
Ah, I see. Pardon me making him a scapegoat. My rant in between bouts of wrestling with mobile sandbox subscription testing weirdness should have been directed at a vague cohort of less considerate SV folks, so instead I wish to thank him for trying to make the world a better place, and I wouldn't mind if he invited me skiing or flying sometime so we can discuss how I can help him make the world a better place. :)
That really isn't true. There's no law of physics preventing you from having separate data and instruction inputs to models. The model's transcript format generally distinguishes between prompts and instructions and tool output and such. This isn't a solved problem, and it's possible it's entire unsolvable, but it probably is possible (in general, not with current models) to reject prompt injection to several nines.
This is a lot like making the same statement about CPUs, "the von Neumann architecture doesn't distinguish between code and data so it's impossible to reject malicious instructions." There's actually a lot you can do to reject malicious instructions, you can prevent execution in certain pages, you can prevent certain privileged instructions from being executed in certain pages, you can employ stack cookies, et cetera. Do they prevent all exploitation in all circumstances? No. But each component does function in it's lane and it is possible to create programs with high (though not absolute) guarantees against unauthorized code execution by composing them.
Similarly, you could prevent certain tokens from appearing in the prompt portions of a transcript, you can have a model with multiple input heads only one of which is trusted, etc. I'm not saying those techniques will necessarily work, but it is more complex than "models can only possibly take a single and undifferentiated input stream".
reply