Since the source code leaked showed they key off of swearing to trigger certain behavior, I actually intentionally swear when running into things like insufficient thinking and/or hallucinations. It also unironically makes it easier for me to grep later to run analysis on how often its happening.
Well its not that simple. In the same way that throwing an LLM into a process will always have a risk of blowing up spectacularly.
In this case it failed open. It didnt recognize that it was in an edge case (which itself is an edge case). So what are you proposing to be the solution to that? If the car itself does not recognize that its in an abnormal situation that needs intervention then how do you intervene?
His point is that humans are prone to the same error. The flooded engine damage doesn't come from humans recognising the danger of a flooded road and choosing not to attempt it.
Im responding to the implication that you "have to be ok with good enough" and that somehow this will be a mostly fine autonomous experience with this
> A car that only fails in a road conditions edge case is good enough for the vast majority of cases. You accept that, and issue a manual override for when that edge case pops up
But its just like LLMs. They will never be perfect, and so if you arent actively paying attention and steering the behavior then there is always a risk of spectacular failure. Because if you arent paying attention to "needing to [apply a] manual override" then all of a sudden the AI has `rm -rf /` and you had it in "bypass permissions" mode.
No one cares about flooded engines that Google has to pay for. They care about a taxi that might kill them.
You have to compare this to the number of taxi and Uber drivers who will drive into moving water with passengers on board while a passenger is telling them to stop.
Yep, I've sold my entire collection at this point (all digital on MODO) and will likely never go back. I think planeswalkers were the beginning of the end for me, and its been a non-stop slide of power creep and paid promotions ever since.
Genymobile is also behind Genymotion which was an incredible product when it came out. Head and shoulders above the other emulator options in performance.
I hate to spam this but Ive seen this misconception on bun repeatedly in each of these incident threads. It should really be noted that bun _does_ run lifecycle scripts for the top 500 most popular packages by default. You can opt out of this but its not the default config. Its much better than the npm strategy but I think it would be much better if there was a way to explicitly acknowledge you want this default whitelist applied (eg scriptPolicy = allow, deny, or allow popular only)
Note that bun is only immune to this because it isnt in the “top 500” that bypass this system by default. I was actually surprised (pleasantly, but still surprised) tanstack wasnt in that list already
Good to know. Though according to that page, bun still wouldn't have run it if it were on that list, since it came through a git dependency and not npm.
I think moving straight to local models is missing the required next step of open/self-hostable models which is certain to be the "AI future" end-state. Then local models become an optimization on top of that.
I just dont want us to put all this effort in to on-device computation when we need to get to "SOTA-equivalent" self-hosted computation faster.
reply