Hacker Timesnew | past | comments | ask | show | jobs | submit | p-e-w's commentslogin

> A 500k line codebase for an agent CLI proves one thing: making a probabilistic LLM behave deterministically is a massive state-management nightmare.

Considering what the entire system ends up being capable of, 500k lines is about 0.001% of what I would have expected something like that to require 10 years ago.

You can combine that with all the training and inference code, and at the end of the day, a system that literally writes code ends up being smaller than the LibreOffice codebase.

It boggles the mind, really.



> You can combine that with all the training and inference code, and at the end of the day, a system that literally writes code ends up being smaller than the LibreOffice codebase.

You really need to compare it to the model weights though. That’s the “code”.


>You really need to compare it to the model weights though

Then you'd need to compare the education of any developer in relation to how many LOC their IDE is. That's the "code".

So yea, the analogy doesn't make a whole lot of sense.


It even wrote an entire browser!

By "just" wrapping a browser engine.


... what are you even talking about? "The system that literally writes code" has a few hundreds of trillions of parameters. How is this smaller than LibreOffice?

I know xkcd 1053, but come on.


> Across these variations, the overall result stays quite consistent: under certain conditions, ordinary people can be led to do harmful things.

The pop culture version of what happened in those experiments is “regular people will administer potentially lethal shocks when told to”, and that claim has been refuted experimentally many times over.

Contrary to most reports, the original experimenters never told participants that the shocks are supposedly lethal or even dangerous. When participants were actually told that there was a health risk, and that they should ignore it, the vast majority of participants refused to administer the shocks in a later recreation.[1]

In other words, the Milgram experiment, as commonly understood, is somewhere between sensationalism and an outright lie.

[1] https://www.mdpi.com/2076-0760/3/2/194


Many cloud products now continuously send themselves the input you type while you are typing it, to squeeze the maximum possible amount of data from your interactions.

I don’t know whether ChatGPT is one of those products, but if it is, that behavior might be a side effect of blocking the input pipeline until verification completes. It might be that they want to get every single one of your keystrokes, but only after checking that you’re not a bot.


It's still possible to let users already type from the beginning, just delay sending the characters until checks are complete. Hold them in memory until then.

Instagram was uploading the images while the user were adding post details, back in 2012!

https://qht.co/item?id=3913919

No one seem to use or care about their own product anymore. Only uses dashboard and metrics, which does not explain the full situation.


That makes total sense from a UX perspective though, the ChatGPT thing does not.

there were a lot of helpdesk chats doing the same, so you could see users typing messages, then deleting words, etc before hitting send.

This was actually one of the reasons why Instagram felt smooth.

Another thing but Facebook/Instagram have also detected if a person uploads an image and then deletes it and recognizes that they are insecure, and in case of TEENAGE girls, actually then have it as their profile (that they are insecure) and show them beauty products....

I really like telling this example because people in real life/even online get so shocked, I mean they know facebook is bad but they don't know this bad.

[Also a bit offtopic, but I really like how the item?id=3913919 the 391 came twice :-) , its a good item id ]


I just checked the network inspector, the only thing it does per key press is to generate an autocomplete list. It doesn't seem too hard to wait with the autocomplete generation until after whichever checks you run pass.

I wondered if ChatGPT streams my message to the GPU while I type it, because the response comes weirdly fast after I submit th message. But I don't know much about how this stuff works.

Likely prefix caching among many other things

> I'm amazed that wasn't taken into account!

Agreed. While I didn’t anticipate this myself, nor would have likely figured it out myself, I also don’t expect my claims to influence global policy.

The scientists who failed to realize this do expect that, so the standards we expect from them need to be higher in accordance with that.


I mean, you can simply use Linux and save yourself all those hacks…

Absolutely. I went through great lengths to install Asahi on my work M1, only to have most things not work (RTFM). So when one is forced to use MacOS, may it round corners in hell, for work…


Yeah I've used Karabiner to get windows-style shortcuts (home/end, etc.) and it works very well.

Aside: the new, large radius Liquid Ass corners that make some parts of the window basically unusable are really annoying me.

I think it’s even crazier that a visible slice of the address space that is supposed to last for the rest of humanity’s future has already been allocated.

It's 1/8 of that space and it's being allocated in big blocks that are expected not to run out unless humanity expands to the whole solar system. If it does run out, there are 7 tries left. More if you only use half as much space next time.

This is a bad argument. Established things are established. “If you don’t like what the president of your country is doing, just run for the office yourself.”

"Established things are established" BUT "established things don't always stay there." Things can change, if many people will support said change. The power of many is really something.

Exactly. And it seems that "many people" do not, in fact, support this change, to the point Libreoffice felt necessary to defend it after the fact on their official website.

Maybe "many people" remember what's been going on at Mozilla over the past decade. After all, Mozilla went there before and set the example of downward slope: first donations then partnerships, first opt-in then opt-out then automatically installed addons, first "contribute to the browser" then to sideprojects/non-technical causes, etc.

A similar case could be made for Wikimedia.


I fully expect that future programs for formalizing mathematics will reveal that most sufficiently complex proofs are riddled with gaps and errors, and that some of them actually led to false results.

Annals of Mathematics once published a supposed proof (related to intersection bodies IIRC) for a statement that turned out to be false, and it was discovered only by someone else proving the opposite, not by someone finding an error.


The same way Chinese tech companies “compete” with Western ones: By not permitting them to do business in China.

They are permitted to do business there. You just have to make a bargain with the devil. 50% of your domestically incorporated branch is Chinese owned. Then you have the requisite technology and IP transfer. Most sensible companies would not accept such a bargain, but you have quite a few investors only interested in the next quarterly profit going up to the right. And they've made that bargain repeatedly.

It’s so important, in fact, that there should be more than one such institution.

People keep falling into the same trap. They love monopolies, then are shocked when those monopolies jerk them around.


I am using Zenodo for a while now instead. It is more user friendly, as well.

Zenodo is more for IT Papers and also datasets isn't it?

It can host large datasets as well, yes. It is hosted by CERN, so it is not specifically IT in any way. It also allows you to restrict access to the files of your submission. It has no requirements to submit your LaTeX sources, any PDF will be fine. There are also no restrictions on who can publish. You'll get a DOI, of course.

Everything published on arXiv could also be published on Zenodo, but not the other way around.


Zenodo is great too, yes, but their meta-data management is somewhat problematic; i.e., it can be changed at whim, which makes indexing difficult.

oh interesting I didnt know this

I like it as well, it works great. But I wonder if it would scale if at some point there were a massive exodus from arXiv.

I think it already hosts much more data than arXiv, given that they also host large datasets.

It is just a preprint repository. It is pretty open (the stories where a preprint was rejected or delayed unreasonably are extremely rare). It offers the basic services for a math/compsci/physics themed preprint repository.

I don't see much of a monopoly, nor any "moat" apart from it being recognised. You can already post preprints on a personal website or on github, and there are "alternatives" such as researchgate that can also host preprints, or zenodo. There are also some lesser known alternatives even. I do not see anything special in hosting preprints online apart from the convenience of being able to have a centralised place to place them and search for them (which you call "monopoly"). If anything, the recognisability and centrality of arxiv helped a lot the old, darker days to establish open access to papers. There was a time when many journals would not let you publish a preprint, or have all kinds of weird rules when you can and when you can't. Probably still to some degree.


there is. bioarxiv.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: