More

i_cannot_hack · 2026-04-01T22:14:06 1775081646

Graphs can be abused and statistics can be misleading, and some things are hard to quantify and measure. But the author never makes any convincing case why the statistics would be wrong or misleading in this case: "I’m not here to argue with Scott’s statistics. I think they’re about as accurate as we could hope to make them. I’m here to argue that you don’t require them to make sense of the world".

His main argument is that many people feel crime is increasing, and that in itself is a good argument to disregard any falling numbers as obviously incorrect without any further justification being necessary.

The obvious problem is that people almost always say that crime is increasing, and they have consistently been shown to misjudge the actual trend for decades on end: "In 23 of 27 Gallup surveys conducted since 1993, at least 60% of U.S. adults have said there is more crime nationally than there was the year before, despite the downward trend in crime rates during most of that period." If we bought into the author's argument we would never be able to reach any other conclusion than that that crime has always been increasing and will always continue to increase.

During the satanic panic the the 1980's the populace at large were convinced that large swaths of satanists were routinely sacrificing and abusing children. The police was convinced it was a real problem and had special "satanic experts" to combat the issue, a huge amount of parents were genuinely afraid of their childrens' safety, and there were thousands and thousands of cases of reported ritual abuse. In reality and in hindsight there were zero evidence of satanic cults abusing children. The author's argument could, completely unmodified, be used to argue that we should listen to the people's lived experience instead of the evidence and conclude that the satanic cults must actually have been a real societal danger back then. Or is he only against disregarding someone's lived experience in favor of evidence when it is his lived experience?

It doesn't even matter if he is right in this case. Maybe the all the statistics is flawed and his feeling of rising crime rates is justified. The problem is that he offers no actionable heuristic that allows us to separate his intuition from other people's intuition that has been obviously wrong in hindsight, like the satanic panic.

i_cannot_hack · 2026-04-01T06:50:53 1775026253

The first one seems to indeed be a real RCE in vim.

Also including the emacs one as a "found vulnerability" seems really disingenuous. It basically amounts to "emacs will call git status, and git status will call git hooks that can execute arbitrary code".

1. As the Emacs maintainers point out, it is indeed an issue with git, not emacs, and they are completely right to not address the issue.

2. It is something that has been known for decades. That is the reason hooks are never copied when doing git clone, to prevent this scenario (notice that the author uses wget instead of git clone to get around this).

Funnily enough this posts highlights both the strengths and the hazards of using AI, (1) quickly and easily finding real issues that would have taken a human a laborious audit to find (2) quickly and unthinkingly generating plausible sounding but ultimately meaningless vulnerability reports on some clout chasing mission and overwhelming open source maintainers with AI slop.

lloeki · 2026-04-01T07:21:17 1775028077

> The first one seems to indeed be a real RCE in vim.

Barely, since there is little restriction as to what options modelines can set they should be largely considered equivalent to eval (if unintentionally). And generally they are which is why distros typically disable them by default.

IMHO in this day and age securemodelines should just be the default.

https://www.vim.org/scripts/script.php?script_id=1876

i_cannot_hack · 2026-04-01T08:15:05 1775031305

I don't know much about vim, but from the report it sounds like part of the issue was that disabling modelines would not prevent it:

> tabpanel is missing P_MLE Unlike statusline and tabline, tabpanel is not marked with the P_MLE flag. This allows a modeline to inject %{...} expressions even when modelineexpr is disabled.

Edit: Upon re-reading the above I guess disabling modelineexpr is not the same as disabling modelines, and disabling modelines altogether might indeed prevent the issue.

cryptbe · 2026-04-01T07:06:29 1775027189

When I wget a tarball, unzip, and emacs a.txt inside, I don't expect that it'd execute arbitrary commands.

I think people should be aware of this risk, especially when it looks like it's not getting fixed.

Disclosure: I didn't find the bugs. I helped wrote the blog post.

i_cannot_hack · 2026-04-01T07:58:59 1775030339

But you would expect running "git status" or "git ls-files" in the unzipped directory to completely pwn your system? Probably not either.

If you don't trust git, you can remove from your system or configure emacs not to use it. If you are worried about unsuspecting people with both git and emacs getting into trouble when downloading and interacting with untrusted malware from the internet, the correct solution is to add better safeguards in git before executing hooks. But you did not report this to the git project (where even minor research beyond Claude Code would reveal to you that this has already been discussed in the git community).

I suspect that what happened here was that (1) you asked Claude to find RCEs in Emacs (2) Claude, always eager to please, told you that it indeed has found an RCE in Emacs and conjured up a convincing report with included PoC (3) since Claude told you it had found an RCE "in Emacs", you thought "success!", didn't think critically about it and simply submitted Claude's report to the Emacs project.

Had you instead asked Claude to find RCEs in git itself and it told you about git hooks, you probably would not have turned around and submitted vulnerability reports to all tools and editors that ever call a git command.

cryptbe · 2026-04-01T08:12:01 1775031121

>But you would expect running "git status" or "git ls-files" in the unzipped directory to completely pwn your system? Probably not either.

That’s fair, but it would be pretty unusual for me to run Git commands in a directory I’m not actively working on. On the other hand, I open files from random folders all the time without really thinking about it, so that scenario feels much more realistic.

chrismorgan · 2026-04-01T09:10:09 1775034609

It’s extremely common for shell prompts to integrate Git status for the working directory.

Who’s responsible for the vulnerability? Your text editor? The version control system with a useful feature that also happens to be a vulnerability if run on a malicious repository? The thing you extracted the repository with? The thing you downloaded the malicious repository with?

Windows + NTFS has a solution, sometimes called the “mark of the web”: add a Zone.Identifier alternate data stream to files. And that’s the way you could mostly fix the vulnerability: a world where curl sets that on the downloaded file, tar propagates it to all of the extracted files, and Git ignores (and warns about) config and hooks in marked files. But figuring out where the boundaries of propagation lie would be tricky and sometimes controversial, and would break some people’s workflows.

chotmat · 2026-04-01T07:34:16 1775028856

I don't think this is fair, as it will likely also affect any editor with Git integration (or not?)

cryptbe · 2026-04-01T07:42:55 1775029375

Yes, likely. And git is not going to fix it. So isn't it fair to expect the editor maintainers to do something about it, to protect their users, no?

chotmat · 2026-04-01T07:54:21 1775030061

Tested with zed and vscode. They don't seem to have the issue. Probably due to "Restricted Access" mode when opening new folder?

Edit: yeah pwned when clicking the big green trust button

snarf_br · 2026-04-01T08:15:56 1775031356

Ferret7446 · 2026-04-01T11:19:53 1775042393

If you untar a file and get a git repository, you should absolutely expect malicious behavior. No one does that, you clone repos not tarball them, and cloning doesn't copy hooks for precisely this reason

i_cannot_hack · 2026-03-24T10:39:14 1774348754

Pulling the emergency break promising to improve a situation will in general not build any trust unless the mea culpa also includes:

1. An analysis of what allowed the situation to get out of control to begin with

2. Systematic changes to prevent it from happening again

Otherwise you will just be in the same situation again in 3 years. And neither is included in Microsoft's messaging here.

Wobbles42 · 2026-03-24T10:42:21 1774348941

I don't really see that happening here.

Microsoft doesn't have any trust to lose, and they won't be gaining any by this move.

That is the one advantage they have in all of this. Their public image is as bad as it can get.

raddan · 2026-03-24T12:07:36 1774354056

> they won't be gaining any by this move.

Then why even do it?

itopaloglu83 · 2026-03-24T12:24:05 1774355045

Microslop is saying “I’m sorry that you’re offended” and will continue to abuse their users. All of this is a PR campaign to fix their image so that they can raise more money.

gentleman11 · 2026-03-25T21:53:23 1774475603

I think this "emergency break" is just going to get eye rolls from most people

i_cannot_hack · 2026-03-21T09:35:17 1774085717

It's a really interesting case study, but the summary seems to lean into the AI hype to an extent that borders on lying.

> His fabrication shop uses it daily, and he built the entire thing in 8 weeks. During those 8 weeks he also had to learn everything about Claude Code, the terminal, VS Code, everything.

I don't see how he can give this summary with a straight face after posting the interview that CLEARLY contradicts it.

In the interview the engineer says "When Claud Code came out almost a year ago, I started dabbling with web based tools ..." and "When it first came out I had so many ideas and tried all these different things", so he had clearly already used extensively it for a year. I would also guess the engineer was somewhat technically minded from the get-go, since he claims he was "really good with excel" before starting with Claude Code, but that is beside the point.

The interviewer later asks "How much of those 8 weeks was learning Claude Code versus actually building the thing?", and the interviewee answers "Well, I started Claude Code when it first came out so the learning curve has really gone down for me now..." and then trails off to a different subject. Which further confirms that the summary in the post is false.

It really seems like the engineer has spent the year prior learning Claude Code and then spent 8 weeks on solely building this specific application.

The interviewer also claims "This would normally have taken a developer a year to build", which seems really unsubstantiated. It's of course hard to judge without all the details, but looking at the short demo in the video, 8 weeks of regular development time from a somewhat experienced developer doesn't seem too far fetched if the objective is "don't make it pretty, just make it work".

As I said, it's a really interesting case study about a paradigm shift in how software is developed, and it's clear this app would never have existed without Claude Code. So I don't really see the need for the blatant lying.

mirsadm · 2026-03-21T09:56:53 1774087013

I've noticed even experienced engineers have started overestimating how long things would take to build without AI. Believe it or not we coded before AI and not everything took years all the time.

le-mark · 2026-03-21T13:39:45 1774100385

We’ve all worked on projects where it took months to get requirements from the business. Sometimes to see the project cancelled after months of sitting around waiting for them to decide on things.

Coding has never been the roadblock in software. Indeed don’t we experience this now with ai? Vibe code a basics idea then discover the things we didn’t consider. Try to vibe that and the code base quickly gets out of hand. Then we all discover “spec driven development” SDD and in turn discover thinking of specifying everything our selves is an even bigger of PITA?

sumedh · 2026-03-21T21:59:04 1774130344

He might be a bit nervous to speak to the camera and might have messed up the timeline.

abelitoo · 2026-03-21T17:38:58 1774114738

> So I don't really see the need for the blatant lying.

Because this is an advert

i_cannot_hack · 2026-03-20T17:10:15 1774026615

The standard for obscurity is different for LLMs, something can be very widespread and public without the average person knowing about it. DICOM is used at practically every hospital in the world, there's whole websites dedicated to browsing the documentation, companies employ people solely for DICOM work, there's popular maintained libraries for several different languages, etc, so the LLM has an enormous amount of it in its training data.

The question relevant for LLMs would be "how many high quality results would I get if I googled something related to this", and for DICOM the answer is "many". As long the that is the case LLMs will not have trouble answering questions about it either.

i_cannot_hack · 2026-03-18T21:12:28 1773868348

They mention false positives as well on github: The rate of false positives is harder to measure, but based on limited manual reviews it's well within 20% range and the majority of it is a gray zone.

riteshkew1001 · 2026-03-19T16:48:35 1773938915

That 20% figure is actually better than it sounds. Coverity on kernel-scale C codebases typically lands in the 40-60% false positive range... "not wrong but not the bug you'd prioritize" is different from a true false positive.

kleiba · 2026-03-19T06:57:28 1773903448

Hard to measure, how? Either something is a bug or not - otherwise how would you be able to count anything at all?

lstodd · 2026-03-19T14:59:42 1773932382

Assign each line of code a bugginess factor then count those exceeding an arbitrary threshold obviously.

i_cannot_hack · 2026-03-18T09:19:44 1773825584

You make it seem like it's not predominantly skewed right wing, just a "healthy" mix of right wingers and left wingers due to not banning anyone. Which might be an unpopular take, but in this scenario I think it's unpopular simply because it is demonstrably wrong.

> A study published by science journal Nature has examined the impact of Elon Musk’s changes to X/Twitter, and outlines how X’s algorithm shapes political attitudes, and leans towards conservative perspectives. They found that the algorithm promotes conservative content and demotes posts by traditional media. Exposure to algorithmic content leads users to follow conservative political activist accounts, which they continue to follow even after switching off the algorithm. https://www.socialmediatoday.com/news/x-formerly-twitter-amp...

> Sky News team ran a study where they created nine new Twitter/X accounts. Right-wing accounts got almost exclusively right-wing material, all accounts got more of it than left-wing or neutral stuff. (Notably, the three “politically neutral” accounts got about twice as much right-wing content as left-wing content. https://news.sky.com/story/the-x-effect-how-elon-musk-is-boo...

> New X users with interests in topics such as crafts, sports and cooking are being blanketed with political content and fed a steady diet of posts that lean toward Donald Trump and that sow doubt about the integrity of the Nov. 5 election, a Wall Street Journal analysis found. https://www.wsj.com/politics/elections/x-twitter-political-c...

> A Washington Post analysis found that Republicans are posting more, getting followed more and going viral more now that the world’s richest Trump supporter is running the show. https://www.washingtonpost.com/technology/2024/10/29/elon-mu...

i_cannot_hack · 2026-03-16T22:28:40 1773700120

It becomes a problem for everone when spaces meant for meaningful work become overrun with an awful stream of endless mediocre slop that someone quickly generated without giving it a second thought. The problem here is not that it is fast and easy. The cardinal sin is that it is fast, easy AND bad.

sbarre · 2026-03-16T22:44:48 1773701088

Huge gatekeeping energy right here.

i_cannot_hack · 2026-03-16T23:01:26 1773702086

Do you think people complaining about online marketplaces being overrun with unscrupulous drop-shippers are "gatekeeping e-commerce" as well?

sbarre · 2026-03-17T12:30:34 1773750634

No I do not because that's not a reasonable comparison?

i_cannot_hack · 2026-03-17T13:37:38 1773754658

Then you haven't understood the complaint.

sbarre · 2026-03-17T15:58:23 1773763103

I understood it just fine. You object to creations and creativity that do not pass your subjective quality bar and/or aren't produced in a way that is satisfactory to the people already behind the gate.

It's the literal definition of gatekeeping.

The problem you describe (quantity over so-called quality) is a discovery and curation problem.

Yet you blame the tools of creation and lament the lack of restriction or controls on production instead.

Yes these tools make it easier to produce, and yes that means that you have more low-quality work out there. I'm not pretending like that doesn't introduce new challenges.

But the answer isn't to gate-keep the tools or the process of creation or to stop or shame people from being creative with these new tools by universally calling their work "slop" or "bad".

i_cannot_hack · 2026-03-17T18:39:09 1773772749

So you completely agree with the factual description of the problem I supplied when asked to describe the problem, your only real complaint is that I used the phrase "more awful slop" instead of your preferred euphemism "more low-quality work". Having a frank discussion about the problems caused by new technology is not gatekeeping, and I don't think we should sugarcoat it out of fear of hurting people's feelings.

sbarre · 2026-03-17T21:10:07 1773781807

You initially said:

> It becomes a problem for everone when spaces meant for meaningful work become overrun with an awful stream of endless mediocre slop that someone quickly generated without giving it a second thought. The problem here is not that it is fast and easy. The cardinal sin is that it is fast, easy AND bad.

So..

"a problem for everyone" <- the fallacy of assuming your personal feelings and opinions are universal and apply to all of us (they're not and they don't).

"spaces meant for meaningful work" <- tells me that you don't seem to believe anything made with these new tools can be meaningful, implying they don't belong etc..

And again the hubris of believing that your personal opinion reflects the ideal state or voice of a broad and diverse community (a fucking textbook definition of gatekeeping btw)

And lastly, do you truly believe that AI tooling is the dividing line?

That all non-AI games made today are meaningful?

There's tons of quick and dirty stuff out there like asset flips and weekend projects that people throw up on Steam or Itch for sale, and there have been for years and years.

If your fear is that bad games are going to get out into the world you haven't been paying attention for the last (checks watch) 50+ years...

i_cannot_hack · 2026-03-17T22:29:32 1773786572

> "a problem for everyone" <- the fallacy of assuming your personal feelings and opinions are universal and apply to all of us (they're not and they don't).

The phrase "a problem for everyone" doesn't mean everyone agrees, it just means the described situation would affect everyone broadly...

And even you literally admitted you agree it will introduce problems just in the previous post: "I'm not pretending like that doesn't introduce new challenges", it's a little too late to try walk that back now.

> "spaces meant for meaningful work" <- tells me that you don't seem to believe anything made with these new tools can be meaningful, implying they don't belong etc..

No, just that the non-meaningful work they create risks overwhelming any meaningful work created with or without the tools, which is a real problem AI is already creating in online communities today. Knitting patterns on Etsy is a prime example. It is an accurate description of a problem that already exists today, and trying to avoid discussing it helps no-one.

Again, even you admit the problem is real and don't really have any real complaints except that you keep complaining about my phrasing. It seems you would have been happy if I'd just used the more polite terms you introduce instead, like "new challenges" instead of "problems", "low-quality work" instead of "awful slop", and "not low-quality" instead of "meaningful"? Which is fine, but not really an interesting discussion.

To avoid admitting you are simply annoyed with my phrasing you instead try to pin extreme opinions on me that are nothing close to anything I have ever said, like "you believe your personal opinion reflects the ideal voice of the community", "you believe your personal feelings and opinions are universal", "you believe nothing made with these new tools can be meaningful" and that I think "all non-AI games made today are meaningful", which is just silly.

Since you agree that you see the same problem I see, and just want to discuss other opinions you invent for me that I don't actually share, I don't think we will reach any conclusion here and I probably won't engage further. Thank you for your time anyway.

i_cannot_hack · 2026-03-10T18:21:32 1773166892

Your characterization of the event as a simple reminder to follow established best practices is directly contradicted by the briefing note of the meeting, which specifically mentions a lack of best practices related to AI. Which makes me skeptical of your assessment of the situation in general.

> Under “contributing factors” the note included “novel GenAI usage for which best practices and safeguards are not yet fully established”.

i_cannot_hack · 2026-02-28T17:32:18 1772299938

Reviewing the correctness of code is a lot harder than writing correct code, in my experience. Especially when the code given looks correct on an initial glance, and leads you into faulty assumptions you would not have made otherwise.

I'm not claiming AI-written and human-reviewed code is necessarily bad, just that the claim that reviewing code is equivalent to writing it yourself does not match my experience at all.

tempest_ · 2026-02-28T17:39:51 1772300391

Plus if you look at the commit cadence there is a lot of commits like 5-10 minutes a part in places that add new functionality (which I realize doesn't mean they were "written" in that time)

I find people do argue a lot about "if it is reviewed it is the same" which might be easy when you start but I think the allure of just glancing going "it makes sense" and hammering on is super high and hard to resist.

We are still early into the use of these tools so perhaps best practices will need to be adjusted with these tools in mind. At the moment it seems to be a bit of a crap shoot to me.

mountainriver · 2026-03-03T20:49:26 1772570966

eh with plenty of tests that I can easily read and are well documented I haven't actually ever found this to be a problem in practice