In my opinion Claude should be shipped by a custom implementation of "rm" that A...

nananana9 · 2026-03-28T11:27:14 1774697234

Oh, rm failed, since we're running in a weird environment! Let me retry with `bash -c "/usr/bin/rm -rf *"`!

giancarlostoro · 2026-03-29T00:15:43 1774743343

Ideally they control the harness and should be able to stop Claude from running any shell willy nilly.

estimator7292 · 2026-03-29T20:14:05 1774815245

Thus defeating the purpose of a custom "rm"

throwaway2027 · 2026-03-28T08:49:22 1774687762

All of which is useless when it just starts using big blocks of python instead. You need filesystem sandboxing for the python interpreter too.

ethanwillis · 2026-03-28T08:58:02 1774688282

What we need is a capabilities based security system. It could write all the python, asm, whatever it wants and it wouldn't matter at all if it was never given a reference to use something it shouldn't.

mcv · 2026-03-28T09:20:38 1774689638

Isn't this already possible? Give it its own user account with write access to the project directory and either read access or no access outside it.

VorpalWay · 2026-03-28T14:40:37 1774708837

Unix permissions is not a capability system though. Capabilities are more like "here is a file descriptor pointing to a directory, you are not capable of referring to anything outside it". So closer to chroot, except you can have several such directory references at the same time.

You can always narrow down a capability (get a new capability pointing to a subdirectory or file, or remove the writing capability so it is read only) but never make it more broad.

In a system designed for this it will be used for everything, not just file system. You might have capabilities related to network connections, or IPC to other processes, etc. The latter is especially attractive in microkernel based OSes. (Speaking of which, Redox OS seems to be experimenting with this, just saw an article today about that.)

See also https://en.wikipedia.org/wiki/Capability-based_security

100721 · 2026-03-28T11:58:30 1774699110

I have been putting my agents on their own, restricted OS-level user accounts for a while. It works really well for everything I do.

Admittedly, there’s a little more friction and agent confusion sometimes with this setup, but it’s worth the benefit of having zero worries about permissions and security.

jmogly · 2026-03-28T12:46:45 1774702005

Haha, you can already see wheel reinventors in this thread starting to spin their reinvention wheels. Nice stuff, I run my agents in containers.

ma2kx · 2026-03-28T18:03:36 1774721016

There exist restricted Shells. But honestly, I don't feel capable of assessing all attack vectors and security measures in sufficient detail. For example, do the rbash restrictions also apply when Python is called with it? Or can the agent somehow bypass rbash to call Python?

https://en.wikipedia.org/wiki/Restricted_shell

rienbdj · 2026-03-28T13:19:19 1774703959

Docker is enough in practice no?

jkukul · 2026-03-31T17:45:32 1774979132

Enabling Claude Code's sandbox (as OP suggested) does exactly that. It's a system-level filesystem sandbox that only permits access to specified locations for any process, including the python interpreter.

giancarlostoro · 2026-03-28T19:33:59 1774726439

If you disallow it from just writing Python scripts to bypass its defined environment at its core system training why would this matter? I would lockdown its path anything that tries to call Python should require the end-user to approve and see the raw script before they do.

tintor · 2026-03-28T19:55:31 1774727731

It will then write script in some other language, as a workaround.

lxgr · 2026-03-28T11:01:16 1774695676

> a custom implementation of "rm" that Anthropic can add guardrails to

Wrong layer. You want the deletion to actually be impossible from a privilege perspective, not be made practically harder to the entity that shouldn't delete something.

Claude definitely knows how to reimplement `rm`.

torginus · 2026-03-28T12:29:20 1774700960

Why cant you ship with OverlayFS which actually enforces these restrictions?

I have seen the AI break out of (my admittedly flimsy) guards, like doing simply

safepath/../../stuff or something even more convoluted like symlinks.

troupo · 2026-03-28T08:31:24 1774686684

> Claude should be shipped by a custom implementation of

And when that fails for some reason it will happily write and execute a Python script bypassing all those custom tools

eru · 2026-03-28T07:19:07 1774682347

> It's really surprising they don't just tweak what Claude uses and lock it down to where it cannot be harmful. Ensure it only ever calls tooling Claude Code provides.

That would make it far less useful in general.

KronisLV · 2026-03-28T08:23:00 1774686180

Maybe Anthropic (or some collection of the large AI orgs, like OpenAI and Anthropic and Google coming together) should apply patches on top of (or fork altogether) the coreutils and whatever you normally get in a userland - a bit like what you get in Git Bash on Windows, just with:

1) more guardrails in place

2) maybe more useful error messages that would help LLMs

3) no friction with needing to get any patches upstreamed

External tool calling should still be an option ofc, but having utilities that are usable just like what's in the training data, but with more security guarantees and more useful output that makes what's going on immediately obvious would be great.

eru · 2026-03-28T08:42:43 1774687363

So for me, it's really, really useful for Claude to be able to send Slack messages and emails or make pull requests.

But that's also the most damaging actions it could take. Everything on my computer is backed up, but if Claude insults my boss, that would be worse.

KronisLV · 2026-03-28T12:15:06 1774700106

> So for me, it's really, really useful for Claude to be able to send Slack messages and emails or make pull requests.

Oh, I'm totally not arguing for cutting off other capabilities, I like tool use and find it to be as useful as the next person!

Just that the shell tools that will see A LOT of usage have additional guardrails added on top of them, because it's inevitable that sooner or later any given LLM will screw up and pipe the wrong thing in the wrong command - since you already hear horror stories about devs whose entire machines get wiped. Not everyone has proper backups (even though they totally should)!

walthamstow · 2026-03-28T08:19:38 1774685978

Claude has told me that its Grep tool does use rg under the hood, but I constantly find it using the Bash tool with grep

giancarlostoro · 2026-03-28T19:36:49 1774726609

When I tell it to use rg it goes much faster than it using grep. I really don't understand why its slower with grep.

oefrha · 2026-03-28T06:11:36 1774678296

You can define your own rm shell alias/function and it will use that. I also have cp/mv aliases that forces -i to avoid accidental clobbering and it confuses Claude to no end (it uses cp/mv rare enough—rarer than it should, really—that I don’t bother wasting memory tokens on it).

d1sxeyes · 2026-03-28T06:26:34 1774679194

I did this, Claude detected it and decided to run /bin/rm directly.

cogogo · 2026-03-28T11:01:33 1774695693

This is terrifying. I have not used agents because I do not have a sandbox machine I do not care about. Am I crazy to worry about a sandboxed agent running on my home network? Anyone experienced anything weird by doing that?

oefrha · 2026-03-28T11:10:41 1774696241

Don’t dangerously skip permissions and actually read commands when you get prompted and you’re fine.

d1sxeyes · 2026-03-28T11:15:22 1774696522

Yeah, I actually have both an alias for `rm` and a custom seatbelt sandbox which means the agent can only delete stuff within the directory it’s working in, so wasn’t an issue, was just fun to watch it say “hm, that doesn’t seem to work. Looks like the user has aliased rm. I’ll just go ahead and work around it”

cruffle_duffle · 2026-03-29T01:20:06 1774747206

Hah… I’ve seen Claude happily and very cleverly find ways to escape its sandbox. It’s like some kind of arms race between the model and its designers.