Show HN: Zerobox – Sandbox any command with file, network, credential controls (github.com)

43 points by afshinmeh 2 days ago

I'm excited to introduce Zerobox, a cross-platform, single binary process sandboxing CLI written in Rust. It uses the sandboxing crates from the OpenAI Codex repo and adds additional functionalities like secret injection, SDK, etc.

Watch the demo: https://www.youtube.com/watch?v=wZiPm9BOPCg

Zerobox follows the same sandboxing policy as Deno which is deny by default. The only operation that the command can run is reading files, all writes and network I/O are blocked by default. No VMs, no Docker, no remote servers.

Want to block reads to /etc?

  zerobox --deny-read=/etc -- cat /etc/passwd

  cat: /etc/passwd: Operation not permitted
How it works:

Zerobox wraps any commands/programs, runs an MITM proxy and uses the native sandboxing solutions on each operating system (e.g BubbleWrap on Linux) to run the given process in a sandbox. The MITM proxy has two jobs: blocking network calls and injecting credentials at the network level.

Think of it this way, I want to inject "Bearer OPENAI_API_KEY" but I don't want my sandboxed command to know about it, Zerobox does that by replacing "OPENAI_API_KEY" with a placeholder, then replaces it when the actual outbound network call is made, see this example:

  zerobox --secret OPENAI_API_KEY=$OPENAI_API_KEY --secret-host OPENAI_API_KEY=api.openai.com -- bun agent.ts
Zerobox is different than other sandboxing solutions in the sense that it would allow you to easily sandbox any commands locally and it works the same on all platforms. I've been exploring different sandboxing solutions, including Firecracker VMs locally, and this is the closest I was able to get when it comes to sandboxing commands locally.

The next thing I'm exploring is `zerobox claude` or `zerobox openclaw` which would wrap the entire agent and preload the correct policy profiles.

I'd love to hear your feedback, especially if you are running AI Agents (e.g. OpenClaw), MCPs, AI Tools locally.

simonw an hour ago

This looks really good - the CLI interface design is solid, and I especially like the secrets / network proxy pattern - but the thing it needs most is copiously detailed documentation about exactly how the sandbox mechanism works - and how it was tested.

There are dozens of projects like this emerging right now. They all share the same challenge: establishing credibility.

I'm loathe to spend time evaluating them unless I've seen robust evidence that the architecture is well thought through and the tool has been extensively tested already.

My ideal sandbox is one that's been used by hundreds of people in a high-stakes environment already. That's a tall order, but if I'm going to spend time evaluating one the next best thing is documentation that teaches me something about sandboxing and demonstrates to me how competent and thorough the process of building this one has been.

UPDATE: On further inspection there's a lot that I like about this one. The CLI design is neat, it builds on a strong underlying library (the OpenAI Codex implementation) and the features it does add - mainly the network proxy being able to modify headers to inject secrets - are genuinely great ideas.

kjok an hour ago

> There are dozens of projects like this emerging right now. They all share the same challenge: establishing credibility.

Care to elaborate on the kind of "credibility" to be established here? All these bazillion sandboxing tools use the same underlying frameworks for isolation (e.g., ebpf, landlock, VMs, cgroups, namespaces) that are already credible.

simonw an hour ago

The problem is that those underlying frameworks can very easily be misconfigured. I need to know that the higher level sandboxing tools were written by people with a deep understanding of the primitives that they are building on, and a very robust approach to testing that their assumptions hold and they don't have any bugs in their layer that affect the security of the overall system.

Most people are building on top of Apple's sandbox-exec which is itself almost entirely undocumented!

kjok 32 minutes ago

afshinmeh an hour ago

Simon! Thanks. I appreciate your comment and totally agreed. I will improve the docs as well as tests.

mdavid626 an hour ago

I trust sandbox-exec more, or Docker on Linux. Those come from the OS, well tested and known.

MITM proxy is nice idea to avoid leaking secrets. Isn’t it very brittle though? Anthropic changes some URL-s and it’ll break.

afshinmeh an hour ago

Thanks for sharing that. Zerobox _does_ use the native OS sandboxing mechanisms (e.g. seatbelt) under the hood. I'm not trying to reinvent the wheel when it comes to sandboxing.

Re the URLs, I agree, that's why I added wildcard support, e.g. `*.openai.com` for secret injection as well as network call filtering.

mdavid626 27 minutes ago

You know, the thing is, that it is super easy to create such tools with AI nowadays. …and if you create your own, you can avoid these unnecessary abstractions. You get exactly what you want.

mdavid626 44 minutes ago

How do you intercept network traffic on mac os? How do you fake certificates?

afshinmeh 28 minutes ago

volume_tech an hour ago

the credential injection via MITM proxy is the most interesting part to me. the standard approach for agents is environment variables, which means the agent process can read them directly. having the sandbox intercept network calls and swap in credentials at the proxy layer means the agent code has a placeholder and never sees the real value -- useful when running less-trusted agent code or third-party tools.

the deny-by-default network policy also matters specifically for agent use: without it there is nothing stopping a tool call from exfiltrating context window contents to an arbitrary endpoint. most sandboxes focus on filesystem isolation and treat network as an afterthought.

afshinmeh an hour ago

Thanks and agreed! Zerobox uses the Deno sandboxing policy and also the same pattern for cred injection (placeholders as env vars, replaced at network call time).

Real secrets are never readable by any processes inside the sandbox:

```

zerobox -- echo $OPENAI_API_KEY

ZEROBOX_SECRET_a1b2c3d4e5...

```

simonw an hour ago

Do you know if there's a widely shared name for this pattern? I've been collecting examples of it recently - it's a really good idea - but I'm not sure if there's good terminology. "Credential injection" is one option I've seen floating around.

afshinmeh an hour ago

eluded7 2 hours ago

Personally I would probably always reach for a docker container if I want a sandboxed command that can run identically anywhere.

I appreciate that alternate sandboxing tools can reduce some of the heavier parts of docker though (i.e. building or downloading the correct image)

How would you compare this tool to say bubblewrap https://github.com/containers/

ebb_earl_co 2 hours ago

The text says that it uses OS-level tools, specifically bubble wrap on Linux.

afshinmeh an hour ago

That's right. It uses the same kernel mechanisms as Docker, the runtime is different though (bwrap on linux, seatbelt on mac, etc.)

zephyrwhimsy an hour ago

Technical debt is not always bad. Deliberate technical debt taken on with eyes open to ship faster is a legitimate business strategy. The problem is accidental technical debt from poor decisions compounding silently.

time0ut 2 hours ago

Very interesting. I just started researching this topic yesterday to build something for adjacent use cases (sandboxing LLM authored programs). My initial prototype is using a wasm based sandbox, but I want something more robust and flexible.

Some of my use cases are very latency sensitive. What sort of overhead are you seeing?

afshinmeh an hour ago

I added a benchmark test (Apple M5) and on average I'm seeing 10ms overhead. I added a benchmark section to the repo as well https://github.com/afshinm/zerobox?tab=readme-ov-file#perfor...

Also, I'm literally wrapping Claude with zerobox now! No latency issues at all.

jbverschoor 2 hours ago

Again, it’s blacklisting so kind of impossible to get right. I’ve looked at this many times, but in order for things to properly work, you have to create a huge, huge, huge, huge sandbox file.

Especially for your application that you any kind of Apple framework.

simonw an hour ago

This doesn't look like it's blacklisting to me. It's an allowlist system:

  --allow-net=api.openai.com # Explicitly allow access to that host

  --allow-write=config.txt # Explicitly allow write to that file

afshinmeh an hour ago

That's correct. The pattern is: reads allowed, write and network I/O blocked by default.

```

zerobox -- curl https://example.com

Could not resolve host: example.com

```

simonw an hour ago

afshinmeh an hour ago

That's interesting, thanks for sharing that. Could you elaborate a bit more? I'd like to understand the use case is a bit better.

gigatexal 6 minutes ago

there's been so many of these -- which of these sandboxing tools is best?

wepple an hour ago

You should probably add a huge disclaimer that this is an untested, experimental project.

Related, a direct comparison to other sandboxes and what you offer over those would be nice

afshinmeh an hour ago

I agree to some extend. I'm using the OpenAI Codex crates for sandboxing though, which I think it's properly tested? They launched last year and iterated many times. I will add a note though, thanks!

nonameiguess 26 minutes ago

This is more a criticism of codex's linux-sandboxing, which you're just wrapping, but it's the first I've ever looked at it. I don't see how it makes sense to invoke bwrap as a forked subprocess. Bubblewrap can't do anything beyond what you can do with unshare directly, which you can simply invoke as a system call without needing to spawn a subprocess or requiring the user to have bwrap installed. It kinds of reeks of amateur hour when developers effectively just translate shell scripts into compiled languages by using whatever variant of "system" is available to make the same command invocations you would make through a shell, as opposed to actually using the system call API. Especially when the invocation is crafted from user input, there's a long history of exploits arising from stuff like this. Writing it in Rust does nothing for you when you're just using Rust to call a different CLI tool that isn't written in Rust.

afshinmeh 9 minutes ago

Thanks for sharing this, I read your comment multiple times. What would be the alternative though? It is true that the program being written in Rust doesn't solve the problem of spawning subprocesses, but what's the alternative in that case?

alyxya 2 hours ago

Cool project, and I think there would be a lot of value in just logging all operations.

kimixa 2 hours ago

For just logging would it really give any more info than a trace already does?

alyxya an hour ago

Forgot about that, was mostly thinking about how AI agents with unrestricted permissions would ideally have some external logging and monitoring, so there would be a record of what it touched. A trace has all of the raw information, so some kind of wrapper around that would be useful.

afshinmeh an hour ago

afshinmeh an hour ago

Agreed. I added the `--debug` flag this morning. It does simple logging including the proxy calls:

```

$ zerobox --debug --allow-net=httpbin.org -- curl

2026-04-01T18:06:33.928486Z CONNECT blocked (client=127.0.0.1:59225, host=example.com, reason=not_allowed)

curl: (56) CONNECT tunnel failed, response 403

```

I'm planning on adding otel integration as well.