Hacker Newsnew | past | comments | ask | show | jobs | submit | ushakov's commentslogin

We are running Sandboxes for AI Agents using Firecracker microVMS @ E2B

$1.5M seed bets, maybe. not $60M though

just from looking at it

on Linux it runs Firecracker: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/linu...

on macOS uses the Apple's Virtualization.Framework Go wrapper: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/darw...


nice - I was wondering about the cross-platform story. firecracker on linux for the isolation, virtualization.framework on mac so you don't need vmware.

very cool, if you want cross-platform microvms, there's an interesting project called libkrun that powers projects like Podman and Colima.

here's a Go binding: https://github.com/mishushakov/libkrun-go

demo (on Mac): https://x.com/mishushakov/status/2020236380572643720


Since when does libkrun power Podman? Last time I checked, Podman used non-virtualized containers based on `crun`.

(Though you can certainly configure Podman to use krun[0], which fires up a libkrun VM inside a crun container.)

[0]: https://github.com/containers/crun/blob/main/krun.1


i think there’s a confusion around what use-case Monty is solving (i was confused as well). this seems to isolate in a scope of execution like function calls, not entire Python applications

agree. you still need a secure boundary like VM to isolate the tenants in case the model breaks out of the sandbox.

everything that you don’t want your agent to access should live outside of the sandbox.


best answer is probably to have a layered approach - use this to limit what the generated code can do, wrap it in a secure VM to prevent leaking out to other tenants.

there’s no way around VMs for secure, untrusted workloads. everything else, like Monty has too many tradeoffs that makes it non-viable for any real workloads

disclaimer: i work at E2B, opinions my own


As discussed on twitter, v8 shows that's not true.

But to be clear, we're not even targeting the same "computer use" use case I think e2b, daytona, cloudflare, modal, fly.io, deno, google, aws are going after - we're aiming to support programmatic tool calling with minimal latency and complexity - it's a fundamentally different offering.

Chill, e2b has its use case, at least for now.


There's been a constant stream of v8 VM sandbox escape discoveries since its dawn of course. Considering those have mostly existed for a long time before publication it's very porous most of the time.

And Python VM had/has its sandboxing features too, previously rexec and still https://github.com/zopefoundation/RestrictedPython - in the same category I'd argue.

Then there's of course hypervisor based virtualization and the vulnerabilities and VM escapes there.

Browsers use belt-and-suspenders approaches of employing both language runtime VMs and hardware memory protection as layers to some effect, but still are the star act at pwn2own etc.

It's all layers of porous defenses. There'd definitely be room in the world for performant dynamic language implementations with provably secure foundations.


> It's all layers of porous defenses.

Also known as the "swiss cheese model" in risk management.


part of why rexec is "historical" is that Guido was looking at some lockdown work and asked (twitter, probably?) the community to come up with attack ideas (on a specific more-locked-down-than-default proposed version.) After a couple of hours, it was clear that "patching the problems" was entirely doomed given how flexible python is and it was better to do something else entirely and stop pretending...

V8 itself is intended to be heavily sandboxed. Not through a microvm, but otherwise it's probably the most heavily sandboxed piece of code ever ie: in Chrome it can make virtually no system calls and runs with every restriction an OS can possibly provide and more and seccomp-bpf was basically invented for it.

Perhaps you're using v8 isolates, which then you're back into the "heavily restricted environment within the process" and you lose the things you'd want your AI to be able to do, and even then you still have to sandbox the hell out of it to be safe and you have to seriously consider side channel leaks.

And even after all of that you'd better hope you're staying up to date with patches.

MicroVMs are going to just be way simpler IMO. I don't really get the appeal of using V8 for this unless you have platform/ deployment limitations. Talking over Firecracker's vsock is extremely fast. Firecracker is also insanely safe - 3 CVEs ever, and IMO none are exploitable.


we’re not disagreeing here - i meant for general use-case VMs are better, for some application-specific calls Monty this might suffice.

although you’d still need another boundary to run your app in to prevent breaking out to other tenants.


Factory, Nvidia, Perplexity and Manus are using E2B in production - we ran more than 200 million Sandboxes for our customers

both Docker and bubblewrap are not secure sandboxes. the only way to have actually isolated sandboxes is by using VMs

disclaimer: i work on secure sandboxes at E2B


No disagreement from me. From the article:

> Bubblewrap and Docker are not hardened security isolation mechanisms, but that's okay with me.

Edit to add: my understanding is the major flaw in this approach is potential bugs in Linux kernel that would allow sandbox escape. Would appreciate your insight if there are some easier/more probable attack vectors.


What about cgroups? I know they are not exactly analogous, but to me that seems like a pretty decent solution.

Do you have more information on how to set up such VMs?

for personal use, many ways: Vargant, Docker Sandbox, NixOS VMs, Lima, OrbStack.

if you want multi-tenant: E2B (open-source, self-hosted)


Hashicorp has mostly abandoned Vagrant, so I'd avoid it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: