BunkerVM – 격리된 Firecracker microVM 샌드박스에서 AI 에이전트 실행

hackernews | | 📰 뉴스
#ai 에이전트 #bunkervm #firecracker #microvm #review #디버깅
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

BunkerVM은 AI 에이전트가 생성한 악성 코드로부터 호스트 시스템을 보호하기 위해 Firecracker 마이크로VM 기반의 하드웨어 격리 샌드박스를 제공합니다. 사용자는 복잡한 설정 없이 윈도우용 BunkerDesktop 설치 파일을 더블 클릭해 약 3초 만에 격리된 환경을 구축할 수 있으며, 이 과정에서 WSL2와 백엔드 설치가 자동으로 처리됩니다. Docker와 달리 커널을 공유하지 않아 컨테이너 탈출 위험이 없고, LangChain이나 OpenAI Agents SDK와 연동하여 에이전트의 코드 실행을 안전하게 제어할 수 있습니다.

본문

Time-travel debugging for AI agent sandboxes. Hardware-isolated Firecracker microVMs with snapshot, replay, and diff — not containers. AI agents execute code on your machine. When something goes wrong — and it will — you have no way to see what the agent actually did, rewind to the moment before it broke, or compare why one agent succeeded and another failed. Containers share your kernel (escapes are real). Cloud sandboxes send your data to someone else's server. Neither gives you observability into agent behaviour. BunkerVM solves all three: isolation, observability, and time-travel. Each sandbox is a Firecracker microVM — the same technology behind AWS Lambda. Own kernel, own filesystem, hardware-level (KVM) isolation. Not a container. On top of that, BunkerVM adds capabilities that no other sandbox provides: from bunkervm import Sandbox with Sandbox(record=True) as sb: sb.run("import pandas as pd") sb.run("df = pd.read_csv('/data/input.csv')") sb.run("df['total'] = df.price * df.qty") sb.run("df.to_csv('/output/result.csv')") # Every step recorded: command, output, filesystem changes, VM snapshot sb.restore(step=2) # VM state rewinds to after read_csv sb.run("df.describe()") # explore from that exact point The VM's memory, CPU registers, filesystem — everything reverts to exactly what it was after step 2. Not a re-run. An actual restore from a Firecracker snapshot. for cp in sb.history(): print(f"step {cp['step']}: {cp['command']}") if cp['trace']: for f in cp['trace']['files_created']: print(f" + {f['path']} ({f['size']} bytes)") step 1: import pandas as pd step 2: df = pd.read_csv('/data/input.csv') ~ /data/input.csv (read) step 3: df['total'] = df.price * df.qty step 4: df.to_csv('/output/result.csv') + /output/result.csv (1247 bytes) bunkervm diff session-abc session-def Agent Diff Session A: abc (12 steps, 3400ms) Session B: def (8 steps, 1200ms) Files only in A: /tmp/debug.log, /tmp/retry_3.py Files only in B: /output/result.csv step 1 [same] import pandas as pd step 2 [same] df = pd.read_csv('/data/input.csv') step 3 [diff] A: df = df.dropna() B: df = df.fillna(0) step 4 [diff] A: # crashed — KeyError: 'total' B: df['total'] = df.price * df.qty ← OK Agent A dropped rows and lost a required column. Agent B filled missing values and succeeded. Without diff, you'd never know why. pip install bunkervm from bunkervm import run_code result = run_code("print('Hello from a microVM!')") print(result) # Hello from a microVM! VM boots, code runs, VM dies. Your host was never touched. AI Agent │ ▼ bunkervm (host) ──vsock──▶ Firecracker MicroVM │ ┌────────────────────┐ │ record=True │ Alpine Linux │ │ ─────────▶ │ Own kernel │ │ snapshot() │ exec_agent.py │ │ trace() │ (filesystem trace) │ │ restore() └────────────────────┘ │ KVM hardware isolation ▼ ~/.bunkervm/sessions/ ~/.bunkervm/snapshots/ session-abc.json step1/ vmstate + memory session-def.json step2/ vmstate + memory Firecracker provides the isolation. BunkerVM adds the instrumentation layer: | Layer | What it does | |---|---| | exec_agent (inside VM) | Traces filesystem changes per command — files created, modified, deleted, bytes written | | Firecracker API (host→VM) | Pauses VM, snapshots CPU + memory state to disk, resumes — all via Firecracker's built-in snapshot API | | Snapshot manager (host) | Stores and indexes snapshots at ~/.bunkervm/snapshots/ , manages lifecycle | | Session recorder (host) | Chains commands → traces → snapshots into a replayable session JSON | No custom kernel modules. No eBPF. No ptrace. The VM is the isolation boundary; the API socket is the control plane. Pure Python, stdlib-only transport. Every command execution can return a trace of what changed on disk. result = client.exec("python3 train.py", trace=True) print(result["trace"]) # { # "files_created": [{"path": "/output/model.pkl", "size": 4820}], # "files_modified": [{"path": "/tmp/loss.log", "old_size": 0, "new_size": 312}], # "files_deleted": [], # "bytes_written": 5132 # } This happens inside the VM — a pre/post filesystem snapshot diff. No host-side hooks, no strace, no overhead on non-traced commands. Full VM state (CPU, memory, filesystem) saved to disk. Restore boots a new Firecracker process from that state instead of cold-booting. from bunkervm import Sandbox with Sandbox() as sb: sb.run("import torch; model = torch.load('bert.pt')") sb.checkpoint("model-loaded") # snapshot: 45ms sb.run("output = model(bad_input)") # crashes sb.restore(step=1) # restore: --trace # replay recorded session bunkervm diff # compare two agent runs bunkervm snapshot list # list VM snapshots bunkervm snapshot delete # delete a snapshot bunkervm server --transport sse # MCP server bunkervm info # system readiness check See CONTRIBUTING.md. See SECURITY.md. Apache-2.0 If BunkerVM helps you build safer agents, star the repo

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →