HN 표시: BunkerVM – microVM 샌드박스를 사용하는 AI 에이전트를 위한 보안 런타임
hackernews
|
|
🔬 연구
#ai 에이전트
#bunkervm
#microvm
#review
#디버깅
#보안
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
AI 에이전트가 실행하는 코드가 호스트 시스템에 직접 실행될 때 보안 위험이 있을 수 있다는 점을 인지하고, 이를 해결하기 위해 BunkerVM이라는 솔루션을 개발했습니다. BunkerVM은 AI 에이전트 코드를 가벼운 Firecracker microVM 내에서 실행하여 강력한 격리 기능을 제공하며, 시작 시간은 약 2초로 매우 빠릅니다. 현재 LLM 에이전트의 도구 실행, 지속적인 VM 세션, 환경 스냅샷 등의 기능을 탐색하고 있으며, 관련 분야 개발자들의 피드백을 구하고 있습니다.
본문
Time-travel debugging for AI agent sandboxes. Hardware-isolated Firecracker microVMs with snapshot, replay, and diff — not containers. AI agents execute code on your machine. When something goes wrong — and it will — you have no way to see what the agent actually did, rewind to the moment before it broke, or compare why one agent succeeded and another failed. Containers share your kernel (escapes are real). Cloud sandboxes send your data to someone else's server. Neither gives you observability into agent behaviour. BunkerVM solves all three: isolation, observability, and time-travel. Each sandbox is a Firecracker microVM — the same technology behind AWS Lambda. Own kernel, own filesystem, hardware-level (KVM) isolation. Not a container. On top of that, BunkerVM adds capabilities that no other sandbox provides: from bunkervm import Sandbox with Sandbox(record=True) as sb: sb.run("import pandas as pd") sb.run("df = pd.read_csv('/data/input.csv')") sb.run("df['total'] = df.price * df.qty") sb.run("df.to_csv('/output/result.csv')") # Every step recorded: command, output, filesystem changes, VM snapshot sb.restore(step=2) # VM state rewinds to after read_csv sb.run("df.describe()") # explore from that exact point The VM's memory, CPU registers, filesystem — everything reverts to exactly what it was after step 2. Not a re-run. An actual restore from a Firecracker snapshot. for cp in sb.history(): print(f"step {cp['step']}: {cp['command']}") if cp['trace']: for f in cp['trace']['files_created']: print(f" + {f['path']} ({f['size']} bytes)") step 1: import pandas as pd step 2: df = pd.read_csv('/data/input.csv') ~ /data/input.csv (read) step 3: df['total'] = df.price * df.qty step 4: df.to_csv('/output/result.csv') + /output/result.csv (1247 bytes) bunkervm diff session-abc session-def Agent Diff Session A: abc (12 steps, 3400ms) Session B: def (8 steps, 1200ms) Files only in A: /tmp/debug.log, /tmp/retry_3.py Files only in B: /output/result.csv step 1 [same] import pandas as pd step 2 [same] df = pd.read_csv('/data/input.csv') step 3 [diff] A: df = df.dropna() B: df = df.fillna(0) step 4 [diff] A: # crashed — KeyError: 'total' B: df['total'] = df.price * df.qty ← OK Agent A dropped rows and lost a required column. Agent B filled missing values and succeeded. Without diff, you'd never know why. pip install bunkervm from bunkervm import run_code result = run_code("print('Hello from a microVM!')") print(result) # Hello from a microVM! VM boots, code runs, VM dies. Your host was never touched. AI Agent │ ▼ bunkervm (host) ──vsock──▶ Firecracker MicroVM │ ┌────────────────────┐ │ record=True │ Alpine Linux │ │ ─────────▶ │ Own kernel │ │ snapshot() │ exec_agent.py │ │ trace() │ (filesystem trace) │ │ restore() └────────────────────┘ │ KVM hardware isolation ▼ ~/.bunkervm/sessions/ ~/.bunkervm/snapshots/ session-abc.json step1/ vmstate + memory session-def.json step2/ vmstate + memory Firecracker provides the isolation. BunkerVM adds the instrumentation layer: | Layer | What it does | |---|---| | exec_agent (inside VM) | Traces filesystem changes per command — files created, modified, deleted, bytes written | | Firecracker API (host→VM) | Pauses VM, snapshots CPU + memory state to disk, resumes — all via Firecracker's built-in snapshot API | | Snapshot manager (host) | Stores and indexes snapshots at ~/.bunkervm/snapshots/ , manages lifecycle | | Session recorder (host) | Chains commands → traces → snapshots into a replayable session JSON | No custom kernel modules. No eBPF. No ptrace. The VM is the isolation boundary; the API socket is the control plane. Pure Python, stdlib-only transport. Every command execution can return a trace of what changed on disk. result = client.exec("python3 train.py", trace=True) print(result["trace"]) # { # "files_created": [{"path": "/output/model.pkl", "size": 4820}], # "files_modified": [{"path": "/tmp/loss.log", "old_size": 0, "new_size": 312}], # "files_deleted": [], # "bytes_written": 5132 # } This happens inside the VM — a pre/post filesystem snapshot diff. No host-side hooks, no strace, no overhead on non-traced commands. Full VM state (CPU, memory, filesystem) saved to disk. Restore boots a new Firecracker process from that state instead of cold-booting. from bunkervm import Sandbox with Sandbox() as sb: sb.run("import torch; model = torch.load('bert.pt')") sb.checkpoint("model-loaded") # snapshot: 45ms sb.run("output = model(bad_input)") # crashes sb.restore(step=1) # restore: --trace # replay recorded session bunkervm diff # compare two agent runs bunkervm snapshot list # list VM snapshots bunkervm snapshot delete # delete a snapshot bunkervm server --transport sse # MCP server bunkervm info # system readiness check See CONTRIBUTING.md. See SECURITY.md. Apache-2.0 If BunkerVM helps you build safer agents, star the repo
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유