HN 표시: NervOS – Firecracker MicroVM을 사용하는 AI 에이전트용 샌드박스
hackernews
|
|
🔬 연구
#ai 에이전트
#firecracker
#microvm
#nervos
#review
#샌드박스
#디버깅
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
NervOS는 AI 에이전트의 코드가 호스트 머신이 아닌 Firecracker 마이크로VM 내에서 실행되도록 하여 안전한 샌드박스 환경을 제공합니다. 이 가벼운 Alpine VM은 Python, bash, curl 및 git과 같은 도구를 지원하며, 문제 발생 시 VM을 파괴하여 호스트 머신을 안전하게 보호할 수 있습니다. 현재는 로컬 에이전트를 위한 MCP와의 연동 실험이 진행 중입니다.
본문
Time-travel debugging for AI agent sandboxes. Hardware-isolated Firecracker microVMs with snapshot, replay, and diff — not containers. AI agents execute code on your machine. When something goes wrong — and it will — you have no way to see what the agent actually did, rewind to the moment before it broke, or compare why one agent succeeded and another failed. Containers share your kernel (escapes are real). Cloud sandboxes send your data to someone else's server. Neither gives you observability into agent behaviour. BunkerVM solves all three: isolation, observability, and time-travel. Each sandbox is a Firecracker microVM — the same technology behind AWS Lambda. Own kernel, own filesystem, hardware-level (KVM) isolation. Not a container. On top of that, BunkerVM adds capabilities that no other sandbox provides: from bunkervm import Sandbox with Sandbox(record=True) as sb: sb.run("import pandas as pd") sb.run("df = pd.read_csv('/data/input.csv')") sb.run("df['total'] = df.price * df.qty") sb.run("df.to_csv('/output/result.csv')") # Every step recorded: command, output, filesystem changes, VM snapshot sb.restore(step=2) # VM state rewinds to after read_csv sb.run("df.describe()") # explore from that exact point The VM's memory, CPU registers, filesystem — everything reverts to exactly what it was after step 2. Not a re-run. An actual restore from a Firecracker snapshot. for cp in sb.history(): print(f"step {cp['step']}: {cp['command']}") if cp['trace']: for f in cp['trace']['files_created']: print(f" + {f['path']} ({f['size']} bytes)") step 1: import pandas as pd step 2: df = pd.read_csv('/data/input.csv') ~ /data/input.csv (read) step 3: df['total'] = df.price * df.qty step 4: df.to_csv('/output/result.csv') + /output/result.csv (1247 bytes) bunkervm diff session-abc session-def Agent Diff Session A: abc (12 steps, 3400ms) Session B: def (8 steps, 1200ms) Files only in A: /tmp/debug.log, /tmp/retry_3.py Files only in B: /output/result.csv step 1 [same] import pandas as pd step 2 [same] df = pd.read_csv('/data/input.csv') step 3 [diff] A: df = df.dropna() B: df = df.fillna(0) step 4 [diff] A: # crashed — KeyError: 'total' B: df['total'] = df.price * df.qty ← OK Agent A dropped rows and lost a required column. Agent B filled missing values and succeeded. Without diff, you'd never know why. pip install bunkervm from bunkervm import run_code result = run_code("print('Hello from a microVM!')") print(result) # Hello from a microVM! VM boots, code runs, VM dies. Your host was never touched. AI Agent │ ▼ bunkervm (host) ──vsock──▶ Firecracker MicroVM │ ┌────────────────────┐ │ record=True │ Alpine Linux │ │ ─────────▶ │ Own kernel │ │ snapshot() │ exec_agent.py │ │ trace() │ (filesystem trace) │ │ restore() └────────────────────┘ │ KVM hardware isolation ▼ ~/.bunkervm/sessions/ ~/.bunkervm/snapshots/ session-abc.json step1/ vmstate + memory session-def.json step2/ vmstate + memory Firecracker provides the isolation. BunkerVM adds the instrumentation layer: | Layer | What it does | |---|---| | exec_agent (inside VM) | Traces filesystem changes per command — files created, modified, deleted, bytes written | | Firecracker API (host→VM) | Pauses VM, snapshots CPU + memory state to disk, resumes — all via Firecracker's built-in snapshot API | | Snapshot manager (host) | Stores and indexes snapshots at ~/.bunkervm/snapshots/ , manages lifecycle | | Session recorder (host) | Chains commands → traces → snapshots into a replayable session JSON | No custom kernel modules. No eBPF. No ptrace. The VM is the isolation boundary; the API socket is the control plane. Pure Python, stdlib-only transport. Every command execution can return a trace of what changed on disk. result = client.exec("python3 train.py", trace=True) print(result["trace"]) # { # "files_created": [{"path": "/output/model.pkl", "size": 4820}], # "files_modified": [{"path": "/tmp/loss.log", "old_size": 0, "new_size": 312}], # "files_deleted": [], # "bytes_written": 5132 # } This happens inside the VM — a pre/post filesystem snapshot diff. No host-side hooks, no strace, no overhead on non-traced commands. Full VM state (CPU, memory, filesystem) saved to disk. Restore boots a new Firecracker process from that state instead of cold-booting. from bunkervm import Sandbox with Sandbox() as sb: sb.run("import torch; model = torch.load('bert.pt')") sb.checkpoint("model-loaded") # snapshot: 45ms sb.run("output = model(bad_input)") # crashes sb.restore(step=1) # restore: --trace # replay recorded session bunkervm diff # compare two agent runs bunkervm snapshot list # list VM snapshots bunkervm snapshot delete # delete a snapshot bunkervm server --transport sse # MCP server bunkervm info # system readiness check See CONTRIBUTING.md. See SECURITY.md. Apache-2.0 If BunkerVM helps you build safer agents, star the repo
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유