AI 에이전트를 위한 시간 이동 디버거를 구축했습니다.

hackernews | | 📰 뉴스
#ai 서비스 #ai 에이전트 #개발 도구 #디버깅 #시간 이동 #langgraph #review #시간 이동 디버거 #옵저버빌리티
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

다단계 AI 에이전트 디버깅 시 코드를 수정하고 전체 단계를 재실행해야 하는 비효율성을 해결하기 위해 특정 시점으로 되돌아가 실행을 이어갈 수 있는 'Agent VCR' 오픈소스 도구가 공개되었습니다. 이 도구는 에이전트의 모든 단계를 프레임별 기록한 뒤, 오류 발생 시 문제의 프레임으로 돌아가 상태를 수정하고 해당 지점부터 재실행할 수 있는 시간 이동 기능을 제공합니다. 또한 작업 내용을 물리적으로 되돌리는 ACID 트랜잭션, 성공한 실행 기록을 캐싱해 토큰 없이 즉시 재생하는 기능, 대시보드 및 TUI 환경을 지원하며, 기록 오버헤드는 5ms 미만으로 유지됩니다. LangGraph, CrewAI 등 주요 프레임워크와 단 1줄의 코드만으로 통합되어 개발 시간과 API 토큰 비용을 크게 절감할 수 있습니다.

본문

Time-travel debugging for AI agents. Building multi-step AI agents (LangGraph, CrewAI, OpenHands) is painfully slow to debug. When your agent fails on step 8 out of 10, observability tools like LangSmith or LangFuse only show you what went wrong. To fix it, you patch the code and re-run all 10 steps from scratch. Every typo costs minutes of wall time and dollars in wasted tokens. Agent VCR records your agent's complete state at every step. When something breaks, you rewind to the failing step, edit the state, and resume execution from that exact point. No re-running the whole chain. LangSmith shows you what happened. Agent VCR lets you change it. pip install ai-agent-vcr from agent_vcr import VCRRecorder, VCRPlayer # Record your agent recorder = VCRRecorder() recorder.start_session("bug_hunt") # ... your agent code ... recorder.save() # Time-travel and fix player = VCRPlayer.load(".vcr/bug_hunt.vcr") state = player.goto_frame(2) # jump to step 2 state["prompt"] = "Fixed prompt" # fix the state player.resume(from_frame=2) # continue from there - Time Travel — Jump back to any step. Full state snapshot at every node. - State Injection & Resume — Edit the state at any frame — fix a prompt, patch tool output, inject context — then resume mid-chain. - ACID Transactions — Wrap agent execution in real database-style transactions backed by git. Rollback physically reverts files on disk, not just in-memory state. - Golden Run Cache — Save successful runs as replayable paths. Next time you hit the same task, skip all LLM calls. Same task, zero tokens, instant. - React Dashboard — Run vcr-server , openlocalhost:8000 . Glassmorphism UI for inspecting state, viewing JSON diffs, live WebSocket streaming. - TUI Debugger — Run vcr-tui in your terminal. Navigate frames, presse to edit state, pressr to resume. - Visual Diffs — Color-coded state mutation tracking in Dashboard and TUI. - DAG Visualization — See parallel execution branches, search/filter sessions by tags. - Framework Agnostic — 1-line integration with LangGraph, CrewAI, or raw Python. - Git-Friendly Storage — JSONL files, version controllable, append-only. - Production Safe — str: return "findings..." Install with: pip install "ai-agent-vcr[crewai]" See examples/crewai_integration.py for a full runnable demo. Agent VCR uses JSONL (JSON Lines): {"type": "session", "data": {"session_id": "abc123", "created_at": "2024-01-01T00:00:00Z", ...}} {"type": "frame", "data": {"frame_id": "...", "node_name": "planner", "input_state": {...}, "output_state": {...}, ...}} {"type": "frame", "data": {...}} - Human-readable - Git-diffable - Append-only (efficient for streaming) - Line-by-line parsing (no need to load the entire file) Recording overhead is continuously benchmarked in CI to stay under 5ms per frame. pytest tests/benchmarks/ -v class VCRRecorder: def start_session( self, session_id: str = None, parent_session_id: str = None, forked_from_frame: int = None, metadata: dict = None, tags: list[str] = None, ) -> Session def record_step( self, node_name: str, input_state: dict, output_state: dict, metadata: FrameMetadata = None, frame_type: FrameType = FrameType.NODE_EXECUTION, ) -> Frame def record_llm_call(...) def record_tool_call(...) def record_error(...) def save(self) -> Path def fork(self, from_frame: int, ...) -> VCRRecorder class VCRPlayer: @classmethod def load(cls, filepath: str) -> VCRPlayer def goto_frame(self, index: int) -> dict def get_frame(self, index: int) -> Frame def list_nodes(self) -> list[str] def get_errors(self) -> list[Frame] def compare_frames(self, a: int, b: int) -> dict def resume(self, agent_callable: Callable, config: ResumeConfig) -> str def export_state(self, frame_index: int) -> dict class ACIDWorkspace: def __init__(self, workspace_path: str, recorder: VCRRecorder = None) def begin(self, session_id: str) -> None def savepoint(self, state: dict, node_name: str) -> None def rollback(self, to_frame_index: int) -> None def commit(self) -> None class GoldenRunCache: def __init__(self, cache_dir: str = ".vcr/golden") def save_golden_run(self, task: str, recorder: VCRRecorder) -> str def replay(self, task: str) -> tuple[list[dict], CostLedger] def invalidate(self, task: str) -> bool class ResumeConfig: from_frame: int # Frame to resume from new_session_id: str = None # Optional ID for forked session state_overrides: dict = {} # State changes to apply mode: ResumeMode = FORK # FORK, REPLAY, or MOCK skip_nodes: list[str] = [] # Nodes to skip during replay inject_mocks: dict = {} # Mock values for dependencies See the examples/ directory: basic_usage.py — Recording and playbacktime_travel_demo.py — Full time-travel workflowlanggraph_integration.py — LangGraph auto-instrumentationacid_golden_run.py — ACID transactions and Golden Run Cache python examples/acid_golden_run.py Contributions welcome. See CONTRIBUTING.md for guidelines. git clone https://github.com/agent-vcr/agent-vcr.git cd agent-vcr pip install -e ".[dev]" pytest tests/unit/ -v pyte

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →