Show HN: Volnix – AI 에이전트를 위한 월드 엔진

hackernews | 2026년 4월 10일 23:37 | 📦 오픈소스

#ai 모델 #ai 에이전트 #anthropic #claude #gemini #gpt-4 #llama #openai #show hn #volnix #시뮬레이션 #월드 엔진 #ai 환경 #상태 관리

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Volnix는 AI 에이전트를 위한 현실적인 시뮬레이션 환경을 제공하는 월드 엔진으로, 단순한 테스트 환경을 넘어 예산, 정책, 인과관계가 적용되는 7단계 거버넌스 파이프라인을 통해 에이전트의 모든 행동을 평가합니다. 사용자는 자체 에이전트를 연결하거나 내부 팀을 배포할 수 있으며, Stripe, Slack, GitHub 등 11개의 검증된 서비스 팩을 갖춘 13개의 내장 청사진을 제공합니다. 이를 통해 지원, 금융, 데브옵스 등 다양한 분야에서 복잡한 다중 에이전트 조정, 시나리오 시뮬레이션 및 합성 데이터 생성을 수행할 수 있습니다. 시스템은 Python 3.12 이상과 LLM API 키가 필요하며, 실시간 대시보드를 통해 이벤트 피드와 스코어카드를 모니터링할 수 있습니다.

본문

Programmable worlds for AI agents. Volnix creates living, governed realities for AI agents. Not mock servers. Not test harnesses. Complete worlds with stateful services, policies that push back, budgets that run out, NPCs that follow up and escalate, and consequences that cascade. Worlds are defined in YAML, run on their own timelines, and score every agent that interacts with them. Requirements: Python 3.12+, uv (recommended), and at least one LLM API key (GOOGLE_API_KEY , OPENAI_API_KEY , or ANTHROPIC_API_KEY ). See docs/llm-providers.md for supported providers. pip install volnix export GOOGLE_API_KEY=... # or OPENAI_API_KEY / ANTHROPIC_API_KEY volnix check # verify setup volnix serve dynamic_support_center --internal agents_dynamic_support --port 8080 git clone https://github.com/janaraj/volnix.git && cd volnix uv sync --all-extras export GOOGLE_API_KEY=... uv run volnix serve dynamic_support_center --internal agents_dynamic_support --port 8080 # Dashboard (separate terminal) cd volnix-dashboard && npm install && npm run dev # http://localhost:3000 With venv activated ( source .venv/bin/activate ), you can runvolnix directly instead ofuv run volnix . Note: The React dashboard is only available when installed from source. The pip package includes the full backend and CLI. Volnix supports two modes — connect your own agents to a governed world, or deploy internal agent teams that collaborate autonomously. Mode 1: Connect Your Own Agent Mode 2: Deploy Internal Agent Teams ──────────────────────── ────────────────────────── Your Agent (any framework) Mission + Team YAML │ │ ▼ ▼ Gateway (MCP/REST/SDK) Lead Agent ──▶ Slack ◀── Agent N │ │ ▲ ▼ ▼ │ ┌──────────────────────┐ ┌──────────────────────┐ │ Volnix World │ │ Volnix World │ │ 7-Step Pipeline │ │ 7-Step Pipeline │ │ Simulated Services │ │ Simulated Services │ │ Policies + Budget │ │ Policies + Budget │ │ Static world │ │ Living world (NPCs)│ └──────────┬───────────┘ └──────────┬───────────┘ │ │ ▼ ▼ Scorecard + Event Log Deliverable + Scorecard Every action flows through a 7-step governance pipeline — permission, policy, budget, capability, responder, validation, commit — before it touches the world. Nothing bypasses it. Deploy agent teams that coordinate through the world itself — posting in Slack, updating tickets, processing payments. A lead agent manages a 4-phase lifecycle (delegate → monitor → buffer → synthesize) to produce a deliverable. mission: > Investigate each open ticket. Process refunds where appropriate. Senior-agent handles refunds under $100. Supervisor approves over $100. deliverable: synthesis agents: - role: supervisor lead: true permissions: { read: [zendesk, stripe, slack], write: [zendesk, stripe, slack] } budget: { api_calls: 50, spend_usd: 500 } - role: senior-agent permissions: { read: [zendesk, stripe, slack], write: [zendesk, stripe, slack] } budget: { api_calls: 40, spend_usd: 100 } See docs/internal-agents.md for the complete guide. Connect any agent framework — CrewAI, PydanticAI, LangGraph, AutoGen, or plain HTTP. Your agent interacts with simulated services as if they were real. It doesn't know it's in a simulation. | Protocol | Endpoint | Best For | |---|---|---| | MCP | /mcp | Claude Desktop, Cursor, PydanticAI | | OpenAI compat | /openai/v1/ | OpenAI SDK, LangGraph, AutoGen | | Anthropic compat | /anthropic/v1/ | Anthropic SDK | | Gemini compat | /gemini/v1/ | Google Gemini SDK | | REST | /api/v1/ | Any HTTP client | # PydanticAI via MCP — zero Volnix imports from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP("http://localhost:8080/mcp/") agent = Agent("openai:gpt-4.1-mini", toolsets=[server]) async with agent: result = await agent.run("Check the support queue and handle urgent tickets.") See docs/agent-integration.md for the full guide. - 7-step governance pipeline on every action (permission → policy → budget → capability → responder → validation → commit) - Policy engine with block, hold, escalate, and log enforcement modes - Budget tracking per agent (API calls, LLM spend, time) - Reality dimensions — tune information quality, reliability, social friction, complexity, and boundaries - 11 verified service packs — Stripe, Zendesk, Slack, Gmail, GitHub, Calendar, Twitter, Reddit, Notion, Alpaca, Browser - BYOSP — bring any service; the compiler auto-resolves from API docs - Multi-provider LLM — Gemini, OpenAI, Anthropic, Ollama, vLLM, CLI tools - Real-time dashboard with event feed, scorecards, and agent timeline - Causal graph — every event traces back to its causes - 13 built-in blueprints across support, finance, DevOps, research, security, and marketing Some of the things you can do with Volnix: | Use Case | What It Means | |---|---| | Agent evaluation | Put your agent in a governed world, measure how it handles policies, budgets, and ambiguity | | Multi-agent coordination | Deploy agent teams that collaborate through shared world state — not a pipeline | | Scenario simulation | Explore "what if" scenarios with realistic services, actors, and consequences | | Gateway deployment | Route agent actions through governance (permission, policy, budget) before they hit real APIs | | Synthetic data generation | Generate interconnected, realistic service data (tickets, charges, customers) with causal consistency | | PMF / product exploration | Simulate business environments to test workflows, team structures, or product decisions | | Blueprint | Domain | Services | Agent Team | |---|---|---|---| dynamic_support_center | Support | Stripe, Zendesk, Slack | agents_dynamic_support (3) | market_prediction_analysis | Finance | Slack, Twitter, Reddit | agents_market_analysts (3) | incident_response | DevOps | Slack, GitHub, Calendar | — | climate_research_station | Research | Slack, Gmail | agents_climate_researchers (4) | feature_prioritization | Product | Slack | agents_feature_team (3) | security_posture_assessment | Security | Slack, Zendesk | agents_security_team (3) | volnix blueprints # list all volnix serve market_prediction_analysis \ --internal agents_market_analysts --port 8080 See docs/blueprints-reference.md for the full catalog. cd volnix-dashboard && npm install && npm run dev # http://localhost:3000 Live event streaming, governance scorecards, policy trigger logs, deliverable inspection, agent activity timeline, entity browser. | Guide | Description | |---|---| | Getting Started | Installation, first run, connecting agents | | Creating Worlds | World YAML schema, reality dimensions, seeds | | Internal Agents | Agent teams, lead lifecycle, deliverables | | Agent Integration | MCP, REST, SDK, framework adapters | | Blueprints Reference | Complete catalog of blueprints and pairings | | Service Packs | Verified packs, YAML profiles, BYOSP | | LLM Providers | Provider types, tested models, routing | | Configuration | TOML config, LLM routing, tuning | | Architecture | Two-half model, 10 engines, pipeline | | Vision | World memory, generative worlds, visual reality | uv sync --all-extras # install uv run pytest # test (2800+ tests) uv run ruff check volnix/ # lint uv run ruff format --check volnix/ # format See CONTRIBUTING.md for development setup and PR process. - Context Hub by Andrew Ng — curated, versioned documentation for coding agents. Volnix uses Context Hub for dynamic API schema extraction during service profile resolution. MIT License. See LICENSE for details.

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기