AI 에이전트를 위한 실제 환경

hackernews | | 🔬 연구
#ai 에이전트 #review #리뷰 #실제 환경 #인프라 엔지니어링 #쿠버네티스
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

AI 에이전트가 실제 환경에서 작동하기 위한 연구가 활발히 진행되고 있습니다. emergent.sh라는 프로젝트에서 이러한 연구를 진행하며, AI가 현실 세계의 복잡성을 이해하고 적응할 수 있도록 돕는 기술 개발에 집중하고 있습니다. 실제 환경에서의 AI 에이전트 성능 향상을 목표로, 다양한 기술적 도전 과제를 해결하기 위한 노력이 이루어지고 있습니다. 이러한 연구는 AI가 더욱 실용적인 문제 해결 능력을 갖추도록 하는 데 기여할 것으로 기대됩니다.

본문

Engineering ⢠Real Environments for AI Agents: Why We Bet on Kubernetes This post kicks off a multi-part series on how we built durable, persistent environments for Agents Part 1 covers the architecture: why we chose Kubernetes over VMs and lightweight sandboxes, how we solve persistence with a single content-addressed approach, and how isolation works when you give root-equivalent access to an AI agent in every pod. In Part 2, we cover the scaling layer: the warm pool, multi-cluster orchestration across multi-region production clusters, and what we learned the hard way. Introduction What Emergent Does Emergent is an AI-powered software development platform. You describe the product you want to build â a SaaS dashboard, a REST API, a data pipeline â and our agent takes it from there: planning the work, writing the code, installing dependencies, running tests, and deploying a live, working application. The goal is to compress the time between an idea and a deployed product from days to minutes. When a user describes what they want to build, Emergent assigns an AI agent to a private workspace. The agent writes code, runs tests, and sets everything up automatically. Once it is ready, the user gets a live preview link to see the app in action. If everything looks good, they can deploy it to production with a single click and get a hosted link to share with anyone. Building and running real software is not just about writing code. It requires a place to work. Why the Environment Matters? Consider how a software engineer works on a remote development environment. They SSH into an environment, write and edit code directly on the filesystem, install system packages and language dependencies, run tests, check logs, and trigger deployments â all from the same environment. The environment has memory, persistent state, and a network. It behaves like developerâs local environment because it essentially is a close replica of it. An agent doing software development needs the same thing. With a real environment, an agent can run the code it generates, verify assumptions, and test the output. The environment is foundational for closing the development loop and elevates the agent from a code generator into a system that can build and ship software. The output is production-grade because the agent operates in a production-grade environment. Thatâs the thesis. The Fundamental Challenge We launched Emergent eight months ago. We grew faster than any of our initial infrastructure assumptions â from a few hundred concurrent environments to over 30K+ (and continuously growing). The environment management system had to evolve at the same pace. The one conviction that held through all of it: AI coding agents need real infrastructure, not lightweight sandboxes dressed up as infrastructure. Each one is a full Linux environment with its own filesystem, network isolation, and compute budget. The difficulty isnât just provisioning at scale â itâs solving for three distinct imperatives simultaneously; Startup speed â spinning them up fast enough to keep pace with user demand State durability â reliably persisting work to survive any disaster Clean teardown â releasing resources completely when sessions end All of this while working against the grain of Kubernetes, which is explicitly designed to treat pods as disposable. Every architectural decision in this series flows from that tension. Two Common Approaches The VM approach: accurate but slow VMs are honest. Give an agent a full VM â a dedicated GCE instance or EC2 environment â and it gets root , a real filesystem, system packages, databases, the whole stack. The agent can produce real software: security headers, database migrations, SSL configuration, actual auth flows. The environment stops being the excuse. The problem is speed. VM provisioning takes 30â120 seconds. At scale, dedicated VMs, each with its own OS image, boot sequence, and persistent disk. The cost is significant, scaling is measured in minutes, and the user experience begins with waiting before the agent writes its first line of code. The Lightweight sandbox approach: fast but constrained Sandboxes trade capability for speed. Many sandboxed runtimes â WebContainers, edge functions, browser-based runtimes â spin up in milliseconds and the agent starts immediately. For simple use cases, it feels frictionless. In our workloads, the ceiling shows up quickly. The moment the agent needs a real database, a native Node addon, apt-get install , server-side rendering with a persistent backend, or an auth flow that sets secure HTTP-only cookies â typical sandbox constraints made it hard. The agent generates code fast, but the code is constrained by the environment: no real databases, no system packages, no background processes, no multi-port services. For the current generation of lightweight sandboxes, this is a structural ceiling, not a model limitation. When an agent cannot run a MongoDB migration, cannot start Redis, cannot

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →