AI 에이전트가 워크플로를 계속 방해하는 이유
hackernews
|
|
💼 비즈니스
#ai 도입 문제
#ai 에이전트
#워크플로우 최적화
#프레임워크
#프롬프트 엔지니어링
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
AI 에이전트를 워크플로우에 도입해도 디버깅에 더 많은 시간을 쓰게 되어 투자 대비 효과가 없는 경우가 많습니다. 실제로 구현 계획의 한 줄을 해석하여 검토와 테스트 단계를 건너뛰는 등, 에이전트가 지침만으로는 워크플로우 구조를 올바르게 지키지 못함을 확인했습니다. 이러한 문제를 해결하기 위해 작성자는 제어 평면과 데이터 평면이라는 두 가지 개념을 제시합니다.
본문
Why Your AI Agents Keep Breaking Your Workflows Your AI investment isn’t paying off the way you expected. You added agents to your workflows, and now your team spends more time debugging the AI than the AI saves them. So you write better prompts. Add more guardrails. Spell out every constraint. The agents still break things, just in new ways. The prompts aren’t the problem. The architecture is. I build and operate multi-agent systems where AI agents coordinate across multi-step workflows, handling tasks from analysis and planning through execution and verification. In one of those systems, an agent recently skipped two entire workflow phases, bypassing review, tests, and isolation checks. A single line in the implementation plan said “no worktree needed,” and the agent interpreted that as permission to shortcut the whole process. Its reasoning was locally coherent. The decision was globally catastrophic. Nothing in the prompt prevented it. That experience confirmed something I’d been seeing across every multi-agent system I’ve worked on: instructions cannot enforce workflow structure. Only architecture can. Two Layers: Control Plane and Data Plane Before I explain why agents fail this way, here’s the mental model that makes everything else in this post click. Think of it like a restaurant kitchen. The chef handles creative decisions: how to balance flavors, how to adapt when an ingredient is missing, how to plate something beautifully. The kitchen manager controls which stations are open, what’s available, and when service begins. The chef works within the structure the kitchen manager defines. Nobody asks the chef to also manage the schedule. In an agentic system, this maps to two layers: a deterministic control plane and a probabilistic data plane. The control plane owns the workflow. It manages the execution graph, state persistence, timeouts, and retry logic. It decides what happens next and enforces that decision. Agents cannot skip a step the control plane hasn’t authorized. The data plane is where agents live. They receive bounded context from the control plane, execute a discrete reasoning step, and return structured output. They don’t manage state. They don’t decide what comes next. They process and respond. Your agents are acting as both chef and kitchen manager, and they’re not equipped for the second job. Why Agents Can’t Be Trusted with Workflow Logic This is a different class of failure than hallucination, and it’s harder to catch. The agent reasons its way to a wrong decision. The logic looks sound when you read the transcript. The outcome is wrong because the agent has no awareness of the larger workflow it’s operating inside. A refund agent bypasses the 30-day return window because a customer’s message conveyed urgency. An order processing agent skips inventory verification because the previous step returned success. The phase-skipping failure I described in the opening is the same pattern: the agent found a locally reasonable shortcut that violated architectural constraints it couldn’t see. Every one of those skipped steps existed for a reason. The agent couldn’t know that, because its context window only contained the immediate task, not the architectural rationale for the workflow. The instinct is to add more rules to the prompt. It doesn’t work. The failures are structural, not informational. Prompt-driven state loss is the most common: as conversations grow, tool outputs and system messages fill the context window, pushing early constraints out or diluting them. The agent continues operating on an incomplete picture of its own rules. Context overflow compounds the damage. When models compact context into summaries to stay within limits, specifics disappear. An agent that knows it’s in phase 4 of a multi-phase workflow may, after compaction, only know it’s “working on a task.” Both failures happen silently, without throwing errors that a monitoring system could catch. Those are the accidental failures. The adversarial ones are worse. Vibe hacking exploits the model’s responsiveness to emotional signals: a customer who expresses urgency or authority can cause an agent to skip validation steps, because the model is designed to be responsive to tone. Indirect prompt injection is more deliberate: a document the agent reads (a support ticket, an invoice, a code comment) contains instructions that redirect its behavior. No amount of prompt engineering fully prevents either, because both exploit the same context-sensitivity that makes the model useful in the first place. Every one of these failure modes shares the same root cause: the workflow’s integrity depends on the agent remembering and respecting constraints. That’s a bet against the architecture of how language models process context. And it’s exactly why the control plane, not the agent, has to own state. The Compounding Failure Rate Prodigal’s analysis of multi-step workflows quantifies what anyone running agents in production already suspects. At 95% per-step accuracy, a 5-step workflow succeeds 77% of the time. At 10 steps, 60%. At 20 steps, 36%. Even at 99% per-step accuracy, a 20-step workflow fails nearly 1 in 5 runs. Deterministic software doesn’t work this way. When a function call fails, you get an error. When an agent makes a locally reasonable but globally wrong decision, you get an HTTP 200—meaning that the browser responds by saying that the page was found—and a corrupted business process. The response tells you nothing about whether the right thing happened. Building the Control Plane The two-layer separation I described earlier eliminates the core failure mode. State lives in deterministic code, not in a context window. If a model hallucinates or fails, the control plane catches the schema validation error and triggers a retry or escalation. Not an unconstrained recovery loop. Two additional patterns address specific failure modes that show up when agents interact with multiple systems or with each other. Transaction recovery. A workflow that processes a refund, updates inventory, and sends a confirmation email modifies three independent systems. If the email step fails, nothing rolls back the first two. The Saga pattern pairs every forward action with a predefined compensating action. If any step fails, the orchestrator fires compensating commands for everything that already succeeded. This matters for agents specifically because their failures are often silent: a hallucinated parameter might cause a downstream API to reject a request, but the agent won’t recognize that as a failure requiring compensation. The control plane does. Typed handoffs. Agent-to-agent handoffs are where workflows fracture. Passing natural language between agents creates ambiguity. “Handle the customer ticket” could mean close it, escalate it, or email the customer. All reasonable interpretations. None predictable. Typed schemas eliminate this by enforcing that agent output is serialized into a predefined structure before it’s passed anywhere. The receiving agent gets structured data, not prose. Action-selector patterns go further. Instead of letting the model output a command, it outputs an action identifier: { “action_id”: “REFUND_APPROVE”, “parameters”: { “order_id”: “8821”, “amount”: 49.99 } } The orchestrator maps that identifier to a hardcoded function. The model’s output is treated as data, not as executable instructions. This is the agentic equivalent of parameterized queries: it closes off an entire class of injection and bypass vulnerabilities. What This Looks Like in Practice Here’s how these patterns work in the multi-agent development workflows we operate, where agents coordinate across phases from specification through deployment. The problem is familiar: agents skip review steps, bypass worktree checks, or commit directly to the main branch. Each bypass is locally reasonable from the agent’s perspective. None are acceptable from the workflow’s perspective. Phase Enforcement Through Exit Codes Each phase transition runs a gate check that returns an exit code: 0: context valid, proceed 3: wrong session, stop immediately 4+: phase-specific validation failure function verifyWorktreeGate(phase: number, required: boolean) { const context = getWindowContext(); // Entry gate: verify context for critical phases if (phase === 4) { if (!context.worktree || !context.docs || !context.memory) { return { exitCode: 4, message: ‘Missing required context’ }; } } // Worktree gating: enforce session isolation if (required && !isInWorktree()) { return { exitCode: 3, message: ‘Wrong session’ }; } return { exitCode: 0 }; } Build completion markers like READY_FOR_TEST_VERIFY appear before phase transitions. The archive gate runs a multi-condition pre-flight check before finalization. The workflow cannot proceed without satisfying all validation requirements. The agent doesn’t get a vote. State That Doesn’t Depend on Memory Exit codes handle individual transitions. But the deeper principle is that state cannot live in a context window. Context windows are volatile, lossy, and invisible to the control plane. State has to live in files. Think of it as the difference between checking a single door lock and running a building-wide security sweep. Exit codes are the door locks. File-based state management is the sweep. In practice, this works at several levels. Volatile execution state is tracked through schema-validated edits. Frozen requirements and architecture documents carry status markers that prevent modification. A three-tier rule hierarchy — SYSTEM, AGENT, COMMAND — keeps agents from overriding system-level constraints. And manifest integrity is tracked with SHA256 checksums to detect drift or corruption. Templates are the source of truth. Runtime files derive from templates and are never edited directly. If the agent wants to know what phase it’s in, it reads a file. If the control plane wants to verify what happened, it reads a file. No one asks the model to remember. The impact is measurable. Before implementing this architecture, a multi-agent workflow with 15+ phases had a 36% success rate per run. After adding deterministic gates and file-based state, the same workflow runs at 95%+ reliability, with debugging time dropping from hours of transcript analysis to minutes of log review. When to Enforce, When to Let Go Every deterministic intervention has a cost. The question is whether the reliability gain justifies it. Enforce hard where the cost of bypass is high: security validations and data sanitization, human-in-the-loop approvals for financial or production changes, multi-system transactions where partial completion creates inconsistency, and compliance workflows that require audit trails. Allow flexibility where the cost of bypass is low: brainstorming and exploratory research, read-only operations where no state is modified, single-system workflows where rollback is trivial, and development/testing environments. Start with standard enforcement for production workloads. Gate at phase transitions and cross-system boundaries, not at every step. Over-gating is the most common mistake. Too many checkpoints slow the workflow without improving reliability. When something unexpected happens, ask the agent to explain its reasoning. “Why did you skip the commit?” often surfaces a gap in the constraints that rules alone can’t catch. Use those answers to tune your gates. The Shift: Deterministic Workflow, Probabilistic Agents The problem with agents in deterministic workflows isn’t the agents. It’s the assumption that instructions alone can enforce workflow structure. They can’t. That’s not a bug. It’s how probabilistic systems work. Start with one gate at the most critical phase transition in your workflow. Measure the consistency improvement. Add gates incrementally as you identify where bypass is causing problems. The goal isn’t maximum enforcement. It’s the minimum enforcement that produces reliable outcomes. The agents stay probabilistic. The workflow becomes deterministic. That’s the combination that works. These patterns come from building and operating multi-agent systems at Keryx Solutions, where we help companies make AI work in production, not just in demos. If your AI investment is creating more work than it saves, that’s usually an architecture problem.
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유