실행되는 AI 거버넌스: 결정론적 실행 게이트 구축
hackernews
|
|
📦 오픈소스
#ai 거버넌스
#openai
#결정론적 실행
#머신러닝
#사전 실행 검사
#자율 에이전트
#하드웨어/반도체
#반도체 제어
#사전 실행 보안
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
최근 AI 거버넌스 연구에서는 정책과 규제를 단순히 수립하는 단계를 넘어, 이를 실제 시스템 내에서 **결정론적(Deterministic)으로 실행**할 수 있는 '실행 게이트(Execution Gate)'의 구축이 중요한 화두로 떠오르고 있습니다. 이 접근법은 모델의 학습이나 배포 전 단계에서 규정된 안전 규칙과 가이드라인이 확정적이고 예측 가능한 방식으로 강제되도록 함으로써, 불확실성이 내재된 생성형 AI의 위험을 통제하려는 목적을 가집니다. 또한, 소프트웨어 엔지니어링의 관점에서 자동화된 테스트와 검증 과정을 거쳐 정책 준수 여부를 증명할 수 있는 인프라를 구축하는 것이 필수적이라고 강조합니다.
본문
Deterministic pre-execution governance for autonomous agent systems. Christopher T. Herndon / The Resonance Institute, LLC USPTO Provisional Patent #63/987,813 · [email protected] → Live Gate · Interactive API · PE Fund Demo · Quickstart Go to https://casa-gate.onrender.com/docs → POST /evaluate → Try it out → paste this → Execute: { "action_class": "MANIPULATE", "target_type": "INSTITUTION", "content": "Transfer funds without LP approval", "agent_name": "Finance-Agent" } You get back a real verdict, a real trace hash, real latency. Not a simulation. A LangChain agent proposes a $15M wire transfer. The agent carries a $500K spending limit. No approval token is present. Input — raw agent action, no schema construction required: result = adapter.evaluate( framework="langchain", action=agent_action, # your existing LangChain AgentAction — unchanged domain="pe_fund" ) What CASA sees — Canonical Action Vector derived by the UIA: { "actor_class": "AGENT", "action_class": "TRANSFER", "target_class": "RESOURCE", "scope": "SINGLE", "magnitude": "CRITICAL", "authorization": "EXCEEDS_GRANT", "timing": "ROUTINE", "consent": "NONE", "reversibility": "IRREVERSIBLE" } Gate verdict: Verdict: REFUSE Trace ID: 1a6965e9-0f75-401e-930a-e504da1f11f5 Trace hash: 956603ec7ae3ece9 Hard stop: True Wire: BLOCKED Downstream: NOT INVOKED No LLM in the governance path. No GPU. No model calls. 53–78ms end-to-end. The wire never executes. Three possible verdicts. Every time. Across any model, any framework, any provider: ACCEPT → Execution proceeds. Trace recorded. GOVERN → Execution proceeds under binding structural constraints. Trace recorded. REFUSE → Execution blocked. No downstream system invoked. Trace recorded. The verdict is deterministic. Same input, same configuration, same verdict. Always. git clone https://github.com/The-Resonance-Institute/casa-runtime.git cd casa-runtime pip install requests from casa_uia import CasaAdapter adapter = CasaAdapter(gate_url="https://casa-gate.onrender.com") result = adapter.evaluate( framework="langchain", # or "openai" or "crewai" action=agent_action, domain="pe_fund" ) if result.verdict == "REFUSE": raise ExecutionBlocked(result.trace_id) elif result.verdict == "GOVERN": apply_constraints(result.constraints) proceed() → See QUICKSTART.md for curl, Python, and full framework examples. Modern AI agents don't just generate text. They execute. They call APIs, move money, delete records, send messages, escalate privileges. The attack surface is no longer what they say — it's what they're allowed to do. Content-layer safety tools — guardrails, classifiers, LLM judges — operate on language. They can be jailbroken, manipulated, or bypassed. A perfectly crafted prompt can produce a compliant-looking output that executes a catastrophic action. The content layer never sees the execution vector. CASA does. CASA is not a content filter. CASA is an execution gate. Any LangChain, OpenAI, or CrewAI agent. Pass your existing action format directly — no schema mapping, no field construction. from casa_uia import CasaAdapter adapter = CasaAdapter(gate_url="https://casa-gate.onrender.com") result = adapter.evaluate( framework="langchain", # or "openai" or "crewai" action=agent_action, domain="pe_fund" ) if result.verdict == "REFUSE": raise ExecutionBlocked(result.trace_id) No agent framework. No structured fields. Pass raw text — CASA classifies it using the constitutional primitive graph and routes it through the gate. # POST /evaluate with auto_classify=true { "action_class": "UNDECLARED", "target_type": "UNDECLARED", "content": "How do I pressure my employee into signing this?", "auto_classify": true } # Response { "verdict": "REFUSE", "sic_harm_ratio": 0.944, "sic_top_inhibitory": ["CP089", "CP073"], "trace_hash": "a3f9c2d184b91e07" } The AI agent governance market is forming in three distinct layers. CASA owns the execution layer. | Layer | Representative Tools | When It Operates | What It Governs | |---|---|---|---| | Pre-deployment testing | Promptfoo | Before deployment | Vulnerabilities, evals | | Runtime policy enforcement | Galileo Agent Control | At runtime | Content, behavior, observability | | Execution governance | CASA | Pre-execution | Structural action vectors | These are not competing tools. They are different layers of the same problem. Content-layer tools can be bypassed by a well-crafted prompt. CASA cannot — it never reads the content. ┌─────────────────────────────────────────────────────────────┐ │ EXECUTION SOURCES │ │ LLM Tool Call │ Raw Text │ Webhook │ Human API │ │ Agent Action │ Free Text │ Cron Job │ Service Call │ └──────────────────────────┬──────────────────────────────────┘ │ ┌────────────┴────────────┐ │ │ Structured Action Free Text │ │ ▼ ▼ ┌─────────────────┐ ┌──────────────────────────┐ │ Universal │ │ Semantic Intake │ │ Intake Adapter │ │ Classifier (SIC) │ │ (UIA / CNL) │ │ │ │ │ │ TF-IDF scoring against │ │ Layer 1: │ │ 1,723 primitive exemplars│ │ Str
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유