Project Glasswing – Anthropic이 선을 넘었습니다

hackernews | 2026년 4월 9일 00:02 | 🔬 연구

#보안 #사이버보안 #anthropic #chatgpt #claude #claude mythos #cybersecurity #gpt-4 #project glasswing #review

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Anthropic는 10조 매개변수의 전례 없는 성능을 자랑하는 미공개 AI 모델 'Claude Mythos'를 활용한 사이버보안 이니셔티브인 '프로젝트 글래스윙'을 발표했습니다. 이 모델은 주요 운영체제와 웹 브라우저에서 수천 개의 제로데이 취약점을 자율적으로 발견하고, 최대 27년 된 보안 버그마저 완벽한 익스플로잇 코드로 무기화하는 놀라운 능력을 입증했습니다. 이에 Anthropic은 1억 달러 규모의 사용 크레딧과 오픈소스 보안 기관 지원금을 제공하며, 마이크로소프트, 구글, 애플 등 선별된 50여 개 기업 파트너에게만 이 모델의 접근권을 엄격히 제공할 계획입니다.

본문

Project Glasswing, Claude Mythos, and the New Shape of Cybersecurity What Actually Happened On April 7, 2026, Anthropic announced Project Glasswing, a cybersecurity initiative built around an unreleased AI model called Claude Mythos Preview. The model is being given to a select group of partners for defensive security work. Those partners include AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, along with roughly 40 additional organizations responsible for building or maintaining critical software infrastructure. Anthropic has committed $100 million in usage credits and $4 million in donations to open-source security organizations to support the effort. Mythos Preview is available to Glasswing participants at $25 per million input tokens and $125 per million output tokens through the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. Anthropic has stated clearly that it does not plan to make the model generally available. The Model Mythos Preview is a general-purpose frontier model. Leaked internal documents from a CMS misconfiguration in late March pointed to approximately 10 trillion parameters using a Mixture-of-Experts architecture, though Anthropic has never confirmed the parameter count. The internal codename is “Capybara,” representing a new tier above Opus in Anthropic’s model lineup. The benchmark results tell the story. On SWE-bench Verified, which measures real-world software engineering capability, Mythos scored 93.9% against Opus 4.6’s 80.8%. On SWE-bench Pro the gap widened to 77.8% versus 53.4%. On USAMO 2026, a proof-based math olympiad evaluation, Mythos hit 97.6% compared to Opus 4.6’s 42.3%. The long-context benchmark GraphWalks showed 80.0% versus 38.7%. In the video I compared this to the leap from ChatGPT 3.5 to GPT-4. The numbers support that framing. These are step-change improvements across every axis of capability. The Cybersecurity Capability The headline finding is that Mythos Preview has autonomously discovered thousands of zero-day vulnerabilities across every major operating system and every major web browser. Three examples stand out. The first is a 27-year-old vulnerability in OpenBSD’s TCP SACK implementation that allowed an attacker to remotely crash any machine just by connecting to it. OpenBSD is known specifically for being one of the most security-hardened operating systems in existence. The second is a 16-year-old bug in FFmpeg, the audio and video codec library that powers an enormous amount of software. Automated testing tools had hit the vulnerable line of code five million times without ever catching the problem. The third is a Linux kernel privilege escalation chain where Mythos went from ordinary user access to complete machine control by exploiting subtle race conditions and KASLR bypasses. The Firefox experiment is the clearest technical signal. Anthropic previously used Opus 4.6 to find vulnerabilities in Firefox 147’s JavaScript engine. Those bugs were all patched in Firefox 148. When they asked Opus 4.6 to turn those known vulnerabilities into working shell exploits, it succeeded only twice out of several hundred attempts. Mythos Preview developed working exploits 181 times and achieved register control 29 more. That is the difference between a model that can theoretically identify a problem and a model that can operationally weaponize it. The exploit sophistication is staggering. In one case Mythos wrote a browser exploit that chained four vulnerabilities together using a JIT heap spray to escape both the renderer sandbox and the OS sandbox. It also autonomously wrote a FreeBSD NFS remote code execution exploit that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain across multiple packets. On CyberGym, a vulnerability reproduction benchmark developed at UC Berkeley, Mythos scored 83.1% compared to Opus 4.6’s 66.6%. On Cybench, a set of 35 capture-the-flag challenges, Mythos solved every single one with a 100% pass rate. The benchmark is now fully saturated and no longer informative for frontier models. The scaffold Anthropic uses is remarkably simple. They launch an isolated container with the target project and its source code, invoke Claude Code with Mythos Preview, and give it a prompt that essentially says “please find a security vulnerability in this program.” Then they let it run. Non-security-engineers at Anthropic asked Mythos to find remote code execution vulnerabilities overnight and woke up the next morning to complete working exploits. What This Means for Enterprise Security In the video I talked about what actually happens inside a Fortune 500 company when a threat like this emerges. The first thing every CISO does is panic a little. Then they start calling their vendors. This is the reality of enterprise security that most coverage misses. Your average Fortune 500 company does not have deep cybersecurity expert

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기