Anthropic의 Mythos AI는 모든 주요 OS 및 브라우저에서 심각한 보안 허점을 발견했습니다.

Singularity Hub | 2026년 4월 11일 04:21 | 🔬 연구

#ai 보안 #anthropic #mythos ai #보안 취약점 #제로데이 #취약점/보안 #claude #cybersecurity #exploits #openai #review #security #ai 위협 #사이버보안

원문 출처: Singularity Hub · Genesis Park에서 요약 및 분석

요약

Anthropic이 전문가 수준의 사이버 공격 능력을 갖춘 최신 AI 모델 'Mythos'를 발표했습니다. 이 모델은 모든 주요 운영체제와 웹 브라우저에서 미발견된 제로데이 취약점을 자율적으로 찾아내고 이를 연쇄적으로 악용해 복잡한 공격을 구성할 수 있습니다. 기존 최고 수준의 모델보다 압도적인 성능을 보여주며, 전문가가 수주가 걸리던 해킹 코드 생성을 단 몇 시간 만에 가능하게 합니다. 악용 가능성을 우려한 Anthropic은 해당 모델을 공개하지 않고 주요 기술 기업 등 제한된 그룹에만 제공해 방어 목적으로 활용하도록 할 방침이며, 6~24개월 내 유사한 수준의 AI 능력이 널리 퍼질 것에 대비해 근본적인 보안 패러다임의 재정비가 필요하다고 경고했습니다.

본문

It’s a step change in cybersecurity. Exploits that would take experts weeks to develop can now be generated in hours. Concerns about AI’s ability to turbocharge cybersecurity threats have been building for years. Anthropic’s latest model could mark a turning point after the company claimed the model could identify and exploit zero-day vulnerabilities in every major operating system and web browser. One of the standout use cases for large language models is analyzing and writing code. This has long raised worries that the technology could help automate much of the work of hackers, potentially lowering the barrier for cyberattacks. Leading models have demonstrated steady progress on various cybersecurity-related benchmarks, and there has been evidence malicious actors are using the technology. But so far, the impact appears to have been modest, suggesting practical barriers remain that prevent the widespread use of the technology. According to Anthropic, that’s about to change. The company says its latest model, Mythos, has hacking capabilities so potent the company will not make it publicly available. Instead, it’s releasing Mythos to a select group of major technology companies and open source developers as part of an initiative called Project Glasswing. Those participating can use the model to identify vulnerabilities in their code and patch them before hackers get access to similar capabilities. “The vulnerabilities that Mythos Preview finds and then exploits are the kind of findings that were previously only achievable by expert professionals,” the company’s researchers write in a blog post. “We believe the capabilities that future language models bring will ultimately require a much broader, ground-up reimagining of computer security as a field.” Fortune first reported news of Mythos last month, after a data leak at Anthropic revealed details about the new model. While the AI excels at cybersecurity tasks, it’s designed to be a general purpose model, and the company says its hacking capabilities are simply a result of vastly improved coding and reasoning skills. In testing, Anthropic’s researchers discovered the model was able to find “zero-day” vulnerabilities—ones that were previously undiscovered—in every major operating system and web browser. Many were decades old, an indicator of how hard they were to detect. But the model isn’t just good at finding vulnerabilities. The company’s red team—security researchers who simulate hacking attacks to identify security weaknesses—showed the model could chain together multiple vulnerabilities to create complex attacks capable of sidestepping defenses. Its capabilities are a step change from the previous best models. Given the challenge of attacking the Firefox web browser’s JavaScript engine, Anthropic’s previous most powerful model Opus 4.6 succeeded just twice, compared to 181 times for Mythos. Most worryingly, the team found that engineers with no security background could use it to develop successful attacks overnight. Key to the new capabilities is the model’s ability to operate autonomously for long stretches. To find bugs, the researchers used Anthropic’s coding agent Claude Code to call the model and give it a simple prompt to scan for vulnerabilities in a particular codebase. The model then read the code, came up with hypotheses about potential bugs, and ran tests to validate them without any human involvement. The Anthropic team says Mythos fundamentally reshapes the cybersecurity landscape as exploits that would take experts weeks to develop can now be generated in hours. In particular, they note that so-called “defense-in-depth” measures that make it time-consuming and costly to attack a system may prove ineffective against models like Mythos. “When run at large scale, language models grind through these tedious steps quickly,” they write. “Mitigations whose security value comes primarily from friction rather than hard barriers may become considerably weaker against model-assisted adversaries.” The head of Anthropic’s frontier red team, Logan Graham, told Axios that they expect other companies to produce models with similar capabilities in the coming six to 18 months. Sources familiar with the matter told Axios that OpenAI is already finalizing a model with similar capabilities to Mythos, which will have a similarly limited release. In its blog post, the company’s researchers note that new security technology has historically benefited defenders more than attackers. If frontier labs are careful about model releases, they think the same could be true here too, but the transitional period is likely to be disruptive. “We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months,” Graham told Wired. “Many things would be different about security. Many of the assumptions that we’ve built the modern security paradigms on might break.” Whether AI developers can keep a lid on these capabilities long enough for the

원문 보기 (Singularity Hub)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기