Anthropic의 Claude가 취약한 코드를 퍼뜨리고 있다고 사이버 전문가들이 경고했습니다.
url_exploration
|
|
🔒 보안
#ai 모델
#anthropic
#claude
#openai
#보안 취약점
#코드 품질 저하
요약
사이버 보안 기업 TrustedSec의 CEO 데이브 케네디는 안스로픽의 최신 모델 '클로드 오푸스'가 지난 2월 업데이트 이후 코드 품질이 47.3%나 악화되었다고 밝혔습니다. 모델의 성능 저하로 심각한 보안 문제와 결함이 발생하여 현재 사용이 불가능한 수준이라는 지적입니다. 전문가들은 초보 개발자들이 이런 결함을 발견하지 못하고 코드를 사용할 경우 보상 위험이 크다고 경고했습니다.
왜 중요한가
개발자 관점
검토중입니다
연구자 관점
검토중입니다
비즈니스 관점
검토중입니다
본문
This is the online edition of The Wiretap newsletter, your weekly digest of cybersecurity, internet privacy and surveillance news. To get it in your inbox, subscribe here. In March, developers at Ohio-based cybersecurity company TrustedSec were regularly using Anthropic’s premium Claude Opus model to speed up app development and generate attacks to test client defenses. But in recent weeks, they’ve stopped using it. Performance dropped so sharply in the weeks after the release of Opus 4.6 in early February that the model began introducing “serious defects and security issues,” says TrustedSec CEO and former NSA analyst Dave Kennedy. “Right now, from five weeks ago to today, the code quality is over 47.3% worse than when it was first released,” Kennedy tells Forbes. “It’s really bad, I mean unusably bad.” That figure is according to a tool he built to test Claude’s quality, which tracks code quality, bugs, security issues and whether it completes a coding job from start to finish without problems. The ultimate risk, he says, is that novice developers using Claude for coding won’t spot flaws, “introducing serious defects.” “It’s very alarming,” he says. Kennedy says Opus 4.7, the latest model, was “marginally better” but still not at the quality level of 4.6 when it was released. In recent weeks, scores of once-happy Anthropic customers have flocked to Reddit and X to vent similar frustrations. It’s not just programmers experiencing usability issues. An AI executive at chipmaker AMD wrote on Github that her team had seen Claude’s thinking become so “shallow” that it “cannot be trusted to perform complex engineering tasks.” Analyses from coding security company Veracode have also found that Claude models are writing less secure code than competitors. Over the last year, Veracode has been testing AI systems by asking them to complete 80 coding tasks. In 52% of those, Opus 4.7 included a vulnerability in the code. That’s up from 51% for Opus 4.1 and 50% for Claude Sonnet 4.5, a lower level, more cost-efficient model that doesn’t use up as much compute. OpenAI’s models perform notably better at around 30%. Jens Wessling, Veracode’s chief innovation officer, says the data backed up user claims of model degradation. Wessling believes models are being trained to write working code, “not to consistently apply the controls that make software secure.” “It reflects a real dynamic where faster, more capable models can still produce insecure output at meaningful rates,” he tells Forbes. “Without changes to how that code is validated and remediated, the net effect can look like more buggy or vulnerable software, not less.” Anthropic said it was actively investigating the claims of degradation in Opus and that engineers should always check for vulnerabilities. Previously, head of Claude Code Boris Cherny posted on X that the company had chosen to turn down how hard Claude thinks before editing code, down from "high" to "medium" effort, in response to complaints about token usage, referring to a unit of text or code that a model uses to process and generate language. Adding irony to injury, this month Anthropic announced it had developed a new model, Mythos, that was capable of autonomously finding security issues in commonly-used browsers and operating systems and at scale. The company limited Mythos use to 40 major organizations, from Apple to Google, so it can be used to secure widely-used products before hackers get hold of similarly powerful AI. Kennedy is so concerned about the potential for any AI giant’s models regressing that he’s reconsidering how his team uses AI. Now, he is building his own on-premise AI infrastructure so he can run bespoke models that he can control. “Who can we really trust here?” he asks. Got a tip on surveillance or cybercrime? Get me on Signal at +1 929-512-7964. THE BIG STORY Inside Madison Square Garden’s Surveillance Machine Wired has a deep dive into the surveillance apparatus at New York’s Madison Square Garden, where one trans woman was tracked for two years and protestors were snooped on by people pretending to be cops. Even Knicks players are warning about rooms being bugged, and staff fear being followed to local bars. Stories You Have To Read Today Tinder and Zoom announced partnerships with Sam Altman’s World company, which scans people’s eyeballs to prove they’re human and validate their identity. Palantir published a 22-point manifesto from the new book of cofounder and CEO Alex Karp, which included a call for national service. “We should, as a society, seriously consider moving away from an all-volunteer force and only fight the next war if everyone shares in the risk and the cost,” he wrote. While Karp’s business works closely with the Pentagon, Karp himself is not known to have served in the military. As part of an international police operation, the DOJ seized some of the biggest online markets offering to provide distributed denial of service (DDoS) attacks, which flood websites and apps with traffic to take them offline. Tyler Robert Buchanan, a 24-year-old from Dundee, Scotland, pleaded guilty to his role in a hacking conspiracy to steal at least $8 million in virtual currency from U.S. companies. Investigators alleged Buchanan was part of the Scattered Spider crew, which targeted a range of retail and telecommunications companies globally. In case you missed it, Forbes published its eighth annual AI 50 list, with sponsoring partner Mayfield, that highlights the most promising privately held AI companies in the world. There’s a lot of familiar names, like Anthropic, Harvey and ElevenLabs, but this year Forbes has also highlighted some exciting newcomers, including presentation builder Gamma, drug discovery startup Chai Discovery and New York-based Rogo, which is building AI for bankers and investors. We also launched our first ever AI 50 Brink list, featuring early stage companies with the potential to rival their more established peers in the future. Winner of the Week Last year, the DOJ sued Rhode Island to acquire non-public voter databases that included sensitive information like birth dates and Social Security numbers, without any justification. Now, a U.S. district court judge granted a motion from voting rights groups and the ACLU to dismiss it. Loser of the Week A 35-year-old former cop, Robert Jay Josett, pleaded guilty to using Flock Safety car surveillance technology, among other snooping tools, to track the whereabouts of his wife, mistress and romantic rivals. He’s been ordered to serve a 52-week domestic violence program and sentenced to three years informal probation.