Claude Code: 모든 유료 계층에 걸쳐 광범위한 비정상적인 사용 제한 유출

hackernews | 2026년 4월 15일 18:47 | 📦 오픈소스

#anthropic #claude

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

Claude의 모든 유료 계층 사용자에게 3월 23일부터 단순한 인사말 하나로도 세션 사용량의 최대 7%가 소진되는 치명적인 사용량 제한 오류가 발생했습니다. 이 문제는 캐시 무효화 버그와 피크 시간대 스로틀링 등 최소 4가지 원인이 복합적으로 작용한 결과로, 사용자들이 5시간 세션을 19분 만에 다 써버리는 등 서비스 이용에 큰 차질을 빚고 있습니다. 그러나 안타깝게도 Anthropic은 이러한 심각한 장애에 대해 공식 블로그나 이메일 통보 없이 개별 엔지니어의 소셜 미디어 발표로만 대응하여 비난을 받고 있습니다.

본문

You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert [BUG] Critical: Widespread abnormal usage limit drain across all paid tiers since March 23, 2026 — multiple root causes identified, no formal communication issued #41930 A note to the Anthropic team — from a paying customer and fellow software developer. I want to be upfront: I know this GitHub issue tracker is for Claude Code bugs, and this report covers the broader Claude ecosystem. I'm filing it here because, after three days, I have exhausted every other channel available to me — and received no response from any of them. I have submitted support tickets through the Mac Claude subscription interface. No response. I have reached out on Twitter/X. No response. I have looked for updates on Reddit. The only official replies are variations of "we're working on it." There is no blog post. No email to subscribers. No status page entry. Nothing that would tell a paying customer what is happening, why, or when it will be resolved. I am a professional software developer. I understand that bugs happen. I understand that scaling is hard. I understand that demand surges can catch even well-resourced teams off guard. What I do not understand — and what I find genuinely difficult to accept — is the communication approach Anthropic has chosen here. This is not a minor inconvenience affecting a handful of edge-case users. This is a service degradation affecting every paid tier, documented by hundreds of users across every public channel, covered by multiple tech publications, and trending on Hacker News. The financial impact is real: users are paying $20–$200/month for a service that, for over a week now, delivers a fraction of its advertised value. For some users, a single "hello" consumes 2% of their session. That is not a degraded experience — that is a broken product. As a developer, I have to ask the engineering question: if a community member can identify the root cause by reverse-engineering your binary in a weekend, why has it taken over a week without a fix — or even a confirmed diagnosis? If users report that downgrading to v2.1.34 resolves the issue, the regression surface is well-bounded. A rollback is a standard incident response. A/B testing (canary releases, staged rollouts to 10% of users) is a standard deployment practice that would have caught this before it reached your entire user base. These are not exotic techniques. They are the basics. I say this not to be disrespectful, but because I genuinely believe Anthropic builds an excellent product, and the way this incident is being handled does not reflect the quality I have come to expect. Your users are not adversaries. Many of us, myself included, would happily help debug this if given the chance. But right now, the silence is more damaging to trust than the bug itself. What I am specifically asking for: Respond to support tickets. Three days without a reply is not acceptable for a paid service. Publish a public update — a blog post, a status page entry, an email to subscribers. Something official that acknowledges the scope of the problem. If a fix is not imminent, roll back the changes that introduced the regression. Users have demonstrated that v2.1.34 does not exhibit the issue. Consider offering usage credits or refunds for the affected period. Users paid for capacity they did not receive due to a software bug. I am filing this in good faith, as a last resort after all other communication channels failed. I would much rather have received a support ticket response. Preflight Checklist I have searched existing issues and this hasn't been reported yet This is a single bug report (please file separate reports for different bugs) I am using the latest version of Claude Code What's Wrong? Since March 23, 2026, users across all paid tiers (Pro, Max 5×, Max 20×) have experienced catastrophic usage limit drain. Single prompts are consuming 3–7% of session quota. Five-hour session windows are depleting in as little as 19 minutes. The issue has been independently confirmed by hundreds of users across Reddit, Twitter/X, GitHub, and tech press outlets. This is not a single bug. Community investigation has identified at least four overlapping root causes hitting simultaneously: Intentional peak-hour throttling (confirmed by Anthropic on March 26) Two prompt-caching bugs silently inflating token costs 10–20× (under investigation as of March 31) Expiration of the 2× off-peak usage promotion on March 28 Despite widespread impact, Anthropic has issued no blog post, no email notification, and no status page entry. All official communication has been limited to personal X/Twitter posts by individual engineers and a handful of Reddit comments. This also affects claude.ai web users, not just Claude Code. See anthropics/anthropic-sdk-python#1215: "This is NOT the Claude Code prompt caching bug — I do not use Claude Code." Root Cause Analysis 1. Peak-Hour Throttling (Confirmed) On March 26, Anthropic engineer Thariq Shihipar announced via personal X/Twitter post that session limits now drain faster during weekday peak hours (5am–11am PT / 1pm–7pm GMT). Estimated to affect ~7% of users, but no specifics were given on the magnitude of the reduction. PCWorld confirmed the statement through direct outreach to Anthropic. 2. Two Prompt-Caching Bugs (Under Investigation) A community member reverse-engineered the Claude Code standalone binary (228MB) using Ghidra, a MITM proxy, and radare2, identifying two independent cache invalidation bugs: Bug A — Billing sentinel string replacement: Anthropic's custom Bun fork performs a string replacement on every API request targeting a billing attribution sentinel. If conversation history mentions billing-related terms, the replacement hits the wrong position, breaking the cache prefix and forcing a full (uncached) token rebuild. Uncached tokens cost 10–20× more against the usage quota than cached tokens. Bug B — Resume/continue flag cache invalidation: Using --resume or --continue flags causes tool attachments to be injected in a different position than in fresh sessions, invalidating the entire conversation cache and forcing complete reprocessing of all prior context. A verification script was published at gitlab.com/treetank/cc-diag. This was submitted as GitHub issue #40524 and flagged as a regression. Anthropic's Thariq Shihipar responded on March 31 that they are "actively looking into this in particular." 3. Session-Resume Token Generation Bug GitHub issue #38029 documented 652,069 output tokens generated without corresponding user prompts during a session resume. Issue #37436 reported quota rising from 23% to 32% within minutes of starting sessions with no active work. A same-day Claude Code update note — "Improved memory usage and startup time when resuming large sessions" — suggests Anthropic was aware of this vector. 4. Promotion Expiration Compounding Effect From March 13–28, Anthropic doubled usage limits during off-peak hours. Users accustomed to this capacity experienced the return to baseline as a sudden reduction, amplifying the perception of the other three issues. Timeline Date Event Mar 13 2× off-peak usage promotion begins Mar 23 First widespread user reports of abnormal usage drain Mar 24 GitHub #38335 filed; r/ClaudeCode threads gain traction Mar 26 Anthropic confirms peak-hour throttling via engineer's personal X post Reverse-engineering analysis published identifying two caching bugs Mar 31 Lydia Hallie (Anthropic product lead) acknowledges on X; Anthropic Reddit account calls it "top priority" Mar 31 The Register: "Anthropic admits Claude Code quotas running out too fast" hits Hacker News front page (136 points, 115 comments) Apr 1 No blog post, no email, no status page entry. Issue persists. Steps to Reproduce Open any Claude client (claude.ai, Claude Code CLI, Claude Desktop) on a Pro or Max subscription Send a short message (even "hello" or "good morning") Observe the usage meter — a single short exchange may consume 2–7% of the 5-hour session quota Continue normal usage; observe that 5-hour sessions may deplete in 1–2 hours instead of the expected 5 For the caching bugs specifically: Start a Claude Code session with --resume on a previous conversation Monitor token usage via MITM proxy or the verification script at gitlab.com/treetank/cc-diag Observe that cache_read tokens remain near zero while cache_write tokens grow linearly, indicating the cache is being fully rebuilt on every turn What Should Happen? A short message ("hello") should consume a negligible fraction of session quota A 5-hour session window should last approximately 5 hours of active use Prompt caching should reduce token costs by 10–20× on subsequent turns within a session Session resume should not generate hundreds of thousands of tokens without user prompts Significant changes to usage metering should be announced proactively via blog, email, and status page Error Messages/Logs ## Actual Behavior - Short messages consume 2–7% of session quota - 5-hour sessions deplete in 19 minutes to ~2 hours - Prompt cache is silently invalidated, causing 10–20× cost inflation - Session resume generates 650K+ output tokens silently - The only official communication has been informal social media posts by individual engineers Steps to Reproduce Open any Claude client (claude.ai, Claude Code CLI, Claude Desktop) on a Pro or Max subscription Send a short message (even "hello" or "good morning") Observe the usage meter — a single short exchange may consume 2–7% of the 5-hour session quota Continue normal usage; observe that 5-hour sessions may deplete in 1–2 hours instead of the expected 5 For the caching bugs specifically: Start a Claude Code session with --resume on a previous conversation Monitor token usage via MITM proxy or the verification script at gitlab.com/treetank/cc-diag Observe that cache_read tokens remain near zero while cache_write tokens grow linearly, indicating the cache is being fully rebuilt on every turn Environment Affected products: claude.ai (web), Claude Code (CLI), Claude Desktop (all share the same usage pool) Affected tiers: Pro ($20/mo), Max 5× ($100/mo), Max 20× ($200/mo) Affected Claude Code versions: v2.1.69+ through at least v2.1.87 (caching regression); v2.1.88 current Known partial workaround: Downgrade to v2.1.34 for Claude Code users OS: Cross-platform (macOS, Linux, Windows all affected) Region: Global Community-Discovered Workarounds These are temporary mitigations documented by affected users: Downgrade Claude Code to v2.1.34 to avoid the caching regressions Run via npx @anthropic-ai/claude-code instead of the standalone binary to avoid the Bun fork string-replacement bug Avoid --resume and --continue flags which trigger full cache invalidation Use /clear to start fresh sessions rather than continuing long conversations Shift intensive work to off-peak hours (after 11am PT weekdays, evenings, weekends) Monitor usage in the first few turns — if a single short message burns >3–5% of session, restart immediately Preflight Checklist What's Wrong? Since March 23, 2026, users across all paid tiers (Pro, Max 5×, Max 20×) have experienced catastrophic usage limit drain. Single prompts are consuming 3–7% of session quota. Five-hour session windows are depleting in as little as 19 minutes. The issue has been independently confirmed by hundreds of users across Reddit, Twitter/X, GitHub, and tech press outlets. This is not a single bug. Community investigation has identified at least four overlapping root causes hitting simultaneously: Despite widespread impact, Anthropic has issued no blog post, no email notification, and no status page entry. All official communication has been limited to personal X/Twitter posts by individual engineers and a handful of Reddit comments. Quantitative User Reports This also affects claude.ai web users, not just Claude Code. See anthropics/anthropic-sdk-python#1215: "This is NOT the Claude Code prompt caching bug — I do not use Claude Code." Root Cause Analysis 1. Peak-Hour Throttling (Confirmed) On March 26, Anthropic engineer Thariq Shihipar announced via personal X/Twitter post that session limits now drain faster during weekday peak hours (5am–11am PT / 1pm–7pm GMT). Estimated to affect ~7% of users, but no specifics were given on the magnitude of the reduction. PCWorld confirmed the statement through direct outreach to Anthropic. 2. Two Prompt-Caching Bugs (Under Investigation) A community member reverse-engineered the Claude Code standalone binary (228MB) using Ghidra, a MITM proxy, and radare2, identifying two independent cache invalidation bugs: Bug A — Billing sentinel string replacement: Anthropic's custom Bun fork performs a string replacement on every API request targeting a billing attribution sentinel. If conversation history mentions billing-related terms, the replacement hits the wrong position, breaking the cache prefix and forcing a full (uncached) token rebuild. Uncached tokens cost 10–20× more against the usage quota than cached tokens. Bug B — Resume/continue flag cache invalidation: Using --resume or--continue flags causes tool attachments to be injected in a different position than in fresh sessions, invalidating the entire conversation cache and forcing complete reprocessing of all prior context.A verification script was published at gitlab.com/treetank/cc-diag . This was submitted as GitHub issue #40524 and flagged as a regression. Anthropic's Thariq Shihipar responded on March 31 that they are "actively looking into this in particular."3. Session-Resume Token Generation Bug GitHub issue #38029 documented 652,069 output tokens generated without corresponding user prompts during a session resume. Issue #37436 reported quota rising from 23% to 32% within minutes of starting sessions with no active work. A same-day Claude Code update note — "Improved memory usage and startup time when resuming large sessions" — suggests Anthropic was aware of this vector. 4. Promotion Expiration Compounding Effect From March 13–28, Anthropic doubled usage limits during off-peak hours. Users accustomed to this capacity experienced the return to baseline as a sudden reduction, amplifying the perception of the other three issues. Timeline Steps to Reproduce For the caching bugs specifically: --resume on a previous conversationgitlab.com/treetank/cc-diag cache_read tokens remain near zero whilecache_write tokens grow linearly, indicating the cache is being fully rebuilt on every turnWhat Should Happen? Error Messages/Logs Steps to Reproduce For the caching bugs specifically: --resume on a previous conversationgitlab.com/treetank/cc-diag cache_read tokens remain near zero whilecache_write tokens grow linearly, indicating the cache is being fully rebuilt on every tu

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기