Durable Object alarm loop: $34k in 8 days, zero users, no platform warning
hackernews
|
|
{'이벤트': '📰', '머신러닝/연구': '📰', '하드웨어/반도체': '📰', '취약점/보안': '📰', '기타 AI': '📰', 'AI 딜': '📰', 'AI 모델': '📰', 'AI 서비스': '📰', 'discount': '📰', 'news': '📰', 'review': '📰', 'tip': '📰'} 이벤트
#gpt-4
#openai
요약
이 기사를 한국어로 요약해야 합니다. 2-4문장, 80자 이상, 구체적 사실/수치/맥락 포함. 주요 내용: - Cloudflare Durable Objects 알람 루프 문제 - 4월 3일 시작, 4월 4-5일에 일일 9300억 행 읽기 발생 - 4월 15일 $34,895 청구서 - 원인: onStart()에서 이미 예약된 알람 확인 없이 매번 setAlarm() 호출 - Workers 사용 알림은 CPU 시간만 모니터링, DO 행 읽기/쓰기는 모니터링 안함 - DO 작업에 대한 지출 한도 없음 한국어로 요약하겠습니다. **요약:** 개발자가 Cloudflare Durable Objects(DO)에서 알람 루프 버그로 8일 동안 약 3만5천 달러의巨额 청구서를 받았다. DO 인스턴스가 알람을 중복으로 설정하며 스스로的健康검사를 반복하면서 4월 4~5년에 일일 9,300억 행의 읽기가 발생했지만, Cloudflare의 사용 알림은 CPU 시간만 모니터링하고 DO 행 읽기/쓰기는 감시하지 않아 사전 경고가 불가능했다. Cloudflare는 현재 AI 에이전트 개발자를 DO에 유도하는 마케팅 캠페인을 진행 중이나, DO 작업에 대한 지출 한도 기능이 없어 사용량이 급증해도 자동으로 차단되지 않는다.
왜 중요한가
본문
Sharing this as a warning to anyone using Cloudflare Durable Objects with alarms.<p>Root cause:</p><p>My DO agent's onStart() handler called this.ctx.storage.setAlarm() on every wake-up without checking whether an alarm was already scheduled. Combined with 60+ preview Worker deployments each creating independent DO instances, this created a runaway self-health-check loop.</p><p>Timeline:
- April 3: loop began (zero DO usage before this date)</p><p>- April 4-5: peaked at ~930 billion row reads/day</p><p>- April 11: found it, fixed it</p><p>- April 15: $34,895 invoice due with no billing response yet</p><p>The fix:</p><p>// Before (dangerous)
async onStart() {
await this.ctx.storage.setAlarm(Date.now() + 60_000)
}</p><p>// After (safe)
async onStart() {
const existing = await this.ctx.storage.getAlarm()
if (!existing) {
await this.ctx.storage.setAlarm(Date.now() + 60_000)
}
}</p><p>Other things worth doing:
- Strip DO bindings from preview environments entirely
- Deploy a budget monitor kill switch Worker
- Add a circuit breaker that checks alarm state before scheduling</p><p>Why I had no warning:</p><p>Cloudflare's Workers Usage Notifications only monitors CPU time. Not Durable Object row reads or writes. There is also no hard spending cap for DO operations available in the dashboard or Wrangler config. Nothing would have fired an alert during this runaway. I found out when the bill showed up.</p><p>This is worth knowing if you're using DO alarms. The platform will not tell you when DO row reads/writes go exponential. You have to build your own kill switch.</p><p>One more thing that I think deserves more attention than it's getting: this is Agents Week. Cloudflare is running a dedicated marketing push right now to get individual developers building AI agents on Durable Objects. Blog posts, announcements, the whole thing. That is a deliberate effort to onboard solo developers and indie founders into a product that can silently generate a five-figure bill with zero platform warning. There is no spending cap for DO operations. The usage notification system doesn't cover DO reads or writes. Cloudflare knows this. Running Agents Week while that gap exists is not a neutral decision.</p><p>I've filed Case 02067725. I'm a pre-launch sole proprietor who put all my personal savings into this startup. This bill would financially destroy me for usage that generated zero business value. Sharing here both as a technical warning and because I need help getting this in front of someone at Cloudflare who can make a decision.</p><p>Has anyone escalated a billing dispute with Cloudflare successfully?</p>
- April 3: loop began (zero DO usage before this date)</p><p>- April 4-5: peaked at ~930 billion row reads/day</p><p>- April 11: found it, fixed it</p><p>- April 15: $34,895 invoice due with no billing response yet</p><p>The fix:</p><p>// Before (dangerous)
async onStart() {
await this.ctx.storage.setAlarm(Date.now() + 60_000)
}</p><p>// After (safe)
async onStart() {
const existing = await this.ctx.storage.getAlarm()
if (!existing) {
await this.ctx.storage.setAlarm(Date.now() + 60_000)
}
}</p><p>Other things worth doing:
- Strip DO bindings from preview environments entirely
- Deploy a budget monitor kill switch Worker
- Add a circuit breaker that checks alarm state before scheduling</p><p>Why I had no warning:</p><p>Cloudflare's Workers Usage Notifications only monitors CPU time. Not Durable Object row reads or writes. There is also no hard spending cap for DO operations available in the dashboard or Wrangler config. Nothing would have fired an alert during this runaway. I found out when the bill showed up.</p><p>This is worth knowing if you're using DO alarms. The platform will not tell you when DO row reads/writes go exponential. You have to build your own kill switch.</p><p>One more thing that I think deserves more attention than it's getting: this is Agents Week. Cloudflare is running a dedicated marketing push right now to get individual developers building AI agents on Durable Objects. Blog posts, announcements, the whole thing. That is a deliberate effort to onboard solo developers and indie founders into a product that can silently generate a five-figure bill with zero platform warning. There is no spending cap for DO operations. The usage notification system doesn't cover DO reads or writes. Cloudflare knows this. Running Agents Week while that gap exists is not a neutral decision.</p><p>I've filed Case 02067725. I'm a pre-launch sole proprietor who put all my personal savings into this startup. This bill would financially destroy me for usage that generated zero business value. Sharing here both as a technical warning and because I need help getting this in front of someone at Cloudflare who can make a decision.</p><p>Has anyone escalated a billing dispute with Cloudflare successfully?</p>