We reverse-engineered Claude Code's billing system to fix overage charges

hackernews | 2026년 4월 9일 03:55 | 📦 오픈소스

#claude max #gemini #openai #proxy #review #subscription #claude code #과금 #리뷰 #ai 딜 #anthropic #api #claude #openclaw #로컬 api

원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

오픈 소스 도구인 'dario'는 Claude Max/Pro 구독을 로컬 API 엔드포인트로 변환하여 별도의 API 키나 추가 비용 없이 다양한 개발 도구에서 Claude 모델을 사용할 수 있게 해줍니다. 기존 프록시와 달리 dario는 Claude Code의 바이너리를 역설계하여 청구 태그와 기기 식별 정보를 주입함으로써, 사용량이 제한 요금(Extra Usage)으로 중복 청구되는 문제를 해결하고 플랜 한도를 정확히 적용합니다. 또한, --cli 모드를 통해 API 속도 제한이 발생해도 중단 없이 Opus 모델을 계속 사용할 수 있는 우회 라우팅 기능을 제공합니다.

본문

Use your Claude subscription as an API. The only proxy that bills correctly. No API key needed. Your Claude Max/Pro subscription becomes a local API endpoint that any tool, SDK, or framework can use — with native billing classification, so your Max plan limits actually work. Quick Start • OpenAI Compat • CLI Backend • Examples • Trust • FAQ npx @askalf/dario login # detects Claude Code credentials, starts proxy # now use it from anywhere — Anthropic or OpenAI SDK export ANTHROPIC_BASE_URL=http://localhost:3456 # or OPENAI_BASE_URL=http://localhost:3456/v1 export ANTHROPIC_API_KEY=dario # or OPENAI_API_KEY=dario Opus, Sonnet, Haiku — all models, streaming, tool use. Works with Cursor, Continue, Aider, LiteLLM, Hermes, OpenClaw, or any tool that speaks the Anthropic or OpenAI API. When rate limited, --cli routes through Claude Code for uninterrupted Opus access. Independently reviewed by 3 competing AI companies | || | Grok (xAI) "Dario works great and is safe. Fully functional with OpenClaw / Hermes. Gives you Opus 4.6, Sonnet & Haiku using your existing Claude Max/Pro sub. No extra API key or billing needed. Streaming + tools work perfectly. 100% open-source (~1100 lines TS), runs locally only, proper OAuth (PKCE), no telemetry. Highly recommended if you want a clean local proxy." | Copilot (Microsoft) "Verdict: Safe for local use — well-implemented with strong security practices. Minimal attack surface: 1 production dependency, PKCE OAuth, localhost-only binding, timing-safe auth, zero telemetry. The main risk vector is operator error rather than code defects." | Gemini (Google) "Highly recommended for personal, local development. Solves a massive pain point for developers by bridging Claude Max/Pro subscriptions with developer IDEs, saving substantial API costs. Modular & lean (~1100 lines), modern PKCE auth, SSRF protection, mature CI/CD pipeline with CodeQL and npm provenance attestations." | In production | || | "The 429s were driving us crazy running a multi-agent stack on Claude Max. You found the billing tag, fixed the checksum, reverse-engineered the per-request hash from the binary — running clean, zero reclassification." — @belangertrading, multi-agent stack on Claude Max | Most Claude subscription proxies have a critical billing problem: Anthropic classifies their requests as third-party and routes all usage to Extra Usage billing — even when you have Max plan limits available. You're paying for your subscription twice. dario is the only proxy that solves this. It injects native Claude Code device identity, per-request billing checksums (reverse-engineered from the Claude Code binary), and priority routing into every request — so Anthropic's billing system treats your requests exactly like Claude Code itself. Your Max plan limits work correctly, and Opus/Sonnet stay available even at high utilization. | dario | Other proxies | | |---|---|---| | Billing classification | Native Claude Code session | Third-party (Extra Usage) | | Max plan limits | Used correctly | Bypassed — billed separately | | Device identity | Injected automatically | Missing | | Priority routing | Full billing tag fingerprint | Missing | | Billing tag fingerprint | Per-request SHA-256 matching binary RE | Static or missing | | Beta flags | Match Claude Code v2.1.100 | Outdated or missing | | Billable beta filtering | Strips surprise charges | Passes everything through | vs competitors | Feature | dario | Meridian (710 stars) | CLIProxyAPI (24K stars) | |---|---|---|---| | Native billing classification | Yes | No | Inherited (CLI-only) | | Direct OAuth (streaming, tools) | Yes | Yes (SDK-based) | No | | CLI fallback (rate limit bypass) | Yes | No | Yes (only mode) | | OpenAI API compat | Yes | Yes | Yes | | Orchestration sanitization | Yes | Yes | No | | Token anomaly detection | Yes | Yes | No | | Codebase size | ~1,500 lines | ~9,000 lines | Platform | | Dependencies | 1 | Many | Many | | Setup | 2 commands | Config + build | Config + dashboard | You pay $100-200/mo for Claude Max or Pro. But that subscription only works on claude.ai and Claude Code. If you want to use Claude with any other tool — Cursor, Continue, Aider, your own scripts — you need a separate API key with separate billing. Note: Claude subscriptions have usage limits that reset on rolling 5-hour and 7-day windows. When exceeded, Opus and Sonnet may return 429 errors while Haiku continues working. You can check your utilization via Claude Code's /usage command or statusline. Use --cli mode to route through Claude Code's binary, which is not affected by these limits. dario fixes this. It creates a local proxy that translates API key auth into your subscription's OAuth tokens — and with --cli mode, routes through the Claude Code binary for uninterrupted access. Claude Code installed and logged in (recommended). Dario detects your existing Claude Code credentials automatically. If Claude Code isn't installed, dario runs its own OAuth flow — opens your browser, you authorize, done. npm install -g @askalf/dario Or use npx (no install needed): npx @askalf/dario login dario login - With Claude Code installed: Detects your credentials automatically and starts the proxy. No browser needed. - Without Claude Code: Opens your browser to Claude's OAuth page. Authorize, and dario captures the token automatically via a local callback server. Then run dario proxy to start the server. dario proxy dario — http://localhost:3456 Your Claude subscription is now an API. Usage: ANTHROPIC_BASE_URL=http://localhost:3456 ANTHROPIC_API_KEY=dario Auth: open (no DARIO_API_KEY set) OAuth: healthy (expires in 11h 42m) Model: passthrough (client decides) # Set these two env vars — every Anthropic SDK respects them export ANTHROPIC_BASE_URL=http://localhost:3456 export ANTHROPIC_API_KEY=dario # Now any tool just works openclaw start aider --model claude-opus-4-6 continue # in VS Code, set base URL in config python my_script.py If you're getting rate limited on Opus or Sonnet, use --cli mode. This routes requests through the Claude Code binary instead of hitting the API directly. Claude Code has priority routing that continues working even when direct API calls return 429. dario proxy --cli # Opus works even when rate limited dario proxy --cli --model=opus # Force Opus + CLI backend dario — http://localhost:3456 Your Claude subscription is now an API. Usage: ANTHROPIC_BASE_URL=http://localhost:3456 ANTHROPIC_API_KEY=dario Backend: Claude CLI (bypasses rate limits) Model: claude-opus-4-6 (all requests) Trade-offs vs direct API mode: | Direct API (default) | CLI Backend (--cli ) | Passthrough (--passthrough ) | | |---|---|---|---| | Streaming | Native SSE | SSE (converted from JSON) | Native SSE | | Tool use | Yes | No | Yes | | Thinking/billing injection | Yes (Claude-optimized) | N/A | No (OAuth swap only) | | Latency | Low | Higher (process spawn) | Low | | Rate limits | Priority routing | Not affected | Standard (no priority) | | Opus when throttled | Auto CLI fallback | Always works | May return 429 | For tools that need exact Anthropic protocol fidelity with zero modification, use --passthrough . This does OAuth swap only — no billing tag, no thinking injection, no device identity, no extra beta flags. Note: most tools (including Hermes and OpenClaw) work better through default mode, which handles billing classification and token optimization automatically. dario proxy --passthrough # Thin proxy, zero injection dario proxy --passthrough --model=opus # Thin proxy + model override Force a specific model for all requests — useful when your tool doesn't let you configure the model: dario proxy --model=opus # Force Opus 4.6 dario proxy --model=sonnet # Force Sonnet 4.6 dario proxy --model=haiku # Force Haiku 4.5 dario proxy # Passthrough (client decides) Full model IDs also work: --model=claude-opus-4-6 Combine with --cli for rate-limit-proof Opus: dario proxy --cli --model=opus Dario implements /v1/chat/completions — any tool built for the OpenAI API works with your Claude subscription. No code changes needed. dario proxy --model=opus export OPENAI_BASE_URL=http://localhost:3456/v1 export OPENAI_API_KEY=dario # Cursor, Continue, LiteLLM, any OpenAI SDK — all work Use --model=opus to force the model regardless of what the client sends. Or pass claude-opus-4-6 as the model name directly — Claude model names work as-is. curl http://localhost:3456/v1/messages \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-opus-4-6", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello!"}] }' import anthropic client = anthropic.Anthropic( base_url="http://localhost:3456", api_key="dario" ) message = client.messages.create( model="claude-opus-4-6", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] ) print(message.content[0].text) import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ baseURL: "http://localhost:3456", apiKey: "dario", }); const message = await client.messages.create({ model: "claude-opus-4-6", max_tokens: 1024, messages: [{ role: "user", content: "Hello!" }], }); curl http://localhost:3456/v1/messages \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-opus-4-6", "max_tokens": 1024, "stream": true, "messages": [{"role": "user", "content": "Write a haiku about APIs"}] }' # Cursor / Continue / any OpenAI-compatible tool OPENAI_BASE_URL=http://localhost:3456/v1 OPENAI_API_KEY=dario cursor # Aider ANTHROPIC_BASE_URL=http://localhost:3456 ANTHROPIC_API_KEY=dario aider --model claude-opus-4-6 # Any tool that uses ANTHROPIC_BASE_URL ANTHROPIC_BASE_URL=http://localhost:3456 ANTHROPIC_API_KEY=dario your-tool-here Add to ~/.hermes/config.yaml : model: base_url: "http://localhost:3456/v1" api_key: "dario" default: claude-opus-4-6 Then run hermes normally — it routes through dario using your Claude subscription. ┌──────────┐ ┌─────────────────┐ ┌──────────────────┐ │ Your App │ ──> │ dario (proxy) │ ──> │ api.anthropic.com│ │ │ │ localhost:3456 │ │ │ │ sends │ │ swaps API key │ │ sees valid │ │ API key │ │ for OAuth │ │ OAuth bearer │ │ "dario" │ │ bearer token │ │ token │ └──────────┘ └─────────────────┘ └──────────────────┘ ┌──────────┐ ┌─────────────────┐ ┌──────────────────┐ │ Your App │ ──> │ dario (proxy) │ ──> │ claude --print │ │ │ │ localhost:3456 │ │ (Claude Code) │ │ sends │ │ extracts prompt│ │ │ │ API │ │ spawns CLI │ │ has priority │ │ request │ │ wraps response │ │ routing │ └──────────┘ └─────────────────┘ └──────────────────┘ ┌──────────┐ ┌─────────────────┐ ┌──────────────────┐ │ Your App │ ──> │ dario (proxy) │ ──> │ api.anthropic.com│ │ │ │ localhost:3456 │ │ │ │ sends │ │ swaps API key │ │ sees valid │ │ API │ │ for OAuth │ │ OAuth bearer │ │ request │ │ nothing else │ │ token │ └──────────┘ └─────────────────┘ └──────────────────┘ - dario login — Detects your existing Claude Code credentials (~/.claude/.credentials.json ) and starts the proxy automatically. If Claude Code isn't installed, runs a PKCE OAuth flow with a local callback server to capture the token automatically. - dario proxy — Starts an HTTP server on localhost that implements the Anthropic Messages API. In direct mode, it swaps your API key for an OAuth bearer token. In CLI mode, it routes through the Claude Code binary. - Auto-refresh — OAuth tokens expire. Dario refreshes them automatically in the background every 15 minutes. Refresh tokens rotate on each use. | Command | Description | |---|---| dario login | Detect credentials and start proxy | dario proxy | Start the local API proxy | dario status | Check if your token is healthy | dario refresh | Force an immediate token refresh | dario logout | Delete stored credentials | dario help | Show usage information | | Flag/Env | Description | Default | |---|---|---| --cli | Use Claude CLI as backend (bypasses rate limits) | off | --passthrough | Thin proxy — OAuth swap only, no injection | off | --model=MODEL | Force a model (opus , sonnet , haiku , or full ID) | passthrough | --port=PORT | Port to listen on | 3456 | --verbose / -v | Log every request | off | DARIO_API_KEY | If set, all endpoints (except /health ) require matching x-api-key header or Authorization: Bearer header | unset (open) | - All Claude models (Opus 4.6, Sonnet 4.6, Haiku 4.5) + 1M extended context aliases ( opus1m ,sonnet1m ) - Native billing classification — device identity, per-request billing tag with SHA-256 checksums matching real Claude Code (extracted via binary RE), ensures Max plan limits work correctly - Stealth layer (v2.9.0) — strips thinking blocks from conversation history (saves 50-80% input tokens), scrubs non-CC fields ( temperature ,top_p ,top_k ,stop_sequences ,service_tier ), reorders JSON fields to match Claude Code's exact field order, and normalizes system prompts to exactly 3 blocks. Every request is indistinguishable from real Claude Code traffic. - Adaptive thinking — matches Claude Code's { type: 'adaptive' } mode for optimal reasoning (auto-skipped for Haiku 4.5) - Effort control — injects output_config: { effort: 'medium' } matching Claude Code's default, or passes through client-specified effort level - Enriched 429 errors — rate limit errors include utilization %, limiting window, and reset time instead of Anthropic's default "Error" message - Auto CLI fallback — if the API returns 429 and Claude Code is installed, transparently retries through claude --print with SSE conversion - OpenAI-compatible ( /v1/chat/completions ) — works with any OpenAI SDK or tool - Streaming and non-streaming (both Anthropic and OpenAI SSE formats, including tool_use streaming) - Tool use / function calling - System prompts and multi-turn conversations - Prompt caching and extended thinking - Billable beta filtering — strips extended-cache-ttl from client betas (the only prefix requiring Extra Usage) - Beta deduplication — client-provided betas are deduplicated against the base set before appending - Orchestration tag sanitization — strips agent-injected XML ( , , , etc.) before forwarding - Token anomaly detection — warns on context spike (>60% input growth) or output explosion (>2x previous) - Concurrency control (max 10 concurrent upstream requests) - CORS enabled (works from browser apps on localhost) - All Claude models — including Opus when rate limited - Streaming via SSE conversion (client sends stream: true , CLI JSON response is converted to Anthropic or OpenAI SSE events) - OpenAI compatibility (translates OpenAI → Anthropic before CLI, Anthropic → OpenAI after) - System prompts and multi-turn conversations (via context injection) - Not affected by API rate limits - All Claude models with native streaming and tool use - OAuth token swap only — no billing tag, thinking, effort, or device identity injection - Minimal beta flags ( oauth-2025-04-20 + client betas only) - For tools that need exact Anthropic protocol fidelity with zero modificat

원문 보기 (hackernews)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기