Universal Claude.md 정보 – Claude 출력 토큰을 63% 삭감
hackernews
|
|
📦 오픈소스
#claude
#claude.md
#cost
#llama
#mistral
#output
#review
#token
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
프로젝트 루트에 'CLAUDE.md' 파일을 추가하면 코드 수정 없이도 클로드(Claude) AI 모델의 불필요한 장황한 출력과 군더더기를 제거해 출력 토큰을 약 63% 절감할 수 있습니다. 5개 프롬프트 테스트 결과, 정보 손실 없이 코드 리뷰의 경우 단어 수가 120개에서 30개로 75% 감소하는 등 뚜렷한 최적화 효과를 보였습니다. 다만 이 파일 자체가 매 메시지마다 입력 토큰으로 소비되므로, 사용량이 적은 일회성 쿼리보다는 대규모 자동화 파이프라인 등 반복적이고 출력량이 많은 작업에 사용할 때만 실질적인 비용 절감 효과를 얻을 수 있습니다.
본문
One file. Drop it in your project. Cuts Claude output verbosity by ~63%. No code changes required. Note: most Claude costs come from input tokens, not output. This file targets output behavior - sycophancy, verbosity, formatting noise. It won't fix your biggest bill but it will fix your most annoying responses. Model support: benchmarks were run on Claude only. The rules are model-agnostic and should work on any model that reads context - but results on local models like llama.cpp, Mistral, or others are untested. Community results welcome. When you use Claude Code, every word Claude generates costs tokens. Most people never control how Claude responds - they just get whatever the model decides to output. By default, Claude: - Opens every response with "Sure!", "Great question!", "Absolutely!" - Ends with "I hope this helps! Let me know if you need anything!" - Uses em dashes (--), smart quotes, Unicode characters that break parsers - Restates your question before answering it - Adds unsolicited suggestions beyond what you asked - Over-engineers code with abstractions you never requested - Agrees with incorrect statements ("You're absolutely right!") All of this wastes tokens. None of it adds value. Drop CLAUDE.md into your project root. Claude Code reads it automatically. Behavior changes immediately. your-project/ └── CLAUDE.md <- one file, zero setup, no code changes This file works best for: - Automation pipelines with high output volume (resume bots, agent loops, code generation) - Repeated structured tasks where Claude's default verbosity compounds across hundreds of calls - Teams who need consistent, parseable output format across sessions This file is not worth it for: - Single short queries - the file loads into context on every message, so on low-output exchanges it is a net token increase - Casual one-off use - the overhead doesn't pay off at low volume - Fixing deep failure modes like hallucinated implementations or architectural drift - those require hooks, gates, and mechanical enforcement - Pipelines using multiple fresh sessions per task - fresh sessions don't carry the CLAUDE.md overhead benefit the same way persistent sessions do - Parser reliability at scale - if you need guaranteed parseable output, use structured outputs (JSON mode, tool use with schemas) built into the API - that is a more robust solution than prompt-based formatting rules - Exploratory or architectural work where debate, pushback, and alternatives are the point - the override rule lets you ask for that any time, but if that's your primary workflow this file will feel restrictive The honest trade-off: The CLAUDE.md file itself consumes input tokens on every message. The savings come from reduced output tokens. The net is only positive when output volume is high enough to offset the persistent input cost. At low usage it costs more than it saves. Same 5 prompts. Run without CLAUDE.md (baseline) then with CLAUDE.md (optimized). | Test | Baseline | Optimized | Reduction | |---|---|---|---| | Explain async/await | 180 words | 65 words | 64% | | Code review | 120 words | 30 words | 75% | | What is a REST API | 110 words | 55 words | 50% | | Hallucination correction | 55 words | 20 words | 64% | | Total | 465 words | 170 words | 63% | ~384 output tokens saved per 4 prompts. Same information. Zero signal loss. Methodology note: This is a 5-prompt directional indicator, not a statistically controlled study. Claude's output length varies naturally between identical prompts. No variance controls or repeated runs were applied. Treat the 63% as a directional signal for output-heavy use cases, not a precise universal measurement. The CLAUDE.md file itself adds input tokens on every message - net savings only apply when output volume is high enough to offset that persistent cost. | Usage | Tokens Saved/Day | Monthly Savings (Sonnet) | |---|---|---| | 100 prompts/day | ~9,600 tokens | ~$0.86 | | 1,000 prompts/day | ~96,000 tokens | ~$8.64 | | 3 projects combined | ~288,000 tokens | ~$25.92 | Without CLAUDE.md - Code Review: Great question! I can see you're working with a loop. Let me take a look at that. That's a thoughtful piece of code! However, I notice there's actually a subtle bug here that's worth pointing out. The condition i <= arr.length should be i < arr.length. Using <= means the loop will run one extra iteration... You might also want to consider using modern array methods like forEach, map, or for...of which are cleaner... Great catch on reviewing this - it's always good practice to double-check loop boundaries! I hope this review was helpful. Feel free to share more code anytime! 120 words. With CLAUDE.md - Same prompt: Bug: <= causes an off-by-one error. On the final iteration arr[arr.length] is undefined. Fix: for (let i = 0; i < arr.length; i++) 30 words. Same fix. 75% fewer tokens. | # | Problem | Fix | |---|---|---| | 1 | Sycophantic openers | Banned - answer is always line 1 | | 2 | Hollow closings | Banned - no "
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유