Ask HN: At ~165k tokens, does Opus 4.6 1M outperform Opus 4.6 200k?

hackernews | | {'이벤트': '📰', '머신러닝/연구': '📰', '하드웨어/반도체': '📰', '취약점/보안': '📰', '기타 AI': '📰', 'AI 딜': '📰', 'AI 모델': '📰', 'AI 서비스': '📰', 'discount': '📰', 'news': '📰', 'review': '📰', 'tip': '📰'} 머신러닝/연구
#anthropic #claude #claude opus #머신러닝/연구 #모델 크기 추정

요약

Here is a question for which I cannot find an answer, and cannot yet afford to answer myself:<p>NoLiMa [0] and &quot;context rot&quot; [1] would indicate that with a ~165k request, Opus 200k would suck, and Opus 1M would be better (as a lower percentage of the context window was used)... but they are the same model, right? However, there are practical inference deployment differences that could change the whole paradigm, right? I am so confused.</p><p>Anthropic says it&#x27;s the same model [2].

왜 중요한가

본문

Here is a question for which I cannot find an answer, and cannot yet afford to answer myself:<p>NoLiMa [0] and &quot;context rot&quot; [1] would indicate that with a ~165k request, Opus 200k would suck, and Opus 1M would be better (as a lower percentage of the context window was used)... but they are the same model, right? However, there are practical inference deployment differences that could change the whole paradigm, right? I am so confused.</p><p>Anthropic says it&#x27;s the same model [2]. But, Claude Code&#x27;s own source treats them as distinct variants with separate routing [3]. Closest test I found [4] asserts they&#x27;re identical below 200K but it never actually A&#x2F;B tests, correct?</p><p>Inside Claude Code it&#x27;s probably not testable, right? According to this issue [5], the CLI is non-deterministic for identical inputs, and agent sessions branch on tool-use. Would need a clean API-level test.</p><p><i>The API level test is what I really want to know for the Claude based features in my own apps. Is there a real benchmark for this?</i></p><p>I have reached the limits of my understanding on this problem. If what I am trying to say makes any sense, any help would be greatly appreciated.</p><p>If anyone could help me ask the question better, that would also be appreciated.</p><p>[0] https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2502.05167</p><p>[1] https:&#x2F;&#x2F;research.trychroma.com&#x2F;context-rot</p><p>[2] https:&#x2F;&#x2F;claude.com&#x2F;blog&#x2F;1m-context-ga</p><p>[3] https:&#x2F;&#x2F;github.com&#x2F;anthropics&#x2F;claude-code&#x2F;issues&#x2F;35545</p><p>[4] https:&#x2F;&#x2F;www.claudecodecamp.com&#x2F;p&#x2F;claude-code-1m-context-window</p><p>[5] https:&#x2F;&#x2F;github.com&#x2F;anthropics&#x2F;claude-code&#x2F;issues&#x2F;3370</p>