Ask HN: Pros and cons of switching to self-hosted inference?
hackernews
|
|
💼 비즈니스
#compliance
#inference
#openai
#privacy
#self-hosting
원문 출처: hackernews · Genesis Park에서 요약 및 분석
요약
데이터 프라이버리 규정 준수를 위해 API 방식에서 오픈 가중치 모델의 자체 호스팅으로 전환을 고려 중인 관리자가, 이를 경험한 이들의 의견을 구하고 있습니다. 특히 API 사용료 대비 비용 절감 효과와 성능 관리, 하드웨어 활용률 같은 기술적 이슈, 그리고 팀별 비용 할당 방법에 대한 구체적인 정보가 필요합니다. 또한 전환 과정에서 예상치 못했던 문제나 사전에 알았으면 좋았던 팁 등 실무적인 조언을 통해 전환에 대비하고자 합니다.
본문
Management is pushing us toward running open-weight models in-house after some compliance conversations around data privacy. Before we commit, we'd love to hear from people who've made this transition.<p>Specifically curious about:</p><p>Did it actually end up cheaper than paying for API access at your request volume?
Were there any issues related to managing performance, more specifically latency, throughput, hardware utilization?
How do you handle cost visibility and attribution across teams/workloads?</p><p>Also, super curious about other aspects, what worked, what didn't, and what do you wish you'd known before switching?</p><p>Thanks in advance!
PS: We are not seeking for an absolute truth, just want to be prepared if that transition will take place.</p>
Were there any issues related to managing performance, more specifically latency, throughput, hardware utilization?
How do you handle cost visibility and attribution across teams/workloads?</p><p>Also, super curious about other aspects, what worked, what didn't, and what do you wish you'd known before switching?</p><p>Thanks in advance!
PS: We are not seeking for an absolute truth, just want to be prepared if that transition will take place.</p>
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유