Intel Delivers Open, Scalable AI Performance in MLPerf Inference v6.0 - 게임뷰
[AI] ai performance optimization
|
|
🔬 연구
#하드웨어/반도체
#review
#ai
#gpu
#intel
#mlperf
원문 출처: [AI] ai performance optimization · Genesis Park에서 요약 및 분석
요약
MLCommons가 발표한 MLPerf Inference v6.0 벤치마크에서 인텔은 제온 6 CPU와 아크 프로 B70 GPU를 탑재한 시스템을 통해 개방형 확장 가능한 AI 성능을 입증했습니다. 신규 아크 프로 B70은 전 모델 대비 최대 1.8배 높은 추론 성능을 보였으며, 4장의 GPU를 구성하면 128GB VRAM으로 1,200억 개 파라미터 모델을 구동할 수 있습니다. 인텔은 소프트웨어 최적화를 통해 동일 하드웨어에서 이전 버전 대비 최대 1.18배의 성능 향상을 달성하여 고성능 AI 워크로드 솔루션을 제공합니다.
본문
MLCommons released its latest MLPerf Inference v6.0 benchmarks, showcasing results across four key benchmarks for Intel’s GPU Systems. Intel’s AI systems featured Intel® Xeon® 6 CPUs and Intel® Arc™ Pro B70 graphics, demonstrating accessible AI workload solutions across high-end workstations, datacenter, and edge applications. The results show a four GPU Intel Arc Pro B70/B65 system delivers 128GB of VRAM to run 120B parameter models with high concurrency, with the Arc Pro B70 providing up to 1.8x higher inference performance than the Arc Pro B601. Software optimizations, configured in an open, containerized software stack efficiently scales inference performance from single node to multi-GPU enterprise deployments improving performance and delivering up to 1.18x higher gains on the same Intel Arc Pro B60 hardware versus MLPerf v5.12. “The combination of Intel Xeon 6 and Intel’s Arc Pro B-Series GPUs represent our investment to expand customer choice and value, offering real-world solutions that address both LLM models as well as traditional machine learning workloads, with leading performance and incredible value for graphics professionals and AI developers worldwide.” - Anil Nanduri, Intel vice president, AI Products and GTM, Intel Data Center Group Why It Matters: As the demand for AI inference grows, the professional compute market is going through a major transition whereby graphics creators and AI developers seek out performance and value, without compromising data privacy or incurring heavy subscription costs tied to proprietary AI models. Intel GPU Systems, featuring newly launched Intel Arc Pro B70/B65 GPUs, are designed to meet the needs of modern AI inference and provide an all-in-one inference platform combining full-stack validated hardware and software. With enhanced memory capacity, they aim to simplify the adoption and ease of use with a containerized solution built for Linux environments, optimized to deliver incredible inference performance with multi-GPU scaling and PCIe P2P data transfers, and designed to include enterprise-class reliability and manageability features such as ECC, SRIOV, telemetry and remote firmware updates. For example, when compared to comparable competitor GPU solutions the Intel Arc Pro B70 is able to handle significantly larger models and context windows in multi-GPU setups – powering up to 1.6x as much KV cache capacity when running larger models. AI inference is increasingly defined not only by GPU throughput but also by CPU-accelerated system performance. The CPU, shaping overall cluster efficiency and total cost of ownership, is also responsible for critical functions such as memory management, task orchestration, and workload distribution, while ensuring the security, reliability, and operational continuity essential to modern AI infrastructure. Intel continues to be the only server processor vendor to submit stand-alone CPU results for MLPerf inference benchmarks, underscoring its leadership and strong commitment to advancing AI inference across both compute and accelerator centric platforms. As the most widely used host CPU in AI accelerated systems—with over half of MLPerf 6.0 submissions powered by Xeon—Intel further reinforces its position at the core of the industry’s AI infrastructure. This leadership extends to the silicon itself: Intel Xeon 6 processors with P-cores delivered up to a 1.9x generational performance gain in MLPerf Inference v5.1, while built-in AI acceleration technologies such as AMX and AVX512 allow workloads like LLM inference, fine tuning, and classical machine learning to run efficiently without the need for dedicated accelerator hardware. 저작권자 © 게임뷰 무단전재 및 재배포 금지
Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.
공유