HN 표시: Fortran77이 없는 LAPACK; C11 번역

hackernews | | 📦 오픈소스
#blas #c11 #fortran #lapack #linear algebra #review
원문 출처: hackernews · Genesis Park에서 요약 및 분석

요약

포트란77 의존성을 제거하고 표준 CBLAS 인터페이스를 활용하여 기존 LAPACK 선형대수학 라이브러리를 C11 기반으로 라인 바이 라인 직역한 프로젝트가 공개되었습니다. 이를 통해 네임 맹글링이나 LP64/ILP64 빌드 시 발생하는 복잡한 ABI 문제를 회피할 수 있으며, 컴파일 타임 프로브 기법을 통해 연결된 BLAS의 정수 너비를 자동 감지합니다. 네 가지 정밀도(float, double, complex, double-complex) 루틴이 99% 번역되어 약 45만 건의 공식 테스트 케이스를 모두 통과했습니다. 해당 코드는 유료 LLM 구독을 활용해 상당 부분 생성되었으며 기계적 번역 원칙을 준수했으나, 일부 루틴에는 인간이 의도적으로 개선한 알고리즘 최적화 기법도 포함되어 있습니다.

본문

A C implementation of LAPACK linear algebra library, removing the Fortran dependency while leveraging optimized vendor BLAS through the standard CBLAS interface. BLAS and LAPACK libraries have been the foundation of numerical linear algebra for decades. Vendors have rewritten BLAS in C/Assembly for performance, but LAPACK remains in Fortran 77, and embedding Fortran into C projects brings a cascade of integration concerns, from name mangling and calling conventions to 1-based indexing and a Fortran runtime dependency. On top of that, the BLAS ecosystem is fragmented across providers (OpenBLAS, MKL, BLIS, Accelerate) each with their own symbol and linking conventions. A native C implementation built on the standard CBLAS interface sidesteps both problems. This project is a line-by-line C translation of the reference LAPACK. Because we control the LAPACK layer code, our only external dependency is the ~150 CBLAS functions interface. This drastically reduces the ABI surface compared to projects wrapping vendor Fortran LAPACK, and makes LP64/ILP64 support a clean dual build without symbol mangling. A compile-time probe auto-detects the linked BLAS integer width (via a clever trick we learned from libblastrampoline). In practice, this means a full BLAS/LAPACK stack that needs only a C compiler. For example, one can build OpenBLAS with only its CBLAS option and link it against this project, and you have a complete LAPACK stack built entirely with a C compiler. All four precisions (double, single, complex, double-complex) are 99% translated (DMD and a few auxiliary routines are missing). Note XBLAS extra-precision variants are out of scope. Tests are ported from LAPACK's official test suite and fully ported (~450K parametrized test cases). This work is heavily assisted by a paid LLM subscription with personal financing. While there is a lot of manual labor involved in porting the critical routines, a significant portion of the codebase was produced predominantly by the LLM, in particular: - some of the C code for the double precision routines, - porting the test code and Meson/CMocka conversion, - S, C, and Z precision generation from D routines (via script), .rst file generation script,- generating C-style docstrings from the original Fortran comments. This creates an obvious ethical concern and we acknowledge and share it, hence the disclosure. A lot of care has gone into ensuring the resulting C code is a mechanical translation with no algorithmic changes. If you notice something that looks like a breach of copyright, please let us know. That should not happen. There are also intentional, human-made modifications. For example, the LU factorization in getrf incorporates a technique we learned from the excellent faer project. These are deliberate improvements, not accidental machine-generated code. A C11 compiler, Meson >= 1.1.0, and a CBLAS implementation (OpenBLAS, MKL, ...). Tests require CMocka >= 2.0. See the building guide for full details including BLAS vendor selection and ILP64 options. meson setup builddir ninja -C builddir meson test -C builddir pip install -r doc/requirements.txt cd doc && make html See the contributing guide for details on the Doxygen + Sphinx pipeline. BSD-3-Clause. See LICENSE.

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

공유

관련 저널 읽기

전체 보기 →