lapack
lapack copied to clipboard
CBLAS install step with USE_OPTIMIZED_BLAS=ON
Problem
I configure CBLAS (and LAPACKE) out of source with
cmake -DUSE_OPTIMIZED_BLAS=ON -DCBLAS=ON -DUSE_OPTIMIZED_LAPACK=ON -DLAPACKE=ON ..
Building (make) works without problems.
When I then use make install, cblas-targets.cmake does not get installed.
However, the installed cblas-config.cmake directly refers to it, causing errors when it is used by find_package.
I have tracked this down to the fact that ALL_TARGETS will be empty at the point when it is checked in
https://github.com/Reference-LAPACK/lapack/blob/8e1e16c6300861f096744a5ab75c3572c4b9b756/CBLAS/CMakeLists.txt#L51
Since this is my first time dealing with the LAPACK code, I don't quite understand why that variable is checked here in the first place.
In the comparable place of the LAPACKE code
https://github.com/Reference-LAPACK/lapack/blob/8e1e16c6300861f096744a5ab75c3572c4b9b756/LAPACKE/CMakeLists.txt#L129
no such check is performed.
Solution?
If I comment out above mentioned check, installation works as intended and I can use find_package(CBLAS) as usual.
Normally I would make a pull request, but I don't understand the purpose of ALL_TARGETS well enough to suggest proper changes right now.
Environment
OS: Arch Linux (5.6.6-arch1-1) CMake: 3.17.1
@hello, (not directly related to, but somehow it is) for people writing C/C++ programs, we (I) re-wrote from scratch a full reference implementation (minus complex; work in progress) of fortran-blas in vanilla C (no cblas, no compromise):
*reference link https://github.com/moe123/macadam/tree/master/macadam/details/numa/lapack/blas
it can compile c-native row-major as well as column-major to mockup fortran intrinsics (@see mc_blas_access.h); however, everything is header inlined, hence make sure if binary size matter to you to bootstrap those calls within your program.
Nota: we took the liberty of interfacing long double type using the prefix l + all families are implemented in a single file.
@moe123 thanks for the info! We mostly target supercomputers and rely on optimized BLAS/LAPACK installations, which is why we probably need to stick to CBLAS and LAPACKE for now...
I'm also currently working on a pull request for this issue, but am hoping to hear from some contributers before I post it
@derpda yes I understand; I am shaping the future (LOL); it's only few weeks of spare time work; however, it's building up; I have been delayed by other matters in life; true work + also having troubles to fully unittest the DQZ impl within lapack and C counterpart; to me this actual code is addressing corner cases long gone since; but I am might be wrong.
I have seen some logs going on about your issue.
I would certainly hope that native C or better yet C++ implementations are in the near future... :D
For now, I have made changes to the CMake code in my fork that bring partially fix the issues, but I have to figure out a few more things. Hopefully I can finish my pull request early next week and get some feedback here.
certainly is the word; that's the reason I am doing it; even if an algorithm has been thought and shape taking advantages of column-major access. There is no comparison in term of processing between to somehow change/force the memory access and transposing, copying, querying, transposing back results with all the possible man-made software bugs which could be introduced during/along those tedious operations.
Solved by #413, I believe. Let me know if not.