OpenBLAS
OpenBLAS copied to clipboard
OpenBLAS link error to openmp functions
Hi Team,
I try to build OpenBLAS by enabling openmp on Windows on ARM device. I've reproduced below link errors. Do you know the issue? How to solve it?
Thanks!
[Build commands] cmake .. -G Ninja -DCMAKE_C_COMPILER=clang-cl -DCMAKE_Fortran_COMPILER=flang-new -DBUILD_SHARED_LIBS=TRUE -DUSE_THREAD=1 -DUSE_OPENMP=1 -DOpenMP_Fortran_FLAGS=-fopenmp -DCMAKE_BUILD_TYPE=Release cmake --build . --config Release
[Error logs] lld-link: error: undefined symbol: __kmpc_for_static_init_8
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined.1)
lld-link: error: undefined symbol: __kmpc_for_static_fini
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined.1)
lld-link: error: undefined symbol: __declspec(dllimport) omp_get_thread_num
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_threads) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_threads)
lld-link: error: undefined symbol: __kmpc_master
referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/ssytrd_sb2st.F.obj:(ssytrd_sb2st_..omp_par) referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/dsytrd_sb2st.F.obj:(dsytrd_sb2st_..omp_par) referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/chetrd_hb2st.F.obj:(chetrd_hb2st_..omp_par) referenced 1 more times
lld-link: error: undefined symbol: __kmpc_end_master
referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/ssytrd_sb2st.F.obj:(ssytrd_sb2st_..omp_par) referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/dsytrd_sb2st.F.obj:(dsytrd_sb2st_..omp_par) referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/chetrd_hb2st.F.obj:(chetrd_hb2st_..omp_par) referenced 1 more times
lld-link: error: undefined symbol: omp_get_num_threads
referenced by CMakeFiles/LAPACK_OVERRIDES.dir/lapack-netlib/SRC/iparam2stage.F.obj:(iparam2stage_..omp_par) ninja: build stopped: subcommand failed.
Unfortunately this looks like a problem in the flang-new for Windows/Arm64 (maybe you can try with a later patch release of LLVM17, depending on which one you are using now).
Thanks for your reply!
Can I config libomp.lib to fix this issue? How to config it? BTW, I use LLVM WoA version 17.0.6 and CMake 3.28.
Or How to build openblas to let it execute in high performance?
Normally I would expect it to link libomp without any special configuration, just from having -fopenmp
on the command line (which should be added automatically if you specified -DUSE_OPENMP=ON
. I guess you could experiment with putting it on the target_link_libraries
line (around line 308 of the toplevel CMakeLists.txt). Is it too slow for your needs when you don't use OpenMP ? (There are some fixes to speed up Windows thread management in the current develop
branch, also it will depend on your hardware if relevant BLAS functions have optimized implementations in OpenBLAS)
BTW - unfortunately I do not have any Windows on Arm setup available to test, maybe @everton1984 can comment on the current status (I notice #3973 is still open) or @mmuetzel ?
I don't have access to Windows on ARM hardware either. Nor do I have any experience with clang-cl
.
What I can tell is that OpenBLAS is built and distributed for Windows on ARM with OpenMP using clang
(the MinGW version of it) by MSYS2:
https://github.com/msys2/MINGW-packages/blob/4c0259a04a205ae8175ece19fe7260be958cdf8c/mingw-w64-openblas/PKGBUILD#L98
I'm not aware of reports about issues with that version of OpenBLAS for Windows on ARM.
you have iomp5 symbols in your build output. You need to start with a clean source tree and make different builds in different (sub-)directories.
@brada4 what do you mean ? the kmpc ones are definitely in llvm omp
Heh, same comes out if you mix up mkl linker commands..... Clean+rebuild is one of possibilities.
I've downloaded openmp libomp.dll and libomp.a from https://packages.msys2.org/package/mingw-w64-clang-aarch64-openmp?repo=clangarm64. Where should I put them? So that openblas can link to openmp.
Here are more information.
OS: Windows 11 ARM64 OpenBLAS version: v0.3.24 Build tools: Visual Studio 2022 + CMake3.28 + LLVM 17.0.6
Hi team,
What does below generation logs mean?
Key logs: Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed Looking for pthread_create in pthreads - not found
Detailed Logs: -- fortran lapack -- Building deprecated routines -- Building Single Precision -- Building Double Precision -- Building Single Precision Complex -- Building Double Precision Complex -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - not found -- Found Threads: TRUE -- Generating openblas_config.h in include/openblas -- Generating f77blas.h in include/openblas -- Generating cblas.h in include/openblas -- Copying LAPACKE header files to include/openblas -- Configuring done (14.4s) -- Generating done (0.6s)
no idea, if you mix mingw and llvm files such link errors are expected. Just make as clean development tools install as possible.
Hi team,
I've checked that LLVM 17.0.6 doesn't have libomp.lib and libomp.dll. Can you enable openmp for openblas on Windows X64 device?
visual studio includes old msomp and llvm openmp if using clang-CL.EXE you need to select -openmp=llvm or ms
What is detail command of openmp flag when use clang-cl.exe?
Thanks!
Use plain clang.exe clang-cl is just partial cl.exe replica Or: https://devblogs.microsoft.com/cppblog/improved-openmp-support-for-cpp-in-visual-studio/
Hi Team,
On Windows on ARM device, I use below commands to generate and build OpenBLAS, I can find "Found OpenMP_C: -Xclang -fopenmp (found version "5.1")" in generation logs. But when I build OpenBLAS, still has openmp link erros.
I use LLVM17.0.6 & CMAKE 3.28.
Do you know the reason? How to link to correct omp functions?
Thanks a lot!
[Build commands] cmake .. -G Ninja -DCMAKE_C_COMPILER=clang-cl -DNOFORTRAN=1 -DBUILD_SHARED_LIBS=TRUE -DUSE_OPENMP=1 -DCMAKE_BUILD_TYPE=Release -DBUILD_WITHOUT_LAPACK=0 cmake --build . --config Release
[Generation logs] -- Running getarch -- GETARCH results: CORE=ARMV8 LIBCORE=armv8 NUM_CORES=12 MAKEFLAGS += -j 12
-- Compiling a 64-bit binary. -- Found OpenMP_C: -Xclang -fopenmp (found version "5.1") -- Found OpenMP: TRUE (found version "5.1") -- Building Single Precision -- Building Double Precision -- Building Complex Precision -- Building Double Complex Precision
[Link error logs] [6754/6754] Linking C shared library lib\openblas.dll FAILED: lib/openblas.dll lib/Release/openblas.lib
lld-link: error: undefined symbol: __declspec(dllimport) omp_get_max_threads
referenced by interface/CMakeFiles/interface.dir/CMakeFiles/saxpy.c.obj:(saxpy_) referenced by interface/CMakeFiles/interface.dir/CMakeFiles/saxpy.c.obj:(saxpy_) referenced by interface/CMakeFiles/interface.dir/CMakeFiles/sswap.c.obj:(sswap_) referenced 503 more times
lld-link: error: undefined symbol: __declspec(dllimport) omp_in_parallel
referenced by interface/CMakeFiles/interface.dir/CMakeFiles/saxpy.c.obj:(saxpy_) referenced by interface/CMakeFiles/interface.dir/CMakeFiles/saxpy.c.obj:(saxpy_) referenced by interface/CMakeFiles/interface.dir/CMakeFiles/sswap.c.obj:(sswap_) referenced 485 more times
lld-link: error: undefined symbol: __kmpc_global_thread_num
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas)
lld-link: error: undefined symbol: __kmpc_push_num_threads
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas)
lld-link: error: undefined symbol: __kmpc_fork_call
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas)
lld-link: error: undefined symbol: __kmpc_for_static_init_8
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined.1)
lld-link: error: undefined symbol: __kmpc_for_static_fini
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_blas.omp_outlined.1)
lld-link: error: undefined symbol: __declspec(dllimport) omp_get_thread_num
referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_threads) referenced by driver/others/CMakeFiles/driver_others.dir/blas_server_omp.c.obj:(exec_threads) ninja: build stopped: subcommand failed.
it links to older openmp provided by microsoft despite one cmake detected. Use clang.exe for CC, kind of easy.
I use clang.exe, don't have openmp link errors. But built openblas.lib is about 1KB that is very small. And can't find any export function in built file openblas.dll.
Anything wrong? How to config openmp for openblas?
[Build commands] cmake .. -G Ninja -DCMAKE_C_COMPILER=clang -DNOFORTRAN=1 -DBUILD_SHARED_LIBS=TRUE -DUSE_OPENMP=1 -DCMAKE_BUILD_TYPE=Release -DBUILD_WITHOUT_LAPACK=0 cmake --build . --config Release
[File openblas.dll without any export symbol]
dumpbin /exports .\openblas.dll Microsoft (R) COFF/PE Dumper Version 14.38.33133.0 Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file .\openblas.dll
File Type: DLL
Summary
1000 .00cfg
8000 .data
E000 .pdata
36000 .rdata
2000 .reloc
1000 .rsrc
8DD000 .text
10000 .tls
.text is 10MB which is reasonable for single cpu type. Frankly no idea how microsoft omp hangs in the way, it is not in visual studio by default.
On Windows on ARM device, I can build openblas with below commands. Seems openblas performance isn’t improved.
Do you know the reason?
[Build commands]
cmake .. -G Ninja -DCMAKE_C_COMPILER=clang-cl -DNOFORTRAN=1 -DBUILD_SHARED_LIBS=TRUE -DCMAKE_BUILD_TYPE=Release -DPARALLEL=1 -DBUILD_WITHOUT_LAPACK=0 -DUSE_OPENMP=1 -DOpenMP_C_FLAGS="-fopenmp=libomp" -DOpenMP_C_LIB_NAMES="libomp" -DOpenMP_libomp_LIBRARY="libomp.lib"
cmake --build . --config Release -j32
Please present some measurements. like integrate various OpenBLAS builds in octave or R and run same benchmark scripts over and over. You needed OpenMP, which means you can call OpenBLAS from OpenMP parallel sections and manage yourself the parallelism of multiple now single-threaded OpenBLAS.
Hi,
How to use multi threads with openmp in OpenBLAS? Which configuration should I use?
Do you have demo configuration or app?
Thanks!
Call OpenBLAS from top level, not from within extra OpenMP pragmas? Should be obvious if you program OpenMP.
You can try experimenting with the sources in the cpp_thread_test directory.. if your code is calling BLAS functions from an OpenMP parallel region, OpenBLAS will currently use only one thread in each of the parallel calls.