qmcpack
qmcpack copied to clipboard
[WIP] Use HIP as language in CMake
Please review the developer documentation on the wiki of this project that contains help and requirements.
Proposed changes
Describe what this PR changes and why. If it closes an issue, link to it here with a supported keyword.
Fixes #3581
What type(s) of changes does this code introduce?
Delete the items that do not apply
- Code style update (formatting, renaming)
- Refactoring (no functional changes, no api changes)
- Build related changes
Does this introduce a breaking change?
- Yes (requires CMake 3.21 or later)
What systems has this change been tested on?
Checklist
Update the following with a yes where the items apply. If you're unsure about any of them, don't hesitate to ask. This is simply a reminder of what we are going to look for before merging your code.
- Yes. This PR is up to date with current the current state of 'develop'
- No. Documentation has been added (if appropriate)
Can one of the admins verify this patch?
@ye-luo I'm getting the following warnings when building and suspect it's related to CMAKE_HIP_ARCHITECTURES
.
clang-13: warning: argument unused during compilation: '--offload-arch=gfx906' [-Wunused-command-line-argument]
clang-13: warning: argument unused during compilation: '--offload-arch=gfx908' [-Wunused-command-line-argument]
Is this OK on spock?
@quantumsteve could you paste your cmake line here? I mean on a regular workstation not spock.
I got a strange error
cmake -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DQMC_MPI=0 -DENABLE_CUDA=ON -DQMC_CUDA2HIP=ON ..
...
-- Check for working HIP compiler: /opt/rocm-4.5.0/llvm/bin/clang++
-- Check for working HIP compiler: /opt/rocm-4.5.0/llvm/bin/clang++ - broken
CMake Error at /home/packages/spack/opt/spack/linux-ubuntu20.04-zen2/gcc-9.3.0/cmake-3.21.4-t47xgwhz7ql3mvemzhzc7zwsrly6na3s/share/cmake-3.21/Modules/CMakeTestHIPCompiler.cmake:65 (message):
The HIP compiler
"/opt/rocm-4.5.0/llvm/bin/clang++"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /home/yeluo/opt/cleanup/qmcpack/build_R7_aompbuild_offload_cuda2hip_real_check/CMakeFiles/CMakeTmp
Run Build Command(s):/usr/bin/make -f Makefile cmTC_90a18/fast && /usr/bin/make -f CMakeFiles/cmTC_90a18.dir/build.make CMakeFiles/cmTC_90a18.dir/build
make[1]: Entering directory '/home/yeluo/opt/cleanup/qmcpack/build_R7_aompbuild_offload_cuda2hip_real_check/CMakeFiles/CMakeTmp'
Building HIP object CMakeFiles/cmTC_90a18.dir/testHIPCompiler.hip.o
/opt/rocm-4.5.0/llvm/bin/clang++ -D__HIP_ROCclr__=1 -I/opt/rocm-4.5.0/hip/include -I/opt/rocm-4.5.0/include -isystem /opt/rocm-4.5.0/hip/../include -isystem /opt/rocm-4.5.0/llvm/lib/clang/13.0.0 -fopenmp--cuda-host-only --offload-arch=gfx906 -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -o CMakeFiles/cmTC_90a18.dir/testHIPCompiler.hip.o -x hip -c /home/yeluo/opt/cleanup/qmcpack/build_R7_aompbuild_offload_cuda2hip_real_check/CMakeFiles/CMakeTmp/testHIPCompiler.hip
clang-13: error: unknown argument: '-fopenmp--cuda-host-only'
make[1]: *** [CMakeFiles/cmTC_90a18.dir/build.make:78: CMakeFiles/cmTC_90a18.dir/testHIPCompiler.hip.o] Error 1
make[1]: Leaving directory '/home/yeluo/opt/cleanup/qmcpack/build_R7_aompbuild_offload_cuda2hip_real_check/CMakeFiles/CMakeTmp'
make: *** [Makefile:127: cmTC_90a18/fast] Error 2
no idea where -fopenmp--cuda-host-only
si from
@ye-luo I had some trouble getting the spacing right between arguments in CMAKE_HIP_FLAGS
. Expect it should be -fopenmp --cuda-host-only
Last commit fixes that issue on Nitrogen. @prckent I haven't tried building on spock.
Using the same cmake configuration as GitHub Actions:
cmake -GNinja -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DQMC_MPI=0 -DENABLE_CUDA=ON -DQMC_CUDA2HIP=ON -DCMAKE_PREFIX_PATH=/opt/OpenBLAS/0.3.18 -DQMC_COMPLEX=0 -DQMC_MIXED_PRECISION=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
I'm getting some linking errors
ld.lld: error: undefined symbol: __tgt_target_data_update_mapper
>>> referenced by test_ompBLAS.cpp
>>> CMakeFiles/test_omptarget_blas.dir/test_ompBLAS.cpp.o:(void qmcplusplus::test_gemv<float>(int, int, char))
This is on my desktop machine with the aomp 14.0 compiler and rocm 4.5.0. I did compile the aomp offload version successfully a few weeks ago.
Nevermind ... I was confused about what this support was for. It's not for the offload version.
Had no luck with gcc
cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DENABLE_CUDA=ON -DQMC_CUDA2HIP=ON -DQMC_MPI=OFF ..
it tries to link programs with clang instead of gcc. Got messages like
[100%] Linking HIP executable test_CUDA
which is expected host code.
Had no luck with gcc
cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DENABLE_CUDA=ON -DQMC_CUDA2HIP=ON -DQMC_MPI=OFF ..
it tries to link programs with clang instead of gcc. Got messages like
[100%] Linking HIP executable test_CUDA
which is expected host code.
This test comment seems to suggest that the link language for mixed targets should always be HIP 😕 https://gitlab.kitware.com/cmake/cmake/-/blob/master/Tests/HIP/MixedLanguage/CMakeLists.txt
HIP is only part of the program. Having HIP compiler (and its flags) to take over linking is just so wrong.
HIP is only part of the program. Having HIP compiler (and its flags) to take over linking is just so wrong.
HIP, CXX, and CUDA have values of 90, 30, and 15. https://gitlab.kitware.com/cmake/cmake/-/blob/a6fa3fa136c291c36aefbc79b62995a63bf9107b/Modules/CMakeHIPCompiler.cmake.in#L27
Unfortunately, I found the current ROCm 4.5 release is bad in many ways for adopting HIP as a language. I have to park this PR for the moment. https://github.com/RadeonOpenCompute/ROCm/issues/1259#issuecomment-991753189
@ye-luo Thank you for reviewing this. I'll move onto other issues while we wait for a more usable ROCm release.
temporarily close or mark this [WIP] it's clutter up the PR list.
Test this please
The latest clang 16 based AOMP 16.0-2 doesn't accept a clang_rt.builtins from rocm shipped clang 15. It seems that the cmake using hip as a language in rocm 5.3 picks up the clang_rt.builtins from the compiler rather than the rocm directory. This is actually the right thing to do because clang_rt.builtins is a compiler built-in library. So I think it is time to turn on HIP as a language in CMake. Thus @williamfgc requires an update to cmake 3.21 on the olcf CI.
Test this please
Test this please
Test this please
@quantumsteve @markdewing I think it is good now. We can use HIP as a cmake language. Since I did many changes. I should not approval my own changes. So please do another round of review.
@prckent please review the documentation update. crusher recipe has been updated to use rocm 5.3.0
@quantumsteve @markdewing I think it is good now. We can use HIP as a cmake language. Since I did many changes. I should not approval my own changes. So please do another round of review.
LGTM, though GitHub won't let me approve a PR I'm the original author of.
Confirmed good with rocm 5.4.0 as well.
Test this please