HIP
HIP copied to clipboard
[Debian official packaging] How is ROCm LLVM fork still needed?
Hello, There is a team dedicated to packaging ROCm natively in Debian. As far as I can tell, our progress is pretty good and the lower third of the stack, up until rocr-runtime and comgr, before hipamd, is packaged without much issue and soon to be published. This is all working so far directly with upstream LLVM 13, without the ROCm llvm-project fork.
ROCm build system cmake is still littered with HIP_CLANG_PATH and other flags: What is still preventing the stack from using the upstream Clang 13 and AMDGPU backend?
Quoting AMD fellow Cordell Bloor in https://lists.debian.org/debian-ai/2021/05/msg00034.html :
about the progress on upstreaming into LLVM. It seems that the core features required for building ROCm are now all in LLVM trunk. The main features specific to the AMD fork are [...]
- openmp offloading,
- __hip_atomic builtin functions,
- the -parallel-jobs flag,
- the option to delegate to another compiler for further CPU optimizations.
[...]
Most new development is upstreamed quickly.
I would enjoy getting an ELI5, an overview, on such above functionalities that you develop outside LLVM. I am also keen to know if a realistic, short-term perspective on being able to un-bunde the AMD-flavored LLVM from ROCm packages, exists.
I might dive too early into details before letting a general conversation happen...
but here is a naive set of patches for ROCm-CompilerSupport (against the amd-stg-open
branch) to compile with upstream llvm already.
This is, starting from the bottom, not even having reached hipamd in the stack. The two repositories have quite diverged. Most of the 5k commits are merge remnants, but still over 900 files have changed, even if a lot are about tests or openMP.
maxzor@it:~/rocm/llvm-project$ git remote -v
origin https://github.com/RadeonOpenCompute/llvm-project.git (fetch)
origin https://github.com/RadeonOpenCompute/llvm-project.git (push)
upstream https://github.com/llvm/llvm-project (fetch)
upstream https://github.com/llvm/llvm-project (push)
maxzor@it:~/rocm/llvm-project$ git diff --compact-summary upstream/main origin/amd-stg-open | tail -1
991 files changed, 53490 insertions(+), 14653 deletions(-)
maxzor@it:~/rocm/llvm-project$ git diff --dirstat=files,10,cumulative upstream/main origin/amd-stg-open
10.7% clang/include/clang/
28.6% clang/
12.0% llvm/lib/
10.2% llvm/test/Transforms/
23.7% llvm/test/
40.9% llvm/
10.3% openmp/libomptarget/
20.7% openmp/
I will try to keep reporting while building the upper blocks, and again would enjoy having this discussion.
As expected, Cordell's script compiles fine with the AMD fork. But trying to switch from amd-llvm to upstream-llvm-13 first is a bit more involved. Will report soon.
A few hurdles already: https://github.com/ROCm-Developer-Tools/HIP/pull/2451 https://github.com/ROCmSoftwarePlatform/Tensile/issues/1455
But good news, with some tweaking, all rocrand tests pass on a RadeonVII (stupid build sandbox here).
Good news!
The last issue that I had was related to debian:testing llvm-13 packages being not finished in testing. I updated the repo used to answer the question of this thread https://salsa.debian.org/maxzor/rocm-builder-experimental/-/tree/main/ ROCm compiles from what I've seen with vanilla llvm-13 on debian!
The tweaks are detailed in this thread, they sum up from memory to:
- minor comgr patch
- minor hipcc patch
- disable parallel-job clang flags patched by AMD and not yet upstreamed for device code compilation, which takes much longer.
My benchmark results for rocTHRUST are: AMD/Vanilla STL 98.54% AMD/VANILLA Thrust 111.38% So a rough 10% perf advantage for the AMD fork.
To be noted, 2 tests out of ~100 do not pass with vanilla LLVM. See the directory "results" in the above "rocm-builder-experimental" repo for full details.
@yxsamliu might be able to answer your LLVM questions. Most of my knowledge of the ROCm LLVM fork comes from him.
openmp offloading, @saiislam is working on upstreaming it. Do we have an ETA on it? Thanks.
__hip_atomic builtin functions, upstreamed
the -parallel-jobs flag, I have a Phabricator review https://reviews.llvm.org/D69582 however it is blocked on https://reviews.llvm.org/D52193.
the option to delegate to another compiler for further CPU optimizations. will not be upstreamed since it depends on closed source code for CPU optimizations.
@ronlieb, please have a look at the query.
We do not have an ETA for upstreaming OpenMP offloading to upstream. Rather we are in the process of upstreaming various features as they are developed, as well as older functionality. From my perspective, LLVM 13 was barely functional for AMDGPU OpenMP offloading as it was the first release to attempt to support AMDGPU OpenMP offloading( a good first step). The upcoming LLVM 14 should have increased robustness for AMDGPU OpenMP offloading. However, within AMD we have and continue to use our amd-stg-open branch of LLVM to produce a ROCm compiler product for a variety of reasons including content not yet upstreamed, and extensive qualification testing for our ROCm product.
For the record, the result of HIP tests here are
99% tests passed, 3 tests failed out of 403
Total Test time (real) = 3933.38 sec
The following tests FAILED:
103 - directed_tests/ipc/hipMultiProcIpcMem.tst (Timeout)
132 - directed_tests/runtimeApi/cooperativeGrps/cooperative_streams_half_capacity.tst (Subprocess aborted)
197 - directed_tests/runtimeApi/memory/hipIpcMemAccessTest.tst (Timeout)
===========================
94% tests passed, 1 tests failed out of 18
Total Test time (real) = 230.02 sec
The following tests FAILED:
418 - performance_tests/module/hipPerfModuleLoad.tst (Failed)
Jan 8th: currently fiddling with the dedicated hip-testsuite and hip-examples repositories.
@Maxzor Can you please test with latest ROCm 6.0.2 (HIP 6.0.32831)? If resolved, please close ticket. Thanks!
I think this question has been answered. The Debian packaging for ROCm is fairly mature now, and it does not use the ROCm LLVM fork.