HIP [Debian official packaging] How is ROCm LLVM fork still needed?

Hello, There is a team dedicated to packaging ROCm natively in Debian. As far as I can tell, our progress is pretty good and the lower third of the stack, up until rocr-runtime and comgr, before hipamd, is packaged without much issue and soon to be published. This is all working so far directly with upstream LLVM 13, without the ROCm llvm-project fork.

ROCm build system cmake is still littered with HIP_CLANG_PATH and other flags: What is still preventing the stack from using the upstream Clang 13 and AMDGPU backend?

Dec 29 '21 01:12 Maxzor

Quoting AMD fellow Cordell Bloor in https://lists.debian.org/debian-ai/2021/05/msg00034.html :

about the progress on upstreaming into LLVM. It seems that the core features required for building ROCm are now all in LLVM trunk. The main features specific to the AMD fork are [...]

openmp offloading,

__hip_atomic builtin functions,

the -parallel-jobs flag,

the option to delegate to another compiler for further CPU optimizations.

[...]

Most new development is upstreamed quickly.

I would enjoy getting an ELI5, an overview, on such above functionalities that you develop outside LLVM. I am also keen to know if a realistic, short-term perspective on being able to un-bunde the AMD-flavored LLVM from ROCm packages, exists.

Dec 30 '21 00:12 Maxzor

I might dive too early into details before letting a general conversation happen... but here is a naive set of patches for ROCm-CompilerSupport (against the amd-stg-open branch) to compile with upstream llvm already.

This is, starting from the bottom, not even having reached hipamd in the stack. The two repositories have quite diverged. Most of the 5k commits are merge remnants, but still over 900 files have changed, even if a lot are about tests or openMP.

maxzor@it:~/rocm/llvm-project$ git remote -v
origin	https://github.com/RadeonOpenCompute/llvm-project.git (fetch)
origin	https://github.com/RadeonOpenCompute/llvm-project.git (push)
upstream	https://github.com/llvm/llvm-project (fetch)
upstream	https://github.com/llvm/llvm-project (push)

maxzor@it:~/rocm/llvm-project$ git diff --compact-summary upstream/main origin/amd-stg-open | tail -1
 991 files changed, 53490 insertions(+), 14653 deletions(-)

maxzor@it:~/rocm/llvm-project$ git diff --dirstat=files,10,cumulative upstream/main origin/amd-stg-open
  10.7% clang/include/clang/
  28.6% clang/
  12.0% llvm/lib/
  10.2% llvm/test/Transforms/
  23.7% llvm/test/
  40.9% llvm/
  10.3% openmp/libomptarget/
  20.7% openmp/

I will try to keep reporting while building the upper blocks, and again would enjoy having this discussion.

Dec 30 '21 05:12 Maxzor

As expected, Cordell's script compiles fine with the AMD fork. But trying to switch from amd-llvm to upstream-llvm-13 first is a bit more involved. Will report soon.

Dec 30 '21 06:12 Maxzor

A few hurdles already: https://github.com/ROCm-Developer-Tools/HIP/pull/2451 https://github.com/ROCmSoftwarePlatform/Tensile/issues/1455

But good news, with some tweaking, all rocrand tests pass on a RadeonVII (stupid build sandbox here).

Dec 31 '21 08:12 Maxzor

Good news!

The last issue that I had was related to debian:testing llvm-13 packages being not finished in testing. I updated the repo used to answer the question of this thread https://salsa.debian.org/maxzor/rocm-builder-experimental/-/tree/main/ ROCm compiles from what I've seen with vanilla llvm-13 on debian!

The tweaks are detailed in this thread, they sum up from memory to:

minor comgr patch
minor hipcc patch
disable parallel-job clang flags patched by AMD and not yet upstreamed for device code compilation, which takes much longer.

My benchmark results for rocTHRUST are: AMD/Vanilla STL 98.54% AMD/VANILLA Thrust 111.38% So a rough 10% perf advantage for the AMD fork.

To be noted, 2 tests out of ~100 do not pass with vanilla LLVM. See the directory "results" in the above "rocm-builder-experimental" repo for full details.

Jan 01 '22 16:01 Maxzor

@yxsamliu might be able to answer your LLVM questions. Most of my knowledge of the ROCm LLVM fork comes from him.

Jan 04 '22 15:01 cgmb

openmp offloading, @saiislam is working on upstreaming it. Do we have an ETA on it? Thanks.

__hip_atomic builtin functions, upstreamed

the -parallel-jobs flag, I have a Phabricator review https://reviews.llvm.org/D69582 however it is blocked on https://reviews.llvm.org/D52193.

the option to delegate to another compiler for further CPU optimizations. will not be upstreamed since it depends on closed source code for CPU optimizations.

Jan 04 '22 15:01 yxsamliu

@ronlieb, please have a look at the query.

Jan 04 '22 16:01 saiislam

We do not have an ETA for upstreaming OpenMP offloading to upstream. Rather we are in the process of upstreaming various features as they are developed, as well as older functionality. From my perspective, LLVM 13 was barely functional for AMDGPU OpenMP offloading as it was the first release to attempt to support AMDGPU OpenMP offloading( a good first step). The upcoming LLVM 14 should have increased robustness for AMDGPU OpenMP offloading. However, within AMD we have and continue to use our amd-stg-open branch of LLVM to produce a ROCm compiler product for a variety of reasons including content not yet upstreamed, and extensive qualification testing for our ROCm product.

Jan 04 '22 16:01 ronlieb

For the record, the result of HIP tests here are

99% tests passed, 3 tests failed out of 403

Total Test time (real) = 3933.38 sec

The following tests FAILED:
103 - directed_tests/ipc/hipMultiProcIpcMem.tst (Timeout)
132 - directed_tests/runtimeApi/cooperativeGrps/cooperative_streams_half_capacity.tst (Subprocess aborted)
197 - directed_tests/runtimeApi/memory/hipIpcMemAccessTest.tst (Timeout)

===========================

94% tests passed, 1 tests failed out of 18

Total Test time (real) = 230.02 sec

The following tests FAILED:
	418 - performance_tests/module/hipPerfModuleLoad.tst (Failed)

Jan 8th: currently fiddling with the dedicated hip-testsuite and hip-examples repositories.

Jan 07 '22 17:01 Maxzor

@Maxzor Can you please test with latest ROCm 6.0.2 (HIP 6.0.32831)? If resolved, please close ticket. Thanks!

Apr 03 '24 14:04 ppanchad-amd

I think this question has been answered. The Debian packaging for ROCm is fairly mature now, and it does not use the ROCm LLVM fork.

Apr 03 '24 16:04 cgmb

HIP HIP copied to clipboard

[Debian official packaging] How is ROCm LLVM fork still needed?

HIP
HIP copied to clipboard