AMDGPU.jl
AMDGPU.jl copied to clipboard
AMDGPU.jl on rolling release distros (Arch): Libraries unavailable
Hello, just a quick report and temporary solution. I was getting this issue on Arch:
julia> using AMDGPU
┌ Warning: HSA runtime is unavailable, compilation and runtime functionality will be disabled.
│ Reason: Could not find `libhsa-runtime64` v1 library
└ @ AMDGPU ~/.julia/packages/AMDGPU/FXTo5/src/AMDGPU.jl:181
┌ Warning: LLD is unavailable, compilation functionality will be disabled.
│ Reason: unknown
└ @ AMDGPU ~/.julia/packages/AMDGPU/FXTo5/src/AMDGPU.jl:193
┌ Warning: Device libraries are unavailable, device intrinsics will be disabled.
│ Reason: unknown
└ @ AMDGPU ~/.julia/packages/AMDGPU/FXTo5/src/AMDGPU.jl:205
┌ Warning: HIP library is unavailable, HIP integration will be disabled.
│ Reason: unknown
└ @ AMDGPU ~/.julia/packages/AMDGPU/FXTo5/src/AMDGPU.jl:221
I think this is because the JLLs are still being built for the Julia 1.9 and the latest AMDGPU package, so it is falling back to my native instances of each of these libraries, which are on a later version than what is supported by AMDGPU. The quick fix is just to downgrade the packages to 5.4.3.
In arch, this can be done with the Downgrade package like so:
sudo downgrade rocsparse
sudo downgrade rocblas
sudo downgrade rocsolver
sudo downgrade rocfft
sudo downgrade rocrand
sudo downgrade miopen-hip
sudo downgrade hip-runtime-amd
sudo downgrade rocm-device-libs
sudo downgrade hsa-rocr
sudo downgrade hsakmt-roct
There is definitely a better script, but this one works. It will open a UI and ask which version of each library to downgrade to, then you just go to 5.4.3. It's up to you whether you keep the settings so when you pacman -Syu
it doesn't upgrade them by default.
I decided to create an issue instead of updating the docs because:
- I figure people will just google the error message
- I think once the JLLs are built, we won't need to do this anymore.
Also: if anyone has a better downgrade script, feel free to post it.
If the JLLs are too old, probably this issue needs to be addressed in Yggdrasil?
Hi, Is it safe to downgrade those packages? I'm on Manjaro and ROCm version currently is 5.6.1. Upon doing using AMDGPU I get the following error:
[ Info: Precompiling AMDGPU [21141c5a-9bdb-4563-92ae-f87d6854732e]
julia: /usr/src/debug/hip-runtime-amd/clr-rocm-5.6.1/rocclr/os/os_posix.cpp:310: static void amd::Os::currentStackInfo(unsigned char**, size_t*): Assertion 'Os::currentStackPtr() >= *base - *size && Os::currentStackPtr() < *base && "just checking"' failed.
[4501] signal (6.-6): Aborted
in expression starting at /home/fra/.julia/packages/AMDGPU/bQD5E/src/AMDGPU.jl:61
unknown function (ip: 0x7f4a9168e83c)
raise at /usr/bin/../lib/libc.so.6 (unknown line)
abort at /usr/bin/../lib/libc.so.6 (unknown line)
unknown function (ip: 0x7f4a916263db)
__assert_fail at /usr/bin/../lib/libc.so.6 (unknown line)
unknown function (ip: 0x7f49e64ed8f4)
unknown function (ip: 0x7f49e64fb3e7)
unknown function (ip: 0x7f49e62c6421)
unknown function (ip: 0x7f4a9198e0fd)
unknown function (ip: 0x7f4a9198e1eb)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x7f4a91994ad5)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x7f4a91994e4b)
unknown function (ip: 0x7f4a916889eb)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x7f4a9198a602)
unknown function (ip: 0x7f4a916884f6)
dlopen at /usr/bin/../lib/libc.so.6 (unknown line)
ijl_load_dynamic_library at /usr/bin/../lib/julia/libjulia-internal.so.1 (unknown line)
unknown function (ip: 0x7f4a77914556)
dlopen at ./libdl.jl:116 [inlined]
find_library at ./libdl.jl:206
find_library at ./libdl.jl:214 [inlined]
find_library at ./libdl.jl:214 [inlined]
find_rocm_library at /home/fra/.julia/packages/AMDGPU/bQD5E/src/discovery_utils.jl:64
#find_system_library!#9 at /home/fra/.julia/packages/AMDGPU/bQD5E/src/rocm_discovery.jl:52
find_system_library! at /home/fra/.julia/packages/AMDGPU/bQD5E/src/rocm_discovery.jl:49 [inlined]
macro expansion at /home/fra/.julia/packages/AMDGPU/bQD5E/src/rocm_discovery.jl:170 [inlined]
#10 at ./task.jl:514
unknown function (ip: 0x7f4a888c9161)
unknown function (ip: 0x7f4a90c6b17e)
Allocations: 1398669 (Pool: 1397552; Big: 1117); GC: 2
ERROR: Failed to precompile AMDGPU [21141c5a-9bdb-4563-92ae-f87d6854732e] to "/home/fra/.julia/compiled/v1.9/AMDGPU/jl_OmQaOz".
Do you think this is related to the ROCm version and if so do you think I can safely downgrade? Thanks!
Latest supported ROCm version is 5.4 for now.