AMDGPU.jl icon indicating copy to clipboard operation
AMDGPU.jl copied to clipboard

Cannot setup AMDGPU.jl

Open wsmoses opened this issue 1 year ago • 4 comments

I'm got access to an AMD machine to help debug the Enzyme Rules. Unfortunately I can't get even a regular setup to work.

julia --project
julia> using AMDGPU
AMDGP┌ Warning: HSA runtime is unavailable, compilation and runtime functionality will be disabled.
└ @ AMDGPU ~/git/AMDGPU.jl/src/AMDGPU.jl:205
UIERROR: InitError: KeyError: key "JULIA_AMDGPU_CORE_MUST_LOAD" not found
Stacktrace:
  [1] (::Base.var"#717#718")(k::String)
    @ Base ./env.jl:156
  [2] access_env
    @ ./env.jl:61 [inlined]
  [3] getindex
    @ ./env.jl:156 [inlined]
  [4] __init__()
    @ AMDGPU ~/git/AMDGPU.jl/src/AMDGPU.jl:208
  [5] run_module_init(mod::Module, i::Int64)
    @ Base ./loading.jl:1134
  [6] register_restored_modules(sv::Core.SimpleVector, pkg::Base.PkgId, path::String)
    @ Base ./loading.jl:1122
  [7] _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, depmods::Vector{Any})
    @ Base ./loading.jl:1067
  [8] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
    @ Base ./loading.jl:1581
  [9] _require(pkg::Base.PkgId, env::String)
    @ Base ./loading.jl:1938
 [10] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1812
 [11] #invoke_in_world#3
    @ ./essentials.jl:926 [inlined]
 [12] invoke_in_world
    @ ./essentials.jl:923 [inlined]
 [13] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1803
 [14] macro expansion
    @ ./loading.jl:1790 [inlined]
 [15] macro expansion
    @ ./lock.jl:267 [inlined]
 [16] __require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1753
 [17] #invoke_in_world#3
    @ ./essentials.jl:926 [inlined]
 [18] invoke_in_world
    @ ./essentials.jl:923 [inlined]
 [19] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1746
during initialization of module AMDGPU
wmoses@hydra:~/git/AMDGPU.jl$  JULIA_AMDGPU_CORE_MUST_LOAD=1  julia --project test/enzyme_tests.jl 
┌ Warning: HSA runtime is unavailable, compilation and runtime functionality will be disabled.
└ @ AMDGPU ~/git/AMDGPU.jl/src/AMDGPU.jl:205
Diagnostics:
-- permissions
crw-rw---- 1 root render 234, 0 Sep 26 16:02 /dev/kfd
total 0
drwxr-xr-x   3 root root        120 Sep 26 16:02 .
drwxr-xr-x  20 root root       4.9K Sep 26 16:02 ..
drwxr-xr-x   2 root root        100 Sep 26 16:02 by-path
crw-rw----+  1 root video  226,   1 Sep 26 16:02 card1
crw-rw----+  1 root video  226,   2 Sep 26 16:02 card2
crw-rw----+  1 root render 226, 128 Sep 26 16:02 renderD128
total 0
drwxr-xr-x 2 root root 100 Sep 26 16:02 .
drwxr-xr-x 3 root root 120 Sep 26 16:02 ..
lrwxrwxrwx 1 root root   8 Sep 26 16:02 pci-0000:e3:00.0-card -> ../card2
lrwxrwxrwx 1 root root  13 Sep 26 16:02 pci-0000:e3:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root   8 Sep 26 16:02 pci-0000:e7:00.0-card -> ../card1
crw-rw----+ 1 root video 226, 1 Sep 26 16:02 /dev/dri/card1
crw-rw----+ 1 root video 226, 2 Sep 26 16:02 /dev/dri/card2
crw-rw----+ 1 root render 226, 128 Sep 26 16:02 /dev/dri/renderD128
uid=1000(wmoses) gid=1000 groups=1000,4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),100(users),114(lpadmin),1467337(sipb),3097103(halide),34299987(llvm),40290977(JuliaLabs),76709007(QuantumBFS),182163809(EnzymeAD),360224085(PRONTOLab)
ERROR: LoadError: InitError: Failed to load HSA runtime, but HSA must load, bailing out
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] __init__()
    @ AMDGPU ~/git/AMDGPU.jl/src/AMDGPU.jl:210
  [3] run_module_init(mod::Module, i::Int64)
    @ Base ./loading.jl:1134
  [4] register_restored_modules(sv::Core.SimpleVector, pkg::Base.PkgId, path::String)
    @ Base ./loading.jl:1122
  [5] _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, depmods::Vector{Any})
    @ Base ./loading.jl:1067
  [6] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
    @ Base ./loading.jl:1581
  [7] _require(pkg::Base.PkgId, env::String)
    @ Base ./loading.jl:1938
  [8] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1812
  [9] #invoke_in_world#3
    @ ./essentials.jl:926 [inlined]
 [10] invoke_in_world
    @ ./essentials.jl:923 [inlined]
 [11] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base ./loading.jl:1803
 [12] macro expansion
    @ ./loading.jl:1790 [inlined]
 [13] macro expansion
    @ ./lock.jl:267 [inlined]
 [14] __require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1753
 [15] #invoke_in_world#3
    @ ./essentials.jl:926 [inlined]
 [16] invoke_in_world
    @ ./essentials.jl:923 [inlined]
 [17] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1746
during initialization of module AMDGPU
in expression starting at /home/wmoses/git/AMDGPU.jl/test/enzyme_tests.jl:1

wsmoses avatar Sep 30 '24 18:09 wsmoses

What's your ROCm install, and on which OS are you?

luraess avatar Sep 30 '24 18:09 luraess

If you are using pxl-th/enzyme branch I've rebased it on master that should get rid of the fisrt error: KeyError: key "JULIA_AMDGPU_CORE_MUST_LOAD" not found

Does rocminfo work for you?

pxl-th avatar Sep 30 '24 18:09 pxl-th

I have the same problem, and yes, rocminfo does work for me.

dvasiliu avatar Mar 07 '25 21:03 dvasiliu

Can this be closed?

luraess avatar Sep 05 '25 12:09 luraess