AMDGPU.jl icon indicating copy to clipboard operation
AMDGPU.jl copied to clipboard

ROCclr segfault when running Julia with threads

Open leios opened this issue 7 months ago • 18 comments

Nothing works when using Julia in threaded mode:

[leios@noema Fable.jl]$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> AMDGPU.zeros(10)
10-element ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

julia> 
[leios@noema Fable.jl]$ julia -t 2
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> AMDGPU.zeros(10)
10-element ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
julia: /usr/src/debug/hip-runtime/clr-rocm-6.2.4/rocclr/os/os_posix.cpp:321: static void amd::Os::currentStackInfo(unsigned char**, size_t*): Assertion `Os::currentStackPtr() >= *base - *size && Os::currentStackPtr() < *base && "just checking"' failed.

[7772] signal 6 (-6): Aborted
in expression starting at none:0
unknown function (ip: 0x7087c8970624)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x7087c88fe4ea)
unknown function (ip: 0x70874990c108)
unknown function (ip: 0x708749918dc7)
unknown function (ip: 0x708749706285)
macro expansion at /home/leios/.julia/packages/GPUToolbox/cZlg7/src/ccalls.jl:143 [inlined]
macro expansion at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:122 [inlined]
hipGetDeviceCount at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:42 [inlined]
ndevices at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/device.jl:103
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
#25 at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:27 [inlined]
get! at ./iddict.jl:171
task_local_state! at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:26
prepare_state at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:193 [inlined]
hipStreamQuery at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:113 [inlined]
#11 at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/stream.jl:114
unknown function (ip: 0x7087bc5f29ff)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 24176043 (Pool: 24175407; Big: 636); GC: 17
Aborted (core dumped)

leios avatar May 25 '25 13:05 leios

https://github.com/ROCm/clr/blob/204d35d16ef5c2c1ea1a4bb25442908a306c857a/rocclr/os/os_posix.cpp#L301-L323

What is versioninfo() and AMDGPU.versioninfo()?

vchuravy avatar May 25 '25 15:05 vchuravy

Right. That segfaults even when I am using Julia single threaded.

julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
julia: /usr/src/debug/hip-runtime/clr-rocm-6.2.4/hipamd/src/hip_code_object.cpp:1152: hip::FatBinaryInfo** hip::StatCO::addFatBinary(const void*, bool): Assertion `err == hipSuccess' failed.

[10123] signal 6 (-6): Aborted
in expression starting at REPL[2]:1
unknown function (ip: 0x728308c69624)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x728308bf74ea)
unknown function (ip: 0x728290250954)
unknown function (ip: 0x727ff5b6b99c)
unknown function (ip: 0x728308e1e1d6)
unknown function (ip: 0x728308e1e2ac)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x728308e24e78)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x728308e25283)
unknown function (ip: 0x728308c639d3)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x728308e1b558)
unknown function (ip: 0x728308c634b2)
dlopen at /usr/lib/libc.so.6 (unknown line)
ijl_load_dynamic_library at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/dlload.c:365
jl_get_library_ at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/runtime_ccall.cpp:45 [inlined]
jl_get_library_ at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/runtime_ccall.cpp:29
ijl_lazy_load_and_lookup at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/runtime_ccall.cpp:73
macro expansion at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:122 [inlined]
miopenGetVersion at /home/leios/.julia/packages/AMDGPU/STpZC/src/dnn/libMIOpen.jl:29
version at /home/leios/.julia/packages/AMDGPU/STpZC/src/dnn/MIOpen.jl:62 [inlined]
_ver at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:5 [inlined]
versioninfo at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:6
unknown function (ip: 0x7283019193af)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/interpreter.c:663
jl_interpret_toplevel_thunk at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/interpreter.c:821
jl_toplevel_eval_flex at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
eval_user_input at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:245
repl_backend_loop at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:342
#start_repl_backend#59 at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:327
start_repl_backend at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:324
#run_repl#72 at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:483
run_repl at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:469
jfptr_run_repl_10097.1 at /home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/REPL/u0gqU_XvZAg.so (unknown line)
#1150 at ./client.jl:446
jfptr_YY.1150_14693.1 at /home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/REPL/u0gqU_XvZAg.so (unknown line)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
jl_f__call_latest at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/builtins.c:875
#invokelatest#2 at ./essentials.jl:1055 [inlined]
invokelatest at ./essentials.jl:1052 [inlined]
run_main_repl at ./client.jl:430
repl_main at ./client.jl:567 [inlined]
_start at ./client.jl:541
jfptr__start_73609.1 at /home/leios/builds/julia-1.11.3/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
true_main at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/jlapi.c:900
jl_repl_entrypoint at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/jlapi.c:1059
main at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/cli/loader_exe.c:58
unknown function (ip: 0x728308bf9487)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 6746836 (Pool: 6746608; Big: 228); GC: 7
Aborted (core dumped)

leios avatar May 25 '25 15:05 leios

rocm info:

[leios@noema Fable.jl]$ rocminfo 
ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 7 7700X 8-Core Processor 
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 7 7700X 8-Core Processor 
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   5573                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            16                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Memory Properties:       
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    31989652(0x1e81f94) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    31989652(0x1e81f94) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    31989652(0x1e81f94) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1031                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 6700 XT              
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      3072(0xc00) KB                     
    L3:                      98304(0x18000) KB                  
  Chip ID:                 29663(0x73df)                      
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          128(0x80)                          
  Max Clock Freq. (MHz):   2725                               
  BDFID:                   768                                
  Internal Node ID:        1                                  
  Compute Unit:            40                                 
  SIMDs per CU:            2                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:       
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 122                                
  SDMA engine uCode::      80                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    12566528(0xbfc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    12566528(0xbfc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1031         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*******                  
Agent 3                  
*******                  
  Name:                    gfx1036                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon Graphics                
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      256(0x100) KB                      
  Chip ID:                 5710(0x164e)                       
  ASIC Revision:           1(0x1)                             
  Cacheline Size:          128(0x80)                          
  Max Clock Freq. (MHz):   2200                               
  BDFID:                   4608                               
  Internal Node ID:        2                                  
  Compute Unit:            2                                  
  SIMDs per CU:            2                                  
  Shader Engines:          1                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Memory Properties:       APU
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 21                                 
  SDMA engine uCode::      9                                  
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    15994824(0xf40fc8) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    15994824(0xf40fc8) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Recommended Granule:2048KB                             
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Recommended Granule:0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1036         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done *** 

leios avatar May 25 '25 15:05 leios

Right. That segfaults even when I am using Julia single threaded.

Yeah that's #767 :/

vchuravy avatar May 25 '25 15:05 vchuravy

Ah, oops. Duplicate then

leios avatar May 25 '25 15:05 leios

What is:

using AMDGPU
import Libdl
foreach(println, Libdl.dllist())

Ah, oops. Duplicate then

No on my Archlinux system, versioninfo crashes, but other things work.

vchuravy avatar May 25 '25 15:05 vchuravy

julia> using AMDGPU

julia> import Libdl

julia> foreach(println, Libdl.dllist())
linux-vdso.so.1
/usr/lib/libdl.so.2
/usr/lib/libpthread.so.0
/usr/lib/libc.so.6
/home/leios/builds/julia-1.11.3/bin/../lib/libjulia.so.1.11
/lib64/ld-linux-x86-64.so.2
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libgcc_s.so.1
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libopenlibm.so
/usr/lib/libstdc++.so.6
/usr/lib/libm.so.6
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libjulia-internal.so.1.11
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libunwind.so.8
/usr/lib/librt.so.1
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libz.so.1
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libatomic.so.1
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libjulia-codegen.so.1.11
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libLLVM-16jl.so
/home/leios/builds/julia-1.11.3/lib/julia/sys.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libpcre2-8.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libgmp.so.10
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libmpfr.so.6
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libgfortran.so.5
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libquadmath.so.0
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libopenblas64_.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libblastrampoline.so.5
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Base64/D7K0n_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Markdown/AREjX_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/InteractiveUtils/0TrXF_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/StyledStrings/UcVoM_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Unicode/E4Hzs_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/REPL/u0gqU_XvZAg.so
/home/leios/.julia/compiled/v1.11/Adapt/rUIgN_2wbLs.so
/home/leios/.julia/compiled/v1.11/CEnum/0gyUJ_4Hzk0.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Printf/3FQLY_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Dates/p8See_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/TOML/mjrwE_XvZAg.so
/home/leios/.julia/compiled/v1.11/Preferences/pWSk8_4Hzk0.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/NetworkOptions/J8H6s_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/MbedTLS_jll/u5NEn_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libmbedcrypto.so.7
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libmbedtls.so.14
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libmbedx509.so.1
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LibSSH2_jll/K6mup_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libssh2.so.1
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LibGit2_jll/nfCpg_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libgit2.so.1.7
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LibGit2/xrYJZ_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/ArgTools/aGHFV_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/nghttp2_jll/KTGSA_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libnghttp2.so.14
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LibCURL_jll/9JWaY_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libcurl.so.4
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/MozillaCACerts_jll/XKIUi_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LibCURL/ht49g_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Downloads/eiA4B_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Tar/G9ZYP_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/p7zip_jll/dfuGM_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/UUIDs/SIw1t_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Logging/PWFjL_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Pkg/tUTdb_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LazyArtifacts/MRP8l_XvZAg.so
/home/leios/.julia/compiled/v1.11/JLLWrappers/7Zgw7_4Hzk0.so
/home/leios/.julia/compiled/v1.11/LLVMExtra_jll/R9OeX_rURdT.so
/home/leios/.julia/compiled/v1.11/LLVM/e8NBy_rURdT.so
/home/leios/.julia/compiled/v1.11/LibTracyClient_jll/mti1A_rURdT.so
/home/leios/.julia/artifacts/8a696873fc2d7c6d28ccd099ccaaa8960691a0a0/lib/libTracyClient.so
/home/leios/.julia/compiled/v1.11/ExprTools/eM8wu_4Hzk0.so
/home/leios/.julia/compiled/v1.11/Tracy/QvZG9_rURdT.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Serialization/zGad9_XvZAg.so
/home/leios/.julia/compiled/v1.11/Scratch/ICI1U_4Hzk0.so
/home/leios/.julia/compiled/v1.11/PrecompileTools/AQ9Mk_4Hzk0.so
/home/leios/.julia/compiled/v1.11/GPUCompiler/yPwef_rURdT.so
/home/leios/.julia/compiled/v1.11/UnsafeAtomics/OuhNJ_4Hzk0.so
/home/leios/.julia/compiled/v1.11/Atomix/3LdQ4_ZduEO.so
/home/leios/.julia/compiled/v1.11/MacroTools/38lnR_NwWdq.so
/home/leios/.julia/compiled/v1.11/StaticArraysCore/Tzw28_4Hzk0.so
/home/leios/.julia/compiled/v1.11/StaticArrays/yY9vm_2wbLs.so
/home/leios/.julia/compiled/v1.11/AdaptStaticArraysExt/9bCdf_2wbLs.so
/home/leios/.julia/compiled/v1.11/KernelAbstractions/aywHT_NwWdq.so
/home/leios/.julia/compiled/v1.11/LinearAlgebraExt/1TyTB_NwWdq.so
/home/leios/.julia/compiled/v1.11/UnsafeAtomicsLLVM/yk2PZ_rURdT.so
/home/leios/.julia/compiled/v1.11/Reexport/bTpYr_4Hzk0.so
/home/leios/.julia/compiled/v1.11/GPUArraysCore/qiYUe_PQ5ag.so
/home/leios/.julia/compiled/v1.11/Statistics/ERcPL_4Hzk0.so
/home/leios/.julia/compiled/v1.11/StaticArraysStatisticsExt/EfhbW_2wbLs.so
/home/leios/.julia/compiled/v1.11/GPUArrays/v5u0T_rURdT.so
/home/leios/.julia/compiled/v1.11/ArgCheck/P66Js_rURdT.so
/home/leios/.julia/compiled/v1.11/ManualMemory/rywzg_4Hzk0.so
/home/leios/.julia/compiled/v1.11/ThreadingUtilities/FwPmW_rURdT.so
/home/leios/.julia/compiled/v1.11/ArrayInterface/7bROb_rURdT.so
/home/leios/.julia/compiled/v1.11/IfElse/YSZB7_4Hzk0.so
/home/leios/.julia/compiled/v1.11/CommonWorldInvalidations/rQYNM_4Hzk0.so
/home/leios/.julia/compiled/v1.11/Static/4nGFz_2wbLs.so
/home/leios/.julia/compiled/v1.11/Compat/GSFWK_4Hzk0.so
/home/leios/.julia/compiled/v1.11/CompatLinearAlgebraExt/Zxpzq_4Hzk0.so
/home/leios/.julia/compiled/v1.11/StaticArrayInterface/1FInX_rURdT.so
/home/leios/.julia/compiled/v1.11/SIMDTypes/NiIYy_4Hzk0.so
/home/leios/.julia/compiled/v1.11/LayoutPointers/SicMc_rURdT.so
/home/leios/.julia/compiled/v1.11/CloseOpenIntervals/eAH4s_rURdT.so
/home/leios/.julia/compiled/v1.11/StrideArraysCore/kWbGj_rURdT.so
/home/leios/.julia/compiled/v1.11/BitTwiddlingConvenienceFunctions/fzQ1O_2wbLs.so
/home/leios/.julia/compiled/v1.11/CpuId/vMZBF_4Hzk0.so
/home/leios/.julia/compiled/v1.11/CPUSummary/3IE2Z_2wbLs.so
/home/leios/.julia/compiled/v1.11/PolyesterWeave/XwY71_rURdT.so
/home/leios/.julia/compiled/v1.11/Polyester/V16F5_rURdT.so
/home/leios/.julia/compiled/v1.11/ArrayInterfaceGPUArraysCoreExt/sVb8r_rURdT.so
/home/leios/.julia/compiled/v1.11/ArrayInterfaceStaticArraysCoreExt/gqwrP_rURdT.so
/home/leios/.julia/compiled/v1.11/StaticArrayInterfaceStaticArraysExt/Nww1z_rURdT.so
/home/leios/.julia/compiled/v1.11/StableTasks/OT4TS_rURdT.so
/home/leios/.julia/compiled/v1.11/ChunkSplitters/DaaT8_rURdT.so
/home/leios/.julia/compiled/v1.11/TaskLocalValues/oQVnM_4Hzk0.so
/home/leios/.julia/compiled/v1.11/ScopedValues/fN9Bp_4Hzk0.so
/home/leios/.julia/compiled/v1.11/ConstructionBase/sBbW6_4Hzk0.so
/home/leios/.julia/compiled/v1.11/ConstructionBaseLinearAlgebraExt/mc2IG_4Hzk0.so
/home/leios/.julia/compiled/v1.11/InitialValues/djxcV_4Hzk0.so
/home/leios/.julia/compiled/v1.11/InverseFunctions/PkVmn_4Hzk0.so
/home/leios/.julia/compiled/v1.11/CompositionsBase/KqDTx_4Hzk0.so
/home/leios/.julia/compiled/v1.11/CompositionsBaseInverseFunctionsExt/WTKja_4Hzk0.so
/home/leios/.julia/compiled/v1.11/InverseFunctionsDatesExt/gjNlb_4Hzk0.so
/home/leios/.julia/compiled/v1.11/Accessors/XelUh_rURdT.so
/home/leios/.julia/compiled/v1.11/LinearAlgebraExt/Pm3AL_rURdT.so
/home/leios/.julia/compiled/v1.11/BangBang/Ovsha_rURdT.so
/home/leios/.julia/compiled/v1.11/OhMyThreads/2oy0C_rURdT.so
/home/leios/.julia/compiled/v1.11/ConstructionBaseStaticArraysExt/MmdaU_2wbLs.so
/home/leios/.julia/compiled/v1.11/StaticArraysExt/sIz4V_rURdT.so
/home/leios/.julia/compiled/v1.11/BangBangStaticArraysExt/I8ZlX_rURdT.so
/home/leios/.julia/compiled/v1.11/MarkdownExt/xNmCG_rURdT.so
/home/leios/.julia/compiled/v1.11/AcceleratedKernels/M6fRl_rURdT.so
/home/leios/.julia/compiled/v1.11/DataValueInterfaces/9Lpkp_4Hzk0.so
/home/leios/.julia/compiled/v1.11/DataAPI/3a8mN_4Hzk0.so
/home/leios/.julia/compiled/v1.11/IteratorInterfaceExtensions/N0h8q_4Hzk0.so
/home/leios/.julia/compiled/v1.11/TableTraits/I6SaN_4Hzk0.so
/home/leios/.julia/compiled/v1.11/OrderedCollections/LtT3J_Hzd3d.so
/home/leios/.julia/compiled/v1.11/Tables/Z804B_Hzd3d.so
/home/leios/.julia/compiled/v1.11/StringManipulation/4nJQd_rURdT.so
/home/leios/.julia/compiled/v1.11/Crayons/TXPcU_4Hzk0.so
/home/leios/.julia/compiled/v1.11/LaTeXStrings/H4HGh_4Hzk0.so
/home/leios/.julia/compiled/v1.11/PrettyTables/kRdcL_rURdT.so
/home/leios/.julia/compiled/v1.11/BangBangTablesExt/h92XF_rURdT.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/LLD_jll/ZHBMJ_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/Zlib_jll/xjq3Q_XvZAg.so
/home/leios/.julia/compiled/v1.11/ROCmDeviceLibs_jll/JXG1e_4Hzk0.so
/home/leios/.julia/compiled/v1.11/GPUToolbox/VNkSP_rURdT.so
/home/leios/.julia/compiled/v1.11/IrrationalConstants/ukdUG_4Hzk0.so
/home/leios/.julia/compiled/v1.11/DocStringExtensions/KRdZs_2wbLs.so
/home/leios/.julia/compiled/v1.11/LogExpFunctions/cmCYR_2wbLs.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/OpenLibm_jll/ToVO1_XvZAg.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/CompilerSupportLibraries_jll/iCwSB_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libgomp.so.1
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libssp.so.0
/home/leios/.julia/compiled/v1.11/OpenSpecFun_jll/TDl1L_4Hzk0.so
/home/leios/.julia/artifacts/09b351e89a85e07e957194a647765403d4ee1bcb/lib/libopenspecfun.so
/home/leios/.julia/compiled/v1.11/SpecialFunctions/78gOt_rURdT.so
/home/leios/.julia/compiled/v1.11/LogExpFunctionsInverseFunctionsExt/IXKft_PQ5ag.so
/home/leios/.julia/compiled/v1.11/RandomNumbers/pgCpR_4Hzk0.so
/home/leios/.julia/compiled/v1.11/Random123/1imiM_rURdT.so
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/SuiteSparse_jll/ME9At_XvZAg.so
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libamd.so.3
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libsuitesparseconfig.so.7
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libbtf.so.2
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libcamd.so.3
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libccolamd.so.3
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libcholmod.so.5
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libcolamd.so.3
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libklu.so.2
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libldl.so.3
/home/leios/builds/julia-1.11.3/bin/../lib/julia/librbio.so.4
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libspqr.so.4
/home/leios/builds/julia-1.11.3/bin/../lib/julia/libumfpack.so.6
/home/leios/builds/julia-1.11.3/share/julia/compiled/v1.11/SparseArrays/P9ieR_XvZAg.so
/home/leios/.julia/compiled/v1.11/AdaptSparseArraysExt/7qfxl_2wbLs.so
/home/leios/.julia/compiled/v1.11/SparseArraysExt/TR6ym_NwWdq.so
/home/leios/.julia/compiled/v1.11/SparseArraysExt/k0lPI_4Hzk0.so
/home/leios/.julia/compiled/v1.11/ArrayInterfaceSparseArraysExt/3nhwj_rURdT.so
/home/leios/.julia/compiled/v1.11/AbstractFFTs/Di3HZ_4Hzk0.so
/home/leios/.julia/compiled/v1.11/AMDGPU/arpZD_rURdT.so
/opt/rocm/lib/libamdhip64.so
/opt/rocm/lib/librocprofiler-register.so.0
/opt/rocm/lib/libamd_comgr.so.2
/opt/rocm/lib/libhsa-runtime64.so.1
/usr/lib/libnuma.so.1
/usr/lib/libfmt.so.11
/usr/lib/libglog.so.2
/usr/lib/libzstd.so.1
/usr/lib/libncursesw.so.6
/usr/lib/libelf.so.1
/opt/rocm/lib/libhsakmt.so.1
/usr/lib/libdrm.so.2
/usr/lib/libgflags.so.2.2
/usr/lib/libdrm_amdgpu.so.1
/opt/rocm/lib/libhsa-amd-aqlprofile64.so

leios avatar May 25 '25 15:05 leios

vchuravy@odin ~> pacman -Qe | grep amd
amd-ucode 20250408.c1a774f3-1
amdmemorytweak-git 40.9a64ff1-1
amdvlk 2025.Q1.3-1
hip-runtime-amd 6.3.3-1
xf86-video-amdgpu 23.0.0-2
vchuravy@odin ~> pacman -Qe | grep roc
python-zeroconf 0.146.1-1
rocm-hip-sdk 6.3.3-1
roctracer 6.3.3-1

So it seems I am on 6.3 and you are likely on 6.0?

vchuravy avatar May 25 '25 15:05 vchuravy

6.2.4

I'll update?

[leios@noema ~]$ pacman -Qe | grep amd
amd-ucode 20250210.5bc5868b-1
amdvlk 2024.Q4.3-1
hsa-amd-aqlprofile-bin 6.2.4-1
xf86-video-amdgpu 23.0.0-2
[leios@noema ~]$ pacman -Qe | grep roc
rocblas 6.2.4-1
rocfft 6.2.4-1
rocm-hip-runtime 6.2.2-1
rocm-hip-sdk 6.2.2-1
rocm-opencl-sdk 6.2.2-1
rocm-smi-lib 6.2.4-1
rocrand 6.2.4-1
rocsolver 6.2.4-1
rocsparse 6.2.4-1

leios avatar May 25 '25 15:05 leios

Good news, everything's still broken on 6.4 for me

[leios@noema ~]$ pacman -Qe | grep amd
amd-ucode 20250508.788aadc8-2
amdvlk 2025.Q2.1-1
hsa-amd-aqlprofile-bin 6.4.0-1
xf86-video-amdgpu 23.0.0-2

[leios@noema ~]$ pacman -Qe | grep roc
rocblas 6.4.0-1
rocfft 6.4.0-1
rocm-hip-runtime 6.4.0-1
rocm-hip-sdk 6.4.0-1
rocm-opencl-sdk 6.4.0-1
rocm-smi-lib 6.4.0-1
rocrand 6.4.0-1
rocsolver 6.4.0-1
rocsparse 6.4.0-1
[leios@noema ~]$ julia -t 2
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> AMDGPU.zeros(10)
10-element ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
julia: /usr/src/debug/hip-runtime/hip-runtime-clr/rocclr/os/os_posix.cpp:321: static void amd::Os::currentStackInfo(unsigned char**, size_t*): Assertion `Os::currentStackPtr() >= *base - *size && Os::currentStackPtr() < *base && "just checking"' failed.

[1364] signal 6 (-6): Aborted
in expression starting at none:0
unknown function (ip: 0x7fa23f6b374c)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x7fa23f6414e2)
unknown function (ip: 0x7fa1c0e660bb)
unknown function (ip: 0x7fa1c11437d3)
unknown function (ip: 0x7fa1c0ea5feb)
macro expansion at /home/leios/.julia/packages/GPUToolbox/cZlg7/src/ccalls.jl:143 [inlined]
macro expansion at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:122 [inlined]
hipGetDeviceCount at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:42 [inlined]
ndevices at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/device.jl:103
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
#25 at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:27 [inlined]
get! at ./iddict.jl:171
task_local_state! at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:26
prepare_state at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:193 [inlined]
hipStreamQuery at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:113 [inlined]
#11 at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/stream.jl:114
unknown function (ip: 0x7fa2375f646f)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 23885857 (Pool: 23885222; Big: 635); GC: 16
Aborted (core dumped)

leios avatar May 25 '25 15:05 leios

Uhhh.

Works now.

I didn't change anything on my end, but I guess I am closing this for now?

leios avatar May 26 '25 10:05 leios

Reopening because it's still happening, but now seemingly at random. I can't really figure out how to create a MWE because it'll work sometimes and not at other times.

An example: https://youtube.com/clip/UgkxRc957OTXTE5V5cFjtL4to-w02-kP2vmf?feature=shared

This one is actually closer to #690, but I had my exports set export UCX_ERROR_SIGNALS="SIGILL,SIGBUS,SIGFPE", which had solved the issue previously.

(also, yes, the video's unlisted precisely because I got a little animated)

leios avatar Jun 08 '25 14:06 leios

UCX_ERROR_SIGNALS is only applicable to MPI based libraries that use the UCX library.

I agree that the error looks similar to that, I assume this is Julia 1.11?

The backtrace is kinda weird... It looks like you are encountering a segmentation fault in https://github.com/JuliaLang/julia/blob/760b2e5b7396f9cc0da5efce0cadd5d1974c4069/src/jlapi.c#L740 so ct-> might be an issue here, but that would mean something corrupted Julia's task-local state since we just loaded the jl_current_task from it...

vchuravy avatar Jun 09 '25 08:06 vchuravy

Should I create another issue somewhere else? I am happy to do so next time I get this error so we can get more info.

Again, like in #690, I was not explicitly loading MPI libraries

leios avatar Jun 09 '25 09:06 leios

Ah, for the record, the UCX issue is only one of the segfaults. I still get the one associated with this issue regularly and the one for versioninfo(). I still don't know how to diagnose this locally as it seems to appear more or less at random

leios avatar Jun 09 '25 09:06 leios

Woops. Didn't mean to close it and don't have permission to reopen

leios avatar Jun 09 '25 09:06 leios

The original error looks like: https://github.com/ROCm/clr/issues/36

I've seen this with debug ROCm build.

pxl-th avatar Jun 12 '25 12:06 pxl-th

I've had this issue and I solved this by installing ROCm via AUR instead of pacman.

So i've deleted every ROCm package I had with pacman -Rns (and write every ROCm/HIP/miopen package you have) And then I installed ROCm via yay -S opencl-amd-dev which installs everything back. I didn't know that.

Now everything works like a charm.

FlenneSoyeux avatar Jun 25 '25 08:06 FlenneSoyeux