James Schloss

Results 253 comments of James Schloss

Right. That segfaults even when I am using Julia single threaded. ``` julia> AMDGPU.versioninfo() [ Info: AMDGPU versioninfo julia: /usr/src/debug/hip-runtime/clr-rocm-6.2.4/hipamd/src/hip_code_object.cpp:1152: hip::FatBinaryInfo** hip::StatCO::addFatBinary(const void*, bool): Assertion `err == hipSuccess' failed. [10123]...

rocm info: ``` [leios@noema Fable.jl]$ rocminfo ROCk module is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.1 Runtime Ext Version: 1.6 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration:...

``` julia> using AMDGPU julia> import Libdl julia> foreach(println, Libdl.dllist()) linux-vdso.so.1 /usr/lib/libdl.so.2 /usr/lib/libpthread.so.0 /usr/lib/libc.so.6 /home/leios/builds/julia-1.11.3/bin/../lib/libjulia.so.1.11 /lib64/ld-linux-x86-64.so.2 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libgcc_s.so.1 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libopenlibm.so /usr/lib/libstdc++.so.6 /usr/lib/libm.so.6 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libjulia-internal.so.1.11 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libunwind.so.8 /usr/lib/librt.so.1 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libz.so.1 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libatomic.so.1 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libjulia-codegen.so.1.11 /home/leios/builds/julia-1.11.3/bin/../lib/julia/libLLVM-16jl.so /home/leios/builds/julia-1.11.3/lib/julia/sys.so /home/leios/builds/julia-1.11.3/bin/../lib/julia/libpcre2-8.so /home/leios/builds/julia-1.11.3/bin/../lib/julia/libgmp.so.10...

6.2.4 I'll update? ``` [leios@noema ~]$ pacman -Qe | grep amd amd-ucode 20250210.5bc5868b-1 amdvlk 2024.Q4.3-1 hsa-amd-aqlprofile-bin 6.2.4-1 xf86-video-amdgpu 23.0.0-2 [leios@noema ~]$ pacman -Qe | grep roc rocblas 6.2.4-1 rocfft 6.2.4-1...

Good news, everything's still broken on 6.4 for me ``` [leios@noema ~]$ pacman -Qe | grep amd amd-ucode 20250508.788aadc8-2 amdvlk 2025.Q2.1-1 hsa-amd-aqlprofile-bin 6.4.0-1 xf86-video-amdgpu 23.0.0-2 [leios@noema ~]$ pacman -Qe |...

Uhhh. Works now. I didn't change anything on my end, but I guess I am closing this for now?

Reopening because it's still happening, but now seemingly at random. I can't really figure out how to create a MWE because it'll work sometimes and not at other times. An...

Should I create another issue somewhere else? I am happy to do so next time I get this error so we can get more info. Again, like in #690, I...

Ah, for the record, the UCX issue is only one of the segfaults. I still get the one associated with this issue regularly and the one for `versioninfo()`. I still...