AMDGPU.jl
                                
                                 AMDGPU.jl copied to clipboard
                                
                                    AMDGPU.jl copied to clipboard
                            
                            
                            
                        Update ROCSparse for Julia v1.10
@pxl-th @dkarrasch
FYI, I have disabled tests for rocSPARSE temporarily since they were crashing my Navi 3 in CI and I didn't have the time to investigate the final cause. Also for some reason rocBLAS tests segfault on ROCm 5.6 (@luraess).
@amontoison have you run the tests locally? We can of course re-enable rocSPARSE tests, but I'm not sure they will run successfully
@pxl-th The tests passed on our cluster.
@amontoison, I've sent you an invite to be able to merge PRs. I currently don't have access to AMD GPUs and therefore not working on AMDGPU.jl. So feel free to merge PRs once they are in a good state (although I'd recommend to merge them if CI is green).
I can try running the tests on my system @pxl-th (now with ROCm 6.0.2 on Navi 3). On which system did they pass @amontoison ?
I can try running the tests on my system @pxl-th (now with ROCm 6.0.2 on Navi 3). On which system did they pass @amontoison ?
It was on Frontier. I need to check with @michel2323 the version of ROCm.
ROCm 6.0 it was on an MI250.
Running the ROCSparse tests on Navi 3 (gfx1101 - Radeon RX 7800 XT) and ROCm 6.0.2 I am getting the following test that error (alongside with an error in ROCBlas) test_log_out.txt.
@luraess Can you check if the tests for rocSPARSE are failing or not.on the branch master?
Can you also give more details about the errors.
I suspect that something is not correctly dispatched because all the units tests for mv! and mm! passed.
Running only the rocSparse tests on master I am getting some warnings but no errors. There is still the failing BLAS test.
rocSaprse_out.txt
@luraess Can you just run include("test/rocarray/blas.jl")?
@luraess Can you just run
include("test/rocarray/blas.jl")?
Yes, here is the output of running the test test_out.txt
Thanks @luraess! But you need to import additional packages to isolate the issue:
using AMDGPU
using LinearAlgebra
import GPUArrays
include(joinpath(pkgdir(GPUArrays), "test", "testsuite.jl"))
testf(f, xs...; kwargs...) =
    TestSuite.compare(f, AMDGPU.ROCArray, xs...; kwargs...)
include("test/rocarray/blas.jl")
Thanks for the hints. Following those I am getting a segfault on Navi3 - ROCm 6.0.2 (blas_navi3.txt) and a bunch of errors on MI250x - ROCm 5.3.3 on LUMI (blas_lumi.txt).