Ludovic Räss

Results 170 comments of Ludovic Räss

> share the steps to test CUDA flow If you want to test on another backend than CPU, you can achieve this by running the test on a machine where...

AMDGPU on Windows right? Could one have sum differently to discern if p or q are not executed?

Testing on RX 7800 XT Ubuntu 22.04 I can reproduce the issue. The following does not errors though for me: ```julia @kernel function kernel_xx!(tensor, Nx::Int64, Ny::Int64, Nz::Int64) i, j, k...

Also the "plain" AMDGPU version works fine: ```julia using AMDGPU function compute_amdgpu(tensor, kernel_fun, Nx, Ny, Nz) groupsize = (16, 4, 2) # nthreads gridsize = cld.(size(tensor), groupsize) # nblocks @roc...

- Seems that `ubuntu-latest` test for `test_datatype.jl` fail for nightly on both default and openmpi-jll. - CUDA tests on >= v1.9 struggle with `test_allreduce` and `test_allgather` and more collectives (tried...

On CPU, we get failing test in Ubuntu-latest, [EDIT] ~~Julia 1.9 and 1.10,~~ for `PrimitiveType = Primitive80` on: - default https://github.com/JuliaParallel/MPI.jl/actions/runs/10366144398/job/28694765294?pr=861#step:6:265 - openmpi-jll: https://github.com/JuliaParallel/MPI.jl/actions/runs/10366144398/job/28694768737?pr=861#step:7:268 Any hint what could go wrong...

> Where do you see failures with julia 1.9 and 1.10? It looks to me only Julia nightly is failing Correct, only nightly is failing. 1.9 and 1.10 fail with...

Now CUDA tests segfault on `test_basic.jl` https://buildkite.com/julialang/mpi-dot-jl/builds/1520#01914b09-c528-4d9b-9c31-d8273912270d/286-489, which suggests it's not related to collective but to something else that brakes CUDA-aware MPI in CI. I will revert the excluded tests...

Note that adding e.g. `@fastmath D = sqrt.(B .+ H)` makes it work.

From testing it looks like changes added in 0.13.35 cause this error (adapt to more recent GPUCompiler) https://github.com/EnzymeAD/Enzyme.jl/pull/2338/files cc @wsmoses @giordano