Valentin Churavy

Results 1415 comments of Valentin Churavy

> Is there a way to have this selected automatically for some use cases, or make it default? Any better strategy than the current one? I was thinking about this,...

Sorry could we keep the conversation focused here? If there is an issue with atomic operations in CUDA.jl please open an issue there. I assume that: > My code speeds...

Can you split the issue into device side RNG and the host side hang? They are unrelated and likely need different work to fix

I don't think removing the artifacts is the right answer. This breaks down-stream JLL being able to build against ROCM and makes it harder for us to understand what version...

In the past I had success with using SIMD.jl for this. (packed ops on CUDA)

One could, but it would require probably quite a bit of work by someone. CUDA.jl WMMA support was the work of a full-time master student.

> wekk it has to be done at some point. why only nvidia to get all the goodies? :P Are you volunteering?

Will need to change https://github.com/JuliaGPU/KernelAbstractions.jl/blob/c5fe83c899b3fd29308564467c3a3722179bfe9d/Project.toml#L23 to only be `0.7.1`

Will need rebase for #478

The tests are a bit sparse and they should be enabled for more than the CPU backend?