AMDGPU.jl
AMDGPU.jl copied to clipboard
AMD GPU (ROCm) programming in Julia
With https://github.com/JuliaLang/julia/pull/53687, ExaTron.jl tests run into [an assertion failure](https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/2918cd6_vs_a9611ce/ExaTron.primary.log) due to what I think is a bug in AMDGPU.jl. The problem lies with `free!(AMDGPU.Device.HostCallHolder)`, which compiles the following `ccall`: ```...
**MWE** ```julia using AMDGPU function main() data = rand(Float64, 1024, 1024) Threads.@threads for i in 1:1000 sum(ROCArray(data)) end end main() ``` **gdb** ``` (gdb) bt #0 __futex_abstimed_wait_common64 (private=, cancel=true, abstime=0x0,...
This is related to https://github.com/JuliaGPU/CUDA.jl/issues/2280 When taking the 2-norm (or any p-norm apart from 1- and Inf-norm) of a CuArray view, the implementation errors due to scalar iteration. Interestingly, 1-norm...
I think I'm missing something basic with synchronization. When using a simple `@roc` kernel launch inside a function we get an error in this [AMDGPU.synchronize() line](https://github.com/JuliaORNL/JACC.jl/blob/main/ext/JACCAMDGPU/JACCAMDGPU.jl#L10). The stacktrace can be...
On a system with ROCm 6.1.0 I fail to install AMDGPU.jl The error is certainly at my end, and I am investigating. julia> versioninfo() Julia Version 1.11.0-beta1 Commit 08e1fc0abb9 (2024-04-10...
It would be nice to investigate wrapping the [hipTensor](https://github.com/ROCm/hipTensor) library for accelerated tensor operations.
Hi folks! I am working on the multiGPU support of JACC: https://github.com/JuliaORNL/JACC.jl/ For that, I would need to be able to use a single array of pointers that can store...
This small code: ``` input = rand(Float32, 6, 6, 2, 1) |> gpu output = nn(input) |> cpu ``` is slower and slower to compute on my computer. Namely, the...
I'm trying to get AMDGPU to work on Windows with an RX7900XT. The test output lists successfully finding the gpu and my igpu. However, the tests hang. After interrupting them...
Similar to https://github.com/JuliaGPU/NCCL.jl/, it would be great to have wrappers for RCCL for training DL models on multiple AMDGPUs. (I know MPI.jl has ROCm support but we don't ship JLLs...