AMDGPU.jl
AMDGPU.jl copied to clipboard
AMD GPU (ROCm) programming in Julia
Add a test for `Mem.unsafe_copy3d!` function to perform device to device copy in async fashion.
This would reduce the need for using `rocprof` or the Profile stdlib to observe kernel execution ordering and latency hiding efficiency.
See JuliaGPU/Adapt.jl#52 Not functional like this yet, I don't know AMDGPU well enough and don't have a device to test, currently. See JuliaGPU/CUDA.jl#1520 for the CUDA.jl counterpart.
Hello @maleadt @jpsamaroo, On my side, this commit seems to add support for ```AnyROCArray```s in ```GPUArray.mapreducedim!```, and it assumes that the passed array is located on the default device. Please...
The following error appears for a large ROCArray, 50Kx50K (or larger) Float32 on `AMDGPU.rand!` It points at [this line in random.jl](https://github.com/JuliaGPU/AMDGPU.jl/blob/master/src/rand/random.jl#L50) and might be related to using `Int32`. To reproduce:...
I have small server (Minisforum HX90) with Ryzen 5900HX/ Cezanne APU. I successfully installed AMDGPU ``` julia> AMDGPU.versioninfo() HSA Runtime (ready) - Path: /home/davidj/.julia/artifacts/b1aa837f69ba67b20f9654af56c818e6d9bfd262/lib/libhsa-runtime64.so - Version: 1.1.0 ld.lld (ready) -...
Following kernel produces segfault on AMDGPU. On CUDA it works though. **MWE:** ```julia using AMDGPU using ROCKernels using KernelAbstractions Base.zeros(::ROCDevice, T, shape) = AMDGPU.zeros(T, shape) linear_threads(::ROCDevice) = 512 @kernel function...
Hello, ```GPUArray.mapreducedim!``` should be implemented for ```AnyROCArray``` instead of the more specific type of ```ROCArray```. Is there a way for ```AnyROCArray``` to be adapted to ```ROCArray```, so that this function...
We should add CI for different system ROCM versions people might encounter. 4.5 and 5.x are probably the most relevant.
Julia 1.7.3 and ROCM 5.0 ``` (@v1.7) pkg> build AMDGPU Building AMDGPU → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/535df1a8570f08101f087fa80b7fd764da7759f2/build.log` Precompiling project... 1 dependency successfully precompiled in 14 seconds (65 already precompiled) julia> using AMDGPU julia>...