AMDGPU.jl issues

@atomic is slow within AMDGPU.jl

29

I have noticed that with KernelAbstractions the use of @atomic is very slow on AMDGPU compared to CUDA. I have a test example at https://github.com/CliMA/CGDycore.jl/tree/main/TestAMD with results for a kernel...

OsKnoth

rocWMMA support

6

Could you implement rocWMMA to use with Navi3 GPU? From what i understood it uses the AI accelerotors present in them for faster matrix multilplication. I guess this could make...

radudiaconu0

Performance gap on a 7-point stencil Laplacian kernel on Frontier MI250x GPUs

1

In a recent [study](https://dl.acm.org/doi/10.1145/3624062.3624278) on Frontier, a 7-point stencil kernel under performs at half the bandwidth (~300 GB/s) of its HIP counterpart (~600 GB/s) on a single MI250x. The behavior...

williamfgc

Invalid `Os::currentStackPtr()`

Debug HIP build triggers following assertion: ```julia julia> using AMDGPU julia> ROCArray{Float32}(undef, 4) julia: /home/pxl-th/code/clr/rocclr/os/os_posix.cpp:310: static void amd::Os::currentStackInfo(unsigned char**, size_t*): Assertion `Os::currentStackPtr() >= *base - *size && Os::currentStackPtr() < *base...

pxl-th

KA tests trigger assertion if julia is built with them

3

```julia /home/gabrielbaraldi/julia4/src/llvm-late-gc-lowering.cpp:1029: void NoteDef(State&, BBState&, int, const std::vector&): Assertion `Num >= 0' failed. [132189] signal (6.-6): Aborted in expression starting at /home/gabrielbaraldi/.julia/dev/AMDGPU/test/ka_tests.jl:17 pthread_kill at /lib/x86_64-linux-gnu/libc.so.6 (unknown line) raise at /lib/x86_64-linux-gnu/libc.so.6...

gbaraldi

error: ran out of registers during register allocation

2

Ok, I'm going to be honest, I cannot create a MWE for this, but I can provide a replicator. I'm not expecting this issue to be fixed soon, but I...

leios

AMDGPU.jl on rolling release distros (Arch): Libraries unavailable

3

Hello, just a quick report and temporary solution. I was getting this issue on Arch: ``` julia> using AMDGPU ┌ Warning: HSA runtime is unavailable, compilation and runtime functionality will...

leios

AMDGPU.jl
AMDGPU.jl copied to clipboard

Metadata

@atomic is slow within AMDGPU.jl

rocWMMA support

Performance gap on a 7-point stencil Laplacian kernel on Frontier MI250x GPUs

Invalid `Os::currentStackPtr()`

KA tests trigger assertion if julia is built with them

error: ran out of registers during register allocation

AMDGPU.jl on rolling release distros (Arch): Libraries unavailable

Investigate using MIOpen's immediate mode for conv algorithm search

KA 0.10 API changes

[Question] AI accelerators

← Metadata

Owner

Metadata

AMDGPU.jl AMDGPU.jl copied to clipboard

Metadata

← Metadata

Owner

Metadata

AMDGPU.jl
AMDGPU.jl copied to clipboard