GPUArrays.jl icon indicating copy to clipboard operation
GPUArrays.jl copied to clipboard

KA.jl-related slowdowns

Open maleadt opened this issue 1 year ago • 6 comments

The switch to KA.jl significantly slowed down several operations.


CUDA.jl: permudetims, broadcast, and many others

https://speed.juliagpu.org/changes/?tre=10&rev=6221589f5befec8f6f157a5a5271667dba09d0b6&exe=11&env=1


Metal.jl: permudetims

private array/permutedims/4d 	2911500 ns 	860084 ns 	3.39
private array/permutedims/2d 	1065021 ns 	862229.5 ns 	1.24
private array/permutedims/3d 	1629229 ns 	919520.5 ns 	1.77

shared array/permutedims/4d 	2933000 ns 	858875 ns 	3.41
shared array/permutedims/2d 	1054250 ns 	862292 ns 	1.22
shared array/permutedims/3d 	1625958 ns 	923916.5 ns 	1.76

maleadt avatar Oct 18 '24 06:10 maleadt