Valentin Churavy

Results 1413 comments of Valentin Churavy

What backend are you executing this one? Can you isolate this into a MWE.

Ah shoot. I had expected it to be related to https://github.com/JuliaGPU/CUDA.jl/pull/2336 But this looks much more like a core Julia bug.

So it does belong here since on a machine without CUDA it is not reproducible.

@kh-abd-kh I am sorry for your frustration. https://discourse.julialang.org is a better place to seek help than this issue tracker. For this particular case I am confused why it even downloaded...

> Could these things maybe live in https://github.com/anicusan/AcceleratedKernels.jl in the future ? There is a dependency ordering issue, GPUArrays is the common infrastructure and this is would be the fallback...

Of course JLArrays doesn't work.. That uses the CPU backend and this is `cpu=false`

Does #436 speed things up for you? I haven't merged it since I haven't seen an impact on benchmarks.

So the question would be where are the overheards coming from, so maybe you cana run a profile and compare the time being spend. You could also use static scheduling....

@avik-pal Isn't this just a normal Julia performance pitfal? ``` @kernel function batchnorm_kernel_act!(y, @Const(act), @Const(scale), @Const(bias), @Const(x), @Const(μ), @Const(σ²)) i, j = @index(Global, NTuple) y[i, j] = act(scale[i] * (x[i,...