Valentin Churavy

Results 435 issues of Valentin Churavy

We don't want people to use string interpolation in the kernel ala `@print("ii = $ii; ij = $ij; ik = $ik; bi = $bi; groupsize() = $(groupsize())\n")`

help wanted

So the idea is that we have a common output format for the CPU backend as well as the GPU backend. This implements NVTXT so that you can load these...

@dpsanders ``` using KernelAbstractions using OffsetArrays @kernel function update!(A, @Const(B)) i, j = @index(Global, NTuple) acc = zero(eltype(A)) for m in -1:1 for n in -1:1 acc += B[i+m, j+n]...

In the example below the stacktrace is `src/macros.jl:212` but it should have rather been the kernel function itself. In general due to the code motion we are doing the line...

help wanted

still fails with illegal memory access in kernel

x-ref: https://github.com/vchuravy/GPUifyLoops.jl/issues/91

enhancement
help wanted

https://github.com/JuliaGPU/KernelAbstractions.jl/blob/1497d4109857c239a3d407943a0f2323c2bfc396/src/backends/cuda.jl#L99 x-ref: https://github.com/vchuravy/GPUifyLoops.jl/issues/100

enhancement
help wanted

https://github.com/vchuravy/GPUifyLoops.jl/issues/103

enhancement
help wanted

https://github.com/vchuravy/GPUifyLoops.jl/issues/104

enhancement
help wanted

Since kernel launches are event based and not stream based, we need a way to efficiently time the kernels themselves, using the event system.

enhancement
help wanted