KernelAbstractions.jl
KernelAbstractions.jl copied to clipboard
Heterogeneous programming in Julia
This is more for my own reference later. If I create the following kernel: ``` @kernel function check_kernel() cI = @index(Global) if cI == 2 @print("cI is: ", cI, '\n')...
When you call wait on both CPU and GPU. The synchronization can be unstable. The error message is ```julia any: Test Failed at /home/leo/jcode/lab/wait_fail.jl:34 Expression: Array(c1) ≈ ones(100) Evaluated: Float32[0.0,...
The following code ``` using KernelAbstractions using CuArrays struct S{FT} dummy1::FT dummy2::Int x::FT end @kernel function kernel!(s::S, a, b, dummy3, dummy4) sin(s.x) @inbounds a[1] = b[1] end let FT =...
We don't want people to use string interpolation in the kernel ala `@print("ii = $ii; ij = $ij; ik = $ik; bi = $bi; groupsize() = $(groupsize())\n")`
``` using KernelAbstractions using KernelAbstractions.Extras: @unroll using CuArrays @kernel function kernel!(::Val{nreduce}, ::Val{Nq}) where {nreduce, Nq} s_MJQ = @localmem Float64 (Nq * Nq) i, j = @index(Local, NTuple) @inbounds begin @unroll...
Here is a snippet of code ``` using KernelAbstractions, Test, CUDAapi if CUDAapi.has_cuda_gpu() using CuArrays CuArrays.allowscalar(false) end @kernel function localmem_check!(a, @Const(TDIM)) T = eltype(a) #i = @index(Global) # Fails when...
So the idea is that we have a common output format for the CPU backend as well as the GPU backend. This implements NVTXT so that you can load these...
@dpsanders ``` using KernelAbstractions using OffsetArrays @kernel function update!(A, @Const(B)) i, j = @index(Global, NTuple) acc = zero(eltype(A)) for m in -1:1 for n in -1:1 acc += B[i+m, j+n]...
In the example below the stacktrace is `src/macros.jl:212` but it should have rather been the kernel function itself. In general due to the code motion we are doing the line...
``` using KernelAbstractions @kernel function kernel_index!(a) i = KernelAbstractions.@index(Global) @inbounds a[i] = i end let a = zeros(1) kernel! = kernel_index!(CPU(), 1, 1) event = kernel!(a) wait(event) end ``` fails...