KernelAbstractions.jl
KernelAbstractions.jl copied to clipboard
Heterogeneous programming in Julia
I am trying to set up a dynamic kernel wherein a KA kernel launches a CUDA kernel. The final objective would be to have dynamic parallelism using only kernel abstractions....
Requires an API bump since backend need to implement this with `device_overlay`.
``` Test Failed at /home/runner/work/KernelAbstractions.jl/KernelAbstractions.jl/test/private.jl:98 Expression: !(occursin("gcframe", IR)) Evaluated: !(occursin("gcframe", "; Function Signature: cpu_reduce_private(KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.NoDynamicCheck, Base.IteratorsMD.CartesianIndex{1}, Base.IteratorsMD.CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.StaticSize{(8,)}, Base.IteratorsMD.CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, Nothing}}, Array{Float64, 1}, Array{Float64, 2})\n; @ none...
Int32 can be quite a bit faster and we should make sure that we use it where we can for our index calculations.
I have encountered what I think is a variable scoping issue that causes one of my KernelAbstractions kernels to fail when executing on the CPU. (GPU execution is fine.) I'm...
This PR tries to include offsets in kernel launches so that the `Global` indices returned by `@index(Global, NTuple)` and `@index(Global, Linear)` are offset by an `offset` argument. Example: ```julia julia>...
#382 added EnzymeRules support, but not yet full support for reverse mode on all backends. Currently only CPU is supported for reverse mode.
hello - I'm able to run a script based on an Oceananigans example in VSCode, but get an error while using the VSCode debugger on the same script that runs...