KernelAbstractions.jl icon indicating copy to clipboard operation
KernelAbstractions.jl copied to clipboard

Heterogeneous programming in Julia

Results 165 KernelAbstractions.jl issues
Sort by recently updated
recently updated
newest added

A few backends have the option to create arrays using unified memory. Using unified memory for certain algorithms can have significant (positive) impact on performance by removing the need for...

enhancement
good first issue

Reproducer: CUDA.jl: ```julia using CUDA using Adapt CUDA.allowscalar(false) # using KernelAbstractions is_valid_index(meta, ui) = 1 ≤ ui[1] ≤ params(meta)[4] && 1 ≤ ui[2] ≤ params(meta)[1] && 1 ≤ ui[3] ≤...

At the moment, `supports_atomics` returns a boolean, but different backends have different levels of support. For example, Metal essentially only supports 32-bt integers and floats, with 64-bit integer atomics being...

Implement reduction API. Supports two types of algorithms: - thread: reduction performed by threads: uses shmem of length `groupsize`, no bank conflict, no divergence. - warp: reduction performed by `shlf_down`...

Kernel closures support passing `ndrange` and `workgroupsize` as keyword arguments: https://github.com/JuliaGPU/KernelAbstractions.jl/blob/97419620494baa2e45541a6f2015413d6fa9315b/src/KernelAbstractions.jl#L661 The kernel function itself probably should too, while it currently only accepts positional versions of these arguments (used to...

enhancement
good first issue

I'm not sure how to explain this behavior, it's like `@localmem` cancels the effects of the kernel: ```julia julia> using Metal, KernelAbstractions julia> backend = Metal.MetalBackend() MetalBackend() julia> @kernel function...

I had a request to do #429 for fastmath, so here is my attempt. #431 is also related 2 issues: 1. My only issue is that I don't know what...

To address the 0% code coverage, you main need to follow the steps outlined in https://discourse.julialang.org/t/psa-new-version-of-codecov-action-requires-additional-setup/109857 to use a Codecov token for report upload https://github.com/JuliaGPU/KernelAbstractions.jl/actions/runs/13809668007/job/38632985795#step:13:34

This is mainly to start a conversation around the KA kernel language, as it currently starts accumulating more functionality / cruft; for example, if I want a high-performance kernel as...

While looking at #583 I noticed that the `aug_fwd` kernel looks like: ``` function aug_fwd( ctx, f::FT, ::Val{ModifiedBetween}, subtape, ::Val{TapeType}, args..., ) where {ModifiedBetween, FT, TapeType} # A2 = Const{Nothing}...

Enzyme