Valentin Churavy
Valentin Churavy
You can use `@macroexpand` on the `@kernel` to see the code KA generates for the CPU.
Thanks for the initial implementation I will have to think about this a bit. I still feel like this may be better expressed as a projection `f(Idx) -> Idx`. ```...
@timholy might also be able to offer advise. IIUC you are trying to implement an exterior/interior iteration split like `EdgeIterator` from https://github.com/JuliaArrays/TiledIteration.jl?
You can do this right now as you would do with CUDA.jl/AMDGPU.jl by projecting a smaller ndrange to your custom index space. This is more about if we can do...
You need to be on https://github.com/JuliaGPU/CUDA.jl/pull/1772
AMDGPU support for KA 0.9 https://github.com/JuliaGPU/AMDGPU.jl/pull/398
Slightly confusing, so not expected. In my experience dynamic parallelism doesn't have the best performance and of course we will need to figure out what it means for at least...
There currently is no support in KA for wavefront/warp level programming. Two immediate questions: 1. What would the semantics be on the CPU level? 2. Does Intel and Metal also...
Coming back to my question: What's the reason you want to access this functionality? Generally speaking I don't think warpsize is something we should expose in KA, but there are...
KernelAbstractions is not One API, so the meaning of subgroup needs to be defined clearly and independently. If often comes down to can we expose these semantics without to much...