Valentin Churavy

Results 1415 comments of Valentin Churavy

We can do this in a backwards compatible manor. What function do we need? - device(::Backend) - device!(::Backend) - ?devices or ndevices

> Those, as well as synchronization functions on a single device. Can you expand on that? We already have synchronize w.r.t to the current active device.

``` @kernel function kern() I = @index(Global,Cartesian) @show I end end ``` ``` kern(CPU(), 64)(ndrange=(2, 3, 4)) I = CartesianIndex(1, 1, 1) I = CartesianIndex(2, 1, 1) I = CartesianIndex(1,...

One thing to note that your `kern(CPU(), 64)` is equivalent to `(64, 1, 1)`. So I am not surprised that `R1 = CartesianIndices((1:1,1:N,1:N))` is slow. For that I would expect...

Well I thought I had documented that clearly, but I seem to not find it... Take a look at: https://juliagpu.github.io/KernelAbstractions.jl/stable/examples/performance/ the workgroupsize is also a tuple where you provide the...

Yeah this is expected and the reason why the `@uniform` macro is needed. https://juliagpu.github.io/KernelAbstractions.jl/api/#KernelAbstractions.@uniform

Yeah the CPU lowering is a bit tricky, and doesn't have the best errors

@ManuelCostanzo please make it easier to help you by formatting your post. I am unsure what you want to achieve? By definition `@uniform` and `@index` are incompatible.

RGF is not supported on the GPU and I don't see a way of supporting them.

Unexplored territory and I don't know if GPUCompiler can handle them yet.