Valentin Churavy comments

Results 1415 comments of


                                            Valentin Churavy

Add query function for if we are executing on a GPU

Hm yeah, I was thinking `current_backend()` but Simon wanted the ability to not use it on the top-level. We could pre-empt some other work I am thinking about and expose...

StackOverflow for `get_device` on `ROCArray`

We should also finish #320

Possible error due to KernelAbstractions while using VSCode debugger

I am unsure what KernelAbstractions could do here. This sounds like a fault in the debugger.

Simple Block Reduce Fails when using `while` loops

Hm I need to think through the semantics of while loops on the CPU... #262 You can use `@macroexpand` to debug the lowering of the kernel. You should see two...

Simple Block Reduce Fails when using `while` loops

So one thing to think through is what a `while` loop with a `@synchronize` inside should look like.

Simple Block Reduce Fails when using `while` loops

Yeah the `@synchronize` makes while loops hard.... ``` s = MVector(Int, length(wkgrp)) mask = map(s->s>0, s) while any(mask) for tid in wkgrp mask[tid] || continue if tid < s[tid] cache[ti]...

Simple Block Reduce Fails when using `while` loops

We could solve break through introducing a mask... I like this direction, but it is something that the current architecture doesn't easily support.

How do I detect what GPU is installed on a host?

I often say: The choice is up to the user. Experience has shown that having GPU backends as dependencies can cause issues, when one backend is quicker to update than...

Index type

@luraess also mentioned that it would make sense to configure the hardware dimension index into the Kernel struct.

Index type

The maximum linear index with `UInt32` is 4,294,967,295 so an array of about 4GB. With GPUs having upwards of 40GB or more memory in the data canter, it's not unlikely...