Valentin Churavy
Valentin Churavy
I suspect that as the test says, it will still be broken on 1.0-1.2
bors try
@tkf does this jive with what you need for Loops/Executor?
Maybe instead of ComputingDevice we call it ComputeUnit? We are moving towards heterogeneous system in general and they might not be separated devices
Eventually :)
Yes shared memory and scratch memory are both abstractions that currently rely on the GPU hardware to do the right thing. Shared memory is shared across a block and scratch...
> device array allocation Do you mean device side allocation or the allocation of device memory on the host? > multiple streams > synchronizing streams The launch function in #20...