Umpire icon indicating copy to clipboard operation
Umpire copied to clipboard

Zero-out Kernel

Open kab163 opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe.

Need to zero-out a large array of GPU memory ("fast" way to zero out device memory).

Describe the solution you'd like

Want to allocate a large array and zero it out. We could call malloc_zero_out_kernel(nbytes); or something like that instead of having to write our own kernel to zero it out. Have a built-in umpire function to do that.

Describe alternatives you've considered

Using the resource manager to do a memset takes too long. Allocating an array and then launching a kernel to zero out memory could work, but that adds more code.

Additional context

See teams conversation here.

kab163 avatar Jan 18 '24 17:01 kab163

Another idea is to not just have zero as the value to set a range of memory to, but any value (or at least -1 and nan.. maybe others)

kab163 avatar Jan 19 '24 00:01 kab163