CHAI
CHAI copied to clipboard
Optionally allow allocations to be zeroed
We would like to be able to specify that an allocation be zeroed upon allocation, sort of like calloc
instead of malloc
, but with all the host/device and pool usage that Umpire and Chai provide. Since allocations can happen lazily, and on either the device or the host, (I think) Chai has the best information of where the allocation is happening when it happens so that the zeroing can happen most efficiently without extra memory transfers.
@rchen20, @ajkunen, Verinder Rana, and John Loffeld also know the details. What other information is needed to flesh this request out?
Here's a comment from @rchen20 in our internal issue:
I don't think there are any plans to add zero-ing memory to RCU, other than what is being used in this MR (Umpire memset). Looking at the functor implementation of kConst, I would guess it would be faster than umpire::memset. I can bring this up in the next RAJA or Umpire meeting; maybe we can propose adding the Armus::kConst implementation to Umpire?
And here's a follow-up from @ajkunen :
I think this functionality should go in CHAI:
- RAJA doesn't do any memory management
- One could make an Umpire allocator that zeros new allocations... but you really just want the first allocation to be zeroed, not both the host and device allocations.
- CHAI is the only thing that knows that host and device allocations are associated. So, CHAI would be the right thing to trigger the zeroing... but I think Umpire should do the zeroing (like @rchen20 suggests)... so CHAI should tell Umpire to zero the first allocation for a ManagedArray, on whatever device that happens to be.
We could get this functionality now, by wiring this logic into our existing CHAI callback function... but it might be cleaner to just let the CHAI team implement it.
@brunner6, is this still something you are interested in, and if so, can you provide some motivation for why you want to do this?
We have a pattern in our code where we do an allocation, and then need to remember to zero the array. We've had several bugs where we haven't remember that. Having a feature where the zeroing could happen on the device where it was allocated would have avoided that. (In contrast to std::vector, say, which even if you tricked the allocator out, you'd still zero initialize on the host and then transfer the memory to the GPU).
Clearly this is pretty low priority for us, since we're not banging at your door...