BabelStream icon indicating copy to clipboard operation
BabelStream copied to clipboard

std-indices: Capture in offload-friendly way

Open illuhad opened this issue 1 year ago • 2 comments

Previously, std-indices captured by reference. In an offload scenario, capture-by-value is generally preferred because if the reference points to stack memory on the host, offloaded kernels will encounter illegal memory accesses. This is also something that compilers generally cannot remedy using magic compiler transformations to make data GPU-accessible -- this only works for the heap.

The current code only works in an offload scenario, because the stream class itself is allocated on the heap (which compilers can then make GPU-accessible), and the kernels can then reference the std::vector objects for the data.

As I've said, this relies on an implementation detail and may be brittle in case the architecture ever changes, or someone wishes to reuse babelstream code in a different context.

This PR therefore attempts to make things more robust by directly capturing data pointers by value.

On Intel iGPU, I see no substantial performance difference between the two versions in an offload scenario.

illuhad avatar Jul 06 '23 21:07 illuhad

@illuhad Can you check if this is resolved in develop?

tom91136 avatar Sep 25 '23 00:09 tom91136

@tom91136 i think this is indeed resolved in develop. Now a, b, and c are pointers, and are captured by value, avoiding the capture of the std::vector object by reference or a copy of the std::vector object itself to be captured.

gonzalobg avatar May 26 '24 16:05 gonzalobg