ompi icon indicating copy to clipboard operation
ompi copied to clipboard

v4.1.x: ompi/coll/cuda: implement reduce_local

Open Akshay-Venkatesh opened this issue 6 months ago • 2 comments

Reduce_local implementation is missing which causes failures in IMB. The implementation piggybacks on existing cuda reduce implementation to stage/unstage send/receive buffers.

bot:notacherrypick

Akshay-Venkatesh avatar Aug 13 '24 18:08 Akshay-Venkatesh