BabelStream icon indicating copy to clipboard operation
BabelStream copied to clipboard

Add hipstdpar support to BabelStream

Open gsitaram opened this issue 9 months ago • 3 comments

This PR adds support for offload to AMD GPUs using the par_unseq execution policy in C++ standard parallelism algorithms. To trigger the GPU offload of all parallel algorithms, the --hipstdpar compilation flag must be provided. For GPU targets other than the current default of gfx906, the --offload-arch=<arch_string> option must also be provided at compile time.

When using ROCm 6.1.0, the compilation commands may look like the following if compiling for an AMD Instinct MI200 series GPU:

cmake -Bbuild -H. -DMODEL=std-data -DCMAKE_CXX_COMPILER=hipcc -DCLANG_OFFLOAD=gfx90a
cmake --build build

Remember to set the environment variable to enable address translation and page migration (where applicable) when running std-data-stream or std-indices-stream:

export HSA_XNACK=1

gsitaram avatar May 02 '24 14:05 gsitaram

It's great to see hipstdpar working, so let's work to get this merged in. Thanks for the contributions.

tomdeakin avatar May 13 '24 17:05 tomdeakin

Hi @tomdeakin, @afanfa and I have made the changes requested. Please check and approve if everything looks okay. Thanks!

gsitaram avatar Jun 20 '24 14:06 gsitaram

Added this PR with some fixes to #202 .

gonzalobg avatar Aug 14 '24 07:08 gonzalobg