BabelStream
BabelStream copied to clipboard
Add hipstdpar support to BabelStream
This PR adds support for offload to AMD GPUs using the par_unseq
execution policy in C++ standard parallelism algorithms. To trigger the GPU offload of all parallel algorithms, the --hipstdpar
compilation flag must be provided. For GPU targets other than the current default of gfx906
, the --offload-arch=<arch_string>
option must also be provided at compile time.
When using ROCm 6.1.0, the compilation commands may look like the following if compiling for an AMD Instinct MI200 series GPU:
cmake -Bbuild -H. -DMODEL=std-data -DCMAKE_CXX_COMPILER=hipcc -DCLANG_OFFLOAD=gfx90a
cmake --build build
Remember to set the environment variable to enable address translation and page migration (where applicable) when running std-data-stream
or std-indices-stream
:
export HSA_XNACK=1
It's great to see hipstdpar working, so let's work to get this merged in. Thanks for the contributions.
Hi @tomdeakin, @afanfa and I have made the changes requested. Please check and approve if everything looks okay. Thanks!
Added this PR with some fixes to #202 .