miniWeather icon indicating copy to clipboard operation
miniWeather copied to clipboard

DRAFT: Adds support for C++ parallel algorithms

Open jefflarkin opened this issue 2 years ago • 2 comments

Support added for the C++ standard parallel algorithms. The changes have been tested with NVHPC 21.11 on both multicore CPU and GPU. To build with GPU, add -DSTDPAR_FLAGS='-stdpar=gpu'. To build for multicore CPU, add -DSTDPAR_FLAGS='-stdpar=multicore'. I have not tested with other Parallel C++ implementations.

The NVIDIA implementation for GPUs relies on unified virtual memory memory. Several functions had to be modified to pass in data structures or make local copies of global scalars so that they would work both with multicore and GPU targets. There are likely other (better?) ways to handle this.

Because the C++ cartesian product is not yet available, I had to pack and unpack the loop indices myself using a couting_iterator. When ranges and cartesian_product are fully supported, they will be a nicer solution.

Note: I based this on the version from the C directory rather than a version from the C++ directory to avoid a dependency on YAKL.

jefflarkin avatar Dec 14 '21 00:12 jefflarkin

Change CMake lists to use semicolon rather than space.

mrnorman avatar Dec 15 '21 15:12 mrnorman

Might want to consider https://thrust.github.io/doc/classthrust_1_1zip__iterator.html

mrnorman avatar Dec 15 '21 16:12 mrnorman