miniWeather
miniWeather copied to clipboard
Port MiniWeather over the CUDASTF programming model
This PR introduces a new version of the MiniWeather benchmark based on the CUDASTF programming model.
CUDASTF is shipped in NVIDIA's CCCL project, and implements task-parallelism as a C++ header only library.
This example shows how to leverage CUDA graphs to hide latencies on small problem sizes, or to scale parallel_for kernels over multiple devices of the same machine (e.g. a DGX platform).