2decomp-fft icon indicating copy to clipboard operation
2decomp-fft copied to clipboard

ability to perform runtime autotuning

Open slaizet opened this issue 2 years ago • 0 comments

ability to perform runtime autotuning of the process grid dimensions used to partition the global domain and communication backends used for transpose and/or halo communication. This feature enables users to run the library using the best performing configuration for a given global domain size, number of tasks, and compute cluster topology. The autotuner aims to select decomposition and communication backend options that minimize transpose and halo communication time

See https://nvidia.github.io/cuDecomp/autotuning.html

slaizet avatar Jan 13 '23 09:01 slaizet