returnn Optimal dim order for convolution - NHWC for some newer GPUs?

Optimal dim order for convolution - NHWC for some newer GPUs?

Open albertz opened this issue 2 years ago • 1 comments

In the Volta Tensor Core GPU Achieves New AI Performance Milestones (2018) blog post by Nvidia, it is said that NHWC performs best.

It sounds like this statement is specifically for the Tensor Core GPU architecture built into Volta GPUs. I assume this is then also true for successors, i.e. Ampere GPUs. This would include the GeForce 30 series (3080, 3090 etc). Although the 20 series (2080 etc) also has Tensor Cores (second generation) but I'm not really sure if it applies to them as well.

I'm also not sure if this is then also true for TensorFlow, or if we need anything special on TensorFlow. Or maybe a recent TensorFlow version + recent CUDA version.

In any case, the automatic selection of optimal dim order in ConvLayer, PoolLayer and others, which currently just depends on GPU vs CPU should probably be extended to take this into account.

Jul 08 '22 16:07 albertz

@curufinwe maybe you know some more about this? Or @JackTemaki? Or who else could know more?

Jul 08 '22 16:07 albertz

Some research should be made on this, and also some benchmarking.

Oct 10 '22 22:10 albertz

returnn returnn copied to clipboard

Optimal dim order for convolution - NHWC for some newer GPUs?

returnn
returnn copied to clipboard