returnn icon indicating copy to clipboard operation
returnn copied to clipboard

Optimal dim order for convolution - NHWC for some newer GPUs?

Open albertz opened this issue 2 years ago • 1 comments

In the Volta Tensor Core GPU Achieves New AI Performance Milestones (2018) blog post by Nvidia, it is said that NHWC performs best.

It sounds like this statement is specifically for the Tensor Core GPU architecture built into Volta GPUs. I assume this is then also true for successors, i.e. Ampere GPUs. This would include the GeForce 30 series (3080, 3090 etc). Although the 20 series (2080 etc) also has Tensor Cores (second generation) but I'm not really sure if it applies to them as well.

I'm also not sure if this is then also true for TensorFlow, or if we need anything special on TensorFlow. Or maybe a recent TensorFlow version + recent CUDA version.

In any case, the automatic selection of optimal dim order in ConvLayer, PoolLayer and others, which currently just depends on GPU vs CPU should probably be extended to take this into account.

albertz avatar Jul 08 '22 16:07 albertz

@curufinwe maybe you know some more about this? Or @JackTemaki? Or who else could know more?

albertz avatar Jul 08 '22 16:07 albertz

Some research should be made on this, and also some benchmarking.

albertz avatar Oct 10 '22 22:10 albertz