tensorflow-onnx icon indicating copy to clipboard operation
tensorflow-onnx copied to clipboard

Conv3DTranspose with strides leads to wrong output dimensions if data format is channels_first

Open fthielke opened this issue 4 years ago • 2 comments

Describe the bug When converting a model containing Conv3DTranspose with strides > 1 and data_format='channes_first', the output of the resulting ONNX model has the wrong shape (seems to be off by one).

Urgency Not very high.

Can be easily worked around by using data format channels_last and adding transpose operations which are removed by the optimizer anyhow; adding the workaround each time is annoying, though.

System information

  • OS Platform and Distribution: Windows 10
  • Tensorflow Version: 2.6.0
  • Python version: 3.9.6

To Reproduce The attached Jupyter notebook test_convtranspose.ipynb.gz creates a simple model containing only a Conv3DTranspose with kernel size (3,3,3) and strides (2,2,2), either using data_format='channes_first' or 'channels_last'.

For the model using 'channels_last', the converted ONNX model correctly doubles its input shape. The other model however does not: e.g. for an input of size (8,8,8), the output size is (16,16,17).

fthielke avatar Sep 14 '21 11:09 fthielke

Hi @fthielke,

Our unit tests don't always cover the channels_first case since I think tf won't run it on CPU and our CI doesn't have GPU, so it is quite likely that we have a bug. It would be fantastic if you were able to help track down where in the code the issue occurs (I'd recommend stepping through conversion with a python debugger). Hopefully is a simple fix.

TomWildenhain-Microsoft avatar Sep 22 '21 19:09 TomWildenhain-Microsoft

The debugger sadly was not too helpful for finding the bug, but I could easily spot it by comparing the resulting models with and without the workaround in Netron.

The fix is indeed quite simple, I've opened a PR: https://github.com/onnx/tensorflow-onnx/pull/1748

fthielke avatar Oct 19 '21 16:10 fthielke