MMdnn icon indicating copy to clipboard operation
MMdnn copied to clipboard

PyTorch->MMIR: regex unable to extract output shape from a constant value

Open nickfraser opened this issue 3 years ago • 4 comments

Platform (like ubuntu 16.04/win10): Ubuntu 16.04

Python version: 3.6.8

Source framework with version (like Tensorflow 1.4.1 with GPU): PyTorch 1.5.1

Destination framework with version (like CNTK 2.3 with GPU): MMIR

Pre-trained model path (webpath or webdisk path): Nvidia's ResNet-18 example

Running scripts: PYTHONPATH=`readlink -f .` mmtoir -f pytorch -in resnet18/model_best_only.pth -o resnet18-mmir/ --inputShape 3,224,224

Note, the current working directory is the path given in the link to the model path. Also, the model 'model_best_only.pth' has been updated to save the entire model, not just the state dict as specified in the PyTorch instructions. I'm using the current master branch of MMdnn.

When running this script, the original output is:

Traceback (most recent call last):
  File "/opt/conda/bin/mmtoir", line 10, in <module>
    sys.exit(_main())
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 197, in _main
    ret = _convert(args)
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/_script/convertToIR.py", line 97, in _convert
    parser = PytorchParser151(model, inputshape[0])
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/pytorch/pytorch_parser.py", line 528, in __init__
    self.build_graph(input_shape)
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/pytorch/pytorch_parser.py", line 92, in build_graph
    self.pytorch_graph.build(self.input_shape)
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/pytorch/pytorch_graph.py", line 140, in build
    output_shape = [int(x.replace('!', '')) for x in output_shape_str.split(',')]
  File "/opt/conda/lib/python3.6/site-packages/mmdnn/conversion/pytorch/pytorch_graph.py", line 140, in <listcomp>
    output_shape = [int(x.replace('!', '')) for x in output_shape_str.split(',')]
ValueError: invalid literal for int() with base 10: ''

After some investigation, I can trace the issue back to this regex which returns the string ', scope: ResNet' when the input string is '%191 : Tensor = onnx::Constant[value={0}](), scope: ResNet'.

All other layers appear to be parsed correctly. Printing both node.__str__() and output_shape_str reveals the following:

%123 : Float(1, 64, 112, 112) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[7, 7], pads=[3, 3, 3, 3], strides=[2, 2]](%input.1, %1), scope: ResNet/Conv2d[conv1] # /opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py:350:0
1, 64, 112, 112

.....
More layers omitted for brevity
.....

%189 : Float(1, 512, 1, 1) = onnx::GlobalAveragePool(%188), scope: ResNet/AdaptiveAvgPool2d[avgpool] # /opt/conda/lib/python3.6/site-packages/torch/nn/functional.py:889:0
1, 512, 1, 1

%190 : Tensor = onnx::Shape(%189), scope: ResNet
%189

%191 : Tensor = onnx::Constant[value={0}](), scope: ResNet
, scope: ResNet

What output shape should there be for this constant tensor? Should it be '0'?

nickfraser avatar Aug 05 '20 11:08 nickfraser

Note, I have a workaround which appears to work, replacing view with flatten here: https://github.com/NVIDIA/DeepLearningExamples/blob/4e00153ab5c4963c0505c6cebfc07730a42ef7dc/PyTorch/Classification/RN50v1.5/image_classification/resnet.py#L201

However, the original issue remains.

nickfraser avatar Aug 05 '20 11:08 nickfraser

Hi @nickfraser , thanks for your feedback, it seems torch cannot inference outputshape in some special cases. We will look into this issue.

XiaoXYe avatar Aug 06 '20 10:08 XiaoXYe

@XiaoXYe do you have updates on this issue?

rmzr7 avatar Aug 20 '20 16:08 rmzr7

I have committed some code. If pytorch cannot inference the outputshape, MMdnn will raise a warning and set the outputshape to None in IR json file. You can set the outputshape manually in json file but actually outputshape is not necessary in most cases.

XiaoXYe avatar Aug 21 '20 08:08 XiaoXYe