ResNet101, OpenImages 5000 classes: Tensorflow -> ONNX

Open caponetto opened this issue 7 years ago • 1 comments

Platform (like ubuntu 16.04/win10): win10

Python version: 3.6.8

Source framework with version (like Tensorflow 1.4.1 with GPU): Tensorflow 1.13.0rc1 CPU

Destination framework with version (like CNTK 2.3 with GPU): ONNX

Pre-trained model path (webpath or webdisk path): https://storage.googleapis.com/openimages/2017_07/oidv2-resnet_v1_101.ckpt.tar.gz

Running scripts: mmconvert --srcFramework tensorflow -in oidv2-resnet_v1_101.ckpt.meta --weights oidv2-resnet_v1_101.ckpt --dstFramework onnx -om resnet_v1_101.onnx --inNodeName map/TensorArrayStack/TensorArrayGatherV3 --dstNodeName resnet_v1_101/logits/BiasAdd --inputShape 299,299,3

I'm trying to convert the tensorflow checkpoint that I mentioned above to ONNX. I've followed the steps described by @JiahaoYao in the issue #206 and managed to generate the .onnx file. I inspected the file with Netron and, apparently, everything is ok - [1,299,299,3] as input and [1,1,5000] as output (I'd rather have [5000], though). However, when I load the model in either CNTK or ONNX, the output of the loaded model is [5000,1,10], resulting in invalid predictions. Can someone give it a try? Thanks.

[Update 1] There is a ReduceMean op in the end of the network taking two axes as parameters, but only one axis is being reduced. Just for testing, the conversion worked as expected when I added one ReduceMean op for each axis. Apparently, there is something wrong with this op.

[Update 2] The conversion also worked when I changed the opsetid to 8 or 9 in the following line. https://github.com/Microsoft/MMdnn/blob/f62a33a7d6834680537693c7fdc7e90e1b2382ef/mmdnn/conversion/onnx/onnx_emitter.py#L87

Feb 22 '19 18:02 caponetto

Hi @Etty-Cohen It's been a while since I worked with this model, and I no longer have access to it. From what I remember, I managed to successfully execute the conversion using the two approaches that I mentioned in the updates:

[Update 1] There is a ReduceMean op in the end of the network taking two axes as parameters, but only one axis is being reduced. Just for testing, the conversion worked as expected when I added one ReduceMean op for each axis. Apparently, there is something wrong with this op.

[Update 2] The conversion also worked when I changed the opsetid to 8 or 9 in the following line. https://github.com/Microsoft/MMdnn/blob/f62a33a7d6834680537693c7fdc7e90e1b2382ef/mmdnn/conversion/onnx/onnx_emitter.py#L87

I probably ended up using the second approach. Hope this helps.

Jul 13 '22 12:07 caponetto