MMdnn
MMdnn copied to clipboard
can't convert model from tensorflow to caffe with tf.layrers.batch_normalization
Platform (like ubuntu 16.04/win10):
ubuntu 16.04
Python version:
python 2.7
Source framework with version (like Tensorflow 1.4.1 with GPU):
Tensorflow 1.12
Destination framework with version (like CNTK 2.3 with GPU):
caffe
Pre-trained model path (webpath or webdisk path):
Running scripts:
mmconvert -sf tensorflow -in model-lenet-17000.meta -iw model-lenet-17000 --dstNodeName dense2/outputs -df caffe -om caffe-lenet
Error Info:
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/Switch].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/Switch].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm/Switch_1].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm_1/Switch_1].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm/Switch_2].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm_1/Switch_2].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm_1/Switch_3].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm_1/Switch_4].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm/Switch_1].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm_1/Switch_1].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm/Switch_2].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm_1/Switch_2].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm_1/Switch_3].
TensorflowEmitter has not supported operator [Switch] with name [conv2/batch_normalization/cond/FusedBatchNorm_1/Switch_4].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm/Switch].
TensorflowEmitter has not supported operator [Switch] with name [conv1/batch_normalization/cond/FusedBatchNorm_1/Switch].
Traceback (most recent call last):
File "/usr/local/bin/mmconvert", line 11, in <module>
sys.exit(_main())
File "/usr/local/lib/python2.7/dist-packages/mmdnn/conversion/_script/convert.py", line 102, in _main
ret = convertToIR._convert(ir_args)
File "/usr/local/lib/python2.7/dist-packages/mmdnn/conversion/_script/convertToIR.py", line 115, in _convert
parser.run(args.dstPath)
File "/usr/local/lib/python2.7/dist-packages/mmdnn/conversion/common/DataStructure/parser.py", line 22, in run
self.gen_IR()
File "/usr/local/lib/python2.7/dist-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 424, in gen_IR
func(current_node)
File "/usr/local/lib/python2.7/dist-packages/mmdnn/conversion/tensorflow/tensorflow_parser.py", line 800, in rename_FusedBatchNorm
self.set_weight(source_node.name, 'scale', self.ckpt_data[scale.name])
KeyError: u'conv1/batch_normalization/gamma/read'
It can be succeed if I restore this model and save again. But I got something difference in processing conversion. It seems model dropped some parameters.
Source model conversion:
Parse file [model-lenet-17000.meta] with binary format successfully.
Tensorflow model file [model-lenet-17000.meta] loaded successfully.
Tensorflow checkpoint file [model-lenet-17000] loaded successfully. [36] variables loaded.
Re-saved model conversion:
Parse file [model-lenet.meta] with binary format successfully.
Tensorflow model file [model-lenet.meta] loaded successfully.
Tensorflow checkpoint file [model-lenet] loaded successfully. [14] variables loaded.
22 parameters dropped after re-save.
Hi @2h4dl, would you mind uploading the model so I can check it to solve the problem?
@rainLiuplus https://drive.google.com/file/d/1zizj49H0pdkWmEXshPaXI3qtUd7r3cbG/view?usp=sharing, plz check it.
Hi @2h4dl, I think the first attempt failed because of tf.cond
in the batch norm
. It might be using the tf.contrib.layers.batch_norm
.
Similar problems also happen when others try to import the frozen graph in here.
So, what is the difference when you resave the model? How big is it? I think it is probably because of the batchnorm
.
Hi @JiahaoYao. You mean tf.layers.batch_normalization as same as tf.contrib.layers.batch_norm? After resave model, some variables lost, model is smaller than before. How to use batch norm in tensorflow to avoid it?
Hi @2h4dl, if you resave the model and tf.cond
's are eliminated, it is possible for your model to be smaller due to the paremeters in tf.cond
. We have met with this kind of this before, as mentioned here. I think simply using tf.layer.batch_norm
is safe.
Hi @JiahaoYao. As you mentioned, after resave the model, tf.cond
is gone. But still have a question, sorry about that.
In this wiki, it says tf.cond
exists in slim
not tf.layer
.
But I trained this model with tf.layers.batch_normalization
, is there something wrong with my model code.
This is my code here:
weights = get_weights(w_shape, regualizer)
conv = conv2d(x_input, weights, padding)
norm = tf.layers.batch_normalization(conv, training=istrain)
conv_relu = activation(norm)
Hi @2h4dl , I have also met this problem, how did you fixed finally?