MMdnn
MMdnn copied to clipboard
Converting Caffe Deconvolution layer with "group" parameter to TensorFlow
I am generally able to use the "mmconvert" program to successfully convert my Caffe models to TensorFlow. However, when I have a Caffe "Deconvolution" layer that includes a "group" parameter with value greater than 1, such as:
layer {
bottom: 'input'
top: 'output'
name: 'mylayer'
type: 'Deconvolution'
param {lr_mult: 0 decay_mult: 0}
convolution_param {
dilation: 1
num_output: 4
group: 4
kernel_size: 2
stride: 2
}
}
then the TensorFlow result does not seem correct. In particular, I get a TensorFlow error message similar to:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1241, in conv2d_transpose filter.get_shape()[2])) ValueError: output_shape does not match filter's output channels, 4 != 1
which indicates that the translation of the grouped channels situation is incorrect.
Platform: Ubuntu 16.04: Python version: 2.7 Source framework with version: Caffe Destination framework with version: Tensorflow 1.8 Running scripts: mmconvert
Hi @drkoller ,could you please provide your model file?
This problem is really exists:
layer { name: "Deconvolution1" type: "Deconvolution" bottom: "Convolution18" top: "Deconvolution1" param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 192 bias_term: false pad: 1 kernel_size: 4 group: 192 stride: 2 weight_filler { type: "bilinear" } } }
Has error: output_shape does not match filter's output channels, 192 != 1
I cannot provide my particular model file, due to proprietary concerns. However, it looks like any Caffe model with a grouped Deconvolution layer (i.e. a "group" value greater than one) is not going to convert to TensorFlow properly with the current MMdnn code. @GalacticF provides an additional example.
Grouped Convolution layers do convert properly, though (although not in the best-performing manner). When a grouped Convolution layer is converted to TensorFlow, the MMdnn translation outputs an initial Split operation into the individual groups, then a set of Convolutions on each group, and finally a Concat to merge the results. This functionality is in the TensorFlow emitter code for def _layer_Conv()
in mmdnn/conversion/tensorflow/tensorflow_emitter.py
. However, there is currently no similar functionality to support grouping in the case of Deconvolution layers.
@drkoller @GalacticF Recently, I meet the same problem, is there any solution?
@rivergold I've changed and retrained model.
I'm also having the same issue.
ValueError: output_shape does not match filter's output channels, 256 != 1
@namizzz Are there any plans to support grouped deconvolutions? I'm a bit out of my depth here, but with some guidance I might be able to implement a solution.
In group convolution, each group shares the weight, similar in deconvolution/conv2d_transpose, Although tensorflow dose not support group deconvolution/conv2d_transpose, we can recover the full weight of tensorflow deconvolution/conv2d_transpose by copying the weight of different groups. Therefore, we can copy the Caffe weight group copies for processing.
__weights_dict[deconvolution_name]['weights'] = np.tile(__weights_dict[deconvolution_name]['weights'], (1, 1, group, 1))