MMdnn icon indicating copy to clipboard operation
MMdnn copied to clipboard

Converting Caffe Deconvolution layer with "group" parameter to TensorFlow

Open drkoller opened this issue 6 years ago • 7 comments

I am generally able to use the "mmconvert" program to successfully convert my Caffe models to TensorFlow. However, when I have a Caffe "Deconvolution" layer that includes a "group" parameter with value greater than 1, such as:

layer {
    bottom: 'input'
    top:  'output'
    name: 'mylayer'
    type: 'Deconvolution'
    param {lr_mult: 0 decay_mult: 0}
    convolution_param {
        dilation: 1
        num_output: 4
        group: 4
        kernel_size: 2
        stride: 2
    }
}

then the TensorFlow result does not seem correct. In particular, I get a TensorFlow error message similar to:

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1241, in conv2d_transpose filter.get_shape()[2])) ValueError: output_shape does not match filter's output channels, 4 != 1

which indicates that the translation of the grouped channels situation is incorrect.

Platform: Ubuntu 16.04: Python version: 2.7 Source framework with version: Caffe Destination framework with version: Tensorflow 1.8 Running scripts: mmconvert

drkoller avatar Aug 23 '18 21:08 drkoller

Hi @drkoller ,could you please provide your model file?

namizzz avatar Aug 24 '18 03:08 namizzz

This problem is really exists:

layer { name: "Deconvolution1" type: "Deconvolution" bottom: "Convolution18" top: "Deconvolution1" param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 192 bias_term: false pad: 1 kernel_size: 4 group: 192 stride: 2 weight_filler { type: "bilinear" } } }

Has error: output_shape does not match filter's output channels, 192 != 1

GalacticF avatar Sep 10 '18 07:09 GalacticF

I cannot provide my particular model file, due to proprietary concerns. However, it looks like any Caffe model with a grouped Deconvolution layer (i.e. a "group" value greater than one) is not going to convert to TensorFlow properly with the current MMdnn code. @GalacticF provides an additional example.

Grouped Convolution layers do convert properly, though (although not in the best-performing manner). When a grouped Convolution layer is converted to TensorFlow, the MMdnn translation outputs an initial Split operation into the individual groups, then a set of Convolutions on each group, and finally a Concat to merge the results. This functionality is in the TensorFlow emitter code for def _layer_Conv() in mmdnn/conversion/tensorflow/tensorflow_emitter.py. However, there is currently no similar functionality to support grouping in the case of Deconvolution layers.

drkoller avatar Sep 10 '18 18:09 drkoller

@drkoller @GalacticF Recently, I meet the same problem, is there any solution?

rivergold avatar Oct 29 '18 07:10 rivergold

@rivergold I've changed and retrained model.

GalacticF avatar Oct 29 '18 10:10 GalacticF

I'm also having the same issue.

ValueError: output_shape does not match filter's output channels, 256 != 1

@namizzz Are there any plans to support grouped deconvolutions? I'm a bit out of my depth here, but with some guidance I might be able to implement a solution.

KendallPark avatar Mar 17 '19 23:03 KendallPark

In group convolution, each group shares the weight, similar in deconvolution/conv2d_transpose, Although tensorflow dose not support group deconvolution/conv2d_transpose, we can recover the full weight of tensorflow deconvolution/conv2d_transpose by copying the weight of different groups. Therefore, we can copy the Caffe weight group copies for processing. __weights_dict[deconvolution_name]['weights'] = np.tile(__weights_dict[deconvolution_name]['weights'], (1, 1, group, 1))

BasicCoder avatar Jun 01 '21 08:06 BasicCoder