keras icon indicating copy to clipboard operation
keras copied to clipboard

Incorrect output shape when using dilations and strides > 1 with depthwise conv

Open Hrayo712 opened this issue 3 years ago • 3 comments
trafficstars

Issue Type

Bug

Source

binary

Tensorflow Version

tf 2.4 - v2.4.0-rc4-71-g582c8d236cb 2.4.0

Custom Code

No

OS Platform and Distribution

Linux Ubuntu 18.04.6 LTS

Mobile device

No response

Python version

3.8.0

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

11.0

GPU model and memory

No response

Current Behaviour?

I am building a model which uses strides > 1 and dilation_rate > 1 in some DepthwiseConv2D layers.

However, with certain values, it seems like the output shape is incorrect.

Note: On the documentation, I see that using dilation_rate > 1 with strides > 1 is invalid, however, no runtime error is raised, in contrast with Conv2D, where the error is raised.

Standalone code to reproduce the issue

import tensorflow as tf

input = tf.keras.layers.Input(
            shape=(136, 136, 72),
            batch_size=1,
        )
x = tf.keras.layers.DepthwiseConv2D(
      kernel_size=3,
      strides=2,
      activation=None,
      use_bias=False,
      padding='valid',
      dilation_rate=(3, 3))(input)

print(f"output shape: {x.shape}")

The print statement shows:

output shape: (1, 64, 64, 72)

Interestingly, when doing the following:

conv = tf.keras.layers.DepthwiseConv2D(
      kernel_size=3,
      strides=2,
      activation=None,
      use_bias=False,
      padding='valid',
      dilation_rate=(3, 3))

conv.compute_output_shape((1, 136, 136, 72))

The output reported is:

(1, 65, 65, 72)

Interestingly, when using dilation_rate =(2, 2) instead, the outputs reported via these two examples, match.

Following the equation in: https://www.tensorflow.org/api_docs/python/tf/nn#notes_on_padding_2

for valid padding, the output shape is calculated as:

output = ceil((in_height - filter_height + 1) / stride_height)

where filter_height when using dilations is computed as:

filter_height: dilation*(filter_height - 1) + 1

Following these equations, the output shape as reported by compute_output_shape is correct. However, it doesnt match the shape reported when actually forwarding through the layer.

I also tried implementing this in Pytorch, and output shape reported matches the one reported by compute_output_shape

This might be related to this issue: https://github.com/keras-team/keras/issues/16092

Is this a bug ? I need to know how the output shape is computed, as I need to be able to calculate a certain amount of padding such to ensure the downsampling caused by the stride results in a certain shape.

Thanks in advance for your support!

Hrayo712 avatar Aug 23 '22 09:08 Hrayo712

@gadagashwini, I was able to reproduce the issue on tensorflow v2.8, v2.9 and nightly. Kindly find the gist of it here.

tilakrayal avatar Aug 24 '22 10:08 tilakrayal

Any update on this, team ?

Hrayo712 avatar Sep 24 '22 14:09 Hrayo712

@Hrayo712 , I confirm this is an issue. It has to do with the implementation of the Tensorflow tf.nn.*conv* ops, which Keras layers are based on. In fact, there is an inconsistency. On TPU you do get the expected output shape. The inconsistency has to do with how dilation is implemented on CPU and GPU. I reported this to the Tensorflow team who owns convolution ops. The original bug you create was actually at the right place.

hertschuh avatar Sep 26 '22 18:09 hertschuh