caffe-jacinto icon indicating copy to clipboard operation
caffe-jacinto copied to clipboard

Sparsification of Depthwise Separable Convolution

Open umeannthtome opened this issue 7 years ago • 2 comments

Hi,

I see in https://github.com/tidsp/caffe-jacinto/blob/caffe-0.16/src/caffe/net.cpp#L2078 that the sparsification process excludes thin layers and depthwise separable layers. I understand why thin layers are excluded as accuracy after sparsification may drop drastically. But why aren't DW separable layers sparsified?

Is there an easy workaround to make it work for DW separable layers? Or would the implementation be totally different from usual convolution layers?

umeannthtome avatar Nov 24 '17 04:11 umeannthtome

Hi,

I don't think the implementation need to be changed. Feel free to change the code to include which ever layers you would like to include.

The DW seperable convolution do not have many parameters or multiplications - so it doesn't look like you will save much by sparsifying them. Note that the point-wise (1x1) conclusion that comes right after the seperable convolution is already included - this layer has the most of the parameters and the compute out of the two.

mathmanu avatar Nov 24 '17 05:11 mathmanu

Thanks, that helps.

umeannthtome avatar Nov 24 '17 06:11 umeannthtome