tsn-tensorflow icon indicating copy to clipboard operation
tsn-tensorflow copied to clipboard

BNInception converted from caffe/pytorch

Open gongbudaizhe opened this issue 6 years ago • 6 comments

Hi,

Have you converted the BNinception model from TSN to TensorFlow format ?

gongbudaizhe avatar Jul 25 '18 04:07 gongbudaizhe

Sorry, I have not. I am training the tensorflow implementation. Now I got 82% accuracy for RGB stream with 25 segments and one random crop. I am debugging still...

shuangshuangguo avatar Jul 25 '18 04:07 shuangshuangguo

I am also trying to reproduce TSN using TensorFlow. However, the performance is not satisfactory (~7% vs ~17% in the Something-Something V1 validation dataset).

I tried to keep all configurations the same as tsn-pytorch except that I am using InceptionV2 instead of BNinception. I guess this might be the problem.

gongbudaizhe avatar Jul 25 '18 05:07 gongbudaizhe

As far as I am known, BNInception is InceptionV2 actually. But I found slim InceptionV2 has a different implementation compared to pytorch BNInception.

shuangshuangguo avatar Jul 25 '18 05:07 shuangshuangguo

slim InceptionV2 implements the first layer as separable convolution.

shuangshuangguo avatar Jul 25 '18 05:07 shuangshuangguo

You are right. In principle, InceptionV2 should work as well as BNInception since BNInception is just a reproduction of InceptionV2 by xiong.

The differences are:

  1. InceptionV2 inputs = 2.0 * (RGB_images / 255 - 0.5) BNinception inputs = BGR_images - [104, 117, 128]

  2. InceptionV2 does not have biases in convolution and gamma in BN.

  3. InceptionV2 weight decay 4e-5 BNInception weight decay 5e-4

  4. InceptionV2 first layer: separable convolution

  5. InceptionV2 Mixed_4d branch 3 Convolution has 96 channels BNInception has 128 channels

gongbudaizhe avatar Jul 25 '18 05:07 gongbudaizhe

You are right. In principle, InceptionV2 should work as well as BNInception since BNInception is just a reproduction of InceptionV2 by xiong.

The differences are:

  1. InceptionV2 inputs = 2.0 * (RGB_images / 255 - 0.5) BNinception inputs = BGR_images - [104, 117, 128]
  2. InceptionV2 does not have biases in convolution and gamma in BN.
  3. InceptionV2 weight decay 4e-5 BNInception weight decay 5e-4
  4. InceptionV2 first layer: separable convolution
  5. InceptionV2 Mixed_4d branch 3 Convolution has 96 channels BNInception has 128 channels

are you succeed???i also want to do that

wqqLuke avatar Aug 09 '19 01:08 wqqLuke