SpatialTransformerLayer Why the results is worse than the paper's?

Hi, @daerduoCarey Many thanks for your attention. According to your implementation, I have got some results. But these error percentage is two times of that paper.

The dataset generation is following the appendix A of the paper. The rotated dataset (R) was generated from rotating MNIST training digits with a random rotation sampled uniformly between -90 and +90 .

The network is got by your implementation.

The following is the error percentage of just rotation mnist. model | baseline | mine CNN | 1.2 | 3.15 FCN | 2.1 | 5.91
ST-CNN | 0.7 | 2.39
ST-FCN | 1.2 | 3.05

Can you give me some suggestion to get the approximate results?

Best Regards Kevin

Jan 09 '18 09:01 xyyu-callen

@2502572025 hi，i have get a error when i run a new caffe vision"s stn code . error: check error:unknown name: file why ?

Jan 10 '18 09:01 zengjianyou

Hello, @zengjianyou

You need to pay attention to the initialization of the regress layer bias. The author customed his own initialization function.

Jan 10 '18 09:01 xyyu-callen

@2502572025 So，Don”t you use "file” type？ I got worse result when i initialize the regress layer bias to “constant”.What should i do?

Jan 10 '18 12:01 zengjianyou

@2502572025 Hi, Kevin,

Thank you for your interests in my code. When I did my experiments using my code, for MNIST experiment, I can get the results pretty close to the original paper, but for CUB datset, I can't. Unfortunately, the authors of the original paper do not intend to release the code (correct me if they do) so I cannot check what's the difference between my implementation and theirs.

But you can try other tensorflow implementations, https://github.com/kevinzakka/spatial-transformer-network or http://torch.ch/blog/2015/09/07/spatial_transformers.html

Bests, Kaichun

Jan 10 '18 20:01 daerduoCarey

@daerduoCarey I am very interests in your code. I always got an unstable result,when i did my experiments using your code.The result stays oscillating when it runs to the end. I hope you can give me some advice. Thank you!!

Jan 12 '18 09:01 zengjianyou

@daerduoCarey

Hi, Kaichun. Many thanks for your attention.

I think the only difference is the data preprocessing between yours and mine.

I suppose this situation caused by the two factors. i. The data preprocessing My preprocessing is based on the PIL of python. ii. The version of the caffe I just changed the caffe version, the divergent training began to convergent.

So, if possible, can you give me your data or preprocessing script? I want to find out what's wrong with my processing.

Thank you very much again!

Best Regards Kevin

Jan 14 '18 05:01 xyyu-callen

hi,@xyyu-kevin , can you tell me how to add stn layer directly after input image data layer? and then pass the transformed into the pre-trained VGG16? the localization and the stn layer both needs backward to learn the parameter or not? How to set the lr?

Mar 19 '19 03:03 Blue-Clean

SpatialTransformerLayer SpatialTransformerLayer copied to clipboard

Why the results is worse than the paper's?

SpatialTransformerLayer
SpatialTransformerLayer copied to clipboard