SpatialTransformerLayer
SpatialTransformerLayer copied to clipboard
Why the results is worse than the paper's?
Hi, @daerduoCarey Many thanks for your attention. According to your implementation, I have got some results. But these error percentage is two times of that paper.
The dataset generation is following the appendix A of the paper. The rotated dataset (R) was generated from rotating MNIST training digits with a random rotation sampled uniformly between -90 and +90 .
The network is got by your implementation.
The following is the error percentage of just rotation mnist.
model | baseline | mine
CNN | 1.2 | 3.15
FCN | 2.1 | 5.91
ST-CNN | 0.7 | 2.39
ST-FCN | 1.2 | 3.05
Can you give me some suggestion to get the approximate results?
Best Regards Kevin
@2502572025 hi,i have get a error when i run a new caffe vision"s stn code . error: check error:unknown name: file why ?
Hello, @zengjianyou
You need to pay attention to the initialization of the regress layer bias. The author customed his own initialization function.
@2502572025 So,Don”t you use "file” type? I got worse result when i initialize the regress layer bias to “constant”.What should i do?
@2502572025 Hi, Kevin,
Thank you for your interests in my code. When I did my experiments using my code, for MNIST experiment, I can get the results pretty close to the original paper, but for CUB datset, I can't. Unfortunately, the authors of the original paper do not intend to release the code (correct me if they do) so I cannot check what's the difference between my implementation and theirs.
But you can try other tensorflow implementations, https://github.com/kevinzakka/spatial-transformer-network or http://torch.ch/blog/2015/09/07/spatial_transformers.html
Bests, Kaichun
@daerduoCarey I am very interests in your code. I always got an unstable result,when i did my experiments using your code.The result stays oscillating when it runs to the end. I hope you can give me some advice. Thank you!!
@daerduoCarey
Hi, Kaichun. Many thanks for your attention.
I think the only difference is the data preprocessing between yours and mine.
I suppose this situation caused by the two factors. i. The data preprocessing My preprocessing is based on the PIL of python. ii. The version of the caffe I just changed the caffe version, the divergent training began to convergent.
So, if possible, can you give me your data or preprocessing script? I want to find out what's wrong with my processing.
Thank you very much again!
Best Regards Kevin
hi,@xyyu-kevin , can you tell me how to add stn layer directly after input image data layer? and then pass the transformed into the pre-trained VGG16? the localization and the stn layer both needs backward to learn the parameter or not? How to set the lr?