DISE-Domain-Invariant-Structure-Extraction icon indicating copy to clipboard operation
DISE-Domain-Invariant-Structure-Extraction copied to clipboard

Unable to reproduce the values in the paper

Open subeeshvasu opened this issue 4 years ago • 6 comments

Hello,

I run the code "train_dise_gta2city.py" following the procedure explained in this project page. The only change I have done was to keep the batch_size as 1, to reduce the memory requirement. I got 38.4% mIoU on val set. This is a big difference as compared to the value of 45.4% reported in the paper. Can you please help me to understand the potential reasons behind this performance drop.

Some of the possible reasons which I could guess are the following.

  1. The mIoU scores are computed at a resolution of 512 x 1024, while the original images are of size 1024 x 2048. In the paper, the resolution used to report the values are not mentioned. May be authors have reported the values at a resolution of 1024 x 2048? Just to check if this is the reason, I used the pretrained weights provided by the authors and got a score of 44.2% for images at resolution 512 x 1024. Therefore, I am assuming that the resolution of test images is not the reason behind performance drop

  2. As per the paper, authors have used pretrained weights from PASCAL VOC dataset to initialise the encoder. This can also be the reason behind performance drop. However, even when I start the training scheme with the pretrained weights provided by the authors, the performance goes down eventually and will start to fluctuate around 38-39%.

Has anyone succeeded to get values around 44 % up on experimentation with this code?

Regards, Subeesh

subeeshvasu avatar Aug 29 '19 16:08 subeeshvasu

Hi,

Unfortunately, the training under the pytorch framework is non-deterministic. A relevant issue is here: https://discuss.pytorch.org/t/random-seed-initialization/7854/18 Even though we re-run the code, there still exists some fluctuation, but it is not difficult to get a value higher than 44% (with batch size = 2).

To your guess,

  1. During the final testing, we use 1024x2048. As you tested, we think it doesn't affect too much.
  2. Our apologies. That's a typo. Following the setting of Tsai et al., we use exactly the same initial weights as Deeplabv2 used, so the code is correct. A relevant issue is here: https://github.com/wasidennis/AdaptSegNet/issues/5

The potential reason could be the version of libraries. I am re-running my code with batch size = 1. I hope I can come up with some good results and random seed to help you reproduce the performance.

--- edit --- In total, the model will be trained for 250000 steps. The smaller batch size you use, the less samples the model sees (i.e. you only see half of the number of data as compared to mine). It could be helpful to train for more steps and adjust the learning rate moderately.

Cheers, Hui-Po

a514514772 avatar Aug 29 '19 19:08 a514514772

Thank you for all the suggestions. I will try them out and see if I can improve the values.Please let me know if you are able to get to the value of 44% with batch size = 1.

subeeshvasu avatar Aug 29 '19 20:08 subeeshvasu

hello,i run the code,but my device is two 1080GPU,i want to know that what are your device? and how long you cost to train the model? @subeeshvasu

crazygirl1992 avatar Oct 12 '19 02:10 crazygirl1992

@crazygirl1992 I was using a single GTX TitanX (12 GB). With this settings, for batch size = 1, training cost was approximately: 2hrs, 20 minutes per 1000 iterations.

subeeshvasu avatar Oct 12 '19 11:10 subeeshvasu

thank you very much,and can you achive the paper's result now? the training almost 250000 iterations in his paper,and 20*250mins,not 2hrs

crazygirl1992 avatar Oct 12 '19 11:10 crazygirl1992

I couldn't get those values. With batch size = 4, one could reproduce the values I guess!.

subeeshvasu avatar Oct 12 '19 11:10 subeeshvasu