pytorch-deep-image-matting icon indicating copy to clipboard operation
pytorch-deep-image-matting copied to clipboard

About learning rate setting

Open myCigar opened this issue 5 years ago • 9 comments

Hello @huochaitiantang

Recently, I retrained the model with your code. The learning rate in the train.sh is 0.00001, and the default value in the train.py is 0.001. Which learning rate is the one you are using for your current training model? Thanks !

myCigar avatar Nov 05 '19 13:11 myCigar

lr=1e-5 as the paper.

huochaitiantang avatar Nov 12 '19 14:11 huochaitiantang

Thanks.

myCigar avatar Nov 12 '19 15:11 myCigar

Hi, @huochaitiantang

I am wondering if the training is stable (replicable)? I ran the experiment using the default setting provide by train.sh (except that I change the batchsize to 25), but I got sad=66.7 after epoch 25 and the best sad=60.7 after epoch 5. Do you have any ideas? Thanks!

hejm37 avatar Dec 02 '19 02:12 hejm37

Current training is sensitive to batch size and the large batch will lead to lower performance. You could get a reasonable result with batch size = 1.

huochaitiantang avatar Dec 02 '19 03:12 huochaitiantang

Thanks! I will try a few more experiments (with batch size=1) to see if I could get close to sad=54.42.

hejm37 avatar Dec 02 '19 03:12 hejm37

Hi, @huochaitiantang

I tried batch size = 1 as you said and got reasonable results from 2 out of 3 experiments. However, I found that the best performance was from Epoch 11 and Epoch 17 respectively and then the sad went up again (more accurately, it was fluctuating at a range 55~70). Were you having the same situation in your experiment? Thanks!

hejm37 avatar Dec 09 '19 07:12 hejm37

In my experiment, the sad metric fluctuates at a range 54~64 after the 3rd epoch.

huochaitiantang avatar Dec 13 '19 08:12 huochaitiantang

Thanks for your response! I found the same in my experiment.

hejm37 avatar Dec 13 '19 17:12 hejm37

Hi, thank you for your excellent work! My question is: why the performance gets worse by increasing the batch size? Does this phenomenon documented in the literature? I am very confusing.

kfeng123 avatar Dec 25 '19 07:12 kfeng123