cvpr2018-SSAH
cvpr2018-SSAH copied to clipboard
How much time it takes in the training phase?
I run the code for a whole day on a K80 GPU. But now it only runs to epoch 19. So I want to ask the time cost in your training phase
@asiandragon my gpu is 1080Ti, training is quick. If you use the K80, it will relatively slow. Besides, I advise you try different training order. or lastly, reload the per-trained weights, and discard the discriminator,only train the Generator. Good Luck
@zyfsa Thanks for your reply. How about the meaning of different training order? Is it mean to change the order of updating ImgNet, LabNet and TxtNet? Also, how about the rough time you cost? Can you finish the training within 1 day?
@asiandragon the first question, you are right. the second problem, too many epochs are useless. i run 60 epochs,then stop it, the time is about 12 hours.you can try it.
@zyfsa I stop the training after 50 epochs, but the accuracy is only around 60%,55% for i-t and t-i, respectively. Could you kindly upload your pretrained model in the dir checkpoint? Thank you.
@asiandragon I think you are right. So I reload the pre-trained weights in checkpoints, only train the Generator. This brings some improvement. Besides, Training has a certain randomness. you can try it
@asiandragon namely, I do not train the discriminator, only train the lab_net, img_net, text_net
@zyfsa Have you ever conducted this experiment on the NUS-WIDE dataset?
@FrankYufeng17 No,,maybe this is the next wrok
@zyfsa So it means that we can neglect the loss in Sec3.4 adversarial learning of the original paper? And according to your updated results, in this case, we can obtain similar results with the original paper by one epoch?
@asiandragon yes,you can directly run it.You can get the result
@zyfsa I get the same result with you. But have you find out why the results decrease after adding the adversarial learning part? I fail to get this solution.
sorry, I do not try this experiment. In fact, I think the adversarial learning is a magic and some paper also think it only can bring little or no improvements. Besides, in this paper(SSAH), the self-supervised semantic network is very important.
@zyfsa Also, though I can obtain a relatively high performance with only 1 epoch, when I continue to train the DL network, the performance will be decreased (only i-t 64% t-i 55% after 11 epochs). Do you encounter this problem?
yes, I encounter this problem. Besides, I consulted with the authors, it also get the Similar conclusion. The adversarial learning is not as important as we think
@zyfsa how do you solve this problem? Only train 1 epoch and utilize the results? I exclude the adversarial learning part but the results also decrease along with the training phase (when iterating more epochs).
@asiandragon I am also confused because training is too fast. The authors show the training efficiency in this paper in figure 5. But he escapes this problem we discussed. Perhaps this result demonstrates the superiority of the approach and early stopping is a good method to prevent overfitting? Maybe You can visual the loss by tensorboard.