texture_nets
texture_nets copied to clipboard
Training the model failed
Hi, I tried to train my model based on the imagenet validation set which contains 50k images. At the beginning, i.e. the iteration step is smaller than some small numbers like 8000, I can get reasonable test result using the trained model. But as the training going on, I get a black image, where is output of the pixel value is NaN! All the parameters used for training are unchanged, and the training image is the " cezanne.jpg" which is included in the branch of texture_nets_v1. Shoud I change the learning late? Could you please give me some advice about this problem? Thanks!
I typically use 1e-3 lr for Johnson's model and 1e-2 (or higher) for others.
What batch size did you have? Probably there is a problem in dataloader and it does not change the epoch properly. I will take a look.
The batch size is 1, all of the parameters are unchanged. Can you train any model correctly using this code? It seems that the training goes wrong suddenly~
When I change the pyramid model to Johnson's model, it works fine. So maybe there is some problem about the pyramid model.
It is probably something with learning rate, I tested both, I will take a look.
Same issue for me. After enough iterations model returns black images.
I cannot reproduce it, can you please specify your cmd?