ssd_keras
ssd_keras copied to clipboard
In training, why the loss is decreasing, while the val loss is increasing?
I user VOCtest_06-Nov-2007 dataset , first I user get_data_from_XML.py convert xml ground truth to VOC2007.pkl, and use it to train the network, In the training , I found the loss is decreasing, while the the val loss is increasing , is it overfit?
I'm observing the same phenomena. Is there some fix to this?
@kitterive - What initial weights are you using? Also how are you normalizing the coordinates?
I also observe the same behavior. However, I was able to get a val loss of 1.4 after 20 epochs and afterwards the val loss started increasing.
@oarriaga - In that case, which model weights did you use to finetune it VGG16 ( with top removed) or the caffe converted SSD weights ?
I used the pre-trained weights provided in the README file, which I believe are the weights from an older original implementation in caffe.
@oarriaga @rykov8 - Has anyone successfully tried to train the SSD from scratch ( i.e using only VGG16 weights) using this code ? If not then perhaps, it would be wise to rethink the loss function.
Hi @meetshah1995
try to add BN layer after the Conv..
(the weight wouldn't be the best match but can be good start for train )
I am seeing the same issue with training off of the MS COCO data set of images.
I was following the training example form SSD_training.ipynb
@meetshah1995 I have trained SSD with only the VGG16 weights and it was overfiting after ~20 epochs my lowest validation loss was of 1.4. I believe that better results can be obtained from the correct implementation of the random_size_crop
function in the data augmentation part. Also the architecture ported in the repository is not the newest model from the latest arxiv version and this might lead to significant differences between the implementation here presented and the other ones around such as the TF, pytorch and original caffe one.
Hi, @oarriaga Can you show your training log? I want to know loss after 120k iterations. Thank you in advance!
I am seeing the same issue while training my own datasets. is it overfit or not?
My minimum loss is also around 1.39 ~ 1.4. would adding random_size_crop help?