fpn.pytorch
fpn.pytorch copied to clipboard
Training suddenly terminate after first epoch. Looking for help, plz
Here are my Trace backs:
[session 1][epoch 1][iter 0] loss: 4.0006, lr: 1.00e-02
fg/bg=(128/384), time cost: 7.218862
rpn_cls: 0.6919, rpn_box: 0.1386, rcnn_cls: 2.8319, rcnn_box 0.3382
Traceback (most recent call last):
File "trainval_net.py", line 330, in
I found that the code runs normally on faster-rcnn. But if I use the code of fpn, it failed. So I guess the problem happens in fpn.py, but I still can't find out why.
What's more, I used this model to train my personal data, if I changed the data back to origin Voc2007, it works. That's strange. I just changed my personal data into the form of Voc2007.
Here is one of my annotation file:
and here is the annotation file in original voc2007
@KevinQian97 I have encountered with the same problem. Have you found out how to solve it?
@KevinQian97 @WangTianYuan did you solve this issue?
Have you solved the problem? I got the same error.@KevinQian97 @WangTianYuan
Have you solved the problem? I got the same error.@KevinQian97 @WangTianYuan
I found that if you use your own dataset to train the model, if it has dirty data, it will cause Nan values in roi_ level in FPN.py. You can try the following modification methods: roi_ level[roi_ level < 2] = 2 roi_ level[roi_ level > 5] = 5 To roi_ level[roi_ level < 2] = 2 roi_ level[roi_ level > 5] = 5 roi_ level[roi_ level!=roi_ level]=5