pytorch-faster-rcnn icon indicating copy to clipboard operation
pytorch-faster-rcnn copied to clipboard

Training on new dataset, the speed slows down after certain iteration and error shows

Open ymchen7 opened this issue 8 years ago • 4 comments

Hi, I tried to implemented faster-rcnn on other dataset with json anno. I have worked it out when I change the json anno into xml and follows voc dataset. Now I tred to use json and write a pytorch dataset. The thing is the speed slows down after certain iteration and error shows. I haven't change much on the training code especally network.py. Therefore I don't know what should I fix now.

python 2.7.5 cuda 8.0 Tesla M40 vgg16 7560 roidb entries batch size set as default

some logs info


iter: 480 / 30000, total loss: 2.036504

rpn_loss_cls: 0.663102 rpn_loss_box: 0.285543 loss_cls: 0.645373 loss_box: 0.442487 lr: 0.001000 speed: 0.322s / iter iter: 500 / 30000, total loss: 1.373579 rpn_loss_cls: 0.327806 rpn_loss_box: 0.184931 loss_cls: 0.655622 loss_box: 0.205220 lr: 0.001000 speed: 0.321s / iter iter: 520 / 30000, total loss: 2.680187 rpn_loss_cls: 0.390241 rpn_loss_box: 0.156651 loss_cls: 1.424312 loss_box: 0.708983 lr: 0.001000 speed: 0.320s / iter iter: 540 / 30000, total loss: 1.936543 rpn_loss_cls: 0.328758 rpn_loss_box: 0.171708 loss_cls: 1.037029 loss_box: 0.399048 lr: 0.001000 speed: 0.320s / iter iter: 560 / 30000, total loss: 2.059262 rpn_loss_cls: 0.350970 rpn_loss_box: 0.098141 loss_cls: 1.055237 loss_box: 0.554913 lr: 0.001000 speed: 0.357s / iter iter: 580 / 30000, total loss: 1.768692 rpn_loss_cls: 0.511994 rpn_loss_box: 0.590019 loss_cls: 0.494195 loss_box: 0.172484 lr: 0.001000 speed: 0.592s / iter iter: 600 / 30000, total loss: 1.049125 rpn_loss_cls: 0.265679 rpn_loss_box: 0.031345 loss_cls: 0.579659 loss_box: 0.172441 lr: 0.001000 speed: 0.810s / iter iter: 620 / 30000, total loss: 2.416317 rpn_loss_cls: 0.413436 rpn_loss_box: 0.700102 loss_cls: 0.684415 loss_box: 0.618365 lr: 0.001000 speed: 1.015s / iter iter: 640 / 30000, total loss: 1.650612 rpn_loss_cls: 0.249911 rpn_loss_box: 0.438877 loss_cls: 0.689089 loss_box: 0.272735 lr: 0.001000 speed: 1.207s / iter iter: 660 / 30000, total loss: 1.902411 rpn_loss_cls: 0.415372 rpn_loss_box: 0.171251 loss_cls: 0.832487 loss_box: 0.483302 lr: 0.001000 speed: 1.390s / iter iter: 680 / 30000, total loss: 2.508811 rpn_loss_cls: 0.565063 rpn_loss_box: 0.264611 loss_cls: 1.119584 loss_box: 0.559552 lr: 0.001000 speed: 1.559s / iter iter: 700 / 30000, total loss: 2.196737 rpn_loss_cls: 0.406803 rpn_loss_box: 0.110491 loss_cls: 1.045586 loss_box: 0.633857 lr: 0.001000 speed: 1.722s / iter iter: 720 / 30000, total loss: 1.117774 rpn_loss_cls: 0.428924 rpn_loss_box: 0.198056 loss_cls: 0.389690 loss_box: 0.101104 lr: 0.001000 speed: 1.875s / iter iter: 740 / 30000, total loss: 3.418862 rpn_loss_cls: 0.320257 rpn_loss_box: 1.186544 loss_cls: 1.309661 loss_box: 0.602400 lr: 0.001000 speed: 2.020s / iter iter: 760 / 30000, total loss: 1.773256 rpn_loss_cls: 0.458810 rpn_loss_box: 0.155227 loss_cls: 0.826238 loss_box: 0.332982 lr: 0.001000 speed: 2.161s / iter iter: 780 / 30000, total loss: 2.134902 rpn_loss_cls: 0.472494 rpn_loss_box: 0.458725 loss_cls: 0.873805 loss_box: 0.329878 lr: 0.001000 speed: 2.294s / iter iter: 800 / 30000, total loss: 2.456363 rpn_loss_cls: 0.692896 rpn_loss_box: 1.136842 loss_cls: 0.377992 loss_box: 0.248633 lr: 0.001000 speed: 2.419s / iter iter: 820 / 30000, total loss: 2.277763 rpn_loss_cls: 0.515828 rpn_loss_box: 0.368687 loss_cls: 0.889839 loss_box: 0.503409 lr: 0.001000 speed: 2.538s / iter iter: 840 / 30000, total loss: 3.272842 rpn_loss_cls: 0.464894 rpn_loss_box: 1.077220 loss_cls: 1.147062 loss_box: 0.583667 lr: 0.001000 speed: 2.652s / iter iter: 860 / 30000, total loss: 1.022741 rpn_loss_cls: 0.322352 rpn_loss_box: 0.068675 loss_cls: 0.437379 loss_box: 0.194335 lr: 0.001000 speed: 2.762s / iter Traceback (most recent call last): File "./tools/trainval_net.py", line 137, in max_iters=args.max_iters) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/model/train_val.py", line 354, in train_net sw.train_model(max_iters) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/model/train_val.py", line 256, in train_model self.net.train_step_with_summary(blobs, self.optimizer) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 470, in train_step_with_summary self.forward(blobs['data'], blobs['im_info'], blobs['gt_boxes']) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 395, in forward rois, cls_prob, bbox_pred = self._predict() File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 366, in _predict rois = self._region_proposal(net_conv) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 258, in _region_proposal rpn_labels = self._anchor_target_layer(rpn_cls_score) File "/data1/ymchen/project/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 136, in _anchor_target_layer rpn_cls_score.data, self._gt_boxes.data.cpu().numpy(), self._im_info, self._feat_stride, self._anchors.data.cpu().numpy(), self._num_anchors) AttributeError: 'NoneType' object has no attribute 'data'

ymchen7 avatar Nov 14 '17 12:11 ymchen7

I'll get to this later this week.

ruotianluo avatar Nov 14 '17 13:11 ruotianluo

did you fix this problem? I got the same one.

hao522 avatar Jan 17 '18 09:01 hao522

I also have the same issue. Anyone have ideas?

penguinshin avatar May 09 '18 22:05 penguinshin

AttributeError: 'NoneType' object has no attribute 'data'

That might be problem with your dataset , please check whether your dataset contain correct data information or not and check whether that satisfy the condition of xmax>xmin like that.

devendraswamy avatar Feb 28 '20 04:02 devendraswamy