cascade-rcnn icon indicating copy to clipboard operation
cascade-rcnn copied to clipboard

Train error: many params are -1, can't save the trained model

Open DetectionIIT opened this issue 6 years ago • 3 comments

I0806 23:44:24.048591 20123 solver.cpp:219] Iteration 9900 (2.14913 iter/s, 46.5305s/100 iters), loss = 0.440841 I0806 23:44:24.048627 20123 solver.cpp:238] Train net output #0: bbox_iou = -1 I0806 23:44:24.048635 20123 solver.cpp:238] Train net output #1: bbox_iou_2nd = -1 I0806 23:44:24.048638 20123 solver.cpp:238] Train net output #2: bbox_iou_3rd = -1 I0806 23:44:24.048641 20123 solver.cpp:238] Train net output #3: bbox_iou_pre = -1 I0806 23:44:24.048645 20123 solver.cpp:238] Train net output #4: bbox_iou_pre_2nd = -1 I0806 23:44:24.048648 20123 solver.cpp:238] Train net output #5: bbox_iou_pre_3rd = -1 I0806 23:44:24.048651 20123 solver.cpp:238] Train net output #6: cls_accuracy = 0.984375 I0806 23:44:24.048655 20123 solver.cpp:238] Train net output #7: cls_accuracy_2nd = 0.972656 I0806 23:44:24.048658 20123 solver.cpp:238] Train net output #8: cls_accuracy_3rd = 0.964844 I0806 23:44:24.048666 20123 solver.cpp:238] Train net output #9: loss_bbox = 0.0117847 (* 1 = 0.0117847 loss) I0806 23:44:24.048671 20123 solver.cpp:238] Train net output #10: loss_bbox_2nd = 0.0129223 (* 0.5 = 0.00646114 loss) I0806 23:44:24.048676 20123 solver.cpp:238] Train net output #11: loss_bbox_3rd = 0.00699362 (* 0.25 = 0.0017484 loss) I0806 23:44:24.048681 20123 solver.cpp:238] Train net output #12: loss_cls = 0.0294972 (* 1 = 0.0294972 loss) I0806 23:44:24.048686 20123 solver.cpp:238] Train net output #13: loss_cls_2nd = 0.0663875 (* 0.5 = 0.0331937 loss) I0806 23:44:24.048689 20123 solver.cpp:238] Train net output #14: loss_cls_3rd = 0.0622066 (* 0.25 = 0.0155517 loss) I0806 23:44:24.048696 20123 solver.cpp:238] Train net output #15: rpn_accuracy = 0.999953 I0806 23:44:24.048701 20123 solver.cpp:238] Train net output #16: rpn_accuracy = -1 I0806 23:44:24.048703 20123 solver.cpp:238] Train net output #17: rpn_bboxiou = -1 I0806 23:44:24.048708 20123 solver.cpp:238] Train net output #18: rpn_loss = 0.000343773 (* 1 = 0.000343773 loss) I0806 23:44:24.048713 20123 solver.cpp:238] Train net output #19: rpn_loss = 0 (* 1 = 0 loss) I0806 23:44:24.048717 20123 sgd_solver.cpp:105] Iteration 9900, lr = 0.0002 I0806 23:45:10.848093 20123 solver.cpp:587] Snapshotting to binary proto file /disk1/g201708021059/cascade-rcnn/examples/voc/res101-9s-600-rfcn-cascade/log/cascadercnn_voc_iter_10000.caffemodel *** Aborted at 1533570310 (unix time) try "date -d @1533570310" if you are using GNU date *** PC: @ 0x7f55674532e7 caffe::Layer<>::ToProto() *** SIGSEGV (@0x0) received by PID 20123 (TID 0x7f55682b49c0) from PID 0; stack trace: *** @ 0x7f5565dedcb0 (unknown) @ 0x7f55674532e7 caffe::Layer<>::ToProto() @ 0x7f55675d7533 caffe::Net<>::ToProto() @ 0x7f55675f415f caffe::Solver<>::SnapshotToBinaryProto() @ 0x7f55675f42f2 caffe::Solver<>::Snapshot() @ 0x7f55675f7f7a caffe::Solver<>::Step() @ 0x7f55675f8994 caffe::Solver<>::Solve() @ 0x40d4c0 train() @ 0x408d32 main @ 0x7f5565dd8f45 (unknown) @ 0x409442 (unknown) @ 0x0 (unknown)

thanks a lot

DetectionIIT avatar Aug 07 '18 01:08 DetectionIIT

-1 is fine, it means there is no positive samples. I don't know why you don't save the models. It should be independent of what the model is.

zhaoweicai avatar Aug 13 '18 04:08 zhaoweicai

thanks a lot

DetectionIIT avatar Aug 14 '18 12:08 DetectionIIT

I tried the same model, can you tell me the loss you finally got, my loss didn't drop, about 0.5 all the time. Can you give me some advice?

xiaosibai avatar Aug 15 '19 01:08 xiaosibai