ASFF icon indicating copy to clipboard operation
ASFF copied to clipboard

segment error(core dumped

Open Note-Liu opened this issue 4 years ago • 8 comments

using cude using tfboard segment error(core dumped)

Note-Liu avatar Dec 02 '19 14:12 Note-Liu

please provide more information about your error.....I have no idea what happened on your machine

GOATmessi7 avatar Dec 02 '19 14:12 GOATmessi7

TRAIN: LR: 0.001 MOMENTUM: 0.9 DECAY: 0.0005 BURN_IN: 5 MAXEPOCH: 300 COS: True SYBN: False#True MIX: True NO_MIXUP_EPOCHS: 30 LABAL_SMOOTH: True BATCHSIZE: 1 IMGSIZE: 608 IGNORETHRE: 0.7

train script: python main.py --cfg config/yolov3_baseline.cfg -d VOC --tfboard --ngpu 1 --checkpoint weights/darknet53_feature_mx.pth --start_epoch 0 --half --log_dir log/VOC -s 608

Note-Liu avatar Dec 02 '19 14:12 Note-Liu

The version of you cuda, pytorch, apex and so on. And the details of your error information.... I know you use the default script, but obviously the error is not in the script.

GOATmessi7 avatar Dec 02 '19 15:12 GOATmessi7

If I use only one GPU , Is my train script right? thanks

Note-Liu avatar Dec 03 '19 01:12 Note-Liu

I didn't test the code without distributed training, so even with single gpu, I suggest you keep the distributed training. And your batchsize is only one, which could significantly depress your performance.

GOATmessi7 avatar Dec 03 '19 13:12 GOATmessi7

@Note-Liu have you solverd it??

lxyyang avatar Dec 05 '19 12:12 lxyyang

@Note-Liu have you solverd it??

no.[cry][cry]

Note-Liu avatar Dec 06 '19 12:12 Note-Liu

I think it may be caused by the wrong gcc version error and the DCN module part. When I compile the CenterNet code, it happened once! You can update the gcc to 5.0 or above and try the lateset DCN module from the original github.

eternalgogi92 avatar Mar 10 '20 03:03 eternalgogi92