M3DSSD
M3DSSD copied to clipboard
Test code bug in training
Thanks for your research!
When I run python scripts/train_rpn_3d.py --config=kitti_3d_base --exp_name base
, training is normal, but a bug appear in testing. My torch version is 0.4.1. It looks like a object type error, have you ever met the bug? Thank you!
Epoch:9
acc/fg: 0.950
acc/bg: 0.997
misc/z: 0.641
misc/ry: 0.311
acc/iou: 0.856
loss/ttloss: 0.551
testing
0%| | 0/3769 [00:00<?, ?it/s]
Traceback (most recent call last):
File "scripts/train_rpn_3d.py", line 323, in <module>
main(args)
File "scripts/train_rpn_3d.py", line 288, in main
iou_3d = test_kitti_3d(dataset_val, rpn_net, conf, results_path, paths.data, writer=writer)
File "/**/**/model/M3DSSD-master/lib/rpn_util.py", line 1794, in test_kitti_3d
aboxes = im_detect_3d(im, net, rpn_conf, imobj)
File "/**/**/model/M3DSSD-master/lib/rpn_util.py", line 1462, in im_detect_3d
bbox_x3d = bbox_x3d * bbox_stds[0, 4] + bbox_means[0, 4]
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.DoubleTensor for argument #2 'other'
I just fix the bug.Modify rpn_util.py #1444 1445 1446
anchors = torch.from_numpy(rpn_conf.anchors).cuda().float()
bbox_means = torch.from_numpy(rpn_conf.bbox_means).cuda().float()
bbox_stds = torch.from_numpy(rpn_conf.bbox_stds).cuda().float()
Moreover, #1516 sorted_inds = (-aboxes[:, 4]).argsort()
, tensor doesn't support argsort()
, should use torch.sort()
.
In the end, how to use multi-gpu to train?
Thanks for your contribution to fixing the bug. I haven't tested multi-gpu training, you can modify the code to support this feature.
I just fix the bug.Modify rpn_util.py #1444 1445 1446
anchors = torch.from_numpy(rpn_conf.anchors).cuda().float() bbox_means = torch.from_numpy(rpn_conf.bbox_means).cuda().float() bbox_stds = torch.from_numpy(rpn_conf.bbox_stds).cuda().float()
Moreover, #1516
sorted_inds = (-aboxes[:, 4]).argsort()
, tensor doesn't supportargsort()
, should usetorch.sort()
. In the end, how to use multi-gpu to train?
Hi! To train with multi-gpu, you should modify the code at lib/core/init_training_model to as fllow:
if 'CUDA_VISIBLE_DEVICES' not in os.environ.keys():
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
device_ids = [id for id in range(len(os.environ['CUDA_VISIBLE_DEVICES'].split(',')))]
network = torch.nn.DataParallel(network, device_ids)
network.to('cuda')
Then use CUDA_VISIBLE_DEVICES='your gpu device ids' python scripts/train_rpn_3d.py --config=config --exp_name=exp_name
to train with multi-gpu