Hi, thanks for you great work.I just started to explore MapTr2 and when I started training, I met this error:
2023-09-19 17:19:58,978 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs
2023-09-19 17:19:58,979 - mmdet - INFO - Checkpoints will be saved to /home/guxunjia/project/MapTR_v2/work_dirs/maptrv2_nusc_r50_24ep_w_centerline by HardDiskBackend.
/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/models/utils/grid_mask.py:114: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:180.)
mask = torch.from_numpy(mask).to(x.dtype).cuda()
Traceback (most recent call last):
File "tools/train.py", line 259, in
main()
File "tools/train.py", line 248, in main
custom_train_model(
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/bevformer/apis/train.py", line 27, in custom_train_model
custom_train_detector(
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 199, in custom_train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/detectors/maptrv2.py", line 197, in forward
return self.forward_train(**kwargs)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/detectors/maptrv2.py", line 315, in forward_train
losses_pts = self.forward_pts_train(img_feats, lidar_feat, gt_bboxes_3d,
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/detectors/maptrv2.py", line 145, in forward_pts_train
outs = self.pts_bbox_head(
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/dense_heads/maptrv2_head.py", line 345, in forward
outputs = self.transformer(
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 365, in forward
ouput_dic = self.get_bev_features(
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 268, in get_bev_features
ret_dict = self.lss_bev_encode(
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/transformer.py", line 230, in lss_bev_encode
encoder_outputdict = self.encoder(images,img_metas)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/encoder.py", line 1110, in forward
x, depth = super().forward(images, img_metas)
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/encoder.py", line 282, in forward
geom = self.get_geometry_v1(
File "/home/guxunjia/anaconda3/envs/maptr_v2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 214, in new_func
output = old_func(*new_args, **new_kwargs)
File "/home/guxunjia/project/MapTR_v2/projects/mmdet3d_plugin/maptr/modules/encoder.py", line 115, in get_geometry_v1
torch.inverse(post_rots)
RuntimeError: CUDA error: operation not supported when calling cusparseCreate(handle)
I have been using an RTX4090 on MapTR in the past weeks or so, and everything is fine. But when it comes to the environment setup of MapTR_v2, there is a problem. I am thinking that the new mmdetection folder in MapTR_v2 causes some incompatibility.
And I think this problem is specific to 4090 (https://github.com/facebookresearch/pytorch3d/issues/1399)
Can you dig into to this a little bit to make it compatible with RTX4090? Thanks!
An ugly workaround is to set every torch inverse to follows:
torch.inverse(lidar2ego_rots.to("cpu")).to("cuda:0")
An ugly workaround is to set every torch inverse to follows:
torch.inverse(lidar2ego_rots.to("cpu")).to("cuda:0")
useful!