MapTR
MapTR copied to clipboard
多卡训练失败
我使用4卡4090上训练maptr,nuscense数据集,出现以下报错,请问是什么原因呀
Traceback (most recent call last):
File "./tools/train.py", line 260, in
custom_train_detector(
File "/workspace/code/projects/mmdet3d_plugin/bevformer/apis/mmdet_train.py", line 75, in custom_train_detector
model = MMDistributedDataParallel(
File "/usr/local/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 496, in init
dist._verify_model_across_ranks(self.process_group, parameters)
RuntimeError: replicas[0][0] in this process with sizes [80, 128] appears not to match sizes of the same param in process 0.