Remote-Sensing-RVSA icon indicating copy to clipboard operation
Remote-Sensing-RVSA copied to clipboard

mmcv版本问题

Open libingDY opened this issue 2 years ago • 11 comments
trafficstars

您好,mmcv已经升级到最新版本了,您代码中的mmcv_custom中的代码还是基于mmcv低版本写的,您能更新下代码吗?

libingDY avatar Jul 04 '23 06:07 libingDY

@libingDY 您好,当前这套代码是完完全全的老版本mmcv,mmdet,mmseg等,老版本上完全可以跑通。由于新老版本改动较大,我没有精力,也没有必要将老版本全部更新为新版本,建议您采用老版本来运行。据我所知,新版本可能已经支持mmcv_custom中的部分功能,您也可以找找看,这样可能就不需要mmcv_custom了。在之后的工作中,我们会采用全套新版本,谢谢关注!

DotWang avatar Jul 05 '23 08:07 DotWang

感谢您的恢复,我用低版本的mmcv\mmdet进行了代码运行,但是我又遇到了以下问题: Traceback (most recent call last): File "tools/train.py", line 153, in main() File "tools/train.py", line 142, in main train_detector( File "/data4/libing/bisai/OBBDetection/mmdet/apis/train.py", line 133, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 53, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 31, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/parallel/data_parallel.py", line 77, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/core/fp16/decorators.py", line 51, in new_func return old_func(*args, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 172, in forward return self.forward_train(img, img_metas, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 154, in forward_train x = self.extract_feat(img) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 84, in extract_feat x = self.backbone(img) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 777, in forward x = self.forward_features(x) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 764, in forward_features x = blk(x, Hp, Wp) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 540, in forward convX = self.drop_path(self.PCM(x_2d).permute(0, 2, 3, 1).contiguous().view(b, n, c)) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward world_size = torch.distributed.get_world_size(process_group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size return _get_group_size(group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size default_pg = _get_default_group() File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. 您遇到过相似的问题吗?

libingDY avatar Jul 05 '23 08:07 libingDY

@libingDY 您好,我没有碰到过,你这是DDP方面的问题吧,我没有改OBBDetection,可能是你命令不对

DotWang avatar Jul 05 '23 09:07 DotWang

好的,非常感谢您的回复

libingDY avatar Jul 05 '23 10:07 libingDY

您好,能提供您老版本的安装包版本吗?感谢

zhongyas avatar Jul 07 '23 02:07 zhongyas

@zhongyas 如果你说mmcv-full的话,安装的时候可以指定版本,我这里现在没有旧的了,如果你说的是obbdetection和mmsegmentation,完整框架在RSP仓库里,RVSA仓库只是提供相关的backbone和config等文件

DotWang avatar Jul 07 '23 09:07 DotWang

十分感谢您的回复

zhongyas avatar Jul 07 '23 10:07 zhongyas

你好,我想问一下,这个backbone在初始化时候,使用的norm_cfg是SyncBN吗

hhb442 avatar Oct 23 '23 12:10 hhb442

@hhb442 如果在多卡上finetune ViTAE-RVSA的话是,readme有写

DotWang avatar Oct 24 '23 12:10 DotWang

请问能否提供一下你的conda环境的压缩包呢,我想进行克隆以避免不用的版本错误

vxiaobai avatar Nov 30 '23 07:11 vxiaobai

感谢您的恢复,我用低版本的mmcv\mmdet进行了代码运行,但是我又遇到了以下问题: Traceback (most recent call last): File "tools/train.py", line 153, in main() File "tools/train.py", line 142, in main train_detector( File "/data4/libing/bisai/OBBDetection/mmdet/apis/train.py", line 133, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 53, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 31, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/parallel/data_parallel.py", line 77, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/core/fp16/decorators.py", line 51, in new_func return old_func(*args, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 172, in forward return self.forward_train(img, img_metas, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 154, in forward_train x = self.extract_feat(img) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 84, in extract_feat x = self.backbone(img) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 777, in forward x = self.forward_features(x) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 764, in forward_features x = blk(x, Hp, Wp) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 540, in forward convX = self.drop_path(self.PCM(x_2d).permute(0, 2, 3, 1).contiguous().view(b, n, c)) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward world_size = torch.distributed.get_world_size(process_group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size return _get_group_size(group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size default_pg = _get_default_group() File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. 您遇到过相似的问题吗?

如果是单卡训练的话,应该是nn.SyncBatchNorm引起的,换成nn.BatchNorm2d就行了

regainOWO avatar Dec 14 '23 08:12 regainOWO