bevfusion
bevfusion copied to clipboard
KeyError: "BEVFusion: 'encoders.camera.backbone.stages.0.blocks.0.attn.w_msa.relative_position_bias_table'"
I use my own lidar-only and camera-only pth to train the fusion model,and encountered this problem,How can I solve it?
Pls did you solve it, I also had the same problem
I faced the same problem, when trying to use the new saved checkpoints from training a cam-only centerhead detector. I am not sure if it has anything to do with the centerhead being different the transfusion.
Let me be specific.
- I first train a new cam-only detector using the following:
torchpack dist-run -np 1 python tools/train.py
configs/nuscenes/det/centerhead/lssfpn/camera/256x704/swint/default.yaml
--model.encoders.camera.backbone.init_cfg.checkpoint pretained/swint-nuimages-pretrained.pth
-
the checkpoints of (1) are saved into runs/ folder
-
Then now I try to train a cam+lidar detector using one of the saved checkpoints in (2):
torchpack dist-run -np 1 python tools/train.py \ configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml
--model.encoders.camera.backbone.init_cfg.checkpoint runs/run-531bf67d-d3138be2/epoch_20.pth
--load_from pretrained/lidar-only-det.pth
And the errors are as follow:
Traceback (most recent call last): File "tools/train.py", line 68, in main model = build_model(cfg.model) File "/home/bevfusion/mmdet3d/models/builder.py", line 41, in build_model return build_fusion_model(cfg, train_cfg=train_cfg, test_cfg=test_cfg) File "/home/bevfusion/mmdet3d/models/builder.py", line 35, in build_fusion_model return FUSIONMODELS.build( File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 212, in build return self.build_func(*args, **kwargs, registry=self) File "/opt/conda/lib/python3.8/site-packages/mmcv/utils/registry.py", line 55, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') KeyError: "BEVFusion: 'encoders.camera.backbone.stages.0.blocks.0.attn.w_msa.relative_position_bias_table'"
The model you trained in (1) means you obtain a single-modality model which just uses the camera images to do the object detection. That means you cannot use it as the camera backbone to train the fusion model.
Hi fdy61,
I notice the same "pretained/swint-nuimages-pretrained.pth" was used as a checkpoint for training the cam-only detector, and also as a checkpoint to train the cam+lidar detector.
This is why I was of the impression that the saved cam-only checkpoints "runs/run-531bf67d-d3138be2/epoch_20.pth" will be usable to train the cam+lidar detector.
May I have your kind advise, how then can I train an appropriate camera backbone to train the cam+lidar fusion model?
Thanks
pretained/swint-nuimages-pretrained.pth is just a image backbone model, if I remember correctly,which has the same structure compared with SwinTransformer. But runs/run-531bf67d-d3138be2/epoch_20.pth is an entire model, and its' weights parameter have been changed completely. you can print out the model weights name to see.
Hi fdy61,
Thanks for your tips. Indeed when I "ls -l" the 2 files, they are very different:
ls -l pretrained/swint-nuimages-pretrained.pth run/run-531bf67d-d3138be2/epoch_20.pth -rw-r--r-- 1 root root 110370759 Sep 26 2022 pretrained/swint-nuimages-pretrained.pth -rw-r--r-- 1 root root 523728374 May 27 03:45 run/run-531bf67d-d3138be2_epoch_20.pth
I am not sure how else can I print examine their "model weight name". Can you kindly advise me?
So the question then becomes how do I re-train the cam+lidar fusion model, using my new image data? I believe you may have the same goal, when you mentioned in your first post "I use my own lidar-only and camera-only pth to train the fusion model....". You also have the camera-only pth checkpoint? Did you manage to solve it? If so, can you advise me please?
Thanks
Can I do the following instead:
torchpack dist-run -np 8
python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml
--model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth
--load_from runs/run-531bf67d-d3138be2/epoch_20.pth
--load_from pretrained/lidar-only-det.pth
That is, have 2 "--load_from", loading both the cam-only checkpoint and the lidar-only checkpoint.
Thanks
torchpack dist-run -np 8 python tools/train.py configs/nuscenes/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/lidar-only-det.pth
After some debugging, it seems that the issue is that the state_dict keys for
pretrained/swint-nuimages-pretrained.pth
differ from that of
runs/run-531bf67d-d3138be2/epoch_20.pth
by the prefix "encoders.camera.backbone".
I am hoping that if I change the key to skip this prefix, then this error will go away!
You just use the pretrained/swint-nuimages-pretrained.pth and it's done. I have told you that runs/run-531bf67d-d3138be2/epoch_20.pth is the camera-only model, which has been trained only based on images and state_dict keys totally different from pretrained/swint-nuimages-pretrained.pth.
Hi fdy61, Thanks. The background is that I have trained the camera-only model, and using the "runs/run-531bf67d-d3138be2/epoch_20.pth" checkpoint, I have obtained mAP improvements over the "pretrained/camera-only-det.pth"
Hence, my thinking is to use this "epoch_20.pth" checkpoint to train the cam+lidar fusion model.
OK, I will try to check/confirm if the state_dict keys of "epoch_20.pth" are totally different from the "camera-only-det.pth", or if they only differ by the prefix "encoders.camera.backbone". I will post the update here.
Thanks
Hi fdy61,
Since we have to use pretrained/swint-nuimages-pretrained.pth to train the C+L fusion model, do you have an idea of how the pretrained model was trained? Alternatively, is there any way we can replicate the training process for the pretrained models?
Thanks~