bevfusion
bevfusion copied to clipboard
Clarification on Training of 'swint-nuimages-pretrained.pth'
I am currently working on training the C+L BEVFusion model and have encountered some confusion regarding the checkpoints being used during the process.
It appears that the training procedure involves using a combination of a lidar-only model and a pretrained camera model. Specifically, the checkpoints utilized are:
- Lidar-only model ( lidar-only-det.pth)
- Pretrained camera model (swint-nuimages-pretrained.pth)
However, I noticed that the combination does not involve the camera-only model along with the lidar-only model, which seems to be a logical choice for such fusion models.
Could you please provide detailed information on how the swint-nuimages-pretrained.pth is being trained? Understanding the training methodology behind this pretrained camera model will greatly help in comprehending its integration within the C+L BEVFusion model.
Thanks!