OpenPCDet icon indicating copy to clipboard operation
OpenPCDet copied to clipboard

multimodal model / pipeline

Open AvivSham opened this issue 2 years ago • 11 comments

Hi All, Thank you for this wonderful repo! Do you support multimodal models (point clouds + RGB)? if not do you have a pipeline / dataloader we can use for such purpose?

Cheers, A

AvivSham avatar Aug 22 '22 15:08 AvivSham

Hi A, Currently, we only have one multimodal model for KITTI dataset, and you can get some information from this config: https://github.com/open-mmlab/OpenPCDet/blob/master/tools/cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml.

The multimodal models with multiple cameras are still not supported yet, but PRs related to this are welcome.

sshaoshuai avatar Aug 29 '22 22:08 sshaoshuai

Hi @sshaoshuai, thank you for your quick response. Does this model use RGB+point clouds as its input? by looking at the paper it seems it only uses point clouds as input modality. Do you refer to multiple cameras as multiple modalities?

AvivSham avatar Aug 30 '22 08:08 AvivSham

If you run the model, you can find that it actually has the option to use the image like this: https://github.com/open-mmlab/OpenPCDet/blob/master/pcdet/models/backbones_3d/focal_sparse_conv/focal_sparse_conv.py#L203-L205

Do you refer to multiple cameras as multiple modalities? No, I mean to say the LiDAR + Multiple cameras, like the setting in NuScenes/Waymo.

sshaoshuai avatar Aug 31 '22 14:08 sshaoshuai

And what about the pipeline? what do I need to change in my config file in order to train this model in multimodal setup?

AvivSham avatar Sep 02 '22 07:09 AvivSham

This model is commited by their authors, and by default it is trained with multimodal (the training/inferene is same with other models).

sshaoshuai avatar Sep 02 '22 09:09 sshaoshuai

I see. Can you please help me and write down which models receive point clouds / RGB as inputs?

AvivSham avatar Sep 03 '22 10:09 AvivSham

For now, only the model I mentioned above uses point clouds+RGB as inputs.

sshaoshuai avatar Sep 05 '22 17:09 sshaoshuai

Sorry if I was not clear, can you please state which of the supported model receive point clouds as input and which receive RGB as input?

Thank you very much for your help.

AvivSham avatar Sep 06 '22 06:09 AvivSham

Hi,

The original Voxel R-CNN receives point clouds only as input. Its config is https://github.com/open-mmlab/OpenPCDet/blob/master/tools/cfgs/kitti_models/voxel_rcnn_car.yaml

In addition, we modify it to be multi-modal in this config, https://github.com/open-mmlab/OpenPCDet/blob/master/tools/cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml It receives both point clouds and images. This improvement is from this paper.

yukang2017 avatar Sep 06 '22 13:09 yukang2017

what about the other models you support in this repo?

AvivSham avatar Sep 06 '22 13:09 AvivSham

You can easily change other config files by DATA_CONFIG and BACKBONE_3D in voxel_rcnn_car_focal_multimodal.yaml to make them supported by multi-modal inputs.

For example, you can change https://github.com/open-mmlab/OpenPCDet/blob/master/tools/cfgs/kitti_models/pv_rcnn.yaml by the DATA_CONFIG and BACKBONE_3D information in the voxel_rcnn_car_focal_multimodal.yaml

yukang2017 avatar Sep 06 '22 15:09 yukang2017

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Oct 07 '22 02:10 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Oct 22 '22 02:10 github-actions[bot]