mmdetection3d [Bug] Poor performance BEVFusion Demo

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (dev-1.x) or latest version (dev-1.0).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmdetection3d

Environment

sys.platform: linux Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0: Quadro RTX 4000 with Max-Q Design CUDA_HOME: /home/s0001895/miniconda3/envs/bevfusion/ NVCC: Cuda compilation tools, release 11.3, V11.3.109 GCC: gcc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0 PyTorch: 2.1.2+cu118 PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.8
NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_90,code=sm_90
CuDNN 8.7
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.2+cu118 OpenCV: 4.9.0 MMEngine: 0.10.2 MMDetection: 3.3.0 MMDetection3D: 1.4.0+fe25f7a spconv2.0: True

Reproduces the problem - code sample

Basically just follow the readme on https://github.com/open-mmlab/mmdetection3d/tree/main/projects/BEVFusion

python projects/BEVFusion/setup.py develop

(Download provided checkpoint)

python projects/BEVFusion/demo/multi_modality_demo.py demo/data/nuscenes/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin demo/data/nuscenes/ demo/data/nuscenes/n015-2018-07-24-11-22-45+0800.pkl projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py ${CHECKPOINT_FILE} --cam-type all --score-thr 0.2 --show

Reproduces the problem - command or script

Basically just follow the readme on https://github.com/open-mmlab/mmdetection3d/tree/main/projects/BEVFusion

python projects/BEVFusion/setup.py develop

(Download provided checkpoint)

python projects/BEVFusion/demo/multi_modality_demo.py demo/data/nuscenes/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin demo/data/nuscenes/ demo/data/nuscenes/n015-2018-07-24-11-22-45+0800.pkl projects/BEVFusion/configs/bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d.py ${CHECKPOINT_FILE} --cam-type all --score-thr 0.2 --show

Reproduces the problem - error message

The resulting detections are as follows

Also get the following warnings/error:

The model and loaded state dict do not match exactly

unexpected key in source state_dict: vtransform.dx, vtransform.bx, vtransform.nx, vtransform.frustum, vtransform.dtransform.0.weight, vtransform.dtransform.0.bias, vtransform.dtransform.1.weight, vtransform.dtransform.1.bias, vtransform.dtransform.1.running_mean, vtransform.dtransform.1.running_var, vtransform.dtransform.1.num_batches_tracked, vtransform.dtransform.3.weight, vtransform.dtransform.3.bias, vtransform.dtransform.4.weight, vtransform.dtransform.4.bias, vtransform.dtransform.4.running_mean, vtransform.dtransform.4.running_var, vtransform.dtransform.4.num_batches_tracked, vtransform.dtransform.6.weight, vtransform.dtransform.6.bias, vtransform.dtransform.7.weight, vtransform.dtransform.7.bias, vtransform.dtransform.7.running_mean, vtransform.dtransform.7.running_var, vtransform.dtransform.7.num_batches_tracked, vtransform.depthnet.0.weight, vtransform.depthnet.0.bias, vtransform.depthnet.1.weight, vtransform.depthnet.1.bias, vtransform.depthnet.1.running_mean, vtransform.depthnet.1.running_var, vtransform.depthnet.1.num_batches_tracked, vtransform.depthnet.3.weight, vtransform.depthnet.3.bias, vtransform.depthnet.4.weight, vtransform.depthnet.4.bias, vtransform.depthnet.4.running_mean, vtransform.depthnet.4.running_var, vtransform.depthnet.4.num_batches_tracked, vtransform.depthnet.6.weight, vtransform.depthnet.6.bias, vtransform.downsample.0.weight, vtransform.downsample.1.weight, vtransform.downsample.1.bias, vtransform.downsample.1.running_mean, vtransform.downsample.1.running_var, vtransform.downsample.1.num_batches_tracked, vtransform.downsample.3.weight, vtransform.downsample.4.weight, vtransform.downsample.4.bias, vtransform.downsample.4.running_mean, vtransform.downsample.4.running_var, vtransform.downsample.4.num_batches_tracked, vtransform.downsample.6.weight, vtransform.downsample.7.weight, vtransform.downsample.7.bias, vtransform.downsample.7.running_mean, vtransform.downsample.7.running_var, vtransform.downsample.7.num_batches_tracked

missing keys in source state_dict: view_transform.dx, view_transform.bx, view_transform.nx, view_transform.frustum, view_transform.dtransform.0.weight, view_transform.dtransform.0.bias, view_transform.dtransform.1.weight, view_transform.dtransform.1.bias, view_transform.dtransform.1.running_mean, view_transform.dtransform.1.running_var, view_transform.dtransform.3.weight, view_transform.dtransform.3.bias, view_transform.dtransform.4.weight, view_transform.dtransform.4.bias, view_transform.dtransform.4.running_mean, view_transform.dtransform.4.running_var, view_transform.dtransform.6.weight, view_transform.dtransform.6.bias, view_transform.dtransform.7.weight, view_transform.dtransform.7.bias, view_transform.dtransform.7.running_mean, view_transform.dtransform.7.running_var, view_transform.depthnet.0.weight, view_transform.depthnet.0.bias, view_transform.depthnet.1.weight, view_transform.depthnet.1.bias, view_transform.depthnet.1.running_mean, view_transform.depthnet.1.running_var, view_transform.depthnet.3.weight, view_transform.depthnet.3.bias, view_transform.depthnet.4.weight, view_transform.depthnet.4.bias, view_transform.depthnet.4.running_mean, view_transform.depthnet.4.running_var, view_transform.depthnet.6.weight, view_transform.depthnet.6.bias, view_transform.downsample.0.weight, view_transform.downsample.1.weight, view_transform.downsample.1.bias, view_transform.downsample.1.running_mean, view_transform.downsample.1.running_var, view_transform.downsample.3.weight, view_transform.downsample.4.weight, view_transform.downsample.4.bias, view_transform.downsample.4.running_mean, view_transform.downsample.4.running_var, view_transform.downsample.6.weight, view_transform.downsample.7.weight, view_transform.downsample.7.bias, view_transform.downsample.7.running_mean, view_transform.downsample.7.running_var

Has anyone experienced something similar?

Thanks.

Additional information

No response

Jan 26 '24 01:01 antrose99

try using checkpoint 'bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d-5239b1af.pth'

Feb 26 '24 05:02 JJangD

met the same problem. I've tried 2 checkpoints: bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d-5239b1af.pth and bevfusion_converter.pth. the results is not good. anyone can help?

Mar 27 '24 02:03 SampWEI

met the same problem. I've tried 2 checkpoints: bevfusion_lidar-cam_voxel0075_second_secfpn_8xb4-cyclic-20e_nus-3d-5239b1af.pth and bevfusion_converter.pth. the results is not good. anyone can help?

https://github.com/open-mmlab/mmdetection3d/issues/2702#issuecomment-1828991207

check the version of spconv and install spconv as the reference above said. Now it seems better.

Mar 27 '24 03:03 SampWEI

The image in the Bird’s Eye View (BEV) perspective below, how was it obtained? When I run multi_modality_demo.py, I only get the top-down multi-view image.

Apr 01 '24 06:04 bikenan