mmdetection3d icon indicating copy to clipboard operation
mmdetection3d copied to clipboard

Training Pointpillar model for my own dataset but get 0 mAP

Open walliampeace opened this issue 3 years ago • 9 comments

Is it my training config file wrong?

walliampeace avatar Nov 08 '21 23:11 walliampeace

Can you provide your training log, it contains the training loss and evaluation results

ZCMax avatar Nov 09 '21 03:11 ZCMax

the training log looks like this which seems all 0. image The training dataset is converted to Kitti format and preprocessed by running create_data.py script, like this: image the training config looks like this, I only changed some parameters like class numbers in dataset config file and I keep the class name to Car and didn't modify it. image One thing to mention that my dataset training label don't contain 2D bbox values and all set them to 0, is this affecting training? image Thank you very much!

walliampeace avatar Nov 10 '21 23:11 walliampeace

2D bbox won't affect training, did your training loss converges well? Or check your calib file?

ZCMax avatar Nov 11 '21 06:11 ZCMax

2D bbox won't affect training, did your training loss converges well? Or check your calib file?

Hi, I got 0 mAP results on my custom dataset too. How to check if my calib files are okay?

curiousboy20 avatar Nov 15 '21 22:11 curiousboy20

I want to bump this issue as a 3rd person who has this problem, and provide additional context in the proper format

Describe the bug mAP is 0. This is because the model is not returning any bounding boxes for this users (as well as my own) custom dataset. I cannot speak for the other user, but I have verified my dataset to be in the correct KITTI format. Validation tools provided with this package demonstrate 3D bboxes in the correct location on the pointcloud. No bboxes are output either when training and validating or when testing.

Reproduction

  1. What command or script did you run? This issue is not exclusive to pointpillars. SECOND also returns no predicted bboxes. Have not tried other models yet
python tools/test.py configs/pointpillars/pp-custom-tree.py work_dirs/pp-custom-tree/epoch_23.pth --show --show-dir ./out/ --out ./out/test.pkl
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

Yes. A new config was created. However, this is identical to the template config hv_pointpillars_secfpn_6x8_160e_kitti-3d-car.pywith the following exceptions: pointcloud_range was changed to reflect the range of my pointcloud. The range of my pointcloud is point_cloud_range=[0, -20, -20, 85, 20, 20,] . This was changed in all the config files that are referenced by this config file ( hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py as well as '../_base_/models/hv_pointpillars_secfpn_kitti.py', '../_base_/datasets/kitti-3d-3class.py', '../_base_/schedules/cyclic_40e.py', '../_base_/default_runtime.py' In the _base_/model config, Anchor3DRangeGenerator was changed to these values as well to reflect the new required range.

  1. What dataset did you use? A custom dataset, that was verified to work with the visualization tools Environment
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3070
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.1.TC455_06.29190527_0
GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
PyTorch: 1.9.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.3-Product Build 20210617 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.10.0
OpenCV: 4.5.3
MMCV: 1.3.9
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.14.0
MMSegmentation: 0.14.1
MMDetection3D: 0.17.0+26ab7ff

Error traceback Here is what is output by the above test.py call

[                                                  ] 0/6, elapsed: 0s, ETA:####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>                                          ] 1/6, 0.0 task/s, elapsed: 34s, ETA:   169s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>                                  ] 2/6, 0.1 task/s, elapsed: 35s, ETA:    71s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>                         ] 3/6, 0.1 task/s, elapsed: 37s, ETA:    37s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                 ] 4/6, 0.1 task/s, elapsed: 38s, ETA:    19s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         ] 5/6, 0.1 task/s, elapsed: 40s, ETA:     8s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6/6, 0.1 task/s, elapsed: 41s, ETA:     0s(mmcv)

Here is what is output when I modify mmdet3d/apis/test.py to print the result after line 39

Use load_from_local loader
[                                                  ] 0/6, elapsed: 0s, ETA:####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>                                          ] 1/6, 0.3 task/s, elapsed: 4s, ETA:    19s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>>>>>>>>>                                  ] 2/6, 0.4 task/s, elapsed: 5s, ETA:    11s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>>>>>>>>>>>>>>>>>>                         ] 3/6, 0.4 task/s, elapsed: 7s, ETA:     7s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                 ] 4/6, 0.5 task/s, elapsed: 8s, ETA:     4s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         ] 5/6, 0.5 task/s, elapsed: 10s, ETA:     2s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################


 vvvvvvvvvvResult of single_gpu_test directly: vvvvvvvvvvvv

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
^^^^^^^^^^^^^^^ 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6/6, 0.6 task/s, elapsed: 11s, ETA:     0s

What is especially strange is that is seems like it is training as losses are being calculated. But during validation everything is 0 and no boxes are output:

2021-11-24 16:10:39,106 - mmdet - INFO - workflow: [('train', 1)], max: 2 epochs
2021-11-24 16:10:57,081 - mmdet - INFO - Epoch [1][50/318]	lr: 3.243e-03, eta: 0:03:30, time: 0.359, data_time: 0.221, memory: 982, loss_cls: 0.4021, loss_bbox: 5.0106, loss_dir: 0.1053, loss: 5.5179, grad_norm: 51.9575
2021-11-24 16:11:07,830 - mmdet - INFO - Epoch [1][100/318]	lr: 7.151e-03, eta: 0:02:33, time: 0.215, data_time: 0.096, memory: 982, loss_cls: 0.5494, loss_bbox: 8.7669, loss_dir: 0.1209, loss: 9.4372, grad_norm: 35.0653
2021-11-24 16:11:16,864 - mmdet - INFO - Epoch [1][150/318]	lr: 1.208e-02, eta: 0:02:02, time: 0.181, data_time: 0.064, memory: 1212, loss_cls: 0.4207, loss_bbox: 5.5836, loss_dir: 0.0598, loss: 6.0641, grad_norm: 6.2871
2021-11-24 16:11:25,916 - mmdet - INFO - Epoch [1][200/318]	lr: 1.620e-02, eta: 0:01:41, time: 0.181, data_time: 0.062, memory: 1212, loss_cls: 0.6594, loss_bbox: 24.3916, loss_dir: 1.0133, loss: 26.0643, grad_norm: 50.4340
2021-11-24 16:11:36,663 - mmdet - INFO - Epoch [1][250/318]	lr: 1.798e-02, eta: 0:01:28, time: 0.215, data_time: 0.096, memory: 1212, loss_cls: 1.1591, loss_bbox: 19.5046, loss_dir: 0.6342, loss: 21.2979, grad_norm: 31.7429
2021-11-24 16:11:48,300 - mmdet - INFO - Epoch [1][300/318]	lr: 1.739e-02, eta: 0:01:17, time: 0.233, data_time: 0.116, memory: 1212, loss_cls: 0.5446, loss_bbox: 7.2424, loss_dir: 0.0700, loss: 7.8569, grad_norm: 8.0101
2021-11-24 16:11:50,842 - mmdet - INFO - Saving checkpoint at 1 epochs
[                                                  ] 0/6, elapsed: 0s, ETA:####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>                                          ] 1/6, 6.8 task/s, elapsed: 0s, ETA:     1s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>                                  ] 2/6, 10.4 task/s, elapsed: 0s, ETA:     0s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>                         ] 3/6, 13.0 task/s, elapsed: 0s, ETA:     0s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>                 ] 4/6, 14.8 task/s, elapsed: 0s, ETA:     0s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         ] 5/6, 16.1 task/s, elapsed: 0s, ETA:     0s####### Printing bbox_results from voxelnet.py ######

[{'boxes_3d': LiDARInstance3DBoxes(
    tensor([], size=(0, 7))), 'scores_3d': tensor([]), 'labels_3d': tensor([], dtype=torch.int64)}]
######################################

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6/6, 17.2 task/s, elapsed: 0s, ETA:     0s
Converting prediction to KITTI format
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 6/6, 10800.8 task/s, elapsed: 0s, ETA:     0s
Result is saved to /tmp/tmpe4jv3ngx/results.pkl.
2021-11-24 16:11:56,046 - mmdet - INFO - 
Car [email protected], 0.70, 0.70:
bbox AP:0.0000, 0.0000, 0.0000
bev  AP:0.0000, 0.0000, 0.0000
3d   AP:0.0000, 0.0000, 0.0000
Car [email protected], 0.50, 0.50:
bbox AP:0.0000, 0.0000, 0.0000
bev  AP:0.0000, 0.0000, 0.0000
3d   AP:0.0000, 0.0000, 0.0000

As discussed, no boxes are drawn! However, Open3D loads and displays the pointclouds no problem; they look correct placing. Please advise on further steps. Thanks!!

24spiders avatar Nov 24 '21 23:11 24spiders

@24spiders Which point clouds are you displaying on which the bounding boxes look correctly placed? @Tai-Wang How's the status on this issue?

holtvogt avatar Aug 01 '22 14:08 holtvogt

I have the same problem with MVXNET. 0 mAP when testing with checkpoints I created on my own training (epoch_40.pth) but reasonable mAP if I use pretrained file (.pth) for MVXNET. Did you manage to find out the reason why this issue exists?

Joaovsky avatar Oct 06 '22 14:10 Joaovsky

Did you finally solve it?

Yyb-XJTU avatar Oct 12 '23 07:10 Yyb-XJTU