Open3D-ML icon indicating copy to clipboard operation
Open3D-ML copied to clipboard

No bounding boxes predicted on custom 3D dataset using PointPillars

Open brian-h-wang opened this issue 3 years ago • 4 comments

Checklist

My Question

I'm trying to train a PointPillars detector for predicting 3D bounding boxes on a custom dataset with a single object class. The dataset is in KITTI format, and I wrote a config file modified from the pointpillars_kitti.yml file to include just one class.

I'm able to successfully run the training using the code below, however the model gives 0 mAP for training and validation throughout the whole training process (I let it run for 70 epochs). The trained detector seems to never output any predicted bounding boxes when I visualize the results. However, I'm able to visualize my training data using the Open3D-ML point cloud visualization tools, and the ground truth bounding boxes seem to be placed correctly. Any advice on what the error might be?

Below you can find my code for training, and my config file. I also attached train and validation logs.

Thank you for your help, and your work in making this codebase available!

Training code

import logging
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d
from pathlib import Path

# from open3d.ml.vis import Visualizer, BoundingBox3D, LabelLUT
from pointpillars.dataset import ForestDataset

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--dataset_path", help="Path to the KITTI-format dataset")
parser.add_argument('--device',
                    help='device to run the pipeline',
                    default='cuda')

args = parser.parse_args()

device = args.device
print("Using device '%s'" % device)

framework = 'torch'
kitti_path = args.dataset_path

cfg_file = "cfg/pointpillars_zed_forest.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.PointPillars(device=device, **cfg.model)
cfg.dataset['dataset_path'] = kitti_path
dataset = ForestDataset(cfg.dataset.pop('dataset_path', None), **cfg.dataset)

pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device=device,
                                          **cfg.pipeline)

max_epoch = cfg.pipeline.max_epoch

ckpt_path = Path("ckpt")
if not ckpt_path.exists():
    ckpt_path.mkdir(parents=True)

pipeline.run_train()

pipeline.save_ckpt(max_epoch)

Config .yml file

dataset:
  name: TreeDetection
  dataset_path: # path/to/your/dataset
  cache_dir: ./logs/cache
#  steps_per_epoch_train:

model:
  name: PointPillars
  ckpt_path: # path/to/your/checkpoint

  batcher: "ignore"

#  point_cloud_range: [0, -39.68, -3, 70.12, 39.68, 1]
  point_cloud_range: [0, -39.68, -3.0, 69.12, 39.68, 10.0]
  classes: ['Car']

  loss:
    focal:
      gamma: 2.0
      alpha: 0.25
      loss_weight: 1.0
    smooth_l1:
      beta: 0.11
      loss_weight: 2.0
    cross_entropy:
      loss_weight: 0.2

  voxelize:
    max_num_points: 32
    voxel_size: &vsize
      [0.16, 0.16, 4]
    max_voxels: [16000, 40000]

  voxel_encoder:
    in_channels: 4
    feat_channels: [64]
    voxel_size: *vsize

  scatter:
    in_channels: 64
    output_shape: [496, 432]

  backbone:
    in_channels: 64
    out_channels: [64, 128, 256]
    layer_nums: [3, 5, 5]
    layer_strides: [2, 2, 2]

  neck:
    in_channels: [64, 128, 256]
    out_channels: [128, 128, 128]
    upsample_strides: [1, 2, 4]
    use_conv_for_no_stride: false

  head:
    in_channels: 384
    feat_channels: 384
    nms_pre: 100
    score_thr: 0.1
    ranges: [
      [0, -39.68, -3.0, 70.4, 39.68, 10.0]
    ]
#    ranges: [
#      [0, -39.68, -0.6, 70.4, 39.68, -0.6],
#      [0, -39.68, -0.6, 70.4, 39.68, -0.6],
#      [0, -39.68, -1.78, 70.4, 39.68, -1.78]
#    ]
#    sizes: [[0.6, 0.8, 1.73], [0.6, 1.76, 1.73], [1.6, 3.9, 1.56]]
    sizes: [[1.6, 1.6, 3.9]]
#    rotations: [0, 1.57]
    rotations: [0.]
    iou_thr: [[0.45, 0.6]]

  augment:
    PointShuffle: True
    ObjectRangeFilter:
      point_cloud_range: [0, -39.68, -3.0, 69.12, 39.68, 10.0]
    ObjectSample:
      min_points_dict:
        Car: 5
#        Pedestrian: 10
#        Cyclist: 10
      sample_dict:
        Car: 15
#        Pedestrian: 10
#        Cyclist: 10


pipeline:
  name: ObjectDetection
  test_compute_metric: true
  batch_size: 2
  val_batch_size: 1
  test_batch_size: 1
  save_ckpt_freq: 5
  max_epoch: 100
  main_log_dir: ./logs
  train_sum_dir: train_log
  grad_clip_norm: 2
  num_workers: 0
  pin_memory: true

  optimizer:
    lr: 0.001
    betas: [0.95, 0.99]
    weight_decay: 0.01

  # evaluation properties
  overlaps: [0.5]
  similar_classes: {
#    Van: Car,
#    Person_sitting: Pedestrian
  }
  difficulties: [0]
  summary:
    record_for: []
    max_pts:
    use_reference: false
    max_outputs: 1

log_train_2022-02-12_18:45:43.txt log_valid_2022-02-13_10:41:42.txt

brian-h-wang avatar Feb 13 '22 16:02 brian-h-wang

@brian-h-wang I notice some problems in your config file

  1. voxel_size[2] should cover the whole z dimension. In your case, it should be 13.
  2. z_min and z_max should be same (or almost close) in model['head']['ranges']. You can use average value of z dimension across objects in whole dataset. We don't want anchors to float above/below the ground level. Also, please make sure the point_cloud_range is correct for your dataset.

sanskar107 avatar Feb 14 '22 14:02 sanskar107

Thank you for the prompt and very helpful response! I corrected these errors in my config, setting the point_cloud_range, model['head']['ranges'], model['head']['sizes'], and voxel_size[2] parameters correctly for my custom dataset.

When I train the detector, I'm now seeing values for the losses, for example:

INFO - 2022-02-14 19:14:30,328 - object_detection - === EPOCH 10/50 ===                                 
training -  loss_cls: 0.808 loss_bbox: 1.024 loss_dir: 0.000 > loss: 1.832: 100%|█| 1856/1856 [09:53<00:

Before making the changes to my config, during training the loss_cls, loss_bbox etc. values were always all 0.00.

However, the validation mAP still always shows up as zero (I'm attaching the logs). When I tried running inference using the trained detector, on point clouds from my training set, the detector still does not predict any bounding boxes.

Is it possible there are other errors in my config, or do you have suggestions for other possible causes for this issue? Thanks again for your help.

Here is my current config file:

dataset:
  name: TreeDetection
  dataset_path: # path/to/your/dataset
  cache_dir: ./logs/cache
#  steps_per_epoch_train:

model:
  name: PointPillars
  ckpt_path: # path/to/your/checkpoint

  batcher: "ignore"

#  point_cloud_range: [0, -39.68, -3, 70.12, 39.68, 1]
  point_cloud_range: [0.0, -12.0, -5.0, 15.0, 12.0, 10.0]
  classes: ['Car']

  loss:
    focal:
      gamma: 2.0
      alpha: 0.25
      loss_weight: 1.0
    smooth_l1:
      beta: 0.11
      loss_weight: 2.0
    cross_entropy:
      loss_weight: 0.2

  voxelize:
    max_num_points: 32
    voxel_size: &vsize
      [0.16, 0.16, 15.0]
    max_voxels: [16000, 40000]

  voxel_encoder:
    in_channels: 4
    feat_channels: [64]
    voxel_size: *vsize

  scatter:
    in_channels: 64
    output_shape: [496, 432]

  backbone:
    in_channels: 64
    out_channels: [64, 128, 256]
    layer_nums: [3, 5, 5]
    layer_strides: [2, 2, 2]

  neck:
    in_channels: [64, 128, 256]
    out_channels: [128, 128, 128]
    upsample_strides: [1, 2, 4]
    use_conv_for_no_stride: false

  head:
    in_channels: 384
    feat_channels: 384
    nms_pre: 100
    score_thr: 0.1
    ranges: [
      [0, -12.0, 0.145, 15.0, 12.0, 0.145]
    ]
    sizes: [[0.42, 0.42, 2.9]]
    rotations: [0.]
    iou_thr: [[0.45, 0.6]]

  augment:
    PointShuffle: True
    ObjectRangeFilter:
      point_cloud_range: [0.0, -12.0, -5.0, 15.0, 12.0, 10.0]
    ObjectSample:
      min_points_dict:
        Car: 5
#        Pedestrian: 10
#        Cyclist: 10
      sample_dict:
        Car: 15
#        Pedestrian: 10
#        Cyclist: 10


pipeline:
  name: ObjectDetection
  test_compute_metric: true
  batch_size: 2
  val_batch_size: 1
  test_batch_size: 1
  save_ckpt_freq: 5
  max_epoch: 10
  main_log_dir: ./logs
  train_sum_dir: train_log
  grad_clip_norm: 2
  num_workers: 0
  pin_memory: true

  optimizer:
    lr: 0.001
    betas: [0.95, 0.99]
    weight_decay: 0.01

  # evaluation properties
  overlaps: [0.5]
  similar_classes: {
#    Van: Car,
#    Person_sitting: Pedestrian
  }
  difficulties: [0]
  summary:
    record_for: []
    max_pts:
    use_reference: false
    max_outputs: 1

brian-h-wang avatar Feb 15 '22 04:02 brian-h-wang

I have noticed the similar behaviour with the default PointPillars model, configuration and KITTI dataset for the object detection task. mAP values and other indicators crash to 0 and losses to nan right after the very first training epoch.

TommiA avatar Feb 21 '22 12:02 TommiA

@brian-h-wang were you able to solve the issue?

ricardosanunes avatar Apr 04 '24 10:04 ricardosanunes