Open3D-ML
Open3D-ML copied to clipboard
No bounding boxes predicted on custom 3D dataset using PointPillars
Checklist
- [X] I have searched for similar issues.
- [X] I have tested with the latest development wheel.
- [X] I have checked the release documentation and the latest documentation (for
master
branch).
My Question
I'm trying to train a PointPillars detector for predicting 3D bounding boxes on a custom dataset with a single object class. The dataset is in KITTI format, and I wrote a config file modified from the pointpillars_kitti.yml file to include just one class.
I'm able to successfully run the training using the code below, however the model gives 0 mAP for training and validation throughout the whole training process (I let it run for 70 epochs). The trained detector seems to never output any predicted bounding boxes when I visualize the results. However, I'm able to visualize my training data using the Open3D-ML point cloud visualization tools, and the ground truth bounding boxes seem to be placed correctly. Any advice on what the error might be?
Below you can find my code for training, and my config file. I also attached train and validation logs.
Thank you for your help, and your work in making this codebase available!
Training code
import logging
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d
from pathlib import Path
# from open3d.ml.vis import Visualizer, BoundingBox3D, LabelLUT
from pointpillars.dataset import ForestDataset
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--dataset_path", help="Path to the KITTI-format dataset")
parser.add_argument('--device',
help='device to run the pipeline',
default='cuda')
args = parser.parse_args()
device = args.device
print("Using device '%s'" % device)
framework = 'torch'
kitti_path = args.dataset_path
cfg_file = "cfg/pointpillars_zed_forest.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)
model = ml3d.models.PointPillars(device=device, **cfg.model)
cfg.dataset['dataset_path'] = kitti_path
dataset = ForestDataset(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device=device,
**cfg.pipeline)
max_epoch = cfg.pipeline.max_epoch
ckpt_path = Path("ckpt")
if not ckpt_path.exists():
ckpt_path.mkdir(parents=True)
pipeline.run_train()
pipeline.save_ckpt(max_epoch)
Config .yml file
dataset:
name: TreeDetection
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
# steps_per_epoch_train:
model:
name: PointPillars
ckpt_path: # path/to/your/checkpoint
batcher: "ignore"
# point_cloud_range: [0, -39.68, -3, 70.12, 39.68, 1]
point_cloud_range: [0, -39.68, -3.0, 69.12, 39.68, 10.0]
classes: ['Car']
loss:
focal:
gamma: 2.0
alpha: 0.25
loss_weight: 1.0
smooth_l1:
beta: 0.11
loss_weight: 2.0
cross_entropy:
loss_weight: 0.2
voxelize:
max_num_points: 32
voxel_size: &vsize
[0.16, 0.16, 4]
max_voxels: [16000, 40000]
voxel_encoder:
in_channels: 4
feat_channels: [64]
voxel_size: *vsize
scatter:
in_channels: 64
output_shape: [496, 432]
backbone:
in_channels: 64
out_channels: [64, 128, 256]
layer_nums: [3, 5, 5]
layer_strides: [2, 2, 2]
neck:
in_channels: [64, 128, 256]
out_channels: [128, 128, 128]
upsample_strides: [1, 2, 4]
use_conv_for_no_stride: false
head:
in_channels: 384
feat_channels: 384
nms_pre: 100
score_thr: 0.1
ranges: [
[0, -39.68, -3.0, 70.4, 39.68, 10.0]
]
# ranges: [
# [0, -39.68, -0.6, 70.4, 39.68, -0.6],
# [0, -39.68, -0.6, 70.4, 39.68, -0.6],
# [0, -39.68, -1.78, 70.4, 39.68, -1.78]
# ]
# sizes: [[0.6, 0.8, 1.73], [0.6, 1.76, 1.73], [1.6, 3.9, 1.56]]
sizes: [[1.6, 1.6, 3.9]]
# rotations: [0, 1.57]
rotations: [0.]
iou_thr: [[0.45, 0.6]]
augment:
PointShuffle: True
ObjectRangeFilter:
point_cloud_range: [0, -39.68, -3.0, 69.12, 39.68, 10.0]
ObjectSample:
min_points_dict:
Car: 5
# Pedestrian: 10
# Cyclist: 10
sample_dict:
Car: 15
# Pedestrian: 10
# Cyclist: 10
pipeline:
name: ObjectDetection
test_compute_metric: true
batch_size: 2
val_batch_size: 1
test_batch_size: 1
save_ckpt_freq: 5
max_epoch: 100
main_log_dir: ./logs
train_sum_dir: train_log
grad_clip_norm: 2
num_workers: 0
pin_memory: true
optimizer:
lr: 0.001
betas: [0.95, 0.99]
weight_decay: 0.01
# evaluation properties
overlaps: [0.5]
similar_classes: {
# Van: Car,
# Person_sitting: Pedestrian
}
difficulties: [0]
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
log_train_2022-02-12_18:45:43.txt log_valid_2022-02-13_10:41:42.txt
@brian-h-wang I notice some problems in your config file
-
voxel_size[2]
should cover the wholez
dimension. In your case, it should be 13. -
z_min
andz_max
should be same (or almost close) inmodel['head']['ranges']
. You can use average value ofz
dimension across objects in whole dataset. We don't want anchors to float above/below the ground level. Also, please make sure thepoint_cloud_range
is correct for your dataset.
Thank you for the prompt and very helpful response! I corrected these errors in my config, setting the point_cloud_range
, model['head']['ranges']
, model['head']['sizes']
, and voxel_size[2]
parameters correctly for my custom dataset.
When I train the detector, I'm now seeing values for the losses, for example:
INFO - 2022-02-14 19:14:30,328 - object_detection - === EPOCH 10/50 ===
training - loss_cls: 0.808 loss_bbox: 1.024 loss_dir: 0.000 > loss: 1.832: 100%|█| 1856/1856 [09:53<00:
Before making the changes to my config, during training the loss_cls, loss_bbox etc. values were always all 0.00.
However, the validation mAP still always shows up as zero (I'm attaching the logs). When I tried running inference using the trained detector, on point clouds from my training set, the detector still does not predict any bounding boxes.
Is it possible there are other errors in my config, or do you have suggestions for other possible causes for this issue? Thanks again for your help.
Here is my current config file:
dataset:
name: TreeDetection
dataset_path: # path/to/your/dataset
cache_dir: ./logs/cache
# steps_per_epoch_train:
model:
name: PointPillars
ckpt_path: # path/to/your/checkpoint
batcher: "ignore"
# point_cloud_range: [0, -39.68, -3, 70.12, 39.68, 1]
point_cloud_range: [0.0, -12.0, -5.0, 15.0, 12.0, 10.0]
classes: ['Car']
loss:
focal:
gamma: 2.0
alpha: 0.25
loss_weight: 1.0
smooth_l1:
beta: 0.11
loss_weight: 2.0
cross_entropy:
loss_weight: 0.2
voxelize:
max_num_points: 32
voxel_size: &vsize
[0.16, 0.16, 15.0]
max_voxels: [16000, 40000]
voxel_encoder:
in_channels: 4
feat_channels: [64]
voxel_size: *vsize
scatter:
in_channels: 64
output_shape: [496, 432]
backbone:
in_channels: 64
out_channels: [64, 128, 256]
layer_nums: [3, 5, 5]
layer_strides: [2, 2, 2]
neck:
in_channels: [64, 128, 256]
out_channels: [128, 128, 128]
upsample_strides: [1, 2, 4]
use_conv_for_no_stride: false
head:
in_channels: 384
feat_channels: 384
nms_pre: 100
score_thr: 0.1
ranges: [
[0, -12.0, 0.145, 15.0, 12.0, 0.145]
]
sizes: [[0.42, 0.42, 2.9]]
rotations: [0.]
iou_thr: [[0.45, 0.6]]
augment:
PointShuffle: True
ObjectRangeFilter:
point_cloud_range: [0.0, -12.0, -5.0, 15.0, 12.0, 10.0]
ObjectSample:
min_points_dict:
Car: 5
# Pedestrian: 10
# Cyclist: 10
sample_dict:
Car: 15
# Pedestrian: 10
# Cyclist: 10
pipeline:
name: ObjectDetection
test_compute_metric: true
batch_size: 2
val_batch_size: 1
test_batch_size: 1
save_ckpt_freq: 5
max_epoch: 10
main_log_dir: ./logs
train_sum_dir: train_log
grad_clip_norm: 2
num_workers: 0
pin_memory: true
optimizer:
lr: 0.001
betas: [0.95, 0.99]
weight_decay: 0.01
# evaluation properties
overlaps: [0.5]
similar_classes: {
# Van: Car,
# Person_sitting: Pedestrian
}
difficulties: [0]
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
I have noticed the similar behaviour with the default PointPillars model, configuration and KITTI dataset for the object detection task. mAP values and other indicators crash to 0 and losses to nan right after the very first training epoch.
@brian-h-wang were you able to solve the issue?