mmdetection3d icon indicating copy to clipboard operation
mmdetection3d copied to clipboard

About nuscenes open dataset

Open starnstar opened this issue 3 years ago • 6 comments

Hello,

Thanks for your great work! I have a question about the unified coordinate system:

In "LIDAR" coordinate system of mmdetection3d, the relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0): image

but there is no translation for the nuscenes dataset. please see in the mmdet3d/datasets/nuscenes_dataset.py 165 line - "TODO":

        # the nuscenes box center is [0.5, 0.5, 0.5], we change it to be
        # the same as KITTI (0.5, 0.5, 0)
        # TODO: Unify the coordinates
        if self.load_type in ['fov_image_based', 'mv_image_based']:
            gt_bboxes_3d = CameraInstance3DBoxes(
                ann_info['gt_bboxes_3d'],
                box_dim=ann_info['gt_bboxes_3d'].shape[-1],
                origin=(0.5, 0.5, 0.5))
        else:
            gt_bboxes_3d = LiDARInstance3DBoxes(
                ann_info['gt_bboxes_3d'],
                box_dim=ann_info['gt_bboxes_3d'].shape[-1],
                origin=(0.5, 0.5, 0.5)).convert_to(self.box_mode_3d)

        ann_info['gt_bboxes_3d'] = gt_bboxes_3d

So I wonder if there are truly some difference between KITTI/WAYMO and NUSCENES? Looking forward to your reply, thanks a lot!

starnstar avatar Apr 12 '23 04:04 starnstar

@starnstar. Hi, we have set the origin as (0.5,0.5,0.5) in NuScenes to fit the original box coordinates in NuScenes. Then, we can just use the interface of the class LiDARInstance3DBoxes, such as 'center', 'corners', 'height' and so on, ignoring the inner complex translation.

JingweiZhang12 avatar Apr 18 '23 02:04 JingweiZhang12

Thanks for your reply! Can I understand like this, as long as the boxes and the point clouds can be aligned, there is no need to set the box center at the bottom? And will you integrate BEVFusion model into mmdetection3d? Thanks a lot!

starnstar avatar Apr 18 '23 02:04 starnstar

Sorry, I just saw that you have implemented BEVFusion in the lastest version v1.1.0. Thanks for your great work! Is this part just write for inference, not for training? https://github.com/open-mmlab/mmdetection3d/tree/main/projects/BEVFusion

I still have some questions about nuscenes dataset. In https://github.com/open-mmlab/mmdetection3d/blob/main/mmdet3d/datasets/nuscenes_dataset.py#L172, the nuscenes box's origin is (0.5, 0.5, 0.5). Then in https://github.com/open-mmlab/mmdetection3d/blob/4ff136163ee4e6436e7f5c44f73c4a4932aa7657/mmdet3d/structures/bbox_3d/base_box3d.py#L68 you changed it to (0.5, 0.5, 0), right?

starnstar avatar Apr 18 '23 05:04 starnstar

Hello, I'm a beginner, could someone tell me

  1. why there are three versions of bbox bottom center coordinates in mmdet3d?
  • (0.5, 0.5, 0.5) mmdet3d/datasets/nuscenes_dataset.py parse_ann_info 🔗
  • (0.5, 1.0, 0.5) in mmdet3d/structures/bbox_3d/box_3d_mode.py 🔗
  • (0.5, 0.5, 0) in mmdet3d/structures/bbox_3d/base_box3d.py 🔗
  1. what is the intrinsic relationship between them, and what kind of coordinate order should I follow when implementing my own data processing pipeline?

shliu0 avatar Feb 22 '24 14:02 shliu0

Hi, in case it can help anyone

Basically, LiDARInstance3DBoxes assume it works with bottom_center instead of geometry_center, so bottom_center means (length/2, width/2, 0) for the center position of a 3d bounding box, and geometry_center means (length/2, width/2, height/2). That's the reason we see it transforms geometry_center annotation datasets, for example, nuScenes to bottom_center by reducing height.

The word origin indeed is a little bit of confusing, it means origin of center with respect to a bounding box. Please correct me if my understanding is incorrect.

KSeangTan avatar Apr 09 '25 13:04 KSeangTan

@KSeangTan Do you happen to know if this is a mismatch between the code and their doc? The code below converts the box convention to follow (x, y, z, l, w, h, yaw) conventions in MMDet3D. And the values are still in Nuscences Lidar frame (i.e., x right, y front, z up).

https://github.com/open-mmlab/mmdetection3d/blob/fe25f7a51d36e3702f961e198894580d83c4387b/tools/dataset_converters/nuscenes_converter.py#L258

But the doc here claims that the boxes are converted to MMDet3D lidar frame (i.e., x front, y left, z up).

is this a mismatch?

YangyangFu avatar Apr 17 '25 15:04 YangyangFu