About nuscenes open dataset
Hello,
Thanks for your great work! I have a question about the unified coordinate system:
In "LIDAR" coordinate system of mmdetection3d, the relative coordinate of bottom center in a LiDAR box is (0.5, 0.5, 0):

but there is no translation for the nuscenes dataset. please see in the mmdet3d/datasets/nuscenes_dataset.py 165 line - "TODO":
# the nuscenes box center is [0.5, 0.5, 0.5], we change it to be
# the same as KITTI (0.5, 0.5, 0)
# TODO: Unify the coordinates
if self.load_type in ['fov_image_based', 'mv_image_based']:
gt_bboxes_3d = CameraInstance3DBoxes(
ann_info['gt_bboxes_3d'],
box_dim=ann_info['gt_bboxes_3d'].shape[-1],
origin=(0.5, 0.5, 0.5))
else:
gt_bboxes_3d = LiDARInstance3DBoxes(
ann_info['gt_bboxes_3d'],
box_dim=ann_info['gt_bboxes_3d'].shape[-1],
origin=(0.5, 0.5, 0.5)).convert_to(self.box_mode_3d)
ann_info['gt_bboxes_3d'] = gt_bboxes_3d
So I wonder if there are truly some difference between KITTI/WAYMO and NUSCENES? Looking forward to your reply, thanks a lot!
@starnstar. Hi, we have set the origin as (0.5,0.5,0.5) in NuScenes to fit the original box coordinates in NuScenes. Then, we can just use the interface of the class LiDARInstance3DBoxes, such as 'center', 'corners', 'height' and so on, ignoring the inner complex translation.
Thanks for your reply! Can I understand like this, as long as the boxes and the point clouds can be aligned, there is no need to set the box center at the bottom? And will you integrate BEVFusion model into mmdetection3d? Thanks a lot!
Sorry, I just saw that you have implemented BEVFusion in the lastest version v1.1.0. Thanks for your great work! Is this part just write for inference, not for training? https://github.com/open-mmlab/mmdetection3d/tree/main/projects/BEVFusion
I still have some questions about nuscenes dataset. In https://github.com/open-mmlab/mmdetection3d/blob/main/mmdet3d/datasets/nuscenes_dataset.py#L172, the nuscenes box's origin is (0.5, 0.5, 0.5). Then in https://github.com/open-mmlab/mmdetection3d/blob/4ff136163ee4e6436e7f5c44f73c4a4932aa7657/mmdet3d/structures/bbox_3d/base_box3d.py#L68 you changed it to (0.5, 0.5, 0), right?
Hello, I'm a beginner, could someone tell me
- why there are three versions of bbox bottom center coordinates in mmdet3d?
- (0.5, 0.5, 0.5) mmdet3d/datasets/nuscenes_dataset.py parse_ann_info 🔗
- (0.5, 1.0, 0.5) in mmdet3d/structures/bbox_3d/box_3d_mode.py 🔗
- (0.5, 0.5, 0) in mmdet3d/structures/bbox_3d/base_box3d.py 🔗
- what is the intrinsic relationship between them, and what kind of coordinate order should I follow when implementing my own data processing pipeline?
Hi, in case it can help anyone
Basically, LiDARInstance3DBoxes assume it works with bottom_center instead of geometry_center, so bottom_center means (length/2, width/2, 0) for the center position of a 3d bounding box, and geometry_center means (length/2, width/2, height/2).
That's the reason we see it transforms geometry_center annotation datasets, for example, nuScenes to bottom_center by reducing height.
The word origin indeed is a little bit of confusing, it means origin of center with respect to a bounding box. Please correct me if my understanding is incorrect.
@KSeangTan Do you happen to know if this is a mismatch between the code and their doc? The code below converts the box convention to follow (x, y, z, l, w, h, yaw) conventions in MMDet3D. And the values are still in Nuscences Lidar frame (i.e., x right, y front, z up).
https://github.com/open-mmlab/mmdetection3d/blob/fe25f7a51d36e3702f961e198894580d83c4387b/tools/dataset_converters/nuscenes_converter.py#L258
But the doc here claims that the boxes are converted to MMDet3D lidar frame (i.e., x front, y left, z up).
is this a mismatch?