zed-sdk icon indicating copy to clipboard operation
zed-sdk copied to clipboard

Saving rendered 3d bounding box coordinates in the image space (pixel coordinates) from ZED SDK?

Open harishkool opened this issue 3 years ago • 3 comments

Preliminary Checks

  • [X] This issue is not a duplicate. Before opening a new issue, please search existing issues.
  • [X] This issue is not a question, bug report, or anything other than a feature request directly related to this project.

Proposal

I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image. Right now the 3D bounding box coordinates from the ZED SDK are normalized, I think it would be great if ZED provides the feasibility of returning the 3D bounding box coordinates in the pixel space by taking projection matrix and image shape as the input. I am doing like below to get the 3D bounding box coordinates in the image space i.e., pixel coordinates

        bbox = objects.object_list[i].bounding_box
    #     _cam_mat = np.array(_cam, np.float32).reshape(4,4)
        N = 8
        hom_obj_coords = np.c_[bbox, np.ones(N)]
        proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
        # proj3D_cam[1] = proj3D_cam[1] + 0.25

        # proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
        # , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]

        proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
                , ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
        proj2D_x = proj2D[0]
        proj2D_y = proj2D[1]

where

_cam_mat

is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, I think it would be great if the ZED SDK provides the support for this.

Use-Case

Saving the 3D bounding boxes in the pixel space will help to train any custom 3D object detection network without any associated point clouds.

Anything else?

No response

harishkool avatar Jan 13 '22 21:01 harishkool

Hi,

Best is to use the OpenCV projectPoints function as it is made for that : https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga1019495a2c8d1743ed5cc23fa0daff8c

The cameraMatrix is given with CameraInformation().calibration_parameters and R,T is the pose of the camera (if necessary).

obraun-sl avatar Jan 14 '22 07:01 obraun-sl

@obraun-sl I have tried doing this, but the rotation of the resulted bounding boxes is weird. @harishkool please share if you've found a solution to fix the wonky boxes.

fennecinspace avatar Jan 27 '23 03:01 fennecinspace

@fennecinspace Did you eventually solve this?

tavasolireza avatar May 08 '23 20:05 tavasolireza