pyk4a icon indicating copy to clipboard operation
pyk4a copied to clipboard

Depth Value Discrepancies

Open JJLimmm opened this issue 1 year ago • 3 comments

Hi all,

Just a question regarding Depth Information obtained via the python wrapper and also using the Body Tracking SDK. Also in relation to this issue #92 discussed in another python wrapper repo for more context as to this issue.

Background Info: I am currently trying to test out how i can use a 2D pose estimation model (like openpose or any other) and together with the depth sensor data, obtain the accurate 3D coordinates for the keypoints detected, instead of using the official Kinect Azure Body Tracking SDK.

However, to make sure that i was getting the correct depth value corresponding to the SDK's body tracking keypoints, i had to verify if the method to convert those 2D keypoints from the model into the 3D Depth Coordinate system is giving me the correct results as seen from the Body Tracking SDK. But for comparison, i used the Neck Keypoint as the reference, then convert it to 2D keypoint (x, y) from the body Tracking SDK, and then transform those coordinates back to the 3D depth image (which is the same image coordinate system that is being used for the Body Tracking Keypoint results).

I perform the following steps before comparing the xyz values obtained:

  1. Obtain 2D keypoints from model after conversion using conver_3d_to_2d(), so now the 2D keypoints are in the 2D RGB image space.
  2. Retrieve that transformed depth image similar to the RGB image space using capture.transformed_depth()
  3. Get the depth value at that 2D keypoint coordinate by indexing the transformed depth image
  4. Using the depth value and 2D keypoint from the RGB image, call the API convert_2d_to_2d() to get the coordinates in the 2D Depth image space.
  5. Get the depth value at that converted coordinate by indexing on the 2D depth image (retrieved by capture.get_depth())
  6. Using that depth value, converted coordinates, call the API convert_2d_to_3d() to obtain the xyz coordinates in the 3D depth space
  7. Compare the xyz for that coordinate to the xyz of the same keypoint from the Body Tracking SDK.

After performing these steps, i noticed a difference in the z value (depth) from the manual conversion to the Body Tracking SDK. (As shown in the picture below, body_kps is the xyz coordinate obtained from the Body Tracking SDK where the z-value (depth) is 715.698... The converted xyz coordinate is the "3D point with converted 2D ......." where the z-value is 624. ) depth_value_difference

Does anybody know if i am doing anything wrong or have faced similar issues? Am i supposed to use both 2D Depth & IR images for finding the actual depth? (saw that Body Tracking SDK documentation uses Depth and IR from the Capture object) If so, how do i use these 2 images (depth and IR) to combine and give me the proper depth value with the corresponding coordinates converted from the 2D RGB space.

JJLimmm avatar Feb 10 '23 04:02 JJLimmm