Azure-Kinect-Sensor-SDK icon indicating copy to clipboard operation
Azure-Kinect-Sensor-SDK copied to clipboard

Pixel to word coordinates in Python, pixel_2d_to_point_3d.

Open rida-xavor opened this issue 2 years ago • 14 comments

Hey, I've been following the Python API to calculate pixel 2d points to 3d but I have been having problem calculating the distance in mm to input to the function, If anyone could help please let me know?

I know this is the fnction in python API https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/f22b76118c6f8a08d036af1c9f6a8abd3e82ea78/src/python/k4a/src/k4a/_bindings/transformation.py#L157

But I am unable to calculate depth distance value using https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/f22b76118c6f8a08d036af1c9f6a8abd3e82ea78/src/python/k4a/src/k4a/_bindings/transformation.py#L376

If some could explain how to get the depth distance in Python. Thankyou.

rida-xavor avatar Oct 13 '21 09:10 rida-xavor

@qm13 @UnaNancyOwen If could maybe have a look, I am stuck.

rida-xavor avatar Oct 13 '21 10:10 rida-xavor

@UnaNancyOwen any leads? Still struggling to get the depth distance through the functions mentioned above

rida-xavor avatar Oct 14 '21 04:10 rida-xavor

Azure Kinect have depth sensor. You can get depth from depth image.

UnaNancyOwen avatar Oct 14 '21 08:10 UnaNancyOwen

@UnaNancyOwen I know that, maybe my question is not clear enough, how exactly can I get it through code for a specific pixel value on the respective color image?

rida-xavor avatar Oct 14 '21 08:10 rida-xavor

You can use color_2d_to_depth_2d to convert coordinate system. It can convert any (cx, cy) that color coordinate system to (dx, dy) that depth coordinate system. Then, you can get depth by just access to depth image using (dx, dy). I recommend that you read the documentation to see what conversion api's are available. https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/develop/src/python/k4a/src/k4a/_bindings/transformation.py#L332-L374 https://microsoft.github.io/Azure-Kinect-Sensor-SDK/develop/structk4a_1_1calibration_ab72a37ed466a41d68c522678fa57ff4f.html#ab72a37ed466a41d68c522678fa57ff4f

UnaNancyOwen avatar Oct 14 '21 08:10 UnaNancyOwen

@UnaNancyOwen Thankyou for your response. I am calculating depth value but my function 2d_pixels_to_3d_points returns None when the depth value is none. Can you please guide?

rida-xavor avatar Oct 14 '21 12:10 rida-xavor

@rida-xavor It's not clear what you want to know. If you have a simple code that reproduces your problem, you should post it. That is the first step to solving your question.

UnaNancyOwen avatar Oct 15 '21 01:10 UnaNancyOwen

If you want get 3d point from any 2d pixel that color coordinate system, you can get it as following sample.

# get depth pixel
color_pixel = (color_image.width_pixels / 2, color_image.height_pixels / 2) # e.g. center of color image
depth_pixel = transformation.color_2d_to_depth_2d(color_pixel, depth_image)
if not all(depth_pixel):
    print("depth pixel is invalid!")
    return

# get depth value
depth_pixel = tuple(map(int,depth_pixel)) # (float, float) -> (int, int)
depth_value = depth_image.data[depth_pixel]
if depth_value is 0:
    print("depth value is invalid!")
    return

# convert to 3d point from 2d pixel
point = transformation.pixel_2d_to_point_3d(color_pixel, depth_value, k4a.ECalibrationType.COLOR, k4a.ECalibrationType.COLOR)
print(point) # (x, y, z)

If you want get 3d point from any 2d pixel that depth coordinate system, you can get it as following sample.

# get depth pixel and value
depth_pixel = (self._depth_image.width_pixels / 2, self._depth_image.height_pixels / 2) # e.g. center of depthimage
depth_pixel = tuple(map(int,depth_pixel)) # (float, float) -> (int, int)
depth_value = self._depth_image.data[depth_pixel]
if depth_value == 0:
    return

# convert to 3d point from 2d pixel
point = self._transformation.pixel_2d_to_point_3d(depth_pixel, depth_value, k4a.ECalibrationType.DEPTH, k4a.ECalibrationType.DEPTH)
print(point) # (x, y, z)

2d_pixels_to_3d_points returns None when the depth value is none.

It seems to be right. You need to check valid depth value.

UnaNancyOwen avatar Oct 15 '21 02:10 UnaNancyOwen

Basically I have a color image and a depth image, I use color image initially to get a few coordinate points and then I need to get their corresponding world coordinates in mm to perform calculations. I make a dictionary against each color pixel the corresponding world coordinate but my code crashes and the loop discontinues when depth value is 0 and 2d_pixels_to_3d_points returns (None, None, None) for that 2d pixel.

I am not sure what transformations and calibration functions I need to use to get to my scenario. Do I need to perform initial depth_to_color_image transformation to make their resolutions same to avoid the "Invalid depth pixel" error?

My use case is basically mapping Mediapipe pose detection coordinates with Kinect. The points are quite specific and 0 depth value or None pixels are creating a huge problem

rida-xavor avatar Oct 15 '21 13:10 rida-xavor

@UnaNancyOwen Even after following your code snippet for " 3d point from any 2d pixel that color coordinate system" I am unable to achieve my goal. Also the documentation is very confusing, maybe you guys need to work on it a bit. I was using intel realsense for my usecase previously and it worked like a charm, I am really not sure how exactly we can tackle 0 depth value issue

rida-xavor avatar Oct 15 '21 13:10 rida-xavor

First of all, the depth image that retrieve from sensor is not perfect. It will always contain some pixels with invalid depth. If you find a pixel with invalid depth, you have two choices. The first is to interpolate depth from around pixels. the around pixels are expected to have close depth. Perhaps, you were using depth image interpolated when using RealSense. The second is to treat this case as an exception in your app. You can't do what you can't do. Please treat this case as an error in your app.

Another thing, you should know is that not all pixels of color image have a corresponding depth. the two sensors (color camera and depth sensor) are in different locations and have different resolutions and viewing angles. naturally, they don't overlap perfectly. It happens even if you try to align the position and resolution using depth_to_color_image. You'll need to consider this case as well in your app.

UnaNancyOwen avatar Oct 15 '21 13:10 UnaNancyOwen

@UnaNancyOwen Thank you so much for clarity. Although I tried treating it as an exception and skipping calculations for frame with 0 depth value but my use case almost never gives me results so I guess I'll have to test the first option.

Do we have something regarding interpolation in K4a python API? The depth_to_color_custom function helps with interpolation or we'll have to write our custom code?

rida-xavor avatar Oct 17 '21 11:10 rida-xavor

@UnaNancyOwen Also regarding your second comment, currently I am using depth_to_color_image to transform my depth image to the geometry of color image and then using pixel_2d_to_point_3d(color_pixel, depth_value, k4a.ECalibrationType.COLOR, k4a.ECalibrationType.COLOR) to get the corresponding value in mm in world coordinates. Whereas my depth value is calculated using depth_value = transformed_depth_image(color_pixel) and color_pixel = a 2d pixel on color camera.

I hope I am doing it the right way.

rida-xavor avatar Oct 17 '21 11:10 rida-xavor

Azure Kinect SDK doesn't have an implementation of the interpolation for depth images. You will need to implement it yourself.

UnaNancyOwen avatar Oct 18 '21 00:10 UnaNancyOwen