Azure-Kinect-Sensor-SDK icon indicating copy to clipboard operation
Azure-Kinect-Sensor-SDK copied to clipboard

Is it possible to convert 1024x1024 depth image into a 1024x1024x3 point cloud?

Open JeffR1992 opened this issue 2 years ago • 3 comments

I'm running a series of experiments with the Azure Kinect at a depth resolution of 1024x1024, and am trying to figure out if it's possible to convert a 1024x1024 depth image into an 1024x1024x3 point cloud in the form of a numpy array. That is, I'd like to convert each pixel in the 1024x1024 depth image from an depth/range into an XYZ position.

Once I get a 1024x1024x3 numpy array of XYZ points, I'm looking to then only choose the z-coordinate and thus convert the 1024x1024x3 point cloud back into a 1024x1024 "z-depth" image instead of a vanilla depth image.

Any help would be appreciated. Thanks!

JeffR1992 avatar Jul 12 '22 03:07 JeffR1992

Each pixel in the depth image is the distance in mm on the z axis. When creating a point cloud image, only the X and Y axes are derived by multiplying against the calibration table for each pixel. The Z value is copied.

The example should help for understanding the depth data and creating a depth point cloud. https://github.com/microsoft/Azure-Kinect-Sensor-SDK/blob/develop/examples/fastpointcloud/main.cpp#L65

On Mon, Jul 11, 2022, 11:20 PM JeffR1992 @.***> wrote:

I'm running a series of experiments with the Azure Kinect at a depth resolution of 1024x1024, and am trying to figure out if it's possible to convert a 1024x1024 depth image into an 1024x1024x3 point cloud. That is, I'd like to convert each pixel in the 1024x1024 depth image from an depth/range into an XYZ position.

Once I get a 1024x1024x3 array of XYZ points, I'm looking to then only choose the z-coordinate and thus convert the 1024x1024x3 point cloud back into a 1024x1024 "z-depth" image instead of a vanilla depth image.

Any help would be appreciated. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/microsoft/Azure-Kinect-Sensor-SDK/issues/1800, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHGN4SHXW5FIQ4ANL7PGIQDVTTP67ANCNFSM53JQE3DQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

conatsera avatar Jul 12 '22 11:07 conatsera

Interesting, that makes my life easier. However, just so I understand correctly, if our coordinate axes are defined as follows (with y going into the page):

 z
 ^
 |
 ----> x

are you saying that by default the depth image reports its values as follows:

 ^      ^      ^
 |      |      |
 |      |      |
 |      |      |

That is, completely ignoring the x and y coordinate values and only using the z coordinate value for each pixel. Instead of reporting the depth using the range along each "pixel ray" as follows (i.e. r = sqrt(x^2 + y^2 + z^2)):

 ^     ^     ^
  \    |    / 
    \  |  /
      \|/

Apologies for the crude drawings, but it seems I can't upload images to explain what I'm asking.

JeffR1992 avatar Jul 13 '22 06:07 JeffR1992

These measurements are processed to generate a depth map. A depth map is a set of Z-coordinate values for every pixel of the image, measured in units of millimeters. https://docs.microsoft.com/en-us/azure/kinect-dk/depth-camera#operating-principles

The depth image doesn't contain distance from the sensor, it specifically provides z-coordinates.

conatsera avatar Jul 13 '22 08:07 conatsera