Azure-Kinect-Sensor-SDK icon indicating copy to clipboard operation
Azure-Kinect-Sensor-SDK copied to clipboard

[Calibration] Overlay point cloud from two cameras

Open guanming001 opened this issue 4 years ago • 39 comments

Hi

I have tried to modify the green_screen example to allow the display of point clouds obtained from two sync cameras but couldn't get the point cloud to overlay correctly.

Is there be any sample code on overlaying the point cloud from two cameras?

Below is the modified main.cpp for reference and the util.h is adapted from AzureKinectSample

Modified green screen code

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.
#include <algorithm>
#include <iostream>
#include <vector>
#include <string>
#include <chrono>
#include <limits>

#include <k4a/k4a.hpp>
#include <opencv2/calib3d.hpp>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>

#include <opencv2/viz.hpp>
#include "util.h" // Adapted from AzureKinectSample

using std::cerr;
using std::cout;
using std::endl;
using std::vector;

#include "transformation.h"
#include "MultiDeviceCapturer.h"

// Allowing at least 160 microseconds between depth cameras should ensure they do not interfere with one another.
constexpr uint32_t MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC = 160;

// ideally, we could generalize this to many OpenCV types
static cv::Mat color_to_opencv(const k4a::image &im);
static cv::Mat depth_to_opencv(const k4a::image &im);
static cv::Matx33f calibration_to_color_camera_matrix(const k4a::calibration &cal);
static Transformation get_depth_to_color_transformation_from_calibration(const k4a::calibration &cal);
static k4a::calibration construct_device_to_device_calibration(const k4a::calibration &main_cal,
                                                               const k4a::calibration &secondary_cal,
                                                               const Transformation &secondary_to_main);
static vector<float> calibration_to_color_camera_dist_coeffs(const k4a::calibration &cal);
static bool find_chessboard_corners_helper(const cv::Mat &main_color_image,
                                           const cv::Mat &secondary_color_image,
                                           const cv::Size &chessboard_pattern,
                                           vector<cv::Point2f> &main_chessboard_corners,
                                           vector<cv::Point2f> &secondary_chessboard_corners);
static Transformation stereo_calibration(const k4a::calibration &main_calib,
                                         const k4a::calibration &secondary_calib,
                                         const vector<vector<cv::Point2f>> &main_chessboard_corners_list,
                                         const vector<vector<cv::Point2f>> &secondary_chessboard_corners_list,
                                         const cv::Size &image_size,
                                         const cv::Size &chessboard_pattern,
                                         float chessboard_square_length);
static k4a_device_configuration_t get_master_config();
static k4a_device_configuration_t get_subordinate_config();
static Transformation calibrate_devices(MultiDeviceCapturer &capturer,
                                        const k4a_device_configuration_t &main_config,
                                        const k4a_device_configuration_t &secondary_config,
                                        const cv::Size &chessboard_pattern,
                                        float chessboard_square_length,
                                        double calibration_timeout);
static k4a::image create_depth_image_like(const k4a::image &im);

// For plotting point cloud
k4a::image main_xyz_image;
k4a::image secondary_xyz_image;
cv::Mat main_xyz;
cv::Mat secondary_xyz;
// Viewer
cv::viz::Viz3d viewer;



int main(int argc, char **argv)
{
    float chessboard_square_length = 0.; // must be included in the input params
    int32_t color_exposure_usec = 8000;  // somewhat reasonable default exposure time
    int32_t powerline_freq = 2;          // default to a 60 Hz powerline
    cv::Size chessboard_pattern(0, 0);   // height, width. Both need to be set.
    uint16_t depth_threshold = 1000;     // default to 1 meter
    size_t num_devices = 0;
    double calibration_timeout = 60.0; // default to timing out after 60s of trying to get calibrated
    double greenscreen_duration = std::numeric_limits<double>::max(); // run forever

    vector<uint32_t> device_indices{ 0 }; // Set up a MultiDeviceCapturer to handle getting many synchronous captures
                                          // Note that the order of indices in device_indices is not necessarily
                                          // preserved because MultiDeviceCapturer tries to find the master device based
                                          // on which one has sync out plugged in. Start with just { 0 }, and add
                                          // another if needed

    if (argc < 5)
    {
        cout << "Usage: green_screen <num-cameras> <board-height> <board-width> <board-square-length> "
                "[depth-threshold-mm (default 1000)] [color-exposure-time-usec (default 8000)] "
                "[powerline-frequency-mode (default 2 for 60 Hz)] [calibration-timeout-sec (default 60)]"
                "[greenscreen-duration-sec (default infinity- run forever)]"
             << endl;

        cerr << "Not enough arguments!\n";
        exit(1);
    }
    else
    {
        num_devices = static_cast<size_t>(atoi(argv[1]));
        if (num_devices > k4a::device::get_installed_count())
        {
            cerr << "Not enough cameras plugged in!\n";
            exit(1);
        }
        chessboard_pattern.height = atoi(argv[2]);
        chessboard_pattern.width = atoi(argv[3]);
        chessboard_square_length = static_cast<float>(atof(argv[4]));

        if (argc > 5)
        {
            depth_threshold = static_cast<uint16_t>(atoi(argv[5]));
            if (argc > 6)
            {
                color_exposure_usec = atoi(argv[6]);
                if (argc > 7)
                {
                    powerline_freq = atoi(argv[7]);
                    if (argc > 8)
                    {
                        calibration_timeout = atof(argv[8]);
                        if (argc > 9)
                        {
                            greenscreen_duration = atof(argv[9]);
                        }
                    }
                }
            }
        }
    }

    if (num_devices != 2 && num_devices != 1)
    {
        cerr << "Invalid choice for number of devices!\n";
        exit(1);
    }
    else if (num_devices == 2)
    {
        device_indices.emplace_back(1); // now device indices are { 0, 1 }
    }
    if (chessboard_pattern.height == 0)
    {
        cerr << "Chessboard height is not properly set!\n";
        exit(1);
    }
    if (chessboard_pattern.width == 0)
    {
        cerr << "Chessboard height is not properly set!\n";
        exit(1);
    }
    if (chessboard_square_length == 0.)
    {
        cerr << "Chessboard square size is not properly set!\n";
        exit(1);
    }

    cout << "Chessboard height: " << chessboard_pattern.height << ". Chessboard width: " << chessboard_pattern.width
         << ". Chessboard square length: " << chessboard_square_length << endl;
    cout << "Depth threshold: : " << depth_threshold << ". Color exposure time: " << color_exposure_usec
         << ". Powerline frequency mode: " << powerline_freq << endl;

    MultiDeviceCapturer capturer(device_indices, color_exposure_usec, powerline_freq);

    // Create configurations for devices
    k4a_device_configuration_t main_config = get_master_config();
    if (num_devices == 1) // no need to have a master cable if it's standalone
    {
        main_config.wired_sync_mode = K4A_WIRED_SYNC_MODE_STANDALONE;
    }
    k4a_device_configuration_t secondary_config = get_subordinate_config();

    // Construct all the things that we'll need whether or not we are running with 1 or 2 cameras
    k4a::calibration main_calibration = capturer.get_master_device().get_calibration(main_config.depth_mode,
                                                                                     main_config.color_resolution);

    // Set up a transformation. DO THIS OUTSIDE OF YOUR MAIN LOOP! Constructing transformations involves time-intensive
    // hardware setup and should not change once you have a rigid setup, so only call it once or it will run very
    // slowly.
    k4a::transformation main_depth_to_main_color(main_calibration);

    capturer.start_devices(main_config, secondary_config);
    // get an image to be the background
    vector<k4a::capture> background_captures = capturer.get_synchronized_captures(secondary_config);
    cv::Mat background_image = color_to_opencv(background_captures[0].get_color_image());
    cv::Mat output_image = background_image.clone(); // allocated outside the loop to avoid re-creating every time

    if (num_devices == 1)
    {
        std::chrono::time_point<std::chrono::system_clock> start_time = std::chrono::system_clock::now();
        while (std::chrono::duration<double>(std::chrono::system_clock::now() - start_time).count() <
               greenscreen_duration)
        {
            vector<k4a::capture> captures;
            // secondary_config isn't actually used here because there's no secondary device but the function needs it
            captures = capturer.get_synchronized_captures(secondary_config, true);
            k4a::image main_color_image = captures[0].get_color_image();
            k4a::image main_depth_image = captures[0].get_depth_image();

            // let's green screen out things that are far away.
            // first: let's get the main depth image into the color camera space
            k4a::image main_depth_in_main_color = create_depth_image_like(main_color_image);
            main_depth_to_main_color.depth_image_to_color_camera(main_depth_image, &main_depth_in_main_color);
            cv::Mat cv_main_depth_in_main_color = depth_to_opencv(main_depth_in_main_color);
            cv::Mat cv_main_color_image = color_to_opencv(main_color_image);

            // single-camera case
            cv::Mat within_threshold_range = (cv_main_depth_in_main_color != 0) &
                                             (cv_main_depth_in_main_color < depth_threshold);
            // show the close details
            cv_main_color_image.copyTo(output_image, within_threshold_range);
            // hide the rest with the background image
            background_image.copyTo(output_image, ~within_threshold_range);

            cv::imshow("Green Screen", output_image);
            cv::waitKey(1);
        }
    }
    else if (num_devices == 2)
    {
        // This wraps all the device-to-device details
        Transformation tr_secondary_color_to_main_color = calibrate_devices(capturer,
                                                                            main_config,
                                                                            secondary_config,
                                                                            chessboard_pattern,
                                                                            chessboard_square_length,
                                                                            calibration_timeout);

        k4a::calibration secondary_calibration =
            capturer.get_subordinate_device_by_index(0).get_calibration(secondary_config.depth_mode,
                                                                        secondary_config.color_resolution);
        // Get the transformation from secondary depth to secondary color using its calibration object
        Transformation tr_secondary_depth_to_secondary_color = get_depth_to_color_transformation_from_calibration(
            secondary_calibration);

        // We now have the secondary depth to secondary color transform. We also have the transformation from the
        // secondary color perspective to the main color perspective from the calibration earlier. Now let's compose the
        // depth secondary -> color secondary, color secondary -> color main into depth secondary -> color main
        Transformation tr_secondary_depth_to_main_color = tr_secondary_depth_to_secondary_color.compose_with(
            tr_secondary_color_to_main_color);

        // Construct a new calibration object to transform from the secondary depth camera to the main color camera
        k4a::calibration secondary_depth_to_main_color_cal =
            construct_device_to_device_calibration(main_calibration,
                                                   secondary_calibration,
                                                   tr_secondary_depth_to_main_color);
        k4a::transformation secondary_depth_to_main_color(secondary_depth_to_main_color_cal);

		// Pointcloud viewer
		const cv::String window_name = "point cloud";
		viewer = cv::viz::Viz3d(window_name);
		// Show Coordinate System Origin
		constexpr double scale = 100.0;
		viewer.showWidget("origin", cv::viz::WCameraPosition(scale));

        std::chrono::time_point<std::chrono::system_clock> start_time = std::chrono::system_clock::now();
        while (std::chrono::duration<double>(std::chrono::system_clock::now() - start_time).count() <
               greenscreen_duration)
        {
            vector<k4a::capture> captures;
            captures = capturer.get_synchronized_captures(secondary_config, true);
            k4a::image main_color_image = captures[0].get_color_image();
            k4a::image main_depth_image = captures[0].get_depth_image();

            // let's green screen out things that are far away.
            // first: let's get the main depth image into the color camera space
            k4a::image main_depth_in_main_color = create_depth_image_like(main_color_image);
            main_depth_to_main_color.depth_image_to_color_camera(main_depth_image, &main_depth_in_main_color);
            cv::Mat cv_main_depth_in_main_color = depth_to_opencv(main_depth_in_main_color);
            cv::Mat cv_main_color_image = color_to_opencv(main_color_image);

            k4a::image secondary_depth_image = captures[1].get_depth_image();
			k4a::image secondary_color_image = captures[1].get_color_image();

            // Get the depth image in the main color perspective
            k4a::image secondary_depth_in_main_color = create_depth_image_like(main_color_image);
            secondary_depth_to_main_color.depth_image_to_color_camera(secondary_depth_image,
                                                                      &secondary_depth_in_main_color);
            cv::Mat cv_secondary_depth_in_main_color = depth_to_opencv(secondary_depth_in_main_color);
			cv::Mat cv_secondary_color_image = color_to_opencv(secondary_color_image);
			

			// Transform Depth Image to Point Cloud
			main_xyz_image = main_depth_to_main_color.depth_image_to_point_cloud( main_depth_in_main_color, K4A_CALIBRATION_TYPE_COLOR );
			secondary_xyz_image = secondary_depth_to_main_color.depth_image_to_point_cloud(secondary_depth_in_main_color, K4A_CALIBRATION_TYPE_COLOR);
			// Get cv::Mat from k4a::image
			main_xyz = k4a::get_mat(main_xyz_image);
			secondary_xyz = k4a::get_mat(secondary_xyz_image);

			// Create Point Cloud Widget
			//cv::viz::WCloud cloud = cv::viz::WCloud(main_xyz, cv_main_color_image);
			cv::viz::WCloud cloud = cv::viz::WCloud(main_xyz, cv::viz::Color::blue());
			cv::viz::WCloud cloud2 = cv::viz::WCloud(secondary_xyz, cv::viz::Color::red());

			// Show Widget
			viewer.showWidget("cloud", cloud);
			viewer.showWidget("cloud2", cloud2);
			viewer.spinOnce();

            //// Now it's time to actually construct the green screen. Where the depth is 0, the camera doesn't know how
            //// far away the object is because it didn't get a response at that point. That's where we'll try to fill in
            //// the gaps with the other camera.
            //cv::Mat main_valid_mask = cv_main_depth_in_main_color != 0;
            //cv::Mat secondary_valid_mask = cv_secondary_depth_in_main_color != 0;
            //// build depth mask. If the main camera depth for a pixel is valid and the depth is within the threshold,
            //// then set the mask to display that pixel. If the main camera depth for a pixel is invalid but the
            //// secondary depth for a pixel is valid and within the threshold, then set the mask to display that pixel.
            //cv::Mat within_threshold_range = (main_valid_mask & (cv_main_depth_in_main_color < depth_threshold)) |
            //                                 (~main_valid_mask & secondary_valid_mask &
            //                                  (cv_secondary_depth_in_main_color < depth_threshold));
            //// copy main color image to output image only where the mask within_threshold_range is true
            //cv_main_color_image.copyTo(output_image, within_threshold_range);
            //// fill the rest with the background image
            //background_image.copyTo(output_image, ~within_threshold_range);

            //cv::imshow("Green Screen", output_image);
            cv::waitKey(1);
        }

		// Close Viewer
		viewer.close();
    }
    else
    {
        cerr << "Invalid number of devices!" << endl;
        exit(1);
    }
    return 0;
}

static cv::Mat color_to_opencv(const k4a::image &im)
{
    cv::Mat cv_image_with_alpha(im.get_height_pixels(), im.get_width_pixels(), CV_8UC4, (void *)im.get_buffer());
    cv::Mat cv_image_no_alpha;
    cv::cvtColor(cv_image_with_alpha, cv_image_no_alpha, cv::COLOR_BGRA2BGR);
    return cv_image_no_alpha;
}

static cv::Mat depth_to_opencv(const k4a::image &im)
{
    return cv::Mat(im.get_height_pixels(),
                   im.get_width_pixels(),
                   CV_16U,
                   (void *)im.get_buffer(),
                   static_cast<size_t>(im.get_stride_bytes()));
}

static cv::Matx33f calibration_to_color_camera_matrix(const k4a::calibration &cal)
{
    const k4a_calibration_intrinsic_parameters_t::_param &i = cal.color_camera_calibration.intrinsics.parameters.param;
    cv::Matx33f camera_matrix = cv::Matx33f::eye();
    camera_matrix(0, 0) = i.fx;
    camera_matrix(1, 1) = i.fy;
    camera_matrix(0, 2) = i.cx;
    camera_matrix(1, 2) = i.cy;
    return camera_matrix;
}

static Transformation get_depth_to_color_transformation_from_calibration(const k4a::calibration &cal)
{
    const k4a_calibration_extrinsics_t &ex = cal.extrinsics[K4A_CALIBRATION_TYPE_DEPTH][K4A_CALIBRATION_TYPE_COLOR];
    Transformation tr;
    for (int i = 0; i < 3; ++i)
    {
        for (int j = 0; j < 3; ++j)
        {
            tr.R(i, j) = ex.rotation[i * 3 + j];
        }
    }
    tr.t = cv::Vec3d(ex.translation[0], ex.translation[1], ex.translation[2]);
    return tr;
}

// This function constructs a calibration that operates as a transformation between the secondary device's depth camera
// and the main camera's color camera. IT WILL NOT GENERALIZE TO OTHER TRANSFORMS. Under the hood, the transformation
// depth_image_to_color_camera method can be thought of as converting each depth pixel to a 3d point using the
// intrinsics of the depth camera, then using the calibration's extrinsics to convert between depth and color, then
// using the color intrinsics to produce the point in the color camera perspective.
static k4a::calibration construct_device_to_device_calibration(const k4a::calibration &main_cal,
                                                               const k4a::calibration &secondary_cal,
                                                               const Transformation &secondary_to_main)
{
    k4a::calibration cal = secondary_cal;
    k4a_calibration_extrinsics_t &ex = cal.extrinsics[K4A_CALIBRATION_TYPE_DEPTH][K4A_CALIBRATION_TYPE_COLOR];
    for (int i = 0; i < 3; ++i)
    {
        for (int j = 0; j < 3; ++j)
        {
            ex.rotation[i * 3 + j] = static_cast<float>(secondary_to_main.R(i, j));
        }
    }
    for (int i = 0; i < 3; ++i)
    {
        ex.translation[i] = static_cast<float>(secondary_to_main.t[i]);
    }
    cal.color_camera_calibration = main_cal.color_camera_calibration;
    return cal;
}

static vector<float> calibration_to_color_camera_dist_coeffs(const k4a::calibration &cal)
{
    const k4a_calibration_intrinsic_parameters_t::_param &i = cal.color_camera_calibration.intrinsics.parameters.param;
    return { i.k1, i.k2, i.p1, i.p2, i.k3, i.k4, i.k5, i.k6 };
}

bool find_chessboard_corners_helper(const cv::Mat &main_color_image,
                                    const cv::Mat &secondary_color_image,
                                    const cv::Size &chessboard_pattern,
                                    vector<cv::Point2f> &main_chessboard_corners,
                                    vector<cv::Point2f> &secondary_chessboard_corners)
{
    bool found_chessboard_main = cv::findChessboardCorners(main_color_image,
                                                           chessboard_pattern,
                                                           main_chessboard_corners);
    bool found_chessboard_secondary = cv::findChessboardCorners(secondary_color_image,
                                                                chessboard_pattern,
                                                                secondary_chessboard_corners);

    // Cover the failure cases where chessboards were not found in one or both images.
    if (!found_chessboard_main || !found_chessboard_secondary)
    {
        if (found_chessboard_main)
        {
            cout << "Could not find the chessboard corners in the secondary image. Trying again...\n";
        }
        // Likewise, if the chessboard was found in the secondary image, it was not found in the main image.
        else if (found_chessboard_secondary)
        {
            cout << "Could not find the chessboard corners in the main image. Trying again...\n";
        }
        // The only remaining case is the corners were in neither image.
        else
        {
            cout << "Could not find the chessboard corners in either image. Trying again...\n";
        }
        return false;
    }
    // Before we go on, there's a quick problem with calibration to address.  Because the chessboard looks the same when
    // rotated 180 degrees, it is possible that the chessboard corner finder may find the correct points, but in the
    // wrong order.

    // A visual:
    //        Image 1                  Image 2
    // .....................    .....................
    // .....................    .....................
    // .........xxxxx2......    .....xxxxx1..........
    // .........xxxxxx......    .....xxxxxx..........
    // .........xxxxxx......    .....xxxxxx..........
    // .........1xxxxx......    .....2xxxxx..........
    // .....................    .....................
    // .....................    .....................

    // The problem occurs when this case happens: the find_chessboard() function correctly identifies the points on the
    // chessboard (shown as 'x's) but the order of those points differs between images taken by the two cameras.
    // Specifically, the first point in the list of points found for the first image (1) is the *last* point in the list
    // of points found for the second image (2), though they correspond to the same physical point on the chessboard.

    // To avoid this problem, we can make the assumption that both of the cameras will be oriented in a similar manner
    // (e.g. turning one of the cameras upside down will break this assumption) and enforce that the vector between the
    // first and last points found in pixel space (which will be at opposite ends of the chessboard) are pointing the
    // same direction- so, the dot product of the two vectors is positive.

    cv::Vec2f main_image_corners_vec = main_chessboard_corners.back() - main_chessboard_corners.front();
    cv::Vec2f secondary_image_corners_vec = secondary_chessboard_corners.back() - secondary_chessboard_corners.front();
    if (main_image_corners_vec.dot(secondary_image_corners_vec) <= 0.0)
    {
        std::reverse(secondary_chessboard_corners.begin(), secondary_chessboard_corners.end());
    }
    return true;
}

Transformation stereo_calibration(const k4a::calibration &main_calib,
                                  const k4a::calibration &secondary_calib,
                                  const vector<vector<cv::Point2f>> &main_chessboard_corners_list,
                                  const vector<vector<cv::Point2f>> &secondary_chessboard_corners_list,
                                  const cv::Size &image_size,
                                  const cv::Size &chessboard_pattern,
                                  float chessboard_square_length)
{
    // We have points in each image that correspond to the corners that the findChessboardCorners function found.
    // However, we still need the points in 3 dimensions that these points correspond to. Because we are ultimately only
    // interested in find a transformation between two cameras, these points don't have to correspond to an external
    // "origin" point. The only important thing is that the relative distances between points are accurate. As a result,
    // we can simply make the first corresponding point (0, 0) and construct the remaining points based on that one. The
    // order of points inserted into the vector here matches the ordering of findChessboardCorners. The units of these
    // points are in millimeters, mostly because the depth provided by the depth cameras is also provided in
    // millimeters, which makes for easy comparison.
    vector<cv::Point3f> chessboard_corners_world;
    for (int h = 0; h < chessboard_pattern.height; ++h)
    {
        for (int w = 0; w < chessboard_pattern.width; ++w)
        {
            chessboard_corners_world.emplace_back(
                cv::Point3f{ w * chessboard_square_length, h * chessboard_square_length, 0.0 });
        }
    }

    // Calibrating the cameras requires a lot of data. OpenCV's stereoCalibrate function requires:
    // - a list of points in real 3d space that will be used to calibrate*
    // - a corresponding list of pixel coordinates as seen by the first camera*
    // - a corresponding list of pixel coordinates as seen by the second camera*
    // - the camera matrix of the first camera
    // - the distortion coefficients of the first camera
    // - the camera matrix of the second camera
    // - the distortion coefficients of the second camera
    // - the size (in pixels) of the images
    // - R: stereoCalibrate stores the rotation matrix from the first camera to the second here
    // - t: stereoCalibrate stores the translation vector from the first camera to the second here
    // - E: stereoCalibrate stores the essential matrix here (we don't use this)
    // - F: stereoCalibrate stores the fundamental matrix here (we don't use this)
    //
    // * note: OpenCV's stereoCalibrate actually requires as input an array of arrays of points for these arguments,
    // allowing a caller to provide multiple frames from the same camera with corresponding points. For example, if
    // extremely high precision was required, many images could be taken with each camera, and findChessboardCorners
    // applied to each of those images, and OpenCV can jointly solve for all of the pairs of corresponding images.
    // However, to keep things simple, we use only one image from each device to calibrate.  This is also why each of
    // the vectors of corners is placed into another vector.
    //
    // A function in OpenCV's calibration function also requires that these points be F32 types, so we use those.
    // However, OpenCV still provides doubles as output, strangely enough.
    vector<vector<cv::Point3f>> chessboard_corners_world_nested_for_cv(main_chessboard_corners_list.size(),
                                                                       chessboard_corners_world);

    cv::Matx33f main_camera_matrix = calibration_to_color_camera_matrix(main_calib);
    cv::Matx33f secondary_camera_matrix = calibration_to_color_camera_matrix(secondary_calib);
    vector<float> main_dist_coeff = calibration_to_color_camera_dist_coeffs(main_calib);
    vector<float> secondary_dist_coeff = calibration_to_color_camera_dist_coeffs(secondary_calib);

    // Finally, we'll actually calibrate the cameras.
    // Pass secondary first, then main, because we want a transform from secondary to main.
    Transformation tr;
    double error = cv::stereoCalibrate(chessboard_corners_world_nested_for_cv,
                                       secondary_chessboard_corners_list,
                                       main_chessboard_corners_list,
                                       secondary_camera_matrix,
                                       secondary_dist_coeff,
                                       main_camera_matrix,
                                       main_dist_coeff,
                                       image_size,
                                       tr.R, // output
                                       tr.t, // output
                                       cv::noArray(),
                                       cv::noArray(),
                                       cv::CALIB_FIX_INTRINSIC | cv::CALIB_RATIONAL_MODEL | cv::CALIB_CB_FAST_CHECK);
    cout << "Finished calibrating!\n";
    cout << "Got error of " << error << "\n";
    return tr;
}

// The following functions provide the configurations that should be used for each camera.
// NOTE: For best results both cameras should have the same configuration (framerate, resolution, color and depth
// modes). Additionally the both master and subordinate should have the same exposure and power line settings. Exposure
// settings can be different but the subordinate must have a longer exposure from master. To synchronize a master and
// subordinate with different exposures the user should set `subordinate_delay_off_master_usec = ((subordinate exposure
// time) - (master exposure time))/2`.
//
static k4a_device_configuration_t get_default_config()
{
    k4a_device_configuration_t camera_config = K4A_DEVICE_CONFIG_INIT_DISABLE_ALL;
    camera_config.color_format = K4A_IMAGE_FORMAT_COLOR_BGRA32;
    camera_config.color_resolution = K4A_COLOR_RESOLUTION_720P;
    camera_config.depth_mode = K4A_DEPTH_MODE_WFOV_UNBINNED; // No need for depth during calibration
    camera_config.camera_fps = K4A_FRAMES_PER_SECOND_15;     // Don't use all USB bandwidth
    camera_config.subordinate_delay_off_master_usec = 0;     // Must be zero for master
    camera_config.synchronized_images_only = true;
    return camera_config;
}

// Master customizable settings
static k4a_device_configuration_t get_master_config()
{
    k4a_device_configuration_t camera_config = get_default_config();
    camera_config.wired_sync_mode = K4A_WIRED_SYNC_MODE_MASTER;

    // Two depth images should be seperated by MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC to ensure the depth imaging
    // sensor doesn't interfere with the other. To accomplish this the master depth image captures
    // (MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2) before the color image, and the subordinate camera captures its
    // depth image (MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2) after the color image. This gives us two depth
    // images centered around the color image as closely as possible.
    camera_config.depth_delay_off_color_usec = -static_cast<int32_t>(MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2);
    camera_config.synchronized_images_only = true;
    return camera_config;
}

// Subordinate customizable settings
static k4a_device_configuration_t get_subordinate_config()
{
    k4a_device_configuration_t camera_config = get_default_config();
    camera_config.wired_sync_mode = K4A_WIRED_SYNC_MODE_SUBORDINATE;

    // Two depth images should be seperated by MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC to ensure the depth imaging
    // sensor doesn't interfere with the other. To accomplish this the master depth image captures
    // (MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2) before the color image, and the subordinate camera captures its
    // depth image (MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2) after the color image. This gives us two depth
    // images centered around the color image as closely as possible.
    camera_config.depth_delay_off_color_usec = MIN_TIME_BETWEEN_DEPTH_CAMERA_PICTURES_USEC / 2;
    return camera_config;
}

static Transformation calibrate_devices(MultiDeviceCapturer &capturer,
                                        const k4a_device_configuration_t &main_config,
                                        const k4a_device_configuration_t &secondary_config,
                                        const cv::Size &chessboard_pattern,
                                        float chessboard_square_length,
                                        double calibration_timeout)
{
    k4a::calibration main_calibration = capturer.get_master_device().get_calibration(main_config.depth_mode,
                                                                                     main_config.color_resolution);

    k4a::calibration secondary_calibration =
        capturer.get_subordinate_device_by_index(0).get_calibration(secondary_config.depth_mode,
                                                                    secondary_config.color_resolution);
    vector<vector<cv::Point2f>> main_chessboard_corners_list;
    vector<vector<cv::Point2f>> secondary_chessboard_corners_list;
    std::chrono::time_point<std::chrono::system_clock> start_time = std::chrono::system_clock::now();
    while (std::chrono::duration<double>(std::chrono::system_clock::now() - start_time).count() < calibration_timeout)
    {
        vector<k4a::capture> captures = capturer.get_synchronized_captures(secondary_config);
        k4a::capture &main_capture = captures[0];
        k4a::capture &secondary_capture = captures[1];
        // get_color_image is guaranteed to be non-null because we use get_synchronized_captures for color
        // (get_synchronized_captures also offers a flag to use depth for the secondary camera instead of color).
        k4a::image main_color_image = main_capture.get_color_image();
        k4a::image secondary_color_image = secondary_capture.get_color_image();
        cv::Mat cv_main_color_image = color_to_opencv(main_color_image);
        cv::Mat cv_secondary_color_image = color_to_opencv(secondary_color_image);

        vector<cv::Point2f> main_chessboard_corners;
        vector<cv::Point2f> secondary_chessboard_corners;
        bool got_corners = find_chessboard_corners_helper(cv_main_color_image,
                                                          cv_secondary_color_image,
                                                          chessboard_pattern,
                                                          main_chessboard_corners,
                                                          secondary_chessboard_corners);
        if (got_corners)
        {
            main_chessboard_corners_list.emplace_back(main_chessboard_corners);
            secondary_chessboard_corners_list.emplace_back(secondary_chessboard_corners);
            cv::drawChessboardCorners(cv_main_color_image, chessboard_pattern, main_chessboard_corners, true);
            cv::drawChessboardCorners(cv_secondary_color_image, chessboard_pattern, secondary_chessboard_corners, true);
        }

        cv::imshow("Chessboard view from main camera", cv_main_color_image);
        cv::waitKey(1);
        cv::imshow("Chessboard view from secondary camera", cv_secondary_color_image);
        cv::waitKey(1);

        // Get 20 frames before doing calibration.
        if (main_chessboard_corners_list.size() >= 20)
        {
            cout << "Calculating calibration..." << endl;
            return stereo_calibration(main_calibration,
                                      secondary_calibration,
                                      main_chessboard_corners_list,
                                      secondary_chessboard_corners_list,
                                      cv_main_color_image.size(),
                                      chessboard_pattern,
                                      chessboard_square_length);
        }
    }
    std::cerr << "Calibration timed out !\n ";
    exit(1);
}

static k4a::image create_depth_image_like(const k4a::image &im)
{
    return k4a::image::create(K4A_IMAGE_FORMAT_DEPTH16,
                              im.get_width_pixels(),
                              im.get_height_pixels(),
                              im.get_width_pixels() * static_cast<int>(sizeof(uint16_t)));
}

guanming001 avatar Sep 27 '19 12:09 guanming001

@guanming001 Thank you for the feedback and sharing the sample code.

I had a quick look of the green screen sample code and also the sample code that how you leverage the green screen sample to calibrate your two camera and transform the point cloud into the same camera space (in the sample code you shared here, they all transformed to the "main color" camera). I can compile your code without changing anything, it seems can align the two points clouds from the two camera in the same coordinate space pretty well. See the image below, apologize of the simplicity in the image, do not want to have much privacy data in the image. (you are rendering the blue and red to represent the point cloud from those two devices). image

And yes, you are doing the right thing. To overlay point clouds from two or more cameras, which is quite a common case of using multiple Azure Kinect from the community here is an example. The sample code you shared looks reasonable. Essentially, the steps to do this task are:

  • you need to calibrate the extrinsics (rotation/translation) between the two/more kinect devices. Depends on what is the hardware setup scenario (devices count and location/pointing direction, distance between object and cameras etc.), you may need to choose the best fit calibration method and target calibration pattern to ensure the subpixel reprojection accuracy. It looks like the green screen is using a simple stereo calibration from opencv as an example, there are many other ways such as multiboard, moving board etc. or many other targets other than chessboard can help you calibrate them based on your needs.
  • you need to generate point clouds from the two depth camera which the SDK supports this
  • you need to define the coordinate space that you want to render them, for example, you can simply define one of the cameras on the Azure Kinect as the coordinate space center (given you have the extrinsics can transform point cloud among all cameras by using the calibration comes from the device itself as well as the extrinsics you calibrated between device to device). then you can mimic how the green screen example to create the new calibration object that from the source camera to the target camera (even they are cross devices) and use the SDK transformation function to transform them.

The key of how you can get them overlay accurately will be the calibration stage.

Now, you mentioned you cannot get a good overlay of the two point clouds. It might be good you can share some more information:

  • can you describe how much the alignment error you think is?
  • what type of target object is for these point cloud (e.g. if you are having some material tends to bring depth accuracy issue, such as subsurface scattering on human skin, you may have slightly alignment error due to the depth accuracy on such material)
  • is the alignment issue happening for moving object or stationery object as well? (if the point cloud aligns for stationary object, but more error on moving object, this may indicate some time synchronization issue)

rabbitdaxi avatar Sep 30 '19 19:09 rabbitdaxi

Hi @rabbitdaxi Thank you for your time and detailed feedback!

Below is the overview of my setup with two Azure Kinect and a (10 x 7) calibration board (square size of 25 mm) placed around 1.0m from both cameras: overview of setup

I ran the above modified green_screen code using the following settings num-cameras = 2 board-height = 7 board-width = 10 board-square-length = 25 depth-threshold-mm = 1500 color-exposure-time-usec = 10000 powerline-frequency-mode = 1 calibration-timeout-sec = 60

>main.exe 2 7 10 25 1500 10000 1 60 

Below is some output from the calibration result:

Chessboard height: 7. Chessboard width: 10. Chessboard square length: 25 Depth threshold: : 1500. Color exposure time: 10000. Powerline frequency mode: 1
...
Calculating calibration... 
Finished calibrating! 
Got error of 0.112819

Chessboard view from main camera: main

Chessboard view from secondary camera: secondary

Below is the result that I obtained, the object considered is a stationary 3D printed Stanford Bunny and the red point cloud (secondary camera) is offset to the left of the blue point cloud (main camera) Front view: front

Top view: top

I have also recorded two .mkv files ( output-main.mkv and output-secondary.mkv) obtained using k4arecorder.exe for your reference. Below are the commands that I used to log the 2 sec recording on a single computer: Start recorder on subordinate:

k4arecorder.exe --device 0 -l 2 -c 720p -d WFOV_UNBINNED -r 5 --imu OFF --external-sync subordinate --sync-delay 160 -e -8 output-secondary.mkv

Start master device:

k4arecorder.exe --device 1 -l 2 -c 720p -d WFOV_UNBINNED -r 5 --imu OFF --external-sync master -e -8 output-main.mkv

guanming001 avatar Oct 01 '19 02:10 guanming001

How many images of the calibration board did you take per camera? Did the images show the board from different perspectives?

mbleyer avatar Oct 02 '19 20:10 mbleyer

Hi @mbleyer i have only taken one image of calibration per camera

guanming001 avatar Oct 02 '19 21:10 guanming001

This might be the problem. Please take multiple images (>30) of the calibration board at different distances and rotations.

mbleyer avatar Oct 02 '19 22:10 mbleyer

Hi @mbleyer thank you for your suggestion, I have tried to take 30 images of the calibration board at different distances and rotations by modifying the capture_devices() function to wait for user input before capturing the next frame:

static Transformation calibrate_devices(MultiDeviceCapturer &capturer,
                                        const k4a_device_configuration_t &main_config,
                                        const k4a_device_configuration_t &secondary_config,
                                        const cv::Size &chessboard_pattern,
                                        float chessboard_square_length,
                                        double calibration_timeout)
{
    k4a::calibration main_calibration = capturer.get_master_device().get_calibration(main_config.depth_mode,
                                                                                     main_config.color_resolution);

    k4a::calibration secondary_calibration =
        capturer.get_subordinate_device_by_index(0).get_calibration(secondary_config.depth_mode,
                                                                    secondary_config.color_resolution);
    vector<vector<cv::Point2f>> main_chessboard_corners_list;
    vector<vector<cv::Point2f>> secondary_chessboard_corners_list;
    std::chrono::time_point<std::chrono::system_clock> start_time = std::chrono::system_clock::now();
	int counter = 0;
    //while (std::chrono::duration<double>(std::chrono::system_clock::now() - start_time).count() < calibration_timeout)
	while (true)
    {
        vector<k4a::capture> captures = capturer.get_synchronized_captures(secondary_config);
        k4a::capture &main_capture = captures[0];
        k4a::capture &secondary_capture = captures[1];
        // get_color_image is guaranteed to be non-null because we use get_synchronized_captures for color
        // (get_synchronized_captures also offers a flag to use depth for the secondary camera instead of color).
        k4a::image main_color_image = main_capture.get_color_image();
        k4a::image secondary_color_image = secondary_capture.get_color_image();
        cv::Mat cv_main_color_image = color_to_opencv(main_color_image);
        cv::Mat cv_secondary_color_image = color_to_opencv(secondary_color_image);

        vector<cv::Point2f> main_chessboard_corners;
        vector<cv::Point2f> secondary_chessboard_corners;
        bool got_corners = find_chessboard_corners_helper(cv_main_color_image,
                                                          cv_secondary_color_image,
                                                          chessboard_pattern,
                                                          main_chessboard_corners,
                                                          secondary_chessboard_corners);
        if (got_corners)
        {
            main_chessboard_corners_list.emplace_back(main_chessboard_corners);
            secondary_chessboard_corners_list.emplace_back(secondary_chessboard_corners);
            cv::drawChessboardCorners(cv_main_color_image, chessboard_pattern, main_chessboard_corners, true);
            cv::drawChessboardCorners(cv_secondary_color_image, chessboard_pattern, secondary_chessboard_corners, true);

			std::stringstream ss;
			ss << std::setw(6) << std::setfill('0') << counter++;
			cv::imwrite(ss.str() + "_main.png", cv_main_color_image);
			cv::imwrite(ss.str() + "_seco.png", cv_secondary_color_image);
			cout << "Saved " << ss.str() << endl;
        }

        cv::imshow("Chessboard view from main camera", cv_main_color_image);
        //cv::waitKey(1);
        cv::imshow("Chessboard view from secondary camera", cv_secondary_color_image);
        cv::waitKey(0); // Wait for user input before capturing the next frame

        // Get 30 frames before doing calibration.
        if (main_chessboard_corners_list.size() >= 30)
        {
            cout << "Calculating calibration..." << endl;
            return stereo_calibration(main_calibration,
                                      secondary_calibration,
                                      main_chessboard_corners_list,
                                      secondary_chessboard_corners_list,
                                      cv_main_color_image.size(),
                                      chessboard_pattern,
                                      chessboard_square_length);
        }
    }
    std::cerr << "Calibration timed out !\n ";
    exit(1);
}
Below are the images of the captured calibration board:

combine

But the result I got is still not aligned: Front view: Capture

Side view: Capture1

However, when I tried to place both Kinect side-by-side and calibrate with one image per camera, I can get a good result: Setup: 20191003_105228

Front view: front

Side view: side

guanming001 avatar Oct 03 '19 03:10 guanming001

Hi @guanming001,

I have the same issue than you. Did you find a way to find properly the transformation between the two cameras? Could the problem come from the intrinsic et distortion coefficients of the Kinect cameras?

Thanks in advance!

valvador92 avatar Oct 07 '19 08:10 valvador92

Hi @valvador92 glad to know someone is experiencing the same issue as me!

I still could not solve the problem with Azure Kinect, but I have tried to use a similar method (stereo calibrate) for two Intel Realsense D415 cameras (without distortion) and the point clouds could overlay correctly.

guanming001 avatar Oct 07 '19 09:10 guanming001

Hi @guanming001, Would you mind trying my code? https://github.com/forestsen/KinectAzureDKProgramming You can find the Two Kinects calibration using the aruco library. And this project. https://github.com/forestsen/K4aGrabber You can find the real-time point cloud stitching for two kinects.

forestsen avatar Oct 07 '19 17:10 forestsen

Hi @forestsen

Thanks for your code, I am able to build your example on Aruco_TwoKinects_Calibration_Extrinsics. With the printed arucopattern.png, I run your code and I am able to get the frame_marker_sub.csv and frame_sub_master.csv files.

Then I run the PCL_Grabber_TwoKinects code to get the point cloud stitching for two kinects, but below is the result I get: aruco

May I check if you are able to successfully align the point clouds and did I miss out any steps?

Note: I tested on azure-kinect-sensor-sdk version 1.2.0 instead of 1.1.1 (by making some slight modification to the CMakeList.txt and azure-kinect-sensor-sdk-config.cmake)

guanming001 avatar Oct 08 '19 04:10 guanming001

Hi, @guanming001 Thank you for your advice. I have change the k4a version of my code. You can pull the new code of https://github.com/forestsen/KinectAzureDKProgramming and https://github.com/forestsen/K4aGrabber This is my result. Point Cloud Viewer 2019_9_1 21_44_24_20190902174707 I use the parameters such as "frame_sub_master.csv frame_sub_marker.csv 0.25 1.0 0.25 1.0" in the PCL_Grabber_TwoKinects project.

forestsen avatar Oct 08 '19 07:10 forestsen

@guanming001 , which modification have you done to the CMakeList.txt and the azure-kinect-sensor-sdk-config.cmake?

@mbleyer Hi, could the problem comes from the intrinsic parameters of the camera and the distortion coefficients? How good is the calibration done by Microsoft?

valvador92 avatar Oct 08 '19 07:10 valvador92

Hi, @valvador92 You can check out the new code of https://github.com/forestsen/KinectAzureDKProgramming and https://github.com/forestsen/K4aGrabber I have changed the k4a version of my code.

forestsen avatar Oct 08 '19 07:10 forestsen

@forestsen Thank you for your message.

I am trying to build https://github.com/forestsen/K4aGrabber.

But I got this error. Do you know why?

image Is your code running on Ubuntu 18.04?

valvador92 avatar Oct 09 '19 08:10 valvador92

@forestsen Thank you for your message.

I am trying to build https://github.com/forestsen/K4aGrabber.

But I got this error. Do you know why?

image Is your code running on Ubuntu 18.04?

I commented the line template. It runs now.

What are these parameters: master_znear master_zfar sub_znear sub_zfar?

valvador92 avatar Oct 09 '19 09:10 valvador92

@forestsen I tried and I have the same problem than @guanming001

image

I got this kind of image when I run Aruco_TwoKinects_Calibration_Extrinsics:

image

valvador92 avatar Oct 09 '19 09:10 valvador92

Hi @valvador92 @guanming001 I refined the calibration result based on the Open3D-0.8.0. Using the Open3D ColoredICP, I refined the extrinsic matrix between two azure kinects. This is the result of Aruco_TwoKinects_Calibration_Extrinsics. QQ截图20191010174639 This is the result which only based on the calibration result (extrinsic matrix between two kinects) from Aruco_TwoKinects_Calibration_Extrinsics. QQ截图20191010175422 QQ截图20191010175437 This is the new result from PCL_Grabber_TwoKinects. QQ截图20191010175637 QQ截图20191010175655 And this is another comparison result. QQ截图20191010180141 QQ截图20191010180400

Lastly, you just use the "frame_sub_master.csv" parameter in the new version of PCL_Grabber_TwoKinects.

forestsen avatar Oct 10 '19 10:10 forestsen

@valvador92: The accuracy of the intrinsic calibration is expected to be high. We are using the intrinsic parameters to convert from radial depth to Z values. If there were inaccuracies in the intrinsic calibration, flat surfaces would not appear flat in the 3d reconstruction.

My impression is that the problem is the extrinsic calibration using OpenCV. I am sure that there is plenty of information about camera calibration with OpenCV on the web. Oftentimes, the calibration does not give precise results, because the user took too few or too similar images of the calibration board, the calibration board was not flat enough or the size of the squares was not measured accurately. I would also make sure that you move the calibration board as close as possible to the cameras in some recordings and rotate the calibration board around all three axes. I would also check for situations where the order of corners is flipped across left and right images, which sometimes happened, at least in older versions of OpenCV.

To verify the calibration you might want look at the translation vector and check if its length matches the measured distance between your cameras.

mbleyer avatar Oct 11 '19 16:10 mbleyer

@mbleyer

Thank you for your comment. I checked the translation vector. I took 50 images. The values seem correct when the 2 cameras are close from each other. Bu when I increased the distance, the values are too small. I tried stereoCalibrate from OpenCV with a chessboard. Do you know if it works better with ArucoBoard? Otherwise do you know a pipeline to find this transformation matrix?

Thank you in advance

valvador92 avatar Oct 17 '19 16:10 valvador92

Hi @mbleyer could the issue of systematic nonlinear depth distortion (that is inherent in time-of-flight depth cameras) also contribute to the misalignment in point clouds?

i.e. even though the extrinsic calibration between the cameras is performed properly, the point cloud may still not align correctly as the extrinsic calibration using checkboard is only between the color cameras but did not consider the distortion in point cloud.

guanming001 avatar Oct 18 '19 04:10 guanming001

Hi @valvador92 and @guanming001, I am running your modified green_screen code, but it seems like I am able to extract much more data from the primary camera compared to the secondary camera (i.e. my generated point cloud is 80% points from the primary camera). Did either of you experience this? Is this likely a limitation caused by my computer specs (i7-7820x and GTX 1080) or some other issue?

henryo12 avatar Nov 07 '19 16:11 henryo12

Hi @henryo12 the position of the cameras and the scene that they are looking at will affect the number of generated point cloud data ... do you have any images of the camera position and the captured scene?

guanming001 avatar Nov 08 '19 00:11 guanming001

Thanks @guanming001 , I think I realized part of my concern: does the viewer only display points with positive depth in the primary camera's coordinates? Regardless, you can see in my images that the blue point cloud data is denser than that from the other two cameras. The performance is also similar when using your unmodified code with only two cameras.

I saved a screenshot from each camera's perspective. I circled myself where I am standing in the middle of the cameras. There are obvious alignment issues which I have yet to address. image image image

image image

henryo12 avatar Nov 08 '19 02:11 henryo12

Hi @valvador92 glad to know someone is experiencing the same issue as me!

I still could not solve the problem with Azure Kinect, but I have tried to use a similar method (stereo calibrate) for two Intel Realsense D415 cameras (without distortion) and the point clouds could overlay correctly.

@guanming001 Crazy that we had the same problem. I can align the point clouds successfully from the realsense camera but for some reason, not the Azure Kinect. It seems to me the only difference between them is one has distortion and one doesn't, so that leads me to think there's something related to the distortion that's causing the misalignment.

While finding the transformation between 2 kinects, I've tried

  1. undistorting (cv::undistort) the checkerboard images first and then stereo calibrate (cv::stereoCalibrate) with zero distortion

  2. direct stereo calibrate with captured checkerboard images with given distortion parameters

both yield similar transformation, so this leads me to believe the distortion parameters provided by the kinect should be right.

Next, I undistort the color image and depth image for both kinects, convert the depth to pointcloud, with the given intrinsics and then transform them based on k4a_calibration_extrinsics_t, and visually it looks like the point cloud aligns correctly with the color image. After doing the same thing for the 2nd kinect, when I apply the transformation from the stereo calibration to the point cloud from the secondary kinect, I get similar results as @guanming001. Very puzzling.

One potential problem I could see is when I undistort the depth image, does the depth value has to be adjusted somehow to reflect the change? Meaning can I simply treat the depth image as a 32-bit float image and just call cv::undistort on it and use the undistorted depth value directly? I think the answer should be yes, can @rabbitdaxi confirm this?

jasjuang avatar Nov 17 '19 07:11 jasjuang

Hi @jasjuang, I had a similar problem with 3 Kinect DK devices I purchased. I was unable to accurately calculate the pose of the other 2 cameras and thus always had problem generating an accurate point cloud of my scene. I had to re-calibrate the color camera intrinsics and ony after this procedure was I able to generate accurate camera poses.

All 3 devices had, lets say, 'less than optimal' intrinsics values out of the box.

I used a custom made, high accurate Charuco board, with OpenCV 3+. I used 60 or so images to generate my own intrinsic values.

Please try calibrating your cameras and share your results with the community. If there is indeed an issue with factory calibration we should know.. I wasted quite a bit of development time trying to figure out what the problem was and did not consider factory calibration issues until I had exhausted all the other possibilities.

amonnphillip avatar Nov 18 '19 16:11 amonnphillip

@amonnphillip thanks for sharing your experience and I will give that a try. Did you only re-calibrate the intrinsics of the color camera? Did you redo the intrinsics for the depth camera and the extrinsics between the color and depth camera?

jasjuang avatar Nov 18 '19 17:11 jasjuang

I only calibrated the color camera and that gave me enough pose estimation accuracy for my use case (for now at least, we are still in early development). But yes, I do wonder about the accuracy of the depth intrinsics also. I guess you could calibrate the depth camera using the IR stream??? Maybe someone from Microsoft can advise..

amonnphillip avatar Nov 18 '19 20:11 amonnphillip

@amonnphillip I re-calibrate the intrinsics of the color camera but not the depth camera, and to my great surprise, it actually produces better alignment results than the one Azure Kinect provides, but there's still a little bit misalignment in there that's not good enough for my purpose. I think this is most likely due to the intrinsics of the depth camera still being from the old one, and the extrinsics between the color and depth camera still being the old one. For the new intrinsic parameters I calculated for the color camera, the focal length and distortion parameters are wildly different, and only the principle points are somewhat similar.

I am trying to figure out how to capture the checkerboard images with the IR stream so that my checkerboard corner detection code will work like usual. The resolution for the IR stream is extremely low and some type of IR light has to be used to light up the image.

jasjuang avatar Nov 18 '19 23:11 jasjuang

After a ton of experiments, I believe this is not a calibration problem. It's a depth accuracy problem instead, and I believe the accuracy of the depth is highly affected by the material. If you google "time of flight depth material" a lot of papers will pop up talking about this.

I was able to write the code to add depth information of the checkerboard into calibration as I mentioned in #937, and the calibration results were perfect because when I reconstruct our calibration tower that was being used for calibration from different views, the alignment was on point. However, when I try to reconstruct a human with the exact same calibration parameters, I see the same level of misalignment as shown in this thread. The only explanation here can only be that the depth accuracy varies based on different materials.

This has been extremely frustrating. Can anyone from Microsoft grab two Azure Kinects, and try to align them yourself, and tell us whether you are able to successfully align the point clouds?

jasjuang avatar Dec 31 '19 23:12 jasjuang

Hello everyone ! Just out of curiosity, you are all using the k4a_calibration_3d_to_3d() function to convert your depth data to the color camera 3D space right ? Because you are calibrating with the color camera but then try to match cloud points data as if they were taken by the same sensor. If you are not doing that then I think this is your problem. Kinect Calibration Doc

EDIT : After a bit of reading of the green screen example and your code, I might have something you could try out... It's a guess, but should be easy to try !

Your code : // Transform Depth Image to Point Cloud main_xyz_image = main_depth_to_main_color.depth_image_to_point_cloud( main_depth_in_main_color, K4A_CALIBRATION_TYPE_COLOR ); secondary_xyz_image = secondary_depth_to_main_color.depth_image_to_point_cloud(secondary_depth_in_main_color, K4A_CALIBRATION_TYPE_COLOR); // Get cv::Mat from k4a::image main_xyz = k4a::get_mat(main_xyz_image); secondary_xyz = k4a::get_mat(secondary_xyz_image);

You call depth_image_to_point_cloud on a depth image that as just been converted to the color camera before with this function depth_image_to_color_camera

Directly call the function on the original depth image : main_depth_image.depth_image_to_point_cloud(main_depth_image, K4A_CALIBRATION_TYPE_DEPTH)

Because I am not certain of the K4A_CALIBRATION_TYPE_COLOR flag to choose the right conversion. Let us know if that works !

JulesThuillier avatar Jan 02 '20 05:01 JulesThuillier