pcl icon indicating copy to clipboard operation
pcl copied to clipboard

ICP different result depending if cloud is in [mm] or [m]

Open michhelle132 opened this issue 3 years ago • 21 comments
trafficstars

I have two identical clouds, with a translation diff on Z-axis about 200mm(0.2m). When clouds are in [m], ICP::IterativeClosestPointsWithNormals gives me perfect alignment, which is expected as clouds are the same. The problem occurs when I have clouds in millimeters. With the same scenario, I am not able to get alignment. The only parameters I am setting are: setRansacOutlierRejectionThreshold for both is half of uniform sampling leaf size. [Meter] cloud leaf is 0.05, a [millimeter] is 50 setMaxCorrespondenceDistance [Meter] cloud leaf is 0.5, [millimeter] is 500

I would expect the result should be the same, especially clouds are the same.

I also tried to increase transformationEpsilon and euclideanFitnessEpsilon -by multiplying the default value by 1000 - in the case of millimeters.

michhelle132 avatar Feb 22 '22 09:02 michhelle132

Do you have the same problem if you use IterativeClosestPoint instead of IterativeClosestPointWithNormals? Have you tried scaling the cloud up by 10 (decimeter) or 100 (centimeter)?

mvieth avatar Feb 22 '22 17:02 mvieth

With IterativeClosestPoint I see the result is quite similar. As "similar" I mean the result transformation differs about 5mm in translation.

I also encounter the same problem with GeneralizedIterativeClosestPoint.

I checked GeneralizedIterativeClosestPoint. with different scaling, the higher the scaling the worse result.

For IterativeClosestPointWithNormals similar as for 'GeneralizedIterativeClosestPoint` when applying different scaling

michhelle132 avatar Feb 22 '22 23:02 michhelle132

I wonder if it's due to float? 1000 times might result in some small errors not resulting in any difference in float representation.

Eg: 9.9*10**15 + 1 = 9.9*10**15 (hence, 1 == 0 :) )

kunaltyagi avatar Feb 23 '22 04:02 kunaltyagi

I wonder if it's due to float? 1000 times might result in some small errors not resulting in any difference in float representation.

Maybe, although it doesn't sound like we're already in the 10^15 region here? @michhelle132 Could you provide your cloud(s) as PCD or PLY files, and some reproducible code so we can investigate this properly?

mvieth avatar Feb 23 '22 12:02 mvieth

Distance is computed as sum of squares. With a 20m cloud, that would be 20*20 = 400, but for a 20,000mm cloud, that would be 4*10**8

This is already halfway there (logarithmic-ally speaking), but yeah, +1 to getting an actual cloud out here to see what could be the issue

kunaltyagi avatar Feb 23 '22 13:02 kunaltyagi

cloudTests.zip

parameters to script ../data/cloud_meter.ply ../data/cloud_meter.ply 20 1 , first digit is a number of iterations, second scaling. Scaling for meter cloud is 1. In attachment two clouds, I did scaling in meshlab(the same as in my tests).

michhelle132 avatar Feb 23 '22 23:02 michhelle132

I took a quick look at the program and activated more logging with pcl::console::setVerbosityLevel(pcl::console::L_VERBOSE);. The number of correspondences is the same for meter and mm (3912), but for the meter cloud it finishes in 3 iterations, for the mm cloud it runs 10 iterations and then quits. Most likely the problem is in the BFGS transformation estimation, but there everything is in double precision ... I will try to take a closer look in the next few days

mvieth avatar Feb 24 '22 20:02 mvieth

My current state of the investigation is that the function which the BFGS algorithm tries to minimize might be the problem. The transformation is described with 6 values, 3 values for the translation in x, y, z, and 3 values for rotation angles. These two triplets are totally different things: the rotation angles are somewhere between -pi and pi, and the value range of the translation depends on the unit of the point cloud (in the example here it would be ~200 if the cloud is in mm, and 0.2 if in m). If the value ranges of translation and rotation are very different, then the convergence is not guaranteed. The BFGS algorithm relies on the gradient of the "registration distance" (distance = 0 means perfect alignment), with respect to the 6 values of the transformation. If the cloud is in mm, then translating 1 unit barely changes the distance. However, if the cloud is in m, then translating 1 unit changes the registration distance a lot. The line search along the gradient then heavily favours the rotation if the unit is mm. A possible solution might be to use Levenberg-Marquardt instead of BFGS for the optimization problem (or any other method), but I don't have high hopes that this is the solution. Another solution could be to optimize the translation and the rotation separately, maybe in an alternating way. I wonder if any of the other methods/classes in the registration module have similar problems, and whether they have any solution for it. Related issue: #1147 @keineahnung2345 Since you are currently active in GICP, feel free to take a look at this and let us know if you have any idea/suggestion

mvieth avatar Mar 05 '22 17:03 mvieth

I've tried testing with IterativeClosestPointWithNormals(point to plane ICP) using following code:

pcl::IterativeClosestPointWithNormals<PointT, PointT> icp2;
icp2.setInputSource(cloud_template_normals);
icp2.setInputTarget(cloud_scene_normals);
icp2.setEnforceSameDirectionNormals(false);
icp2.setUseSymmetricObjective(false);

icp2.setRANSACOutlierRejectionThreshold(leaf * 0.5);
//icp2.setEuclideanFitnessEpsilon(icp2.getEuclideanFitnessEpsilon());
icp2.setTransformationEpsilon(scaling * icp2.getTransformationEpsilon());
icp2.setMaxCorrespondenceDistance(scaling * 0.5 * 0.8);
icp2.setMaximumIterations(1);
iterations = 0;

With clouds in different scales(meter,dm,cm,mm):

cloud_meter.ply #converges at around 240 iteration
cloud_dm.ply #converges at 760
cloud_cm.ply #converges at 20
cloud_mm.ply #converges at 390

At first, they seem to translate and rotate randomly but they finally all converged. From the data above, it seems converge iterations and cloud scale are not positively correlated.

keineahnung2345 avatar Mar 25 '22 08:03 keineahnung2345

@keineahnung2345 Thank you for investigating! I haven't looked much into IterativeClosestPointWithNormals, only GICP, but it is good to know that there does not seem to be a problem there. Although it is still a bit strange that the number of iterations is so different. The algorithm is purely deterministic, there is nothing random in it, right?

mvieth avatar Mar 25 '22 17:03 mvieth

Although it is still a bit strange that the number of iterations is so different.

That confuses me too, maybe I can add some debug message showing its loss in each iteration to see what happens.

keineahnung2345 avatar Mar 26 '22 01:03 keineahnung2345

Original radius for computing normals is:

float nnradius = scaling * 0.5;

The normals would be like this(all normals are approximately same direction, not reflecting local shapes):

normal_0 5

After changing it to:

float nnradius = scaling * 0.1;

The normals now reflect local shapes:

normal_0 1

Using pcl::IterativeClosestPointWithNormals<PointT, PointT> icp2 and :

icp2.setTransformationEpsilon(1e-10);
icp2.setEuclideanFitnessEpsilon(scaling * 1e-10);

Point clouds of different scales needs much less iterations to converge.

meter(scale 1)    5
dm(scale 10)      6
cm(scale 100)     5
mm(scale 1000)    10

For pcl::GeneralizedIterativeClosestPoint, it "converges" quickly too(less than 20 iterations), but actually the point cloud is not aligned.

keineahnung2345 avatar May 05 '22 03:05 keineahnung2345

Although it is still a bit strange that the number of iterations is so different.

That confuses me too, maybe I can add some debug message showing its loss in each iteration to see what happens.

@mvieth I've added debug message of loss for TransformationEstimationSVD and TransformationEstimationPointToPlaneLLS in https://github.com/keineahnung2345/pcl/tree/te_loss, if you find it helpful, I could create a PR.

keineahnung2345 avatar May 05 '22 06:05 keineahnung2345

Find some bugs in the code that @michhelle132 provided:

  1. Before doing alignment with icp2, there should be a transformation_matrix.setIdentity();.
  2. For gicp,rotation_epsilon_ is used, so we need something like icp2.setRotationEpsilon(1e-10);(Also note that setEuclideanFitnessEpsilon is useless for gicp)

Setting float nnradius = scaling * 0.1;, following are the convergence iterations for gicp:

meter(scale 1)       9
dm(scale 10)         6
cm(scale 100)        6
mm(scale 1000)       207
um(scale 1000000)    33

But for mm and um, they are not aligned perfectly, estimated transformation matrices are:

0.999446  0.0200865   0.026555    -56.258
-0.0197015   0.999698  -0.014682    47.3637
-0.0268419  0.0141507    0.99954   -171.459
         0          0          0          1

and

    0.999964   0.00702394   0.00476756 -5.74474e-07
 -0.00753089     0.993183      0.11632  7.38101e-07
 -0.00391803    -0.116352       0.9932 -3.09468e-06
           0            0            0            1

keineahnung2345 avatar May 17 '22 09:05 keineahnung2345

Another solution could be to optimize the translation and the rotation separately, maybe in an alternating way.

I've implemented the alternating algorithm, and it works from megameter to micrometer(all converged and aligned). Following are their convergence iterations:

km 0.1 67
m 0.1 41
dm 0.1 99
cm 0.1 99
mm 0.1 67
um 0.1 67

keineahnung2345 avatar May 20 '22 08:05 keineahnung2345

@keineahnung2345 Sorry for the late reply! It is good to hear that pcl::IterativeClosestPointWithNormals works fine if supplied with adequate normals (and parameters).

@mvieth I've added debug message of loss for TransformationEstimationSVD and TransformationEstimationPointToPlaneLLS in https://github.com/keineahnung2345/pcl/tree/te_loss, if you find it helpful, I could create a PR.

Do you have an estimate how long it takes to compute the values that are printed? If the functions run much slower because of these debug messages, I would rather not add them. A possibility would be to call pcl::console::isVerbosityLevelEnabled() first, before computing the values needed in the debug messages.

mvieth avatar May 29 '22 16:05 mvieth

I use PCL_DEBUG not PCL_VERBOSE, do we need another function pcl::console::isDebugLevelEnabled() or just using pcl::console::isVerbosityLevelEnabled() would suffice?

keineahnung2345 avatar May 30 '22 03:05 keineahnung2345

I use PCL_DEBUG not PCL_VERBOSE, do we need another function pcl::console::isDebugLevelEnabled() or just using pcl::console::isVerbosityLevelEnabled() would suffice?

Sorry for the confusion, here is the doc of that function: https://pointclouds.org/documentation/namespacepcl_1_1console.html#a59064b5f6807552c0bac04377b1d1d0c It takes the level as a parameter, in this case, L_DEBUG

mvieth avatar May 30 '22 06:05 mvieth

I misunderstood the VerbosityLevel in isVerbosityLevelEnabled as the L_VERBOSE in VERBOSITY_LEVEL. Seems it works for all kinds of VERBOSITY_LEVEL, thanks.

Do you have an estimate how long it takes to compute the values that are printed?

For point cloud of size 3912, it takes about 137.8 ms for one iteration averagely.

keineahnung2345 avatar May 30 '22 07:05 keineahnung2345

For point cloud of size 3912, it takes about 137.8 ms for one iteration averagely.

That's quite long, but if we check the verbosity level first, that should be okay

mvieth avatar May 30 '22 12:05 mvieth

@mvieth I've created a PR: https://github.com/PointCloudLibrary/pcl/pull/5279.

keineahnung2345 avatar May 31 '22 02:05 keineahnung2345