fast_gicp icon indicating copy to clipboard operation
fast_gicp copied to clipboard

FastVGICPCuda not working while FastVGICP does

Open Dysl3xik opened this issue 2 years ago • 8 comments

I am trying to test both variants of VGICP using the same data set and the cuda variant seems to have a bug and is simply not returning real results.

When I use the cuda variant it seems to be taking some time to do the covariance calcs, but the LM optimization just returns all zeros and or NAN. Switching to gauss newton results in it immediately returning the same input transform (identity in this case).

If it test using VGICP on the same data / params I get the expected results.

I am using cuda 11.4 if that makes any difference.

Any ideas what may be wrong here?

Dysl3xik avatar Sep 07 '21 13:09 Dysl3xik

I actually tracked this down to the RegularizationMethod parameter. When its set to Planar, the default the code does not work on either my sample set, or your provided data.

If I change to Frobenius it seems to work fine.

Dysl3xik avatar Sep 07 '21 17:09 Dysl3xik

I'm not really sure, but I guess it is a problem on Eigen::SelfAdjointEigenSolver used in the PLANE regularization that may have a problem on some GPUs. What GPU are you using? Can you insert the following test code just below eig.computeDirect(cov); to see if eigenvalue decomposition is working properly?

    // --- test code ---
    Eigen::Vector3f values = eig.eigenvalues();
    Eigen::Matrix3f v_diag = values.asDiagonal();
    Eigen::Matrix3f v_inv = eig.eigenvectors().inverse();

    Eigen::Matrix3f C_ = eig.eigenvectors() * v_diag * v_inv;

    if((cov - C_).array().abs().maxCoeff() > 1e-3) {
      printf("wrong SVD result\n");
      printf("--- C ---\n");
      for(int i=0; i<3; i++) {
        for(int j=0; j<3; j++) {
          printf("%.6f ", cov(i, j));
        }
        printf("\n");
      }

      printf("--- C_ ---\n");
      for(int i=0; i<3; i++) {
        for(int j=0; j<3; j++) {
          printf("%.6f ", C_(i, j));
        }
        printf("\n");
      }
    }
    // ---

koide3 avatar Sep 10 '21 08:09 koide3

GPU is RTX8000

This code did not trigger to trap anything, I tried to go through all the covariance stuff and look for NAN or INF and I am not seeing it show up anywhere...

Dysl3xik avatar Sep 10 '21 18:09 Dysl3xik

I also tried reverting the plane code to the commented section and use covariance_regularization_svd() and still get the same result. Maybe that provides any more useful information....

Dysl3xik avatar Sep 10 '21 18:09 Dysl3xik

I found a fix for this. https://github.com/rzhao88/fast_gicp/commit/2159f3942dbbd7795f0eceb8cdda57bd298a0a99

Comment is wrong, I was using CUDA 11.5

rzhao88 avatar Jan 02 '22 22:01 rzhao88

I found a fix for this. rzhao88@2159f39

Comment is wrong, I was using CUDA 11.5

Hi, @rzhao88 I have tested it on CUDA 11.5. It fixes sometimes the covariance stuff return NAN or INF issues in the Cuda version. I think you could make a PR, the src/fast_gicp/cuda/covariance_regularization.cu fix it. Thanks, @rzhao88 I think you @koide3 may be interested in it.

cdb0y511 avatar Jan 16 '22 04:01 cdb0y511

I don't have the time current (due to job constraints) to do a clean fix for this. Feel free to take it and make a PR. @koide3 @cdb0y511

rzhao88 avatar Feb 03 '22 03:02 rzhao88

I have the same problem. Do you guys have any ideas?

JACKLiuDay avatar Mar 07 '23 09:03 JACKLiuDay