mmcv
mmcv copied to clipboard
Potential bug in calc_square_dist()
https://github.com/open-mmlab/mmcv/blob/f527e43c1a1e52e5903410255aea1f71f005a161/mmcv/ops/points_sampler.py#L12 This function could return NaN, since the square dist could be negative in some cases.
for example:
a = torch.tensor([[[0.0000, 0.0000, 0.2188, 0.0000, 0.0000]]])
b = torch.tensor([[[0.0000, 0.0000, 0.2189, 0.0000, 0.0000]]])
calc_square_dist(a,b)
Please @ZCMax have a look.
I can not reproduce the NaN result using your provided example. It calculates the square dist correctly.
I can not reproduce the
NaNresult using your provided example. It calculates the square dist correctly.
this is very strange. I can reproduce neither. But I did encounter NaN while using it.

Hi, what is your mmcv version? I can not reproduce the NaN result either.

nan_tensor.zip I save two tensors that will casues nan in the zip file. My mmcv version is
MMCV: 1.5.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.0
This seems to be caused by a subtle numerical problem. Because there can be negative values around zero in the results of dist = a_square + b_square - 2 * corr_matrix, i.e., in dist, nan can be produced by computing sqrt(dist). A simple workaround is to add an small number epsilon to dist when computing its square root. Please @ZCMax have a look at whether this modification has other influence or has any effect on current related models.
This seems to be caused by a subtle numerical problem. Because there can be negative values around zero in the results of
dist = a_square + b_square - 2 * corr_matrix, i.e., indist,nancan be produced by computingsqrt(dist). A simple workaround is to add an small numberepsilonto dist when computing its square root. Please @ZCMax have a look at whether this modification has other influence or has any effect on current related models.
yep. using torch.cdist instead not seen nan so far
Great, torch.cdist would be a better solution for this situation, contributions ( PR) are welcome if you have time after checking the performance influence of this modification.