RPMNet icon indicating copy to clipboard operation
RPMNet copied to clipboard

`L_total = L_reg + λL_inlier`?? a little confused about this formula

Open RichardTaoK opened this issue 4 years ago • 1 comments

  1. in your paper, L_total = L_reg + λL_inlier you use λ=0.5^(N_i-i) but in the function >src/train.py>compute_losses:You seem to put λ out of the formula and make it into L_total = λ(L_reg + L_inlier) Here is your code description:
    for i in range(num_iter):
        ref_outliers_strength = (1.0 - torch.sum(endpoints['perm_matrices'][i], dim=1)) * _args.wt_inliers
        src_outliers_strength = (1.0 - torch.sum(endpoints['perm_matrices'][i], dim=2)) * _args.wt_inliers
        if reduction.lower() == 'mean':
            losses['outlier_{}'.format(i)] = torch.mean(ref_outliers_strength) + torch.mean(src_outliers_strength)
        elif reduction.lower() == 'none':
            losses['outlier_{}'.format(i)] = torch.mean(ref_outliers_strength, dim=1) + \
                                             torch.mean(src_outliers_strength, dim=1)

    discount_factor = 0.5  # Early iterations will be discounted
    total_losses = []
    for k in losses:
        discount = discount_factor ** (num_iter - int(k[k.rfind('_')+1:]) - 1)
        total_losses.append(losses[k] * discount)
    losses['total'] = torch.sum(torch.stack(total_losses), dim=0)
    return losses
  1. I can't understand why L_inlier can alleviate the outlier problem? Do you have some theoretical basis?

RichardTaoK avatar Jan 06 '21 09:01 RichardTaoK

Hi,

  1. You understood it wrongly, perhaps the description could have been written more clearly. We compute the loss for all (two) iterations, where each iteration's loss is the sum of both L_reg and L_inlier, weighted by λ. However, we weight later iterations more, i.e. the losses from the last iteration is kept unscaled, second last is halved, ...

  2. Since L_reg is trained using the groundtruth pose as supervision, there's nothing stopping the network from distributing more and more weights into the last row/column of the match matrix. This is especially since you only need a small number of points (just 3 will do) to compute the transform. L_reg is a workaround to prevent this from happening. Alternatively, it may be better to provide supervision directly on the weight matrix, as done in https://arxiv.org/abs/2010.16085. This way you should be able to avoid the need for L_reg.

Zi Jian

yewzijian avatar Jan 07 '21 05:01 yewzijian