packnet-sfm
packnet-sfm copied to clipboard
Automasking
Hi,
In your paper "3D Packing for Self Supervised Monocular Depth Estimation", automasking is defined as masking out pixels "by removing those which have a warped photometric loss higher than their corresponding unwarped photometric loss".
However, from what I understand in your code, you are basically computing pixel-wise minimum photometric loss with respect to the loss of every warped context image plus the original unwarped context images. Thus, in case the unwarped photometric loss is lower than that of warped context images, the final (minimum) loss is simply the unwarped loss, which is not the same as masking those pixels right?
If my understanding is right, is there a specific reason why the implementation is different than what is explained in the paper?
Cheers