The problem in metric calculation
Hello Zou, Thanks for your open-source codes, I have a question on metric calculation.
In the _evaluate_test_videos function (from line 285 to line 294) in inference.py, you transform all the processed frames, the GTs and the masks to PIL format with ToPILImage function from the torchvision package before calculating the metrics, such as PSNR, SSIM, etc. However, ToPILImage function will transform the images from float to uint8, which will cause the inaccuracy in metric calculation.
For example, the value of PSNR depends on the MSE between the GT and the processed image, if you transform the images from float to uint8, the insufficient bits in uint8 will cause the arithmetic overflow in MSE calculation, which leads inaccurate PSNR value.
You can find the reference in https://stackoverflow.com/questions/40395657/psnr-of-image-in-matlab
I'm looking forward to your reply and discussion about this problem, thank you!