Scene-Text-Removal icon indicating copy to clipboard operation
Scene-Text-Removal copied to clipboard

About PSNR and l2

Open neouyghur opened this issue 5 years ago • 10 comments

Hi, I am checking your results provided in Table1. I find the PSNR is not corresponding to l2. For example, 0.2465 l2 is corresponding to 25.60 PSNR, while 0.0627 l2 is corresponding to 24.83, and scene text easier l2 error is very high. Could you check this or could you offer your model for testing? Thanks.

neouyghur avatar Nov 26 '19 00:11 neouyghur

I plotted your PSNR and l2 scores in a figure. It clearly shows your result is not consistent. Could you explain why? @naoto0804 did you get the same PSNR score? Thanks.

EnsNet

neouyghur avatar Nov 26 '19 01:11 neouyghur

@neouyghur,Both PSNR and l2 scores are based on the average scores of the all test images.

zhangshuaitao avatar Nov 27 '19 02:11 zhangshuaitao

@zhangshuaitao I am comparing my method with yours. I am also following the same protocol, however, my MSE and PSNR curves share the same trend. Besides that, as we know PSNR score is calculated based on the MSE score.

neouyghur avatar Nov 27 '19 03:11 neouyghur

@zhangshuaitao is your l2 score is rmse or mse? thanks.

neouyghur avatar Nov 27 '19 22:11 neouyghur

@neouyghur, l2 score is mse. We use the compare_mse and compare_ssim and compare_psnr functions in the skimage.measure module.

zhangshuaitao avatar Nov 28 '19 02:11 zhangshuaitao

@zhangshuaitao First of all, thanks again for releasing the code and answering a lot of questions patiently.

However, what you say above seems to be inconsistent with the README.md;To evalution the model performace over a dataset, you can find the evaluation metrics in this website PythonCode.zip.. Which is correct?

naoto0804 avatar Nov 28 '19 11:11 naoto0804

I would really appreciate it if you could produce the whole pipeline for evaluation?

It might be hard to follow the exactly same evaluation protocol, since some parameters for each function is unknown. (e.g., compare_ssim has some optional params, how did you set it? What's the range of values in images, 0.0~1.0 or 0~255?)

naoto0804 avatar Nov 28 '19 11:11 naoto0804

@naoto0804, Sorry for not explaining it clearly. we use AGE, pEPs, pCEPS in the PythonCode.zip. We use the compare_mse and compare_ssim and compare_psnr functions in the skimage.measure module. The default parameters for those functions is ok.

zhangshuaitao avatar Nov 29 '19 03:11 zhangshuaitao

@zhangshuaitao Thank you so much for making it much more clear.

To make sure whether I followed your instruction exactly, I've computed all the metrics between all the original input/ground truth images in the test subset of the synthetic dataset. This is because I want to focus on the difference only in the evaluation phase, before reproducing the training phase.

The result is as follows; Do you think it's reasonable? If possible, could you compute it on your dataset and evaluation code? (I suspect there's still bugs in my implmentation, since these values are much better than the baseline method in Table. 1) mse 0.006965 ssim 0.933875 psnr 23.996012 AGE 5.851178 pEPs 0.064378 pCEPs 0.050264

naoto0804 avatar Nov 29 '19 06:11 naoto0804

@naoto0804 @zhangshuaitao I think your result is reasonable since only a small part of the scene is text. However, I felt they didn't fully train the baselines. With more training, the based line should get better results than they reported.

neouyghur avatar Nov 29 '19 06:11 neouyghur