MVSS-Net
MVSS-Net copied to clipboard
About the optimal threshold
From this picture, most models have relatively clear judgments on the tampered area in most cases. Why can the F1 scores of most models be doubled or even tripled only by adjusting the optimal threshold? Effect of the threshold seem excessive?
I tried threshold settings from 0.01 to 0.99 on the IMD2020 dataset and found that the highest score was only 0.32, not 0.757 in the paper.
Some say that the author may have tried threshold settings from 0.01 to 0.99 and then extracted the best p and r from the results to calculate f1, but this evaluation method makes no sense, because a model that outputs the same logits for all positions will get the highest f1 (1.0) . And even in this way, MVSS doesn't seem to live up to the performance in the paper.