Batch-Spectral-Penalization icon indicating copy to clipboard operation
Batch-Spectral-Penalization copied to clipboard

Accuracy reported on the validation set of visda-2017 instead of test set atleast in the code. Are the results reported in BSP paper also on the validation set using the same code output?

Open sobalgi opened this issue 4 years ago • 1 comments

The paper reports SOTA results compared to CDAN on VisDA-2017 dataset. However there might be some issues with the reproducibility.

On a close look, the provided code only reports the accuracy on the validation set and not onn the test set. Also leads to a doubt if the results reported in the paper are on the same validation set or on the actual test. If indeed the result reported is on the test set. Then it might be that the current code is not the most updated.

Any clarifications regarding this might be very helpful. Also running the current code on VisDA-2017 is reproducing an average accuracy of 77.75% which is better than the reported accuracy of 75.0% in the actual paper. Please refer to the screenshot below for my result.

Screenshot_20200521-111902~01

Any thoughts on this also might be helpful along with my original query.

Best regards, SB

sobalgi avatar May 21 '20 09:05 sobalgi

The paper reports SOTA results compared to CDAN on VisDA-2017 dataset. However there might be some issues with the reproducibility.

On a close look, the provided code only reports the accuracy on the validation set and not onn the test set. Also leads to a doubt if the results reported in the paper are on the same validation set or on the actual test. If indeed the result reported is on the test set. Then it might be that the current code is not the most updated.

Any clarifications regarding this might be very helpful. Also running the current code on VisDA-2017 is reproducing an average accuracy of 77.75% which is better than the reported accuracy of 75.0% in the actual paper. Please refer to the screenshot below for my result.

Screenshot_20200521-111902~01

Any thoughts on this also might be helpful along with my original query.

Best regards, sobalgi

The validation set of Visda-2017 is commonly used as data of the target domain, so I don't think any evaluation on validation set will raise issues. Nonetheless, some papers of powerful methods still report performances on the test set. Another question is that, do you think the accuracy converges too fast?

xfflzl avatar Apr 21 '21 13:04 xfflzl