ProtST icon indicating copy to clipboard operation
ProtST copied to clipboard

About data leakage on zero-shot classification?

Open LTEnjoy opened this issue 1 year ago • 2 comments

Hello!

Thanks for your great work! I have tested the zero-shot classification given your released checkpoint and it did a good performance. But I am confused that whether there exists some data leakage problem? Your model was fine-tuned on Swiss-Prot database and the DeepLoc dataset was also constructed from UniProt database. Did you do some filtering when you tested zero-shot performance?

Looking forward to your reply! Thanks in advance!

LTEnjoy avatar Nov 02 '23 04:11 LTEnjoy