Eff10k
Thanks to your help, I have re-trained the CONSCNN model on the ConSurf-DB dataset using ESM embeddings. Could you also provide the Eff10k dataset so I can re-train the 10-fold logistic regression? Thanks a lot. Happiness to you.
Hi :) I currently do not have access to the dataset and it might take a while until i do. in the meantime I would suggest to work with the new version of vespa: https://academic.oup.com/bioinformatics/article/40/11/btae621/7907184 and the respective training data there https://zenodo.org/records/11085958
Thank you for your response! Could you please share a link to the Eff10k dataset? I’d love to keep an eye on its accessibility over time. I’m also diving into the vespaG method—such an impressive piece of work! I have one question: is there a way to convert the vespaG training data into an Eff10k-style format with “effect” and “neutral” labels? Or perhaps you know of any other datasets, beyond Eff10k, already annotated with those two labels? I’ve been really impressed by vespa’s zero-shot performance and would love to reproduce it, and of course I’ll keep following vespaG as well. Thanks a lot. Wishing you a wonderful life!
could you please reach out via the email provided here so i can send you the data https://doi.org/10.1007/s00439-021-02411-y