Uni-Mol icon indicating copy to clipboard operation
Uni-Mol copied to clipboard

Question Pre-Training Data Dependence

Open vthost opened this issue 1 year ago • 1 comments

Thank you for providing all the code, it's a very interesting model! Have you by any chance insights into how Uni-Mol performs for molecular property prediction when it's pre-trained on the 2M molecules many other experiments are run on or, e.g., more generally, how much pre-training data is needed to obtain decent performance.

vthost avatar Jul 04 '23 17:07 vthost

Hello vthost, I believe it's worth a try. We haven't yet experimented with a 2M data size till now, but there are some specially tasks pretraining in 2M~ dataset, which also show some performance gain. ref: https://chemrxiv.org/engage/chemrxiv/article-details/6412d142aad2a62ca1d86505

Naplessss avatar Aug 28 '23 05:08 Naplessss