retrosynthesis
retrosynthesis copied to clipboard
About the uspto50k P2R model checkpoint
Dear authors,
Thanks for sharing the code, and some checkpoints about this great work. l am wondering how you trained these checkpoint (i.e., models/P2R/USPTO_50K_P2R.pt
), via 1) pretrain-finetune, or 2) train from scratch?
I noticed that in the Readme, there are two modes available for users for training a model checkpoint.
This checkpoint is obtained by pretrain-fintune formula.
However, if obtained by train from scratch, you should be able to get a close prediction result.
Thank you! may l ask which pretrained checkpoint you were using? since I observed that the file size of the shared pretrained checkpoints is ~500MB, but the fine-tuned checkpoint here is only ~200MB, which is not consistent with the checkpoint size.
Thank you! may l ask which pretrained checkpoint you were using? since I observed that the file size of the shared pretrained checkpoints is ~500MB, but the fine-tuned checkpoint here is only ~200MB, which is not consistent with the checkpoint size.
I apologize for the confusion. The checkpoints in the "pretrained" folder are all pre-trained, while the remaining checkpoints are trained from scratch.
If you are looking to obtain fine-tuned checkpoints, it may require you to perform the fine-tuning by yourself. Upon checking my local repository and disk storage, it appears that the fine-tuned checkpoints have been inadvertently lost or misplaced. I apologize for any inconvenience this may have caused you.
Thank you for your detailed reply; now I understand how the checkpoints are trained in detail. May I ask one more question regarding the role of pretraining in R-SMILES? If the models trained from scratch are very close to those trained using the pretrain-finetune formula, what is the point of pretrainig models here?
Thanks
Thank you for your detailed reply; now I understand how the checkpoints are trained in detail. May I ask one more question regarding the role of pretraining in R-SMILES? If the models trained from scratch are very close to those trained using the pretrain-finetune formula, what is the point of pretrainig models here?
Thanks
This is a good question. It can be attributed to two main points: (1) It does indeed bring about certain performance improvements. In our local experimental environment, the model trained from scratch achieved top-1 and top-10 accuracy rates of 55.0% and 89.8% respectively, while the model fine-tuned from pretraining achieved top-1 and top-10 accuracy rates of 56.3% and 91.0% respectively; (2) It significantly reduces the training time on large datasets. Training a model from scratch on large datasets like uspto-full is very time-consuming, whereas the training time for fine-tuning based on a pretrained model is much less.
Since the questioner has not asked any further questions for a long time, I believe it is appropriate to close this issue.