retrosynthesis icon indicating copy to clipboard operation
retrosynthesis copied to clipboard

About the uspto50k P2R model checkpoint

Open fiberleif opened this issue 1 year ago • 5 comments

Dear authors,

Thanks for sharing the code, and some checkpoints about this great work. l am wondering how you trained these checkpoint (i.e., models/P2R/USPTO_50K_P2R.pt), via 1) pretrain-finetune, or 2) train from scratch?

image

I noticed that in the Readme, there are two modes available for users for training a model checkpoint.

fiberleif avatar Dec 11 '23 01:12 fiberleif

This checkpoint is obtained by pretrain-fintune formula.

However, if obtained by train from scratch, you should be able to get a close prediction result.

otori-bird avatar Dec 11 '23 05:12 otori-bird

Thank you! may l ask which pretrained checkpoint you were using? since I observed that the file size of the shared pretrained checkpoints is ~500MB, but the fine-tuned checkpoint here is only ~200MB, which is not consistent with the checkpoint size.

image

fiberleif avatar Dec 12 '23 02:12 fiberleif

Thank you! may l ask which pretrained checkpoint you were using? since I observed that the file size of the shared pretrained checkpoints is ~500MB, but the fine-tuned checkpoint here is only ~200MB, which is not consistent with the checkpoint size.

image

I apologize for the confusion. The checkpoints in the "pretrained" folder are all pre-trained, while the remaining checkpoints are trained from scratch.

If you are looking to obtain fine-tuned checkpoints, it may require you to perform the fine-tuning by yourself. Upon checking my local repository and disk storage, it appears that the fine-tuned checkpoints have been inadvertently lost or misplaced. I apologize for any inconvenience this may have caused you.

otori-bird avatar Dec 12 '23 06:12 otori-bird

Thank you for your detailed reply; now I understand how the checkpoints are trained in detail. May I ask one more question regarding the role of pretraining in R-SMILES? If the models trained from scratch are very close to those trained using the pretrain-finetune formula, what is the point of pretrainig models here?

Thanks

fiberleif avatar Dec 12 '23 15:12 fiberleif

Thank you for your detailed reply; now I understand how the checkpoints are trained in detail. May I ask one more question regarding the role of pretraining in R-SMILES? If the models trained from scratch are very close to those trained using the pretrain-finetune formula, what is the point of pretrainig models here?

Thanks

This is a good question. It can be attributed to two main points: (1) It does indeed bring about certain performance improvements. In our local experimental environment, the model trained from scratch achieved top-1 and top-10 accuracy rates of 55.0% and 89.8% respectively, while the model fine-tuned from pretraining achieved top-1 and top-10 accuracy rates of 56.3% and 91.0% respectively; (2) It significantly reduces the training time on large datasets. Training a model from scratch on large datasets like uspto-full is very time-consuming, whereas the training time for fine-tuning based on a pretrained model is much less.

otori-bird avatar Dec 13 '23 06:12 otori-bird

Since the questioner has not asked any further questions for a long time, I believe it is appropriate to close this issue.

otori-bird avatar Jul 01 '24 13:07 otori-bird