DeepPurpose
DeepPurpose copied to clipboard
Training Configuration of pre-trained MPNN_CNN
Hi Kexin Huang,
I am using the provided pre-trained MPNN_CNN model. When I looked into its model configuration file, it looks wired to me.
{'input_dim_drug': 1024, 'input_dim_protein': 8420, 'hidden_dim_drug': 128, 'hidden_dim_protein': 256, 'cls_hidden_dims': [1024, 1024, 512], 'batch_size': 16, 'train_epoch': 1, 'LR': 0.001, 'drug_encoding': 'MPNN', 'target_encoding': 'CNN', 'result_folder': './result/', 'binary': False, 'mpnn_hidden_size': 128, 'mpnn_depth': 3, 'cnn_target_filters': [32, 64, 96], 'cnn_target_kernels': [4, 8, 12], 'num_workers': 0, 'decay': 0}
Did you only train this model for only 1 epoch with batch size 16?
Best regards, Po-Yu Kao
That's weird, I must have stored the wrong model. Let me double-check and I will upload the correct model.
Hey it seems this model is wrong. You can use "MPNN_CNN_BindingDB_IC50" instead. It is trained on a much larger training set (~10^5 -> 10^6) and should have higher quality. Do note that the units now switches from Kd to IC50.
Did you use the latest BindingDB to train this model?
Hey, it is using the past version 2020m2. There should be some minor difference with the current most up to date version regarding the number of training points.
Thank you for your reply 👍🏽 Please let me know if you want the trained MPNN_CNN on BindingDB using Kd.
No problem! Did you mean you are managed to train the model? If so, would be great to share with me ([email protected]), thanks!
You can simply use the model.save('XXX') function and then send me the model file; i will upload to the server and update the link, thanks again!
Hi Kexin,
It seems that the pre-trained model MPNN_CNN downloaded using pretrained_dir = download_pretrained_model('pretrained_models') in the oneliner.py still showing the old configuration:
{'input_dim_drug': 1024, 'input_dim_protein': 8420, 'hidden_dim_drug': 128, 'hidden_dim_protein': 256, 'cls_hidden_dims': [1024, 1024, 512], 'batch_size': 16, 'train_epoch': 1, 'LR': 0.001, 'drug_encoding': 'MPNN', 'target_encoding': 'CNN', 'result_folder': './result/', 'binary': False, 'mpnn_hidden_size': 128, 'mpnn_depth': 3, 'cnn_target_filters': [32, 64, 96], 'cnn_target_kernels': [4, 8, 12]}
Maybe you need to update the model file on the https://dataverse.harvard.edu/api/access/datafile/
Maybe the configure files corresponding to pretrained_dir = download_pretrained_model('models_configs') also need a update.
Sounds good, do you want to contribute and train a new model for it?
I'd like to have a try. Could you please give me the dataset of BindDB Kd? And what preproccess or data cleaning is needed before I start the train? Subsequent help may be needed since I am a complete newbie for ML :)
Sounds good, it should be the one in the https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/Transformer%2BCNN_BindingDB.ipynb
simply replacing the model and parameter should be good
Thank you for your fruitful discussion and big thank-you to the developers of this library. My question is: In the latest release od DeepPurpose, was the MPNN_CNN model corrected and it works fine now?