ColabDesign icon indicating copy to clipboard operation
ColabDesign copied to clipboard

converting other MPNN checkpoints for support in ColabDesign and JAX

Open adrienchaton opened this issue 2 years ago • 9 comments

Hi everyone and thanks for the great work you are sharing!

It is awesome that ProteinMPNN is integrated within ColabDesign. I assume you have converted the original pytorch checkpoints to JAX ... or have you retrained the models?

In the case you have converted the checkpoints, I would like to ask if the conversion method could be shared please so that I could also convert other protein MPNN models that were fine-tuned from the original checkpoint using the original pytorch code.

I have this one in mind, https://zenodo.org/records/8164693 and it would be awesome if possible to convert the abmpnn.pt checkpoint to a .pkl compatible with ColabDesign ...

Any help would be much appreciated, thanks!

adrienchaton avatar Dec 12 '23 19:12 adrienchaton

@sokrypton trying my luck ... any chance one of you have a conversion script for ProteinMPNN checkpoints from the official PyTorch repo to JAX for using in your (great) ColabDesign repo? Thanks!

adrienchaton avatar Dec 18 '23 15:12 adrienchaton

@adrienchaton I apologize for the delay. Was travelling. Here is the script we used to convert weights: https://github.com/sokrypton/ColabDesign/tree/main/mpnn/convert_weights

sokrypton avatar Dec 20 '23 16:12 sokrypton

@sokrypton thanks a lot, I will run that and upload the weights.

I didn't see this script as I was digging https://github.com/sokrypton/ColabDesign/tree/main/colabdesign/mpnn

But in the meantime I was figuring out it is a matter of matching dict keys (with nested w,b) and converting parameters to numpy arrays. Nonetheless, you saving me some work to get it right.

Much appreciated and best end of year wishes to you.

adrienchaton avatar Dec 20 '23 17:12 adrienchaton

I just upload the script 🤪 Tell me if you run into any issues.

sokrypton avatar Dec 20 '23 19:12 sokrypton

@sokrypton thanks for clarifying, I didn't check the commit history but was surprised I hadn't seen that before asking. The script worked flawlessly and the checkpoint is running correctly, I didn't do assertions against the original pt checkpoint and code, but sequence recovery on IF against Ab backbone is high so it should be correct, i.e. with low temperature it is mostly >80%. Would you like to add this checkpoint to the repo?

AFAIK, there is only another IF model for Abs, finetuned from the ESM one by the same lab (oxpig). AntiFold feels maybe more interesting (possibly, going a bit less towards the germline) but the MPNN and ColabDesign workflows are greater and much more interesting as protein design tools. Bottom line, having both is great for me and I would imagine it's a relevant addition to the repo.

adrienchaton avatar Dec 21 '23 09:12 adrienchaton

Awesome! Tell me if you wanna contribute the converted weights, would be happy to add them to the repo :D

sokrypton avatar Dec 21 '23 15:12 sokrypton

Hi @sokrypton sorry for the delayed answer, sounds good!

FYI the fine-tuned pt weights are shared here https://zenodo.org/records/8164693 under the Creative Commons Attribution 4.0 license. Original MPNN codes and models are MIT license, both are quite open licenses but we probably want to mention that still.

I used already quite a bit the converted checkpoint and outputs seem correct How would you like me to do that?

  • just upload the .pkl in the "soluble" weight directory?
  • add a self-contained script which allows dl/conversion? (based on the script you kindly uploaded)
  • add an example script (with some public structure from e.g. SabDab)?
  • some test script with assertion against running the pt checkpoint with the original MPNN codes?

Let me know and I will prepare a pull-request. Cheers

adrienchaton avatar Dec 23 '23 08:12 adrienchaton

In case that's helpful to anyone, for now the converted checkpoint is here! Feel free to bring it to the repo however you wish to or to let me know any actions wished from my end, thanks again for the conversion script!

abmpnn.pkl.zip

adrienchaton avatar Dec 30 '23 09:12 adrienchaton

in case someone is interested in running some "high thermostability" MPNN model in ColabDesign

https://github.com/meilerlab/HyperMPNN/issues/1#issuecomment-2545800754

adrienchaton avatar Dec 16 '24 14:12 adrienchaton