folding_tools
folding_tools copied to clipboard
add protein sequence design
Me and @ginaelnesr created a list of inverse folding models that I think might nicely complement this *fold list similar to the pLM list already present in the repo.
Let me know if it is worthwhile to add it here or if you think it might be better suited to add to kevins list
I think it's a great idea... but i'm a bit concerned about the terminology used here. Calling this "inverse folding" is misleading. Inverse folding has a very specific meaning (you are designing a protein to fold into one structure and no other structure), since there is no fold enumeration or any negative design against alternative conformations, this is not inverse folding. Inverting protein structure prediction models for protein design is one step closer to "inverse folding".
I would suggest renaming this document to "protein design" and grouping the design methods based on the probabilities they are approximating. P(sequence) = various language models etc. P(sequence | structure) = ESM-IF, ProteinMPNN, TERMinator, TrMRF etc P(structure | sequence) = AlphaDesign/AfDesign/TrDesign/RfDesign etc.
Mind if I accept and then rename to proteindesign.md
? 😄
there is no fold enumeration or any negative design against alternative conformations, this is not inverse folding. Inverting protein structure prediction models for protein design is one step closer to "inverse folding".
Hehe I agree with this. Seen the term used by a few ML people already so I am not sure you will be able to stop the proliferation of it but I guess that from the biophysical standpoint protein sequence design is more appropriate. Happy to remove the reference to "inversefolding" and call it proteindesign.md
or proteinsequencedesign.md
I might add some more models before merging so I changed the pull request to draft
@duerrsimon great idea! Have you had a chat with @noeliaferruz about this?... 👀
I renamed the file and gina fixed some formatting. I think it is ready for merge now