esm
esm copied to clipboard
first sequence than structure or the other way around
Thanks for providing this comprehensive model open-source.
I was wondering what the correct sequence of predictions is:
In your examples, e.g. with the Carbonic Anhydrase (2vvb) the following order is used:
masked sequence prompt => predicted sequence track => predicted backbone structure
For the GFP evolution gfp_design.ipynb (related to your publication) the order is:
heavily masked sequence & masked structure => structure tokens => generated sequence tokens ==> purged structure tokens => freshly generated structure token
In my particular case I have a heavily masked sequence track with the unmasked amino acids with coordinates provided and a secondary structure track with only a few masked position. Is it better to to first predict the structure or first predict the sequence?
Or more general: Is their a general rule which track should be predicted first?
Thanks for any feed-back