esm icon indicating copy to clipboard operation
esm copied to clipboard

first sequence than structure or the other way around

Open l-i-g opened this issue 3 months ago • 1 comments

Thanks for providing this comprehensive model open-source.

I was wondering what the correct sequence of predictions is:

In your examples, e.g. with the Carbonic Anhydrase (2vvb) the following order is used:

   masked sequence prompt => predicted sequence track => predicted backbone structure

For the GFP evolution gfp_design.ipynb (related to your publication) the order is:

  heavily masked sequence & masked structure => structure tokens =>  generated sequence tokens ==> purged structure tokens => freshly generated structure token

In my particular case I have a heavily masked sequence track with the unmasked amino acids with coordinates provided and a secondary structure track with only a few masked position. Is it better to to first predict the structure or first predict the sequence?

Or more general: Is their a general rule which track should be predicted first?

Thanks for any feed-back

l-i-g avatar Nov 15 '24 13:11 l-i-g