esm icon indicating copy to clipboard operation
esm copied to clipboard

ESM atlas length limit

Open y-hwang opened this issue 2 years ago • 4 comments

Thank you for developing this excellent tool! I keep getting the following error when attempting to fold sequences even when it is less than 400 AAs long: "Invalid entry. (Max length is 400)"

y-hwang avatar Feb 14 '23 23:02 y-hwang

Hmm that's weird, but probably the error message is misleading; does your input contain any non-standard amino acid? eg spaces and line breaks will throw us off too.

tomsercu avatar Feb 15 '23 23:02 tomsercu

Ah it turned out to be due to "X" coding for any amino acid. Is there a way to replace this with another "placeholder" aminoacid code?

y-hwang avatar Feb 16 '23 02:02 y-hwang

In principle X could be allowed, the model has been pre-trained with it. You could substitute a flexible amino acid like G if it makes sense in your setting

tomsercu avatar Feb 16 '23 03:02 tomsercu

Thanks for noticing this bug, "X" is actually allowed in the API, but not in the ESMAtlas frontend which is a bug. Please use curl -X POST --data "AAAXAAA" https://api.esmatlas.com/foldSequence/v1 in this case for now.

nikita-smetanin avatar Feb 21 '23 16:02 nikita-smetanin