esm
esm copied to clipboard
Embedding Multichain proteins with ESM3 and ESMC
Hi,
Many proteins naturally exist in pairs, often referred to as alpha and beta chains, such as TCRs, antibodies, and MHCs. I’m curious about how ESM3/ESMC processes these cases.
- Does ESM3/ESMC have a special separator or handling mechanism for paired chains? I've seen the
'|'separator in the sequence tokenizer. - When using
ESMProtein.from_pdb(pdbID), how does ESM handle multi-chain proteins?
Would appreciate any insights on this!
Thanks!
I am also curious about whether some examples (for instance, 3_gfp_design.ipynb) could possibly include multimers, since I found there is "esm3-medium-multimer-2024-09" available, and dealing with multi-chain proteins are of much greater interest for researchers. Thanks!