alphafold icon indicating copy to clipboard operation
alphafold copied to clipboard

Unfolded regions in big multimers

Open abiadak opened this issue 2 years ago • 5 comments

Hi, I am trying to model protein complexes with 2 or more proteins with different copy number for each of them. Usually, everything goes very well when the number of proteins or their copy number is low, but as I increase the complexity, alphafold v2.2.0 begins to produce models where some proteins that previously where perfectly folded, become totally or partially unfolded. For example, these 2 proteins:

p51 MGFDALSVKKKNQSGSTRKLKKIQKSIDRINRVYPVCGYTDEGYVKTKFGLKEGYFEVFDVKHYDTNILDEKEFNFVTESYWKLQQIYSDPLKEVHMNLPEDNQLQQEYIKYKIERTNNFARLRVLNAELEKLKFIEKTYKSRRTYLIVFGRTAEELTKRIDDLTRFTNFLEPQPISIEKKIKILHGMNNFI p52 MEKIIKDLDFVYATQPMGGISFKDEFFNRTGDGYVACLHVYRYPENFTPYWLNKLTSIHNTIVTIDTYTQKDINYTDKVKSSTNEMKSRIRNAANETDADVAREELQTLRELGLAITKGGEVIKQIHVRIFLHGATQSELEKRISEVQKSIDSDGFKSKVFLDENKEEWQSLFLDYETQLTMPNKRIGNDMPAEAIGLGFAYDQTSLSDPTGVYYGYTSTRGTVYWDLFHKTTKRLYYNMFVAGDMGSGKSTLLKKILRDNASKGNFIRGFDKSGEFQSVTADMGGITIDLDGSNGRINLMQIFPSVTKKIGDKVVIDESASFRQHVSKLNSCYRIKNPKSDDNVLVQFDELVYGFYEKHNFWGEDARSNITQLPPEEYPLLSDFQAYCEERYHAEKDPNFKSRIGDIAMSIKNLVTQFGEIFDGITTIPDMVNEQIVFYDIGNLSQLSDQVKDIQIFNALSQIWGTMMNIGRKEKEAYDKGKIHWFDITRFLIILDECHNLLELKKAHTANFFVTLMSEARKFFGGLVLATQRIERMFPNTNTSDPDMAIAANKLREIFGLTQYKALFKQDQTSMKLIKNLFEDQMTDNEYALLPKFETGDCILSIAGDRNLVMHVEATQEELELFEGGA

On the left, when modelled as a dimer (p51 bluish colors and p52 red/orange colors), and on the right when modelled as part of an hexamer (6 x p51 + 6 x p52) (It's an ATPase, and the hexamer would be the biological active form): image

In this example, the alpha helix on the p51's N-terminal is what gets unfolded, and this would be a mild example. In other cases, with other proteins, some of them become totally unfold. For this complex, I've obtained models without problems with up to 4 copies of each protein (4xp51 + 4xp52), and it would be possible using other programs to get the full hexamer model, but I've other cases with different stoichiometry and with more proteins implied are not so easy to circumvent.

This has been run on a Nvidia A100-PCIE-40GB, with tensorflow 2.5.0, jax 0.2.19, jaxlib 0.1.70 and CUDA 11.6

Any suggestion about how to overcome this kind of issues, apart from the obvious one of dividing the problem in smaller ones, ?

I've a doubt also about the GPU and main memory usage. I've seen comments where they state that alphafold for one multimer can use only the memory of one GPU on the system plus main memory when using TF_FORCE_UNIFIED_MEMORY=1 (and this is what I observe) and others that say that it's possible to use the aggregated memory of more than one GPU on the same server for big proteins or multimers (and this is what I haven't managed to achieve). Is this really possible?

Many thanks to the developers for this great software!

abiadak avatar May 26 '22 12:05 abiadak

Hi Abiadak,

WIll it possible for you share the fasta sequence input format for predicting multimer structures. I get error when i submit sequences separated by ':'. Thanks in advance!

Sameerpython avatar May 31 '22 07:05 Sameerpython

Hi Sameer,

The input format for multimers is very simple, it's just one sequence after another in fasta format. For example: >p64 MLTERQALQDRLEKIDKDEITLIKEYQKQRNQIFERLREIDREEYKNLPDLKQLASLEIHQKSKPERDIRKHVAVNILKVNPDGLSADELRSKIEKETNMQILNMTNFMRSIMEKNPSVKKPRRGYYRFEEM >p64 MLTERQALQDRLEKIDKDEITLIKEYQKQRNQIFERLREIDREEYKNLPDLKQLASLEIHQKSKPERDIRKHVAVNILKVNPDGLSADELRSKIEKETNMQILNMTNFMRSIMEKNPSVKKPRRGYYRFEEM

Happy modelling!

abiadak avatar May 31 '22 12:05 abiadak

Hi, I am planning to analyze 700 proteins using the alpha fold. Can I submit these proteins in batch mode on the server or do i have to submit these proteins one after another for the analysis?.
Thank you

rajdeepjaswal52 avatar Jun 27 '22 14:06 rajdeepjaswal52

You could try to use ColabFold's AlphaFold2_batch notebook for this @rajdeepjaswal52.

atgctg avatar Jun 27 '22 17:06 atgctg

You could try to use ColabFold's AlphaFold2_batch notebook for this @rajdeepjaswal52.

Thank you very much for the suggestion.

rajdeepjaswal52 avatar Jun 27 '22 18:06 rajdeepjaswal52