alphafold icon indicating copy to clipboard operation
alphafold copied to clipboard

protein protein interaction Indexeror

Open hellorp1990 opened this issue 2 years ago • 7 comments

Hi, I am new to protein folding area and I am having a little issue with running protein protein interaction with Alphafold.

My inputs:

 >EUXXXX
MKMASNDATPSDGSTANLVPEVNNEVMALEPVVGAAIAAPVAGQQNVIDPWIRNNFVQAPGGEFTVSPRNAPGEILWSAPLGPDLNPYLSHLARMYNGYAGGFEVQVILAGNAFTAGKIIFAAVPPNFPTEGLSPSQVTMFPHIIVDVRQLEPVLIPLPDVRNNFYHYNQSNDPTIKLIAMLYTPLRANNAGDDVFTVSCRVLTRPSPDFDFIFLVPPTVESRTKPFSVPILTVEEMTNSRFPIPLEKLFTGPSSAFVVQPQNGRCTTDGVL
 >12H
MEWNWVVLFLLSLTAGVYAQGQMQQSGAELVKPKLSCKTSGF

I am having an error: IndexError: list index out of range (i believe this error is related to input file formatting).

Can anyone help with this issue?

hellorp1990 avatar Aug 29 '22 14:08 hellorp1990

Are EUXXXX and 12H sequence names? If so, the FASTA format requires the following formatting:

>EUXXXX
MKMASNDATPSDGSTANLVPEVNNEVMALEPVVGAAIAAPVAGQQNVIDPWIRNNFVQAPGGEFTVSPRNAPGEILWSAPLGPDLNPYLSHLARMYNGYAGGFEVQVILAGNAFTAGKIIFAAVPPNFPTEGLSPSQVTMFPHIIVDVRQLEPVLIPLPDVRNNFYHYNQSNDPTIKLIAMLYTPLRANNAGDDVFTVSCRVLTRPSPDFDFIFLVPPTVESRTKPFSVPILTVEEMTNSRFPIPLEKLFTGPSSAFVVQPQNGRCTTDGVL
>12H
MEWNWVVLFLLSLTAGVYAQGQMQQSGAELVKPKLSCKTSGF

See https://github.com/deepmind/alphafold#examples for more details.

Augustin-Zidek avatar Aug 29 '22 15:08 Augustin-Zidek

@Augustin-Zidek Yes they are started with > (i added them here as well, but for some reason > is removed from the comment)

hellorp1990 avatar Aug 29 '22 15:08 hellorp1990

I see, > is a special character in Markdown so it didn't render properly -- I fixed that in your comment. Could you try without spaces before the > characters?

Augustin-Zidek avatar Aug 29 '22 15:08 Augustin-Zidek

Could you also post the full command you are using to launch this and the full error?

Augustin-Zidek avatar Aug 29 '22 15:08 Augustin-Zidek

@Augustin-Zidek When i ran the first sequence ( > EUXXXX) alone -monomer pipeline. And it ran fine but when i tried to ran multimer or protein protein interaction pipeline its showed error.

Full error: error

hellorp1990 avatar Aug 29 '22 15:08 hellorp1990

@Augustin-Zidek input command:

--fasta_paths=input.fasta
--max_template_date=2022-08-29
--model_preset=multimer \

hellorp1990 avatar Aug 29 '22 15:08 hellorp1990

The error is clearly coming from the FASTA parser, so there must be something wrong with the FASTA format.

  • Is the listing you provided above exactly what is in the FASTA file (including blank lines, leading/trailing spaces, etc.)?
  • Is the FASTA encoded as ASCII or UTF-8?

Augustin-Zidek avatar Aug 29 '22 15:08 Augustin-Zidek