DPPI icon indicating copy to clipboard operation
DPPI copied to clipboard

from protein sequence to training data

Open chelseaju opened this issue 6 years ago • 7 comments

Hi, We would like to try out your code to predict PPI; however, we are having trouble understanding the input format. Given two lists of protein sequences (positive and negative sets), how do we convert the primary protein sequences to the format you have under the folder myTrain/? Thanks!

chelseaju avatar Oct 17 '18 04:10 chelseaju

Hi,

You can find the answer in README. First prepare your csv file and the corresponding profiles. Then use scripts convert_csv_to_dat.lua and create_crop.lua You can make the profiles for each protein using blastpgp: blastpgp -a 1 -F F -j 3 -b 3000 -e 1e-3 -h 1e-3 -d <path to non-redundant blast database> -I <path to fasta file containing exactly one target protein> -Q <path to output profile file>

hashemifar avatar Oct 17 '18 16:10 hashemifar

Thanks for the quick response. Since blastpgp is obsolete, I am wondering how do you transfer these parameters to psiblast? Thanks!

chelseaju avatar Oct 18 '18 06:10 chelseaju

Since running blastpgp or psiblast is computational costly for thousands of protein sequences, is there a way to download them online? Thanks!

chelseaju avatar Oct 22 '18 05:10 chelseaju

Hi, You may be able to find the profile of some of your proteins in some databases. You do not have to use hlastpgp. There are a bunch of other methods for computing the profiles. You might need to check them and see if those are faster than blastpgp.

Best, Somaye

On Mon, Oct 22, 2018, 1:39 AM chelseaju [email protected] wrote:

Since running blastpgp or psiblast is computational costly for thousands of protein sequences, is there a way to download them online? Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashemifar/DPPI/issues/2#issuecomment-431743044, or mute the thread https://github.com/notifications/unsubscribe-auth/AMaxbSTM6buFgo_Hwg1p-Tinijf3O-8Iks5unVoNgaJpZM4XjHiY .

hashemifar avatar Oct 22 '18 17:10 hashemifar

Hi. I have download human.profile and yeast.profile files. Can you share your code for convert these files into individual profiles as in myTrain folders? Or can you share more individual profile samples you used for experiments? Thank you. My email: [email protected]

CindyHXH avatar Dec 01 '18 20:12 CindyHXH

Hey everyone.

How did you solve the problems? I'm trying to create the protein profiles but i don't know how to replicate the parameters of blastpgp for psiblast or how to download the profiles.

Thanks in advance.

andrea-mosk avatar Sep 10 '19 07:09 andrea-mosk

Hey everyone.

How did you solve the problems? I'm trying to create the protein profiles but i don't know how to replicate the parameters of blastpgp for psiblast or how to download the profiles.

Thanks in advance.

I think I did not get your question. Would you please explain a little bit more. You mean you can not run blastpgp?

hashemifar avatar Sep 11 '19 21:09 hashemifar