hh-suite icon indicating copy to clipboard operation
hh-suite copied to clipboard

errors at PSIPRED step - Builiding customized databases for HHSuite

Open tamuanand opened this issue 6 years ago • 27 comments

Hi

I have an existing thread where my question was on obtaining the dsspcmbi binary : https://github.com/soedinglab/hh-suite/issues/72

I then downloaded HHsuite version 3.0.0 (15-03-2015) and I am trying to build a custom database. I did update paths on HHpaths.pm to point to the psipred instance and the legacy ncbi blast.

I am getting errors at the PSIPRED step (the errors are about failure to execute 'makemat')

I happened to see this thread and want to know what I should do to make sure that PsiPred will not use Legacy Blast (hence what changes would be needed on my psipred as well as what changes would be needed to addss.pl)

https://github.com/soedinglab/hh-suite/issues/36

Any help will be greatly appreciated.

Thanks, Anand

tamuanand avatar Oct 05 '17 22:10 tamuanand

Hi Anand,

Also responding to your comment at #36:

  1. For building the pdb70 database we are apparently currently using Psipred version 4.0 and BLAST version 2.2.24. We participated in the last CASP competition with HHpred running psipred version 2.61 and BLAST version 2.2.26.

  2. Please use the dssp-2.0.4-linux-amd64 binary from the dssp ftp and name it dsspcmbi.

I recommend you to not add secondary structure information (addss.pl) if you are building a larger database (on the scale of the uniprot), since this takes a lot of computational resources. We also dropped secondary structure information for our Uniclust databases.

You can also take a look at how we build the pdb70: https://github.com/soedinglab/hhdatabase_cif70 or the PFAM: https://github.com/soedinglab/hhdatabase_pfam

There is also the "hhsuitedb.py" script in the scripts folder, which might also help you in building a database (you will need the --force flag to produce a database, counterintuitively).

milot-mirdita avatar Oct 17 '17 12:10 milot-mirdita

Hi Milot,

Thanks for taking the time to respond. I have some follow up questions

  1. I am building a customized database with sequences from ~20 species. Do you still not recommend using addss.pl OR would you recommend it? IMO, based on the HHSearch user guide, I think it adds valuable information and it is at this step I am having issues with PSIPRED.

  2. You mentioned to look up how you build the pdb70 or the PFAM databases - I checked the links provided. Each of the links have a bunch of scripts - probably a README file explaining the core steps/scripts would be beneficial.

Thanks

tamuanand avatar Oct 17 '17 14:10 tamuanand

20 species would be around 100k proteins? This is still pretty well manageable, If you have the computational resources. Computing the secondary structure predictions won't hurt obviously. If it was already a pain to generate the a3m with hhblits, then addss.pl will be an even larger pain.

Using psipred version 2.61 and BLAST version 2.2.26 and entering the correct paths in the HHpaths, should work though. This is how the relevant paths look in the casp version of hhpred:

our $execdir = "/data/hhpredA/bin/psipred/bin"; our $datadir = "/data/hhpredA/bin/psipred/data"; our $ncbidir = "/data/hhpredA/bin/blast/bin"; our $pdbdir = "/data/databases/pdb/divided"; our $dsspdir = "/data/databases/dssp/data"; our $dssp = "/data/databases/dssp/bin/dsspcmbi";

You can also set $dssp to an empty string, to only run Psipred.

The pdb70_update.sh script is the main entry point and linearly calls the other ones. It should be pretty easy to follow.

milot-mirdita avatar Oct 17 '17 14:10 milot-mirdita

Thanks Milot.

I have used this as the guide (Section 3.5 - Building Customized Databases) https://github.com/soedinglab/hh-suite/blob/master/hhsuite- userguide.pdf and I have been able to run the first 2 steps (split fasta file and build MSA with HHblits).

I have tried all possible combinations listed above (psipred261/psipred4 with blast 2.2.26) after updating the paths in psipred and HHPaths - I am just unable to get past the errors at this step.

tamuanand avatar Oct 17 '17 15:10 tamuanand

Could you upload the log somewhere please?

milot-mirdita avatar Oct 17 '17 16:10 milot-mirdita

This step worked mpirun -np 16 ffindex_apply_mpi ecoli.fa.ff{data,index} -i ecoli_a3m_wo_ss.ffindex -d ecoli_a3m_wo_ss.ffdata -- hhblits -d uniprot20_2016_02 -i stdin -oa3m stdout -n 3 -cpu 1 -v 0

This step errs out

mpirun -np 16 ffindex_apply_mpi ecoli_a3m_wo_ss.ff{data,index} -i ecoli_a3m.ffindex -d ecoli_a3m.ffdata -- addss.pl -v 0 stdin stdout

These are the error messages

*** Unable to open checkpoint file!

*** Unable to open checkpoint file!

Error: command '<path_to>src/psipred4.0/psipred/bin/chkparse /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.chk > /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.mtx' returned error code 255

Error: command '<path_to>src/psipred4.0/psipred/bin/chkparse /tmp/18_E2URTjN/Lfr53VCoEu.chk > /tmp/18_E2URTjN/Lfr53VCoEu.mtx' returned error code 255

Bad mtx file - no sequence length!Bad mtx file - no sequence length!

Error: command '<path_to>src/psipred4.0/psipred/bin/psipred /tmp/18_E2URTjN/Lfr53VCoEu.mtx <path_to>src/psipred4.0/psipred/data/weights.dat <path_to>src/psipred4.0/psipred /data/weights.dat2 <path_to>src/psipred4.0/psipred/data/weights.dat3 > /tmp/18_E2URTjN/Lfr53VCoEu.ss' returned error code 255

Error: command '<path_to>src/psipred4.0/psipred/bin/psipred /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.mtx <path_to>src/psipred4.0/psipred/data/weights.dat <path_to>src/psipred4.0/psipred /data/weights.dat2 <path_to>src/psipred4.0/psipred/data/weights.dat3 > /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.ss' returned error code 255

<path_to>src/psipred4.0/psipred/bin/psipass2: /lib64/libc.so.6: version GLIBC_2.14' not found (required by <path_to>src/psipred4.0/psipred/bin/psipass2) <path_to>src/psipred4.0/psipred/bin/psipass2: /lib64/libc.so.6: version GLIBC_2.14' not found (required by <path_to>src/psipred4.0/psipred/bin/psipass2)

Error: command '<path_to>src/psipred4.0/psipred/bin/psipass2 <path_to>src/psipred4.0/psipred/data/weights_p2.dat 1 0.98 1.09 /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.ss2 /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.ss > /tmp/GOK7Zo2g_0/_jIG1Yb4Bq.horiz' returned error code 1

Error: command '<path_to>src/psipred4.0/psipred/bin/psipass2 <path_to>src/psipred4.0/psipred/data/weights_p2.dat 1 0.98 1.09 /tmp/18_E2URTjN/Lfr53VCoEu.ss2 /tmp/18_E2URTjN/Lfr53VCoEu.ss > /tmp/18_E2URTjN/Lfr53VCoEu.horiz' returned error code 1

tamuanand avatar Oct 17 '17 17:10 tamuanand

/lib64/libc.so.6: version GLIBC_2.14' not found (required by <path_to>src/psipred4.0/psipred/bin/psipass2) 

That looks like the culprit. Please recompile psipred from source. The binary you downloaded, does not work with your systems libc.

milot-mirdita avatar Oct 18 '17 08:10 milot-mirdita

Hi Milot

I downloaded psipred from here: http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/old_versions/psipred.4.0.tar.gz

Then did a tar -xzvf and then changed the paths within runpsipred to point to my blast install

All the errors in the above log are after I did the steps mmentioned.

tamuanand avatar Oct 18 '17 14:10 tamuanand

Please go to the extracted psipred folder and execute:

cd src
make clean && make all && make install

Afterwards you should have a psipred that will work with your systems libc. If libc errors persist, please turn to the psipred support.

milot-mirdita avatar Oct 18 '17 14:10 milot-mirdita

Thanks Milot - that seemed to take care of the errors.

However, the *_a3m.ffdata file produced at this step is empty - is that expected? The *_a3m.ffindex seems to have data

To reiterate, this is the step I am running - mpirun -np 16 ffindex_apply_mpi ecoli_a3m_wo_ss.ff{data,index} -i ecoli_a3m.ffindex -d ecoli_a3m.ffdata -- addss.pl -v 0 stdin stdout

tamuanand avatar Oct 18 '17 15:10 tamuanand

Remove the -v 0 please and post a log of what errors it returns.

milot-mirdita avatar Oct 18 '17 15:10 milot-mirdita

I keep getting errors about makemat. Do I have to download makemat separately - I do not find it within my blast suite (2.2.25).

This is the log

<path_to>/CentOS_6.4/hhsuite/r-3.0-beta.3/scripts/reformat.pl -v 1 -r -noss a3m psi /tmp/weSY02eHSX/tZSFgL1lFv.in.a3m /tmp/weSY02eHSX/tZSFgL1lFv.in.psi Predicting secondary structure with PSIPRED ... $ <path_to>/ncbi-blast-2.2.25/bin/blastpgp -b 1 -j 1 -h 0.001 -d <path_to>/CentOS_6.4/hhsuite/r-3.0-beta.3/data/do_not_delete -i /tmp/ weSY02eHSX/tZSFgL1lFv.sq -B /tmp/weSY02eHSX/tZSFgL1lFv.in.psi -C /tmp/weSY02eHSX/tZSFgL1lFv.chk 1> /tmp/weSY02eHSX/tZSFgL1lFv.blalog 2> /tmp/weSY02eHSX/tZSFgL1lFv.blalog $ <path_to>/CentOS_6.4/hhsuite/r-3.0-beta.3/scripts/reformat.pl -v 1 -r -noss a3m psi /tmp/OPZnXRQvho/EoKc55doYv.in.a3m /tmp/OPZnXRQvho/EoKc55doYv.in.psi Predicting secondary structure with PSIPRED ... $ <path_to>/ncbi-blast-2.2.25/bin/blastpgp -b 1 -j 1 -h 0.001 -d <path_to>/CentOS_6.4/hhsuite/r-3.0-beta.3/data/do_not_delete -i /tmp/ OPZnXRQvho/EoKc55doYv.sq -B /tmp/OPZnXRQvho/EoKc55doYv.in.psi -C /tmp/OPZnXRQvho/EoKc55doYv.chk 1> /tmp/OPZnXRQvho/EoKc55doYv.blalog 2> /tmp/OPZnXRQvho/EoKc55doYv.blalog $ echo EoKc55doYv.chk > /tmp/OPZnXRQvho/EoKc55doYv.pn

$ echo EoKc55doYv.sq > /tmp/OPZnXRQvho/EoKc55doYv.sn

$ <path_to>/ncbi-blast-2.2.25/bin/makemat -P /tmp/OPZnXRQvho/EoKc55doYv

Error: failed to execute '<path_to>/ncbi-blast-2.2.25/bin/makemat -P /tmp/OPZnXRQvho/EoKc55doYv': No such file or directory

tamuanand avatar Oct 20 '17 13:10 tamuanand

Please do not mixncbi blast+ with legacy blast.

You can download the legacy blast version from: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/legacy

We currently do not have the resources to support newer blast versions.

milot-mirdita avatar Oct 20 '17 14:10 milot-mirdita

ok - do you recommend 2.2.24 or any other version. thanks in advance

tamuanand avatar Oct 20 '17 17:10 tamuanand

Hi

I downloaded legacy blast, changed the relevant variables/paths etc

When I run mpirun -np 16 ffindex_apply_mpi ecoli_a3m_wo_ss.ff{data,index} -i ecoli_a3m.ffindex -d ecoli_a3m.ffdata -- addss.pl -v 0 stdin stdout

these are list of errors I get (summarized below)

[1] Error: command 'hhfilter -v 1 -neff 7 -i /tmp/qITiEkHKKC/dHCH5cnc6o.in.a3m -o /tmp/qITiEkHKKC/dHCH5cnc6o.in.a3m' returned error code 0

[2] Segmentation fault /r-2.2.26/bin/blastpgp -b 1 -j 1 -h 0.001 -d /hhsuite/r-3.0-beta.3/data/do_not_delete -i /tmp/lSBUottTXr/dqua6vWwfk.sq -B /tmp/lSBUottTXr/dqua6vWwfk.in.psi -C /tmp/lSBUottTXr/dqua6vWwfk.chk 1> /tmp/lSBUottTXr/dqua6vWwfk.blalog 2> /tmp/lSBUottTXr/dqua6vWwfk.blalog' returned error code 139

[3] /r-2.2.26/bin/makemat -P /tmp/CIbTmvlCpg/pPiCCNgc5N' returned error code 1

[4] /psipred4.0/psipred/bin/psipass2 /psipred4.0/psipred/data/weights_p2.dat 1 0.98 1.09 /tmp/XGp0iVDlXK/AP0NiWa8Sq.ss2 /tmp/XGp0iVDlXK/AP0NiWa8Sq.ss > /tmp/XGp0iVDlXK/AP0NiWa8Sq.horiz' returned error code 1

However, at the end of running this, the number of lines in *ffindex file from this step (addss.pl step) comes to be the same the number of lines from the *ffindex file from the "hhblits" step.

So the question I have is - I do not think this step ran correctly. Has anybody else had this issue?

Thanks

tamuanand avatar Oct 29 '17 02:10 tamuanand

Sorry for the delays. I hope you were able to resolve your issue in the meantime.

If not, I need the output of these commands for a possible diagnosis.

Could you please upload the contents of the log files (filenames after >, 1>, 2> etc.). Especially /tmp/lSBUottTXr/dqua6vWwfk.blalog, /tmp/lSBUottTXr/dqua6vWwfk.blalog

milot-mirdita avatar Nov 16 '17 14:11 milot-mirdita

Hi Anand,

did you get this resolved? — I am currently trying to get this installed on an AWS Linux machine and have also run into problems when I tried to get addss.pl to work.

So, I tried to get the blast alone to work. And there seems to be a problem in installing the legacy blast, because blastpgp expects the environment BLASTMAT to be set correctly. (blastall, in contrast, runs without that. So if you only test with blastall you don’t find the problem…)

Error: command '/home/ec2-user/blast-2.2.26/bin//blastpgp -b 1 -j 1 -h 0.001 -d /usr/share/hhsuite//data/do_not_delete -i /tmp/sTyBCTAxvQ/1eM9FOKdwv.sq -B /tmp/sTyBCTAxvQ/1eM9FOKdwv.in.psi -C /tmp/sTyBCTAxvQ/1eM9FOKdwv.chk 1> /tmp/sTyBCTAxvQ/1eM9FOKdwv.blalog 2> /tmp/sTyBCTAxvQ/1eM9FOKdwv.blalog' returned error code 139

To fix the, I have done the following steps: sudo cp blast-2.2.26/bin/* /usr/local/bin sudo cp -r blast-2.2.26/data /usr/local/blast-data BLASTMAT="/usr/local/blast-data/“ export BLASTMAT

I suspect that copying to /usr/local wasn’t really required…. But now my addss.pl works.

Best wishes

   Andrea

On 29 Oct 2017, at 03:15, tamuanand [email protected] wrote:

Hi

I downloaded legacy blast, changed the relevant variables/paths etc

When I run mpirun -np 16 ffindex_apply_mpi ecoli_a3m_wo_ss.ff{data,index} -i ecoli_a3m.ffindex -d ecoli_a3m.ffdata -- addss.pl -v 0 stdin stdout

these are list of errors I get (summarized below)

[1] Error: command 'hhfilter -v 1 -neff 7 -i /tmp/qITiEkHKKC/dHCH5cnc6o.in.a3m -o /tmp/qITiEkHKKC/dHCH5cnc6o.in.a3m' returned error code 0

[2] Segmentation fault /r-2.2.26/bin/blastpgp -b 1 -j 1 -h 0.001 -d /hhsuite/r-3.0-beta.3/data/do_not_delete -i /tmp/lSBUottTXr/dqua6vWwfk.sq -B /tmp/lSBUottTXr/dqua6vWwfk.in.psi -C /tmp/lSBUottTXr/dqua6vWwfk.chk 1> /tmp/lSBUottTXr/dqua6vWwfk.blalog 2> /tmp/lSBUottTXr/dqua6vWwfk.blalog' returned error code 139

[3] /r-2.2.26/bin/makemat -P /tmp/CIbTmvlCpg/pPiCCNgc5N' returned error code 1

[4] /psipred4.0/psipred/bin/psipass2 /psipred4.0/psipred/data/weights_p2.dat 1 0.98 1.09 /tmp/XGp0iVDlXK/AP0NiWa8Sq.ss2 /tmp/XGp0iVDlXK/AP0NiWa8Sq.ss > /tmp/XGp0iVDlXK/AP0NiWa8Sq.horiz' returned error code 1

However, at the end of running this, the number of lines in *ffindex file from this step (addss.pl step) comes to be the same the number of lines from the *ffindex file from the "hhblits" step.

So the question I have is - I do not think this step ran correctly. Has anybody else had this issue?

Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

-- Dr. Andrea Schafferhans

I12, Chair of Prof. Rost Technische Universität München Fakultät für Informatik Boltzmannstraße 3, 85748 Garching, Germany

URL: http://www.rostlab.org/~andrea/ Mail: [email protected] Phone: 0049 89 289 17833 Fax: 0049 89 289 19414 Room: 01.09.055

"Those who would give up essential Liberty to obtain a little temporary Safety, deserve neither Liberty nor Safety." Benjamin Franklin (1775)

aschafu avatar Dec 05 '17 11:12 aschafu

Hi Andrea and Milot,

No I have not been able to resolve this - I have paused working on this temporarily after several tries. I would start re-work on this possibly in January.

In the meanwhile, I have a suggestion - would it be better if the HHSuite tarball includes the working versions of PSIPred and DSSP along with the associated tweaks

Best, Anand

tamuanand avatar Dec 10 '17 03:12 tamuanand

Hello Andrea,

I see that you mentioned that you have been trying to get HHSuite to work on AWS. If so, would you be willing to share the AWS blueprint (is there a community AMI that I can look up, how did you get MPIRun for HHSuite, DSSP and legacy blast etc to work with AWS).

Thanks in advance, Anand

tamuanand avatar Dec 11 '17 16:12 tamuanand

Hi Anand,

so far, I’m not quite finished with the setup (an only occasionally have time to work on this), but I will share my setup script once it’s finished. The way I do it is to use a normal Amazon Linux machine and do all the installation in a user data script. The specifics of the script will have to be adapted depending on which machine type you choose and where you plan to store your data.

Best wishes

   Andrea

#!/bin/bash set -x exec > >(tee /var/log/user-data.log|logger -t user-data ) 2>&1

yum update -y yum upgrade -y yum groupinstall -y "Development Tools" yum install -y python-pip lvm2 git yum install -y cmake mysql

currently in testing I’m reusing an EBS,

but you’d have to make new EBSes for each instance your run and also set it up properly

mkdir /mnt/data mkfs -t ext4 /dev/xvdc mount /dev/xvdc /mnt/data/

chmod a+tw /mnt/data/ mkdir /mnt/data/hhblits/ chmod a+tw /mnt/data/hhblits/ REGION=wget -q 169.254.169.254/latest/meta-data/placement/availability-zone -O- | sed 's/.$//' aws --recursive --region=$REGION s3 cp s3://pssh3cache/hhblits_dbs/ /mnt/data/hhblits/

mkdir /home/ec2-user/git chmod a+tw /home/ec2-user/git cd /home/ec2-user/git git clone https://github.com/soedinglab/hh-suite.git cd hh-suite/ git submodule init git submodule update

for using md5 sums as sequence identifiers:

sed -i 's/FFINDEX_MAX_ENTRY_NAME_LENTH 32/FFINDEX_MAX_ENTRY_NAME_LENTH 33/g' lib/ffindex/src/ffindex.h

mkdir build cd build mkdir /usr/share/hhsuite/ INSTALL_BASE_DIR='/usr/share/hhsuite/' cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=${INSTALL_BASE_DIR} .. make make install

cd /home/ec2-user/git git clone https://github.com/aschafu/PSSH2.git

from here on only needed for making your own hhblits databases

cd /home/ec2-user wget http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/psipred.4.01.tar.gz tar -xvzf psipred.4.01.tar.gz cd psipred/src make make install cd /home/ec2-user wget ftp://ftp.ncbi.nih.gov/blast/executables/legacy/2.2.26/blast-2.2.26-x64-linux.tar.gz tar -xvzf blast-2.2.26-x64-linux.tar.gz cp blast-2.2.26/bin/* /usr/local/bin cp -r blast-2.2.26/data /usr/local/blast-data

mkdir -p /mnt/data/pdb/divided chmod a+tw /mnt/data/pdb

I just prepare the directory, but then fetch only sequences I need on the node

mkdir -p /mnt/data/dssp/bin mkdir -p /mnt/data/dssp/data chmod a+tw /mnt/data/dssp cd /mnt/data/dssp/bin wget ftp://ftp.cmbi.ru.nl/pub/molbio/software/dssp-2/dssp-2.0.4-linux-i386 chmod a+rx dssp-2.0.4-linux-i386 ln -s dssp-2.0.4-linux-i386 dsspcmbi

TODO get this from somewhere else (e.g. S3)

cp /mnt/data/hhblits/HHPaths.pm /usr/share/hhsuite/scripts/HHPaths.pm

-- Dr. Andrea Schafferhans

I12, Chair of Prof. Rost Technische Universität München Fakultät für Informatik Boltzmannstraße 3, 85748 Garching, Germany

URL: http://www.rostlab.org/~andrea/ Mail: [email protected] Phone: 0049 89 289 17833 Fax: 0049 89 289 19414 Room: 01.09.055

"Those who would give up essential Liberty to obtain a little temporary Safety, deserve neither Liberty nor Safety." Benjamin Franklin (1775)

aschafu avatar Dec 11 '17 18:12 aschafu

Thanks a lot Andrea

tamuanand avatar Dec 11 '17 19:12 tamuanand

Hi Anand,

meanwhile my setup has successfully run on AWS. My startup script can be found on GitHub, too: https://github.com/aschafu/PSSH2/blob/master/src/cloud/pssh2_init_aws_2017.sh This should contain everything you need to get PSIpred, DSSP etc running. -- I'd also be happy to share and explain the scripts that I use to generate the database (to be found in the same GitHub). However, if you are not working in academia, but a commercial environment, we could talk about a consulting contract, because we always try to find funding for our Aquaria (aquaria.ws) project... ;-)

 Best wishes
     Andrea

aschafu avatar Mar 06 '18 22:03 aschafu

Hi Milot,

I am responding to your message from Nov 2017.

I would like to give you these files but when I run the command, the files exist on the nodes for a while and they get deleted when the job ends - Is there a way to modify the script(s) to prevent deletion of these files so that I can send them to you for debug?

One another question - is it possible to run the command for addss.pl on a single node (or masternode) without invoking mpirun -- I am trying to see how best I can be of help for you for this debug.

I am first trying this pipeline (building custom hhsuite database) with Ecoli sequences (some 4500 sequences) and I am stuck with these errors.

Sorry for the delays. I hope you were able to resolve your issue in the meantime. If not, I need the output of these commands for a possible diagnosis. Could you please upload the contents of the log files (filenames after >, 1>, 2> etc.). Especially /tmp/lSBUottTXr/dqua6vWwfk.blalog, /tmp/lSBUottTXr/dqua6vWwfk.blalog

tamuanand avatar Jun 05 '18 19:06 tamuanand

@aschafu

As per your previous post that you were successfully deployed HHSuite on AWS. Congratulations for that.!!.

Here I just wanted to know few things about this setup, So please help me to clarify. What type of setup it was ? Was it deployed on in a single EC2 machine ? Or deployed it on EC2 behind with auto scaling to scale up? If it was just a single EC2 deployment, what type of data set that've tested with on real time. Have you tested with big data sets ? And how fit it is for the data which is big in size to process?.

Regards, Srini

lucky295 avatar Oct 19 '18 01:10 lucky295

@lucky295

Hi Srini, my setup runs on regular EC2 nodes with the respective latest AWS linux images. And yes, I do use autoscaling to run it on multiple machines. These copy their output to S3 for storing. I use the SQS queuing mechanism to tell my instances what to do. My instances fetch their input sequences from a database that is running on RDS. The attached pdf illustrates a bit of what I do.

In my setup, I first produce profiles for all pdb sequences, then generate a database from that, and finally search all of Swissprot against that custom database. In that scenario I worked out that c4large machines were the most cost effective (last time I looked). I use an extra queue with larger machines and/or one with fewer parallel processes to run those sequences that fail in that setup (due to too little available memory).

I'd be happy to help you in getting this to work. -- As I wrote to Anand (tamuanand ) before: Since we're always looking for funding to maintain Aquaria, I'd be even happier if some kind of cooperation could arise from this. ;-)

Best wishes Andrea

AquariaCloudExcerpt.pdf

aschafu avatar Oct 21 '18 10:10 aschafu

Sorry for the delays. I hope you were able to resolve your issue in the meantime.

If not, I need the output of these commands for a possible diagnosis.

Could you please upload the contents of the log files (filenames after >, 1>, 2> etc.). Especially /tmp/lSBUottTXr/dqua6vWwfk.blalog, /tmp/lSBUottTXr/dqua6vWwfk.blalog

Hello Milot,

Any suggestions on the bellow error which is addss.pl execution ?

$ cp /tmp/wMsgY8WWvn/0hJy2eND32.stdin /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m Filtering alignment to diversity 7 ... $ hhfilter -v 1 -neff 7 -i /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m -o /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m

  • 14:34:52.132 ERROR: In /database/executables/git/hh-suite/src/hhalignment.cpp:513: Read:

  • 14:34:52.132 ERROR: No sequences found in file /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m

Error: command 'hhfilter -v 1 -neff 7 -i /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m -o /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m' returned error code 1

$ /database/executables/hhsuite/scripts/reformat.pl -v 1 -r -noss a3m psi /tmp/wMsgY8WWvn/0hJy2eND32.in.a3m /tmp/wMsgY8WWvn/0hJy2eND32.in.psi Use of uninitialized value $seq in substitution (s///) at /database/executables/hhsuite/scripts/reformat.pl line 253. Use of uninitialized value $seq in substitution (s///) at /database/executables/hhsuite/scripts/reformat.pl line 254. Use of uninitialized value $seq in string ne at /database/executables/hhsuite/scripts/reformat.pl line 257. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 261. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 470. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 471. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 652. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 655. Use of uninitialized value in transliteration (tr///) at /database/executables/hhsuite/scripts/reformat.pl line 664. Predicting secondary structure with PSIPRED ... $ /database/executables/blast-2.2.26/bin/blastpgp -b 1 -j 1 -h 0.001 -d /database/executables/hhsuite/data/do_not_delete -i /tmp/wMsgY8WWvn/0hJy2eND32.sq -B /tmp/wMsgY8WWvn/0hJy2eND32.in.psi -C /tmp/wMsgY8WWvn/0hJy2eND32.chk 1> /tmp/wMsgY8WWvn/0hJy2eND32.blalog 2> /tmp/wMsgY8WWvn/0hJy2eND32.blalog sh: line 1: 1155 Segmentation fault /database/executables/blast-2.2.26/bin/blastpgp -b 1 -j 1 -h 0.001 -d /database/executables/hhsuite/data/do_not_delete -i /tmp/wMsgY8WWvn/0hJy2eND32.sq -B /tmp/wMsgY8WWvn/0hJy2eND32.in.psi -C /tmp/wMsgY8WWvn/0hJy2eND32.chk > /tmp/wMsgY8WWvn/0hJy2eND32.blalog 2> /tmp/wMsgY8WWvn/0hJy2eND32.blalog

Thanks in Adavance. Srini.

lucky295 avatar Dec 04 '18 17:12 lucky295

What is the current state of things now in June 2019? Which versions of psipred, its BLAST dependency, and accompanying DSSP version and corresponding download links should we be using?

gnmcsbnfrmtcsclb avatar Jun 12 '19 13:06 gnmcsbnfrmtcsclb