diamond
diamond copied to clipboard
Fails when masking queries
Hello,
I have been trying to run diamond but without any success. I tried running it on a cluster with 64 GB RAM, after installing diamond from the source, and that is what I get:
diamond v2.0.4.142 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
#CPU threads: 32
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory:
Opening the database... [0.179s]
#Target sequences to report alignments for: 25
Reference = ../../../Tools/blobtools/uniprot/reference_proteomes.dmnd
Sequences = 52962370
Letters = 19499493459
Block size = 2000000000
Loading taxonomy mapping... [1.025s]
Opening the input file... [0.125s]
Opening the output file... [0s]
Loading query sequences... [11.851s]
Masking queries... terminate called recursively
terminate called after throwing an instance of 'std::bad_alloc'
/var/slurmd-cm2_tiny/job123415/slurm_script: line 14: 24198 Abandon (core dumped) ../../../Tools/diamond-2.0.4/bin/diamond blastx --db ../../../Tools/blobtools/uniprot/reference_proteomes.dmnd -q assembly.fasta -f 6 qseqid staxids bitscore --threads 32 -o diamond.out
Please try using a smaller block size (like -b0.4) and also --log. How long are your query sequences?
The largest one is 105 Mb. If the sequences length is a problem, I can filter out the largest ones, then the remaining ones should be up to 5 Mb.
It could be due to the length. You can also try turning off the masking using --masking 0.