diamond icon indicating copy to clipboard operation
diamond copied to clipboard

cannot open temporary file

Open sapuizait opened this issue 5 years ago • 8 comments

I am trying to run diamond blastp using as query some really big protein fasta files (300-900MB). Of course in order to achieve that, i usually split the files in smaller chunks and try to run the blasts in parallel. I am using a server and for each job I typically specify 32 processors and 64GB RAM.
Diamond blast starts fine but after a couple of hours it crashes and almost always, the problem is that it cannot find a temporary file (last few lines of the logfile below) I have assumed that the 64GB of RAM is enough but that it simply runs out of space as the enormous amount of query sequences makes it generate way too many temp files that fill the hard drive.... is my assumption correct? Also, what I can do in my case? what is the best strategy to overcome this? is splitting in multiple query files and running in parallel the right way to go or will it still throw an error if the multiple jobs use the same temporary directory? In another post I saw you suggested to limit the query bins using the --bin option. Will that help overcome filling the temp directory? but will it increase the computing time? thanks in advance for the help


Loading reference sequences... [29.9942s] Building reference histograms... [4.89397s] Allocating buffers... [0.000117s] Initializing temporary storage... No such file or directory [0.10247s] Error: Error opening temporary file /storage/scratch/32173925/diamond-5fd2f410-381d.tmp Stop job: ven. déc. 11 03:24:08 CET 2020

sapuizait avatar Dec 11 '20 09:12 sapuizait

This does not look like a disk space problem to me. Please try using the option --no-unlink and see if that helps.

bbuchfink avatar Dec 11 '20 10:12 bbuchfink

I am using the 0.9.8 version and it does not have the "--no-unlink" option, is it in the latest version?

sapuizait avatar Dec 11 '20 11:12 sapuizait

Not sure which version it was added, but it is in the latest version yes.

bbuchfink avatar Dec 11 '20 11:12 bbuchfink

Hi again I m a bit lost here I downloaded and compiled the latest version "diamond v2.0.5.143 " but there is no "--no-unlink" option... "Error: Invalid option: no-unlink" not sure what went wrong...

sapuizait avatar Dec 13 '20 10:12 sapuizait

Have you compiled the latest master branch by any chance? If so, you need to add the CMake option -DEXTRA=ON to activate this option.

bbuchfink avatar Dec 13 '20 10:12 bbuchfink

YES, thank you! Working now, even though it threw me a bit off in the beginning cause the option does not appear in the --help Will submit the new jobs tonight and will know more in a few days What does the "no-unlink" option do exactly?

sapuizait avatar Dec 13 '20 11:12 sapuizait

It prevents unlink being invoked on the temp files, which normally will cause these files to be deleted even if the program crashes etc.

bbuchfink avatar Dec 14 '20 16:12 bbuchfink

it seems to be working without issues and also twice as fast! Thank you for your help

sapuizait avatar Dec 15 '20 15:12 sapuizait