cannot open temporary file
I am trying to run diamond blastp using as query some really big protein fasta files (300-900MB). Of course in order to achieve that, i usually split the files in smaller chunks and try to run the blasts in parallel.
I am using a server and for each job I typically specify 32 processors and 64GB RAM.
Diamond blast starts fine but after a couple of hours it crashes and almost always, the problem is that it cannot find a temporary file (last few lines of the logfile below)
I have assumed that the 64GB of RAM is enough but that it simply runs out of space as the enormous amount of query sequences makes it generate way too many temp files that fill the hard drive.... is my assumption correct?
Also, what I can do in my case? what is the best strategy to overcome this? is splitting in multiple query files and running in parallel the right way to go or will it still throw an error if the multiple jobs use the same temporary directory?
In another post I saw you suggested to limit the query bins using the --bin option. Will that help overcome filling the temp directory? but will it increase the computing time?
thanks in advance for the help
Loading reference sequences... [29.9942s] Building reference histograms... [4.89397s] Allocating buffers... [0.000117s] Initializing temporary storage... No such file or directory [0.10247s] Error: Error opening temporary file /storage/scratch/32173925/diamond-5fd2f410-381d.tmp Stop job: ven. déc. 11 03:24:08 CET 2020
This does not look like a disk space problem to me. Please try using the option --no-unlink and see if that helps.
I am using the 0.9.8 version and it does not have the "--no-unlink" option, is it in the latest version?
Not sure which version it was added, but it is in the latest version yes.
Hi again I m a bit lost here I downloaded and compiled the latest version "diamond v2.0.5.143 " but there is no "--no-unlink" option... "Error: Invalid option: no-unlink" not sure what went wrong...
Have you compiled the latest master branch by any chance? If so, you need to add the CMake option -DEXTRA=ON to activate this option.
YES, thank you! Working now, even though it threw me a bit off in the beginning cause the option does not appear in the --help Will submit the new jobs tonight and will know more in a few days What does the "no-unlink" option do exactly?
It prevents unlink being invoked on the temp files, which normally will cause these files to be deleted even if the program crashes etc.
it seems to be working without issues and also twice as fast! Thank you for your help