BIGSdb icon indicating copy to clipboard operation
BIGSdb copied to clipboard

delete_temp_files can takes times

Open bryan-brancotte opened this issue 4 years ago • 7 comments

Hi Keith

We are experiencing some slow down when scanning new allele, probably related to delete_temp_files which scan the secure_tmp directory. It contains currently around 43K files. This directory is purged for any file older than a week. Would is be possible to create a subdirectory for each job et then only delete the directory job ? Something like if $1 is only [a-zA-Z_] and "$self->{'config'}->{'secure_tmp_dir'}/$1" is a directory then delete it, otherwise do the usual foreach ? Sorry for not making a PR, I am not fluent in perl.

Best regards

Bryan

bryan-brancotte avatar Dec 03 '19 12:12 bryan-brancotte

Is this during web scanning or using the command line tool? As far as I can see this method is only called from offline scripts. Are you sure the slowdown is here? - running the glob command on a directory containing 47,000 files takes a few milliseconds on my machine.

kjolley avatar Dec 03 '19 13:12 kjolley

It is used with the command line tool scannew.pl. Note that in our settings secure_tmp_dir is on a shared file system and a basic ls -la takes 196seconds...

bryan-brancotte avatar Dec 03 '19 13:12 bryan-brancotte

How are you sharing it? I can list 100,000 files over a NFS share in <1 second. Is there a reason that you have to use a remote directory for the secure tmp directory, which will only be accessed by the local script - is this due to the way you run jobs on a cluster?

kjolley avatar Dec 03 '19 13:12 kjolley

Yes, it is because of the jobs running on the cluster

bryan-brancotte avatar Dec 03 '19 13:12 bryan-brancotte

Discussions with users indicate that in the last two month they are using more cli tools than before, plus the databases have been opened. Both can results in more and more temp file, and that could be how we now have the slowdown. Temp files or only kept for seven days.

bryan-brancotte avatar Dec 03 '19 15:12 bryan-brancotte

This is going to be complicated to achieve as different temp files have a different lifespan during the job. Since the autotagger or new allele definer can run for days on a large database, we delete files once they're finished with, but some files (like if the reuse_blast option is set) will last the lifetime of the job before being deleted.

Offline jobs, however, don't need to share the same secure_tmp_dir as the web server. The offline jobs don't keep secure temp files very long, so if a machine is only running these jobs then temp files should not acculumate. For example, the machine running all user jobs, as well as autotagger and scannew jobs on PubMLST, currently only has 34 temp files (I actually use a ramdisk for this).

The web server does accumulate secure_tmp_dir files because these are needed to return results of user jobs, web scans etc. and they may come back to them up to 7 days later (by default). Do you have a common secure_tmp_dir for both the web server and the server running jobs? If so, can you use different directories for this?

kjolley avatar Dec 03 '19 17:12 kjolley

Hi Keith, Indeed secure_tmp_dir is shared. Actually we do not have a server running jobs, but a cluster running some binaries thanks to wrappers, while others are run directly inside the web server. This setting make the directory shearing mandatory.

bryan-brancotte avatar Jan 14 '20 13:01 bryan-brancotte