MUM&Co thread and optimisation issue
I am writing to bring to your attention a persistent issue I have encountered regarding job terminations, despite submitting them with an increased number of CPUs, utilizing up to 72 CPU cores, along with the command "export OMP_NUM_THREADS=$PBS_NCPUS" within my PBS script.
Given the limitations on walltime at my institution, which is set at 48 hours, I had to explore several options in an attempt to address this matter, including increasing the default CPU (thread) count from within the mumandco_v3.8.sh script, specifically modifying "threads" from "1" to "72." Additionally, I have incorporated the "export OMP_NUM_THREADS=$PBS_NCPUS" command within my PBS script.
Regrettably, these jobs continue to be terminated repeatedly, and the program lacks a resume option. Further examination reveals that the program utilizes a significant amount of memory, approximately 40 to 60GB, yet it fails to efficiently utilize the requested threads, despite my best efforts to configure it for optimal performance.
%CPU WallTime Time Lim RSS mem memlim cpus
98252891 R hj3792 te53 mumco1_n 5 21:08:13 48:00:00 64.4GB 64.4GB 80.0GB 72 98252892 R hj3792 te53 mumco2_n 6 21:06:57 48:00:00 64.5GB 64.5GB 80.0GB 72 98252899 R hj3792 te53 mumco1_n 3 21:06:05 48:00:00 65.6GB 65.6GB 90.0GB 72 98252926 R hj3792 te53 mumco2_n 3 21:05:27 48:00:00 65.0GB 65.0GB 90.0GB 72 98253022 R hj3792 te53 mumco2_n 3 21:06:40 48:00:00 64.9GB 64.9GB 90.0GB 72 98298196 R hj3792 te53 mumco2_n 12 04:01:25 48:00:00 64.3GB 64.3GB 80.0GB 72
Considering the circumstances, it appears increasingly likely that the MUM&Co program may require more than the allocated 48 hours to complete successfully. Without an extension of the walltime, these jobs remain trapped in an unproductive cycle of termination and restart.
Hence, I kindly request your valuable assistance in resolving this matter, which may entail optimizing thread utilization to accommodate the program's resource requirements. Your support in this regard would be invaluable and greatly appreciated.
Hi there, sorry for the delay in replying
These steps are essentially just related to the alignment done by MUMmer and so are not really part of MUM&Co (although used by MUM&Co. So I don't think I can really help there unfortunately.
One solution, as it may speed up the alignments, is to remove the --maxmatch option
You can simply remove it with something like sed 's/--maxmatch//g' mumandco_v3.8.sh > mumandco_v3.8.no_maxmatch.sh
Let me know if that helps Thanks Samuel