cmor
cmor copied to clipboard
Improvement : parallelized PrePARE could use all available CPUs
This could come with value 'no_limit' for its argument max-threads, and could use https://docs.python.org/2/library/multiprocessing.html#multiprocessing.cpu_count
However, PrePARE can exhaust available memory of 64 Gb nodes when launched with max-threads=150 in a job allocated with 21 nodes having each 40 CPUs.
Checking data... /slurmstepd: Job 66629726 exceeded memory limit (65649604 > 62914560), being killed Exception in thread Thread-1: Traceback (most recent call last): File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/threading.py", line 801, in __bootstrap_inner self.run() File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/threading.py", line 754, in run self.__target(*self.__args, **self.__kwargs) File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/multiprocessing/pool.py", line 328, in _handle_workers pool._maintain_pool() File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/multiprocessing/pool.py", line 232, in _maintain_pool self._repopulate_pool() File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/multiprocessing/pool.py", line 225, in _repopulate_pool w.start() File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/multiprocessing/process.py", line 130, in start self._popen = Popen(self) File "/scratch/CMIP6/V1/externals/miniconda2/envs/cmor/lib/python2.7/multiprocessing/forking.py", line 121, in init self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory slurmstepd: Exceeded job memory limit
When launched with max-threads=100 , it uses 25Go on node with highest memory use
This would be useful to assess in CMOR4 planning