archivetar
archivetar copied to clipboard
dwalk OOM with big directories
Tried both local install and singularity. Both runs out of memory at the end of dwalk on a directory with 130m files. The server has 32 cores and 384G of memory.
Any suggestions? Or I'll just need to break them up into smaller batches.
[2024-03-13T14:14:17] Walked 137236135 items in 4453.108998 secs (30818.049832 items/sec) ...
[2024-03-13T14:14:18] Walked 137241833 items in 4453.868931 secs (30814.070893 items/sec) ...
[2024-03-13T14:14:20] Walked 137241833 items in 4456.083929 seconds (30798.754062 items/sec)
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node pi3dc5-003 exited on signal 9 (Killed).
--------------------------------------------------------------------------
ERROR:root:Problem running: ['mpirun', '--oversubscribe', '-np', '12', '/archivetar/install/bin/dwalk', '--sort', 'name', '--distribution', 'size:0,1K,1M,10M,100M,1G,10G,100G,1T', '--progress', '10', '--output', '/scratch/CRISPRCasFinder-pi3dc5-003-singularity-2024-03-13-13-00-03.cache', '.'] and Command '['mpirun', '--oversubscribe', '-np', '12', '/archivetar/install/bin/dwalk', '--sort', 'name', '--distribution', 'size:0,1K,1M,10M,100M,1G,10G,100G,1T', '--progress', '10', '--output', '/scratch/CRISPRCasFinder-pi3dc5-003-singularity-2024-03-13-13-00-03.cache', '.']' returned non-zero exit status 137.
Traceback (most recent call last):
File "mpiFileUtils/__init__.py", line 45, in apply
File "subprocess.py", line 526, in run
subprocess.CalledProcessError: Command '['mpirun', '--oversubscribe', '-np', '12', '/archivetar/install/bin/dwalk', '--sort', 'name', '--distribution', 'size:0,1K,1M,10M,100M,1G,10G,100G,1T', '--progress', '10', '--output', '/scratch/CRISPRCasFinder-pi3dc5-003-singularity-2024-03-13-13-00-03.cache', '.']' returned non-zero exit status 137.
Traceback (most recent call last):
File "mpiFileUtils/__init__.py", line 45, in apply
File "subprocess.py", line 526, in run
subprocess.CalledProcessError: Command '['mpirun', '--oversubscribe', '-np', '12', '/archivetar/install/bin/dwalk', '--sort', 'name', '--distribution', 'size:0,1K,1M,10M,100M,1G,10G,100G,1T', '--progress', '10', '--output', '/scratch/CRISPRCasFinder-pi3dc5-003-singularity-2024-03-13-13-00-03.cache', '.']' returned non-zero exit status 137.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "archivetar.py", line 21, in <module>
File "archivetar/__init__.py", line 460, in main
File "archivetar/__init__.py", line 214, in build_list
File "mpiFileUtils/__init__.py", line 122, in scanpath
File "mpiFileUtils/__init__.py", line 48, in apply
mpiFileUtils.exceptions.mpiFileUtilsError: Problems Command '['mpirun', '--oversubscribe', '-np', '12', '/archivetar/install/bin/dwalk', '--sort', 'name', '--distribution', 'size:0,1K,1M,10M,100M,1G,10G,100G,1T', '--progress', '10', '--output', '/scratch/CRISPRCasFinder-pi3dc5-003-singularity-2024-03-13-13-00-03.cache', '.']' returned non-zero exit status 137.
[193100] Failed to execute script 'archivetar' due to unhandled exception!