mpich icon indicating copy to clipboard operation
mpich copied to clipboard

MPI_Comm_spawn -- too many file descriptors

Open kmccall882 opened this issue 3 years ago • 1 comments

When a job spawns many subprocesses via MPI_Comm_spawn, the job can fail with the error below. MPICH currently launches an individual proxy for each spawn, which probably attributes to the flood of file descriptors on the server.

[[email protected]] HYDU_create_process (../../../../mpich-4.0.1/src/pm/hydra/utils/launch/launch.c:21): pipe error (Too many open files) [[email protected]] HYDT_bscd_common_launch_procs (../../../../mpich-4.0.1/src/pm/hydra/tools/bootstrap/external/external_common_launch.c:296): create process returned error free(): invalid pointer /var/spool/slurm/job235999/slurm_script: line 296: 3778907 Aborted (core dumped)

kmccall882 avatar Mar 22 '22 18:03 kmccall882

Thanks @kmccall882 for the report.

Currently, hydra launches separate hydra_pmi_proxy for each new "spawn". This simplifies the code design that "proxy" will only deal with local processes within a single "process group" or MPI_COMM_WORLD. However, it is common for applications that utilize MPI dynamic processes to spawn one process at a time. Launching proxies is a bootstrapping process and it is not optimized for performance. In addition, the job scheduler mechanism that works for initial bootstrapping may not work for later spawning additional processes. Of course, this ticket reports another side effect that the server ends up having too many file descriptors to keep track of.

We should allow using existing proxy on the node to launch additional processes. This will involve expanding the current proxy roles to track multiple process groups. It may not be trivial, but I believe the benefit will outweigh the effort.

hzhou avatar Mar 22 '22 19:03 hzhou