ompi icon indicating copy to clipboard operation
ompi copied to clipboard

v4.1: Remove error message from --do-not-launch

Open jjhursey opened this issue 3 years ago • 3 comments

  • Fixes #10643

bot:notacherrypick

jjhursey avatar Aug 09 '22 19:08 jjhursey

Before this change:

shell$  mpirun --host f5n17:2 --do-not-launch hostname
 Data for JOB [17227,1] offset 0 Total slots allocated 2

 ========================   JOB MAP   ========================

 Data for node: f5n17	Num slots: 2	Max slots: 0	Num procs: 2
 	Process OMPI jobid: [17227,1] App: 0 Process rank: 0 Bound: N/A
 	Process OMPI jobid: [17227,1] App: 0 Process rank: 1 Bound: N/A

 =============================================================
[f5n18:3126629] LAUNCH MSG RAW SIZE: 783
--------------------------------------------------------------------------
An internal error has occurred in ORTE:

[[17227,0],0] FORCE-TERMINATE AT (null):0 - error base/plm_base_launch_support.c(595)

This is something that should be reported to the developers.
--------------------------------------------------------------------------

After this change:

shell$   mpirun --host f5n17:2 --do-not-launch hostname
 Data for JOB [5227,1] offset 0 Total slots allocated 2

 ========================   JOB MAP   ========================

 Data for node: f5n17	Num slots: 2	Max slots: 0	Num procs: 2
 	Process OMPI jobid: [5227,1] App: 0 Process rank: 0 Bound: N/A
 	Process OMPI jobid: [5227,1] App: 0 Process rank: 1 Bound: N/A

 =============================================================
[f5n18:3138117] LAUNCH MSG RAW SIZE: 783

jjhursey avatar Aug 09 '22 19:08 jjhursey

Can this go back to v4.0.x?

awlauria avatar Aug 09 '22 19:08 awlauria

Yeah if the RMs want it, v4.0.x does the same thing:

shell$ ompi_info | head -n 4
                 Package: Open MPI jjhursey@f5n18 Distribution
                Open MPI: 4.0.7rc2
  Open MPI repo revision: v4.0.7-83-gd29fc9e3
   Open MPI release date: Unreleased developer copy
shell$  mpirun --host f5n17:2 --do-not-launch hostname
 Data for JOB [50808,1] offset 0 Total slots allocated 2

 ========================   JOB MAP   ========================

 Data for node: f5n17	Num slots: 2	Max slots: 0	Num procs: 2
 	Process OMPI jobid: [50808,1] App: 0 Process rank: 0 Bound: N/A
 	Process OMPI jobid: [50808,1] App: 0 Process rank: 1 Bound: N/A

 =============================================================
[f5n18:3420237] LAUNCH MSG RAW SIZE: 783
--------------------------------------------------------------------------
An internal error has occurred in ORTE:

[[50808,0],0] FORCE-TERMINATE AT (null):0 - error base/plm_base_launch_support.c(594)

This is something that should be reported to the developers.
--------------------------------------------------------------------------

jjhursey avatar Aug 09 '22 19:08 jjhursey