superlu_dist icon indicating copy to clipboard operation
superlu_dist copied to clipboard

Many tests fail on FreeBSD

Open yurivict opened this issue 2 years ago • 1 comments

      Start  1: pdtest_1x1_1_2_8_20_SP
 1/18 Test  #1: pdtest_1x1_1_2_8_20_SP ...........***Failed    2.19 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  2: pdtest_1x1_3_2_8_20_SP
 2/18 Test  #2: pdtest_1x1_3_2_8_20_SP ...........***Failed    2.17 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  3: pdtest_1x2_1_2_8_20_SP
 3/18 Test  #3: pdtest_1x2_1_2_8_20_SP ...........***Failed    2.17 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 1 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  4: pdtest_1x2_3_2_8_20_SP
 4/18 Test  #4: pdtest_1x2_3_2_8_20_SP ...........***Failed    2.17 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 1 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  5: pdtest_1x3_1_2_8_20_SP
 5/18 Test  #5: pdtest_1x3_1_2_8_20_SP ...........   Passed   59.46 sec
      Start  6: pdtest_1x3_3_2_8_20_SP
 6/18 Test  #6: pdtest_1x3_3_2_8_20_SP ...........   Passed  155.22 sec
      Start  7: pdtest_2x1_1_2_8_20_SP
 7/18 Test  #7: pdtest_2x1_1_2_8_20_SP ...........***Failed    2.19 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 1 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  8: pdtest_2x1_3_2_8_20_SP
 8/18 Test  #8: pdtest_2x1_3_2_8_20_SP ...........***Failed    2.18 sec
Time to read and distribute matrix 0.00
 ** On entry to DGEMM parameter number  8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 0 on
node yv exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------

      Start  9: pdtest_2x2_1_2_8_20_SP
 9/18 Test  #9: pdtest_2x2_1_2_8_20_SP ...........   Passed  246.40 sec
      Start 10: pdtest_2x2_3_2_8_20_SP
10/18 Test #10: pdtest_2x2_3_2_8_20_SP ...........   Passed  255.64 sec
      Start 11: pdtest_2x3_1_2_8_20_SP
11/18 Test #11: pdtest_2x3_1_2_8_20_SP ...........***Failed    0.05 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 6
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 12: pdtest_2x3_3_2_8_20_SP
12/18 Test #12: pdtest_2x3_3_2_8_20_SP ...........***Failed    0.05 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 6
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 13: pdtest_5x1_1_2_8_20_SP
13/18 Test #13: pdtest_5x1_1_2_8_20_SP ...........***Failed    0.04 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 5
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 14: pdtest_5x1_3_2_8_20_SP
14/18 Test #14: pdtest_5x1_3_2_8_20_SP ...........***Failed    0.05 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 5
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 15: pdtest_5x2_1_2_8_20_SP
15/18 Test #15: pdtest_5x2_1_2_8_20_SP ...........***Failed    0.04 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 10
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 16: pdtest_5x2_3_2_8_20_SP
16/18 Test #16: pdtest_5x2_3_2_8_20_SP ...........***Failed    0.05 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 10
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 17: pdtest_5x3_1_2_8_20_SP
17/18 Test #17: pdtest_5x3_1_2_8_20_SP ...........***Failed    0.05 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 15
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------

      Start 18: pdtest_5x3_3_2_8_20_SP
18/18 Test #18: pdtest_5x3_3_2_8_20_SP ...........***Failed    0.04 sec
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 15
slots that were requested by the application:

  /disk-samsung/freebsd-ports/math/superlu-dist/work/.build/TEST/pdtest

Either request fewer slots for your application, or make more slots
available for use.

A "slot" is the Open MPI term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------


22% tests passed, 14 tests failed out of 18

Total Test time (real) = 730.19 sec

The following tests FAILED:
	  1 - pdtest_1x1_1_2_8_20_SP (Failed)
	  2 - pdtest_1x1_3_2_8_20_SP (Failed)
	  3 - pdtest_1x2_1_2_8_20_SP (Failed)
	  4 - pdtest_1x2_3_2_8_20_SP (Failed)
	  7 - pdtest_2x1_1_2_8_20_SP (Failed)
	  8 - pdtest_2x1_3_2_8_20_SP (Failed)
	 11 - pdtest_2x3_1_2_8_20_SP (Failed)
	 12 - pdtest_2x3_3_2_8_20_SP (Failed)
	 13 - pdtest_5x1_1_2_8_20_SP (Failed)
	 14 - pdtest_5x1_3_2_8_20_SP (Failed)
	 15 - pdtest_5x2_1_2_8_20_SP (Failed)
	 16 - pdtest_5x2_3_2_8_20_SP (Failed)
	 17 - pdtest_5x3_1_2_8_20_SP (Failed)
	 18 - pdtest_5x3_3_2_8_20_SP (Failed)
Errors while running CTest

Version: 8.1.0 clang-14 FreeBSD 13.1

yurivict avatar Aug 27 '22 07:08 yurivict

The first type of error: " ** On entry to DGEMM parameter number 8 had an illegal value" This seems the BLAS library is not properly installed.

The second type of error: "There are not enough slots available in the system to satisfy the 6 slots that were requested by the application" The test case uses 6 MPI ranks. You need to open up more "slot" per their instruction.

xiaoyeli avatar Aug 27 '22 18:08 xiaoyeli