dicodile icon indicating copy to clipboard operation
dicodile copied to clipboard

Unit tests fail with mpich

Open hndgzkn opened this issue 3 years ago • 5 comments

Mandrill example runs without problems however unit tests fail with mpich.

When tests are run with:

$ pytest

Output is:

dicodile/tests/test_dicodile.py::test_dicodile [mpiexec@hande] match_arg (utils/args/args.c:160): unrecognized argument pmi_args
[mpiexec@hande] HYDU_parse_array (utils/args/args.c:175): argument matching returned error
[mpiexec@hande] parse_args (ui/mpich/utils.c:1603): error parsing input array
[mpiexec@hande] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1655): unable to parse user arguments
[mpiexec@hande] main (ui/mpich/mpiexec.c:128): error parsing parameters

This might be due to missing Singleton feature in mpich .

When tests are run with:

$ mpirun -np 1 pytest
  • dicodile/tests/test_dicodile.py runs without problems but hangs after running the tests and cannot stop the spawned process.
  • dicodile/update_z/tests/test_dicod.py hangs at first iteration.

When tests are run with mpirun command with openmpi implementation, all tests run without problems but it also hangs after running all the tests, leaving all processes spawned by the last test alive.

The problem with mpich seems to be valid only for the tests; for example examples/plot_mandrill.py runs without problems with mpich.

hndgzkn avatar Mar 02 '21 10:03 hndgzkn

I am getting this when running examples/plot_gait.py:

[DEBUG:DICODILE] Lambda_max = 1.7133212673372127
[mpiexec@mendoza-PC] match_arg (utils/args/args.c:160): unrecognized argument pmi_args
[mpiexec@mendoza-PC] HYDU_parse_array (utils/args/args.c:175): argument matching returned error
[mpiexec@mendoza-PC] parse_args (ui/mpich/utils.c:1603): error parsing input array
[mpiexec@mendoza-PC] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1655): unable to parse user arguments
[mpiexec@mendoza-PC] main (ui/mpich/mpiexec.c:128): error parsing parameters

with mpich-3.4.2, mpi4py-3.1.1, and mpi-1.0-mpich.

chmendoza avatar Oct 08 '21 14:10 chmendoza

Hi @chmendoza

What is the command that you are using to run the example?

hndgzkn avatar Oct 08 '21 16:10 hndgzkn

I don't have my laptop with me right now to recreate this and give more detail, but it was python plot_gait.py. I could post here more details about the conda environment if you think is needed.

chmendoza avatar Oct 08 '21 18:10 chmendoza

Please try with the command:

mpirun -np 1 --host localhost:16 python -m mpi4py examples/plot_gait.py

My guess is that you are trying to run the notebook without mpirun -np 1.. part of the command. It is possible with openmpi but not mpich. I hope this helps.

hndgzkn avatar Oct 08 '21 19:10 hndgzkn

That worked @hndgzkn, thanks! Although I am using this package in my conda environment:

openmpi 4.1.1 hbfc84c5_0

so, I am not using mpich

Finally, I ran the tests, with $ pytest ., and 35 tests passed, with 10 warnings.

chmendoza avatar Oct 17 '21 22:10 chmendoza