quda icon indicating copy to clipboard operation
quda copied to clipboard

MPIEXEC_[PRE,POST]FLAGS introduces quotation marks which leads to ctest launch failure

Open kostrzewa opened this issue 3 years ago • 0 comments

Wanting to run ctest in parallel (-j 16) and wanting to replace the default binding (otherwise I get all processes bound to two cores on my test machine), I wanted to pass --bind-to none as either ${MPIEXEC_POSTFLAGS} or ${MPIEXEC_PREFLAGS}. Unfortunately, CMake introduces quotation marks which in turn result in launch failure (for me, OpenMPI ):

$ mpiexec --version
mpiexec (OpenRTE) 4.0.3

Report bugs to http://www.open-mpi.org/community/help/

The command that is run has quotes around "--bind-to none":

1: Test command: /usr/bin/mpiexec "-n" "1" "--bind-to none" "/home/bartek/build/quda-ndeg_twisted_clover_merge_develop_PR_fix_with_tests/tests/blas_test" "--bind-to none" "--dim" "2" "4" "6" "8" "--nsrc" "8" "--msrc" "9" "--solve-type" "direct-pc" "--gtest_output=xml:blas_test_parity.xml"
1: Test timeout computed to be: 1500
1: /usr/bin/mpiexec: Error: unknown option "--bind-to none"
1: Type '/usr/bin/mpiexec --help' for usage.
 1/21 Test  #1: blas_test_parity_wilson ...........................***Failed    0.01 sec

Removing the quotes manually and running

/usr/bin/mpiexec "-n" "1" --bind-to none "/home/bartek/build/quda-ndeg_twisted_clover_merge_develop_PR_fix_with_tests/tests/blas_test" --bind-to none "--dim" "2" "4" "6" "8" "--nsrc" "8" "--msrc" "9" "--solve-type" "direct-pc" "--gtest_output=xml:blas_test_parity.xml"

works instead. As far as I can tell this is "by design", but perhaps this is something worth thinking about.

For this particular case I've helped myself by adjusting the OpenMPI MCA base parameters (/etc/openmpi/openmpi-mca-params.conf) to use a different binding.

kostrzewa avatar Nov 11 '21 16:11 kostrzewa