arpack-ng
arpack-ng copied to clipboard
MATRIX_MARKET tests failure with parallel make -jN check
Expected behavior
All tests complete successfully.
Actual behavior
Two out of three from arpackmm, issue215 and issue401 tests fail if run with make -j2 or higher.
Where/how to reproduce the problem
- arpack-ng: 3.9.1
- OS: Fedora rawhide (but reproducible on 38 and 39, too)
- compiler: gcc version 13.2.1 20231011 (Red Hat 13.2.1-4) (GCC)
- environment:
FFLAGS='-O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules ' - configure:
./configure --build=x86_64-redhat-linux --host=x86_64-redhat-linux --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --runstatedir=/run --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-static --with-blas=-lflexiblas --with-lapack=-lflexiblas --enable-eigen --enable-icb
Steps to reproduce the problem
cd EXAMPLES/MATRIX_MARKET
make check -j2
make check -j3
Error message
With make -j2 issue215 test passes and the other two fail.
$ make check -j2
make arpackmm \
arpackmm.sh issue401.sh issue215.sh An.mtx As.mtx Az.mtx B.mtx Bz.mtx issue401.mtx issue215.mtx
make[1]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[1]: 'arpackmm' is up to date.
make[1]: Nothing to be done for 'arpackmm.sh'.
make[1]: Nothing to be done for 'issue401.sh'.
make[1]: Nothing to be done for 'issue215.sh'.
make[1]: Nothing to be done for 'An.mtx'.
make[1]: Nothing to be done for 'As.mtx'.
make[1]: Nothing to be done for 'Az.mtx'.
make[1]: Nothing to be done for 'B.mtx'.
make[1]: Nothing to be done for 'Bz.mtx'.
make[1]: Nothing to be done for 'issue401.mtx'.
make[1]: Nothing to be done for 'issue215.mtx'.
make[1]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make check-TESTS
make[1]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[2]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
FAIL: issue401.sh
FAIL: arpackmm.sh
PASS: issue215.sh
============================================================================
Testsuite summary for ARPACK-NG 3.9.1
============================================================================
# TOTAL: 3
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
============================================================================
See EXAMPLES/MATRIX_MARKET/test-suite.log
Please report to https://github.com/opencollab/arpack-ng/issues/
============================================================================
make[2]: *** [Makefile:741: test-suite.log] Error 1
make[2]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[1]: *** [Makefile:849: check-TESTS] Error 2
make[1]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make: *** [Makefile:936: check-am] Error 2
With make -j3 or higher, arpackmm test passes and the other two fail:
$ make check -j3
make arpackmm \
arpackmm.sh issue401.sh issue215.sh An.mtx As.mtx Az.mtx B.mtx Bz.mtx issue401.mtx issue215.mtx
make[1]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[1]: 'arpackmm' is up to date.
make[1]: Nothing to be done for 'arpackmm.sh'.
make[1]: Nothing to be done for 'issue401.sh'.
make[1]: Nothing to be done for 'issue215.sh'.
make[1]: Nothing to be done for 'An.mtx'.
make[1]: Nothing to be done for 'As.mtx'.
make[1]: Nothing to be done for 'Az.mtx'.
make[1]: Nothing to be done for 'B.mtx'.
make[1]: Nothing to be done for 'Bz.mtx'.
make[1]: Nothing to be done for 'issue401.mtx'.
make[1]: Nothing to be done for 'issue215.mtx'.
make[1]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make check-TESTS
make[1]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[2]: Entering directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
FAIL: issue215.sh
FAIL: issue401.sh
PASS: arpackmm.sh
============================================================================
Testsuite summary for ARPACK-NG 3.9.1
============================================================================
# TOTAL: 3
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
============================================================================
See EXAMPLES/MATRIX_MARKET/test-suite.log
Please report to https://github.com/opencollab/arpack-ng/issues/
============================================================================
make[2]: *** [Makefile:741: test-suite.log] Error 1
make[2]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make[1]: *** [Makefile:849: check-TESTS] Error 2
make[1]: Leaving directory '/builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET'
make: *** [Makefile:936: check-am] Error 2
Traces
make -j2
$ tail -n 300 /builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET/test-suite.log
============================================================
ARPACK-NG 3.9.1: EXAMPLES/MATRIX_MARKET/test-suite.log
============================================================
# TOTAL: 3
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
.. contents:: :depth: 2
FAIL: arpackmm.sh
=================
./arpackmm --help
========================================================================================
./arpackmm --A As.mtx --slv BiCG --slvItrTol 1.e-06 --slvItrMaxIt 150 --nbCV 6 --maxIt 200 --verbose 3 --debug 3
========================================================================================
./arpackmm --A As.mtx --slv BiCG --slvItrTol 1.e-06 --slvItrMaxIt 150 --nbCV 6 --maxIt 200 --verbose 3 --debug 3 --restart
========================================================================================
./arpackmm --A As.mtx --slv BiCG --slvItrTol 1.e-06 --slvItrMaxIt 150 --simplePrec --nbCV 6 --maxIt 200 --verbose 3 --debug 3
========================================================================================
./arpackmm --A As.mtx --slv BiCG --slvItrTol 1.e-06 --slvItrMaxIt 150 --simplePrec --nbCV 6 --maxIt 200 --verbose 3 --debug 3 --restart
FAIL arpackmm.sh (exit status: 1)
FAIL: issue401.sh
=================
OPT: A issue401.mtx, B N.A., dense no, nbEV 1, nbCV 5, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LA
OPT: shiftReal no, sigmaReal 0, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart no
INP: create A 0 s
OUT: mode 1, nb EV found 1, nb iterations 1
OUT: init mode solver 0 s, RCI time 0 s
OUT: full time 0 s
STAT: total number of user OP*x operation 9
STAT: total number of user B*x operation 0
STAT: total number of reorthogonalization steps taken 4
STAT: total number of it. refinement steps in reorthogonalization 8
STAT: total number of restart steps 3
OPT: A issue401.mtx, B N.A., dense no, nbEV 1, nbCV 5, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LA
OPT: shiftReal no, sigmaReal 0, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart yes
INP: create A 0 s
OUT: mode 1, nb EV found 1, nb iterations 1
OUT: init mode solver 0 s, RCI time 0 s
OUT: full time 0 s
STAT: total number of user OP*x operation 10
STAT: total number of user B*x operation 0
STAT: total number of reorthogonalization steps taken 5
STAT: total number of it. refinement steps in reorthogonalization 10
STAT: total number of restart steps 4
OPT: A issue401.mtx, B N.A., dense no, nbEV 1, nbCV 5, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LA
OPT: shiftReal no, sigmaReal 0, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart yes
INP: create A 0 s
Error: bad dim - restart KO
Error: bad restart (resid)
Error: arpack solve KO
Error: solve KO
Error: arpack solve KO
FAIL issue401.sh (exit status: 1)
make -j3
$ tail -n 300 /builddir/build/BUILD/arpack-3.9.1/src/EXAMPLES/MATRIX_MARKET/test-suite.log
============================================================
ARPACK-NG 3.9.1: EXAMPLES/MATRIX_MARKET/test-suite.log
============================================================
# TOTAL: 3
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 2
# XPASS: 0
# ERROR: 0
.. contents:: :depth: 2
FAIL: issue401.sh
=================
OPT: A issue401.mtx, B N.A., dense no, nbEV 1, nbCV 5, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LA
OPT: shiftReal no, sigmaReal 0, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart no
INP: create A 0 s
OUT: mode 1, nb EV found 1, nb iterations 1
OUT: init mode solver 0 s, RCI time 0 s
OUT: full time 0.001 s
STAT: total number of user OP*x operation 9
STAT: total number of user B*x operation 0
STAT: total number of reorthogonalization steps taken 4
STAT: total number of it. refinement steps in reorthogonalization 8
STAT: total number of restart steps 3
OPT: A issue401.mtx, B N.A., dense no, nbEV 1, nbCV 5, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LA
OPT: shiftReal no, sigmaReal 0, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart yes
INP: create A 0 s
Error: bad dim - restart KO
Error: bad restart (resid)
Error: arpack solve KO
Error: solve KO
Error: arpack solve KO
FAIL issue401.sh (exit status: 1)
FAIL: issue215.sh
=================
OPT: A issue215.mtx, B N.A., dense no, nbEV 1, nbCV 4, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LM
OPT: shiftReal yes, sigmaReal 0.1, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart no
INP: create A 0 s
OUT: mode 1, nb EV found 1, nb iterations 1
OUT: init mode solver 0 s, RCI time 0 s
OUT: full time 0.001 s
STAT: total number of user OP*x operation 6
STAT: total number of user B*x operation 0
STAT: total number of reorthogonalization steps taken 4
STAT: total number of it. refinement steps in reorthogonalization 6
STAT: total number of restart steps 1
OPT: A issue215.mtx, B N.A., dense no, nbEV 1, nbCV 4, stdPb yes, symPb yes, cpxPb no, simplePrec no, mag LM
OPT: shiftReal yes, sigmaReal 0.1, shiftImag no, sigmaImag 0, invert no, tol 1e-06, maxIt 100, Ritz vectors
OPT: slv BiCG, slvItrPC Diag, slvItrTol 1e-06, slvItrMaxIt 100
OPT: check yes, verbose 0, debug 0, restart yes
INP: create A 0 s
Error: bad dim - restart KO
Error: bad restart (resid)
Error: arpack solve KO
Error: solve KO
Error: arpack solve KO
FAIL issue215.sh (exit status: 1)
Callstack
N/A
Notes, remarks
Using make -j1 or no -j option works.
Is it a regression new with 3.9.1?
These tests didn't exist in 3.9.0, so yes, it's new.
These tests are meant to be run sequentially: restart infos are stored into a file that do not support concurrent access
Ok. Could only those tests be run sequentially? make has special markers for targets that require sequential handling.
Ok. Could only those tests be run sequentially?
Sure
make has special markers for targets that require sequential handling.
No idea how