lapack
lapack copied to clipboard
Some tests fail for NVIDIA HPC SDK 20.7, 20.9, 20.11
HI,
I just tried compiling reference LAPACK 3.9.0 using the newly released NVIDIA HPC SDK 20.7 on an AMD Zen2 processor (Ryzen 5 3600X). I noticed that some of the tests failed:
--> LAPACK TESTING SUMMARY <--
Processing LAPACK Testing output found in the TESTING directory
SUMMARY nb test run numerical error other error
================ =========== ================= ================
REAL 1107279 285 (0.026%) 0 (0.000%)
DOUBLE PRECISION 1221707 280 (0.023%) 0 (0.000%)
COMPLEX 641118 23 (0.004%) 0 (0.000%)
COMPLEX16 684278 140 (0.020%) 0 (0.000%)
--> ALL PRECISIONS 3654382 728 (0.020%) 0 (0.000%)
Testing REAL Singular-Value-Decomposition-ssvd.out
SBD drivers: 56 out of 14820 tests failed to pass the threshold
SBD drivers: 56 out of 14820 tests failed to pass the threshold
SBD drivers: 56 out of 14820 tests failed to pass the threshold
SBD drivers: 56 out of 14820 tests failed to pass the threshold
SBD drivers: 56 out of 14820 tests failed to pass the threshold
passed: 51300
failing to pass the threshold: 280
Testing REAL Linear-Equation-routines-stest.out
SLS drivers: 4 out of 105840 tests failed to pass the threshold
passed: 299334
failing to pass the threshold: 4
Testing REAL RFP-linear-equation-routines-stest_rfp.out
STFSM auxiliary routine: 1 out of 7776 tests failed to pass the threshold
passed: 5352
failing to pass the threshold: 1
Testing DOUBLE PRECISION Singular-Value-Decomposition-dsvd.out
DBD drivers: 56 out of 14820 tests failed to pass the threshold
DBD drivers: 56 out of 14820 tests failed to pass the threshold
DBD drivers: 56 out of 14820 tests failed to pass the threshold
DBD drivers: 56 out of 14820 tests failed to pass the threshold
DBD drivers: 56 out of 14820 tests failed to pass the threshold
passed: 51300
failing to pass the threshold: 280
Testing COMPLEX Linear-Equation-routines-ctest.out
CPB: 11 out of 3458 tests failed to pass the threshold
CPB drivers: 4 out of 4750 tests failed to pass the threshold
CLS drivers: 8 out of 105840 tests failed to pass the threshold
passed: 304541
failing to pass the threshold: 23
Testing COMPLEX16 Singular-Value-Decomposition-zsvd.out
ZBD drivers: 28 out of 14340 tests failed to pass the threshold
ZBD drivers: 28 out of 14340 tests failed to pass the threshold
ZBD drivers: 28 out of 14340 tests failed to pass the threshold
ZBD drivers: 28 out of 14340 tests failed to pass the threshold
ZBD drivers: 28 out of 14340 tests failed to pass the threshold
passed: 20425
failing to pass the threshold: 140
Attached is the full testing log: testing_results.txt
Edit: added processor name
Also, here is the make.inc
that I used to compile. I roughly followed the steps listed in this page to build a shared library version of LAPACK by modifying Makefile
and SRC/Makefile
, but I think these modifications should be unrelated to the testing failures.
If nvfortran is in any way related to recent flang you could check if adding -Kieee
to FFLAGS helps (And with the AMD AOCC flavor of flang, I found it necessary to add -fno-unroll-loops
so this could be another option to try and narrow it down)
@martin-frbg I think it is more related to the PGI compiler than AOCC flang
(Actually, the pgfortran
alias is still there and now points to nvfortran
), but I'll give it a try once I get back to my Zen2 workstation.
Edit: the -Kieee
flag does the job! Now it's down to only 5 numerical errors:
--> LAPACK TESTING SUMMARY <--
Processing LAPACK Testing output found in the TESTING directory
SUMMARY nb test run numerical error other error
================ =========== ================= ================
REAL 1300419 1 (0.000%) 0 (0.000%)
DOUBLE PRECISION 1302223 4 (0.000%) 4 (0.000%)
COMPLEX 768366 0 (0.000%) 0 (0.000%)
COMPLEX16 769178 0 (0.000%) 0 (0.000%)
--> ALL PRECISIONS 4140186 5 (0.000%) 4 (0.000%)
Testing REAL RFP-linear-equation-routines-stest_rfp.out
STFSM auxiliary routine: 1 out of 7776 tests failed to pass the threshold
passed: 5352
failing to pass the threshold: 1
Testing DOUBLE PRECISION Nonsymmetric-Eigenvalue-ded.out
DDRVES: DGEES1 returned INFO= 6.
DDRVES: DGEES1 returned INFO= 6.
DES: 2 out of 3264 tests failed to pass the threshold
DGET24: DGEESX1 returned INFO= 6.
DGET24: DGEESX1 returned INFO= 6.
DSX: 2 out of 3494 tests failed to pass the threshold
passed: 6198
failing to pass the threshold: 4
Info Error: 4
Update: Building with NVIDIA HPC SDK version 20.9 and 20.11 also results in some errors. As suggested by @martin-frbg (at least for building OpenBLAS with PGI compilers / NVIDIA HPC SDK), building reference LAPACK also requires the -Kieee
compiler flag. Attached is the make.inc
file (renamed to make.inc.nv.txt
) that I use for reference LAPACK, and the three full build logs (compressed as gzip files), each with NVIDIA HPC SDK 20.7, 20.9, and 20.11, respectively.
The command that I use to build is
$ make clean
$ make -j 12 blas_testing lapack_testing > build-nv20.11.log 2>&1
build-nv20.7.log.gz build-nv20.9.log.gz build-nv20.11.log.gz