flang
flang copied to clipboard
OpenBLAS LAPACK Tests fail when compiled with flang at O2 or above
https://github.com/xianyi/OpenBLAS/pull/2652
Here are two build logs, one prior to https://github.com/xianyi/OpenBLAS/pull/2652 and using the standard > O1
flags. You can search the log for FATAL ERROR
to find the failing tests. The second uses -O1
and all tests pass.
Statement of Martin, OpenBLAS maintainer:
The fortran code in question is a direct copy of Reference-LAPACK (a.k.a "netlib LAPACK") and compiles fine with other compilers, so I'd think at best flang would have to be extra strict in its interpretation of the standard.
Maybe due to not setting -Kieee
https://github.com/flang-compiler/flang/issues/599#issuecomment-430639227 (via https://github.com/xianyi/OpenBLAS/issues/2650#issuecomment-641618709)
On taking a closer look this appears to be entirely due to a miscalculation of the machine precision in the failed tests, the actual results being identical for all practical purposes. More precisely, the tests in OpenBLAS were taken from an older Reference-LAPACK and so far nobody bothered to updated them to reflect https://github.com/Reference-LAPACK/lapack/commit/a880822 which replaced calculating the machine precision with a call to the f90 EPSILON intrinsic. So it seems flang must be optimizing away something in our antiquated approach of "start from 1 and keep halving that value until 1+x becomes indistinguishable from 1". Unfortunately this is not the only bug apparently, as the corresponding CBLAS tests in the ctest subdirectory still fail when compiled with -O2, but pass at -O1
Current understanding now is that after applying -Kieee (and correcting our mis-spelling of -Mrecursive) the only remaining issue is specific to the AMD AOCC fork of flang (which appears to have a problem with loop unrolling optimization at -O2 that manifests itself in the complex parts of the Reference-LAPACK testsuite). Sorry for the noise here.
Is this an actual bug or can this be closed?