flang icon indicating copy to clipboard operation
flang copied to clipboard

OpenBLAS LAPACK Tests fail when compiled with flang at O2 or above

Open leezu opened this issue 4 years ago • 4 comments

https://github.com/xianyi/OpenBLAS/pull/2652

Here are two build logs, one prior to https://github.com/xianyi/OpenBLAS/pull/2652 and using the standard > O1 flags. You can search the log for FATAL ERROR to find the failing tests. The second uses -O1 and all tests pass.

fatal-error.log o1.log

Statement of Martin, OpenBLAS maintainer:

The fortran code in question is a direct copy of Reference-LAPACK (a.k.a "netlib LAPACK") and compiles fine with other compilers, so I'd think at best flang would have to be extra strict in its interpretation of the standard.

leezu avatar Jun 09 '20 21:06 leezu

Maybe due to not setting -Kieee https://github.com/flang-compiler/flang/issues/599#issuecomment-430639227 (via https://github.com/xianyi/OpenBLAS/issues/2650#issuecomment-641618709)

leezu avatar Jun 09 '20 23:06 leezu

On taking a closer look this appears to be entirely due to a miscalculation of the machine precision in the failed tests, the actual results being identical for all practical purposes. More precisely, the tests in OpenBLAS were taken from an older Reference-LAPACK and so far nobody bothered to updated them to reflect https://github.com/Reference-LAPACK/lapack/commit/a880822 which replaced calculating the machine precision with a call to the f90 EPSILON intrinsic. So it seems flang must be optimizing away something in our antiquated approach of "start from 1 and keep halving that value until 1+x becomes indistinguishable from 1". Unfortunately this is not the only bug apparently, as the corresponding CBLAS tests in the ctest subdirectory still fail when compiled with -O2, but pass at -O1

martin-frbg avatar Jun 10 '20 11:06 martin-frbg

Current understanding now is that after applying -Kieee (and correcting our mis-spelling of -Mrecursive) the only remaining issue is specific to the AMD AOCC fork of flang (which appears to have a problem with loop unrolling optimization at -O2 that manifests itself in the complex parts of the Reference-LAPACK testsuite). Sorry for the noise here.

martin-frbg avatar Jun 14 '20 20:06 martin-frbg

Is this an actual bug or can this be closed?

pawosm-arm avatar Feb 20 '22 22:02 pawosm-arm