Use generic kernel on P6 & P7 so that lapack-test passes
When we run lapack-test on Power7, we found some Eigen test failed. This PR tries to turn back to generic kernel for GEMV and GEMM, etc. Then all test passed on Power7 AIX.
LGTM
Hmm, I'm not entirely sure this is necessary. Which compiler did you use ?
We use IBM OpenXL C/C++ 17.1.2 and OpenXL Fortran 17.1.2 on AIX. No failure. I also tried GCC 11.3.1 on RHEL9.2. Only 1 failure in ssvd.out (SBD, M=30, N=40, type=10, test(9) returns 50.50 and thresh is 50.00).
Without the change, there are about 205 (AIX7.2, OpenXL), 189 (RHEL9.2, GCC) failures.
Thanks - the question for me is why this comes up only now, and if the errors are of any significance. The fundamental problem with the LAPACK testsuite is that it expects the exact same results as would be obtained from the unoptimized Reference LAPACK implementation compiled with the equally unoptimized Reference BLAS, and any hint of FMA operations or just a different order of summations can throw it off.
Yes, we started to look at the lapack-test failures on Power platform. We worked on Power7 firstly because that is old and may not impact too much.
You are correct that these test expect exact same result from the LAPACK subroutines which is something like bit accurate. The current failures are caused by not only FMA operation but also something else. What I found is that the Complex GEMM implementation in kernel/power/zgemm_kernel_power6.S computes the complex multiply and add by computing all R*R and I*I seperately. Then in the final step, it sub the sum of I*I from the sum of R*R. And the similar idea in Imag part by adding sum of R*I and sum of I*R.
By switching to the generic C implementation, all test passes except 1 failure in ssvd.out (Linux).
As @RajalakshmiSR suggested, I restored the Makefile.power for P6. Now only change the KERNEL for Power6.
Can we target this for 0.3.28?
@martin-frbg Can we target this for 0.3.28? Thanks.
@RajalakshmiSR I'm not particularly happy with throwing out the GotoBLAS kernels when this are probably "minor errors" in the terminology of the netlib LAPACK FAQ. On the other hand if I misunderstood and the errors are seen with the testsuite from Eigen ...
@martin-frbg Sorry for the confusion. The lapack testing on Power7 may not be "minor failure" because one of the failure is about 0.839E+07. I think some of the EigenValue outputs are not in the same order.
Then how about change in GEMM or GEMV assembly kernels? After looking into the assembly code, we found of the code change the computation order. If this is a preferred way to fix the LAPACK testing failures, I can close this PR and create another PR for that change.
Thank you for your review and comments.
Close this PR and rework on GEMV/GEMM assembly kernels.