libflame Accuracy issue in SVD API "SGESDD "

Hi, We are observing few failures in one of our customer applications using libFLAME and BLIS for SVD API “SGESDD”. The outputs of singular values S and the Orthogonal matrix U are differing from expected output. The tests pass with OpenBLAS and MKL libraries' outputs for the same API.

Input Matrix A Size : 9 x 100 Input values: {1} -> All 1s Parameters: JobZ : ‘O’ M : 9 N : 100 LDA : 9 LDU : M LDVT : 1

Outputs from libflame+BLIS

Singular values (S) 3.0000e+01 4.6447e-06 5.2638e-13 1.2358e-19 0.0000e+00 -0.0000e+00 -0.0000e+00 -0.0000e+00 -0.0000e+00

Orthogonal Matrix(U) 3.3333e-01 -9.4281e-01 -0.0000e+00 -0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 3.3333e-01 1.1785e-01 -9.3541e-01 -1.2102e-07 7.0755e-15 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 3.3333e-01 1.1785e-01 1.3363e-01 -9.2582e-01 1.3064e-08 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 9.1287e-01 0.0000e+00 0.0000e+00 0.0000e+00 0.0000e+00 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 -1.8257e-01 4.4721e-01 4.4721e-01 4.4721e-01 4.4721e-01 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 -1.8257e-01 -8.6180e-01 1.3820e-01 1.3820e-01 1.3820e-01 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 -1.8257e-01 1.3820e-01 -8.6180e-01 1.3820e-01 1.3820e-01 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 -1.8257e-01 1.3820e-01 1.3820e-01 -8.6180e-01 1.3820e-01 3.3333e-01 1.1785e-01 1.3363e-01 1.5430e-01 -1.8257e-01 1.3820e-01 1.3820e-01 1.3820e-01 -8.6180e-01

Expected output:

Singular values(S) 3.0000e+01 4.4731e-06 2.8951e-12 3.2130e-18 2.3120e-24 1.2895e-30 1.3683e-36 1.6802e-42 2.9427e-44

Orthogonal Matrix(U) -3.3333e-01 9.4281e-01 6.4572e-07 9.9341e-09 9.9341e-09 -1.9868e-08 -1.9868e-08 -1.9868e-08 0.0000e+00 -3.3333e-01 -1.1785e-01 9.3541e-01 -1.0867e-06 -7.6012e-09 -1.5052e-08 1.5202e-08 5.1177e-09 0.0000e+00 -3.3333e-01 -1.1785e-01 -1.3363e-01 9.2582e-01 7.9038e-07 -1.7868e-08 3.0130e-10 -2.3329e-09 0.0000e+00 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 9.1287e-01 -5.9798e-07 2.1286e-08 -1.2825e-08 0.0000e+00 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 -1.8257e-01 8.9443e-01 -1.0503e-06 -1.6157e-08 0.0000e+00 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 -1.8257e-01 -2.2361e-01 8.6603e-01 1.3324e-06 -2.2352e-08 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 -1.8257e-01 -2.2361e-01 -2.8868e-01 8.1635e-01 -1.5403e-02 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 -1.8257e-01 -2.2361e-01 -2.8867e-01 -4.2152e-01 -6.9928e-01 -3.3333e-01 -1.1785e-01 -1.3363e-01 -1.5430e-01 -1.8257e-01 -2.2361e-01 -2.8867e-01 -3.9484e-01 7.1468e-01

Any analysis or help regarding this will be highly appreciated.

Dec 07 '20 09:12 srvasanth

We saw a couple of failing numpy tests across a variety of CPUs (Haswell, Skylake, Zen2) when trying to build numpy 1.19.4 on top of latest BLIS (0.8.0) and libFLAME (5.2.0) and GCC 10.2, an example is below.

We're not seeing those failing tests when using OpenBLAS (0.3.12) with the LAPACK it ships, or with Intel MKL (2020 update 4).

If we use BLIS 0.8.0 + reference LAPACK 3.9.0, then there are no failing tests, so the culprit must be libFLAME...

_________________________________________________________________________________________________________________ TestRandomDist.test_multivariate_normal[svd] _________________________________________________________________________________________________________________

self = <numpy.random.tests.test_generator_mt19937.TestRandomDist object at 0x14eac29cce50>, method = 'svd'

    @pytest.mark.parametrize("method", ["svd", "eigh", "cholesky"])
    def test_multivariate_normal(self, method):
        random = Generator(MT19937(self.seed))
        mean = (.123456789, 10)
        cov = [[1, 0], [0, 1]]
        size = (3, 2)
        actual = random.multivariate_normal(mean, cov, size, method=method)
        desired = np.array([[[-1.747478062846581,  11.25613495182354  ],
                             [-0.9967333370066214, 10.342002097029821 ]],
                            [[ 0.7850019631242964, 11.181113712443013 ],
                             [ 0.8901349653255224,  8.873825399642492 ]],
                            [[ 0.7130260107430003,  9.551628690083056 ],
                             [ 0.7127098726541128, 11.991709234143173 ]]])

>       assert_array_almost_equal(actual, desired, decimal=15)
E       AssertionError:
E       Arrays are not almost equal to 15 decimals
E
E       Mismatched elements: 12 / 12 (100%)
E       Max absolute difference: 3.98341847
E       Max relative difference: 2.2477228
E        x: array([[[ 1.994391640846581,  8.74386504817646 ],
E               [ 1.243646915006621,  9.657997902970179]],
E       ...
E        y: array([[[-1.747478062846581, 11.25613495182354 ],
E               [-0.996733337006621, 10.342002097029821]],
E       ...

actual     = array([[[ 1.99439164,  8.74386505],
        [ 1.24364692,  9.6579979 ]],

       [[-0.53808839,  8.81888629],
        [-0.64322139, 11.1261746 ]],

       [[-0.46611243, 10.44837131],
        [-0.46579629,  8.00829077]]])
cov        = [[1, 0], [0, 1]]
desired    = array([[[-1.74747806, 11.25613495],
        [-0.99673334, 10.3420021 ]],

       [[ 0.78500196, 11.18111371],
        [ 0.89013497,  8.8738254 ]],

       [[ 0.71302601,  9.55162869],
        [ 0.71270987, 11.99170923]]])
mean       = (0.123456789, 10)
method     = 'svd'
random     = Generator(MT19937) at 0x14EAC29CE040
self       = <numpy.random.tests.test_generator_mt19937.TestRandomDist object at 0x14eac29cce50>
size       = (3, 2)

Mar 12 '21 10:03 boegel

Unfortunately SVD in libflame is not easy to debug because it uses a completely different algorithm than the one in LAPACK. So for now, I'll encourage you both to use netlib LAPACK + BLIS as your workaround. Apologies for the inconvenience.

Mar 14 '21 19:03 fgvanzee

@fgvanzee That's indeed the alternative approach we're going forward with, we've also seen some other problems with libFLAME (like #46).

Mar 14 '21 19:03 boegel

Perhaps related to this: AMD has just released a new version of their libFLAME fork, see https://github.com/amd/libflame/releases/tag/3.0, which mentions in the release notes "Several bug fixes including handling denormal numbers in SVD functions".

I'm not sure those fixes are related to this issue, but it seems like they could be...

I'll try and find time to take that new AMD-libFLAME version for a spin, and see if I'm still running into problems with the numpy test suite.

Mar 15 '21 18:03 boegel