Fix documentation error and ordering bug in DLASD7
Description
For certain problems with many singular values that are close together, DLASD7 subtly mis-orders the resulting singular values, which causes cascading problems throughout the divide-and-conquer algorithm and eventually results in convergence failure.
There is also a documentation error: the documentation states that the deflated singular values are returned in D in increasing order, but in reality they are in decreasing order.
This fixes https://github.com/Reference-LAPACK/lapack/issues/1138. Please refer to https://github.com/OpenMathLib/OpenBLAS/issues/5333 for a detailed discussion of the issue and fix.
Checklist
- [x] The documentation has been updated.
- [x] If the PR solves a specific issue, it is set to be closed on merge.
It is trivial to predict that this fix (if accepted) will also be needed in slasd7.f - probably slasd2.f/dlasd2.f as well. And there is broadly similar code in ?LAED2 and ?LAED8...
Thanks for the feedback, I've pushed equivalent changes to SLASD7, SLASD2 and DLASD2.
Interestingly, SLAED8/DLAED8 appears to contain code specifically to reorder the indices, presumably in order to solve the same problem I encountered, though it does this in a far more convoluted way:
IF( K2+I.LE.N ) THEN
IF( D( JLAM ).LT.D( INDXP( K2+I ) ) ) THEN
INDXP( K2+I-1 ) = INDXP( K2+I )
INDXP( K2+I ) = JLAM
I = I + 1
GO TO 90
ELSE
INDXP( K2+I-1 ) = JLAM
END IF
ELSE
INDXP( K2+I-1 ) = JLAM
END IF
This is basically doing an insertion sort step to find the correct insertion position into the output array to avoid mis-ordering D. I think this is functionally equivalent to what I would have done:
DO 85 JP = JLAM, J - 1
INDXP( K2 + J - 1 - JP ) = JP
85 CONTINUE
The documentation of SLAED8/DLAED8 is still wrong though (increasing -> decreasing) so I will change that.
Regarding SLAED2/DLAED2/SLAED8/DLAED8, do you prefer to keep the existing (complicated) reordering code, or replace it with the new but simpler code?
Perhaps simplify the ..ED codes in a separate PR, if desired ? (Their rewrite is only tangentially related to the bug this one fixes)