lapack icon indicating copy to clipboard operation
lapack copied to clipboard

DTRSM inconsistent behavior with zeros on the diagonal and an all-zero right-hand side

Open chtobler opened this issue 4 years ago • 5 comments

Description

The reference BLAS implementation of [s/d/c/z]trsm seems to have inconsistent behavior between its varying branches when the right-hand side is all zero and the triangular matrix contains some zeros on the diagonal. Using MATLAB notation:

>> [0 1 1; 0 0 1; 0 0 0] \ [1 0; 1 0; 1 0]
Warning: Matrix is singular to working precision. 
 ans =
   NaN     0
  -Inf     0
   Inf     0
>> [0 1 1; 0 0 1; 0 0 0]' \ [1 0; 1 0; 1 0]
Warning: Matrix is singular to working precision. 
ans =
   Inf   NaN
  -Inf   NaN
   NaN   NaN
>> [0 0 0; 1 0 0; 1 1 0] \ [1 0; 1 0; 1 0]
Warning: Matrix is singular to working precision. 
 ans =
   Inf     0
  -Inf     0
   NaN     0
>> [0 0 0; 1 0 0; 1 1 0]' \ [1 0; 1 0; 1 0]
Warning: Matrix is singular to working precision. 
ans =
   NaN   NaN
  -Inf   NaN
   Inf   NaN

For consistency, it would seem that the second column should contain NaN in all four cases, so that A'\b and At = A'; At\b both agree on whether the result is all-zero or all-NaN. Note that, e.g., Intel MKL's implementation of the BLAS returns all NaN here.

In terms of implementation, I believe removing all branches in *trsm that check if a value of the right-hand side is exact zero would make all four cases above consistently return NaNs in the second column.

Note dtbsv for the case of banded matrices has similar behavior.

Checklist

  • [ ] I've included a minimal example to reproduce the issue
  • [ ] I'd be willing to make a PR to solve this issue

chtobler avatar Nov 04 '21 16:11 chtobler

If info is not 0 the result should not be trusted in my opinion.

ilayn avatar Nov 04 '21 16:11 ilayn

Hi @ilayn, there is no info parameter in TRSM.

langou avatar Nov 07 '21 11:11 langou

My bad, confused with gesv. Sorry for the noise.

ilayn avatar Nov 07 '21 11:11 ilayn

Hi @chtobler, This relates to Section 2.3.3 TRSV and TRSM of the document sent in August 2021 by Jim Demmel Proposed Consistent Exception Handling for the BLAS and LAPACK. The document is not yet public so I am copy-pasting the current version of the draft. Comments welcome. Julien.

The TRSV routine in the reference BLAS solves a triangular system of equations T x = b for x; T may be upper or lower triangular, and unit diagonal (T (i, i) = 1) or not. One may also ask TRSV to solve the transposed linear system T^T x = b. Thereference TRSV returns x= [1; 0] when asked to solve Ux=b with U=[ 1 NaN; 0 NaN ] and b= [ 1; 0] because it checks for trailing zeros in b and does not access the corresponding columns of U (which it would multiply by zero). More generally, TRSV overwrites b with x and checks for zeros appearing anywhere in the updated x, to avoid multiplying the corresponding columns of U by zero. This means solving U x = b with U = [ 1 Nan 1; 0 1 1; 0 0 1] and b = [ 2; 1; 1] yields x = [1 ; 0; 1]. So in both cases, the NaNs in U do not propagate to the result x (neither would an Inf, which if multiplied by zero should also create a NaN). However, if L is the 2-by-2 transpose of the first U above, then calling TRSV to solve L^T x = b, with b = [ 1; 0 ] returns x = [ NaN; NaN]. This is the same linear system as above, but the reference implementation does not check for zeros in this case; this is inconsistent. Again, one could imagine vendor TRSV behaving differently (in Matlab, the NaN does propagate to the solution in these examples). In these cases, if NaN were interpreted to mean “some unknown but finite number”, so that 0*NaN was always 0, then not propagating the NaN would be reasonable. But if NaN meant “some unknown, and possibly infinite number”, then 0*NaN should be a NaN (the default IEEE 754 behavior, which we assume), and not propagating the NaN is incorrect. Analogous comments apply to TRSM.

There are several ways to make exception handling consistent. The first and simplest way is to disallow all checking for zeros. The second way is to allow checking only for leading zeros in b when solving Lx = b (or U T x = b, or trailing zeros in b when solving U x = b, or LT x = b). The reason for the second option is that some users may expect the semantics of solving Lx = b to mean solving a smaller (and possibly much cheaper) linear system when b has trailing zeros, much as they expect alpha = 0 or beta = 0 to affect the semantics of C = alpha*A*B + beta*C. In contrast, zeros appearing after the first nonzero entry in x could be the result of cancellation, and so not something the user can expect in general. But they could also be a result of the sparsity pattern of L and b, for example, if L is block diagonal, and b(i) is nonzero only for indices i corresponding to a subset of the diagonal blocks. On the other hand, optimized versions of TRSM, that apply operations like GEMM to subblocks of matrices, may check for zeros in different (blocked) ways, or not at all (e.g., in MKL), for performance reasons.

We propose to take the first approach, disallowing all zero checking, because it is the simplest, and ensures consis- tency between TRSM and TRSV. This leaves the user the option of checking for leading or trailing zeros themselves and simply changing the size of the linear system in the call to TRSV or TRSM. To support this we will provide simple routines to return the index of the first or last nonzero entry of a 1D-array, or the first or last nonzero row of a 2D-array; the latter is most compatible with blocked optimizations of TRSM. Some versions of these already exist, for the last nonzero row or column of a 2D-array, ILA{S,D,C,Z}L{R,C}.

Analogous comments apply to TPSV and TBSV.

See Section 2.4.3 for an example where TRSV helps cause a NaN to fail to propagate in Gaussian elimination.

langou avatar Nov 07 '21 12:11 langou

Thank you @langou this describes exactly what I meant to describe. I'm happy to hear this is already being considered, and think the first approach will be best. Also because various performance-optimized versions (such as MKL) are already using this behavior, presumably because it will be most performant for the most common case where b doesn't contain any zeros.

chtobler avatar Nov 11 '21 13:11 chtobler