stdlib Tridagonal matrices and associated spmv kernel

This PR is the first of a series to port what I had been doing in SpecialMatrices to stdlib. This first one provides the base type and spmv kernel for Tridiagonal matrices. At the moment, it is included in stdlib_sparse_kinds and stdlib_sparse_spmv files out of simplicity. It could also be (yet) another module, I don't have a strong opinion about that.

Proposed interface

A = Tridiagonal(dl, dv, du) where dl, dv and du are rank-1 arrays describing the tridiagonal elements.
call spmv(A, x, y [, alpha] [, beta] [, op]) for the matrix-vector product.
dense(A) to convert a Tridiagonal matrix to its standard rank-2 array representation.
transpose(A) to compute the transpose of a Tridiagonal matrix.
hermitian(A) to compute the hermitian of a Tridiagonal matrix.
+ and - are being overloaded for addition and subtraction to two Tridiagonal matrices.
* is being overloaded for scalar-matrix multiplication.

Key facts

The spmv kernel uses lagtm LAPACK backend under the hood.
Tridiagonal matrices of all the kinds provided by stdlib can be created.
The kind generality of the spmv kernel has the same limitations as stdlib_linalg_lapack.

Progress

[x] Base implementation
[x] Tests
[x] Documentation
[x] Examples
[x] Tests

cc @jvdp1, @jalvesz, @perazz

Mar 20 '25 08:03 loiseaujc

I got one build to work, that's encouraging. Everything works fine locally though when using gfortran 14.2.0. I'll try to test things over the weekend with older versions of gfortran to see exactly what's going on.

On a different note: the sparse matrix-vector product currently uses call spmv. What about exposing an extension to the intrinsic matmul? For many mundane tasks, it might be more natural to have y = matmul(A, x) in the code rather than call spmv(A, x, y, ...).

Mar 20 '25 11:03 loiseaujc

On a different note: the sparse matrix-vector product currently uses call spmv. What about exposing an extension to the intrinsic matmul? For many mundane tasks, it might be more natural to have y = matmul(A, x) in the code rather than call spmv(A, x, y, ...).

Sure, it might be interesting to create a wrapper interface on top of the subroutine to have a matmul equivalent. Now, I personally avoid using matmul and for that matter any function procedure signature implying assignment to an array lhs, as it is difficult to control whether the compiler recycles the memory of the lhs or creates a temporary array. subroutines allow better control in that regard. At the same time, having these interfaces might also be a good study case for such performance problems.

Mar 20 '25 14:03 jalvesz

All tests are now passing. It was simply the all( y1 == y2) which was too restrictive. Apparently, bit reproducibility is possible with gfortran 14.2.0 but not the others, there are some floating point error accumulation (which actually make sense I suppose).

Sure, it might be interesting to create a wrapper interface on top of the subroutine to have a matmul equivalent. Now, I personally avoid using matmul and for that matter any function procedure signature implying assignment to an array lhs, as it is difficult to control whether the compiler recycles the memory of the lhs or creates a temporary array. subroutines allow better control in that regard. At the same time, having these interfaces might also be a good study case for such performance problems.

So do I. I think adding a matmul is more of a sugar-coating thing than performance related. I kind of see this as the np.dot(A, x) vs A @ x in numpy (although it points to the exact same function in this case). Not quite recommended if you want performance, but more natural/readable for a simple piece of code.

PS: Now that the tests are passing, I'll start working on the docs and examples tomorrow.

Mar 20 '25 15:03 loiseaujc

it might be more natural to have y = matmul(A, x)

I've started a discussion on this at #951, great work @loiseaujc!

Mar 21 '25 07:03 perazz

Oh, I found the initial design as an extension of the sparse module was the ideal proposal. It would enable to improve the global structure by having different use-cases emerging from it.

Mar 24 '25 08:03 jalvesz

Oh, I found the initial design as an extension of the sparse module was the ideal proposal. It would enable to improve the global structure by having different use-cases emerging from it.

I had the same idea initially, but things started to get messy when working on the documentation. Moreover, if Circulant, Toeplitz and Hankel matrices eventually make their ways into stdlib, these are not actually sparse matrices. I figured it made more sense to have a dedicated module grouping all the highly structured matrices. Additionally, very efficient linear solvers exist for all of these highly structured matrices which is not case for general sparse ones. I can still define Tridiagonal and the likes as an extension of sparse_type though to make them compatible.

Another possibility is to keep Tridiagonal, SymTridiagonal, etc in the sparse module, and only have Circulant, Toeplitz, and Hankel in the newly created stdlib_specialmatrices one. I'm all open.

Mar 24 '25 08:03 loiseaujc

I had the same idea initially, but things started to get messy when working on the documentation.

I'm curious about this. Would you mind letting me know why did you find it complicated? I would definitely like to help and make it easier to contribute/build on top of it. If it can be improved, lets try to do it :)

Another possibility is to keep Tridiagonal, SymTridiagonal, etc in the sparse module, and only have Circulant, Toeplitz, and Hankel in the newly created stdlib_specialmatrices one. I'm all open.

Sounds about right!

Mar 24 '25 20:03 jalvesz

Well, it is not so much the process of documenting than having a somewhat consistent documentation when you start planning for the next sets of PR. Take the Tridiagonal matrix as an example. Here is the near complete list of mathematical features or fast algorithms that, to some extent, set them apart from general sparse matrices (in no special order):

The set of n x n tridiagonal matrices actually forms a 3n-2-dimensional vector subspace, i.e. any linear combination of tridiagonal matrices results in a tridiagonal matrix. This is not the case for general sparse matrices where the sparsity pattern of the result will be the union of the sparsity patterns of each element of the linear combination.
The spmv kernel for tridiagonal matrix-vector product actually relies on lagtm directly provided by lapack rather than hand-crafted kernels as the ones you implemented in stdlib_sparse_spmv.
The determinant can be computed in $\mathcal{O}(n)$ operations using a three-term recurrence rather than the standard $\mathcal{O}(n^3)$ for other matrices (including sparse ones I believe).
Linear systems $Ax = b$ can be solved exactly again in $\mathcal{O}(n)$ operations using the Thomas algorithm compared to standard iterative solvers or specialized sparse direct ones for general sparse matrices.
For symmetric tridiagonal matrices, eigenvalues and eigenvectors can be computed exactly using steqr in lapack. In contrast, it is rare to perform a complete spectral decomposition of a general sparse matrix (and one typically use some form of Arnoldi iteration instead to get only the leading eigenpairs).

Modulo some variations in the algorithms, pretty much the same statements apply directly to SymTridiagonal, Circulant, Toeplitz and Hankel matrices: they all form m-dimensional vector subspaces (with m << n²), they all have specialized algorithms for computing matrix-vector products or solving linear systems, and most also have specialized eig/svd algorithms. And on a longer horizon, the same would also apply to block-variants of these types.

Given the similarities between all these types, having them inside the same stdlib_specialmatrices module made sense to me. From a "math" point of view, I kind of see them as being closer to one another than any of them is to general sparse matrices. It is easy to have a consistent documentation across all the provided types. This module could be used by itself. In contrast, the way I envision stdlib_sparse (but I may be wrong) is that it'll be used in conjunction with a potential stdlib_sparse_linalg (kind of what scipy is doing) or stdlib_sparse_graph if graph-theoretical stuff are being implemented.

Mar 24 '25 21:03 loiseaujc

I see! Makes sense!! I would envision that for higher level APIs they should be unified with proper interfaces. Say, if one wants to call a generic matmul or a solve procedure, versions should be available for any matrix type defined in stdlib (dense, sparse, special). Instead of making the matrix structure per-se the center of focus, making it a helper concept for efficient storage and procedure implementation.

If some of the APIs can be brought close in naming convention and usage regardless of the internal implementation details that would already be a good step forward.

Mar 24 '25 22:03 jalvesz

I see! Makes sense!! I would envision that for higher level APIs they should be unified with proper interfaces. Say, if one wants to call a generic matmul or a solve procedure, versions should be available for any matrix type defined in stdlib (dense, sparse, special).

I guess we're on the same page then! That is pretty much what I tried in my personal specialmatrices implementations. All the matrix types actually extend the generic stdlib interfaces (e.g. x = solve(A, b), call eig(A, lambda, right, left), etc) so that they just work whether $A$ is a rank-2 array, a Tridiagonal matrix, or a Toeplitz one for instance.

Instead of making the matrix structure per-se the center of focus, making it a helper concept for efficient storage and procedure implementation.

Indeed. Sure enough, using the correct matrix structure for computational efficiency can be utterly important, but at the end of the day I want simply to solve $A x = b$ and having a x = solve(A, b) syntax conveys exactly that message. It doesn't really matter to many users how $A$ is actually being represented behind the scenes.

If some of the APIs can be brought close in naming convention and usage regardless of the internal implementation details that would already be a good step forward.

That might be easy to do simply by extending already existing interfaces. In an ideal world however, I'd like these different definitions of matrices to be interoperable with packages other than stdlib. Say you have your own implementation of a particular sparse type while I have a package doing graph-theoretic operations. And say stdlib provides an AbstractSparseMatrix (possibly itself extended from a more fundamental AbstractMatrix) from which your type is being extended and for which I provide interfaces. How cool would it be for

call my_arbitrary_graph_operation( your_specific_sparse_matrix )

to just work even though, on my side, I've never actually imported your package anywhere in my list? I think, to some extent, this is what is driving a large part of the Julia packages interoperability. I don't think it'll be too hard (technically) to partially emulate this behaviour with well-designed fundamental derived-types (possibly abstract or not). This discussion however is beyond the scope of this PR. Let's continue it on the discourse.

Mar 25 '25 06:03 loiseaujc

I think it is pretty much ready for review. There are just a couple of things I'm not sure how to do properly with fyyp or ford:

The spmv driver relies on lagtm behind the scene and thus inherits from the fact that, by default, the lapack backend is not being compiled for extended and quadruple precision. At the moment I have a simple #:if k1 != "qp" and k1 != "xdp" wherever needed but I figure this is not the most fyyp-esque way to do so.
In the ford documentation to link the overloaded operators interfaces to the corresponding section of the specification page, I have [[stdlib_specialmatrices(module):operator(+)(interface)]] which does not work. Not sure what is the correct syntax though and the ford documentation was not very helpful for this particular feature.

Mar 25 '25 14:03 loiseaujc

Regarding lagtm: if you check here https://github.com/fortran-lang/stdlib/blob/2bdc50ee09c90919804ef5b4e24a1eef5faa1d9f/src/lapack/stdlib_lapack_blas_like_l3.fypp#L214 you'll see that actually, the single and double precision are taken from the reference implementations and the "if not" implies the the other kinds are generated by fypp templating extension, which means that you could actually use those for xdp and qp. If you link against an optimized library which will most likely only support single and double, then those will be accelerated, xdp and qp will simply continue using the internal implementation.

Mar 26 '25 09:03 jalvesz

Alright. I don't know why I though extended and quadruple where not supported by default. Maybe I read in an old topic about stdlib. Anyway, I've changed the code and everything runs fine with xdp and qp.

Mar 26 '25 11:03 loiseaujc

I think it is pretty much ready now. Let me know if you feel like other modifications should be made.

Jun 22 '25 12:06 loiseaujc

I'm thrown off by the lastest CI failures with test-drive ...

[76/869] D:\a\_temp\msys64\mingw64\bin\gfortran.exe -ID:\a\stdlib\stdlib\build\_deps\test-drive-src\test -ID:/a/stdlib/stdlib/build/_deps/test-drive-build/include -fimplicit-none -ffree-line-length-132 -Wall -Wextra -Wimplicit-interface -fPIC -g -fcheck=all -fbacktrace -J_deps\test-drive-build\test -fpreprocessed -c _deps/test-drive-build/test/CMakeFiles/test-drive-tester.dir/test_select.F90-pp.f90 -o _deps/test-drive-build/test/CMakeFiles/test-drive-tester.dir/test_select.F90.obj
FAILED: _deps/test-drive-build/test/CMakeFiles/test-drive-tester.dir/test_select.F90.obj _deps/test-drive-build/test/test_select.mod 
D:\a\_temp\msys64\mingw64\bin\gfortran.exe -ID:\a\stdlib\stdlib\build\_deps\test-drive-src\test -ID:/a/stdlib/stdlib/build/_deps/test-drive-build/include -fimplicit-none -ffree-line-length-132 -Wall -Wextra -Wimplicit-interface -fPIC -g -fcheck=all -fbacktrace -J_deps\test-drive-build\test -fpreprocessed -c _deps/test-drive-build/test/CMakeFiles/test-drive-tester.dir/test_select.F90-pp.f90 -o _deps/test-drive-build/test/CMakeFiles/test-drive-tester.dir/test_select.F90.obj
D:/a/stdlib/stdlib/build/_deps/test-drive-src/test/test_select.F90:156:22:

  156 |     call run_selected(stub_collect, "not-available", unit, stat)
      |                      1
Error: Interface mismatch in dummy procedure 'collect' at (1): ALLOCATABLE mismatch in argument '_formal_0'

I suspect that it is because one argument might be missing in the call check:

call check(error, all_close(y1, y2))

try with

call check(error, all_close(y1, y2), .true. )

Jun 27 '25 09:06 jalvesz

this error https://github.com/fortran-lang/stdlib/pull/957#issuecomment-3012418465 is weird ... maybe @jvdp1 @perazz @zoziha @ivan-pi @certik would any of you have an idea?

Jun 27 '25 15:06 jalvesz

this error #957 (comment) is weird ... maybe @jvdp1 @perazz @zoziha @ivan-pi @certik would any of you have an idea?

There is an open issue at https://github.com/fortran-lang/test-drive/issues/49 . It looks like this is a GCC-15 bug, it is triggered by -Wall. Until that is solved in test-drive, for fpm (https://github.com/fortran-lang/fpm/pull/1150), we worked around it by setting -Wall -Wno-external-argument-mismatch.

It's a hack, but it works. GCC<15 versions don't have that flag, but the "negative" form of the warnings flag does never trigger an error so it can be safely used on all GCC versions.

Jun 27 '25 16:06 perazz

this error #957 (comment) is weird ... maybe @jvdp1 @perazz @zoziha @ivan-pi @certik would any of you have an idea?

There is an open issue at fortran-lang/test-drive#49 . It looks like this is a GCC-15 bug, it is triggered by -Wall. Until that is solved in test-drive, for fpm (fortran-lang/fpm#1150), we worked around it by setting -Wall -Wno-external-argument-mismatch.

It's a hack, but it works. GCC<15 versions don't have that flag, but the "negative" form of the warnings flag does never trigger an error so it can be safely used on all GCC versions.

This is well-known bug that is fixed meanwhile, there are four duplicate bug reports at gcc, e.g.: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119928

Jun 27 '25 21:06 Romendakil

@loiseaujc merging the latest master branch should help resolve the crashes with gfortran 15.1 thanks to the changes introduced by @perazz in #1008.

Jul 02 '25 06:07 jalvesz

Ok. Will do later today or tomorrow. Thanks guys.

Jul 02 '25 06:07 loiseaujc

Since this PR looks pretty much ready, I've started to work on the symmetric tridiagonal stuff. Using this one as starting material, it should be pretty quick. I'll send a new PR a couple of days after this one has been merged.

Jul 07 '25 07:07 loiseaujc

Yep, I thought about it afterward and figured it would probably be best but I hacked this commit up quickly while commuting. Will definitely go this route next time.

Le mar. 8 juil. 2025 à 08:50, Federico Perini @.***> a écrit :

@.**** approved this pull request.

Great! Thank you for implementing the comments @loiseaujc https://github.com/loiseaujc, LGTM.

For future's sake, a perhaps easier way to implement error handling without too much verbosity is to have a pure subroutine backend, and then the two function versions are just essentially wrappers over it.

— Reply to this email directly, view it on GitHub https://github.com/fortran-lang/stdlib/pull/957#pullrequestreview-2996211712, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOPPHTSRKFA6BEFNNNCG233HNS3HAVCNFSM6AAAAABZMY6XKCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDSOJWGIYTCNZRGI . You are receiving this because you were mentioned.Message ID: @.***>

-- Jean-Christophe Loiseau Homepage https://sites.google.com/site/loiseaujc/

Jul 08 '25 07:07 loiseaujc

With two approvals, I suggest to wait another couple days and then merge this PR, if there are no further comments.

Jul 13 '25 14:07 perazz

Thanks for this nice PR @loiseaujc. Since there are no more comments I'll merge this one.

Jul 17 '25 08:07 jalvesz