igraph
igraph copied to clipboard
Replace ARPACK
Possible solutions to look at:
- arpack-ng https://github.com/pv/arpack-ng
- feast http://www.ecs.umass.edu/~polizzi/feast/features.htm
- eigen http://eigen.tuxfamily.org/index.php?title=Main_Page
- more ??? check http://www.grycap.upv.es/slepc/documentation/reports/str6.pdf
Cons:
- arpack-ng is the same codebase, just repackaged?
- feast does not have a non-symmetric solver
- eigen is C++ templates
- eigen is MPL 2.0, which is somewhat problematic. (If we do not change anything in the source, then eigen files will be dual licensed, if we change something in it, the same happens, but care must be taken not to put any (L)GPL code in these files.) Actually, eigen might not even have sparse eigensolvers....
The software from the paper:
- Anasazi, part of a big software package
- Blklan, not sure how this is relevant, but seems symmetric only
- Blopex, symmetric only
- Blzpack, symmetrix only
- Eigifp, symmetric only, plus code is in Matlab.
- Ietl, symmetric only
- Insynlan, Matlab code, seems symmetric only
- Irbleigs, does have a non-symmetric solver, but the code is Matlab.
- Jadamilu, symmetric only
- Jdcg, symmetric only, code in Matlab
- Mpb, eigensolver is a small part only, looks symmetric only.
- Pdacg, symmetric only.
- Primme, symmetric only.
- Propack, SVD only, but seems good for that. Not developed since 2004.
- PySparse, maintained (!), but symmetric only.
- SLEPc, this is a big package with many solvers, need more time to look over it. It can also call other solvers, e.g. ARPACK as well. It is too big to be included in igraph.
- Spam, cannot find it anywhere.....
- Trlan, symmetric only
- Arncheb, non-symmetric, specifically, but cannot find source code or license
- ARPACK, unstable for non-symmetric problems.
- ARPACK++, just an interface to ARPACK
- Dvdson, symmetric only
- Lanczos, symmetric only
- Lanz, symmetric only
- Laso, symmetric only
- Lopsi, this is for non-symmetric matrices and has unmaintained fortran code, without a license.
- Jdqr/Jdqz, also has Fortran code for the non-symmetric case, license not compatible, non-commercial.
- Na18, symmetric only.
- Napack, this does not seem to be sparse
- Qmrpack, this has a non-symmetric solver, but license is non-commercial only.
- Srrit, non-symmetric as well, approximate solution, only largest m eigenvalues.
- Svdpack, it license is not compatible.
- Underwood, symmetric only
I'll take a look at the LOPSI code. I doubt that it is a good idea to include it, though. I don't know what it does and it is not maintained.
LOPSI has a bug when multiple eigenvalues/vectors are requested, according to "An Evaluation of Software for Computing Eigenvalues of Sparse Nonsymmetric Matrices". The SRRIT code is also available, actually, and they say it is good for multiple/clustered eigenvalues, but it is very slow. So maybe we could just use LOPSI as a default if a single eigenvalue/vector is needed, with optional support for SRRIT.
Btw, there are problems even with the symmetric ARPACK solver: #589.
The ARPACK Fortran code in rARPACK seems to work better, so it is worth just trying to take their ARPACK version.
Also, http://www.math.uri.edu/~jbaglama/ has some Matlab code irlba is based on, so it is better to rewrite the original Matlab code into C. They also have a non-symmetric solver. The code is fairly involved, though.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Reasons for moving away from ARPACK: #544, #1469, #1641 .
FEAST now supports non-symmetric problems, and seems like a realistic alternative. http://www.feast-solver.org/
seems like a realistic alternative
But it is part of Intel oneMKL, so "not allowed" on non-Intel. Also is not it closed source, is not it a problem?
The one I linked to appears to be BSD-licensed and presumably portable Fortran code.
It seems promising, but FEAST does not seem to work on macOS (Apple M1) yet; I tried this in the src/ folder:
$ make ARCH=x64 F90=gfortran MKL=no feast
gfortran -O3 -fopenmp -ffree-line-length-none -ffixed-line-length-none -cpp -c /Users/tamas/Downloads/FEAST/4.0/src/kernel/dzfeast.f90 -o /Users/tamas/Downloads/FEAST/4.0/src/kernel/dzfeast.o
/Users/tamas/Downloads/FEAST/4.0/src/kernel/dzfeast.f90:917:82:
650 | call DGEMM('T','N',fpm(23),fpm(23),N,-DONE,work,N,q,N,DZERO,Aq,M0) ! projection
| 2
......
917 | call DGEMM('N','N',N,fpm(25),fpm(23),DONE,work(1,1),N,Aq(1,fpm(24)),M0,DZERO,workc(1,fpm(24)),N)
| 1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (COMPLEX(8)/REAL(8)).
/Users/tamas/Downloads/FEAST/4.0/src/kernel/dzfeast.f90:944:87:
839 | call DLACPY( 'F', fpm(23), fpm(23),Bq, M0, Bqo, fpm(23) )
| 2
......
944 | if ((fpm(5)==1).and.(loop==0)) call DLACPY( 'F', N, fpm(25),work(1,fpm(24)) , N, workc(1,fpm(24)), N )
| 1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (COMPLEX(8)/REAL(8)).
make: *** [/Users/tamas/Downloads/FEAST/4.0/src/kernel/dzfeast.o] Error 1
The default make invocation tries to use ifort (which I don't have), it may be the case that it works with ifort. I'm not well-versed in Fortran so I'd rather not attempt debugging what's going on yet.
ifort is Intel Fortran, I believe.
I did wonder about compatibility with arm64, precisely because of the association with Intel, so I checked if MacPorts has arm64 binaries, and it does: https://ports.macports.org/port/feast/details/ They don't seem to use any obvious patches either: https://github.com/macports/macports-ports/blob/master/math/feast/Portfile I can try to take a look at how it works another day, but not now.
ifort was renamed to ifx. https://en.wikipedia.org/wiki/Intel_Fortran_Compiler
Anyway, I think Intel compiler will not allow to compile FEAST for non intel, thus aarch64 is a no-no. Mathematica also had to remove FEAST from M1 Macbooks. https://mathematica.stackexchange.com/questions/256740/library-for-feast-method-is-missing
You can patch oneMKL to work on AMD and ARM...
@ntamas MacPorts uses the gfortran option -fallow-argument-mismatch, then it works. I edited the Makefile to add this option, but there should be better ways.
Here's a recent paper comparing various eigensolvers:
https://n.ethz.ch/~gtzounas/pap/eigmethods.pdf https://doi.org/10.3390/app10217592
Summary of features:
Based on this table, we should investigate SLEPc and z-PARES.
We absolutely need:
- Non-symmetric problems, and these all have it
- A reverse communication interface in order to use our own graph data structure directly. This is indicated as "RCI" in the table above.
SLEPc appears to be designed for massively parallel use, with MPI. It's unclear if it's also realistic for solving smaller problems without parallelization, and whether it can be compiled with MPI. If not, that would be a dealbreaker. Looking at how this is packaged in MacPorts, SLEPc depends on PETSc, which depends on MPI and BLAS/LAPACK, so there are not a lot of dependencies (apart from MPI), but it may still be a heavyweight, too large to vendor. It seems that this might be buildable without Fortran, but it's not completely clear.
Refs:
- https://slepc.upv.es/
- https://petsc.org/
- PETSc installation (hinting at dependencies): https://petsc.org/release/install/install_tutorial/#qqtw-quickest-quick-start-in-the-west
License is BSD.
Last release in 2023, it's actively developed.
Update:
From https://petsc.org/release/install/install/#external-packages :
BLAS/LAPACK is the only required external package (other than of course build tools such as compilers and make). PETSc may be built and run without MPI support if processing only in serial.
Presumably the same applies to SLEPc.
Still, it looks like a heavyweight. The question is if it can be made small enough when we only need the eigensolver.
z-PARES is also focused on parallelization with MPI, but it is stated on their homepage that serial operation is also supported. It is written in Fortran 90/95, so it cannot be f2c'd.
First and last release both in 2014, which is not encouraging.
It seems like adopting FEAST would basically mean committing ourselves to requiring a Fortran compiler at build time; FEAST does not seem to be packaged in Debian/Ubuntu yet and it has no formula in Homebrew either (but it's at least in MacPorts). SLEPc has Ubuntu packages, but it depends on ARPACK so I wonder whether we would gain anything by switching to SLEPc if the ultimate goal is to get rid of ARPACK.
I'd dismiss z-PARES as well due to lack of development activity.
It seems to me that all ARPACK alternatives come with significant drawbacks, and none are realistically vendorable. Replacement doesn't seem to make sense in the short term. But it may be a good to think about paving the way, i.e. making the interface such that multiple eigensolver back-ends could be supported. For now I'd just consider how such an interface would look like, but I wouldn't put too much time into it.
I consider it an important feature of igraph that it can be compiled without external dependencies, i.e. that all critical dependencies are vendored. We need to keep a close relationship to our true userbase, which is researchers, not software developers.
There is also https://spectralib.org/
Spectra claims to re-implement the algorithms of ARPACK as a header-only C++ library. It depends only on Eigen. It seems to support the operations we need.
Reasons against using it:
- The Eigen dependency makes it a heavyweight.
- Recent Eigen uses
aligned_allocwhich is not available on macOS 10.14 and earlier. There may be workarounds, this is unclear to me. I had a bad experience where I found out thatlibfivecannot be compiled with Eigen 3.4 on macOS 10.14 because of this limitation. - It is written and maintained by a single person, which makes it high-risk. It is included in Debian, but not in Arch or Fedora (most Linux distros are derived from one of these three). https://repology.org/project/spectra/versions
There are more features that ARPACK provides that we rely on, and should be available in any replacement library. For example, ARPACK allows for finding the eigenvalue with the largest real part, instead of the largest magnitude. We rely on this for PageRank to find the eigenvector corresponding to the eigenvalue 1, and not -1, even though both may be present.
Having a common interface to multiple solvers may be problematic, as selecting the desired eigenvalue would be achieved in a different manner with each.
It is more realistic to look for libraries that implement the same algorithm as ARPACK, or related algorithms, and are preferably written in C without heavy dependencies. Spectra would be a candidate if it didn't have the Eigen dependency.
Another candidate is ANASAZI, see https://github.com/trilinos/Trilinos/tree/master/packages/anasazi and https://trilinos.github.io/anasazi.html
It seems like it provides most features we need: symmetric and non-symmetric problems, the same kind of eigenvalue selection as ARPACK, written in C++. It's not entirely clear if it can be made dependency-free, but it looks promising.
It should be possible to operate with the graph data structure directly (like ARPACK's reverse communication interface):
Interoperability is enabled via the treatment of both matri- ces and vectors as opaque objects—only knowledge of the matrix and vectors via elementary operations is necessary. This permits algorithms to be implemented in a generic manner, requiring no knowledge of the underlying linear algebra types or their specific implementations.
However, it may be more trouble to set it up:
An inspiration for Anasazi is the ARPACK [Lehoucq et al. 1998] FORTRAN 77 software library. ARPACK implements one algorithm, namely an implicitly restarted Arnoldi method [Sorensen 1992]. In contrast, Anasazi provides a software framework, including the necessary infrastructure, to implement a variety of algo- rithms. Anasazi is an extensible framework because the necessary linear algebra infrastructure is made independent of the algorithms used for the numerical solution of large-scale eigenvalue problems. We justify our claims by implementing block variants of three popular algorithms: a Davidson [Davidson 1975] method, a Krylov- Schur [Stewart 2001a] method, and an implementation of LOBPCG [Knyazev 2001].
These quotes are from the paper, https://doi.org/10.1145/1527286.1527287
More info: https://docs.trilinos.org/dev/packages/anasazi/doc/html/index.html
It appears to continue to be maintained, though at a minimal level.