sparse icon indicating copy to clipboard operation
sparse copied to clipboard

Turn Scipy into a soft dependency

Open hameerabbasi opened this issue 7 years ago • 9 comments

Right now, we are only using Scipy in:

  • dot/tensordot
  • Conversion to/from scipy.sparse.spmatrix subclasses.

Is it sensible keeping it as a hard dependency anymore?

hameerabbasi avatar Jan 21 '18 19:01 hameerabbasi

In most applications that I care about tensordot is critical. I'm happy to depend on scipy separately though.

Of course, we could always make our own CSR :)

mrocklin avatar Jan 21 '18 19:01 mrocklin

I'm not aware of how to do these things, so two questions come to mind:

  • CSR @ CSR, CSR@Dense, [svd|eig|solve](CSR) is usually BLAS accelerated. We can and should use those. I know those can be wrapped with Cython.
  • How do we support multiple separate BLAS libraries? Is it even possible/easy?

I have zero experience here.

hameerabbasi avatar Jan 29 '18 12:01 hameerabbasi

  • BLAS generally doesn't have L3 operations for sparse matrices. I notice that scipy.sparse makes reference to Eigen, so perhaps they're linking to a sparse implementation there. In practice though I'm not sure how much advantage you'll get over linking to an optimized library rather than writing it yourself, at least for matvec/matmul. Solve/svd/eig could easily be more complex though and might warrant relying on some other code. I recommend looking at the scipy.sparse solution, and then hunting around on a search engine for appropriate libraries.
  • If you do want to touch BLAS libraries then I recommend just using the Python linked functions in scipy.linalg.blas to start. These tend to link to a particular BLAS implementation. There is decent documentation online helping people to identify and change their linked implementation if desired.

mrocklin avatar Jan 29 '18 13:01 mrocklin

scipy.linalg.blas is great! It does exactly what I needed, without needing to link/compile BLAS myself!

hameerabbasi avatar Jan 30 '18 09:01 hameerabbasi

SuiteSparse is generally considered to have state of the art implementations for sparse linear algebra, but it's GPL licensed so SciPy can't use it directly.

It looks like SciPy mostly uses SuperLU or ARPACK. For matmul, I agree with @mrocklin that you would probably do fine implementing it yourself with a fairly naive loop.

shoyer avatar Oct 15 '18 20:10 shoyer

The eventual plan here was to get CSF into this library, and turn it into just a container, then start adding algorithms like this to SciPy instead. :-)

hameerabbasi avatar Oct 15 '18 20:10 hameerabbasi

I notice that scipy.sparse makes reference to Eigen, so perhaps they're linking to a sparse implementation there.

Nope, that's not the case. No external library is used there. And Eigen isn't used anywhere in the numpy/scipy ecosystem as far as I'm aware.

scipy.linalg.blas is great! It does exactly what I needed, without needing to link/compile BLAS myself!

yep that's the way to go (or linalg.cython_blas / linalg.cython_lapack if needed). you definitely don't want to link directly to any BLAS

SuiteSparse is generally considered to have state of the art implementations for sparse linear algebra, but it's GPL licensed so SciPy can't use it directly.

scikit-umfpack does link to SuiteSparse, in case anyone needs that.

The eventual plan here was to get CSF into this library, and turn it into just a container, then start adding algorithms like this to SciPy instead. :-)

still +1 on that

rgommers avatar Oct 15 '18 23:10 rgommers

This is a very old issue but I'll +1 it just for the signal boost. My use-case is that I'm trying to build a Docker image with minimal dependencies, and I was almost able to drop scipy (which is pretty big) but sparse pulled it back in.

jamestwebber avatar Aug 12 '22 16:08 jamestwebber

We have our own matmul routines now, so this should be fairly easy, just moving imports to where they're needed, dropping the dependency and adding it to the test requirements.

hameerabbasi avatar Aug 12 '22 17:08 hameerabbasi