sparse einsum implementation

Hi guys, do you have plans to add einsum to this library? I guess einsum from numpy is obtained from c. There are einsum operation I cannot directly convert to using tensordot. by the way, this library is really awesome and thanks for creating this.

Aug 31 '17 20:08 aremirata

I have no plans personally to add einsum, but it seems well within scope if you're interested in implementing it.

On Thu, Aug 31, 2017 at 4:24 PM, aremirata [email protected] wrote:

Hi guys, do you have plans to add einsum to this library? I guess einsum from numpy is obtained from c. There are einsum operation I cannot directly convert to using tensordot. by the way, this library is really awesome and thanks for creating this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mrocklin/sparse/issues/31, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszMGgc9XFUWed8OtZOS24ReYpvr_Vks5sdxaPgaJpZM4PJXH6 .

Sep 01 '17 01:09 mrocklin

I realized that what I really need for my project is a highly parallelized einsum. I'm implementing it now as stand-alone and when I'm done I'll check out if it is a reasonable effort to integrate it with this library again. When it's done it will be open sourced anyways. I'll report back when I've got something new.

Sep 28 '17 21:09 bigerl

@aremirata @bigerl I've found in my experience that a lot of the stuff that's done using einsum is doable using broadcasting, reductions and tensordot. Maybe you can try those now that they're added?

Dec 28 '17 11:12 hameerabbasi

I've written a very simple proof of concept for a sparse einstein summation function: branch, diff

It's still very limited as it only works for two operands and doesn't support any of the fancy einsum formats like ellipses etc. Also, there are no tests yet.

Internally it uses tensordot and the fact that we can cheaply diagonalize axes on sparse arrays (see the docstrings for more explanation).

I have tested it against the following examples

sA = sparse.random((5,), density=2/5)
sB = sparse.random((5,), density=2/5)

A = sA.todense()
B = sB.todense()

out = np.einsum('i,i->i', A, B)
sout = sparse.einsum('i,i->i', sA, sB)

assert np.allclose(out, sout.todense())

out = np.einsum('i,i->', A, B)
sout = sparse.einsum('i,i->', sA, sB)

assert np.allclose(out, sout)

sAA = sparse.random((3, 4, 5))
sBB = sparse.random((6, 4, 5))

AA = sAA.todense()
BB = sBB.todense()

out = np.einsum('fns,kns->fkns', AA, BB)
sout = sparse.einsum('fns,kns->fkns', sAA, sBB)

assert np.allclose(out, sout.todense())

Dec 30 '17 09:12 nils-werner

I've written a few small issues and added some tests.

Jan 03 '18 10:01 nils-werner

@nils-werner It'd be great to have this in, even with partial support. More features can always be added. We can document the missing features.

One minor thing: I'd really prefer to have this support any number of arrays rather than two.

Jan 28 '18 08:01 hameerabbasi

I'm using opt_einsum with their sparse backend, but opt_einsum still wants an einsum operation in sparse. It works for particular einsum strings as per their sparse tests, but still relies an einsum operation for higher dimensions/more inputs.

The einsum strings that are throwing errors are 'bz,az->abz', 'cz,bz,az->abcz' and 'dz,cz,bz,az->abcdz' (and so on; always N two indices to one length N+1 index).

Jul 20 '18 19:07 stsievert

It'd be nice to have some concrete applications before we dive into this. I have a rough sketch of what a generalized implementation will look like, but it's way too complex for me to put into code at this point.

Jul 20 '18 20:07 hameerabbasi

Here's a failing test: https://gist.github.com/stsievert/202a07432ce82579a449ad113b33ba4a. It's takes the einsum string as input, but only fails for certain inputs. Its a modification of the opt-einsum sparse tests.

The application is with https://github.com/tensorly/tensorly/pull/64.

Jul 20 '18 21:07 stsievert

Okay, I’ll dive into it when I have time. This isn’t possible near-term though. You can look at the CSR matmul in the meantime and see if you can figure it out from that.

Jul 20 '18 21:07 hameerabbasi

@stsievert your answers are mostly (all?) what I'd call a batched contraction - a 'normal' contraction with one index shared by every array. I wrote some code to do this for two arrays which I've just put in this gist here, the only additional backend requirement (on top of tensordot, transpose and reshape) is stack, which sparse indeed has.

To use it with many-arg optimized contractions with opt_einsum it would have to be added to the pairwise contractions opt_einsum identifies and dispatches to. This would be fairly simple I think (and a great addition!) but the details are probably best left to discuss in that repo.

Now, the implementation is a simple for loop, so if most of the computation is spent running over the batched dimension it might be pretty slow, but it might still be an alright starting point / stop-gap?

Jul 21 '18 22:07 jcmgray

@jcmgray I am doing scientific computation with einsum contraction of type ijsn,msk,tik,tjn ->tim with dimensions t=10000,i=1600,j=1600,s=n=m=k=25. the number of elements in msk is huge, but when I use the sparse matrix there are only 19 or so components.

It will be very useful if your batched_contract code can be expanded to multiple tensor inputs and merged to sparse.

It will be a great if sparse einsum is implemented; or any insights on this issue is given. Any news on development of einsum?

May 18 '20 12:05 zoome0215

I have recently stumbled across einops which already supports a wide range of frameworks. Maybe implementing an interface is a thing to consider?

Nov 20 '20 08:11 nils-werner

I have recently stumbled across einops which already supports a wide range of frameworks. Maybe implementing an interface is a thing to consider?

None of those have sparse support, sadly.

Feb 06 '21 14:02 hameerabbasi

I'd like to point out that without an einsum implementation one cannot directly leverage the full potential of this library when combined with xarray, even though xarray supports sparse containers. The reason is that xarray.dot internally uses an einsum dispatcher. In my opinion sparse and xarray are a match made in heaven, but the lack of support for xr.dot currently hampers one of the main uses of this wonderful combination.

I tried to work around this with xr.apply_ufunc with something like this

xr.apply_ufunc(sparse.tensordot, a, b, 
                      input_core_dims=[['dim1', 'dim_0'], ['dim_0']],
                      output_core_dims=[['dim_1']],
                      dask='parallelized',
                      output_dtypes=[b.dtype],
                      kwargs=dict(axes=(-1,-1)),
                     )

because sparse.dot uses the hardcoded tensordot(a, b, axes=(-1, -2)). But apparently tensordot silently squeezes dimensions, so this does not work either.

Mar 14 '22 19:03 smartass101

We are working on a re-write of PyData/Sparse in C++, that will, among other things, allow us to do blazing-fast einsums for sparse and dense arrays.

Only problem is, I no longer can do this at my day job currently, and it's highly theoretical, so progress is slow at the moment. Please follow along at https://github.com/hameerabbasi/xsparse.

Mar 15 '22 06:03 hameerabbasi

Big :+1: to implementation of einsum in sparse. As pointed out by @smartass101, this would really help with a lot of use cases! I'm excited to see xsparse moving forward. 🚀

Aug 24 '22 18:08 rabernat

opt_einsum supports sparse

Nov 14 '22 11:11 LivelyLiz

opt_einsum supports sparse

It does, but it still requires unary / binary (pairwise) einsum (rather than just tensordot) for some operations. If sparse supported einsum(eq, x) and einsum(eq, x, y) (which is much easier - indeed already implemented in an earlier comment?), then opt_einsum could handle all the higher order cases.

Nov 15 '22 21:11 jcmgray

I've put an einsum implementation in #564, if anyone is interested testing it (either with regard to edge cases or performance).

Nov 17 '22 01:11 jcmgray

I would also be interested to this feature to be able to use sparse as a xarray backend. I have seen the pending pull request. Is there anything I can do to help in order to merge the pull request?

Feb 21 '23 13:02 HadrienNU

@HadrienNU I guess one thing would be to branch off the existing PR and help finish it off.

Feb 21 '23 14:02 hameerabbasi

Sorry it fell to the bottom of my to do list. As far as I'm aware all that's remaining is to add some missing tests for the added DOK.reshape method - the actual einsum is working well. I might get to it later this week.

Feb 21 '23 15:02 jcmgray

Thanks for the quick answer. I have a bit of time next week, I will try to see how to add those tests if you are not available to implement them before.

Feb 22 '23 12:02 HadrienNU

#564 has been merged if anyone wants to try sparse.einsum out now. There are so many use cases (especially once you add sparsity, sparse format, as additional parameters) that benchmarking is quite tricky. I'll just note for possible future optimization, that an alternative method to performing the core pairwise contraction operation is to leverage batch matrix multiplication - I do have a general implementation of this that works for sparse too.

Feb 23 '23 21:02 jcmgray

Thank you very much, I will try this out

Feb 23 '23 22:02 HadrienNU

Resolved by #564, if anything is missing we can open new issues for it.

Feb 24 '23 10:02 hameerabbasi

sparse sparse copied to clipboard

einsum implementation

sparse
sparse copied to clipboard