SparseArrays.jl Common abstractions for nonstructural indicies between Dense and Sparse Matrices

indices is our abstraction for getting the indices so one can iterate the index of a matrix. It works great for dense matrices.

However, for sparse matrices, it is almost never what you want to do. You instread only want to iterate the indicies of the nonstructural elements. (because there are less of them) That is done using rowvals, and nzrange

In a dense matrix one can say that all the elements are nonstructural. So in a circumstance when you only want to iterate through the nonstructural elements, you want to iterate through all the elements in a dense matrix. More generally with a matrix of unknown type, you want to iterate through all the elements.

I thus propose that we should have an abstraction for getting the nonstructural indicies, which falls back to getting all the indiices, to make it easier to write code that works efficiently on spare matricies, and also works (as fast as is possible) on other types.

Some thing like

const SparseMatrix = AbstractSparseArray{<:Any,<:Any,2}
colinds(A::AbstractMatrix)  = indices(A,2)
colinds(A::SparseMatrix) = rowvals(A)

rowinds(A::AbstractMatrix, col::Integer) = indices(A,1)
rowinds(A::SparseMatrix, col::Integer) = nzrange(A, col)

(bikeshed on names pending)

There is also a similar relationship between nonzeros (sparse) and vec (dense/fallback).

I was discussing this on slack with @mbauman and @StefanKarpinski the other day, and wanted to put it on GitHub before it was lost to the ages

Mar 26 '18 02:03 oxinabox

Somewhat related is https://github.com/JuliaLang/julia/issues/15648, but I like how this turns the problem on its head: instead of trying to wedge sparse matrices into a generic API, we simply "lift" dense array support onto abstract sparse APIs. This seems far more tractable.

Some of this abstract sparse API is already in place for broadcasting; it'd be interesting to see what the performance difference would be were we to emulate sparse APIs for structured and dense matrices instead of converting them to actual AbstractSparseArrays.

Mar 26 '18 04:03 mbauman

Oh I think this is row I think nzrange doesn't quiet do this? But the point stands

Mar 26 '18 04:03 oxinabox

I think what I ment to have is:

colinds(A::AbstractMatrix)  = indices(A,2)

rowinds(A::AbstractMatrix, col::Integer) = indices(A,1)
rowinds(A::SparseMatrixCSC, col::Integer) = rowvals(A)[nzrange(A, col)]

Mar 26 '18 05:03 oxinabox

That API seems to be tied pretty closely to CSC layout. How about an each_nonstructural_index (name subject to bikeshedding, of course)? Iterating over its return value would need to produce indices that allow efficient getindex but are still (reasonable efficiently) convertible to LinearIndex/CartesianIndex if they aren't one of those already.

So the following should be reasonably efficient and leave A==B:

B = zeros(size(A))
for i in each_nonstructural_index(A)
    B[LinearIndices(A)[i]] = A[i]
end

Mar 26 '18 09:03 martinholters

you're right, my proposal would not work for say CRC. Or any row major format really. But we do not have such a thing. So maybe it does not matter.

Take a look at the example of use in the Chinese Whisphers clustering. Does that translate sanely to something like colinds, rowinds = zip(each_nonstructural_index(A)...) ?

(I've not thought too hard if it does or not. I think it does not -- at least not that simply)

Mar 26 '18 10:03 oxinabox

it'd be interesting to see what the performance difference would be were we to emulate sparse APIs for structured and dense matrices instead of converting them to actual AbstractSparseArrays.

This was my eventual hope with the sparse broadcast infrastructure. Perhaps in a few months... :)

Mar 27 '18 19:03 Sacha0

Somewhat related is JuliaLang/julia#15648, but I like how this turns the problem on its head: instead of trying to wedge sparse matrices into a generic API, we simply "lift" dense array support onto abstract sparse APIs. This seems far more tractable.

For what it's worth, I came up with the same idea/need when trying to implement efficient sparse-sparse, dense-sparse, and sparse-dense outer products for my own needs (and tried to make an attempt at something committable in JuliaLang/julia#24980, but that's obviously fallen far behind in all the linear algebra and broadcasting changes since then).

So I guess with respect to

That API seems to be tied pretty closely to CSC layout. How about an each_nonstructural_index (name subject to bikeshedding, of course)?

I'm in favor of having a portable way of doing CSC-like operations on dense arrays given the fact that CSC-aware algorithms can easily consume dense arrays that way for free (and then a more generic sparse infrastructure can be added later).

Mar 29 '18 02:03 jmert

I guess we can have

Maximally abstract: indicies, and maybe a nonstructual_indicies like @martinholters suggests
Moderately abstract Column major methods that I proposed above (works for CSC/Dense)
- (Possibly could exend those with a flag that makes them work row major? Or just define then names to not say row or column...)
Minimally abstract: nzrange, rowvals (only CSC)

Mar 29 '18 04:03 oxinabox

SparseArrays.jl SparseArrays.jl copied to clipboard

Common abstractions for nonstructural indicies between Dense and Sparse Matrices

SparseArrays.jl
SparseArrays.jl copied to clipboard