AxisArrays.jl icon indicating copy to clipboard operation
AxisArrays.jl copied to clipboard

Sparse Arrays

Open jakebolewski opened this issue 10 years ago • 6 comments

@mbauman have you given any thought as to how sparse arrays would fit into this framework? I have a project that I've been working on that overlaps with this concept and I would like to help move it here so that people will actually use it :-)

jakebolewski avatar Feb 26 '15 03:02 jakebolewski

Also, you should at least register this package. I didn't know about it until I looked as to why you wanted the Colon change to happen...

jakebolewski avatar Feb 26 '15 03:02 jakebolewski

Very interesting! No, I've not given Sparse arrays any thought at all… but I'd be happy for AxisArrays to support such a use-case. Especially since I think they'll work out-of-the box as it is. I'd be interested in hearing about your project and how you're thinking you might use AxisArrays.

Yes, I'm planning on registering and tagging this soon! I want to get at least a few of the "core" items on the roadmap checked off before doing so… hopefully by the weekend. You're not too far out of date — this has really only been around for little more than a week.

mbauman avatar Feb 26 '15 04:02 mbauman

@mbauman I haven't looked at it in depth, the main change would be to not assume dense labels for row and columns. Some of these sparse matrices are huge for graph processing so you can' t assume that the labels would have O(1) access. Another big thing would be the ability to iterate over and rename the row / column labels easily.

jakebolewski avatar Feb 26 '15 05:02 jakebolewski

I think that is all very doable. Right now the axis selection is done by trait dispatch, so we could add a new "Sparse" trait that would find its indices in a different way.

mbauman avatar Feb 26 '15 05:02 mbauman

This is a good place to mention https://github.com/JuliaComputing/NDSparseData.jl. It's basically the same (or increasingly will be) as AxisArrays, except explicitly lists all indices while AxisArrays gives you a product of index vectors. Basically just Zip vs. Product. It would be great to synchronize the APIs as much as possible, so you can just drop in one or the other based on sparsity. For example NDSparseData could adopt the Axis type for dimension names; I currently have a Dimension type that plays a similar role. There is also an Interval type that does the same thing.

JeffBezanson avatar Jul 27 '16 00:07 JeffBezanson

Some discussion here on aligning sparse and dense behaviors: https://github.com/JuliaComputing/NDSparseData.jl/issues/23

JeffBezanson avatar Jul 27 '16 01:07 JeffBezanson