AxisArrays.jl
AxisArrays.jl copied to clipboard
Sparse Arrays
@mbauman have you given any thought as to how sparse arrays would fit into this framework? I have a project that I've been working on that overlaps with this concept and I would like to help move it here so that people will actually use it :-)
Also, you should at least register this package. I didn't know about it until I looked as to why you wanted the Colon
change to happen...
Very interesting! No, I've not given Sparse arrays any thought at all… but I'd be happy for AxisArrays to support such a use-case. Especially since I think they'll work out-of-the box as it is. I'd be interested in hearing about your project and how you're thinking you might use AxisArrays.
Yes, I'm planning on registering and tagging this soon! I want to get at least a few of the "core" items on the roadmap checked off before doing so… hopefully by the weekend. You're not too far out of date — this has really only been around for little more than a week.
@mbauman I haven't looked at it in depth, the main change would be to not assume dense labels for row and columns. Some of these sparse matrices are huge for graph processing so you can' t assume that the labels would have O(1) access. Another big thing would be the ability to iterate over and rename the row / column labels easily.
I think that is all very doable. Right now the axis selection is done by trait dispatch, so we could add a new "Sparse" trait that would find its indices in a different way.
This is a good place to mention https://github.com/JuliaComputing/NDSparseData.jl. It's basically the same (or increasingly will be) as AxisArrays, except explicitly lists all indices while AxisArrays gives you a product of index vectors. Basically just Zip vs. Product. It would be great to synchronize the APIs as much as possible, so you can just drop in one or the other based on sparsity. For example NDSparseData could adopt the Axis
type for dimension names; I currently have a Dimension
type that plays a similar role. There is also an Interval
type that does the same thing.
Some discussion here on aligning sparse and dense behaviors: https://github.com/JuliaComputing/NDSparseData.jl/issues/23