core.matrix icon indicating copy to clipboard operation
core.matrix copied to clipboard

support the rank of a matrix

Open grtlr opened this issue 12 years ago • 13 comments

Maybe the API should support the rank of a matrix?

grtlr avatar Mar 08 '13 08:03 grtlr

Do you mean like the existing dimensionality function which is the same as "tensor rank"?

Or rank as in linear algenra: http://en.wikipedia.org/wiki/Rank_(linear_algebra)

If it is the latter then yes, I agree this is worth adding to the API. Though the name rank could cause confusion....

mikera avatar Mar 08 '13 09:03 mikera

Yes, I meant rank in terms of linear algebra. I agree that it could cause confusion with dimensionality. I think a detailed description in the documentation could take care of this.

grtlr avatar Mar 08 '13 10:03 grtlr

Cool, makes sense then. Yeah happy to include this as clojure.core.matrix/rank as long as the docs are clear

This will probably need:

  • A new protocol in clojure.core.matrix.protocols. Perhaps PMatrixRank
  • A "default" implementation in clojure.core.matrix.impl.default. Doesn't need to be fast, but this will ensure that implementations without specific support for the rank protocol will still fulfil the API contract.

mikera avatar Mar 08 '13 10:03 mikera

Ok cool. I can take a look into that topic and hopefully come up with rank implementation.

grtlr avatar Mar 08 '13 10:03 grtlr

BTW, the reason I've been holding off on stuff like this (which includes LU decomposition etc.) is that we haven't yet got a reasonably performant matrix implementation in core.matrix itself that we can use to build the default implementation.

Obviously something like vectorz-clj would make this easy, but we can't have that as a dependency for core.matrix itself.

So there are a few options around this to think through. I'll reference this issue on the Clojure Numerics discussion group to see if any of the others have good ideas.

mikera avatar Mar 08 '13 13:03 mikera

Should we think about writing an efficient matrix implementation? It feels like there is no real way around it?

grtlr avatar Mar 13 '13 12:03 grtlr

I'd be semi-tempted to extend the core.matrix protocols to double[][]. That would give us a performant matrix implementation that can be used for linear algebra algorithms plus some extra Java interop ability (for people who use double[][] in other code).

It would be mostly like the current double[] implementation in double-array.clj

mikera avatar Mar 13 '13 16:03 mikera

Other option would be something like:

(deftype DoubleMatrix [^int rows ^int columns ^doubles data]
  .....)

mikera avatar Mar 14 '13 01:03 mikera

I'm pretty sure that last option is the correct way of doing it. You want a single flat array of data -- that's the way numpy does it too. If you have a 2-d double array, what you've really got is an array of double arrays. That's an extra pointer to chase when you want to look up an entry. So I'd expect the long flat array to be more performant. I've started working on something along these lines, because I wanted a high-performance matrix implementation in a separate project.

One small wrinkle: in java arrays are int addressed, so using this method we're limited to m*n = 2^(32 -1) entries. For doubles, that works out to ~ 17 GB of memory in a single matrix. I think that's an acceptable upper limit, because really, what are you doing with a dense matrix that size on a single machine, anyway?

heffalump avatar Mar 14 '13 01:03 heffalump

A nice thing about core.matrix is that we can have both :-)

The key point is which one we use to provide a base for implementing the numerical algorithms (i.e. it needs to become part of core.matrix proper)

I think I agree that a flat array-backed version is nicer (if nothing else, it allows easy wrapping of the array to provide views representation in other formats). The 17Gb limit is acceptable in my view: we can offer other matrix formats if pwople need something bigger.

Happy to take any patches that get us there. It's not a big priority for me, so won't be coding it myself right now

mikera avatar Mar 14 '13 03:03 mikera

We are getting away from the rank discussion somewhat: suggest we continue on the Numerical Clojure list if needed.

mikera avatar Mar 14 '13 03:03 mikera

An efficient implementation for core.matrix is also not a big priority for me at the moment.

As for the rank, I still think we should add it to interface, similar to the inverse operation. inverse isn't implemented in core.matrix yet (or not that I'm aware of), but I think it helps to define the interface.

grtlr avatar Mar 23 '13 21:03 grtlr

OK, I added a rank function and PMatrixRank protocol in latest develop. Ready for when we implement them.

mikera avatar Mar 24 '13 00:03 mikera