core.matrix icon indicating copy to clipboard operation
core.matrix copied to clipboard

ndarray: mmul doesn't preserve numeric types

Open mars0i opened this issue 11 years ago • 7 comments

ndarray matrices can hold numbers that are not Doubles--such as Longs and BigDecimals. However, mmul on matrices containing only Longs or BigDecimals matrices loses the original numeric type, and produces a matrix containing Doubles. I'm not sure what other operations have this effect on ndarray matrices.

user=> (class (mget (matrix [[1 2][3 4]]) 0 0))
java.lang.Long
user=> (class (mget (matrix [[1.0 2.0][3.0 4.0]]) 0 0))
java.lang.Double
user=> (class (mget (matrix [[1.0M 2.0M][3.0M 4.0M]]) 0 0))
java.math.BigDecimal
user=> (class (mget (mmul (matrix [[1.0M 2.0M][3.0M 4.0M]]) (matrix [1.0M 1.0M]))))
java.lang.Double
user=> (class (mget (mmul (matrix [[1M 2M][3 4]]) (matrix [1 1]))))
java.lang.Double

(The same experiments with persistent-vector preserve the Longs and BigDecimals.)

mars0i avatar May 07 '14 17:05 mars0i

I thought I'd see if I could find the source of the issue. I don't think I'll be able to resolve it--too many layers of definitions, types, etc. for me to figure out. However, maybe I've narrowed the source down to point someone else in the right direction. (Or maybe I have no relevant understanding of the code at all!)

First, the problem doesn't occur with add, sub, or emul, so it might be local to the implementation of mmul, i.e. matrix-multiply in clojure.core.matrix.impl.ndarray.

Near the end of the definition of matrix-multiply, the expression (* t (aget-2d b-data b-strides b-offset k j)) returns a BigDecimal when BigDecimal matrices are multiplied. However, the surrounding expression (aadd-2d c-data c-strides c-offset i j ...) returns a Double. I don't fully understand what's happening in the definition aadd-2d, but nothing there seems like an obvious cause of the Double. I believe it's the way that expose-ndarrays defines c-data that is the problem. I think that c-data derives from c, several lines above, as (empty-ndarray-zeroed ...). empty-ndarray-zeroed returns a matrix of Doubles, and I think that (aadd-2d c-data ...) sums that the Double 0.0's from empty-ndarray-zeroed with data from the original matrix arguments to mmul. That would generate Double output.

If that's all correct, I'm sure that someone else would do a better job of determining the best place to fix the problem. I can think of a couple of solutions, but there are probably much better ones.

mars0i avatar May 08 '14 04:05 mars0i

Cool thanks for digging in - that sounds like a very plausible explanation.

It's not totally clear to me what is the best approach to fix this. The issue seems to be with creating the right type of zero to use as the initial accumulator. Maybe we need some kind of generic zeroing mechanism that produces zeros of the right type?

mikera avatar May 08 '14 05:05 mikera

To determine the right type, the code would just look at the type of the <0, 0> element? To look at the type of every element seems too expensive. It would be easy to check the type of one element, and then if it's not a Double, write the correct kind of zeros into the accumulator matrix. Then as long as the user sticks to Doubles, they don't incur that extra cost. But maybe the right way to do it would use the alternative definitions defined by magic? I still don't understand that method well enough to mess around with it, although I understand the motivation.

Or: One could a version of aadd-2d that starts from a matrix initialized with the first stride(?), and then start adding into that. i.e. never use the zeros.

mars0i avatar May 08 '14 22:05 mars0i

Here's a slightly kludgey fix. Probably not the right way to do it, but it works. The code below checks the class of the first element of the first multiplied matrix. If this isn't Double, it re-fills the result matrix with a zero of a class appropriate to that first multiplied element. I don't think there's a significant cost if the multiplied matrices use Doubles. Note that if other elements of the multiplied matrix have a different type, normal Clojure numeric type contagion rules will apply.

The code snippet below is from the second half of the definition of matrix-multiply in ndarray.clj. The added part is the when-let expression in the middle lines.

            (expose-ndarrays [a b c]
              (let [a-rows (aget a-shape (int 0))
                    a-cols (aget a-shape (int 1))
                    b-rows (aget b-shape (int 0))
                    b-cols (aget b-shape (int 1))]
                (do (iae-when-not (== a-cols b-rows)
                      (str "dimension mismatch: "
                           [a-rows a-cols] "x" [b-rows b-cols]))
                    (when-let [zero (condp instance? (mp/get-0d a) ; 1st element in non-empty matrix of any shape
                                      java.lang.Long       0   ; what about ints, floats?
                                      clojure.lang.BigInt  0N
                                      clojure.lang.Ratio   0N
                                      java.math.BigDecimal 0M
                                      nil)]
                      (mp/fill! c zero))
                    (c-for [i (int 0) (< i a-rows) (inc i)
                            k (int 0) (< k a-cols) (inc k)]
                      (let [t (aget-2d a-data a-strides a-offset i k)]
                        (c-for [j (int 0) (< j b-cols) (inc j)]
                          (aadd-2d c-data c-strides c-offset i j
                                   (* t (aget-2d b-data b-strides b-offset k j))))))
                    c))))))))

I can submit a pull request if you want, but you may want to wait until someone has time to do it the right way. (I'll be able to use it for my own purposes, in any event.)

mars0i avatar May 09 '14 04:05 mars0i

At the moment, mmul for persistent vectors also does not preserve BigDecimals because of a recent performance optimisation for dot-products (I think here: 8a4ee0e).

(dot [1.0M 1.0M] [0.1M 0.2M]) ; => 0.30000000000000004

I think we need to make a choice about whether we want to preserve types in general before we look at ndarray in particular.

My suggestion would be to do so as it gives the user control over whether to go for accuracy with BigDecimals or speed by using another implementation that is optimised for primitive doubles. That does mean we probably can't avoid boxing in persistent vector implementations (not sure ?).

dmarjenburgh avatar Nov 09 '15 12:11 dmarjenburgh

I am reluctant to guarantee preservation of types in the API becasue I think that is an implementation-specific decision.

Would be OK to have the persistent vector implementation preserve types however. The numbers are all boxed anyway so performance isn't such a bit concern, people who want speed should use vectorz-clj or a native implementation

mikera avatar Nov 09 '15 12:11 mikera

Happy to take a PR that preserves types for the Clojure implementation

mikera avatar Nov 09 '15 12:11 mikera