core.matrix icon indicating copy to clipboard operation
core.matrix copied to clipboard

reshape performance very bad with large double arrays

Open cnuernber opened this issue 9 years ago • 2 comments

(defn reshape-time-test
  []
  (let [n-rows 100
        n-cols 1000
        src-array (double-array (* n-rows n-cols))]
    (println "reshape time")
    (time (dotimes [idx 100]
            (m/reshape src-array [n-rows n-cols])))
    (println "c-for time")
    (time (dotimes [idx 100]
            (let [^"[[D" dest (make-array Double/TYPE n-rows n-cols)]
              (c-for [row 0 (< row n-rows) (inc row)]
                     (java.lang.System/arraycopy src-array (* row n-cols) (get dest row) 0 n-cols)))))))


(reshape-time-test)
reshape time
"Elapsed time: 174760.275438 msecs"
c-for time
"Elapsed time: 19.301593 msecs"
nil

cnuernber avatar Oct 14 '16 18:10 cnuernber

For sanity's sake you may want to try with counts of 10 instead of 100.

I researched this a bit and I found the source likely two things:

First, aset-double is doing reflection ... so that is in core.clj of clojure itself. Second, (mp/get-2d data i j) I believe is doing nth on an array which apparently is quite slow.

I am running into this importing vgg16 into cortex from keras.

cnuernber avatar Oct 14 '16 18:10 cnuernber

So the problem here is fundamentally that we don't yet have a reshape operation for the :double-array implementation. Hence it is falling back to a default implementation, which certainly isn't optimised for the double-array case.

I'll take a look and see if I can optimise this at all.

In the meantime, the obvious solution is just to use an implementation that plays nicely with Java double arrays:

(defn reshape-time-test
     []
     (let [n-rows 100
           n-cols 1000
           src-array (double-array (* n-rows n-cols))]
       (println "reshape time")
       (time (dotimes [idx 100]
               (m/reshape src-array [n-rows n-cols])))
       (println "vectorz time")
       (time (dotimes [idx 100]
               (m/reshape (array :vectorz src-array) [n-rows n-cols])))))
#'mikera.vectorz.matrix-api/reshape-time-test
=> (reshape-time-test)
reshape time
"Elapsed time: 294872.994923 msecs"
vectorz time
"Elapsed time: 49.254672 msecs"

mikera avatar Oct 15 '16 07:10 mikera