editscript icon indicating copy to clipboard operation
editscript copied to clipboard

The presence of large values causes slowdown even when identical

Open cjohansen opened this issue 7 months ago • 1 comments

I have a datastructure like this:

(def data
  {:a {}
   :b {}})

The map in :a is enourmous:

(count (pr-str (:a data))) ;;=> 4390778

Editscript quickly finds that there are no differences when diffing without changes:

(time
 (e/diff (into {} data) (into {} data)))

;; "Elapsed time: 3.500000 msecs"

However, if there are changes - even minor ones - in :b, the presence of the enourmous (but still identical) :a causes the diff to be extremely slow:

(def data {:a (vec (for [i (range 1000000)]
                     {:number i}))
           :b {:message "Hello"}})

(time (e/diff data data))
;; "Elapsed time: 1.400000 msecs"

(def data-b (assoc-in data [:b :message] "Hi!"))

(identical? (:a data-b) (:a data)) ;; true

(time (e/diff data-b data))
;; "Elapsed time: 2598.700000 msecs"

(time (e/diff (:a data-b) (:a data)))
;; "Elapsed time: 0.000000 msecs"

These numbers are from ClojureScript, but JVM Clojure exhibits the same kind of behavior, although with slightly more favorable numbers.

cjohansen avatar May 21 '25 14:05 cjohansen

Thanks for reporting.

huahaiy avatar May 23 '25 17:05 huahaiy