clustering icon indicating copy to clipboard operation
clustering copied to clipboard

Levenshtein distance runs into stackoverflow error for longer strings

Open rbuchmann opened this issue 8 years ago • 2 comments

I used it to compare strings of around 130 characters length. The clj-fuzzy implementation doesn't crash and yields a distance of around 40.

rbuchmann avatar Sep 11 '17 00:09 rbuchmann

Can you provide a simple test case please?

rm-hull avatar Sep 11 '17 08:09 rm-hull

Sure:

(def s "some pretty long string with ")
(def e " uuids in the middle")

(defn t [n]
  (str s (str/join "," (repeatedly n #(str (java.util.UUID/randomUUID)))) e))

(levenshtein/distance (t 2) (t 2))

rbuchmann avatar Sep 11 '17 21:09 rbuchmann