de-dupe
de-dupe copied to clipboard
Provide helpful examples showing compression
I tried a couple of test inputs with a lot of duplication to test the src/core.cljs file as per your README.
Invariably, using de-dupe made the value of (count (prn-str data)) larger than the original!
Can you post an example of de-dupe helping compress data?
Here's what I tried
(def data1 {:contacts
[{:first "Ben" :last "Bitdiddle" :email "[email protected]"}
{:first "Alyssa" :middle-initial "P" :last "Hacker" :email "[email protected]"}
{:first "Eva" :middle "Lu" :last "Ator" :email "[email protected]"}
{:first "Louis" :last "Reasoner" :email "[email protected]"}
{:first "Cy" :middle-initial "D" :last "Effect" :email "[email protected]"}
{:first "Lem" :middle-initial "E" :last "Tweakit" :email "[email protected]"}]})
(def data2 {:contacts
[{:first "Ben" :last "Bitdiddle" :email "[email protected]"}
{:first "Alyssa" :middle-initial "P" :last "Hacker" :email "[email protected]"}
{:first "Louis" :last "Reasoner" :email "[email protected]"}
{:first "Cy" :middle-initial "D" :last "Effect" :email "[email protected]"}
{:first "Lem" :middle-initial "E" :last "Tweakit" :email "[email protected]"}]})
(def data3 {:contacts []})
(def some-data [data1 data2 data3])
(def compressed (de-dupe some-data))
; if you now compare
(println "compressed:" (count (prn-str compressed)))
(println "original:" (count (prn-str some-data)))
; you will see the degree of comparision
(println "compressed:" compressed)
(println "original:" some-data)
; to recover your original data
(def some-data2 (expand compressed))
(println "recovered" some-data2)
And I got
compressed: 1132
original: 817
...