crouton
crouton copied to clipboard
Stack overflow exception in AsClojure implementation.
I accidentally parsed large GIF file (http://i.imgur.com/flMMU.gif) and this caused StackOverflowError in conversion from JSoup to Clojure data structures in implementation of AsClojure protocol in crouton/html.clj
I concede that this is not usual case and rarely happens in practice but HTML might be specifically constructed to exploit this error or there could be incorrect content type detection. And SOE is serious in that it is not recommended to recover from it (as well as from OutOfMemoryError).
It would be nice to limit conversion to some reasonable depth to avoid this exception.
I have the same problem with big files. (for example MP3 file: http://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3)
I use the following code to retrieve file:
(defn send-request [url]
(clj-http.client/get url {:insecure? true
:follow-redirects true
:max-redirects 10
:socket-timeout 10000
:conn-timeout 10000
:decode-body-headers true
:as :byte-array}))
Result:
StackOverflowError[m: [3m[m\n[37mclojure.lang.PersistentArrayMap.asTransient[m [32mPersistentArrayMap.java: 29[m\n [33mclojure.core/[1;33mtransient[m [32m core.clj: 3060[m\n [33mclojure.core/[1;33minto[m [32m core.clj: 6341[m\n [33mcrouton.html/[1;33mfn[m [32m html.clj: 37[m\n [33mcrouton.html/fn/[1;33mG[m [32m html.clj: 17[m\n [33mcrouton.html/[1;33mfn[m [32m html.clj: 27[m\n [33mcrouton.html/fn/[1;33mG[m [32m html.clj: 17[m\n [33mclojure.core/map/[1;33mfn[m [32m core.clj: 2559[m\n [37mclojure.lang.LazySeq.sval[m [32m LazySeq.java: 40[m\n [37mclojure.lang.LazySeq.seq[m [32m LazySeq.java: 49[m\n [37mclojure.lang.RT.seq[m [32m RT.java: 484[m\n [33mclojure.core/[1;33mseq[m [32m core.clj: 133[m\n [33mclojure.core/filter/[1;33mfn[m [32m core.clj: 2595[m\n [37mclojure.lang.LazySeq.sval[m [32m LazySeq.java: 40[m\n [37mclojure.lang.LazySeq.seq[m [32m LazySeq.java: 49[m\n [37mclojure.lang.Cons.next[m [32m Cons.java: 39[m\n [37mclojure.lang.RT.next[m [32m RT.java: 598[m\n [33mclojure.core/[1;33mnext[m [