HTML-Renderer icon indicating copy to clipboard operation
HTML-Renderer copied to clipboard

Stack Overflow reading huge old HTML

Open Mizutama opened this issue 8 years ago • 1 comments

I had a performance problem displaying huge HTML , then I was looking for huge simple HTML in english. I found http://www.gutenberg.org/files/1661/1661-h/1661-h.htm and I tried and got StackOverflow error. I knew this HTML was so old therefore it might be cause error on parsing.

I suggest using SGMLReader for parsing any HTML. I'm using it for many scraping projects and results are so reasonable.

Mizutama avatar Jul 26 '17 03:07 Mizutama

Thank you for the suggestion. However it takes time to investigate

Licshee avatar Jul 26 '17 09:07 Licshee