James
James
I am going to close this issue since it has been inactive for a long period of time without discussion. If this issue is still relevant, feel free to re-open...
I am closing this issue since there has been no activity since 2018. Going forward, https://python.microformats.io will be the canonical, official hosted version of mf2py. This is now outlined in...
@snarfed I was thinking about this and my first intuition was to try another parser like `html5lib`. It seems like an lxml issue.
Wow. That is surprising. I think switching to a different parser sounds wise; the user shouldn't have to foot the burden and see malformed markup.
You can use `html5lib` with BeautifulSoup for HTML5 parsing, but the BeautifulSoup documentation says this parser is `Very slow`. https://www.crummy.com/software/BeautifulSoup/bs4/doc/
@jpcaruana Looks interesting! > There is a benchmark script named [benchmark.py](https://github.com/kovidgoyal/html5-parser/blob/master/benchmark.py) that compares the parse times for parsing a large (~ 5.7MB) HTML document in html5lib and html5-parser. The results...
Following up on this!
NB: I have signed the CLA. I am unsure why the bot says I have not signed the document.
Following up on this!
Following up!