Add support for HTML parsing libraries

Open jjlee opened this issue 16 years ago • 1 comments

Python libraries for parsing HTML have improved. mechanize doesn't support three of the most popular choices of the current crop.

Expect: can use some mechanize API to request that one of these libraries is used to parse HTML:

lxml.html BeautifulSoup 3 html5lib

Got: can only use bundled BeautifulSoup v.2 or Python's sgmllib or SGMLParser modules.

Dec 30 '09 20:12 jjlee

I just hit:

ParseError('nested FORMs',)

An HTML parser that never raised exceptions would help in this case.

Barring that, is there some way for me to reach in and edit the HTML before .forms() tries to parse it?

Dec 01 '11 05:12 dckc