mechanize
mechanize copied to clipboard
some simple error recovering support + image support
This failed earlier:
In [6]: br = mechanize.Browser()
In [8]: br.open("http://9-eyes.com")
Out[8]: <response_seek_wrapper at 0x101c165a8 whose wrapped object = <closeable_response at 0x101c19440 whose fp = <socket._fileobject object at 0x101c07d70>>>
In [9]: br.title()
---------------------------------------------------------------------------
ParseError Traceback (most recent call last)
/Users/az/Programmierung/9eyes-fetcher/<ipython console> in <module>()
/Users/az/Programmierung/mechanize/mechanize/_mechanize.pyc in title(self)
458 if not self.viewing_html():
459 raise BrowserStateError("not viewing HTML")
--> 460 return self._factory.title
461
462 def select_form(self, name=None, predicate=None, nr=None):
/Users/az/Programmierung/mechanize/mechanize/_html.pyc in __getattr__(self, name)
537 elif name == "title":
538 if self.is_html:
--> 539 self.title = self._title_factory.title()
540 else:
541 self.title = None
/Users/az/Programmierung/mechanize/mechanize/_html.pyc in title(self)
285 return self._get_title_text(p)
286 except sgmllib.SGMLParseError, exc:
--> 287 raise _form.ParseError(exc)
288
289
ParseError: expected name token at '<!<!DOCTYPE html PUB'
Now it works:
In [5]: br.title()
parser exception: expected name token at '<!<!DOCTYPE html PUB'
Out[5]: 'Jon Rafman'
The "parser exception" debug print here is commented out in the commit.
Also, I added image support. I.e. you can iterate over all img tags via Browser.images.
Ping. What about it?
Thank you for your contribution to mechanize!
Following the process in #117, future work on mechanize will be occurring here: https://github.com/python-mechanize/mechanize.
Please re-file your PR there (where it will get attention, and hopefully merged)