feedparser icon indicating copy to clipboard operation
feedparser copied to clipboard

Crash when parsing nikolay.rocks/atom.xml

Open MarcMV opened this issue 6 years ago • 2 comments

When parsing the following website: nikolay.rocks/atom.xml it crashes with the following output:

Traceback (most recent call last): File "/usr/lib/python3.5/base64.py", line 518, in _input_type_check m = memoryview(s) TypeError: memoryview: a bytes-like object is required, not 'str'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 878, in pop output = _base64decode(output) File "/usr/lib/python3.5/base64.py", line 552, in decodebytes _input_type_check(s) File "/usr/lib/python3.5/base64.py", line 521, in _input_type_check raise TypeError(msg) from err TypeError: expected bytes-like object, not str

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "myScript.py", line 380, in parsedFeed = feedparser.parse(content) File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 3956, in parse saxparser.parse(source) File "/usr/lib/python3.5/xml/sax/expatreader.py", line 110, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/lib/python3.5/xml/sax/xmlreader.py", line 125, in parse self.feed(buffer) File "/usr/lib/python3.5/xml/sax/expatreader.py", line 210, in feed self._parser.Parse(data, isFinal) File "../Modules/pyexpat.c", line 468, in EndElement File "/usr/lib/python3.5/xml/sax/expatreader.py", line 370, in end_element_ns self._cont_handler.endElementNS(pair, None) File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 2052, in endElementNS self.unknown_endtag(localname) File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 696, in unknown_endtag method() File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 1789, in _end_summary self.popContent(self._summaryKey or 'summary') File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 1003, in popContent value = self.pop(tag) File "/usr/local/lib/python3.5/dist-packages/feedparser.py", line 886, in pop output = _base64decode(output.encode('utf-8')).decode('utf-8') UnicodeDecodeError: 'utf-8' codec can't decode byte 0x87 in position 1: invalid start byte

MarcMV avatar Jul 17 '19 00:07 MarcMV

Yep, that shouldn't be happening. Thanks for reporting this, @MarcMV!

kurtmckee avatar Jul 17 '19 13:07 kurtmckee

For anyone else befuddled by this error message, the content type has probably been incorrectly set (or parsed) on the feed in question, triggering this block of code

As an example the linked feed includes: type="markdown" where feedparser is instead expecting type="text/markdown"

NPrescott avatar Aug 30 '20 01:08 NPrescott