ebooklib icon indicating copy to clipboard operation
ebooklib copied to clipboard

Etree python 3.5 fix

Open 57uff3r opened this issue 8 years ago • 2 comments

I've found some problems with etree and python 3. Etree component was returning bites in python 3 instead of unicode string and I made small change to fix this problems.

[https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.tostring](XML etree documentation)

Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated)

57uff3r avatar Mar 05 '16 11:03 57uff3r

I checked it out and I would say this should be the fix. The problem is that we are trying to find 'str' in 'bytes', and that would fail. In the rest of the code we return 'bytes' all the time, so I assume we should do it also this time, and if you would need Unicode string you should convert it manually. Will think about this issue a bit more.

tree_str = etree.tostring(body, pretty_print=True, encoding='utf-8', xml_declaration=False)

if tree_str.startswith(six.b('<body>')):
    n = tree_str.rindex(six.b('</body>'))
    return tree_str[7:n]

aerkalov avatar Apr 02 '16 10:04 aerkalov

This appears to be a limitation of XML. So you may want to add your example of how to handle Unicode strings to the documentation.

highpost avatar Jan 11 '17 16:01 highpost