ebooklib
ebooklib copied to clipboard
<head> element incomplete when generated via EpubHtml.get_content()
I should declare upfront that I've encountered this issue using v0.15 installed via pip, but looking at master branch it looks like the issue remains.
Perhaps this is a deliberate design choice -- in which case it ought to be documented -- but the
element will only contain CSS or Javascript items as found in the EpubHtml.items list. Otherwise EpubHtml.get_content() will return HTML with an empty head element.The code that once did populate the head element has been commented out but still resides within the code (https://github.com/aerkalov/ebooklib/blob/master/ebooklib/epub.py#L265-L271):
# this should not be like this
# head = html_root.find('head')
# if head is not None:
# for i in head.getchildren():
# if i.tag == 'title' and self.title != '':
# continue
# _head.append(i)
As it stands, if one uses EbookLib to load an epub, and then write it back out without any other manipulation then all elements within the original head tag are omitted and the resulting file has an empty
, although I suppose one could argue that's a different bug altogether.
Correct, this is a bug. As I recall it correctly the issue was (in my project) that some things would end up multiple times in
. They were parsed and they were additionally added by the system. Some validation when additional resources or metadata is added to the item should be implemented instead of this. I never really used it for some real EPUB=>EPUB conversion so it stayed unnoticed.
@aerkalov Is this lib no longer being maintained?
Hey @andyroberts! No, it is being maintained and I use it in pretty big project myself. Will check these things over the weekend, my mind is at the moment occupied with some other things.
@aerkalov Great. I'm only checking so that I know it's worth opening pull requests.
Hi @aerkalov Would this be good to merge now? Just checking in case I need to update https://packages.debian.org/sid/python-ebooklib etc.