ebookmaker collectImagesFromEBookContents only checks for lower case html tags

collectImagesFromEBookContents only checks for lower case html tags

Open progtologist opened this issue 9 years ago • 1 comments

SInce both lower and upper case characters are accepted as html tags, I think a more general solution to

    def collectImagesFromEBookContents(self, htmlFile):
        with open(htmlFile, encoding='utf-8', mode='r') as f:
            soup = BeautifulSoup(f.read())
            return [img.src for img in soup.body.findAll('img') if img.has_attr('src')]

would be better. In my case, I hardcoded capitals to have my ebook created, but I think that it can be easily fixed ;-) Kudos for all your work

Aug 19 '15 11:08 progtologist

Actually, after further research I found that the problem lies in the code, it should be corrected to:

return [img['src'] for img in soup.body.findAll('img') if img.has_attr('src')]

Aug 19 '15 13:08 progtologist

ebookmaker ebookmaker copied to clipboard

collectImagesFromEBookContents only checks for lower case html tags

ebookmaker
ebookmaker copied to clipboard