obooks
obooks copied to clipboard
Issues with the generated EPUB file (duplicate ch titles & toc, no cover img & css, toc anchor links not working)
Issues:
Running cli.js generates an EPUB file that:
1- Has a <h1>
element right after <body>
at the beginning of each chapter in .xhtml files, which results a duplicate text of the title. The additional <h1>
element is not part of the served .xhtml file from O'Reilly website.
2- Doesn't download the cover image, instead cover.jpg
file displays a generated table of contents.
3- The first page toc.xhtml
is a table of contents that's not part of the book (It's a duplicate of the actual toc).
4- Doesn't download the CSS file.
5- Anchor links at the actual toc is not working, it is locating to a non-existing path which results ERR_FILE_NOT_FOUND
when clicked (Not related to the toc at the first page, that's expected to be removed).
Expected Behavior:
1- Not have duplicate titles in each chapter/section.
2- Display the correct cover image.
3- Removal of the first page toc.xhtml
since the book already contains its own table of contents.
4- Have identical page style from CSS (fonts, italic texts etc..) of the book when viewed from O'Reilly.
5- Clicking on a anchor link at the toc should jump pages and not redirect to an incorrect file location.
Steps To Reproduce:
Steps to reproduce the issues:
- Run the following command
$ ./cli.js -b "9781800560871" -c "Your Cookies"
- After the tool completes running successfully and outputs
done 📚✨
, you will be able to view the issues mentioned above by opening the .epub file using Calibre.
Environment:
- OS: Ubuntu 21.10 x86_64
- Kernel: 5.13.0-20-generic
- Shell: bash 5.1.8
- Node: v12.22.5
- npm: v8.1.1
- Python3: v3.9.7
Anything else:
I am not aware whether these issues are producible only in specific books.