ebooklib icon indicating copy to clipboard operation
ebooklib copied to clipboard

How to add new chapters to existing EPUB files?

Open edmund-zhao opened this issue 4 years ago • 11 comments

  • Read book = epub.read_epub(file_path) # It read a Epub book,and creat a object

  • Creat Chapter

addc = epub.EpubHtml(title='第二章', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>第二章</h1><p>这是测试的第二章</p>'
  • Add The Chapter To Object
book.add_item(addc)
book.spine.extend((addc,))

Now, How did I add the Book.toc ?

edmund-zhao avatar Jan 20 '21 09:01 edmund-zhao

Toc is really a tuple/list with TOC elements. It is a tuple/list, you can either add chapter object to it (in your case addc) or add custom link like epub.Link('intro.xhtml', 'Introduction', 'intro'). In the first case title of the chapter in ToC will be taken from the chapter object and in the second case we defined it ourselves.

You can see here how it is constructed - http://docs.sourcefabric.org/projects/ebooklib/en/latest/tutorial.html#creating-epub and you can find that example here - https://github.com/aerkalov/ebooklib/blob/master/samples/03_advanced_create/create.py

aerkalov avatar Feb 07 '21 23:02 aerkalov

If I add chapters to an existing EPUB file, it will overwrite the existing TOC. code like this

book = epub.read_epub(file_path)
addc = epub.EpubHtml(title='第二章', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>第二章</h1><p>这是测试的第二章</p>'
# Add The Chapter To Object
book.add_item(addc)
book.spine.extend((addc,))
book.toc = (epub.Link('intro.xhtml', '简介', 'intro'),
                    (epub.Section('正文'),
                     (addc,))
                    )

Have a good day!

edmund-zhao avatar Feb 12 '21 16:02 edmund-zhao

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

aerkalov avatar Feb 13 '21 22:02 aerkalov

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

yes, It useful. But it has a bug. If I use book.toc.extend() twice, the second will overwrite the first

edmund-zhao avatar Feb 16 '21 03:02 edmund-zhao

First Run

from ebooklib import epub
book = epub.read_epub('./test.epub')
addc = epub.EpubHtml(title='The Second Chapter', file_name='chapter_add.xhtml', lang='zh-CN')
addc.content = '<h1>The Second Chapter</h1><p>This is The Second Chapter</p>'
book.add_item(addc)
book.spine.extend((addc,))
book.toc.extend((addc,))
epub.write_epub('./test.epub', book, {})

edmund-zhao avatar Feb 16 '21 04:02 edmund-zhao

Second Run

from ebooklib import epub
book = epub.read_epub('./test.epub')
addc = epub.EpubHtml(title='The Third Chapter', file_name='chapter_add2.xhtml', lang='zh-CN')
addc.content = '<h1>The Third Chapterr</h1><p>This is The Third Chapter</p>'
book.add_item(addc)
book.spine.extend((addc,))
book.toc.extend((addc,))
epub.write_epub('./test.epub', book, {})

The second Chapter will be dismissed

edmund-zhao avatar Feb 16 '21 04:02 edmund-zhao

Correct. It will overwrite it. book.toc is really a list of items so just insert it somewhere where you want. Even book.toc.append(addc) would do.

The reason is that content.opf of Epub isn't able to fresh, which lead to the second running book = epub.read_epub can't get the first running content.opf

edmund-zhao avatar Feb 16 '21 05:02 edmund-zhao

if I rewrite epub.EpubNcx() and epub.EpubNav, It work!

book = epub.read_epub('./测试.epub')

print(book.get_metadata('DC','date'))
# nav_items = book.get_items_of_type(ebooklib.ITEM_IMAGE)
# # print(nav_items)
# # e = nav_items.get_name()
# # # print(e)
# # t = b'      <navPoint id="chapter_5">\n        <navLabel>\n          <text>2020\xe5\xb9\xb4\xe5\x9c\xa3\xe8\xaf\x9e\xe7\x95\xaa\xe5\xa4\x96</text>\n        </navLabel>\n        <content src="chapter3.xhtml"/>\n      </navPoint>\n'
# # a = e[:-35] + t + e[-35:]
# # nav_items.content = a
# # book.add_item(nav_items)


# all_items = book.get_items()
# u = []
# for item in book.get_items():
#     if item.get_type() == ebooklib.ITEM_NAVIGATION:
#         if 'chapter' in item.get_name():
#             print(item.file_name)
#             u.append(item)
#         e = item.get_content()
#         # print(e)
#         t = b'      <navPoint id="chapter_5">\n        <navLabel>\n          <text>2020\xe5\xb9\xb4\xe5\x9c\xa3\xe8\xaf\x9e\xe7\x95\xaa\xe5\xa4\x96</text>\n        </navLabel>\n        <content src="chapter3.xhtml"/>\n      </navPoint>\n'
#         a = e[:-35] + t + e[-35:]
#         print(a.decode('utf-8'))
#         item.content = a
#         book.add_item(item)
#         break



# index = book.get_item_with_href('chapter0.xhtml')
addc = epub.EpubHtml(title='第三章', file_name='chapter_add3.xhtml', lang='zh-CN',uid="chapter_add3")
addc.content = '<h1>第三章</h1><p>这是测试的第三章</p>'
addcLink = epub.Link('chapter_add3.xhtml','第三章',uid='chapter_add3')
book.add_item(addc)
# u.append(addc)
# book.toc = (epub.Link('intro.xhtml', '简介', 'intro'),
#                     (epub.Section('正文'),
#                      tuple(u))
#                     )
print(book.toc)
print(book.spine)
print("len of toc:", len(book.toc))
print("len of spine:", len(book.spine))
print("**************")
book.toc.append(addc)
book.spine.append(addc)
print(book.toc)
print(book.spine)
print("len of toc:", len(book.toc))
print("len of spine:", len(book.spine))
book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())
epub.write_epub('./测试.epub', book, {})

edmund-zhao avatar Feb 16 '21 05:02 edmund-zhao

But It will have duplivated name Warning Like This

C:\Python\Python37\lib\zipfile.py:1506: UserWarning: Duplicate name: 'EPUB/toc.ncx'
  return self._open_to_write(zinfo, force_zip64=force_zip64)
C:\Python\Python37\lib\zipfile.py:1506: UserWarning: Duplicate name: 'EPUB/nav.xhtml'
  return self._open_to_write(zinfo, force_zip64=force_zip64)

edmund-zhao avatar Feb 16 '21 05:02 edmund-zhao

Ok, so out of the head solution would be to do this before you create new ncx and nav. You remove them (they have content of old Table of contents anyway) from the items and just add them again.

book.items.remove(book.get_item_with_id('ncx'))
book.items.remove(book.get_item_with_id('nav'))

book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())

aerkalov avatar Feb 16 '21 23:02 aerkalov

Thanks for your help, I will write a blog to introduce the Ebooklib project

—— Edmund Zhao

在 2021年2月17日,上午7:31,Aleksandar Erkalović [email protected] 写道:

 EXTERNAL EMAIL:

Ok, so out of the head solution would be to do this before you create new ncx and nav. You remove them (they have content of old Table of contents anyway) from the items and just add them again.

book.items.remove(book.get_item_with_id('ncx')) book.items.remove(book.get_item_with_id('nav'))

book.add_item(epub.EpubNcx()) book.add_item(epub.EpubNav())

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/aerkalov/ebooklib/issues/217#issuecomment-780185806, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM26ODDAVOASHGBVYAAJP33S7L56TANCNFSM4WKL3DDQ.

edmund-zhao avatar Feb 17 '21 09:02 edmund-zhao