epubcheck
epubcheck copied to clipboard
Taking too long to load the epub books
We are trying to use this epubcheck for the checking whether epub has audio or not? But it is taking 8-10 secs for loading the book itself. This is impacting our performance. Can you please help us with some options to load the book quickly and get the hasAudio meta info?
This tool is a simple wrapper around https://github.com/w3c/epubcheck which does an exhaustive analysis of the EPUB file. If the requirement is to only check if the EPUB contains audio that could indeed be done much faster with a custom script that does not load the entire EPUB. If you are interested in sponsoring such a feature I am happy to look into it.
Similar performance issue. I have a python script that processes epubs and I use EpubCheck to verify the books both before and after changes. I had been using EpubCheck installed on my Mac with Home-brew to run EpubCheck (using subprocess.run(). So I called the java program EpubCheck from within my python script.
To simplify things with respect to compatibility with Windows, I began using this EpubCheck integration. However, it is really slow compared to just running the java EpubCheck with subprocess.
I measure performance from within my python script for EpubCheck. If I use the Homebrew EpubCheck run with subprocess, on a particular book (that generates no errors and no warnings), the Homebrew version takes 2.2 seconds. If I run the same book with the epubcheck module, it takes 10.3 seconds.
That's nearly five times slower, which seems like a lot to me. Any ideas as to what can be done to get it closer to the performance of subprocess?