epub.js icon indicating copy to clipboard operation
epub.js copied to clipboard

locations.generate() slow. Can loading sections be made asynchronous?

Open scottanglr opened this issue 4 years ago • 2 comments

Hi,

First of all, thanks for putting together such a great reader library!

The Reader application I'm putting together has to work with very large eBooks, some hundreds of megabytes large. This is especially a problem when generating locations in the book. It can take up to a few minutes on one of the books which has over 800 sections.

The biggest cause of slowdown I've noted is that the loading of each section happens synchronously

locations.generate

this.spine.each(function(section) {
  if (section.linear) {
    this.q.enqueue(this.process.bind(this), section);
  }
}.bind(this));

In the process function it calls .load on the section

Instead I've tried loading all sections at the same time. My code is very rough and uses promises for my own familiarity.

locations.generate async version

const sections = [];
this.spine.each(function(section) {
  if (section.linear) {
    sections.push(section)
  }
}.bind(this));
this.q.enqueue(this.process.bind(this), sections);

The process function has been updated to asynchronously load the sections and then process them synchronously

locations.process async version

const loadedContents = new Array(sectionOrSections.length);
const sections = sectionOrSections;
// Async load all sections
return new Promise((resolveProcess) => {
  Promise.all(sections.map((section, index)=>{
    return new Promise((resolve)=>{
      section.load(this.request).then((contents)=>{
        loadedContents[index] = contents;
        resolve()
      })
    })
  })).then(()=>{
    //Process sequentially
    for(let i = 0; i < loadedContents.length; i++){
      const contents = loadedContents[i];
      const section = sections[i];
      var completed = new defer();
      var locations = this.parse(contents, section.cfiBase);
      this._locations = this._locations.concat(locations);
       section.unload();
       this.processingTimeout = setTimeout(() => completed.resolve(locations), this.pause);
   }
   resolveProcess()
  })
})

118 seconds old sync code 18 seconds new async code

Now making code asynchronous which was once synchronous is a dangerous operation if you're not sure of the synchronous side-effects / what the synchronous code was changing. For example, the synchronous part annotated above must be left synchronous because otherwise this_.locations could go out of order. So I'm wondering if this is a safe change, and if it is, is it the sort of change which could be integrated into the library?

scottanglr avatar Aug 05 '20 04:08 scottanglr

@scottanglr I'm facing the same slow behavior that you explained here. I tried your solutions and it certainly improves the loading process, did you find some error using this implementation until now?

mfranceschit avatar Sep 08 '20 16:09 mfranceschit

it`s been a while, but..

@scottanglr seems like you can call process function() directly

@mfranceschit so far so good, the result match the the official way using locations.generate, at least in my test case

const calculatePercentage = () => {
    const start = performance.now();
    let prms = [];
    book.current.locations.break = 450;
    book.current.spine.each((section) => {
      prms.push(book.current.locations.process(section));
    });

    Promise.all(prms).then((rs) => {
      const locations = rs.flat();
      const end = performance.now();
      console.log('[percentage calc]', locations);
      console.log('[percentage calc perf]', end - start);
    });
};

MillerDix avatar Oct 16 '23 02:10 MillerDix