react-pdf icon indicating copy to clipboard operation
react-pdf copied to clipboard

pdfjs crashes on getDocument if worker is set using pdfjs.GlobalWorkerOptions.workerPort and second file is rendered

Open jkgenser opened this issue 1 week ago • 0 comments

Before you start - checklist

  • [X] I followed instructions in documentation written for my React-PDF version
  • [X] I have checked if this bug is not already reported
  • [X] I have checked if an issue is not listed in Known issues
  • [X] If I have a problem with PDF rendering, I checked if my PDF renders properly in PDF.js demo

Description

I am setting up some defensive programming where I am loading my worker code in the background as my app starts. One of the issues with underlying pdfjs is that if the worker fetch fails, then the app is not recoverable.

The pdfjs maintainer recommend doing this in order to be robust to failure: https://github.com/mozilla/pdf.js/issues/14332#issuecomment-984764484

Here is a subset of my code. Note I use the blob approach to get around CORS as the worker file is being loaded from a different URL. However, whether you use blob or not to instantiate the worker doesn't make a difference.

export async function fetchWithRetry(
  url: string,
  retries: number = 2,
): Promise<void> {
  const attemptFetch = async (attempt: number): Promise<void> => {
    try {
      const response = await fetch(url);
      if (!response.ok) {
        throw new Error(`HTTP error! status: ${response.status}`);
      }
      const workerScript = await response.text();
      const blob = new Blob([workerScript], { type: 'application/javascript' });
      const workerUrlBlob = URL.createObjectURL(blob);
      pdfjs.GlobalWorkerOptions.workerPort = new Worker(workerUrlBlob, {
        type: 'module',
      });
      // pdfjs.GlobalWorkerOptions.workerSrc = workerUrlBlob;
    } catch (error) {
      if (attempt < retries) {
        console.log(`Retrying fetch... Attempt ${attempt + 1}`);
        await attemptFetch(attempt + 1);
      } else {
        console.error('Error loading PDF worker script', error);
        pdfjs.GlobalWorkerOptions.workerSrc = url;
      }
    }
  };

  await attemptFetch(0);
}

Steps to reproduce

  1. Configure the pdfjs.GlobalWorkerOptions.workerPort using a web worker using the snipper below.
  2. Render a PDF using file_a
  3. Pass in file_b to the component that renders the document
pdfjs.GlobalWorkerOptions.workerPort = new Worker(workerUrlBlob, {
        type: 'module',
      });

Expected behavior

PDF renders file_b

Actual behavior

Application crashes. Here is the relevant part of the stack trace:

Error: PDFWorker.fromPort - the worker is being destroyed.
Please remember to await `PDFDocumentLoadingTask.destroy()`-calls.
    at _PDFWorker.fromPort (api.js:2299:15)
    at getDocument (api.js:355:19)
    at loadDocument (Document.js:238:15)

Additional information

I suspect this may be related to the following issue raised in pdfjs

https://github.com/mozilla/pdf.js/issues/16777

Also here is the relevant line in pdfjs that is addressing that issue:

https://github.com/mozilla/pdf.js/pull/16830/files#diff-082d6b37ad01db7ac97cc07c6ddb0dc52040484c5ef91b110b072f50144d9f39R2305

Long story short, I believe it is related to not awaiting destroy if workerPort is used.

Should we assume workerPort is not supported by this lib since it results in a crash if a different file is passed in while a worker is already running.

Environment

  • Browser (if applicable):
  • React-PDF version: 9.0.0
  • React version: 18.2.0
  • Bundler name and version (if applicable): vite

jkgenser avatar Jun 29 '24 14:06 jkgenser