pdf-lib TypeError: _this.catalog.Pages(...).traverse is not a function

What were you trying to do?

I am trying to load a 90 page PDF into the lib

How did you attempt to do it?

Here is a simple reproduction of the issue

const { PDFDocument } = require("pdf-lib");
const fs = require("fs");

const fileWithError = fs.readFileSync("./policy-doc-test.pdf");

async function main() {
  const parentPDFDoc = await PDFDocument.load(fileWithError);

  console.log(parentPDFDoc.getPageCount());
}

main();

What actually happened?

I am getting the TypeError: _this.catalog.Pages(...).traverse is not a function error anytime I call any APIs that require traversing the pages. This includes getPageCount, save, etc.

What did you expect to happen?

I expected these functions to work as expected.

How can we reproduce the issue?

Run the above code snippet using node

Version

1.17.1

What environment are you running pdf-lib in?

Node

Checklist

[X] My report includes a Short, Self Contained, Correct (Compilable) Example.
[X] I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

Above is the code snippet for reproducing the issue. The document is a somewhat sensitive PDF so i'd prefer to not attach it here publicly. I can attach the PDF via a DM or email.

Some more context:

This is a 90 page document (3.8MB). Opening it in Acrobat causes an error in acrobat. not sure if its related but I suspect it could be.

Here's the fun part... re-exporting this file and opening it with pdf-lib works as expected so Acrobat is doing something that fixes the issue, just not sure what and unfortunately re-exporting through acrobat isn't an option given the task.

Here to see if anyone knows what may be going on and how to potentially fix this issue. Thanks!

Jan 05 '22 15:01 schester44

Facing the same issue. My PDF is around 60 to 70 pages

Jan 09 '22 12:01 kausthubmayuram

Same issue here, is there any time frame on when this will be looked into/fixed

Jan 19 '22 01:01 alecimackay

We are experiencing the same issue, any news on this? glad to help any way I can

Dec 06 '22 19:12 gmayc

Same issue, any chance to fix this soon?

Dec 21 '22 11:12 kubarozycki

no fix

Feb 22 '23 16:02 StarNumber12046

We are also experiencing this issue with a specific PDF.

Aug 04 '23 08:08 msquitieri

I also stumbled over this.

In my case, the reason was that the /Pages dict doesn't have /Type set to /Pages. That caused the PDF parser to instantiate the object as a plain PDFDict instead of a PDFPageTree.

I was successful with the following workaround:

  const pdfDoc = await PDFDocument.load(bytes)

  // Find reference to the page tree
  const pagesRef = pdfDoc.catalog.get(PDFName.of('Pages'))

  // Get the page tree. This is a PDFDict.
  const oldPageTree = pdfDoc.context.indirectObjects.get(pagesRef)

  // Create a PDFPageTree with the same content.
  const newPageTree = new PDFPageTree(oldPageTree.dict, oldPageTree.context)

  // Set the correct `Type`.
  newPageTree.dict.set(PDFName.of('Type'), PDFName.of('Pages'));

  // Replace the PDFDict with the PDFPageTree in the document.
  pdfDoc.context.indirectObjects.set(pagesRef, newPageTree)

  // Save fixed document
  ...

Aug 17 '23 08:08 bspot

In my case the PDFDocument.catalog property was initialised with a PDFDict instead of a PDFCatalog. So here is my workaround for the bug:

const doc = await PDFDocument.load(bytes, { ignoreEncryption: true });
if (!(doc.catalog instanceof PDFCatalog) && ((doc.catalog as any) instanceof PDFDict)) {
    (doc as any).catalog = PDFCatalog.fromMapWithContext(doc.catalog, doc.context);
}

Nov 09 '23 11:11 chebum

For me it wasn't working due to Catalog pointing to the wrong object. I did this to manually point Catalog to a PDFPageTree

let pdfPageTree;

for (const entry of pdfDoc.context.indirectObjects.entries()) {
  const [ref, obj] = entry;
  if (obj instanceof pdfLib.PDFPageTree) {
    pdfPageTree = obj;
    break;
  }
}

doc.catalog = pdfLib.PDFCatalog.withContextAndPages(pdfDoc.context, pdfPageTree);

Feb 15 '24 15:02 nvutri

pdf-lib pdf-lib copied to clipboard

TypeError: _this.catalog.Pages(...).traverse is not a function

What were you trying to do?

How did you attempt to do it?

What actually happened?

What did you expect to happen?

How can we reproduce the issue?

Version

What environment are you running pdf-lib in?

Checklist

Additional Notes

pdf-lib
pdf-lib copied to clipboard