pdf2image pdf page count

pdf page count

Open savgiannis opened this issue 3 years ago • 3 comments

I can specify the page number to be converted or the number -1 to convert all pages. Is there any way to know the page count to setup a loop? Possibly to await each page conversion or convert in batches until there are no other pages.

If not then this is a feature request. If yes could you tell me the trick?

By the way this plugin is awesome. Thank you.

Mar 12 '21 10:03 savgiannis

My team uses pdf.js from Mozilla to get the metadata. Here is an example that uses pdf.js to get the total count of pages:

https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js

Apr 20 '21 17:04 PaulMest

I used the pdf-lib package and used the getPageCount to get the # of pages. https://github.com/Hopding/pdf-lib/blob/d213f921237daca7a5e21fb38eb69d733ff34796/src/api/PDFDocument.ts#L539

Example: const pdfLoadDoc = await PDFDocument.load(Buffer.from(pdfFile.Body, 'base64')) const pageCount = pdfLoadDoc.getPageCount() console.log('page count', pageCount)

Jul 08 '21 03:07 redstone78

import { getDocument } from "pdfjs-dist";

/** returns size and resolution of the pdf */
export async function getPdfFormatInfo(dataBuffer: Buffer): Promise<{
  numPages: number;
  width: number;
  height: number;
}> {
  const pdfDocument = await getDocument({ data: dataBuffer }).promise;
  const page = await pdfDocument.getPage(1);
  const viewport = page.getViewport({ scale: 1 });

  const width = Math.floor(viewport.width);
  const height = Math.floor(viewport.height);
  const finalHeight = 1080;
  const finalWidth = (finalHeight / height) * width;

  return {
    numPages: pdfDocument.numPages,
    width,
    height,
    finalWidth,
    finalHeight,
  };
}

savgiannis here you go.

also, have a look at this example: https://github.com/yakovmeister/pdf2pic-examples/blob/master/from-file-to-images.js

  const convert = fromPath(specimen1, baseOptions);
  
  return convert.bulk(-1);

seems like -1 is already implemented. but I do not recommend it, because for me, imagemagick runs out of memory because bulk instantiates it for each pdf page.

Do you think it's ok to close this issue now?

Feb 04 '22 13:02 irgipaulius

pdf2image pdf2image copied to clipboard

pdf page count

pdf2image
pdf2image copied to clipboard