pdf2image icon indicating copy to clipboard operation
pdf2image copied to clipboard

pdf page count

Open savgiannis opened this issue 3 years ago • 3 comments

I can specify the page number to be converted or the number -1 to convert all pages. Is there any way to know the page count to setup a loop? Possibly to await each page conversion or convert in batches until there are no other pages.

If not then this is a feature request. If yes could you tell me the trick?

By the way this plugin is awesome. Thank you.

savgiannis avatar Mar 12 '21 10:03 savgiannis

My team uses pdf.js from Mozilla to get the metadata. Here is an example that uses pdf.js to get the total count of pages:

https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js

PaulMest avatar Apr 20 '21 17:04 PaulMest

I used the pdf-lib package and used the getPageCount to get the # of pages. https://github.com/Hopding/pdf-lib/blob/d213f921237daca7a5e21fb38eb69d733ff34796/src/api/PDFDocument.ts#L539

Example: const pdfLoadDoc = await PDFDocument.load(Buffer.from(pdfFile.Body, 'base64')) const pageCount = pdfLoadDoc.getPageCount() console.log('page count', pageCount)

redstone78 avatar Jul 08 '21 03:07 redstone78

import { getDocument } from "pdfjs-dist";

/** returns size and resolution of the pdf */
export async function getPdfFormatInfo(dataBuffer: Buffer): Promise<{
  numPages: number;
  width: number;
  height: number;
}> {
  const pdfDocument = await getDocument({ data: dataBuffer }).promise;
  const page = await pdfDocument.getPage(1);
  const viewport = page.getViewport({ scale: 1 });

  const width = Math.floor(viewport.width);
  const height = Math.floor(viewport.height);
  const finalHeight = 1080;
  const finalWidth = (finalHeight / height) * width;

  return {
    numPages: pdfDocument.numPages,
    width,
    height,
    finalWidth,
    finalHeight,
  };
}

savgiannis here you go.

also, have a look at this example: https://github.com/yakovmeister/pdf2pic-examples/blob/master/from-file-to-images.js

  const convert = fromPath(specimen1, baseOptions);
  
  return convert.bulk(-1);

seems like -1 is already implemented. but I do not recommend it, because for me, imagemagick runs out of memory because bulk instantiates it for each pdf page.

Do you think it's ok to close this issue now?

irgipaulius avatar Feb 04 '22 13:02 irgipaulius