pdf2image
pdf2image copied to clipboard
pdf page count
I can specify the page number to be converted or the number -1 to convert all pages. Is there any way to know the page count to setup a loop? Possibly to await each page conversion or convert in batches until there are no other pages.
If not then this is a feature request. If yes could you tell me the trick?
By the way this plugin is awesome. Thank you.
My team uses pdf.js from Mozilla to get the metadata. Here is an example that uses pdf.js to get the total count of pages:
https://github.com/mozilla/pdf.js/blob/master/examples/node/getinfo.js
I used the pdf-lib package and used the getPageCount to get the # of pages. https://github.com/Hopding/pdf-lib/blob/d213f921237daca7a5e21fb38eb69d733ff34796/src/api/PDFDocument.ts#L539
Example: const pdfLoadDoc = await PDFDocument.load(Buffer.from(pdfFile.Body, 'base64')) const pageCount = pdfLoadDoc.getPageCount() console.log('page count', pageCount)
import { getDocument } from "pdfjs-dist";
/** returns size and resolution of the pdf */
export async function getPdfFormatInfo(dataBuffer: Buffer): Promise<{
numPages: number;
width: number;
height: number;
}> {
const pdfDocument = await getDocument({ data: dataBuffer }).promise;
const page = await pdfDocument.getPage(1);
const viewport = page.getViewport({ scale: 1 });
const width = Math.floor(viewport.width);
const height = Math.floor(viewport.height);
const finalHeight = 1080;
const finalWidth = (finalHeight / height) * width;
return {
numPages: pdfDocument.numPages,
width,
height,
finalWidth,
finalHeight,
};
}
savgiannis here you go.
also, have a look at this example: https://github.com/yakovmeister/pdf2pic-examples/blob/master/from-file-to-images.js
const convert = fromPath(specimen1, baseOptions);
return convert.bulk(-1);
seems like -1
is already implemented. but I do not recommend it, because for me, imagemagick runs out of memory because bulk
instantiates it for each pdf page.
Do you think it's ok to close this issue now?