PyMuPDF-Optional-Material
PyMuPDF-Optional-Material copied to clipboard
get_pixmap may cause large image size
doc = fitz.open(pdf_file) for page in doc: pix = page.get_pixmap() img_file = f'{img_file_prefix}-{page.number}.jpg' pix.save(img_file)
Will get_pixmap cause the generated JPG image to be too large in the above code? Is there a better way to convert every page in the PDF into a JPG image?
The pixmap size is directly linked to the page size. Roughly width * height * 3 (for RGB images). You can reduce this in a number of ways, e.g. using grayscale instead of RGB (reduce by factor 3), or by downscaling using a DPI value < 96.
When you save the JPEG you can also influence the quality - please see documentation. A lower quality value also reduces the image size. Also try PNG instead - it may compress better than JPG.