pyvips
pyvips copied to clipboard
`.pagesplit()` not working with iOS Quartz produced pdfs
Hi @jcupitt,
I am trying to split a many-page image into a list of N separate images.
Code:
import pyvips
file_path = "/filesharemnt/testpdf.pdf"
DPI = float(150)
multi_page_image = pyvips.Image.pdfload(file_path, n = -1, dpi=DPI)
total_pages = multi_page_image.get_n_pages()
print("total_pages",total_pages)
fields = multi_page_image.get_fields()
for field in fields:
print(f"{field}: {multi_page_image.get(field)}")
individual_pages = multi_page_image.pagesplit()
print("\nlen(individual_pages) =", len(individual_pages))
output:
total_pages 925
width: 1275
height: 1622346
bands: 4
format: uchar
coding: none
interpretation: srgb
xoffset: 0
yoffset: 0
xres: 5.905511811023622
yres: 5.905511811023622
filename: /filesharemnt/testpdf.pdf
vips-loader: pdfload
page-height: 1650
pdf-n_pages: 925
n-pages: 925
pdf-producer: iOS Version 15.5 (Build 19F77) Quartz PDFContext; modified using iText® 5.4.1 ©2000-2012 1T3XT BVBA (AGPL-version)
len(individual_pages) = 1
Expected:
-
individual_pages
must contain a list of 925 individual pages
Actual:
-
individual_pages
has only 1 element which same as themulti_page_image
but with a temp filename.
I noticed that this is happening with pdfs having the producer given in the output. Rest of the pdfs I tested have a different producer and its working for them.
OS details: only tried testing this with debian 11 docker, ubuntu docker.
lsb_release -a
:
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
uname -a
:
Linux SandboxHost-638582921772039215 5.10.102.2-microsoft-standard #1 SMP Mon Mar 7 17:36:34 UTC 2022 x86_64 GNU/Linux
Python version 3.10.14
pyvips version: 2.2.3
could you please help.