minecart
minecart copied to clipboard
PIL.UnidentifiedImageError: cannot identify image file
I am trying to read an image in PNG format from PDF file. I get the following error.
Traceback (most recent call last):
File "D:\workspace\pdfextraction\pdfextract.py", line 10, in <module>
im = page.images[1].as_pil() # requires pillow
File "C:\Users\Lenovo\AppData\Local\Programs\Python\Python39\lib\site-packages\minecart\content.py", line 368, in as_pil
image = PIL.Image.open(io.BytesIO(image_data))
File "C:\Users\Lenovo\AppData\Roaming\Python\Python39\site-packages\PIL\Image.py", lin
Here is my code
import minecart
pdffile = open('RunScribe-330601.pdf', 'rb')
doc = minecart.Document(pdffile)
page = doc.get_page(1)
#for shape in page.shapes.iter_in_bbox((0, 0, 100, 200)):
# print (shape.path, shape.fill.color.as_rgb())
im = page.images[1].as_pil() # requires pillow
#im.show()
for image in page.images:
print (image.as_pil())
I have tried multiple PDF files with PNG and JPEG images. Pages with JPEG images work fine. Here is the PDF file that I tried.
https://drive.google.com/file/d/1i_ZY5JPYEfs_v43DFuHUEL0eUaeSIv0R/view?usp=sharing
Any pointers on what could be the reason?
Regards, Aravind.