python-pptx icon indicating copy to clipboard operation
python-pptx copied to clipboard

Error when extract 'jpeg' image from pptx, "AttributeError: 'Part' object has no attribute 'image'"

Open wzp123123 opened this issue 2 years ago • 3 comments

I use the below code to extract images from pptx:

code

if shape.shape_type == MSO_SHAPE_TYPE.PICTURE: image_bytes = shape.image.blob

for some images, raise: File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/pptx/shapes/picture.py:195, in Picture.image(self) 193 if rId is None: 194 raise ValueError("no embedded image") --> 195 return slide_part.get_image(rId)

File ~/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/pptx/parts/slide.py:30, in BaseSlidePart.get_image(self, rId) 24 def get_image(self, rId): 25 """ 26 Return an |Image| object containing the image related to this slide 27 by rId. Raises |KeyError| if no image is related by that id, which 28 would generally indicate a corrupted .pptx file. 29 """ ---> 30 return self.related_part(rId).image

AttributeError: 'Part' object has no attribute 'image'


I find all 'png' images extract successfully, all 'jpeg' images failed.

wzp123123 avatar Dec 04 '23 08:12 wzp123123

As a workaround, i can extract images via zipfile from "ppt/media". ref: https://github.com/madyel/extract_media_ppt

wzp123123 avatar Dec 04 '23 09:12 wzp123123

That, in fact, might be a faster way of doing it. But you lose the context of which slide they came from - which you might not care about.

MartinPacker avatar Dec 04 '23 09:12 MartinPacker

That, in fact, might be a faster way of doing it. But you lose the context of which slide they came from - which you might not care about.

I meet the above error when i extract 'jpeg' image😂 directly.

wzp123123 avatar Dec 05 '23 03:12 wzp123123

This turns out to be caused by a package, perhaps Adobe PDF Converter, using an invalid MIME-type for a JPEG image. It uses image/jpg which is not a MIME-type, where it should use image/jpeg.

The fix is to accept image/jpg as an "alias" for image/jpeg, which is consistent with PowerPoint's behavior in this situation.

scanny avatar Aug 03 '24 03:08 scanny