PyMuPDF icon indicating copy to clipboard operation
PyMuPDF copied to clipboard

There is an issue with the image generated by the page.get_pixmap() function

Open 1339503169 opened this issue 1 year ago • 5 comments

Description of the bug

img_test.pdf The image converted through the page.get_pixmap() function has characters that were not originally present in the PDF. The source file has characters that appear to be 'From (Shipper) 发件人', but the actual image displayed does not match the PDF. The converted image is like this, with the red box indicating the error. You can compare it with img_test. pdf for comparison

image

How to reproduce the bug

here is the code i used to generate image

''' import fitz document = fitz.open('./data/img_test.pdf') page = document.load_page(0) rotate = int(0) zoom_x, zoom_y = 2, 2 trans = fitz.Matrix(zoom_x, zoom_y).prerotate(rotate) pix = page.get_pixmap(matrix=trans, alpha=False) pix.save('data/img_test.png') ''' what should I do to get the correct picture

PyMuPDF version

1.23.7 or earlier

Operating system

Windows

Python version

3.8

1339503169 avatar Jan 03 '24 10:01 1339503169

Submitted bug report in https://bugs.ghostscript.com/show_bug.cgi?id=707451.

JorjMcKie avatar Jan 04 '24 14:01 JorjMcKie

Just FYI, that file renders incorrectly in Evince on Fedora GNU/Linux (which is completely independent of PyMuPDF).

image

cbm755 avatar Jan 05 '24 02:01 cbm755

Just FYI, that file renders incorrectly in Evince on Fedora GNU/Linux (which is completely independent of PyMuPDF).

Thanks for this Colin. Yeah, maybe there is a general issue with these files. I am sure we will soon here from our friends at MuPDF.

JorjMcKie avatar Jan 07 '24 12:01 JorjMcKie

The file does indeed look broken. We have a fix in 1.24 that improves it.

The text now says "1 Front(Shipper)", albeit with dodgy spacing.

Essentially, it's a broken file, and we're doing as well with it as we can.

The commit in question is:

https://git.ghostscript.com/?p=mupdf.git;a=commitdiff;h=0a5b60420

I'll see about pulling this back to 1.23.x so you can get access to it soon.

robinwatts avatar Jan 30 '24 18:01 robinwatts

The MuPDF team has developed a fix that will at least improve the rendering of this type of pages.

JorjMcKie avatar Jan 31 '24 17:01 JorjMcKie

Fixed in 1.24.0.