PyMuPDF icon indicating copy to clipboard operation
PyMuPDF copied to clipboard

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Results 162 PyMuPDF issues
Sort by recently updated
recently updated
newest added

I got pymupdf pro key and tried to pymupdf4llm.mark_down(example.hwpx) Then I got runtime error. RuntimeError: code=7: cannot find entry Contents/Contents/header.xml But this did not happen when I executed mark_down function...

upstream bug
fix developed

### Description of the bug doc.set_toc does not always properly set y-positions. In my example pdf, they will be off by 90 pts. ### How to reproduce the bug [buggy_toc_positions.pdf](https://github.com/user-attachments/files/20021312/buggy_toc_positions.pdf)...

bug

### Description of the bug At a `.docx` file, PyMuPDF Pro did not display certain graphical purple elements: With PyMuPDF Pro: ![image](https://github.com/user-attachments/assets/d8c3c577-703a-456d-a69e-42810cb60eba) With Google Docs or Mac's Pages app: ![image](https://github.com/user-attachments/assets/f1693262-9c8f-4d83-b71d-5a1741ac60ea)...

upstream bug

### Description of the bug Not sure if this is documented behavior, but couldn't find it. To recreate: 1. Take a PNG file and rename it with a PDF extension....

documentation issue

### Description of the bug For some documents, PyMuPDF Pro splits the document into many more pages than if I open the document with Google Docs (or Mac Pages/libreoffice). This...

upstream bug

### Description of the bug Sometimes, an embedded image inside a `.doc` file overlaps with the text when creating an image of the document using `get_pixmap()`, although at other software...

upstream bug

### Description of the bug I am using the 940b: https://www.irs.gov/pub/irs-pdf/f940b.pdf The PDF file has identical pages, and each page has this specific dropdown: ![image](https://github.com/user-attachments/assets/af3ea015-5a18-48eb-bef8-8aaa8d5d350b) The choice_values variable is empty....

enhancement

### Description of the bug Based on my research, Mediabox defines size of the pdf page. Cropbox defines the rect of the page displayed by PDF Viewers. Pixmap displays the...

bug

### Description of the bug Cached data from a one PDF file can cause incorrect colors and shapes when generating a PixMap of a page of a different PDF file....

upstream bug

Is there any way this tool can be installed on docker using docker compose ? It's a great tool would be very happy if there is a way to use...

enhancement