PyMuPDF
PyMuPDF copied to clipboard
Redacting results are not as expected in 1.24.x.
Description of the bug
Hi there! Thanks for the excellent software for manipulating PDF files.
I have encountered the same issue as #3375. I am using pymupdf
to remove some information from a commerical report through code, as shown below:
from typing import List
from fitz import Rect, Page
import fitz
import os
def handlePDF(doc):
content_footer_rect = Rect(0, 772, 595, 842)
content_header_rect = Rect(0, 0, 595, 70)
cover_footer_rect = Rect(0, 600, 595, 842)
cover_upper_rect = Rect(0, 0, 595, 140)
print(f"Handling PDF from Jianbo")
print(f"Handling cover page")
p0: Page = doc[0]
p0.add_redact_annot(cover_upper_rect)
p0.add_redact_annot(cover_footer_rect)
p0.apply_redactions()
print(f"Handling body")
for page in doc:
page.add_redact_annot(content_header_rect)
page.add_redact_annot(content_footer_rect)
page.apply_redactions()
print(f"Handling tailpage")
last_page = doc[len(doc) - 1]
copyright_positions: List[Rect] = last_page.search_for("Copyright BGI All Rights Reserved")
for rect in copyright_positions:
last_page.add_redact_annot(Rect(0, rect.y0 - 10, 595, rect.y0 + 30))
last_page.apply_redactions()
return doc
folder = os.path.dirname(__file__)
handlePDF(fitz.open(os.path.join(folder, "test.pdf"))).save(os.path.join(folder, "output.pdf"))
How to reproduce the bug
-
Download
bug-report.tar.gz
, extract, and cd to bug-report. -
Create a virtual python environment with pymupdf 1.24.1 or 1.24.0 installed.
-
Activate the virtual environment and run
python test.py
, you will see result as I reported above.
PyMuPDF version
1.24.1
Operating system
Linux
Python version
3.11
This has nothing to do with #3375. It seems to be a MuPDF issue. I will prepare a report to their system.
This 1-page file contains a redaction annotation which will ruin the appearance when applied. problem.pdf
MuPDF issue reference: https://bugs.ghostscript.com/show_bug.cgi?id=707733
The MuPDF team has resolved the problem. The fix will be rolled out with one of the next versions.
The MuPDF team has resolved the problem. The fix will be rolled out with one of the next versions.
Wow, @JorjMcKie, Thank you very much for your response and efforts to solve this problem. Waiting for the next release.
BTW, I have modified my issue comment to make it less grammar and spelling errors.
This has been fixed in v1.24.2.
Hi, JorjMcKie. Thank you very much for your efforts resolving this problem.
Unfortunately, after I upgrade the pymupdf and rerun the code in https://github.com/pymupdf/PyMuPDF/issues/3376#issue-2239590902, PDF apperance is still ruined. :(
$ git clone https://github.com/pymupdf/PyMuPDF.git && cd PyMuPDF && pip install .
$ pip list | grep PyMuPDF
PyMuPDF 1.24.2
PyMuPDFb 1.24.1
$ cd bug-report && python test.py
Is it because that the upstream code has not been released?
Sorry - my bad: the MuPDF fix has not been included in this version. We will have to wait ...
Fixed in 1.24.3.