PyMuPDF icon indicating copy to clipboard operation
PyMuPDF copied to clipboard

Remove text not working in 1.23.25 version vs 1.20.2

Open bilykigor opened this issue 1 year ago • 6 comments

Description of the bug

Text removal from pdf with PyMuPDF

works good in python3.8 and PyMuPDF 1.20.2: Python bindings for the MuPDF 1.20.3 library

Screen Shot 2024-02-21 at 11 12 42 PM

but if used python3.12 and PyMuPDF 1.23.25 - not all text is removed

Screen Shot 2024-02-21 at 11 09 57 PM

How to reproduce the bug

`import fitz pdf_doc = fitz.open('data/file.pdf') page = pdf_doc.load_page(0) for block in page.get_text("words"): rect = fitz.Rect(block[:4]) page.add_redact_annot(rect) page.apply_redactions()

pdf_doc.save('data/file_1.pdf') pdf_doc.close()`

PyMuPDF version

1.23.25

Operating system

MacOS

Python version

3.12

bilykigor avatar Feb 21 '24 22:02 bilykigor

Could you add the input file, data/file.pdf, to this issue page?

Unfortunately not, it is under NDA. Maybe I can share some metadata?

bilykigor avatar Feb 23 '24 19:02 bilykigor

The bug starts from PyMuPDF-1.23.9 while in 1.23.8 all good.

bilykigor avatar Feb 23 '24 19:02 bilykigor

Sorry, but we need data to reproduce the problem. You can use direct e-mail addresses if there are data protection concerns, or else a file cleaned from confidential data.

JorjMcKie avatar Feb 23 '24 19:02 JorjMcKie

is it valid mail? [email protected]

bilykigor avatar Feb 23 '24 20:02 bilykigor

@bilykigor yes - use it with confidence

JorjMcKie avatar Feb 23 '24 21:02 JorjMcKie

Fixed in 1.24.0.