PaddleOCR 声明的 PyMuPDF<1.21.0 版本兼容问题

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/requirements.txt

官方声明的依赖项： PyMuPDF<1.21.0 ，但是PyMuPDF在 1.20以上pageCount已经改为 page_count。

Mar 07 '23 09:03 izerui

Thanks for your contribution!

Mar 07 '23 09:03 paddle-bot[bot]

All committers have signed the CLA.

Mar 07 '23 09:03 CLAassistant

找到一个可以不修改paddleocr源码的方式： import fitz 并在ocr之前调用 fitz.restore_aliases() 好让fitz恢复旧的别名映射

Mar 13 '23 13:03 izerui

@izerui please sign the License Agreement first

May 25 '23 09:05 xxxpsyduck

找到一个可以不修改paddleocr源码的方式： import fitz 并在ocr之前调用 fitz.restore_aliases() 好让fitz恢复旧的别名映射

好像也不行啊，会报AttributeError: 'NoneType' object has no attribute 'copy'

May 30 '23 09:05 nashlibby

这个问题是 pyMuPDF 1.18.14 以后引入的，官方对函数名变更的说明文档： https://github.com/pymupdf/PyMuPDF/blob/main/docs/znames.rst#deprecated-names

paddleOCR此问题的所有相关issue：https://github.com/PaddlePaddle/PaddleOCR/issues?q=is%3Aissue+ppocr%2Futils%2Futility.py

例如 https://github.com/PaddlePaddle/PaddleOCR/issues/9401

Jul 31 '23 06:07 kerneltravel

@kerneltravel since you are fixing this issue, can you do a quick search in the file for any other lines using getPixmap() or pageCount()? I see that this line in the same utility.py file is also using getPixmap()

https://github.com/PaddlePaddle/PaddleOCR/blob/2f72a55cdbaa79651e027e5a84df9ff68ea901a1/ppocr/utils/utility.py#L100C31-L100C40

Aug 10 '23 09:08 Toefinder

最新的paddleocr 安装都安装不了，必须先安装pymupdf 1.20的，才能够顺利安装。系统是windows10.

Nov 30 '23 06:11 cobaltautomationdev

#10181 fix this, PR can be closed @shiyutang

Jan 13 '24 11:01 itasli

官方为何还是不更新，兼容pymupdf1.20以上的版本。

Jan 18 '24 02:01 cobaltautomationdev

PaddleOCR PaddleOCR copied to clipboard

声明的 PyMuPDF<1.21.0 版本兼容问题

PaddleOCR
PaddleOCR copied to clipboard