PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

声明的 PyMuPDF<1.21.0 版本兼容问题

Open izerui opened this issue 1 year ago • 10 comments

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/requirements.txt

官方声明的依赖项: PyMuPDF<1.21.0 ,但是PyMuPDF在 1.20以上pageCount已经改为 page_count。

izerui avatar Mar 07 '23 09:03 izerui

Thanks for your contribution!

paddle-bot[bot] avatar Mar 07 '23 09:03 paddle-bot[bot]

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Mar 07 '23 09:03 CLAassistant

找到一个可以不修改paddleocr源码的方式: import fitz 并在ocr之前调用 fitz.restore_aliases() 好让fitz恢复旧的别名映射

izerui avatar Mar 13 '23 13:03 izerui

@izerui please sign the License Agreement first

xxxpsyduck avatar May 25 '23 09:05 xxxpsyduck

找到一个可以不修改paddleocr源码的方式: import fitz 并在ocr之前调用 fitz.restore_aliases() 好让fitz恢复旧的别名映射

好像也不行啊,会报AttributeError: 'NoneType' object has no attribute 'copy'

nashlibby avatar May 30 '23 09:05 nashlibby

这个问题是 pyMuPDF 1.18.14 以后引入的,官方对函数名变更的说明文档: https://github.com/pymupdf/PyMuPDF/blob/main/docs/znames.rst#deprecated-names

paddleOCR此问题的所有相关issue:https://github.com/PaddlePaddle/PaddleOCR/issues?q=is%3Aissue+ppocr%2Futils%2Futility.py

例如 https://github.com/PaddlePaddle/PaddleOCR/issues/9401

kerneltravel avatar Jul 31 '23 06:07 kerneltravel

@kerneltravel since you are fixing this issue, can you do a quick search in the file for any other lines using getPixmap() or pageCount()? I see that this line in the same utility.py file is also using getPixmap()

https://github.com/PaddlePaddle/PaddleOCR/blob/2f72a55cdbaa79651e027e5a84df9ff68ea901a1/ppocr/utils/utility.py#L100C31-L100C40

Toefinder avatar Aug 10 '23 09:08 Toefinder

最新的paddleocr 安装都安装不了,必须先安装pymupdf 1.20的,才能够顺利安装。系统是windows10.

cobaltautomationdev avatar Nov 30 '23 06:11 cobaltautomationdev

#10181 fix this, PR can be closed @shiyutang

itasli avatar Jan 13 '24 11:01 itasli

官方为何还是不更新,兼容pymupdf1.20以上的版本。

cobaltautomationdev avatar Jan 18 '24 02:01 cobaltautomationdev