PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Nuitka 打包问题关于paddlex 和paddleocr3.0

Open SHOUshou0426 opened this issue 6 months ago • 11 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

采用了Nuitka 进行打包 Image

Image

🏃‍♂️ Environment (运行环境)

win11 安装cuda 11.6

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

@echo off @REM set name=aoi_pyside6 set name=aoigpu echo name=%name% for /f "delims=" %%t in ('conda info --base') do set base=%%t echo base=%base% set conda_path=%base%\envs%name%\Lib\site-packages echo conda_path=%conda_path% call %base%\Scripts\activate.bat %base% call conda activate %name% call nuitka --mingw64 ^ --standalone ^ --show-progress ^ --enable-plugin=pyside6 ^ --include-distribution-metadata=paddlex ^ --include-distribution-metadata=paddleocr ^ --include-distribution-metadata=paddlepaddle-gpu ^ ./ocr_test.py

pause

SHOUshou0426 avatar Jun 18 '25 08:06 SHOUshou0426

You also need to include the distribution metadata of the other required packages by PaddleOCR. This code in this file checks if the dependencies are installed and thus requires all the metadata to be present:

def _get_extras():
    metadata = importlib.metadata.metadata("paddlex")
    extras = {}
    # XXX: The `metadata.get_all` used here is not well documented.
    for name in metadata.get_all("Provides-Extra", []):
        if name not in _COLLECTIVE_EXTRA_NAMES:
            extras[name] = defaultdict(list)
    for dep_spec in importlib.metadata.requires("paddlex"):
        extra_name, dep_spec = _get_extra_name_and_remove_extra_marker(dep_spec)
        if extra_name is not None and extra_name not in _COLLECTIVE_EXTRA_NAMES:
            dep_spec = dep_spec.rstrip()
            req = Requirement(dep_spec)
            assert extra_name in extras, extra_name
            extras[extra_name][req.name].append(dep_spec)
    return extras

You can find the metadata file in the paddlex dist folder. For the ocr extra these are the required dependencies:

Provides-Extra: ocr
Requires-Dist: ftfy; extra == "ocr"
Requires-Dist: imagesize; extra == "ocr"
Requires-Dist: lxml; extra == "ocr"
Requires-Dist: opencv-contrib-python==4.10.0.84; extra == "ocr"
Requires-Dist: openpyxl; extra == "ocr"
Requires-Dist: premailer; extra == "ocr"
Requires-Dist: pyclipper; extra == "ocr"
Requires-Dist: pypdfium2>=4; extra == "ocr"
Requires-Dist: scikit-learn; extra == "ocr"
Requires-Dist: shapely; extra == "ocr"
Requires-Dist: tokenizers==0.19.1; extra == "ocr"

Include the metadata of all these packages aswell. That should resolve your issue.

timminator avatar Jun 18 '25 14:06 timminator

请问paddleocr 是否可以简化打包,打包时间过长出现错误无法debug 排除错误,目前打包都是全部依赖情况,

---- 回复的原邮件 ---- | 发件人 | @.> | | 发送日期 | 2025年06月18日 下午10:34 | | 收件人 | PaddlePaddle/PaddleOCR @.> | | 抄送人 | SHOUshou0426 @.>, Author @.> | | 主题 | Re: [PaddlePaddle/PaddleOCR] Nuitka 打包问题关于paddlex 和paddleocr3.0 (Issue #15767) | timminator left a comment (PaddlePaddle/PaddleOCR#15767)

You also need to include the distribution metadata of the other required packages by PaddleOCR. This code in this file checks if the dependencies are installed and thus requires all the metadata to be present:

def_get_extras(): metadata=importlib.metadata.metadata("paddlex") extras= {} # XXX: The metadata.get_all used here is not well documented.fornameinmetadata.get_all("Provides-Extra", []): ifnamenotin_COLLECTIVE_EXTRA_NAMES: extras[name] =defaultdict(list) fordep_specinimportlib.metadata.requires("paddlex"): extra_name, dep_spec=_get_extra_name_and_remove_extra_marker(dep_spec) ifextra_nameisnotNoneandextra_namenotin_COLLECTIVE_EXTRA_NAMES: dep_spec=dep_spec.rstrip() req=Requirement(dep_spec) assertextra_nameinextras, extra_nameextras[extra_name][req.name].append(dep_spec) returnextras

You can find the metadata file in the paddlex dist folder. For the ocr extra these are the required dependencies:

Provides-Extra: ocr Requires-Dist: ftfy; extra == "ocr" Requires-Dist: imagesize; extra == "ocr" Requires-Dist: lxml; extra == "ocr" Requires-Dist: opencv-contrib-python==4.10.0.84; extra == "ocr" Requires-Dist: openpyxl; extra == "ocr" Requires-Dist: premailer; extra == "ocr" Requires-Dist: pyclipper; extra == "ocr" Requires-Dist: pypdfium2>=4; extra == "ocr" Requires-Dist: scikit-learn; extra == "ocr" Requires-Dist: shapely; extra == "ocr" Requires-Dist: tokenizers==0.19.1; extra == "ocr"

Include the metadata of all these packages aswell. That should resolve your issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

SHOUshou0426 avatar Jun 18 '25 14:06 SHOUshou0426

上述给出的环境确保已经安装了,并且将环境依赖都cp到当前目录下,非常感谢您的回复

---- 回复的原邮件 ---- | 发件人 | @.> | | 发送日期 | 2025年06月18日 下午10:34 | | 收件人 | PaddlePaddle/PaddleOCR @.> | | 抄送人 | SHOUshou0426 @.>, Author @.> | | 主题 | Re: [PaddlePaddle/PaddleOCR] Nuitka 打包问题关于paddlex 和paddleocr3.0 (Issue #15767) | timminator left a comment (PaddlePaddle/PaddleOCR#15767)

You also need to include the distribution metadata of the other required packages by PaddleOCR. This code in this file checks if the dependencies are installed and thus requires all the metadata to be present:

def_get_extras(): metadata=importlib.metadata.metadata("paddlex") extras= {} # XXX: The metadata.get_all used here is not well documented.fornameinmetadata.get_all("Provides-Extra", []): ifnamenotin_COLLECTIVE_EXTRA_NAMES: extras[name] =defaultdict(list) fordep_specinimportlib.metadata.requires("paddlex"): extra_name, dep_spec=_get_extra_name_and_remove_extra_marker(dep_spec) ifextra_nameisnotNoneandextra_namenotin_COLLECTIVE_EXTRA_NAMES: dep_spec=dep_spec.rstrip() req=Requirement(dep_spec) assertextra_nameinextras, extra_nameextras[extra_name][req.name].append(dep_spec) returnextras

You can find the metadata file in the paddlex dist folder. For the ocr extra these are the required dependencies:

Provides-Extra: ocr Requires-Dist: ftfy; extra == "ocr" Requires-Dist: imagesize; extra == "ocr" Requires-Dist: lxml; extra == "ocr" Requires-Dist: opencv-contrib-python==4.10.0.84; extra == "ocr" Requires-Dist: openpyxl; extra == "ocr" Requires-Dist: premailer; extra == "ocr" Requires-Dist: pyclipper; extra == "ocr" Requires-Dist: pypdfium2>=4; extra == "ocr" Requires-Dist: scikit-learn; extra == "ocr" Requires-Dist: shapely; extra == "ocr" Requires-Dist: tokenizers==0.19.1; extra == "ocr"

Include the metadata of all these packages aswell. That should resolve your issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

SHOUshou0426 avatar Jun 18 '25 14:06 SHOUshou0426

Try it again by adding --include-distribution-metadata=ftfy, --include-distribution-metadata=imagesize and so on, that should resolve the error you reported. I did it myself and got it working.

timminator avatar Jun 18 '25 14:06 timminator

好的,感谢您回复,进行尝试中,后续结果告知您

---- 回复的原邮件 ---- | 发件人 | @.> | | 发送日期 | 2025年06月18日 下午10:56 | | 收件人 | PaddlePaddle/PaddleOCR @.> | | 抄送人 | SHOUshou0426 @.>, Author @.> | | 主题 | Re: [PaddlePaddle/PaddleOCR] Nuitka 打包问题关于paddlex 和paddleocr3.0 (Issue #15767) | timminator left a comment (PaddlePaddle/PaddleOCR#15767)

Try it again by adding --include-distribution-metadata=ftfy, --include-distribution-metadata=imagesize and so on, that should resolve the error you reported. I did it myself and got it working.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

SHOUshou0426 avatar Jun 18 '25 14:06 SHOUshou0426

抱歉,在此询问一下您,以下您提供的相关依赖信息是如何查询到的呢, Provides-Extra: ocr Requires-Dist: ftfy; extra == "ocr"

---- 回复的原邮件 ---- | 发件人 | @.> | | 发送日期 | 2025年06月18日 下午10:56 | | 收件人 | PaddlePaddle/PaddleOCR @.> | | 抄送人 | SHOUshou0426 @.>, Author @.> | | 主题 | Re: [PaddlePaddle/PaddleOCR] Nuitka 打包问题关于paddlex 和paddleocr3.0 (Issue #15767) | timminator left a comment (PaddlePaddle/PaddleOCR#15767)

Try it again by adding --include-distribution-metadata=ftfy, --include-distribution-metadata=imagesize and so on, that should resolve the error you reported. I did it myself and got it working.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

SHOUshou0426 avatar Jun 18 '25 15:06 SHOUshou0426

You can find the metadata file in the paddlex dist folder.

Go into paddlex-3.0.0.dist-info, theres a metadata file. Open it. Then add the --include-distribution-metadata command for all the extras I listed above.

timminator avatar Jun 18 '25 15:06 timminator

好的,非常感谢您

---- 回复的原邮件 ---- | 发件人 | @.> | | 发送日期 | 2025年06月18日 下午11:09 | | 收件人 | PaddlePaddle/PaddleOCR @.> | | 抄送人 | SHOUshou0426 @.>, Author @.> | | 主题 | Re: [PaddlePaddle/PaddleOCR] Nuitka 打包问题关于paddlex 和paddleocr3.0 (Issue #15767) | timminator left a comment (PaddlePaddle/PaddleOCR#15767)

You can find the metadata file in the paddlex dist folder.

Go into paddlex-3.0.0.dist-info, theres a metadata file. Open it. Then add the --include-distribution-metadata command for all the extras I listed above.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

SHOUshou0426 avatar Jun 18 '25 15:06 SHOUshou0426

You can find the metadata file in the paddlex dist folder.

Go into paddlex-3.0.0.dist-info, theres a metadata file. Open it. Then add the --include-distribution-metadata command for all the extras I listed above.

你好,我尝试了您说的打包方式,目前paddlex 没有出现,显示模型已经读取,但是google 包出现导入不成功,难道需要--include-distribution-metadata=google 吗?尝试将google 放到当前目录下还是不行

Image

Image

SHOUshou0426 avatar Jun 19 '25 02:06 SHOUshou0426

This is a new issue different from the missing metadata from before. Please tell me your Nuitka version and your Python version. If possible please also post your ocr_test python script.

timminator avatar Jun 19 '25 07:06 timminator

这是一个新问题,与之前的元数据缺失问题不同。 请告知您的 Nuitka 版本和 Python 版本。如果可以的话,也请发布您的 ocr_test Python 脚本。

您好,感谢您的回复,上述问题已经解决,需要将protobuf版本更改为3.20.2,上述问题可能是版本过高,语法迭代已经更新,最后非常感谢的回复,以及解决方法的方法

SHOUshou0426 avatar Jun 19 '25 08:06 SHOUshou0426

您可以在 paddlex dist 文件夹中找到元数据文件。

进入 paddlex-3.0.0.dist-info,里面有一个元数据文件。打开它。然后为上面列出的所有附加组件添加 --include-distribution-metadata 命令。

您好,我用了以下方式打包还是出现作者的这个问题,请问我是哪里不对吗 我的python是3.13的,

@echo off setlocal enabledelayedexpansion

rem 设置主要参数 set PYINSTALLER=pyinstaller set SCRIPT=./png_to_txt.py set ICON=app.ico set NAME=PlateNumberAI set PYTHON_DIR=C:/Users/26279/AppData/Local/Programs/Python/Python313

rem 设置数据路径 set PADDLEX_PATH="%PYTHON_DIR%/Lib/site-packages/paddlex" set PADDLE_BIN_PATH="%PYTHON_DIR%/Lib/site-packages/paddle/libs;."

rem 执行打包命令 - 使用兼容旧版本的参数 %PYINSTALLER% --noconsole ^ --icon=%ICON% ^ --name=%NAME% ^ --add-data %PADDLEX_PATH%;paddlex ^ --add-binary %PADDLE_BIN_PATH% ^ --copy-metadata ftfy ^ --copy-metadata imagesize ^ --copy-metadata lxml ^ --copy-metadata opencv-contrib-python ^ --copy-metadata openpyxl ^ --copy-metadata premailer ^ --copy-metadata pyclipper ^ --copy-metadata pypdfium2 ^ --copy-metadata scikit-learn ^ --copy-metadata shapely ^ --copy-metadata tokenizers ^ --hidden-import scipy.special._cdflib ^ --hidden-import paddlex.ocr ^ %SCRIPT%

endlocal

打包后的exe只要运行到此处就会报错 def png_to_txt(pdf_path, output_path,doc_path): pdf_name = os.path.basename(pdf_path).split(".")[0] # 初始化 PaddleOCR 实例 # 修改后 ocr = PaddleOCR( use_doc_orientation_classify=False, use_doc_unwarping=False, text_detection_model_dir=resource_path("paddleocr_model/det"), text_recognition_model_dir=resource_path("paddleocr_model/rec"), use_textline_orientation=False) # 对示例图像执行 OCR 推理 result = ocr.predict( input=pdf_path )

ACatThatBitesDogs avatar Jul 17 '25 10:07 ACatThatBitesDogs

First of all PaddleOCR is not officially supporting Python 3.13 yet so I would recommend using a Python version like 3.12. Furthermore I also suspect that you are not using PaddleOCR 3.0, but something newer like PaddleOCR 3.1.0 because my solution mentioned here definitely works with PaddleOCR 3.0.0. For PaddleOCR 3.1.0 you need to include the metadata from a few more packages. I answered that in another issue #15918, specifically in this comment: https://github.com/PaddlePaddle/PaddleOCR/issues/15918#issuecomment-3023825346

Please use the instructions I specified there and try it again.

timminator avatar Jul 17 '25 12:07 timminator

Try it again by adding --include-distribution-metadata=ftfy, --include-distribution-metadata=imagesize and so on, that should resolve the error you reported. I did it myself and got it working.

nuitka --standalone main.py ^ --include-distribution-metadata=paddlex ^ --include-distribution-metadata=paddlepaddle-gpu ^ --include-distribution-metadata=paddleocr ^ --include-distribution-metadata=ftfy ^ --include-distribution-metadata=imagesize ^ --include-distribution-metadata=lxml ^ --include-distribution-metadata=opencv_contrib_python ^ --include-distribution-metadata=openpyxl ^ --include-distribution-metadata=premailer ^ --include-distribution-metadata=pyclipper ^ --include-distribution-metadata=pypdfium2 ^ --include-distribution-metadata=scikit_learn ^ --include-distribution-metadata=shapely ^ --include-distribution-metadata=tokenizers ^ --include-distribution-metadata=einops ^ --include-distribution-metadata=jinja2 ^ --include-distribution-metadata=regex ^ --include-distribution-metadata=tiktoken ^ 您好,我这样打包好像没成功

Sam-gsj avatar Jul 17 '25 13:07 Sam-gsj

@Sam-gsj Same error as reported in the original post?

timminator avatar Jul 17 '25 14:07 timminator

@timminator Error is "Exception: The pipeline (OCR) does not exist! Please use a pipeline name or a config file path! "

Sam-gsj avatar Jul 17 '25 14:07 Sam-gsj

You need to add: --include-package-data=paddleocr --include-package-data=paddlex

That should resolve your error. And please include that error message right away, I cannot help you without knowing what's going on.

timminator avatar Jul 17 '25 14:07 timminator

@timminator When I add -include-package-data=paddleocr --include-package-data=paddlex

nuitka --standalone main.py ^ --include-distribution-metadata=paddlex ^ --include-distribution-metadata=paddlepaddle-gpu ^ --include-distribution-metadata=paddleocr ^ --include-distribution-metadata=ftfy ^ --include-distribution-metadata=imagesize ^ --include-distribution-metadata=lxml ^ --include-distribution-metadata=opencv_contrib_python ^ --include-distribution-metadata=openpyxl ^ --include-distribution-metadata=premailer ^ --include-distribution-metadata=pyclipper ^ --include-distribution-metadata=pypdfium2 ^ --include-distribution-metadata=scikit_learn ^ --include-distribution-metadata=shapely ^ --include-distribution-metadata=tokenizers ^ --include-distribution-metadata=einops ^ --include-distribution-metadata=jinja2 ^ --include-distribution-metadata=regex ^ --include-distribution-metadata=tiktoken ^ --include-package-data=paddleocr ^ --include-package-data=paddlex ^

the error is :

PS C:\Users\CUG\Desktop\PaddleOCR-main\main.dist> .\main.exe Traceback (most recent call last): File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\main.py", line 2, in File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddleocr_pipelines\ocr.py", line 161, in init File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddleocr_pipelines\base.py", line 66, in init File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddleocr_pipelines\base.py", line 100, in create_paddlex_pipeline File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddlex\inference\pipelines_init.py", line 166, in create_pipeline File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddlex\utils\deps.py", line 194, in _wrapper File "C:\Users\CUG\Desktop\PADDLE~1\MAIN~1.DIS\paddlex\utils\deps.py", line 187, in require_extra RuntimeError: OCR requires additional dependencies. To install them, run pip install "paddlex[ocr]==<PADDLEX_VERSION>" if you’re installing paddlex from an index, or pip install -e "/path/to/PaddleX[ocr]" if you’re installing paddlex locally.

Sam-gsj avatar Jul 17 '25 15:07 Sam-gsj

I'm not quite sure, but from your post you are writing the package names containing hyphens incorrectly:

--include-distribution-metadata=opencv_contrib_python ^

--include-distribution-metadata=scikit_learn ^

You are writing them with an underscore instead of hyphen. Did Nuitka not complain about that? I haven't tested or tried if it works with this wrong spelling.

I also did not incude these 3 options in my compile process:

--include-distribution-metadata=paddlex ^ --include-distribution-metadata=paddlepaddle-gpu ^ --include-distribution-metadata=paddleocr ^

timminator avatar Jul 17 '25 16:07 timminator

Tested it! Underscores are fine. Thats not it! You could try it again without the three options i mentioned earlier. If it is still not working, you can also give me an example script you are using.

timminator avatar Jul 17 '25 16:07 timminator

Thanks!!!!!. I will test it again tomorrow.

Sam-gsj avatar Jul 17 '25 16:07 Sam-gsj

首先,PaddleOCR 尚未正式支持 Python 3.13,因此我建议使用 Python 3.12 之类的版本。此外,我怀疑您使用的不是 PaddleOCR 3.0,而是 PaddleOCR 3.1.0 之类的更新版本,因为我这里提到的解决方案肯定适用于 PaddleOCR 3.0.0。 对于 PaddleOCR 3.1.0,您需要添加其他几个软件包的元数据。我在另一个问题#15918中回答了这个问题,具体来说是在这条评论中:#15918(评论)

请使用我在此处指定的说明并重试。

非常感谢,我把python降到了3.10,然后paddleocr和paddlepaddle都改成了3.0,就可以了。如果不是你,我的休息日估计就没了

ACatThatBitesDogs avatar Jul 18 '25 03:07 ACatThatBitesDogs

@timminator Paddleocr == 3.1.0 Paddlex ==3.1.3 PaddlePaddle == 3.0.0 nuitka --standalone main.py ^ --include-distribution-metadata=ftfy ^ --include-distribution-metadata=imagesize ^ --include-distribution-metadata=lxml ^ --include-distribution-metadata=opencv-contrib-python ^ --include-distribution-metadata=openpyxl ^ --include-distribution-metadata=premailer ^ --include-distribution-metadata=pyclipper ^ --include-distribution-metadata=pypdfium2 ^ --include-distribution-metadata=scikit-learn ^ --include-distribution-metadata=shapely ^ --include-distribution-metadata=tokenizers ^ --include-distribution-metadata=einops ^ --include-distribution-metadata=jinja2 ^ --include-distribution-metadata=regex ^ --include-distribution-metadata=tiktoken ^ --include-package-data=paddlex ^ --include-package-data=paddleocr ^

the error is :

PS C:\Users\CUG\Desktop\PaddleOCR-main\main.dist> .\main.exe Traceback (most recent call last): File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\main.py", line 2, in File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\ocr.py", line 161, in init File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\base.py", line 66, in init File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\base.py", line 100, in create_paddlex_pipeline File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\inference\pipelines_init.py", line 166, in create_pipeline File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\utils\deps.py", line 194, in _wrapper File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\utils\deps.py", line 187, in require_extra RuntimeError: OCR requires additional dependencies. To install them, run pip install "paddlex[ocr]==<PADDLEX_VERSION>" if you’re installing paddlex from an index, or pip install -e "/path/to/PaddleX[ocr]" if you’re installing paddlex locally.

Sam-gsj avatar Jul 18 '25 04:07 Sam-gsj

@timminator Paddleocr == 3.1.0 Paddlex ==3.1.3 PaddlePaddle == 3.0.0 nuitka --standalone main.py ^ --include-distribution-metadata=ftfy ^ --include-distribution-metadata=imagesize ^ --include-distribution-metadata=lxml ^ --include-distribution-metadata=opencv-contrib-python ^ --include-distribution-metadata=openpyxl ^ --include-distribution-metadata=premailer ^ --include-distribution-metadata=pyclipper ^ --include-distribution-metadata=pypdfium2 ^ --include-distribution-metadata=scikit-learn ^ --include-distribution-metadata=shapely ^ --include-distribution-metadata=tokenizers ^ --include-distribution-metadata=einops ^ --include-distribution-metadata=jinja2 ^ --include-distribution-metadata=regex ^ --include-distribution-metadata=tiktoken ^ --include-package-data=paddlex ^ --include-package-data=paddleocr ^

the error is :

PS C:\Users\CUG\Desktop\PaddleOCR-main\main.dist> .\main.exe Traceback (most recent call last): File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\main.py", line 2, in File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\ocr.py", line 161, in init File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\base.py", line 66, in init File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddleocr_pipelines\base.py", line 100, in create_paddlex_pipeline File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\inference\pipelines_init.py", line 166, in create_pipeline File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\utils\deps.py", line 194, in _wrapper File "C:\Users\CUG\Desktop\PADDLE1\MAIN1.DIS\paddlex\utils\deps.py", line 187, in require_extra RuntimeError: OCR requires additional dependencies. To install them, run pip install "paddlex[ocr]==<PADDLEX_VERSION>" if you’re installing paddlex from an index, or pip install -e "/path/to/PaddleX[ocr]" if you’re installing paddlex locally.

You can try to copy the metadata of the paddle, paddleocr and paddlex packages within the env environment to the environment of the packaged exe file. This way, you can directly locate the directory. Below are the command examples that I have completed the packaging of. call nuitka --mingw64 ^ --standalone ^ --show-progress ^ --include-distribution-metadata=ftfy ^ --include-distribution-metadata=imagesize ^ --include-distribution-metadata=lxml ^ --include-distribution-metadata=opencv-contrib-python ^ --include-distribution-metadata=openpyxl ^ --include-distribution-metadata=premailer ^ --include-distribution-metadata=pyclipper ^ --include-distribution-metadata=pypdfium2 ^ --include-distribution-metadata=scikit-learn ^ --include-distribution-metadata=shapely ^ --include-distribution-metadata=tokenizers ^ --include-distribution-metadata=protobuf ^ --include-distribution-metadata=paddlex ^ --include-distribution-metadata=paddle ^ --include-distribution-metadata=paddleocr ^ --include-distribution-metadata=httpx ^ --include-distribution-metadata=Pillow ^ --include-distribution-metadata=decorator ^ --include-distribution-metadata=astor ^ --include-distribution-metadata=opt-einsum ^ --include-distribution-metadata=networkx ^ --include-distribution-metadata=typing-extensions ^ ./ocr_test.py

SHOUshou0426 avatar Jul 18 '25 04:07 SHOUshou0426

@SHOUshou0426 --include-distribution-metadata=paddle ? FATAL: Error, could not find distribution 'paddle' for which metadata was asked to be included.

Sam-gsj avatar Jul 18 '25 04:07 Sam-gsj

@SHOUshou0426 --include-distribution-metadata=paddle ? FATAL:错误,找不到要求包含元数据的分布“paddle”。

You attempted to do without adding this.

SHOUshou0426 avatar Jul 18 '25 04:07 SHOUshou0426

@SHOUshou0426 I tried to use --include-distribution-metadata=paddlepaddle-gpu.

Sam-gsj avatar Jul 18 '25 04:07 Sam-gsj

@Sam-gsj You can try not adding the paddle. If my colleague doesn't add the paddle, the package will be completed.

SHOUshou0426 avatar Jul 18 '25 04:07 SHOUshou0426

@timminator @SHOUshou0426
I checked my main.dist file and found that there is no metadata data, meaning the dist-info file was not packaged. When I copied the metadata generated by PyInstaller into the Nuitka output, the problem was solved. Could you please check whether your generated main.dist contains these dist-info files?

Sam-gsj avatar Jul 18 '25 08:07 Sam-gsj

Nuitka does not generate metadata. Instead, it copies the metadata from the env to the directory of the executable file. When the exe file is executed, the problem is solved. @Sam-gsj

SHOUshou0426 avatar Jul 18 '25 08:07 SHOUshou0426