PaddleOCR 使用paddleocr 3.0.2测试,会出现Segmentation fault问题

🔎 Search before asking

[x] I have searched the PaddleOCR Docs and found no similar bug report.
[x] I have searched the PaddleOCR Issues and found no similar bug report.
[x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

使用paddleocr 3.0.2测试,会出现Segmentation fault问题使用paddleocr 3.0.1测试,则不会出现Segmentation fault问题换成mobile模型则不会出现Segmentation fault问题测试使用的电脑内存为16GB,会出现Segmentation fault问题,觉得是因为内存太小引起的,因此换到另外一个80GB内存的电脑上测试,仍然会出现该问题; 电脑没有独立显卡，只有CPU,CPU型号Intel(R) Core(TM) i5-12400

🏃‍♂️ Environment (运行环境)

python 3.12+paddleocr 3.0.2+ubuntu 24.04，电脑没有独立显卡，只有CPU,CPU型号Intel(R) Core(TM) i5-12400

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

附件文件: test-paddleocr-3.0.2.py 为测试用的python脚本文件语文听写-026.jpg 为测试用的图片文件 requirements-for-paddleocr.txt 为安装的软件包及版本信息 configure-paddleocr 中包含使用miniconda创建测试环境的命令 test-log 为运行log信息,可以看到会出现Segmentation fault错误

test-paddleocr-3.0.2-20250620.tar.gz

Segmentation fault信息: python3 test-paddleocr-3.0.2.py /media/xxxyyy/miniconda3/envs/env-for-paddleocr-3.0.2/lib/python3.12/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) Creating model: ('PP-LCNet_x1_0_doc_ori', None) Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('UVDoc', None) Using official model (UVDoc), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-LCNet_x1_0_textline_ori', None) Using official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-OCRv5_server_det', None) Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-OCRv5_server_rec', None) Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models.

C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun(bool) 1 paddle::framework::NaiveExecutor::RunInterpreterCore(std::vector<std::string, std::allocator<std::string > > const&, bool, bool) 2 paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool) 3 paddle::framework::PirInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool) 4 paddle::framework::PirInterpreter::TraceRunImpl() 5 paddle::framework::PirInterpreter::TraceRunInstructionList(std::vector<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase >, std::allocator<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase > > > const&) 6 paddle::framework::PirInterpreter::RunInstructionBase(paddle::framework::InstructionBase*) 7 paddle::framework::PhiKernelInstruction::Run() 8 phi::KernelImpl<void ()(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, phi::DenseTensor), &(void phi::ConvKernel<float, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, phi::DenseTensor*))>::Compute(phi::KernelContext*) 9 void phi::ConvKernelImpl<float, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, int, std::vector<int, std::allocator > const&, std::string const&, phi::DenseTensor*) 10 phi::funcs::Im2ColFunctor<(phi::funcs::ColFormat)0, phi::CPUContext, float>::operator()(phi::CPUContext const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, phi::DenseTensor*, common::DataLayout)

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1750421447 (unix time) try "date -d @1750421447" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x725e471f5300) received by PID 3995330 (TID 0x7264625ef740) from PID 1193235200 ***]

Killed

Jun 20 '25 12:06 Miao367147258

麻烦看看这个issue里的回答是否有帮助～ https://github.com/PaddlePaddle/PaddleOCR/pull/15790

Jun 20 '25 14:06 Bobholamovic

看起来现象是一样的,等下一个版本再试试吧,现在先用着之前的版本,多谢回复.

Jun 21 '25 04:06 Miao367147258

好的，感谢理解

Jun 24 '25 12:06 Bobholamovic

使用 paddleocr 3.0.1 (cpu mode), 同样出现了Segmentation fault问题。

Environment（运行环境）：

Host OS: Linux (Ubuntu 22.04.4 LTS) Docker Version: 27.1.2 CPU Info：Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz

最小可复现问题的Demo：

Terminal 1:

docker build -t test_paddlex_serving -f ./Dockerfile.cpu .

docker run -p 9090:8080 test_paddlex_serving

Terminal 2:

python PaddleOCR_API_test.py

test_paddleocr_3.0.1_20250625.zip

附件 test_paddleocr_3.0.1_20250625.zip 含： PaddleOCR_API_test.py：测试用脚本文件 Dockerfile.cpu：用于创建docker image test_tags.png：测试用图片文件 docker_log.log：Docker log 信息（可以看到Segmentation fault）

Jun 26 '25 03:06 Betty2GitHub

Host OS: Linux (darwin 24.3.0)

请问是darwin还是linux呀？

Jun 26 '25 04:06 Bobholamovic

请问是darwin还是linux呀？

是 Linux (Ubuntu 22.04.4 LTS)，已编辑

Jun 26 '25 04:06 Betty2GitHub

好的，请提供一下paddlex版本，以及PaddleOCR对象的实例化参数～

Jun 26 '25 05:06 Bobholamovic

好的，请提供一下paddlex版本，以及PaddleOCR对象的实例化参数～

paddlex3.0.1 使用了默认的OCR产线配置

SubModules:
  TextDetection:
    box_thresh: 0.6
    limit_side_len: 736
    limit_type: min
    max_side_limit: 4000
    model_dir: null
    model_name: PP-OCRv5_server_det
    module_name: text_detection
    thresh: 0.3
    unclip_ratio: 1.5
  TextLineOrientation:
    batch_size: 6
    model_dir: null
    model_name: PP-LCNet_x0_25_textline_ori
    module_name: textline_orientation
  TextRecognition:
    batch_size: 6
    model_dir: null
    model_name: PP-OCRv5_server_rec
    module_name: text_recognition
    score_thresh: 0.0
SubPipelines:
  DocPreprocessor:
    SubModules:
      DocOrientationClassify:
        model_dir: null
        model_name: PP-LCNet_x1_0_doc_ori
        module_name: doc_text_orientation
      DocUnwarping:
        model_dir: null
        model_name: UVDoc
        module_name: image_unwarping
    pipeline_name: doc_preprocessor
    use_doc_orientation_classify: true
    use_doc_unwarping: true
pipeline_name: OCR
text_type: general
use_doc_preprocessor: true
use_textline_orientation: true

Jun 26 '25 05:06 Betty2GitHub

我也是这样的显示，在pycharm的输出框里等了很久才打印检测结果，耗时200+秒，性能差一点的电脑直接退出运行出现Creating model: ('PP-OCRv5_server_det', None) Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models.，其原因估计是没有找到模型，但实际上应该是找到了的，因为最终200+秒后打印了检测结果，想要不出现这个提示，可以在PaddleOCR里指定模型路径，比如我写的是： ocr = PaddleOCR( doc_orientation_classify_model_name='PP-LCNet_x1_0_doc_ori',......

出现UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message)，应该是耗时200+秒的原因，至于怎么解决，我搞了很久没搞定，我记得paddleocr V2版本没有这个情况的。。。。。。

Jun 26 '25 08:06 AlbertMa123

好的，请提供一下paddlex版本，以及PaddleOCR对象的实例化参数～

paddlex3.0.1 使用了默认的OCR产线配置

将PP-OCRv5_server_det, PP-OCRv5_server_rec换成PP-OCRv5_mobile_det, PP-OCRv5_mobile_rec后没有出现Segmentation fault问题，但还是期待能尽早换回server模型了，多谢

Jun 26 '25 10:06 Betty2GitHub

@Betty2GitHub 段错误应该是Paddle CPU原生推理的一个bug导致的，当server_det模型处理尺寸较大的图时会遇到这个错误，可以尝试指定enable_mkldnn=True绕过。PaddleOCR 3.0.1应该是默认开启MKL-DNN的，但我不确定为什么在你的环境里没有生效，可能需要检查一下是否在初始化的时候指定了enable_mkldnn=False～

@AlbertMa123 耗时久时因为PaddleOCR 3.0.1之后的版本默认使用server系列模型以保证精度，server模型也会比较吃内存资源，而paddleocr 2.x中默认是mobile模型，可以根据实际需求选择高精度或者轻量级模型。我们将在后续的文档中添加相关说明。UserWarning: No ccache found. 这个warning是Paddle 3.0框架报的，可能是因为缺少了某个依赖，但通常不影响结果，可以忽略。

Jun 26 '25 10:06 Bobholamovic

@Betty2GitHub 段错误应该是Paddle CPU原生推理的一个bug导致的，当server_det模型处理尺寸较大的图时会遇到这个错误，可以尝试指定enable_mkldnn=True绕过。PaddleOCR 3.0.1应该是默认开启MKL-DNN的，但我不确定为什么在你的环境里没有生效，可能需要检查一下是否在初始化的时候指定了enable_mkldnn=False～

好，虽然没有指定过enable_mkldnn=False，但手动指定enable_mkldnn=True后server模型也没有报错了，多谢回复

Jun 27 '25 07:06 Betty2GitHub

还是原来的脚本,使用最新的paddleocr 3.0.3，工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

Jun 27 '25 10:06 Miao367147258

好的，那我将先关闭这个issue，后续大家如果有其他问题欢迎重新提issue～

Jun 27 '25 14:06 Bobholamovic

还是原来的脚本,使用最新的paddleocr 3.0.3，工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

请问我还是有同样的问题， ocr = TextRecognition( model_dir="service/Meter_Reader_V1/paddle_models/PP-OCRv5_server_rec/", enable_mkldnn=True )长时间运行后报错 In user code:

PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutable_data firstly.
  [Hint: holder_ should not be null.] (at /paddle/paddle/phi/core/dense_tensor_impl.cc:43)
  [operator < onednn_kernel.phi_kernel > error]

Aug 06 '25 14:08 jiliangqian

还是原来的脚本,使用最新的paddleocr 3.0.3，工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

请问我还是有同样的问题， ocr = TextRecognition( model_dir="service/Meter_Reader_V1/paddle_models/PP-OCRv5_server_rec/", enable_mkldnn=True )长时间运行后报错 In user code:
PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutable_data firstly.
  [Hint: holder_ should not be null.] (at /paddle/paddle/phi/core/dense_tensor_impl.cc:43)
  [operator < onednn_kernel.phi_kernel > error]

这个看起来是另一个问题，请另提issue吧

Aug 07 '25 02:08 Bobholamovic