PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

使用paddleocr 3.0.2测试,会出现Segmentation fault问题

Open Miao367147258 opened this issue 6 months ago • 1 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

使用paddleocr 3.0.2测试,会出现Segmentation fault问题 使用paddleocr 3.0.1测试,则不会出现Segmentation fault问题 换成mobile模型则不会出现Segmentation fault问题 测试使用的电脑内存为16GB,会出现Segmentation fault问题,觉得是因为内存太小引起的,因此换到另外一个80GB内存的电脑上测试,仍然会出现该问题; 电脑没有独立显卡,只有CPU,CPU型号Intel(R) Core(TM) i5-12400

🏃‍♂️ Environment (运行环境)

python 3.12+paddleocr 3.0.2+ubuntu 24.04,电脑没有独立显卡,只有CPU,CPU型号Intel(R) Core(TM) i5-12400

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

附件文件: test-paddleocr-3.0.2.py 为测试用的python脚本文件 语文听写-026.jpg 为测试用的图片文件 requirements-for-paddleocr.txt 为安装的软件包及版本信息 configure-paddleocr 中包含使用miniconda创建测试环境的命令 test-log 为运行log信息,可以看到会出现Segmentation fault错误

test-paddleocr-3.0.2-20250620.tar.gz

Segmentation fault信息: python3 test-paddleocr-3.0.2.py /media/xxxyyy/miniconda3/envs/env-for-paddleocr-3.0.2/lib/python3.12/site-packages/paddle/utils/cpp_extension/extension_utils.py:711: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) Creating model: ('PP-LCNet_x1_0_doc_ori', None) Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('UVDoc', None) Using official model (UVDoc), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-LCNet_x1_0_textline_ori', None) Using official model (PP-LCNet_x1_0_textline_ori), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-OCRv5_server_det', None) Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models. Creating model: ('PP-OCRv5_server_rec', None) Using official model (PP-OCRv5_server_rec), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models.


C++ Traceback (most recent call last):

0 paddle::AnalysisPredictor::ZeroCopyRun(bool) 1 paddle::framework::NaiveExecutor::RunInterpreterCore(std::vector<std::string, std::allocator<std::string > > const&, bool, bool) 2 paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool) 3 paddle::framework::PirInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool) 4 paddle::framework::PirInterpreter::TraceRunImpl() 5 paddle::framework::PirInterpreter::TraceRunInstructionList(std::vector<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase >, std::allocator<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase > > > const&) 6 paddle::framework::PirInterpreter::RunInstructionBase(paddle::framework::InstructionBase*) 7 paddle::framework::PhiKernelInstruction::Run() 8 phi::KernelImpl<void ()(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, phi::DenseTensor), &(void phi::ConvKernel<float, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, std::vector<int, std::allocator > const&, int, std::string const&, phi::DenseTensor*))>::Compute(phi::KernelContext*) 9 void phi::ConvKernelImpl<float, phi::CPUContext>(phi::CPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::string const&, int, std::vector<int, std::allocator > const&, std::string const&, phi::DenseTensor*) 10 phi::funcs::Im2ColFunctor<(phi::funcs::ColFormat)0, phi::CPUContext, float>::operator()(phi::CPUContext const&, phi::DenseTensor const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, std::vector<int, std::allocator > const&, phi::DenseTensor*, common::DataLayout)


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1750421447 (unix time) try "date -d @1750421447" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x725e471f5300) received by PID 3995330 (TID 0x7264625ef740) from PID 1193235200 ***]

Killed

Miao367147258 avatar Jun 20 '25 12:06 Miao367147258

麻烦看看这个issue里的回答是否有帮助~ https://github.com/PaddlePaddle/PaddleOCR/pull/15790

Bobholamovic avatar Jun 20 '25 14:06 Bobholamovic

看起来现象是一样的,等下一个版本再试试吧,现在先用着之前的版本,多谢回复.

Miao367147258 avatar Jun 21 '25 04:06 Miao367147258

好的,感谢理解

Bobholamovic avatar Jun 24 '25 12:06 Bobholamovic

使用 paddleocr 3.0.1 (cpu mode), 同样出现了Segmentation fault问题。

Environment(运行环境):

Host OS: Linux (Ubuntu 22.04.4 LTS) Docker Version: 27.1.2 CPU Info:Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz

最小可复现问题的Demo:

Terminal 1:

docker build -t test_paddlex_serving -f ./Dockerfile.cpu .

docker run -p 9090:8080 test_paddlex_serving

Terminal 2:

python PaddleOCR_API_test.py

test_paddleocr_3.0.1_20250625.zip

附件 test_paddleocr_3.0.1_20250625.zip 含: PaddleOCR_API_test.py:测试用脚本文件 Dockerfile.cpu:用于创建docker image test_tags.png:测试用图片文件 docker_log.log:Docker log 信息(可以看到Segmentation fault)

Betty2GitHub avatar Jun 26 '25 03:06 Betty2GitHub

Host OS: Linux (darwin 24.3.0)

请问是darwin还是linux呀?

Bobholamovic avatar Jun 26 '25 04:06 Bobholamovic

请问是darwin还是linux呀?

是 Linux (Ubuntu 22.04.4 LTS),已编辑

Betty2GitHub avatar Jun 26 '25 04:06 Betty2GitHub

好的,请提供一下paddlex版本,以及PaddleOCR对象的实例化参数~

Bobholamovic avatar Jun 26 '25 05:06 Bobholamovic

好的,请提供一下paddlex版本,以及PaddleOCR对象的实例化参数~

paddlex3.0.1 使用了默认的OCR产线配置

SubModules:
  TextDetection:
    box_thresh: 0.6
    limit_side_len: 736
    limit_type: min
    max_side_limit: 4000
    model_dir: null
    model_name: PP-OCRv5_server_det
    module_name: text_detection
    thresh: 0.3
    unclip_ratio: 1.5
  TextLineOrientation:
    batch_size: 6
    model_dir: null
    model_name: PP-LCNet_x0_25_textline_ori
    module_name: textline_orientation
  TextRecognition:
    batch_size: 6
    model_dir: null
    model_name: PP-OCRv5_server_rec
    module_name: text_recognition
    score_thresh: 0.0
SubPipelines:
  DocPreprocessor:
    SubModules:
      DocOrientationClassify:
        model_dir: null
        model_name: PP-LCNet_x1_0_doc_ori
        module_name: doc_text_orientation
      DocUnwarping:
        model_dir: null
        model_name: UVDoc
        module_name: image_unwarping
    pipeline_name: doc_preprocessor
    use_doc_orientation_classify: true
    use_doc_unwarping: true
pipeline_name: OCR
text_type: general
use_doc_preprocessor: true
use_textline_orientation: true 

Betty2GitHub avatar Jun 26 '25 05:06 Betty2GitHub

我也是这样的显示,在pycharm的输出框里等了很久才打印检测结果,耗时200+秒,性能差一点的电脑直接退出运行 出现Creating model: ('PP-OCRv5_server_det', None) Using official model (PP-OCRv5_server_det), the model files will be automatically downloaded and saved in /home/xxxyyy/.paddlex/official_models.,其原因估计是没有找到模型,但实际上应该是找到了的,因为最终200+秒后打印了检测结果,想要不出现这个提示,可以在PaddleOCR里指定模型路径,比如我写的是: ocr = PaddleOCR( doc_orientation_classify_model_name='PP-LCNet_x1_0_doc_ori',......

出现UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message),应该是耗时200+秒的原因,至于怎么解决,我搞了很久没搞定,我记得paddleocr V2版本没有这个情况的 。。。。。。

AlbertMa123 avatar Jun 26 '25 08:06 AlbertMa123

好的,请提供一下paddlex版本,以及PaddleOCR对象的实例化参数~

paddlex3.0.1 使用了默认的OCR产线配置

PP-OCRv5_server_det, PP-OCRv5_server_rec换成PP-OCRv5_mobile_det, PP-OCRv5_mobile_rec后没有出现Segmentation fault问题,但还是期待能尽早换回server模型了,多谢

Betty2GitHub avatar Jun 26 '25 10:06 Betty2GitHub

@Betty2GitHub 段错误应该是Paddle CPU原生推理的一个bug导致的,当server_det模型处理尺寸较大的图时会遇到这个错误,可以尝试指定enable_mkldnn=True绕过。PaddleOCR 3.0.1应该是默认开启MKL-DNN的,但我不确定为什么在你的环境里没有生效,可能需要检查一下是否在初始化的时候指定了enable_mkldnn=False~

@AlbertMa123 耗时久时因为PaddleOCR 3.0.1之后的版本默认使用server系列模型以保证精度,server模型也会比较吃内存资源,而paddleocr 2.x中默认是mobile模型,可以根据实际需求选择高精度或者轻量级模型。我们将在后续的文档中添加相关说明。UserWarning: No ccache found. 这个warning是Paddle 3.0框架报的,可能是因为缺少了某个依赖,但通常不影响结果,可以忽略。

Bobholamovic avatar Jun 26 '25 10:06 Bobholamovic

@Betty2GitHub 段错误应该是Paddle CPU原生推理的一个bug导致的,当server_det模型处理尺寸较大的图时会遇到这个错误,可以尝试指定enable_mkldnn=True绕过。PaddleOCR 3.0.1应该是默认开启MKL-DNN的,但我不确定为什么在你的环境里没有生效,可能需要检查一下是否在初始化的时候指定了enable_mkldnn=False~

好,虽然没有指定过enable_mkldnn=False, 但手动指定enable_mkldnn=True后server模型也没有报错了,多谢回复

Betty2GitHub avatar Jun 27 '25 07:06 Betty2GitHub

还是原来的脚本,使用最新的paddleocr 3.0.3,工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

Miao367147258 avatar Jun 27 '25 10:06 Miao367147258

好的,那我将先关闭这个issue,后续大家如果有其他问题欢迎重新提issue~

Bobholamovic avatar Jun 27 '25 14:06 Bobholamovic

还是原来的脚本,使用最新的paddleocr 3.0.3,工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

请问我还是有同样的问题, ocr = TextRecognition( model_dir="service/Meter_Reader_V1/paddle_models/PP-OCRv5_server_rec/", enable_mkldnn=True )长时间运行后报错 In user code:

PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutable_data firstly.
  [Hint: holder_ should not be null.] (at /paddle/paddle/phi/core/dense_tensor_impl.cc:43)
  [operator < onednn_kernel.phi_kernel > error]

jiliangqian avatar Aug 06 '25 14:08 jiliangqian

还是原来的脚本,使用最新的paddleocr 3.0.3,工作正常,没有再出现segmentation fault问题,看了一下,enable_mkldnn的默认值为True,问题应该是解决了;这个issue是否可以关掉了?

请问我还是有同样的问题, ocr = TextRecognition( model_dir="service/Meter_Reader_V1/paddle_models/PP-OCRv5_server_rec/", enable_mkldnn=True )长时间运行后报错 In user code:

PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutable_data firstly.
  [Hint: holder_ should not be null.] (at /paddle/paddle/phi/core/dense_tensor_impl.cc:43)
  [operator < onednn_kernel.phi_kernel > error]

这个看起来是另一个问题,请另提issue吧

Bobholamovic avatar Aug 07 '25 02:08 Bobholamovic