Paddle icon indicating copy to clipboard operation
Paddle copied to clipboard

[BUG] 2.6 加载PaddleOCRV4官方模型 SIGILL 错误, 退回paddlepaddle-2.5.2正常.

Open gowy222 opened this issue 1 year ago • 29 comments

bug描述 Describe the Bug

参考 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_ch/quickstart.md 想体验一下的。 docker里面Linux环境,各种发行版本,py版本3.8-3.10都试过了...

pip install --no-cache-dir paddlepaddle paddleocr

安装验证通过的: #7 50.04 I1229 18:18:25.484820 874 interpretercore.cc:237] New Executor is Running. #7 50.07 I1229 18:18:25.518586 874 interpreter_util.cc:518] Standalone Executor is Used. #7 50.08 Running verify PaddlePaddle program ... #7 50.08 PaddlePaddle works well on 1 CPU. #7 50.08 PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

然后成功, app.py 初始化 ocr = PaddleOCR(use_gpu=False,lang="ch") 会自动下载官方模型:

download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar to /root/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer/ch_PP-OCRv4_det_infer.tar
100%|██████████| 4.89M/4.89M [00:01<00:00, 2.72MiB/s]
download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar to /root/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.tar
100%|██████████| 11.0M/11.0M [00:02<00:00, 4.51MiB/s]
download https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar to /root/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.tar
100%|██████████| 2.19M/2.19M [00:01<00:00, 1.46MiB/s]

紧接着 加载模型就报错! (任何配置都报错,cls开不开不影响报错) `-------------------------------------- C++ Traceback (most recent call last):

0 paddle_infer::Predictor::Predictor(paddle::AnalysisConfig const&) 1 std::unique_ptr<paddle::PaddlePredictor, std::default_deletepaddle::PaddlePredictor > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&) 2 paddle::AnalysisPredictor::Init(std::shared_ptrpaddle::framework::Scope const&, std::shared_ptrpaddle::framework::ProgramDesc const&) 3 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptrpaddle::framework::ProgramDesc const&) 4 paddle::AnalysisPredictor::OptimizeInferenceProgram() 5 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*) 6 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*) 7 paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_deletepaddle::framework::ir::Graph >) 8 paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph*) const 9 paddle::framework::ir::SelfAttentionFusePass::ApplyImpl(paddle::framework::ir::Graph*) const 10 paddle::framework::ir::GraphPatternDetector::operator()(paddle::framework::ir::Graph*, std::function<void (std::map<paddle::framework::ir::PDNode*, paddle::framework::ir::Node*, paddle::framework::ir::GraphPatternDetector::PDNodeCompare, std::allocator<std::pair<paddle::framework::ir::PDNode* const, paddle::framework::ir::Node*> > > const&, paddle::framework::ir::Graph*)>)


Error Message Summary:

FatalError: Illegal instruction is detected by the operating system. [TimeInfo: *** Aborted at 1703826924 (unix time) try "date -d @1703826924" if you are using GNU date ***] [SignalInfo: *** SIGILL (@0x7f421eaa186a) received by PID 1 (TID 0x7f422642b740) from PID 514463850 ***]`

其他补充信息 Additional Supplementary Information

No response

gowy222 avatar Dec 29 '23 10:12 gowy222

请问安装的Paddle whl包是cuda几呢? 我们尝试复现一下

tink2123 avatar Jan 03 '24 03:01 tink2123

请问安装的Paddle whl包是cuda几呢? 我们尝试复现一下

云服务器...纯CPU版本... 没用任何GPU相关

直接pip install --no-cache-dir paddlepaddle paddleocr

gowy222 avatar Jan 03 '24 05:01 gowy222

纯cpu没有复现问题,请问预测命令是这个吗:

paddleocr --image_dir=doc/imgs/1.jpg --use_gpu=False

tink2123 avatar Jan 03 '24 05:01 tink2123

纯cpu没有复现问题,请问预测命令是这个吗:

paddleocr --image_dir=doc/imgs/1.jpg --use_gpu=False

docker里面测试跑的

FROM python:3.10-slim-bullseye
ENV TZ=Asia/Shanghai
ENV DEBIAN_FRONTEND=noninteractive
COPY app.py /

apt-get update
apt-get install -y libgomp1
pip install --no-cache-dir paddlepaddle paddleocr

app.py代码是复制参考的 https://pypi.org/project/paddleocr/ PaddleOCR 依赖 PaddlePaddle

from paddleocr import PaddleOCR
ocr = PaddleOCR(use_angle_cls=True,use_gpu=False,lang="ch")
result = ocr.ocr(local_file_path, det=True, rec=True, cls=True)

ocr = PaddleOCR(use_angle_cls=True,use_gpu=False,lang="ch") 这行初始化报错: ` ppocr DEBUG: Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir=None, page_num=0, det_algorithm='DB', det_model_dir='/root/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='/root/.paddleocr/whl/rec/ch/ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='/usr/local/lib/python3.10/dist-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='/root/.paddleocr/whl/cls/ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', ocr_version='PP-OCRv4', structure_version='PP-StructureV2')


C++ Traceback (most recent call last):

0 paddle_infer::Predictor::Predictor(paddle::AnalysisConfig const&) 1 std::unique_ptr<paddle::PaddlePredictor, std::default_deletepaddle::PaddlePredictor > paddle::CreatePaddlePredictor<paddle::AnalysisConfig, (paddle::PaddleEngineKind)2>(paddle::AnalysisConfig const&) 2 paddle::AnalysisPredictor::Init(std::shared_ptrpaddle::framework::Scope const&, std::shared_ptrpaddle::framework::ProgramDesc const&) 3 paddle::AnalysisPredictor::PrepareProgram(std::shared_ptrpaddle::framework::ProgramDesc const&) 4 paddle::AnalysisPredictor::OptimizeInferenceProgram() 5 paddle::inference::analysis::Analyzer::RunAnalysis(paddle::inference::analysis::Argument*) 6 paddle::inference::analysis::IrAnalysisPass::RunImpl(paddle::inference::analysis::Argument*) 7 paddle::inference::analysis::IRPassManager::Apply(std::unique_ptr<paddle::framework::ir::Graph, std::default_deletepaddle::framework::ir::Graph >) 8 paddle::framework::ir::Pass::Apply(paddle::framework::ir::Graph*) const 9 paddle::framework::ir::SelfAttentionFusePass::ApplyImpl(paddle::framework::ir::Graph*) const 10 paddle::framework::ir::GraphPatternDetector::operator()(paddle::framework::ir::Graph*, std::function<void (std::map<paddle::framework::ir::PDNode*, paddle::framework::ir::Node*, paddle::framework::ir::GraphPatternDetector::PDNodeCompare, std::allocator<std::pair<paddle::framework::ir::PDNode* const, paddle::framework::ir::Node*> > > const&, paddle::framework::ir::Graph*)>)


Error Message Summary:

FatalError: Illegal instruction is detected by the operating system. [TimeInfo: *** Aborted at 1704261377 (unix time) try "date -d @1704261377" if you are using GNU date ***] [SignalInfo: *** SIGILL (@0x7f4e1c49386a) received by PID 1 (TID 0x7f4e23e1d740) from PID 474560618 ***]`

gowy222 avatar Jan 03 '24 05:01 gowy222

我们这边cuda11.7+paddle2.6没有问题,麻烦给下更具体的环境信息和测试命令呢?

cuicheng01 avatar Jan 03 '24 07:01 cuicheng01

我们这边cuda11.7+paddle2.6没有问题,麻烦给下更具体的环境信息和测试命令呢?

纯cpu所以不装cuda...云服务器本来就没有显卡...

FROM python:3.10-slim-bullseye ENV CUDA_VISIBLE_DEVICES -1 #环境层禁用GPU

环境信息:

CPU Architecture: CPU Model: AMD EPYC 7K62 48-Core Processor CPU Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca GPU Details: No NVIDIA GPU detected GCC Version: No GCC installed GLIBC Version: ldd (Debian GLIBC 2.31-13+deb11u7) 2.31

pip 安装了哪些包:

anyio 4.2.0 astor 0.8.1 attrdict 2.0.1 Babel 2.14.0 bce-python-sdk 0.8.98 beautifulsoup4 4.12.2 blinker 1.7.0 cachetools 5.3.2 certifi 2023.11.17 charset-normalizer 3.3.2 click 8.1.7 contourpy 1.2.0 cssselect 1.2.0 cssutils 2.9.0 cycler 0.12.1 Cython 3.0.7 decorator 5.1.1 et-xmlfile 1.1.0 exceptiongroup 1.2.0 fire 0.5.0 Flask 3.0.0 flask-babel 4.0.0 fonttools 4.47.0 future 0.18.3 h11 0.14.0 httpcore 1.0.2 httpx 0.26.0 idna 3.6 imageio 2.33.1 imgaug 0.4.0 itsdangerous 2.1.2 Jinja2 3.1.2 kiwisolver 1.4.5 lazy_loader 0.3 lmdb 1.4.1 lxml 5.0.0 MarkupSafe 2.1.3 matplotlib 3.8.2 networkx 3.2.1 numpy 1.26.3 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 openpyxl 3.1.2 opt-einsum 3.3.0 packaging 23.2 paddleocr 2.7.0.3 paddlepaddle 2.6.0 pandas 2.1.4 pdf2docx 0.5.6 pillow 10.2.0 pip 23.3.2 premailer 3.10.0 protobuf 4.25.1 psutil 5.9.7 pyclipper 1.3.0.post5 pycryptodome 3.19.1 PyMuPDF 1.20.2 pyparsing 3.1.1 python-dateutil 2.8.2 python-docx 1.1.0 pytz 2023.3.post1 PyYAML 6.0.1 rapidfuzz 3.6.1 rarfile 4.1 requests 2.31.0 scikit-image 0.22.0 scipy 1.11.4 setuptools 65.5.1 shapely 2.0.2 six 1.16.0 sniffio 1.3.0 soupsieve 2.5 termcolor 2.4.0 tifffile 2023.12.9 tqdm 4.66.1 typing_extensions 4.9.0 tzdata 2023.4 urllib3 2.1.0 visualdl 2.5.3 Werkzeug 3.0.1 wheel 0.42.0

gowy222 avatar Jan 03 '24 19:01 gowy222

AMD 的CPU确实可能会存在一些问题,这个需要我们反馈看下

cuicheng01 avatar Jan 04 '24 15:01 cuicheng01

Got the same error with cuda11.8 and Ubuntu20.0.4(with cpu intel i7-13700) while upgrading to 2.6.0 from 2.5.2

nigue3025 avatar Jan 05 '24 06:01 nigue3025

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

OttomanZ avatar Jan 10 '24 13:01 OttomanZ

Looks like an AVX512 instruction snuck into the paddlepaddle==2.6.0 build. Here's the problematic instruction according to gdb:

>0x7f48dcdde86a      vmovss (%rax),%xmm16

VMOVSS using an xmm16 register

jamesdull avatar Jan 10 '24 20:01 jamesdull

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to https://github.com/PaddlePaddle/Paddle/issues/57493

polym avatar Jan 11 '24 08:01 polym

百度,真让人失望,用一下就出现这个BUG,真是扶不起的阿斗!

coderLinJ5945 avatar Jan 12 '24 09:01 coderLinJ5945

I had the same issue,intel cpu ubuntu x86

lmyzd avatar Jan 15 '24 03:01 lmyzd

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to #57493

LGTM. I had the same issue,intel cpu manjaro x86. only cpu.

yiranzai avatar Jan 22 '24 05:01 yiranzai

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

nice answer

taibaimoyu avatar Jan 27 '24 08:01 taibaimoyu

same here cup version 2.6

GuodongQi avatar Jan 29 '24 07:01 GuodongQi

这个有处理方法吗,2.5.2虽然可以但是推理起来比2.6.0慢好多

xiaofeicn avatar Feb 21 '24 07:02 xiaofeicn

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it.

感谢 这个版本可以解决这个问题 Thank you, this version can solve the problem.

fanxing-6 avatar Feb 21 '24 19:02 fanxing-6

可以使用paddle 2.6.0 但是 paddleocr 要用 PP-OCRv3, PP-OCRv4 有问题, 2.5.2 太慢了

qq70571382 avatar Mar 01 '24 02:03 qq70571382

ocr_object = PaddleOCR(use_angle_cls=True, lang="ch", enable_mkldnn=False,ocr_version='PP-OCRv3') # 中文

qq70571382 avatar Mar 01 '24 02:03 qq70571382

同样问题 纯cpu版, 2.6.0就这样, 2.5.2可以

eritpchy avatar Mar 06 '24 06:03 eritpchy

I had the same issue, pip install --no-cache-dir paddlepaddle==2.5.1 paddleocr==2.7.0.3 this fixed it

For me, python 3.10 is not compatible but python 3.7 is working with this setup.

simonejiang7 avatar Mar 20 '24 10:03 simonejiang7

可以使用paddle 2.6.0 但是 paddleocr 要用 PP-OCRv3, PP-OCRv4 有问题, 几个月了,还没有解决问题吗?

ubuntu 18.04 x64 intel cpu

最新发布的2.6.1 也不行,百度有人来解决issue的吗?

使用paddlepaddle==2.5.2 来运行v4解析同一个图片需要20秒,v3解析之需要2秒,差距太多。

cole-dda avatar Mar 20 '24 10:03 cole-dda

The same problem is waiting to be solved

zainzhoucom avatar Mar 29 '24 08:03 zainzhoucom

+1

pip install --no-cache-dir paddlepaddle-gpu==2.5.2 paddleocr==2.7.0.3 works fine.

hello2mao avatar Apr 18 '24 02:04 hello2mao

请问这个问题什么时候能解决呀

da2vin avatar Apr 22 '24 03:04 da2vin

The same problem in Docker environment

ZhangGaoxing avatar Apr 22 '24 07:04 ZhangGaoxing

Thanks @OttomanZ. But pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3 solved my problem, according to #57493

可以是可以了,但是推理时间慢了一倍。。。

caicaicai avatar May 02 '24 02:05 caicaicai

this worked "pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3"
but how come this problem seems to still be there after almost half a year!? why isnt it fixed by default!?

StuckInLoop avatar May 02 '24 14:05 StuckInLoop

这个命令有效:“pip install --no-cache-dir paddlepaddle==2.5.2 paddleocr==2.7.0.3”, 但是为什么这个问题在将近半年后似乎仍然存在!?为什么它不是默认修复的!?

似乎是因为高版本使用了avx512加速,如果你的cpu支持avx512应该就没问题(

Hisir0909 avatar Jul 26 '24 07:07 Hisir0909