PaddleX
PaddleX copied to clipboard
昇腾910使用table_recognition产线出现错误。
我使用官方ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-npu:cann800-ubuntu20-npu-910b-base-aarch64-gcc84 docker环境,paddlex 3.0-RC1, python -c "import paddle; print(paddle.version)" 测试没有问题。通过查找发现调用模型SLANet-plus出现此问题。 root@ubuntu:/work# paddlex --pipeline layout_parsing \
--input 0011.jpeg \ --use_doc_orientation_classify False \ --use_doc_unwarping False \ --use_textline_orientation False \ --save_path ./output \ --device npu:0
Creating model: ('PP-LCNet_x1_0_doc_ori', None)
Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
I0513 19:01:11.371599 51108 init.cc:237] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.10/dist-packages/paddle_custom_device
I0513 19:01:11.371654 51108 init.cc:146] Try loading custom device libs from: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I0513 19:01:12.236369 51108 custom_device_load.cc:52] Succeed in loading custom runtime in lib: /usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so
I0513 19:01:12.236433 51108 custom_device_load.cc:59] Skipped lib [/usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so]: no custom engine Plugin symbol in this lib.
I0513 19:01:12.240267 51108 custom_kernel.cc:63] Succeed in loading 359 custom kernel(s) from loaded lib(s), will be used like native ones.
I0513 19:01:12.240479 51108 init.cc:158] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I0513 19:01:12.240526 51108 init.cc:243] CustomDevice: npu, visible devices count: 2
Creating model: ('UVDoc', None)
Using official model (UVDoc), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('RT-DETR-H_layout_17cls', None)
Using official model (RT-DETR-H_layout_17cls), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('PP-OCRv4_server_det', None)
Using official model (PP-OCRv4_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('PP-OCRv4_server_rec', None)
Using official model (PP-OCRv4_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('PP-OCRv4_server_seal_det', None)
Using official model (PP-OCRv4_server_seal_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('PP-OCRv4_server_rec', None)
Using official model (PP-OCRv4_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Creating model: ('SLANet_plus', None)
Using official model (SLANet_plus), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Traceback (most recent call last):
File "/usr/local/bin/paddlex", line 8, in
--input 0011.jpg \ --use_doc_orientation_classify False \ --use_doc_unwarping False \ --use_textline_orientation False \ --save_path ./output \ --device npu:0
Creating model: ('PP-LCNet_x1_0_doc_ori', None) Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. I0513 19:01:45.694962 57513 init.cc:237] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.10/dist-packages/paddle_custom_device I0513 19:01:45.695019 57513 init.cc:146] Try loading custom device libs from: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device] I0513 19:01:46.575884 57513 custom_device_load.cc:52] Succeed in loading custom runtime in lib: /usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so I0513 19:01:46.575968 57513 custom_device_load.cc:59] Skipped lib [/usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so]: no custom engine Plugin symbol in this lib. I0513 19:01:46.589579 57513 custom_kernel.cc:63] Succeed in loading 359 custom kernel(s) from loaded lib(s), will be used like native ones. I0513 19:01:46.589810 57513 init.cc:158] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device] I0513 19:01:46.589886 57513 init.cc:243] CustomDevice: npu, visible devices count: 2 Creating model: ('UVDoc', None) Using official model (UVDoc), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('RT-DETR-H_layout_17cls', None) Using official model (RT-DETR-H_layout_17cls), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('PP-OCRv4_server_det', None) Using official model (PP-OCRv4_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('PP-OCRv4_server_rec', None) Using official model (PP-OCRv4_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('PP-OCRv4_server_seal_det', None) Using official model (PP-OCRv4_server_seal_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('PP-OCRv4_server_rec', None) Using official model (PP-OCRv4_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. Creating model: ('SLANet_plus', None) Using official model (SLANet_plus), the model files will be automatically downloaded and saved in /root/.paddlex/official_models. .Call aclrtSynchronizeStream(reinterpret_cast<aclrtStream>(stream)) failed : 507053 at file /paddle/backends/npu/runtime/runtime.cc line 726 EZ1001: [PID: 57513] 2025-05-13-19:02:04.234.060 Shape of aclnnSliceV2 out should be [0,1,300,4], but current is [1,1,300,4]. TraceBack (most recent call last): executor is nullptr. Shape of aclnnSliceV2 out should be [0,1,300,17], but current is [1,1,300,17]. The error from device(chipId:6, dieId:0), serial number is 10, there is an aivec error exception, core id is 26, error code = 0x800000, dump info: pc start: 0x124c00000000, current: 0x124c00000bd8, vec error info: 0x13130c7994, mte error info: 0x9030000ba, ifu error info: 0x7036f3e0c7cc0, ccu error info: 0xecc8bec80d21453e, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100531c00.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1417] The extend info: errcode:(0x800000, 0, 0) errorStr: The DDR address of the MTE instruction is out of range. fixp_error0 info: 0x30000ba, fixp_error1 info: 0x9 fsmId:0, tslot:0, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:ProcessStarsCoreErrorInfo][FILE:device_error_proc.cc][LINE:1429] Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1356] AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1124] Aicore kernel execute failed, device_id=6, stream_id=42, report_stream_id=42, task_id=38087, flip_num=0, fault kernel_name=StridedSliceV3_a3763a988b9143d31d776fc7ba893cf4_high_performance__kernel0, fault kernel info ext=StridedSliceV3_a3763a988b9143d31d776fc7ba893cf4_high_performance__kernel0, program id=0, hash=18112778502496636264.[FUNC:GetError][FILE:stream.cc][LINE:1124] [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1124] rtStreamSynchronize execute failed, reason=[device mem error][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53] synchronize stream failed, runtime result = 507053[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
C++ Traceback (most recent call last):
0 paddle::AnalysisPredictor::ZeroCopyRun(bool)
1 paddle::framework::NaiveExecutor::RunInterpreterCore(std::vector<std::string, std::allocator<std::string > > const&, bool, bool)
2 paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
3 paddle::framework::PirInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
4 paddle::framework::PirInterpreter::TraceRunImpl()
5 paddle::framework::PirInterpreter::TraceRunInstructionList(std::vector<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase >, std::allocator<std::unique_ptr<paddle::framework::InstructionBase, std::default_deletepaddle::framework::InstructionBase > > > const&)
6 paddle::framework::PirInterpreter::RunInstructionBase(paddle::framework::InstructionBase*)
7 paddle::framework::PhiKernelInstruction::Run()
8 paddle::dialect::SliceOp::InferMeta(phi::InferMetaContext*)
9 paddle::experimental::IntArrayBasephi::DenseTensor::IntArrayBase(phi::DenseTensor const&)
10 void phi::Copyphi::DeviceContext(phi::DeviceContext const&, phi::DenseTensor const&, phi::Place, bool, phi::DenseTensor*)
11 phi::MemoryUtils::Copy(phi::Place const&, void*, phi::Place const&, void const*, unsigned long, void*)
12 void paddle::memory::Copy<phi::Place, phi::Place>(phi::Place, void*, phi::Place, void const*, unsigned long, void*)
13 void paddle::memory::Copy<phi::CPUPlace, phi::CustomPlace>(phi::CPUPlace, void*, phi::CustomPlace, void const*, unsigned long, void*)
14 phi::CustomDevice::MemoryCopyD2H(unsigned long, void*, void const*, unsigned long, phi::stream::Stream const*)
15 phi::CustomDevice::SynchronizeStream(unsigned long, phi::stream::Stream const*)
16 SyncStream(C_Device_st*, C_Stream_st*)
17 phi::DeviceManager::~DeviceManager()
18 std::_Hashtable<std::string, std::pair<std::string const, std::vector<std::unique_ptr<phi::Device, std::default_deletephi::Device >, std::allocator<std::unique_ptr<phi::Device, std::default_deletephi::Device > > > >, std::allocator<std::pair<std::string const, std::vector<std::unique_ptr<phi::Device, std::default_deletephi::Device >, std::allocator<std::unique_ptr<phi::Device, std::default_deletephi::Device > > > > >, std::__detail::_Select1st, std::equal_to<std::string >, std::hash<std::string >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear()
19 phi::CustomDevice::DeInitDevice(unsigned long)
20 ReleaseDevice(C_Device_st*)
21 std::__detail::_Map_base<int, std::pair<int const, std::__cxx11::list<void*, std::allocator<void*> > >, std::allocator<std::pair<int const, std::__cxx11::list<void*, std::allocator<void*> > > >, std::__detail::_Select1st, std::equal_to
Error Message Summary:
FatalError: Segmentation fault is detected by the operating system.
[TimeInfo: *** Aborted at 1747134133 (unix time) try "date -d @1747134133" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0xe0a9) received by PID 57513 (TID 0xffff879e5950) from PID 57513 ***]
^C[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared! [ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
Checklist:
- [ ] 查找历史相关issue寻求解答
- [ ] 翻阅FAQ
- [ ] 翻阅PaddleX 文档
- [ ] 确认bug是否在新版本里还未修复
描述问题
复现
-
您是否已经正常运行我们提供的教程?
-
您是否在教程的基础上修改代码内容?还请您提供运行的代码
-
您使用的数据集是?
-
请提供您出现的报错信息及相关log
环境
-
请提供您使用的PaddlePaddle和PaddleX的版本号
-
请提供您使用的操作系统信息,如Linux/Windows/MacOS
-
请问您使用的Python版本是?
-
请问您使用的CUDA/cuDNN的版本号是?