PaddleOCR PP-DocBee2-3B和产线doc_understanding在华为npu昇腾910B4上使用报错，在CPU上也跑不了

PP-DocBee2-3B和产线doc_understanding在华为npu昇腾910B4上使用报错，在CPU上也跑不了

Open Routin opened this issue 2 weeks ago • 1 comments

trafficstars

🔎 Search before asking

[x] I have searched the PaddleOCR Docs and found no similar bug report.
[x] I have searched the PaddleOCR Issues and found no similar bug report.
[x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

PP-DocBee2-3B和产线doc_understanding在华为npu昇腾910B4上使用报错，已经验证PP-OCR与PP-Structure可以在npu上正常使用。

在cpu上要我分配2pb的内存
在npu上不指定卡号报错，找不到设备
在npu上指定卡号报错ACL50002

报错信息

CPU

root@301-dev-arm03:/work/demo# python ppdoc.py 
I1103 17:07:21.083683 1643333 init.cc:238] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.10/dist-packages/paddle_custom_device
I1103 17:07:21.083738 1643333 init.cc:146] Try loading custom device libs from: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 17:07:21.787149 1643333 custom_device_load.cc:51] Succeed in loading custom runtime in lib: /usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so
I1103 17:07:21.787206 1643333 custom_device_load.cc:58] Skipped lib [/usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so]: no custom engine Plugin symbol in this lib.
I1103 17:07:21.789336 1643333 custom_kernel.cc:68] Succeed in loading 359 custom kernel(s) from loaded lib(s), will be used like native ones.
I1103 17:07:21.789510 1643333 init.cc:158] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 17:07:21.789549 1643333 init.cc:244] CustomDevice: npu, visible devices count: 2
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-DocBee2-3B`.
The `unk_token` parameter needs to be defined: we use `eos_token` by default.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/config.json
Loading weights file /root/.paddlex/official_models/PP-DocBee2-3B/model_state.pdparams
Loaded weights file from disk, setting weights to model.
`Qwen2_5_VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
All model checkpoint weights were used when initializing PPDocBee2Inference.

All the weights of PPDocBee2Inference were initialized from the model checkpoint at /root/.paddlex/official_models/PP-DocBee2-3B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use PPDocBee2Inference for predictions without further training.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/generation_config.json
/usr/local/lib/python3.10/dist-packages/paddle/tensor/creation.py:1088: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach(), rather than paddle.to_tensor(sourceTensor).
  return tensor(
Traceback (most recent call last):
  File "/work/demo/ppdoc.py", line 16, in <module>
    results = model.predict(
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_models/base.py", line 57, in predict
    result = list(self.predict_iter(*args, **kwargs))
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 219, in __call__
    yield from self.apply(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 277, in apply
    prediction = self.process(batch_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 144, in process
    preds = self.infer.generate(data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2999, in generate
    generated_ids = super().generate(
  File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 405, in _decorate_function
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/common/vlm/generation/utils.py", line 1136, in generate
    return self.sample(
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/common/vlm/generation/utils.py", line 1405, in sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2844, in prepare_inputs_for_generation
    position_ids, rope_deltas = self.get_rope_index(
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2532, in get_rope_index
    range_tensor = paddle.arange(end=llm_grid_t).reshape([-1, 1])
  File "/usr/local/lib/python3.10/dist-packages/paddle/tensor/creation.py", line 2157, in arange
    tensor = _C_ops.arange(
MemoryError: (ResourceExhausted) Fail to alloc memory of 2251744329974656 size, error code is 12.
  [Hint: Expected error == 0, but received error:12 != 0:0.] (at /paddle/paddle/phi/core/memory/allocation/cpu_allocator.cc:48)

npu不指定卡号

root@301-dev-arm03:/work/demo# python ppdoc.py 
Creating model: ('PP-DocBee2-3B', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-DocBee2-3B`.
I1103 16:36:04.717118 1621556 init.cc:238] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.10/dist-packages/paddle_custom_device
I1103 16:36:04.717156 1621556 init.cc:146] Try loading custom device libs from: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 16:36:05.426925 1621556 custom_device_load.cc:51] Succeed in loading custom runtime in lib: /usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so
I1103 16:36:05.426990 1621556 custom_device_load.cc:58] Skipped lib [/usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so]: no custom engine Plugin symbol in this lib.
I1103 16:36:05.429311 1621556 custom_kernel.cc:68] Succeed in loading 359 custom kernel(s) from loaded lib(s), will be used like native ones.
I1103 16:36:05.429504 1621556 init.cc:158] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 16:36:05.429541 1621556 init.cc:244] CustomDevice: npu, visible devices count: 2
The `unk_token` parameter needs to be defined: we use `eos_token` by default.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/config.json
Loading weights file /root/.paddlex/official_models/PP-DocBee2-3B/model_state.pdparams
Loaded weights file from disk, setting weights to model.
`Qwen2_5_VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
.All model checkpoint weights were used when initializing PPDocBee2Inference.

All the weights of PPDocBee2Inference were initialized from the model checkpoint at /root/.paddlex/official_models/PP-DocBee2-3B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use PPDocBee2Inference for predictions without further training.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/generation_config.json
.Traceback (most recent call last):
  File "/work/demo/ppdoc.py", line 4, in <module>
    output = pipeline.predict(
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_pipelines/doc_understanding.py", line 54, in predict
    return list(self.predict_iter(input, **kwargs))
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/pipelines/doc_understanding/pipeline.py", line 73, in predict
    yield from self.doc_understanding_model(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 219, in __call__
    yield from self.apply(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 277, in apply
    prediction = self.process(batch_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 140, in process
    data = self._switch_inputs_to_device(data)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 245, in _switch_inputs_to_device
    rst_dict = {
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 247, in <dictcomp>
    paddle.to_tensor(input_dict[k], place=self.device)
  File "/usr/local/lib/python3.10/dist-packages/paddle/tensor/creation.py", line 1088, in to_tensor
    return tensor(
  File "/usr/local/lib/python3.10/dist-packages/paddle/tensor/creation.py", line 965, in tensor
    place = _get_paddle_place(device)
  File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 8369, in _get_paddle_place
    device_id = place_info_list[1]
IndexError: list index out of range

npu指定卡号

root@301-dev-arm03:/work/demo# python ppdoc.py 
Creating model: ('PP-DocBee2-3B', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-DocBee2-3B`.
I1103 16:45:18.041335 1628373 init.cc:238] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.10/dist-packages/paddle_custom_device
I1103 16:45:18.041374 1628373 init.cc:146] Try loading custom device libs from: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 16:45:18.734818 1628373 custom_device_load.cc:51] Succeed in loading custom runtime in lib: /usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so
I1103 16:45:18.734863 1628373 custom_device_load.cc:58] Skipped lib [/usr/local/lib/python3.10/dist-packages/paddle_custom_device/libpaddle-custom-npu.so]: no custom engine Plugin symbol in this lib.
I1103 16:45:18.737126 1628373 custom_kernel.cc:68] Succeed in loading 359 custom kernel(s) from loaded lib(s), will be used like native ones.
I1103 16:45:18.737306 1628373 init.cc:158] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.10/dist-packages/paddle_custom_device]
I1103 16:45:18.737344 1628373 init.cc:244] CustomDevice: npu, visible devices count: 2
The `unk_token` parameter needs to be defined: we use `eos_token` by default.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/config.json
Loading weights file /root/.paddlex/official_models/PP-DocBee2-3B/model_state.pdparams
Loaded weights file from disk, setting weights to model.
`Qwen2_5_VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
.All model checkpoint weights were used when initializing PPDocBee2Inference.

All the weights of PPDocBee2Inference were initialized from the model checkpoint at /root/.paddlex/official_models/PP-DocBee2-3B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use PPDocBee2Inference for predictions without further training.
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/generation_config.json
./usr/local/lib/python3.10/dist-packages/paddle/tensor/creation.py:1088: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach(), rather than paddle.to_tensor(sourceTensor).
  return tensor(
/usr/local/lib/python3.10/dist-packages/paddle/utils/decorator_utils.py:420: Warning: 
Non compatible API. Please refer to https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/model_convert/convert_from_pytorch/api_difference/torch/torch.max.html first.
  warnings.warn(
Traceback (most recent call last):
  File "/work/demo/ppdoc.py", line 4, in <module>
    output = pipeline.predict(
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_pipelines/doc_understanding.py", line 54, in predict
    return list(self.predict_iter(input, **kwargs))
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/pipelines/doc_understanding/pipeline.py", line 73, in predict
    yield from self.doc_understanding_model(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 219, in __call__
    yield from self.apply(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 277, in apply
    prediction = self.process(batch_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 144, in process
    preds = self.infer.generate(data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2999, in generate
    generated_ids = super().generate(
  File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 405, in _decorate_function
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/common/vlm/generation/utils.py", line 1136, in generate
    return self.sample(
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/common/vlm/generation/utils.py", line 1409, in sample
    outputs = self(**model_inputs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1576, in __call__
    return self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2737, in forward
    image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw)
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1576, in __call__
    return self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 2920, in forward
    hidden_states = self.patch_embed(hidden_states)
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1576, in __call__
    return self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/modeling/qwen2_5_vl.py", line 711, in forward
    hidden_states = self.proj(
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/layers.py", line 1576, in __call__
    return self.forward(*inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/utils/decorator_utils.py", line 183, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/layer/conv.py", line 1158, in forward
    out = F.conv._conv_nd(
  File "/usr/local/lib/python3.10/dist-packages/paddle/nn/functional/conv.py", line 199, in _conv_nd
    pre_bias = _C_ops.conv3d(
OSError: (External)  ACL error, the error code is : 500002.  (at /paddle/backends/npu/kernels/funcs/npu_op_runner.cc:662)

🏃‍♂️ Environment (运行环境)

CPU：鲲鹏920 NPU：昇腾910B4 paddlex：3.2.1 paddlepaddle：3.2.0 padddleocr：3.2.0 docker容器：用paddle提供的镜像启动的容器

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

from paddleocr import DocVLM
model = DocVLM(model_name="PP-DocBee2-3B",device="npu:0")
results = model.predict(
    input={"image": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png", "query": "识别这份表格的内容, 以markdown格式输出"},
    batch_size=1
)
for res in results:
    res.print()
    res.save_to_json(f"./output/res.json")

一些猜想

qwen2.5vl是否使用了flashattn？昇腾环境并不支持flashattn技术。

Nov 03 '25 09:11 Routin

我碰到的error code是500001

Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/config.json
Loading weights file /root/.paddlex/official_models/PP-DocBee2-3B/model_state.pdparams
`Qwen2_5_VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46                                                                                                                  
Loaded weights file from disk, setting weights to model.
All model checkpoint weights were used when initializing PPDocBee2Inference.
                                                                                                                                          
All the weights of PPDocBee2Inference were initialized from the model checkpoint at /root/.paddlex/official_models/PP-DocBee2-3B.
If your task is similar to the task the model of the checkpoint was trained on, you can already use PPDocBee2Inference for predictions without further training.                                                                                                                    
Loading configuration file /root/.paddlex/official_models/PP-DocBee2-3B/generation_config.json
Traceback (most recent call last):
  File "/usr/local/bin/paddleocr", line 8, in <module>
    sys.exit(console_entry())
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/__main__.py", line 26, in console_entry
    main()
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_cli.py", line 192, in main
    _execute(args)
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_cli.py", line 181, in _execute
    args.executor(args)
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_pipelines/doc_understanding.py", line 107, in execute_with_args
    perform_simple_inference(DocUnderstanding, params)
  File "/usr/local/lib/python3.10/dist-packages/paddleocr/_utils/cli.py", line 68, in perform_simple_inference
    for i, res in enumerate(result):
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/pipelines/doc_understanding/pipeline.py", line 73, in predict
    yield from self.doc_understanding_model(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 273, in __call__
    yield from self.apply(input, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/base/predictor/base_predictor.py", line 330, in apply
    prediction = self.process(batch_data, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/predictor.py", line 185, in process
    data = self.processor.preprocess(data)
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/processors/qwen2_5_vl.py", line 532, in preprocess
    rst_inputs = super().preprocess(
  File "/usr/local/lib/python3.10/dist-packages/paddlex/inference/models/doc_vlm/processors/qwen2_5_vl.py", line 153, in preprocess
    * int(image_grid_thw[index].prod() // merge_length),
  File "/usr/local/lib/python3.10/dist-packages/paddle/utils/decorator_utils.py", line 199, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/paddle/tensor/math.py", line 4649, in prod
    return _C_ops.prod(x, axis, keepdim, reduce_all)
OSError: (External)  ACL error, the error code is : 500001.  (at /paddle/backends/npu/kernels/funcs/npu_op_runner.cc:453)

Nov 05 '25 08:11 dpdpbj

PaddleOCR PaddleOCR copied to clipboard

PP-DocBee2-3B和产线doc_understanding在华为npu昇腾910B4上使用报错，在CPU上也跑不了

🔎 Search before asking

🐛 Bug (问题描述)

报错信息

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

一些猜想

PaddleOCR
PaddleOCR copied to clipboard