PaddleOCR `RuntimeError: Exception from the 'cv' worker: std::exception`

🔎 Search before asking

[x] I have searched the PaddleOCR Docs and found no similar bug report.
[x] I have searched the PaddleOCR Issues and found no similar bug report.
[x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

Cannot run

from paddleocr import PaddleOCRVL
pipeline = PaddleOCRVL()
output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")

error:

/usr/local/lib/python3.11/dist-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
Creating model: ('PP-DocLayoutV2', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PP-DocLayoutV2`.
Creating model: ('PaddleOCR-VL-0.9B', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: `/root/.paddlex/official_models/PaddleOCR-VL`.
Loading configuration file /root/.paddlex/official_models/PaddleOCR-VL/config.json
Loading weights file /root/.paddlex/official_models/PaddleOCR-VL/model.safetensors
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
use GQA - num_heads: 16- num_key_value_heads: 2
/usr/local/lib/python3.11/dist-packages/paddle/utils/decorator_utils.py:420: Warning: 
Non compatible API. Please refer to https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/model_convert/convert_from_pytorch/api_difference/torch/torch.split.html first.
  warnings.warn(
Loaded weights file from disk, setting weights to model.
All model checkpoint weights were used when initializing PaddleOCRVLForConditionalGeneration.

All the weights of PaddleOCRVLForConditionalGeneration were initialized from the model checkpoint at /root/.paddlex/official_models/PaddleOCR-VL.
If your task is similar to the task the model of the checkpoint was trained on, you can already use PaddleOCRVLForConditionalGeneration for predictions without further training.
Loading configuration file /root/.paddlex/official_models/PaddleOCR-VL/generation_config.json
Currently, the PaddleOCR-VL-0.9B local model only supports batch size of 1. The batch size will be updated to 1.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png ...
Downloading paddleocr_vl_demo.png ...
[==================================================] 100.00%
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 83]: no kernel image is available for execution on the device
CUDA error 101 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 102]: invalid device ordinal
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 308]: no kernel image is available for execution on the device
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 391]: no kernel image is available for execution on the device
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_47/2521124400.py in <cell line: 0>()
      1 from paddleocr import PaddleOCRVL
      2 pipeline = PaddleOCRVL()
----> 3 output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")
      4 for res in output:
      5     res.print()

/usr/local/lib/python3.11/dist-packages/paddleocr/_pipelines/paddleocr_vl.py in predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_layout_detection, use_chart_recognition, layout_threshold, layout_nms, layout_unclip_ratio, layout_merge_bboxes_mode, use_queues, prompt_label, format_block_content, repetition_penalty, temperature, top_p, min_pixels, max_pixels, **kwargs)
    132         **kwargs,
    133     ):
--> 134         return list(
    135             self.predict_iter(
    136                 input,

/usr/local/lib/python3.11/dist-packages/paddlex/inference/pipelines/_parallel.py in predict(self, input, *args, **kwargs)
    127             )
    128         else:
--> 129             yield from self._pipeline.predict(
    130                 input,
    131                 *args,

/usr/local/lib/python3.11/dist-packages/paddlex/inference/pipelines/paddleocr_vl/pipeline.py in predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_layout_detection, use_chart_recognition, layout_threshold, layout_nms, layout_unclip_ratio, layout_merge_bboxes_mode, use_queues, prompt_label, format_block_content, repetition_penalty, temperature, top_p, min_pixels, max_pixels, max_new_tokens, **kwargs)
    671                         continue
    672                     if not item[0]:
--> 673                         raise RuntimeError(
    674                             f"Exception from the '{item[1]}' worker: {item[2]}"
    675                         )

RuntimeError: Exception from the 'cv' worker: std::exception

🏃‍♂️ Environment (运行环境)

Kaggle notebook: https://gist.github.com/rozeappletree/1dc93b1b9a84ac74e9f6b18ccb6e44a8

Python 3.11.13
Tesla P100-PCIE-16GB  
System information (os.uname): posix.uname_result(sysname='Linux', nodename='4dd4ec48897f', release='6.6.105+', version='#1 SMP Sat Sep 27 10:16:09 UTC 2025', machine='x86_64')
Platform (platform.system): Linux
Release version (platform.release): 6.6.105+
Version details (platform.version): #1 SMP Sat Sep 27 10:16:09 UTC 2025
Machine architecture (platform.machine): x86_64

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

Kaggle notebook: https://gist.github.com/rozeappletree/1dc93b1b9a84ac74e9f6b18ccb6e44a8

from paddleocr import PaddleOCRVL
pipeline = PaddleOCRVL()
output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")

Nov 27 '25 04:11 rozeappletree

related issues:

(this was supposed to be solved in 3.1.1: https://huggingface.co/PaddlePaddle/PaddleOCR-VL/discussions/35
https://github.com/PaddlePaddle/PaddleOCR/issues/17124
https://github.com/PaddlePaddle/PaddleOCR/issues/17046

Nov 27 '25 04:11 rozeappletree

@Sunting78 anything on this please?

Nov 27 '25 20:11 rozeappletree

Hello! Based on the information you provided, the PaddleOCR-VL-0.9B model is encountering CUDA errors during local inference on a P100 machine, which may be due to the P100 not supporting certain CUDA kernel image execution. We recommend you consider deploying the PaddleOCR-VL-0.9B model using vllm and using it via interface calls to bypass this limitation. You can further test and adjust based on the support of vllm. We hope this helps you resolve the issue!

Dec 03 '25 13:12 changdazhou