PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

RuntimeError: Exception from the 'vlm' worker: (Unimplemented) FlashAttention 2 is unsupported

Open salsasalbilabila5-ui opened this issue 1 month ago • 4 comments

🔎 Search before asking

  • [x] I have searched the PaddleOCR Docs and found no similar bug report.
  • [x] I have searched the PaddleOCR Issues and found no similar bug report.
  • [x] I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

Hi, I’m getting the following error when using PaddleOCR-VL:

RuntimeError: Exception from the 'vlm' worker: (Unimplemented) FlashAttention 2 is unsupported, please check the GPU compatibility and CUDA Version. (at ../paddle/phi/kernels/gpu/flash_attn_utils.h:393)

I am not using vLLM or any external acceleration backend — I’m just using the default PaddleOCR-VL pipeline. I followed the documentation exactly, but FlashAttention 2 still fails to load.

My questions:

  1. Does the default PaddleOCR-VL pipeline require FlashAttention 2?
  2. Is FlashAttention 2 supported on my GPU and CUDA version?
  3. Is there a way to disable FlashAttention 2 and fall back to standard attention?
  4. Did I install PaddleOCR incorrectly?

Any help would be greatly appreciated. Thank you!

🏃‍♂️ Environment (运行环境)

Here is my environment:

GPU: NVIDIA RTX 5060 Ti
CUDA: CUDA 13
OS: Windows
Python: 3.11
PaddlePaddle-gpu: 3.2.1
PaddleX: 3.3.10

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

from paddleocr import PaddleOCRVL

pipeline = PaddleOCRVL()
# pipeline = PaddleOCRVL(use_doc_orientation_classify=True) # Use use_doc_orientation_classify to enable/disable document orientation classification model
# pipeline = PaddleOCRVL(use_doc_unwarping=True) # Use use_doc_unwarping to enable/disable document unwarping module
# pipeline = PaddleOCRVL(use_layout_detection=False) # Use use_layout_detection to enable/disable layout detection module
output = pipeline.predict("1.jpg")
for res in output:
    res.print() ## Print the structured prediction output
    res.save_to_json(save_path="output") ## Save the current image's structured result in JSON format
    res.save_to_markdown(save_path="output") ## Save the current image's result in Markdown format

salsasalbilabila5-ui avatar Dec 09 '25 08:12 salsasalbilabila5-ui

Hi, how did you install paddlepaddle-gpu?

Bobholamovic avatar Dec 09 '25 13:12 Bobholamovic

I installed it using the following command: python -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/

salsasalbilabila5-ui avatar Dec 10 '25 01:12 salsasalbilabila5-ui

Please downgrade paddlex to version 3.3.9 by pip install paddlex==3.3.9 to see if the issue still exists.

Bobholamovic avatar Dec 10 '25 06:12 Bobholamovic

Thank you, the code is working now.

salsasalbilabila5-ui avatar Dec 10 '25 08:12 salsasalbilabila5-ui