RuntimeError: Exception from the 'vlm' worker: (Unimplemented) FlashAttention 2 is unsupported
🔎 Search before asking
- [x] I have searched the PaddleOCR Docs and found no similar bug report.
- [x] I have searched the PaddleOCR Issues and found no similar bug report.
- [x] I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
Hi, I’m getting the following error when using PaddleOCR-VL:
RuntimeError: Exception from the 'vlm' worker: (Unimplemented) FlashAttention 2 is unsupported, please check the GPU compatibility and CUDA Version. (at ../paddle/phi/kernels/gpu/flash_attn_utils.h:393)
I am not using vLLM or any external acceleration backend — I’m just using the default PaddleOCR-VL pipeline. I followed the documentation exactly, but FlashAttention 2 still fails to load.
My questions:
- Does the default PaddleOCR-VL pipeline require FlashAttention 2?
- Is FlashAttention 2 supported on my GPU and CUDA version?
- Is there a way to disable FlashAttention 2 and fall back to standard attention?
- Did I install PaddleOCR incorrectly?
Any help would be greatly appreciated. Thank you!
🏃♂️ Environment (运行环境)
Here is my environment:
GPU: NVIDIA RTX 5060 Ti
CUDA: CUDA 13
OS: Windows
Python: 3.11
PaddlePaddle-gpu: 3.2.1
PaddleX: 3.3.10
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
from paddleocr import PaddleOCRVL
pipeline = PaddleOCRVL()
# pipeline = PaddleOCRVL(use_doc_orientation_classify=True) # Use use_doc_orientation_classify to enable/disable document orientation classification model
# pipeline = PaddleOCRVL(use_doc_unwarping=True) # Use use_doc_unwarping to enable/disable document unwarping module
# pipeline = PaddleOCRVL(use_layout_detection=False) # Use use_layout_detection to enable/disable layout detection module
output = pipeline.predict("1.jpg")
for res in output:
res.print() ## Print the structured prediction output
res.save_to_json(save_path="output") ## Save the current image's structured result in JSON format
res.save_to_markdown(save_path="output") ## Save the current image's result in Markdown format
Hi, how did you install paddlepaddle-gpu?
I installed it using the following command:
python -m pip install paddlepaddle-gpu==3.2.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
Please downgrade paddlex to version 3.3.9 by pip install paddlex==3.3.9 to see if the issue still exists.
Thank you, the code is working now.