Hey there,

Actually when running the code that previously went perfect and zero problems, now I got the error message:

Could not load the custom kernel for multi-scale deformable attention: /home/randbee/.cache/torch_extensions/py310_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory

Anyone experience the same thing?

Seems like the installation of docling is missing some file related to pytorch.

I already tried to:

pip install docling --extra-index-url https://download.pytorch.org/whl/cpu

As I am running the process on a CPU.

Jan 24 '25 10:01 jmvial

I have the same issue while running docling in any environment. Weirdly enough the error only pops up at the first usage of doclings convert-function and then nevers shows up again for the same docling instance. It also seems to prevent it to run on GPU (however as jmvial said also pops up when I tell docling to only use cpu)

Jan 29 '25 09:01 LuRe97

Potential duplicate of #671

Jan 29 '25 14:01 cau-git

Had the same issue there. Seems like the issue is with newer torch==2.6 and torchvision==0.21.0 that comes with it. I downgraded them to torch==2.5.1 torchvision==0.20.1. No issues so far. Exact version combination is taken from https://pytorch.org/get-started/previous-versions/

Extra information sources:

https://github.com/huggingface/transformers/issues/35349
https://github.com/huggingface/transformers/pull/35979

Feb 14 '25 13:02 sadaisystems

Pinning torch and torchvision worked for me as well.

Feb 14 '25 14:02 TroyWilliams3687

Hello @sadaisystems and @TroyWilliams3687, What version of Docling are you using? I tried pinning the versions to torch==2.5.1 and torchvision==0.20.1 with the latest versions (2.24.0, 2.25.0) and the issue persists.

Feb 26 '25 15:02 jmmfcoutinho

@jmmfcoutinho I ended up moving away from tessearct as I couldn't get it running properly on windows 11. I use the EasyOCR and I don't have the issue. I don't have them pinned anymore.

I was using v2.20 and v2.21 of docling before I made the change.

Feb 26 '25 16:02 TroyWilliams3687

This is the setup to replicate the error. I'm working on Windows with WSL2. Ubuntu 24.04.1 LTS Python 3.12.3

Repo structure

error
├── README.md
├── error.py
└── requirements.txt

`error.py`

from docling.document_converter import DocumentConverter
source = "https://arxiv.org/pdf/2408.09869"  # document per local path or URL
converter = DocumentConverter()
result = converter.convert(source)
print(result.document.export_to_markdown())
# output: ## Docling Technical Report [...]"

`requirements.txt`

torch==2.5.1 
torchvision==0.20.1
docling==2.25.0

Commands before running the script

python3.12 -m venv .venv
source .venv/bin/activate
pip install --no-cache-dir -r requirements.txt

Script running

# Run script
python error.py

Error logs

Could not load the custom kernel for multi-scale deformable attention: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Could not load the custom kernel for multi-scale deformable attention: /home/jmmfcoutinho/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /home/jmmfcoutinho/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /home/jmmfcoutinho/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /home/jmmfcoutinho/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory
Could not load the custom kernel for multi-scale deformable attention: /home/jmmfcoutinho/.cache/torch_extensions/py312_cu124/MultiScaleDeformableAttention/MultiScaleDeformableAttention.so: cannot open shared object file: No such file or directory

Output from docling conversions

<!-- image -->

## Docling Technical Report

Version 1.0

Christoph Auer Maksym Lysak Ahmed Nassar Michele Dolfi Nikolaos Livathinos Panos Vagenas Cesar Berrospi Ramis Matteo Omenetti Fabian Lindlbauer Kasper Dinkla Lokesh Mishra Yusik Kim Shubham Gupta Rafael Teixeira de Lima Valery Weber Lucas Morin Ingmar Meijer Viktor Kuropiatnyk Peter W. J. Staar

AI4K Group, IBM Research R¨ uschlikon, Switzerland
...

# the output is truncated

`pip freeze`

annotated-types==0.7.0
attrs==25.1.0
beautifulsoup4==4.13.3
certifi==2025.1.31
charset-normalizer==3.4.1
click==8.1.8
dill==0.3.9
docling==2.25.0
docling-core==2.20.0
docling-ibm-models==3.4.0
docling-parse==3.4.0
easyocr==1.7.2
et_xmlfile==2.0.0
filelock==3.17.0
filetype==1.2.0
fsspec==2025.2.0
huggingface-hub==0.29.1
idna==3.10
imageio==2.37.0
Jinja2==3.1.5
jsonlines==3.1.0
jsonref==1.1.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
latex2mathml==3.77.0
lazy_loader==0.4
lxml==5.3.1
markdown-it-py==3.0.0
marko==2.1.2
MarkupSafe==3.0.2
mdurl==0.1.2
mpire==2.10.2
mpmath==1.3.0
multiprocess==0.70.17
networkx==3.4.2
ninja==1.11.1.3
numpy==2.2.3
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
opencv-python-headless==4.11.0.86
openpyxl==3.1.5
packaging==24.2
pandas==2.2.3
pillow==11.1.0
pyclipper==1.3.0.post6
pydantic==2.10.6
pydantic-settings==2.8.0
pydantic_core==2.27.2
Pygments==2.19.1
pypdfium2==4.30.1
python-bidi==0.6.6
python-dateutil==2.9.0.post0
python-docx==1.1.2
python-dotenv==1.0.1
python-pptx==1.0.2
pytz==2025.1
PyYAML==6.0.2
referencing==0.36.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
rpds-py==0.23.1
Rtree==1.3.0
safetensors==0.5.3
scikit-image==0.25.2
scipy==1.15.2
semchunk==2.2.2
setuptools==75.8.1
shapely==2.0.7
shellingham==1.5.4
six==1.17.0
soupsieve==2.6
sympy==1.13.1
tabulate==0.9.0
tifffile==2025.2.18
tokenizers==0.21.0
torch==2.5.1
torchvision==0.20.1
tqdm==4.67.1
transformers==4.49.0
triton==3.1.0
typer==0.12.5
typing_extensions==4.12.2
tzdata==2025.1
urllib3==2.3.0
XlsxWriter==3.2.2

Notes

Is seems that even with fixed torch and torchvision versions, the error persists.

Nevertheless, there seems to be no problem with the output, but I can't say for sure.

I have tested with different docling versions and the problem starts on v2.12.0 when support for GPU Accelerators is introduced.

Feb 26 '25 16:02 jmmfcoutinho

@jmmfcoutinho I ended up moving away from tessearct as I couldn't get it running properly on windows 11. I use the EasyOCR and I don't have the issue. I don't have them pinned anymore.

I was using v2.20 and v2.21 of docling before I made the change.

@TroyWilliams3687 thanks for the reply! But unfortunately, as you can see in my example, I'm using the most simplest example possible, with all defaults (EasyOCR by default)

Feb 26 '25 16:02 jmmfcoutinho

Could not load the custom kernel for multi-scale deformable attention.

Repo structure

error.py

requirements.txt

Commands before running the script

Script running

Error logs

Output from docling conversions

pip freeze

Notes

`error.py`

`requirements.txt`

`pip freeze`