PaddleDetection
PaddleDetection copied to clipboard
RuntimeError: (PreconditionNotMet) Cannot load cudnn shared library. Cannot invoke method cudnnGetVersion. [Hint: cudnn_dso_handle should not be null.] (at /paddle/paddle/phi/backends/dynload/cudnn.cc:60)
问题确认 Search before asking
Bug组件 Bug Component
Installation
Bug描述 Describe the Bug
The error message is shown when I try an inference with gpu using the configuration file in examples.
Here is all logs.
root@xxx:/PaddleDetection# python deploy/pipeline/pipeline.py --config deploy/pipeline/config/examples/infer_cfg_human_mot.yml --video_file=test.mp4 --device=gpu
/PaddleDetection/deploy/pipeline/pipeline.py:24: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
from collections import Sequence, defaultdict
----------- Running Arguments -----------
MOT:
batch_size: 1
enable: true
model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
tracker_config: deploy/pipeline/config/tracker_config.yml
crop_thresh: 0.5
visual: true
warmup_frame: 50
------------------------------------------
Multi-Object Tracking enabled
100%|???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????| 186349/186349 [02:09<00:00, 1441.91KB/s]
MOT model dir: /root/.cache/paddle/infer_weights/mot_ppyoloe_l_36e_pipeline
----------- Model Configuration -----------
Model Arch: YOLO
Transform Order:
--transform op: Resize
--transform op: Permute
--------------------------------------------
video fps: 30, frame_count: 19854
Thread: 0; frame id: 0
Traceback (most recent call last):
File "/PaddleDetection/deploy/pipeline/pipeline.py", line 1103, in <module>
main()
File "/PaddleDetection/deploy/pipeline/pipeline.py", line 1090, in main
pipeline.run_multithreads()
File "/PaddleDetection/deploy/pipeline/pipeline.py", line 170, in run_multithreads
self.predictor.run(self.input)
File "/PaddleDetection/deploy/pipeline/pipeline.py", line 488, in run
self.predict_video(input, thread_idx=thread_idx)
File "/PaddleDetection/deploy/pipeline/pipeline.py", line 668, in predict_video
res = self.mot_predictor.predict_image(
File "/PaddleDetection/deploy/pptracking/python/mot_sde_infer.py", line 478, in predict_image
inputs = self.preprocess(batch_image_list)
File "/PaddleDetection/deploy/pptracking/python/det_infer.py", line 140, in preprocess
input_tensor.copy_from_cpu(inputs[input_names[i]])
File "/root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/paddle/fluid/inference/wrapper.py", line 38, in tensor_copy_from_cpu
self.copy_from_cpu_bind(data)
RuntimeError: (PreconditionNotMet) Cannot load cudnn shared library. Cannot invoke method cudnnGetVersion.
[Hint: cudnn_dso_handle should not be null.] (at /paddle/paddle/phi/backends/dynload/cudnn.cc:60)
Could you help me to fix this?
It works when remove "--device=gpu" from command line.
复现环境 Environment
OS
root@xxx:/PaddleDetection# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy
GPU and drivers
root@xxx:/PaddleDetection# nvidia-smi
Wed Jan 18 06:20:36 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11 Driver Version: 525.60.11 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100 80G... Off | 00000000:00:10.0 Off | 0 |
| N/A 29C P0 43W / 300W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
nvcc
root@xxx:/PaddleDetection# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
python
root@xxx:/PaddleDetection# python -V
Python 3.9.16
packages for python
root@xxx:/PaddleDetection# pip list
Package Version
------------------ -------------
aiofiles 22.1.0
aiohttp 3.8.3
aiosignal 1.3.1
altair 4.2.0
anyio 3.6.2
astor 0.8.1
async-timeout 4.0.2
attrdict 2.0.1
attrs 22.2.0
Babel 2.11.0
bce-python-sdk 0.8.74
Brotli 1.0.9
certifi 2022.12.7
charset-normalizer 2.1.1
click 8.1.3
contourpy 1.0.7
cycler 0.11.0
Cython 0.29.33
decorator 5.1.1
dill 0.3.6
entrypoints 0.4
fastapi 0.89.1
ffmpy 0.3.0
filterpy 1.4.5
Flask 2.2.2
flask-babel 3.0.0
fonttools 4.38.0
frozenlist 1.3.3
fsspec 2022.11.0
future 0.18.3
gevent 22.10.2
geventhttpclient 2.0.8
gradio 3.16.2
greenlet 2.0.1
grpcio 1.41.0
h11 0.14.0
httpcore 0.16.3
httpx 0.23.3
idna 3.4
importlib-metadata 6.0.0
itsdangerous 2.1.2
Jinja2 3.1.2
joblib 1.2.0
jsonschema 4.17.3
kiwisolver 1.4.4
lap 0.4.0
linkify-it-py 1.0.3
markdown-it-py 2.1.0
MarkupSafe 2.1.2
matplotlib 3.6.3
mdit-py-plugins 0.3.3
mdurl 0.1.2
motmetrics 1.4.0
mpmath 1.2.1
multidict 6.0.4
multiprocess 0.70.14
numpy 1.24.1
onnx 1.12.0
opencv-python 4.5.5.64
opt-einsum 3.3.0
orjson 3.8.5
packaging 23.0
paddle-bfloat 0.1.7
paddlepaddle-gpu 2.4.1.post117
pandas 1.5.2
Pillow 9.4.0
pip 22.3.1
protobuf 3.20.0
psutil 5.9.4
pyclipper 1.3.0.post4
pycocotools 2.0.6
pycryptodome 3.16.0
pydantic 1.10.4
pydub 0.25.1
pyparsing 3.0.9
pyrsistent 0.19.3
python-dateutil 2.8.2
python-multipart 0.0.5
python-rapidjson 1.9
pytz 2022.7.1
PyYAML 6.0
rarfile 4.0
requests 2.28.2
rfc3986 1.5.0
Bug描述确认 Bug description confirmation
- [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
是否愿意提交PR? Are you willing to submit a PR?
- [X] 我愿意提交PR!I'd like to help by submitting a PR!
According to the log message, this error may be caused by that the cudnn path is not set correctly or cudnn is not installed in your system
Thank you for your prompt reply. I'll try to install cudnn property version and will feedback to you.
hello did you solve the problem
i am having the same issue
The first step is to check whether there are libcudnn.so and libcublas.so in the shared library. Enter below command in the terminal.
ls /usr/lib | grep lib
If you don't have libcudnn.so and libcublas.so files, you need to find their location by below command.
locate libcudnn.so
locate libcublas.so
In my case, libcudnn.so is located under /usr/local/cuda-12.1/targets/x86_64-linux/include/libcudnn.so.8.9.1
And libcublas.so is located under /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.3.1
Once you locate them, you need to add them into the shared library by following the steps below.
Enter usr/lib folder
cd /usr/lib
Create libcudnn.so and libcublas.so
sudo ln -s /usr/local/cuda-12.1/targets/x86_64-linux/include/libcudnn.so.8.9.1 libcudnn.so
sudo ln -s /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.3.1 libcublas.so
Now, check whether they are added to the shared library,
ls /usr/lib | grep lib
If you can find libcudnn.so and libcublas.so with the above command, you wouldn't be having the issue.
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
I've fix the previous bugs according to the instruction. However, a new error demonstrates that "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Please make sure libcudnn_ops_infer.so.8 is in your library path!". Any suggestions?
same error
Same error. Do you have any solution about this? Thanks.
Use Cuda 11 instead of 12
We only release paddlepaddle-gpu cuda10.2 on pypi. If you want to install paddlepaddle-gpu with cuda version of 10.2/11.2/11.6/11.7, commands to install are on our website
https://pypi.org/project/paddlepaddle-gpu/
The first step is to check whether there are libcudnn.so and libcublas.so in the shared library. Enter below command in the terminal.
ls /usr/lib | grep libIf you don't have libcudnn.so and libcublas.so files, you need to find their location by below command.
locate libcudnn.solocate libcublas.soIn my case, libcudnn.so is located under
/usr/local/cuda-12.1/targets/x86_64-linux/include/libcudnn.so.8.9.1And libcublas.so is located under/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.3.1Once you locate them, you need to add them into the shared library by following the steps below.
Enter
usr/libfoldercd /usr/libCreate libcudnn.so and libcublas.so
sudo ln -s /usr/local/cuda-12.1/targets/x86_64-linux/include/libcudnn.so.8.9.1 libcudnn.sosudo ln -s /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.3.1 libcublas.soNow, check whether they are added to the shared library,
ls /usr/lib | grep libIf you can find libcudnn.so and libcublas.so with the above command, you wouldn't be having the issue.
The version of cuda I have installed is12.3,Following this version, I downloaded the cudnn package and extracted it to /usr/local/cuda/include和/usr/local/cuda/lib64
cd usr/lib
sudo ln -s /usr/local/cuda-12.3/targets/x86_64-linux/lib/libcudnn.so.8.9.1 libcudnn.so
sudo ln -s /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.3.1 libcublas.so
A new error has occurred:
Could not load library libcudnn_ops_infer.so.8libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
C++ Traceback(most recent call last):
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
I've fix the previous bugs according to the instruction. However, a new error demonstrates that "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Please make sure libcudnn_ops_infer.so.8 is in your library path!". Any suggestions?
If you're missing libcudnn_ops_infer.so.8 or similar you need to do the same thing to add it to your library path. There's a few like this so I found it easiest to just do sudo ln -s /usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib/* . (will vary based on wherever you found e.g. locate libcudnn_ops_infer.so.8)
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
I've fix the previous bugs according to the instruction. However, a new error demonstrates that "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Please make sure libcudnn_ops_infer.so.8 is in your library path!". Any suggestions?
If you're missing
libcudnn_ops_infer.so.8or similar you need to do the same thing to add it to your library path. There's a few like this so I found it easiest to just dosudo ln -s /usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib/* .(will vary based on wherever you found e.g.locate libcudnn_ops_infer.so.8)
I tried the same, but error still exists.
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
I've fix the previous bugs according to the instruction. However, a new error demonstrates that "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Please make sure libcudnn_ops_infer.so.8 is in your library path!". Any suggestions?
If you're missing
libcudnn_ops_infer.so.8or similar you need to do the same thing to add it to your library path. There's a few like this so I found it easiest to just dosudo ln -s /usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib/* .(will vary based on wherever you found e.g.locate libcudnn_ops_infer.so.8)I tried the same, but error still exists.
same as you, tried the @JoshC8C7 's tips. but still not work.
Thanks @Gokulnath-V. Your advice resolved my issues with paddle running on gpu.
I've fix the previous bugs according to the instruction. However, a new error demonstrates that "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory Please make sure libcudnn_ops_infer.so.8 is in your library path!". Any suggestions?
If you're missing
libcudnn_ops_infer.so.8or similar you need to do the same thing to add it to your library path. There's a few like this so I found it easiest to just dosudo ln -s /usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib/* .(will vary based on wherever you found e.g.locate libcudnn_ops_infer.so.8)I tried the same, but error still exists.
My problem is resolved, i have installed cudnn properly, i am using paddlepaddle-gup==2.6.0. My cuda version is 12.2, nvidia-driver version is 535.