onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

[Build] Issues with CUDA 11.4 and ONNX Runtime 1.11.0

Open HShamimGEHC opened this issue 1 year ago • 8 comments
trafficstars

Describe the issue

onnxruntime:Default, provider_bridge_ort.cc:1022 Get] Failed to load library libonnxruntime_providers_cuda.so with error: libcublas.so.10: cannot open shared object file: No such file or directory

[W:onnxruntime:Default, onnxruntime_pybind_state.cc:552 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

Urgency

Urgent

Target platform

Docker on NVIDIA Jetson AGX Xavier

Build script

RUN wget https://nvidia.box.com/shared/static/2sv2fv1wseihaw8ym0d4srz41dzljwxh.whl -O onnxruntime_gpu-1.11.0-cp38-cp38-linux_aarch64.whl &&
pip3 install onnxruntime_gpu-1.11.0-cp38-cp38-linux_aarch64.whl

Install CUDA toolkit

RUN apt-get update && apt-get install -y cuda-toolkit-11-4 && rm -rf /var/lib/apt/lists/*

I was provided a model.onnx that I am trying to load so that I can run inferencing. I was just provided this model.onnx. No

Error / output

onnxruntime:Default, provider_bridge_ort.cc:1022 Get] Failed to load library libonnxruntime_providers_cuda.so with error: libcublas.so.10: cannot open shared object file: No such file or directory

[W:onnxruntime:Default, onnxruntime_pybind_state.cc:552 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

Visual Studio Version

No response

GCC / Compiler Version

No response

### Tasks

HShamimGEHC avatar Feb 23 '24 20:02 HShamimGEHC

From the error message, the wheel was built with CUDA 10.

Please follow the following to install proper version of Jetpack (that will install matched version of CUDA): https://elinux.org/Jetson_Zoo#ONNX_Runtime For example, 1.11 matches with JetPack 4.4 / 4.4.1 / 4.5 / 4.5.1 / 4.6 / 4.6.1.

For build, please take a look at document: https://onnxruntime.ai/docs/build/eps.html#nvidia-jetson-tx1tx2nanoxavier

tianleiwu avatar Feb 23 '24 21:02 tianleiwu

Because this is on a docker, is the following acceptable:

Without changing the JetPack version, download CUDA 10.0 with a Dockerfile?

My current JetPack version is: JetPack 5.1.2 and l4T version is 35.4.1, but I am trying to do all this on a Docker Container

HShamimGEHC avatar Feb 23 '24 21:02 HShamimGEHC

The doc mentioned that CUDA version 11.8 with JetPack 5.1.2 has been tested on Jetson when building ONNX Runtime 1.16.

I guess the docker container r35.4.1 has CUDA 11.4. In that case, you can try onnxruntime-gpu 1.16 or 1.17.

tianleiwu avatar Feb 23 '24 22:02 tianleiwu

I see. I decided to use 1.16 and was wondering if my Dockerfile starts out with:

FROM nvcr.io/nvidia/l4t-base:35.4.1 and it should contain CUDA11.4,

why did I still have to: RUN apt-get update && apt-get install -y cuda-toolkit-11-4 && rm -rf /var/lib/apt/lists/*

to bypass these first set of errors regarding libcublas...?

HShamimGEHC avatar Feb 23 '24 23:02 HShamimGEHC

I should also have cuDNN 8.6.0 but my next set of error is this:

2024-02-23 22:55:09.263697843 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /home/ort/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.8: cannot open shared object file: No such file or directory

HShamimGEHC avatar Feb 23 '24 23:02 HShamimGEHC

@yf711 can you advise?

jywu-msft avatar Feb 24 '24 03:02 jywu-msft

I used to follow this post to deploy docker container for jetson. Please let me know if you could deploy env that can fit your cuda/cudnn requirement. Thanks!

davidlee8086 avatar Feb 24 '24 03:02 davidlee8086

Hi @HShamimGEHC, https://github.com/dusty-nv/jetson-containers there's a wide variety of containers designed for jetson, feel free to pick one which works on your case.

yf711 avatar Feb 24 '24 21:02 yf711

Hi @HShamimGEHC, https://github.com/dusty-nv/jetson-containers there's a wide variety of containers designed for jetson, feel free to pick one which works on your case.

Hi @yf711, sure I can give that a try. Since I need onnxrt and cuda and cudnn, how can I, after downloading them, use them the docker file that I am trying to create? If you could provide some insight onto that, I would greatly appreciate it.

HShamimGEHC avatar Feb 26 '24 15:02 HShamimGEHC

I used to follow this post to deploy docker container for jetson. Please let me know if you could deploy env that can fit your cuda/cudnn requirement. Thanks!

Hi @davidlee8086 - I was trying to follow this but if I plan to install each of the containers I need separately, I find myself running out of storage on my Jetson AGX Xavier.

HShamimGEHC avatar Feb 26 '24 16:02 HShamimGEHC

I should also have cuDNN 8.6.0 but my next set of error is this:

2024-02-23 22:55:09.263697843 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /home/ort/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.8: cannot open shared object file: No such file or directory

is libcudnn.so in your LD_LIBRARY_PATH?

jywu-msft avatar Feb 26 '24 16:02 jywu-msft

I should also have cuDNN 8.6.0 but my next set of error is this: 2024-02-23 22:55:09.263697843 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /home/ort/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.8: cannot open shared object file: No such file or directory

is libcudnn.so in your LD_LIBRARY_PATH?

I checked and it wasn't there. I decided to scroll through the NVIDIA NGC Page and stumbled on this dockerfile: https://gitlab.com/nvidia/container-images/l4t-jetpack/-/blob/master/Dockerfile.jetpack?ref_type=heads

(It includes commands for downloading CUDA and cudnn). This solved my issues of not finding any CUDA or CUDNN related libraries.

The last follow up I have is in regard to onnxruntime. How should I know which onnxruntime to download from the Jetson Zoo link: https://elinux.org/Jetson_Zoo#ONNX_Runtime? Should I just use the one that corresponds to the version of Jetpack SDK that my Jetson Xavier is on?

HShamimGEHC avatar Feb 26 '24 16:02 HShamimGEHC

I should also have cuDNN 8.6.0 but my next set of error is this: 2024-02-23 22:55:09.263697843 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /home/ort/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.8: cannot open shared object file: No such file or directory

is libcudnn.so in your LD_LIBRARY_PATH?

I checked and it wasn't there. I decided to scroll through the NVIDIA NGC Page and stumbled on this dockerfile: https://gitlab.com/nvidia/container-images/l4t-jetpack/-/blob/master/Dockerfile.jetpack?ref_type=heads

(It includes commands for downloading CUDA and cudnn). This solved my issues of not finding any CUDA or CUDNN related libraries.

The last follow up I have is in regard to onnxruntime. How should I know which onnxruntime to download from the Jetson Zoo link: https://elinux.org/Jetson_Zoo#ONNX_Runtime? Should I just use the one that corresponds to the version of Jetpack SDK that my Jetson Xavier is on?

yes, using the package corresponding to the JetPack version is the best option. otherwise, you will need to bring in the appropriate dependencies.

jywu-msft avatar Feb 27 '24 04:02 jywu-msft